Java Get File Size Efficiently

java get file size efficiently

Well, I tried to measure it up with the code below:

For runs = 1 and iterations = 1 the URL method is fastest most times followed by channel. I run this with some pause fresh about 10 times. So for one time access, using the URL is the fastest way I can think of:

LENGTH sum: 10626, per Iteration: 10626.0

CHANNEL sum: 5535, per Iteration: 5535.0

URL sum: 660, per Iteration: 660.0

For runs = 5 and iterations = 50 the picture draws different.

LENGTH sum: 39496, per Iteration: 157.984

CHANNEL sum: 74261, per Iteration: 297.044

URL sum: 95534, per Iteration: 382.136

File must be caching the calls to the filesystem, while channels and URL have some overhead.

Code:

import java.io.*;
import java.net.*;
import java.util.*;

public enum FileSizeBench {

LENGTH {
@Override
public long getResult() throws Exception {
File me = new File(FileSizeBench.class.getResource(
"FileSizeBench.class").getFile());
return me.length();
}
},
CHANNEL {
@Override
public long getResult() throws Exception {
FileInputStream fis = null;
try {
File me = new File(FileSizeBench.class.getResource(
"FileSizeBench.class").getFile());
fis = new FileInputStream(me);
return fis.getChannel().size();
} finally {
fis.close();
}
}
},
URL {
@Override
public long getResult() throws Exception {
InputStream stream = null;
try {
URL url = FileSizeBench.class
.getResource("FileSizeBench.class");
stream = url.openStream();
return stream.available();
} finally {
stream.close();
}
}
};

public abstract long getResult() throws Exception;

public static void main(String[] args) throws Exception {
int runs = 5;
int iterations = 50;

EnumMap<FileSizeBench, Long> durations = new EnumMap<FileSizeBench, Long>(FileSizeBench.class);

for (int i = 0; i < runs; i++) {
for (FileSizeBench test : values()) {
if (!durations.containsKey(test)) {
durations.put(test, 0l);
}
long duration = testNow(test, iterations);
durations.put(test, durations.get(test) + duration);
// System.out.println(test + " took: " + duration + ", per iteration: " + ((double)duration / (double)iterations));
}
}

for (Map.Entry<FileSizeBench, Long> entry : durations.entrySet()) {
System.out.println();
System.out.println(entry.getKey() + " sum: " + entry.getValue() + ", per Iteration: " + ((double)entry.getValue() / (double)(runs * iterations)));
}

}

private static long testNow(FileSizeBench test, int iterations)
throws Exception {
long result = -1;
long before = System.nanoTime();
for (int i = 0; i < iterations; i++) {
if (result == -1) {
result = test.getResult();
//System.out.println(result);
} else if ((result = test.getResult()) != result) {
throw new Exception("variance detected!");
}
}
return (System.nanoTime() - before) / 1000;
}

}

Get total size of file in bytes

You can use the length() method on File which returns the size in bytes.

File size caching & efficient retrieval of file sizes in Java

File systems generally store the length as a part of the file description. This way the OS knows where the end of the file is. This information is cached when accessed. And repeated calls for this information will also be cache.

Note: the OS often reads more data from disk than you ask for. This is because access to disk are expensive and memory is relatively cheap. e.g. when you get the length of one file it may read in the detail of many files on the assumption you might want information about those files too. i.e. the first time you get a file's information it is likely to already be cached.

Get size of folder or file

java.io.File file = new java.io.File("myfile.txt");
file.length();

This returns the length of the file in bytes or 0 if the file does not exist. There is no built-in way to get the size of a folder, you are going to have to walk the directory tree recursively (using the listFiles() method of a file object that represents a directory) and accumulate the directory size for yourself:

public static long folderSize(File directory) {
long length = 0;
for (File file : directory.listFiles()) {
if (file.isFile())
length += file.length();
else
length += folderSize(file);
}
return length;
}

WARNING: This method is not sufficiently robust for production use. directory.listFiles() may return null and cause a NullPointerException. Also, it doesn't consider symlinks and possibly has other failure modes. Use this method.

What is the most efficient way to determine length of a text file?

Use an ArrayList (your option #1). Read in your text file line by line with BufferedReader's readLine() method. It's simple, efficient and maintainable.

Java: How to Count File Size Correctly in a Real-time manner

It is common for applications to buffer output and only push out data in lumps.

I suspect this is not the case here. Instead I suspect Lucene is using memory mapped files. When you grow a memory mapped file, it grows with each allocation you make. As an allocation is expensive, but the cost of allocating more than you need rather cheap (as it uses virtual memory and only uses main memory and disk as you touch it) the most efficient thing to do is to allocate large blocks and then fill them up lazily. (E.g. I allocate 128 MB at a time with a 64-bit JVM)

File.length gives you the extents of the file, not how much has actually been used or even how much disk space is used. You can see how much disk space has been used with du on unix and possibly some tool in Java 7 (I have only found the space used for file system roots, not files)

Even so, this tells you how many pages have been touched. The only way to know accurately how much has been used is to read the file and this has limited accuracy if the file is being modified while you read it.

EDIT: on Windows 7 the space appears to be reserved immediately so you cannot create a sparse file larger than the size of the file system (as you can on ext4 filesystems)



Related Topics



Leave a reply



Submit