java get file size efficiently
Well, I tried to measure it up with the code below:
For runs = 1 and iterations = 1 the URL method is fastest most times followed by channel. I run this with some pause fresh about 10 times. So for one time access, using the URL is the fastest way I can think of:
LENGTH sum: 10626, per Iteration: 10626.0
CHANNEL sum: 5535, per Iteration: 5535.0
URL sum: 660, per Iteration: 660.0
For runs = 5 and iterations = 50 the picture draws different.
LENGTH sum: 39496, per Iteration: 157.984
CHANNEL sum: 74261, per Iteration: 297.044
URL sum: 95534, per Iteration: 382.136
File must be caching the calls to the filesystem, while channels and URL have some overhead.
Code:
import java.io.*;
import java.net.*;
import java.util.*;
public enum FileSizeBench {
LENGTH {
@Override
public long getResult() throws Exception {
File me = new File(FileSizeBench.class.getResource(
"FileSizeBench.class").getFile());
return me.length();
}
},
CHANNEL {
@Override
public long getResult() throws Exception {
FileInputStream fis = null;
try {
File me = new File(FileSizeBench.class.getResource(
"FileSizeBench.class").getFile());
fis = new FileInputStream(me);
return fis.getChannel().size();
} finally {
fis.close();
}
}
},
URL {
@Override
public long getResult() throws Exception {
InputStream stream = null;
try {
URL url = FileSizeBench.class
.getResource("FileSizeBench.class");
stream = url.openStream();
return stream.available();
} finally {
stream.close();
}
}
};
public abstract long getResult() throws Exception;
public static void main(String[] args) throws Exception {
int runs = 5;
int iterations = 50;
EnumMap<FileSizeBench, Long> durations = new EnumMap<FileSizeBench, Long>(FileSizeBench.class);
for (int i = 0; i < runs; i++) {
for (FileSizeBench test : values()) {
if (!durations.containsKey(test)) {
durations.put(test, 0l);
}
long duration = testNow(test, iterations);
durations.put(test, durations.get(test) + duration);
// System.out.println(test + " took: " + duration + ", per iteration: " + ((double)duration / (double)iterations));
}
}
for (Map.Entry<FileSizeBench, Long> entry : durations.entrySet()) {
System.out.println();
System.out.println(entry.getKey() + " sum: " + entry.getValue() + ", per Iteration: " + ((double)entry.getValue() / (double)(runs * iterations)));
}
}
private static long testNow(FileSizeBench test, int iterations)
throws Exception {
long result = -1;
long before = System.nanoTime();
for (int i = 0; i < iterations; i++) {
if (result == -1) {
result = test.getResult();
//System.out.println(result);
} else if ((result = test.getResult()) != result) {
throw new Exception("variance detected!");
}
}
return (System.nanoTime() - before) / 1000;
}
}
Get total size of file in bytes
You can use the length()
method on File
which returns the size in bytes.
File size caching & efficient retrieval of file sizes in Java
File systems generally store the length as a part of the file description. This way the OS knows where the end of the file is. This information is cached when accessed. And repeated calls for this information will also be cache.
Note: the OS often reads more data from disk than you ask for. This is because access to disk are expensive and memory is relatively cheap. e.g. when you get the length of one file it may read in the detail of many files on the assumption you might want information about those files too. i.e. the first time you get a file's information it is likely to already be cached.
Get size of folder or file
java.io.File file = new java.io.File("myfile.txt");
file.length();
This returns the length of the file in bytes or 0
if the file does not exist. There is no built-in way to get the size of a folder, you are going to have to walk the directory tree recursively (using the listFiles()
method of a file object that represents a directory) and accumulate the directory size for yourself:
public static long folderSize(File directory) {
long length = 0;
for (File file : directory.listFiles()) {
if (file.isFile())
length += file.length();
else
length += folderSize(file);
}
return length;
}
WARNING: This method is not sufficiently robust for production use. directory.listFiles()
may return null
and cause a NullPointerException
. Also, it doesn't consider symlinks and possibly has other failure modes. Use this method.
What is the most efficient way to determine length of a text file?
Use an ArrayList
(your option #1). Read in your text file line by line with BufferedReader
's readLine()
method. It's simple, efficient and maintainable.
Java: How to Count File Size Correctly in a Real-time manner
It is common for applications to buffer output and only push out data in lumps.
I suspect this is not the case here. Instead I suspect Lucene is using memory mapped files. When you grow a memory mapped file, it grows with each allocation you make. As an allocation is expensive, but the cost of allocating more than you need rather cheap (as it uses virtual memory and only uses main memory and disk as you touch it) the most efficient thing to do is to allocate large blocks and then fill them up lazily. (E.g. I allocate 128 MB at a time with a 64-bit JVM)
File.length gives you the extents of the file, not how much has actually been used or even how much disk space is used. You can see how much disk space has been used with du
on unix and possibly some tool in Java 7 (I have only found the space used for file system roots, not files)
Even so, this tells you how many pages have been touched. The only way to know accurately how much has been used is to read the file and this has limited accuracy if the file is being modified while you read it.
EDIT: on Windows 7 the space appears to be reserved immediately so you cannot create a sparse file larger than the size of the file system (as you can on ext4 filesystems)
Related Topics
How Cancel the Execution of a Swingworker
Prime Numbers by Eratosthenes Quicker Sequential Than Concurrently
How to Have Explained the Difference Between an Interface and an Abstract Class
Using Java with Nvidia Gpus (Cuda)
Reversing a Linked List in Java, Recursively
When Should I Use File.Separator and When File.Pathseparator
How to Go About Formatting 1200 to 1.2K in Java
Date Format Mapping to JSON Jackson
Break or Return from Java 8 Stream Foreach
Correctly Implementing the MVC Pattern in Gui Development Using Swing in Java
Why Dec 31 2010 Returns 1 as Week of Year
How to Get the Name of the Calling Class in Java
Drawing Multiple Jcomponents to a Frame
Why Java Polymorphism Not Work in My Example
Printing a Java Inputstream from a Process
How to Represent Double Values as Circles in a 2D Matrix in Java