How to tell if a file is gzip compressed?
The magic number for gzip compressed files is 1f 8b
. Although testing for this is not 100% reliable, it is highly unlikely that "ordinary text files" start with those two bytes—in UTF-8 it's not even legal.
Usually gzip compressed files sport the suffix .gz
though. Even gzip(1)
itself won't unpack files without it unless you --force
it to. You could conceivably use that, but you'd still have to deal with a possible IOError (which you have to in any case).
One problem with your approach is, that gzip.GzipFile()
will not throw an exception if you feed it an uncompressed file. Only a later read()
will. This means, that you would probably have to implement some of your program logic twice. Ugly.
How to check if a file is gzip compressed?
There is a magic number at the beginning of the file. Just read the first two bytes and check if they are equal to 0x1f8b
.
How to check whether file is gzip or not in Java
Use this package that I found on google:
package example;
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.RandomAccessFile;
import java.util.zip.GZIPInputStream;
public class GZipUtil {
/**
* Checks if an input stream is gzipped.
*
* @param in
* @return
*/
public static boolean isGZipped(InputStream in) {
if (!in.markSupported()) {
in = new BufferedInputStream(in);
}
in.mark(2);
int magic = 0;
try {
magic = in.read() & 0xff | ((in.read() << 8) & 0xff00);
in.reset();
} catch (IOException e) {
e.printStackTrace(System.err);
return false;
}
return magic == GZIPInputStream.GZIP_MAGIC;
}
/**
* Checks if a file is gzipped.
*
* @param f
* @return
*/
public static boolean isGZipped(File f) {
int magic = 0;
try {
RandomAccessFile raf = new RandomAccessFile(f, "r");
magic = raf.read() & 0xff | ((raf.read() << 8) & 0xff00);
raf.close();
} catch (Throwable e) {
e.printStackTrace(System.err);
}
return magic == GZIPInputStream.GZIP_MAGIC;
}
public static void main(String[] args) throws FileNotFoundException {
File gzf = new File("/tmp/1.gz");
// Check if a file is gzipped.
System.out.println(isGZipped(gzf));
// Check if a input stream is gzipped.
System.out.println(isGZipped(new FileInputStream(gzf)));
}
}
Is it possible to check whether a file (.gz) has been compressed more than once?
You can check for a valid gzip header within the file. A gzip file should contain a defined header starting with a 2-byte number with values 0x1f and 0x8b (see spec ). You can check these bytes to see if they match the header values:
InputStream is = new FileInputStream(new File(filePath));
byte[] b = new byte[2];
int n = is.read(b);
if ( n != 2 ){
//not a gzip file
}
if ( (b[0] == (byte) 0x1f) && (b[1] == (byte)0x8b)){
//2-byte gzip header
}
These two bytes alone have an ~1/65k chance of randomly occurring, but depending upon the data you expect to receive can be enough to base your decision. To be more confident of the call you can read further into the header to be sure it follows valid spec values (see link above - eg third byte is typically but not always an 8
for DEFLATE
compression, and so on...)
how to check if a file is gzipped or not in firefox/firebug
You can tell by looking at HTTP Response Headers - look for 'Content-Encoding: gzip'
You can probably tell by drilling into the Net tab in Firebug, but I always used to use the Web Developer Toolbar (a Firefox extension) for checking response headers. There is also a lesser-featured extension called Live HTTP Headers. https://addons.mozilla.org/en-US/firefox/addon/3829/
Alternatively, you can google for a website such as this, to check for you:
http://www.gidnetwork.com/tools/gzip-test.php
hth
handling gzip.open or open in with statement
The hook_compressed() of the standard module fileinput
that does exactly what you are asking for:
Transparently opens files compressed with gzip and bzip2 (recognized by the extensions .gz and .bz2) using the
gzip
andbz2
modules. If the filename extension is not .gz or .bz2, the file is opened normally (ie, usingopen()
without any decompression).
with fileinput.hook_compressed("test.txt.gz") as f:
f.read()
with fileinput.hook_compressed("test.txt") as f:
f.read()
Related Topics
How to Parse Mustache with Boost.Xpressive Correctly
What Is Default Storage Class for Global Variables
How to Implement Readlink to Find the Path
How Is If Statement Evaluated in C++
How to Statically-Initialize a Dynamically-Allocated Array in C++
How Does Excel Successfully Round Floating Point Numbers Even Though They Are Imprecise
Inconsistent Use of Const Qualifier Between Declaration and Definition
How to Dynamically Allocate Arrays in C++
How to Implement "_Mm_Storeu_Epi64" Without Aliasing Problems
How to Write Make_Unique() in VS2012
Clion C++ Can't Read/Open .Txt File in Project Directory
Does Gcc Inline C++ Functions Without the 'Inline' Keyword
Opencv Gtk+2.X Error - "Unspecified Error (The Function Is Not Implemented...)"
Opencv's Canny Edge Detection in C++
C++ -- How to Overload Operator+=