How to Check If a File Is Gzip Compressed

How to tell if a file is gzip compressed?

The magic number for gzip compressed files is 1f 8b. Although testing for this is not 100% reliable, it is highly unlikely that "ordinary text files" start with those two bytes—in UTF-8 it's not even legal.

Usually gzip compressed files sport the suffix .gz though. Even gzip(1) itself won't unpack files without it unless you --force it to. You could conceivably use that, but you'd still have to deal with a possible IOError (which you have to in any case).

One problem with your approach is, that gzip.GzipFile() will not throw an exception if you feed it an uncompressed file. Only a later read() will. This means, that you would probably have to implement some of your program logic twice. Ugly.

How to check if a file is gzip compressed?

There is a magic number at the beginning of the file. Just read the first two bytes and check if they are equal to 0x1f8b.

How to check whether file is gzip or not in Java

Use this package that I found on google:

package example;
 
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.RandomAccessFile;
import java.util.zip.GZIPInputStream;
 
public class GZipUtil {
 
 /**
  * Checks if an input stream is gzipped.
  *
  * @param in
  * @return
  */
 public static boolean isGZipped(InputStream in) {
  if (!in.markSupported()) {
   in = new BufferedInputStream(in);
  }
  in.mark(2);
  int magic = 0;
  try {
   magic = in.read() & 0xff | ((in.read() << 8) & 0xff00);
   in.reset();
  } catch (IOException e) {
   e.printStackTrace(System.err);
   return false;
  }
  return magic == GZIPInputStream.GZIP_MAGIC;
 }
 
 /**
  * Checks if a file is gzipped.
  *
  * @param f
  * @return
  */
 public static boolean isGZipped(File f) {
  int magic = 0;
  try {
   RandomAccessFile raf = new RandomAccessFile(f, "r");
   magic = raf.read() & 0xff | ((raf.read() << 8) & 0xff00);
   raf.close();
  } catch (Throwable e) {
   e.printStackTrace(System.err);
  }
  return magic == GZIPInputStream.GZIP_MAGIC;
 }
 
 public static void main(String[] args) throws FileNotFoundException {
  File gzf = new File("/tmp/1.gz");
 
  // Check if a file is gzipped.
  System.out.println(isGZipped(gzf));
 
  // Check if a input stream is gzipped.
  System.out.println(isGZipped(new FileInputStream(gzf)));
 }
}

Is it possible to check whether a file (.gz) has been compressed more than once?

You can check for a valid gzip header within the file. A gzip file should contain a defined header starting with a 2-byte number with values 0x1f and 0x8b (see spec ). You can check these bytes to see if they match the header values:

InputStream is = new FileInputStream(new File(filePath));
byte[] b = new byte[2];
int n = is.read(b);
if ( n != 2 ){
//not a gzip file
}
if ( (b[0] == (byte) 0x1f) && (b[1] == (byte)0x8b)){
//2-byte gzip header
}

These two bytes alone have an ~1/65k chance of randomly occurring, but depending upon the data you expect to receive can be enough to base your decision. To be more confident of the call you can read further into the header to be sure it follows valid spec values (see link above - eg third byte is typically but not always an 8 for DEFLATE compression, and so on...)

how to check if a file is gzipped or not in firefox/firebug

You can tell by looking at HTTP Response Headers - look for 'Content-Encoding: gzip'

You can probably tell by drilling into the Net tab in Firebug, but I always used to use the Web Developer Toolbar (a Firefox extension) for checking response headers. There is also a lesser-featured extension called Live HTTP Headers. https://addons.mozilla.org/en-US/firefox/addon/3829/

Alternatively, you can google for a website such as this, to check for you:

http://www.gidnetwork.com/tools/gzip-test.php

hth

handling gzip.open or open in with statement

The hook_compressed() of the standard module fileinput that does exactly what you are asking for:

Transparently opens files compressed with gzip and bzip2 (recognized by the extensions .gz and .bz2) using the gzip and bz2 modules. If the filename extension is not .gz or .bz2, the file is opened normally (ie, using open() without any decompression).

with fileinput.hook_compressed("test.txt.gz") as f:
f.read()
with fileinput.hook_compressed("test.txt") as f:
f.read()


Related Topics



Leave a reply



Submit