Which Compression Method to Use in PHP

Which compression method to use in PHP?

All of these can be used. There are subtle differences between the three:

  • gzencode() uses the GZIP file format, the same as the gzip command line tool. This file format has a header containing optional metadata, DEFLATE compressed data, and footer containing a CRC32 checksum and length check.
  • gzcompress() uses the ZLIB format. It has a shorter header serving only to identify the compression format, DEFLATE compressed data, and a footer containing an ADLER32 checksum.
  • gzdeflate() uses the raw DEFLATE algorithm on its own, which is the basis for both of the other formats.

All three use the same algorithm under the hood, so they won't differ in speed or efficiency. gzencode() adds the ability to include the original file name and other environmental data (this is unused when you are just compressing a string). gzencode() and gzcompress() both add a checksum, so the integrity of the archive can be verified, which can be useful over unreliable transmission and storage methods. If everything is stored locally and you don't need any additional metadata then gzdeflate() would suffice. For portability I'd recommend gzencode() (GZIP format) which is probably better supported than gzcompress() (ZLIB format) among other tools.

When compressing very short strings the overhead of each method becomes relevant since for very short input the overhead can comprise a significant part of the output. The overhead for each method, measured by compressing an empty string, is:

  • gzencode('') = 20 bytes
  • gzcompress('') = 8 bytes
  • gzdeflate('') = 2 bytes

Which PHP compression method is better?

Do not compress or otherwise encode text before putting it in the database unless you are 110% positive that you will never need to perform anything other than simple storage and retrieval on it. If you think you might ever need to issue a SELECT based on something contained in that text you're going to be completely hosed.

That said, if you plan on storing large amounts of data in a DB table many DBMSes have transparent compression built in. Eg: MySQL's InnoDB compression

Lastly, the difference between compress and deflate is negligible IMO. Just remember to never set the compression level to 9 unless you want your CPU to burst into flames for no raisin.

Best way to compress string in PHP

How good the compression of your string will be depends on the data you want to compress. If it consists mainly of random data you won't achieve that much improvements in size. There are many algorithms out there which have been designed for specific usage.

You should try to determine what your data to compress mainly consists of and then select a proper compression.

Just now I can only refer you to bzcompress, bzip has usually highter compression rates than gzip.

Compression Algorithm in PHP

To work with binary in PHP, your data type of choice is a string. Because strings are mere byte arrays:

$bytes = '';

There's no simple base 2 notation for binary in PHP (0100101), usually the next best option to work with binary is hex notation:

$bytes = "\x42";  // 0100 0010

You'll need to convert between base 2 and base 16 notation back and forth in your head, but once you get used to that it's typically easier to follow and work with than long strings of 1s and 0s.

To test or manipulate binary data you'll want to get used to the binary operators:

if (($bytes[3] & "\x02") === "\x02") {
// the second bit of the forth byte in the sequence is set (0000 0010)
}

$bytes[6] |= "\x02"; // setting the second bit of the seventh byte

Windev Compress(): which compression format is used

PCSOFT use their WDZ format for most of anything and it's not standard so you wont be able to open it with the usual tools. I heard they used some of the usual libs like zlib but put their little twist to it so you probably could only work with their tool WDZIP.EXE to work with that.

You should look at other functions like zipCreate() for creating archives and using some standard formats:
http://doc.windev.com/en-US/?3082007

Hope this helps :)

PHP - Compress a txt file while maintaining the file extension

Normally gzip does not use the name in the gzip header when decompressing, and you didn't store a name in the header anyway. It simply removes the .gz from the file name. You would need to name the file test.txt.gz to get it to decompress to test.txt.



Related Topics



Leave a reply



Submit