How to Compress a String in Java

How to compress a String in Java?

Compression algorithms almost always have some form of space overhead, which means that they are only effective when compressing data which is sufficiently large that the overhead is smaller than the amount of saved space.

Compressing a string which is only 20 characters long is not too easy, and it is not always possible. If you have repetition, Huffman Coding or simple run-length encoding might be able to compress, but probably not by very much.

compression and decompression of string data in java

This is because of

String outStr = obj.toString("UTF-8");

Send the byte[] which you can get from your ByteArrayOutputStream and use it as such in your ByteArrayInputStream to construct your GZIPInputStream. Following are the changes which need to be done in your code.

byte[] compressed = compress(string); //In the main method

public static byte[] compress(String str) throws Exception {
...
...
return obj.toByteArray();
}

public static String decompress(byte[] bytes) throws Exception {
...
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
...
}

How can I easily compress and decompress Strings to/from byte arrays?

You can try

enum StringCompressor {
;
public static byte[] compress(String text) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
OutputStream out = new DeflaterOutputStream(baos);
out.write(text.getBytes("UTF-8"));
out.close();
} catch (IOException e) {
throw new AssertionError(e);
}
return baos.toByteArray();
}

public static String decompress(byte[] bytes) {
InputStream in = new InflaterInputStream(new ByteArrayInputStream(bytes));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
byte[] buffer = new byte[8192];
int len;
while((len = in.read(buffer))>0)
baos.write(buffer, 0, len);
return new String(baos.toByteArray(), "UTF-8");
} catch (IOException e) {
throw new AssertionError(e);
}
}
}

Compress the string in java

The problem comes from str.charAt(i) + (char)count, as they are 2 chars, they are summed up with their int value,


Solve that by using consecutive append() calls

str_new.append(str.charAt(i)).append(count);

You can reduce the code by using an outer for-loop and a ternary operator in the append, and increment only i in the inner while by saving i before

int count;
for (int i = 0; i < str.length(); i++) {
count = i;
while (i < str.length() - 1 && str.charAt(i) == str.charAt(i + 1)) {
i++;
}
str_new.append(str.charAt(i)).append((i - count) == 0 ? "" : (i - count + 1));
}


Related Topics



Leave a reply



Submit