Howto Extract Mimetype from a Byte[]

HowTo extract MimeType from a byte[]

Try Java Mime Magic Library

byte[] data = ...
MagicMatch match = Magic.getMagicMatch(data);
String mimeType = match.getMimeType();

How to fetch the MIME type from byte array in Java 6?

You can use the MimetypesFileTypeMap provided class from Java 6. This class is exclusively used to fetch the MIME type.

Use it to fetch the MIME type as shown below:

byte[] content = ;
InputStream is = new BufferedInputStream(new ByteArrayInputStream(content));
String mimeType = URLConnection.guessContentTypeFromStream(is);

For fetching from File you can use below code:

MimetypesFileTypeMap mimeTypesMap = new MimetypesFileTypeMap();
String mime = mimeTypesMap.getContentType(file);

How to extract file extension from byte array

If this is for storing a file that is uploaded:

  • create a column for the filename extension
  • create a column for the mime type as sent by the browser

If you don't have the original file, and you only have bytes, you have a couple of good solutions.

If you're able to use a library, look at using mime-util to inspect the bytes:

http://technopaper.blogspot.com/2009/03/identifying-mime-using-mime-util.html

If you have to build your own byte detector, here are many of the most popular starting bytes:

"BC" => bitcode,
"BM" => bitmap,
"BZ" => bzip,
"MZ" => exe,
"SIMPLE"=> fits,
"GIF8" => gif,
"GKSM" => gks,
[0x01,0xDA].pack('c*') => iris_rgb,
[0xF1,0x00,0x40,0xBB].pack('c*') => itc,
[0xFF,0xD8].pack('c*') => jpeg,
"IIN1" => niff,
"MThd" => midi,
"%PDF" => pdf,
"VIEW" => pm,
[0x89].pack('c*') + "PNG" => png,
"%!" => postscript,
"Y" + [0xA6].pack('c*') + "j" + [0x95].pack('c*') => sun_rasterfile,
"MM*" + [0x00].pack('c*') => tiff,
"II*" + [0x00].pack('c*') => tiff,
"gimp xcf" => gimp_xcf,
"#FIG" => xfig,
"/* XPM */" => xpm,
[0x23,0x21].pack('c*') => shebang,
[0x1F,0x9D].pack('c*') => compress,
[0x1F,0x8B].pack('c*') => gzip,
"PK" + [0x03,0x04].pack('c*') => pkzip,
"MZ" => dos_os2_windows_executable,
".ELF" => unix_elf,
[0x99,0x00].pack('c*') => pgp_public_ring,
[0x95,0x01].pack('c*') => pgp_security_ring,
[0x95,0x00].pack('c*') => pgp_security_ring,
[0xA6,0x00].pack('c*') => pgp_encrypted_data,
[0xD0,0xCF,0x11,0xE0].pack('c*') => docfile

In C#, how can I know the file type from a byte[]?

Not sure, but maybe you should investigate about magic numbers.

Update:
Reading about it, I don't think it's very reliable though.

Using .NET, how can you find the mime type of a file based on the file signature not the extension

In Urlmon.dll, there's a function called FindMimeFromData.

From the documentation

MIME type detection, or "data sniffing," refers to the process of determining an appropriate MIME type from binary data. The final result depends on a combination of server-supplied MIME type headers, file extension, and/or the data itself. Usually, only the first 256 bytes of data are significant.

So, read the first (up to) 256 bytes from the file and pass it to FindMimeFromData.

convert from byte array and mime type to string / object

Jersy's MediaType has a valueOf static method to parse MIME.

It also has support for creating object given a value stream. Unfortunately, it looks like it cannot be used separately.



Related Topics



Leave a reply



Submit