How to Identify Contents of a Byte[] Is a Jpeg

How to identify contents of a byte[] is a JPEG?

From wikipedia:

JPEG image files begin with FF D8 and end with FF D9.

http://en.wikipedia.org/wiki/Magic_number_(programming)

Validating JPEG from byte array - specifically the APP segment

I thought a JPEG is a JPEG is a JPEG.

Actually, most files referred to as "a JPEG file" are either JFIF or Exif. :-)

Exif uses the structure of JFIF, so you can parse them just the same. But because JFIF specifies that the first APP segment must be APP0/JFJF, and Exif says that for Exif the first APP segment must be APP1/Exif, they are not really compatible. Some JFIFs contain Exif APP segments in a later segment, to use it for metadata. Some "JPEG"s contains neither Exif or JFIF APP segment, but still contain valid JPEG code streams. Most software glosses over this fact though.

Is there a good reason for filtering based on particular values of the APP segment?

Depends. For example, if you want to filter out Exif only, or ISO JPEG only, then yes. If you want to read as many "JPEG"s as possible, then you obviously don't want this.

Some software (ie. default Java JPEGImageReaderSpi used by ImageIO, as you mention Java) uses just the SOI marker (0xFF, 0xD8) to identify JPEG. Making sure the next byte is 0xFF is of course an extra precaution, to filter out false positives.

How exactly does the APP segment effect the JPEG image?

Some APP segments effect how the compressed JPEG data is to be interpreted. Most JPEG reading software needs to be aware of at least APP0/JFIF, APP1/Exif, APP2/ICC_PROFILE, APP14/Adobe to properly interpret and convert color from the compressed data. Ignoring these, will most likely produce images with strange-looking or inaccurate colors.

Other segments, like the APP0/JFXX (thumbnail extension), APP13/Photoshop 3.0 and APP1/XMP tags are used mainly for metadata, and can probably be ignored.

Also note that the APPn segments start with a null-terminated ASCII string after the APPn marker, to fully identify the APP segment type. It's not enough to just look at the marker.

PS: To read JPEGs in Java, you might want to have a look at my TwelveMonkeys ImageIO library, to expand the number of "JPEG" varieties ImageIO can read.

Is it possible to find an image file extension from its byte array contents?

Most file formats have a "Magic Number" at the beginning in the form of a couple of bytes that indicates what file it is.

GIF starts with "GIF89a" or "GIF87a", i.e. bytes 47,49,46,38,39,61 or 47,49,46,38,37,61 (both hexadecimal)

JPEG starts with FF,D8 and ends with FF,D9

PNG begins with 89,50,4E,47,0D,0A,1A,0A

See more about Magic Numbers on Wikipedia

For other formats you can google the magic number. Some don't have any though.

How can I know what image format I get from a stream?

You may checkout the Image.RawFormat property. So once you load the image from the stream you could test:

if (ImageFormat.Jpeg.Equals(image.RawFormat))
{
// JPEG
}
else if (ImageFormat.Png.Equals(image.RawFormat))
{
// PNG
}
else if (ImageFormat.Gif.Equals(image.RawFormat))
{
// GIF
}
... etc

C# How can I test a file is a jpeg?

Several options:

You can check for the file extension:

static bool HasJpegExtension(string filename)
{
// add other possible extensions here
return Path.GetExtension(filename).Equals(".jpg", StringComparison.InvariantCultureIgnoreCase)
|| Path.GetExtension(filename).Equals(".jpeg", StringComparison.InvariantCultureIgnoreCase);
}

or check for the correct magic number in the header of the file:

static bool HasJpegHeader(string filename)
{
using (BinaryReader br = new BinaryReader(File.Open(filename, FileMode.Open, FileAccess.Read)))
{
UInt16 soi = br.ReadUInt16(); // Start of Image (SOI) marker (FFD8)
UInt16 marker = br.ReadUInt16(); // JFIF marker (FFE0) or EXIF marker(FFE1)

return soi == 0xd8ff && (marker & 0xe0ff) == 0xe0ff;
}
}

Another option would be to load the image and check for the correct type. However, this is less efficient (unless you are going to load the image anyway) but will probably give you the most reliable result (Be aware of the additional cost of loading and decompression as well as possible exception handling):

static bool IsJpegImage(string filename)
{
try
{
using (System.Drawing.Image img = System.Drawing.Image.FromFile(filename))
{
// Two image formats can be compared using the Equals method
// See http://msdn.microsoft.com/en-us/library/system.drawing.imaging.imageformat.aspx
//
return img.RawFormat.Equals(System.Drawing.Imaging.ImageFormat.Jpeg);
}
}
catch (OutOfMemoryException)
{
// Image.FromFile throws an OutOfMemoryException
// if the file does not have a valid image format or
// GDI+ does not support the pixel format of the file.
//
return false;
}
}

Java get image extension from byte array

There are many solutions. Very simple for example:

String contentType = URLConnection.guessContentTypeFromStream(new ByteArrayInputStream(imageBytes));

Or you can use third-party libraries. Like Apache Tika:

String contentType = new Tika().detect(imageBytes);


Related Topics



Leave a reply



Submit