Fast Way to Get Image Dimensions (Not Filesize)

How to get dimensions of image without loading the image

You have a complete article and working code in the following link (CodeProject).

http://www.codeproject.com/Articles/35978/Reading-Image-Headers-to-Get-Width-and-Height

He is reading the headers information to get the dimensions of the image. Performance looks good.

Get Image size WITHOUT loading image into memory

As the comments allude, PIL does not load the image into memory when calling .open. Looking at the docs of PIL 1.1.7, the docstring for .open says:

def open(fp, mode="r"):
"Open an image file, without loading the raster data"

There are a few file operations in the source like:

 ...
prefix = fp.read(16)
...
fp.seek(0)
...

but these hardly constitute reading the whole file. In fact .open simply returns a file object and the filename on success. In addition the docs say:

open(file, mode=”r”)

Opens and identifies the given image file.

This is a lazy operation; this function identifies the file, but the actual image data is not read from the file until you try to process the data (or call the load method).

Digging deeper, we see that .open calls _open which is a image-format specific overload. Each of the implementations to _open can be found in a new file, eg. .jpeg files are in JpegImagePlugin.py. Let's look at that one in depth.

Here things seem to get a bit tricky, in it there is an infinite loop that gets broken out of when the jpeg marker is found:

    while True:

s = s + self.fp.read(1)
i = i16(s)

if i in MARKER:
name, description, handler = MARKER[i]
# print hex(i), name, description
if handler is not None:
handler(self, i)
if i == 0xFFDA: # start of scan
rawmode = self.mode
if self.mode == "CMYK":
rawmode = "CMYK;I" # assume adobe conventions
self.tile = [("jpeg", (0,0) + self.size, 0, (rawmode, ""))]
# self.__offset = self.fp.tell()
break
s = self.fp.read(1)
elif i == 0 or i == 65535:
# padded marker or junk; move on
s = "\xff"
else:
raise SyntaxError("no marker found")

Which looks like it could read the whole file if it was malformed. If it reads the info marker OK however, it should break out early. The function handler ultimately sets self.size which are the dimensions of the image.

Is it possible to detect the dimensions of an image at an URL without downloading the entire image?

Don't know if it'll help you speed up your application, but it can be done. Checkout these two articles:

http://www.anttikupila.com/flash/getting-jpg-dimensions-with-as3-without-loading-the-entire-file/ for JPEG

http://www.herrodius.com/blog/265 for PNG

They are both for ActionScript, but the principle applies for other languages as well of course.

I made a sample using C#. It's not the prettiest code and it only works for JPEGs, but can be easily extended to PNG too:

var request = (HttpWebRequest) WebRequest.Create("http://unawe.org/joomla/images/materials/posters/galaxy/galaxy_poster2_very_large.jpg");
using (WebResponse response = request.GetResponse())
using (Stream responseStream = response.GetResponseStream())
{
int r;
bool found = false;
while (!found && (r = responseStream.ReadByte()) != -1)
{
if (r != 255) continue;

int marker = responseStream.ReadByte();

// App specific
if (marker >= 224 && marker <= 239)
{
int payloadLengthHi = responseStream.ReadByte();
int payloadLengthLo = responseStream.ReadByte();
int payloadLength = (payloadLengthHi << 8) + payloadLengthLo;
for (int i = 0; i < payloadLength - 2; i++)
responseStream.ReadByte();
}
// SOF0
else if (marker == 192)
{
// Length of payload - don't care
responseStream.ReadByte();
responseStream.ReadByte();

// Bit depth - don't care
responseStream.ReadByte();

int widthHi = responseStream.ReadByte();
int widthLo = responseStream.ReadByte();
int width = (widthHi << 8) + widthLo;

int heightHi = responseStream.ReadByte();
int heightLo = responseStream.ReadByte();
int height = (heightHi << 8) + heightLo;

Console.WriteLine(width + "x" + height);
found = true;
}
}
}

EDIT:
I'm no Python expert, but this article seems to desribe a Python lib doing just that (last sample): http://effbot.org/zone/pil-image-size.htm

How to get image width and height in pixels without loading file in memory

I think exiftool fits the bill nicely here. It runs on all platforms, is very controllable and crucially, it can recurse on its own so it doesn't incur the overhead of being started once per file.

As a rough first attempt, you'd want something like this if processing PNGs and JPEGs and recursing down starting at current directory, i.e. .

exiftool -csv -ImageHeight -ImageWidth -r -ext jpg -ext jpeg -ext png .

Sample Output

black.png,80,80
blue.png,80,80
c.jpg,1,1
deskew/skew40.png,800,800
deskew/gradient.png,800,800

You may want to add -q to exclude the summary if you are parsing the output.

As a rough guide, the above command runs in 9 seconds on a directory containing 10,000 images on my Mac.

fast way to read height width and file size of jpeg without decoding

The file size could be checked by FileManager API.

Image width and height could be checked via CGImageSource functions (ImageIO.framework) without loading the image to memory:

do {
let attribute = try FileManager.default.attributesOfItem(atPath: filePath)

// Filesize
let fileSize = attribute[FileAttributeKey.size] as! Int

// Width & Height
let imageFileUrl = URL(fileURLWithPath: filePath)
if let imageSource = CGImageSourceCreateWithURL(imageFileUrl as CFURL, nil) {
if let imageProperties = CGImageSourceCopyPropertiesAtIndex(imageSource, 0, nil) as Dictionary? {

let width = imageProperties[kCGImagePropertyPixelWidth] as! Int
let height = imageProperties[kCGImagePropertyPixelHeight] as! Int

if (height > 4096 || width > 4096 || height < 256 || width < 256) {
print("Size not valid")
} else {
print("Size is valid")
}
}
}

} catch {
print("File attributes cannot be read")
}

Getting image dimensions without reading the entire file

Your best bet as always is to find a well tested library. However, you said that is difficult, so here is some dodgy largely untested code that should work for a fair number of cases:

using System;
using System.Collections.Generic;
using System.Drawing;
using System.IO;
using System.Linq;

namespace ImageDimensions
{
public static class ImageHelper
{
const string errorMessage = "Could not recognize image format.";

private static Dictionary<byte[], Func<BinaryReader, Size>> imageFormatDecoders = new Dictionary<byte[], Func<BinaryReader, Size>>()
{
{ new byte[]{ 0x42, 0x4D }, DecodeBitmap},
{ new byte[]{ 0x47, 0x49, 0x46, 0x38, 0x37, 0x61 }, DecodeGif },
{ new byte[]{ 0x47, 0x49, 0x46, 0x38, 0x39, 0x61 }, DecodeGif },
{ new byte[]{ 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A }, DecodePng },
{ new byte[]{ 0xff, 0xd8 }, DecodeJfif },
};

/// <summary>
/// Gets the dimensions of an image.
/// </summary>
/// <param name="path">The path of the image to get the dimensions of.</param>
/// <returns>The dimensions of the specified image.</returns>
/// <exception cref="ArgumentException">The image was of an unrecognized format.</exception>
public static Size GetDimensions(string path)
{
using (BinaryReader binaryReader = new BinaryReader(File.OpenRead(path)))
{
try
{
return GetDimensions(binaryReader);
}
catch (ArgumentException e)
{
if (e.Message.StartsWith(errorMessage))
{
throw new ArgumentException(errorMessage, "path", e);
}
else
{
throw e;
}
}
}
}

/// <summary>
/// Gets the dimensions of an image.
/// </summary>
/// <param name="path">The path of the image to get the dimensions of.</param>
/// <returns>The dimensions of the specified image.</returns>
/// <exception cref="ArgumentException">The image was of an unrecognized format.</exception>
public static Size GetDimensions(BinaryReader binaryReader)
{
int maxMagicBytesLength = imageFormatDecoders.Keys.OrderByDescending(x => x.Length).First().Length;

byte[] magicBytes = new byte[maxMagicBytesLength];

for (int i = 0; i < maxMagicBytesLength; i += 1)
{
magicBytes[i] = binaryReader.ReadByte();

foreach(var kvPair in imageFormatDecoders)
{
if (magicBytes.StartsWith(kvPair.Key))
{
return kvPair.Value(binaryReader);
}
}
}

throw new ArgumentException(errorMessage, "binaryReader");
}

private static bool StartsWith(this byte[] thisBytes, byte[] thatBytes)
{
for(int i = 0; i < thatBytes.Length; i+= 1)
{
if (thisBytes[i] != thatBytes[i])
{
return false;
}
}
return true;
}

private static short ReadLittleEndianInt16(this BinaryReader binaryReader)
{
byte[] bytes = new byte[sizeof(short)];
for (int i = 0; i < sizeof(short); i += 1)
{
bytes[sizeof(short) - 1 - i] = binaryReader.ReadByte();
}
return BitConverter.ToInt16(bytes, 0);
}

private static int ReadLittleEndianInt32(this BinaryReader binaryReader)
{
byte[] bytes = new byte[sizeof(int)];
for (int i = 0; i < sizeof(int); i += 1)
{
bytes[sizeof(int) - 1 - i] = binaryReader.ReadByte();
}
return BitConverter.ToInt32(bytes, 0);
}

private static Size DecodeBitmap(BinaryReader binaryReader)
{
binaryReader.ReadBytes(16);
int width = binaryReader.ReadInt32();
int height = binaryReader.ReadInt32();
return new Size(width, height);
}

private static Size DecodeGif(BinaryReader binaryReader)
{
int width = binaryReader.ReadInt16();
int height = binaryReader.ReadInt16();
return new Size(width, height);
}

private static Size DecodePng(BinaryReader binaryReader)
{
binaryReader.ReadBytes(8);
int width = binaryReader.ReadLittleEndianInt32();
int height = binaryReader.ReadLittleEndianInt32();
return new Size(width, height);
}

private static Size DecodeJfif(BinaryReader binaryReader)
{
while (binaryReader.ReadByte() == 0xff)
{
byte marker = binaryReader.ReadByte();
short chunkLength = binaryReader.ReadLittleEndianInt16();

if (marker == 0xc0)
{
binaryReader.ReadByte();

int height = binaryReader.ReadLittleEndianInt16();
int width = binaryReader.ReadLittleEndianInt16();
return new Size(width, height);
}

binaryReader.ReadBytes(chunkLength - 2);
}

throw new ArgumentException(errorMessage);
}
}
}

Hopefully the code is fairly obvious. To add a new file format you add it to imageFormatDecoders with the key being an array of the "magic bits" which appear at the beginning of every file of the given format and the value being a function which extracts the size from the stream. Most formats are simple enough, the only real stinker is jpeg.

Fast way to get remote image dimensions

I think this gem does what you want https://github.com/sdsykes/fastimage

FastImage finds the size or type of an
image given its uri by fetching as
little as needed



Related Topics



Leave a reply



Submit