Getting Pixel Format from Cgimage

Getting pixel format from CGImage

Some years later and after testing my findings in production I can share them with good confidence, but hoping someone with theory knowledge will explain things better here? Good places to refresh memory:

  • Wikipedia: RGBA color space – Representation
  • Apple Lists: Byte Order in CGBitmapContextCreate
  • Apple Lists: kCGImageAlphaPremultiplied First/Last

Based on that you can use following extensions:

public enum PixelFormat
{
case abgr
case argb
case bgra
case rgba
}

extension CGBitmapInfo
{
public static var byteOrder16Host: CGBitmapInfo {
return CFByteOrderGetCurrent() == Int(CFByteOrderLittleEndian.rawValue) ? .byteOrder16Little : .byteOrder16Big
}

public static var byteOrder32Host: CGBitmapInfo {
return CFByteOrderGetCurrent() == Int(CFByteOrderLittleEndian.rawValue) ? .byteOrder32Little : .byteOrder32Big
}
}

extension CGBitmapInfo
{
public var pixelFormat: PixelFormat? {

// AlphaFirst – the alpha channel is next to the red channel, argb and bgra are both alpha first formats.
// AlphaLast – the alpha channel is next to the blue channel, rgba and abgr are both alpha last formats.
// LittleEndian – blue comes before red, bgra and abgr are little endian formats.
// Little endian ordered pixels are BGR (BGRX, XBGR, BGRA, ABGR, BGR).
// BigEndian – red comes before blue, argb and rgba are big endian formats.
// Big endian ordered pixels are RGB (XRGB, RGBX, ARGB, RGBA, RGB).

let alphaInfo: CGImageAlphaInfo? = CGImageAlphaInfo(rawValue: self.rawValue & type(of: self).alphaInfoMask.rawValue)
let alphaFirst: Bool = alphaInfo == .premultipliedFirst || alphaInfo == .first || alphaInfo == .noneSkipFirst
let alphaLast: Bool = alphaInfo == .premultipliedLast || alphaInfo == .last || alphaInfo == .noneSkipLast
let endianLittle: Bool = self.contains(.byteOrder32Little)

// This is slippery… while byte order host returns little endian, default bytes are stored in big endian
// format. Here we just assume if no byte order is given, then simple RGB is used, aka big endian, though…

if alphaFirst && endianLittle {
return .bgra
} else if alphaFirst {
return .argb
} else if alphaLast && endianLittle {
return .abgr
} else if alphaLast {
return .rgba
} else {
return nil
}
}
}

Note, that you should always pay attention to colour space – it directly affects how raw pixel data is stored. CGColorSpace(name: CGColorSpace.sRGB) is probably the safest one – it stores colours in plain format, for example, if you deal with red RGB it will be stored just like that (255, 0, 0) while device colour space will give you something like (235, 73, 53).

To see this in practice drop above and the following into a playground. You'll need two one-pixel red images with alpha and without, this and this should work.

import AppKit
import CoreGraphics

extension CFData
{
public var pixelComponents: [UInt8] {
let buffer: UnsafeMutablePointer<UInt8> = UnsafeMutablePointer.allocate(capacity: 4)
defer { buffer.deallocate(capacity: 4) }
CFDataGetBytes(self, CFRange(location: 0, length: CFDataGetLength(self)), buffer)
return Array(UnsafeBufferPointer(start: buffer, count: 4))
}
}

let color: NSColor = .red
Thread.sleep(forTimeInterval: 2)

// Must flip coordinates to capture what we want…
let screen: NSScreen = NSScreen.screens.first(where: { $0.frame.contains(NSEvent.mouseLocation) })!
let rect: CGRect = CGRect(origin: CGPoint(x: NSEvent.mouseLocation.x - 10, y: screen.frame.height - NSEvent.mouseLocation.y), size: CGSize(width: 1, height: 1))

Swift.print("Will capture image with \(rect) frame.")

let screenImage: CGImage = CGWindowListCreateImage(rect, [], kCGNullWindowID, [])!
let urlImageWithAlpha: CGImage = NSImage(byReferencing: URL(fileURLWithPath: "/Users/ianbytchek/Downloads/red-pixel-with-alpha.png")).cgImage(forProposedRect: nil, context: nil, hints: nil)!
let urlImageNoAlpha: CGImage = NSImage(byReferencing: URL(fileURLWithPath: "/Users/ianbytchek/Downloads/red-pixel-no-alpha.png")).cgImage(forProposedRect: nil, context: nil, hints: nil)!

Swift.print(screenImage.colorSpace!, screenImage.bitmapInfo, screenImage.bitmapInfo.pixelFormat!, screenImage.dataProvider!.data!.pixelComponents)
Swift.print(urlImageWithAlpha.colorSpace!, urlImageWithAlpha.bitmapInfo, urlImageWithAlpha.bitmapInfo.pixelFormat!, urlImageWithAlpha.dataProvider!.data!.pixelComponents)
Swift.print(urlImageNoAlpha.colorSpace!, urlImageNoAlpha.bitmapInfo, urlImageNoAlpha.bitmapInfo.pixelFormat!, urlImageNoAlpha.dataProvider!.data!.pixelComponents)

let formats: [CGBitmapInfo.RawValue] = [
CGImageAlphaInfo.premultipliedFirst.rawValue,
CGImageAlphaInfo.noneSkipFirst.rawValue,
CGImageAlphaInfo.premultipliedLast.rawValue,
CGImageAlphaInfo.noneSkipLast.rawValue,
]

for format in formats {

// This "paints" and prints out components in the order they are stored in data.

let context: CGContext = CGContext(data: nil, width: 1, height: 1, bitsPerComponent: 8, bytesPerRow: 32, space: CGColorSpace(name: CGColorSpace.sRGB)!, bitmapInfo: format)!
let components: UnsafeBufferPointer<UInt8> = UnsafeBufferPointer(start: context.data!.assumingMemoryBound(to: UInt8.self), count: 4)

context.setFillColor(red: 1 / 0xFF, green: 2 / 0xFF, blue: 3 / 0xFF, alpha: 1)
context.fill(CGRect(x: 0, y: 0, width: 1, height: 1))
Swift.print(context.colorSpace!, context.bitmapInfo, context.bitmapInfo.pixelFormat!, Array(components))
}

This will output the following. Pay attention how screen-captured image differs from ones loaded from disk.

Will capture image with (285.7734375, 294.5, 1.0, 1.0) frame.
<CGColorSpace 0x7fde4e9103e0> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; iMac) CGBitmapInfo(rawValue: 8194) bgra [27, 13, 252, 255]
<CGColorSpace 0x7fde4d703b20> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; Color LCD) CGBitmapInfo(rawValue: 3) rgba [235, 73, 53, 255]
<CGColorSpace 0x7fde4e915dc0> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; Color LCD) CGBitmapInfo(rawValue: 5) rgba [235, 73, 53, 255]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 2) argb [255, 1, 2, 3]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 6) argb [255, 1, 2, 3]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 1) rgba [1, 2, 3, 255]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 5) rgba [1, 2, 3, 255]

Playground

How to determine and interpret the pixel format of a CGImage

To ensure device independence, it may be better to use a CGBitmapContext to populate the data for you.

Something like this should work

// Get the CGImageRef
CGImageRef imageRef = [theImage CGImage];

// Find width and height
NSUInteger width = CGImageGetWidth(imageRef);
NSUInteger height = CGImageGetHeight(imageRef);

// Setup color space
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

// Alloc data that the image data will be put into
unsigned char *rawData = malloc(height * width * 4);

// Create a CGBitmapContext to draw an image into
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow = bytesPerPixel * width;
NSUInteger bitsPerComponent = 8;
CGContextRef context = CGBitmapContextCreate(rawData, width, height,
bitsPerComponent, bytesPerRow, colorSpace,
kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGColorSpaceRelease(colorSpace);

// Draw the image which will populate rawData
CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
CGContextRelease(context);

for (NSUInteger y = 0; y < height; y++) {
for (NSUInteger x = 0; x < width; x++) {
int byteIndex = (bytesPerRow * y) + x * bytesPerPixel;

CGFloat red = rawData[byteIndex];
CGFloat green = rawData[byteIndex + 1];
CGFloat blue = rawData[byteIndex + 2];
CGFloat alpha = rawData[byteIndex + 3];
}
}

free(rawData);

Getting RGBA values for all pixels of CGImage Swift

You can access the underlying pixels in a vImage buffer to do this.

For example, given an image named cgImage, use the following code to populate a vImage buffer:

guard
let format = vImage_CGImageFormat(cgImage: cgImage),
let buffer = try? vImage_Buffer(cgImage: cgImage,
format: format) else {
exit(-1)
}

let rowStride = buffer.rowBytes / MemoryLayout<Pixel_8>.stride / format.componentCount

Note that a vImage buffer's data may be wider than the image (see: https://developer.apple.com/documentation/accelerate/finding_the_sharpest_image_in_a_sequence_of_captured_images) which is why I've added rowStride.

To access the pixels as a single buffer of interleaved values, use:

do {
let n = rowStride * Int(buffer.height) * format.componentCount
let start = buffer.data.assumingMemoryBound(to: Pixel_8.self)
let ptr = UnsafeBufferPointer(start: start, count: n)

print(Array(ptr)[ 0 ... 15]) // prints the first 15 interleaved values
}

To access the pixels as a buffer of Pixel_8888 values, use (make sure that format.componentCount is 4:

do {
let n = rowStride * Int(buffer.height)
let start = buffer.data.assumingMemoryBound(to: Pixel_8888.self)
let ptr = UnsafeBufferPointer(start: start, count: n)

print(Array(ptr)[ 0 ... 3]) // prints the first 4 pixels
}

How to get pixel data from a UIImage (Cocoa Touch) or CGImage (Core Graphics)?

FYI, I combined Keremk's answer with my original outline, cleaned-up the typos, generalized it to return an array of colors and got the whole thing to compile. Here is the result:

+ (NSArray*)getRGBAsFromImage:(UIImage*)image atX:(int)x andY:(int)y count:(int)count
{
NSMutableArray *result = [NSMutableArray arrayWithCapacity:count];

// First get the image into your data buffer
CGImageRef imageRef = [image CGImage];
NSUInteger width = CGImageGetWidth(imageRef);
NSUInteger height = CGImageGetHeight(imageRef);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
unsigned char *rawData = (unsigned char*) calloc(height * width * 4, sizeof(unsigned char));
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow = bytesPerPixel * width;
NSUInteger bitsPerComponent = 8;
CGContextRef context = CGBitmapContextCreate(rawData, width, height,
bitsPerComponent, bytesPerRow, colorSpace,
kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGColorSpaceRelease(colorSpace);

CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
CGContextRelease(context);

// Now your rawData contains the image data in the RGBA8888 pixel format.
NSUInteger byteIndex = (bytesPerRow * y) + x * bytesPerPixel;
for (int i = 0 ; i < count ; ++i)
{
CGFloat alpha = ((CGFloat) rawData[byteIndex + 3] ) / 255.0f;
CGFloat red = ((CGFloat) rawData[byteIndex] ) / alpha;
CGFloat green = ((CGFloat) rawData[byteIndex + 1] ) / alpha;
CGFloat blue = ((CGFloat) rawData[byteIndex + 2] ) / alpha;
byteIndex += bytesPerPixel;

UIColor *acolor = [UIColor colorWithRed:red green:green blue:blue alpha:alpha];
[result addObject:acolor];
}

free(rawData);

return result;
}

Re: Get pixel data as array from UIImage/CGImage in swift

You probably forgot the CGImageAlphaInfo parameter. For color images, if you assume bytesPerPixel to be 4, you need to set either RGBA (or ARGB) when creating the context. Following is an example for RGBA without the alpha channel.

// RGBA format
let ctx = CGBitmapContextCreate(&data, pixelsWide, pixelsHigh, 8,
bitmapBytesPerRow, colorSpace, CGImageAlphaInfo.NoneSkipLast.rawValue)

According to the documentation, you have these options:

enum CGImageAlphaInfo : UInt32 {
case None
case PremultipliedLast
case PremultipliedFirst
case Last
case First
case NoneSkipLast
case NoneSkipFirst
case Only
}

is there any way to get pixel format information from vImage_Buffer in swift?

The vImage_Buffer only describes a rectangular array of pixel data. The type of the data (unorm8, float, etc.) is inferred from the name of the function that operates on it. This all should be fairly clear.

From vImage's perspective, the channel order is whatever you say it is. For most vImage functions, the channel order doesn't matter since all the channels are treated the same. They may be named _ARGB8888 but really, they are _XXXX8888. For other vImage functions, (e.g. PremultiplyData) only one channel is treated differently. In that case, it is only important that the alpha channel appear either in the first or last channel as described by the function name. The ordering of the other channels doesn't matter because they are treated the same. For the particular function you are talking about, it is your job to know the ordering of the red, green and blue channels and adjust the ordering of the coefficients matrix accordingly.

The channel ordering in your data is probably set by whatever produced your image data in the first place. Often that is CoreGraphics / ImageIO. In that case -- its a bit complicated -- the color channel order matches the order of the colors in the CGImageRef colorspace. The alpha comes either first or last (if present) based on the CGImage bitmap info, part of which is the CGImageAlphaInfo. As a final complication, the entire thing may be subject to a 16- or 32- bit endianness transform. If the size of a channel is smaller than the endianness transform quantum (16- or 32-bit) then the order of the channels relative to one another have been swapped around per the endianness transform. The by far most common case of this is BGRA unorm8 data, which is encoded as ARGB 8-bit data with a 32-bit little endian transform tagged on to it. However, it is possible that one might get 16-bit per channel grayscale alpha data with the GA order transposed due to a 32-bit endian transform, which will simultaneously swap G and A relative to one another and convert the 16-bit samples to 16-bit little endian. (Note that this probably never happens in nature, since you could more clearly classify this as alpha first with a 16-bit little endian transform. The encoding is legal, though.)

There are a few examples in vImage_Utilities.h C headers (not sure about the swift version) that show examples of common CG encodings.

Getting pixel data from CGImageRef contains extra bytes?

The bitmap data has padding at the end of each row of pixels, to round the number of bytes per row up to a larger value. (In this case, a multiple of 16 bytes.)

This padding is added to make it faster to process and draw the image.

You should use CGImageGetBytesPerRow() to find out how many bytes each row takes. Don't assume that it's the same as CGImageGetWidth() * CGImageGetBitsPerPixel() / 8; the bytes per row may be larger.

Keep in mind that the data behind an arbitrary CGImage may not be in the format that you expect. You cannot assume that all images are 32-bit-per-pixel ARGB with no padding. You should either use the CG functions to figure out what format the data might be, or redraw the image into a bitmap context that's in the exact format you expect. The latter is typically much easier -- let CG do the conversions for you.

(You don't show what parameters you're passing to CGBitmapContextCreate. Are you calculating an exact bytesPerRow or are you passing in 0? If you pass in 0, CG may add padding for you, and you may find that drawing into the context is faster.)



Related Topics



Leave a reply



Submit