Get Pixel Value from Cvpixelbufferref in Swift

Get pixel value from CVPixelBufferRef in Swift

baseAddress is an unsafe mutable pointer or more precisely a UnsafeMutablePointer<Void>. You can easily access the memory once you have converted the pointer away from Void to a more specific type:

// Convert the base address to a safe pointer of the appropriate type
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// read the data (returns value of type UInt8)
let firstByte = byteBuffer[0]

// write data
byteBuffer[3] = 90

Make sure you use the correct type (8, 16 or 32 bit unsigned int). It depends on the video format. Most likely it's 8 bit.

Update on buffer formats:

You can specify the format when you initialize the AVCaptureVideoDataOutput instance. You basically have the choice of:

BGRA: a single plane where the blue, green, red and alpha values are stored in a 32 bit integer each
420YpCbCr8BiPlanarFullRange: Two planes, the first containing a byte for each pixel with the Y (luma) value, the second containing the Cb and Cr (chroma) values for groups of pixels
420YpCbCr8BiPlanarVideoRange: The same as 420YpCbCr8BiPlanarFullRange but the Y values are restricted to the range 16 – 235 (for historical reasons)

If you're interested in the color values and speed (or rather maximum frame rate) is not an issue, then go for the simpler BGRA format. Otherwise take one of the more efficient native video formats.

If you have two planes, you must get the base address of the desired plane (see video format example):

Video format example

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// Get luma value for pixel (43, 17)
let luma = byteBuffer[17 * bytesPerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

BGRA example

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let int32Buffer = UnsafeMutablePointer<UInt32>(baseAddress)

// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

How can I read individual pixels from a CVPixelBuffer

You have to use the CVPixelBuffer APIs to get the right format to access the data via unsafe pointer manipulations. Here is the basic way:

CVPixelBufferRef pixelBuffer = _lastDepthData.depthDataMap;

CVPixelBufferLockBaseAddress(pixelBuffer, 0);

size_t cols = CVPixelBufferGetWidth(pixelBuffer);
size_t rows = CVPixelBufferGetHeight(pixelBuffer);
Float32 *baseAddress = CVPixelBufferGetBaseAddress( pixelBuffer );

// This next step is not necessary, but I include it here for illustration,
// you can get the type of pixel format, and it is associated with a kCVPixelFormatType
// this can tell you what type of data it is e.g. in this case Float32

OSType type = CVPixelBufferGetPixelFormatType( pixelBuffer);

if (type != kCVPixelFormatType_DepthFloat32) {
    NSLog(@"Wrong type");
}

// Arbitrary values of x and y to sample
int x = 20; // must be lower that cols
int y = 30; // must be lower than rows

// Get the pixel.  You could iterate here of course to get multiple pixels!
int baseAddressIndex = y  * (int)cols + x;
const Float32 pixel = baseAddress[baseAddressIndex];

CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );

Note that the first thing you need to determine is what type of data is in the CVPixelBuffer - if you don't know this then you can use CVPixelBufferGetPixelFormatType() to find out. In this case I am getting depth data at Float32, if you were using another type e.g. Float16, then you would need to replace all occurrences of Float32 with that type.

Note that it's important to lock and unlock the base address using CVPixelBufferLockBaseAddress and CVPixelBufferUnlockBaseAddress.

Correct way to draw/edit a CVPixelBuffer in Swift in iOS

You need to call CVPixelBufferLockBaseAddress(pixelBuffer, 0) before creating the bitmap CGContext and CVPixelBufferUnlockBaseAddress(pixelBuffer, 0) after you have finished drawing to the context.

Without locking the pixel buffer, CVPixelBufferGetBaseAddress() returns NULL. This causes your CGContext to allocate new memory to draw into, which is subsequently discarded.

Also double check your colour space. It's easy to mix up your components.

e.g.

guard
    CVPixelBufferLockBaseAddress(pixelBuffer) == kCVReturnSuccess,
    let context = CGContext(data: CVPixelBufferGetBaseAddress(pixelBuffer),
                            width: width,
                            height: height,
                            bitsPerComponent: 8,
                            bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
                            space: colorSpace,
                            bitmapInfo: alphaInfo.rawValue)
else {
    return nil
}

context.setFillColor(red: 1, green: 0, blue: 0, alpha: 1.0)
context.fillEllipse(in: CGRect(x: 0, y: 0, width: width, height: height))

CVPixelBufferUnlockBaseAddress(pixelBuffer)

adapter?.append(pixelBuffer, withPresentationTime: time)

How to get the value of kCVPixelFormatType_DepthFloat16 (half-point float)?

I have solved my problem already. This can be done in two ways.

Use kCVPixelFormatType_DepthFloat32 instead of kCVPixelFormatType_DepthFloat16, it will have the same dimension and fps as the previous depthmap. Then you can convert it into Swift Float type like the following:

let width = CVPixelBufferGetWidth(buffer)
let height = CVPixelBufferGetHeight(buffer)

CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
let floatBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(buffer), to: UnsafeMutablePointer<Float>.self)

for y in 0 ..< height {
    for x in 0 ..< width {
        let pixel = floatBuffer[y*width+x]
    }
}
CVPixelBufferUnlockBaseAddress(self, CVPixelBufferLockFlags(rawValue: 0))

The second way is to convert to UInt16 first, then add two zero bytes before it

// to access the point height = y, width = x, thanks to this project https://github.com/edvardHua/Articles/tree/master/%5BAR:MR%20%E5%9F%BA%E7%A1%80%5D%20%E5%88%A9%E7%94%A8%20iPhone%20X%20%E7%9A%84%E6%B7%B1%E5%BA%A6%E7%9B%B8%E6%9C%BA(TruthDepth%20Camera)%E8%8E%B7%E5%BE%97%E5%83%8F%E7%B4%A0%E7%82%B9%E7%9A%84%E4%B8%89%E7%BB%B4%E5%9D%90%E6%A0%87/Obtain3DCoordinate
let rowData = CVPixelBufferGetBaseAddress(buffer)! + Int(y) *  CVPixelBufferGetBytesPerRow(buffer)
var f16Pixel = rowData.assumingMemoryBound(to: UInt16.self)[x]
var f32Pixel = Float(0.0)
var src = vImage_Buffer(data: &f16Pixel, height: 1, width: 1, rowBytes: 2)
var dst = vImage_Buffer(data: &f32Pixel, height: 1, width: 1, rowBytes: 4)
vImageConvert_Planar16FtoPlanarF(&src, &dst, 0)
let depth = f32Pixel //depth in cm

How can I get the x and y coordinates from UnsafeMutableBufferPointer

Your image has a width and height where size = width * height.

Add .enumerated() to buffer in your for loop to get the index of the pixel. Divide and mod by the width to find the (x, y) coordinates of the pixel:

for (index, pixel) in buffer.enumerated() {
    let x = index % width
    let y = index / width

    var r : UInt32 = 0
    var g : UInt32 = 0
    var b : UInt32 = 0
    if cgImage.byteOrderInfo == .orderDefault || cgImage.byteOrderInfo == .order32Big {
        r = pixel & 255
        g = (pixel >> 8) & 255
        b = (pixel >> 16) & 255
    } else if cgImage.byteOrderInfo == .order32Little {
        r = (pixel >> 16) & 255
        g = (pixel >> 8) & 255
        b = pixel & 255
    }
}

Get Pixel Value from Cvpixelbufferref in Swift