Opengl Es 2.0 to Video on iPad/Iphone

OpenGL ES 2.0 to Video on iPad/iPhone

I just got something similar to this working in my open source GPUImage framework, based on the above code, so I thought I'd provide my working solution to this. In my case, I was able to use a pixel buffer pool, as suggested by Srikumar, instead of the manually created pixel buffers for each frame.

I first configure the movie to be recorded:

NSError *error = nil;

assetWriter = [[AVAssetWriter alloc] initWithURL:movieURL fileType:AVFileTypeAppleM4V error:&error];
if (error != nil)
{
    NSLog(@"Error: %@", error);
}

NSMutableDictionary * outputSettings = [[NSMutableDictionary alloc] init];
[outputSettings setObject: AVVideoCodecH264 forKey: AVVideoCodecKey];
[outputSettings setObject: [NSNumber numberWithInt: videoSize.width] forKey: AVVideoWidthKey];
[outputSettings setObject: [NSNumber numberWithInt: videoSize.height] forKey: AVVideoHeightKey];

assetWriterVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:outputSettings];
assetWriterVideoInput.expectsMediaDataInRealTime = YES;

// You need to use BGRA for the video in order to get realtime encoding. I use a color-swizzling shader to line up glReadPixels' normal RGBA output with the movie input's BGRA.
NSDictionary *sourcePixelBufferAttributesDictionary = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithInt:kCVPixelFormatType_32BGRA], kCVPixelBufferPixelFormatTypeKey,
                                                       [NSNumber numberWithInt:videoSize.width], kCVPixelBufferWidthKey,
                                                       [NSNumber numberWithInt:videoSize.height], kCVPixelBufferHeightKey,
                                                       nil];

assetWriterPixelBufferInput = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:assetWriterVideoInput sourcePixelBufferAttributes:sourcePixelBufferAttributesDictionary];

[assetWriter addInput:assetWriterVideoInput];

then use this code to grab each rendered frame using glReadPixels():

CVPixelBufferRef pixel_buffer = NULL;

CVReturn status = CVPixelBufferPoolCreatePixelBuffer (NULL, [assetWriterPixelBufferInput pixelBufferPool], &pixel_buffer);
if ((pixel_buffer == NULL) || (status != kCVReturnSuccess))
{
    return;
}
else
{
    CVPixelBufferLockBaseAddress(pixel_buffer, 0);
    GLubyte *pixelBufferData = (GLubyte *)CVPixelBufferGetBaseAddress(pixel_buffer);
    glReadPixels(0, 0, videoSize.width, videoSize.height, GL_RGBA, GL_UNSIGNED_BYTE, pixelBufferData);
}

// May need to add a check here, because if two consecutive times with the same value are added to the movie, it aborts recording
CMTime currentTime = CMTimeMakeWithSeconds([[NSDate date] timeIntervalSinceDate:startTime],120);

if(![assetWriterPixelBufferInput appendPixelBuffer:pixel_buffer withPresentationTime:currentTime]) 
{
    NSLog(@"Problem appending pixel buffer at time: %lld", currentTime.value);
} 
else 
{
//        NSLog(@"Recorded pixel buffer at time: %lld", currentTime.value);
}
CVPixelBufferUnlockBaseAddress(pixel_buffer, 0);

CVPixelBufferRelease(pixel_buffer);

One thing I noticed is that if I tried to append two pixel buffers with the same integer time value (in the basis provided), the entire recording would fail and the input would never take another pixel buffer. Similarly, if I tried to append a pixel buffer after retrieval from the pool failed, it would abort the recording. Thus, the early bailout in the code above.

In addition to the above code, I use a color-swizzling shader to convert the RGBA rendering in my OpenGL ES scene to BGRA for fast encoding by the AVAssetWriter. With this, I'm able to record 640x480 video at 30 FPS on an iPhone 4.

Again, all of the code for this can be found within the GPUImage repository, under the GPUImageMovieWriter class.

OpenGL ES and external display on iOS

It works very well with one context and two drawables. One just has to be careful to destroy the renderbuffer before detaching from the old CAEAGLLayer, and reallocate a new one with the new CAEAGLLayer; most the code to do that is provided in the EAGLView class of the OpenGL ES app template in Xcode. And of course, one needs to reconfigure the objects that are drawing the OpenGL with the size of the new layer.

Camera direct to OpenGL texture on iOS

The output you get from the camera in iOS is a CMSampleBufferRef, with a CVPixelBufferRef inside. (See documentation here). iOS from version 5 has CVOpenGLESTextureCache in the CoreVideo framework, which allows you to create an OpenGL ES texture using a CVPixelBufferRef, avoiding any copies.

Check the RosyWriter sample in Apple's developer website, it's all there.

Choose OpenGL ES 1.1 or OpenGL ES 2.0?

Whether to use OpenGL ES 1.1 or 2.0 depends on what you want to do in your application, and how many devices you need it to be compatible with. All iOS devices support OpenGL ES 1.1, where only the iPhone 3G S and newer devices (iPhone 3G S, iPhone 4, iPad, and 3rd and 4th generation iPod touch) support OpenGL ES 2.0. However, all iOS devices Apple is currently shipping are compatible with OpenGL ES 2.0. The percentage of devices that don't support OpenGL ES 2.0 is dropping every day. All iPads have supported OpenGL ES 2.0 from launch, so you're guaranteed to have support if you target that form factor.

OpenGL ES 2.0 and 1.1 use different and fairly incompatible rendering pipelines. OpenGL ES 1.1 uses a fixed function pipeline, where you feed in geometry and texture data, set up lighting, etc. states, and let OpenGL handle the rest for you. OpenGL ES 2.0 is based around a programmable pipeline, where you supply vertex and fragment shaders to handle the specifics of how your content is rendered to the screen.

Because you have to write your own code to replace even the most basic built-in functions, using OpenGL ES 2.0 for simple 3-D applications may require more effort than OpenGL ES 1.1. Also, most sample applications and writeups that you find out there will be geared toward 1.1, so it can be more difficult to get started with the 2.0 API.

However, the fact that you can write your own routines for dealing with your geometry and textures and how they are displayed to the screen means that OpenGL ES 2.0 lets you do things that simply would not be possible (or would require a tremendous amount of effort) to do in OpenGL ES 1.1. These include effects like cartoon shading and ambient occlusion lighting, as well as letting you do something interesting like offloading massively parallel work to the GPU.

If you care to see some examples of what you can do with OpenGL ES 2.0, video for the class I taught on the subject is available on iTunes U, and I created two sample applications here and here.

When it comes to cross-platform compatibility, shaders have been available on desktop OpenGL for a little while now, so anything you build using either OpenGL ES 1.1 or 2.0 should be fairly portable to the desktop.

If you can do the rendering that you want in OpenGL ES 1.1, you could go that way to provide the maximum device compatibility. However, a significant majority of iOS devices in use today support OpenGL ES 2.0 (I can't find the statistics on this right now, but it was based on units shipped), and that number will only grow over time. OpenGL ES 2.0 lets you pull off some stunning effects (see Epic Citadel) that could help you set your application apart from the others out there.

Displaying/Processing iPhone Camera Frames with OpenGL ES 2.0

This sample application of mine has three shaders that perform various levels of processing and display of camera frames to the screen. I explain how this application works in my post here, as well as in the OpenGL ES 2.0 session of my class on iTunes U.

In fact, the shaders here looks to be direct copies of the ones I used for the direct display of video frames, so you're probably already using that application as a template. I assume that my sample application runs just fine on your device?

If so, then there has to be some difference between the starting sample and your application. You appear to have simplified my sample by pulling some of the code I had in the -drawFrame method into the end of your delegate method, which should work fine, so that's not the problem. I'll assume that the rest of your OpenGL setup is identical to what I had in that sample, so the scene is configured properly.

Looking through my code and comparing it to what you've posted, all that I can see that is different is a missing glUseProgram() call in your code. If you've properly compiled and linked the shader program in code away from what you've shown here, you just need to employ glUseProgram() somewhere before you update the uniform values.

Also, you're binding the renderbuffer, but you may need

[context presentRenderbuffer:GL_RENDERBUFFER];

after your last line there to make sure the contents get to the screen (where context is your EAGLContext instance).

Faster alternative to glReadPixels in iPhone OpenGL ES 2.0

As of iOS 5.0, there is now a faster way to grab data from OpenGL ES. It isn't readily apparent, but it turns out that the texture cache support added in iOS 5.0 doesn't just work for fast upload of camera frames to OpenGL ES, but it can be used in reverse to get quick access to the raw pixels within an OpenGL ES texture.

You can take advantage of this to grab the pixels for an OpenGL ES rendering by using a framebuffer object (FBO) with an attached texture, with that texture having been supplied from the texture cache. Once you render your scene into that FBO, the BGRA pixels for that scene will be contained within your CVPixelBufferRef, so there will be no need to pull them down using glReadPixels().

This is much, much faster than using glReadPixels() in my benchmarks. I found that on my iPhone 4, glReadPixels() was the bottleneck in reading 720p video frames for encoding to disk. It limited the encoding from taking place at anything more than 8-9 FPS. Replacing this with the fast texture cache reads allows me to encode 720p video at 20 FPS now, and the bottleneck has moved from the pixel reading to the OpenGL ES processing and actual movie encoding parts of the pipeline. On an iPhone 4S, this allows you to write 1080p video at a full 30 FPS.

My implementation can be found within the GPUImageMovieWriter class within my open source GPUImage framework, but it was inspired by Dennis Muhlestein's article on the subject and Apple's ChromaKey sample application (which was only made available at WWDC 2011).

I start by configuring my AVAssetWriter, adding an input, and configuring a pixel buffer input. The following code is used to set up the pixel buffer input:

NSDictionary *sourcePixelBufferAttributesDictionary = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithInt:kCVPixelFormatType_32BGRA], kCVPixelBufferPixelFormatTypeKey,
                                                       [NSNumber numberWithInt:videoSize.width], kCVPixelBufferWidthKey,
                                                       [NSNumber numberWithInt:videoSize.height], kCVPixelBufferHeightKey,
                                                       nil];

assetWriterPixelBufferInput = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:assetWriterVideoInput sourcePixelBufferAttributes:sourcePixelBufferAttributesDictionary];

Once I have that, I configure the FBO that I'll be rendering my video frames to, using the following code:

if ([GPUImageOpenGLESContext supportsFastTextureUpload])
{
    CVReturn err = CVOpenGLESTextureCacheCreate(kCFAllocatorDefault, NULL, (__bridge void *)[[GPUImageOpenGLESContext sharedImageProcessingOpenGLESContext] context], NULL, &coreVideoTextureCache);
    if (err) 
    {
        NSAssert(NO, @"Error at CVOpenGLESTextureCacheCreate %d");
    }

    CVPixelBufferPoolCreatePixelBuffer (NULL, [assetWriterPixelBufferInput pixelBufferPool], &renderTarget);

    CVOpenGLESTextureRef renderTexture;
    CVOpenGLESTextureCacheCreateTextureFromImage (kCFAllocatorDefault, coreVideoTextureCache, renderTarget,
                                                  NULL, // texture attributes
                                                  GL_TEXTURE_2D,
                                                  GL_RGBA, // opengl format
                                                  (int)videoSize.width,
                                                  (int)videoSize.height,
                                                  GL_BGRA, // native iOS format
                                                  GL_UNSIGNED_BYTE,
                                                  0,
                                                  &renderTexture);

    glBindTexture(CVOpenGLESTextureGetTarget(renderTexture), CVOpenGLESTextureGetName(renderTexture));
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, CVOpenGLESTextureGetName(renderTexture), 0);
}

This pulls a pixel buffer from the pool associated with my asset writer input, creates and associates a texture with it, and uses that texture as a target for my FBO.

Once I've rendered a frame, I lock the base address of the pixel buffer:

CVPixelBufferLockBaseAddress(pixel_buffer, 0);

and then simply feed it into my asset writer to be encoded:

CMTime currentTime = CMTimeMakeWithSeconds([[NSDate date] timeIntervalSinceDate:startTime],120);

if(![assetWriterPixelBufferInput appendPixelBuffer:pixel_buffer withPresentationTime:currentTime]) 
{
    NSLog(@"Problem appending pixel buffer at time: %lld", currentTime.value);
} 
else 
{
//        NSLog(@"Recorded pixel buffer at time: %lld", currentTime.value);
}
CVPixelBufferUnlockBaseAddress(pixel_buffer, 0);

if (![GPUImageOpenGLESContext supportsFastTextureUpload])
{
    CVPixelBufferRelease(pixel_buffer);
}

Note that at no point here am I reading anything manually. Also, the textures are natively in BGRA format, which is what AVAssetWriters are optimized to use when encoding video, so there's no need to do any color swizzling here. The raw BGRA pixels are just fed into the encoder to make the movie.

Aside from the use of this in an AVAssetWriter, I have some code in this answer that I've used for raw pixel extraction. It also experiences a significant speedup in practice when compared to using glReadPixels(), although less than I see with the pixel buffer pool I use with AVAssetWriter.

It's a shame that none of this is documented anywhere, because it provides a huge boost to video capture performance.

iOS OpenGL ES perform pinch zoom in 2d world

This is my solution:

I have two classes, one that takes care of all the OpenGL stuff (RenderViewController) and another one that takes care of all the gesture recognizers and communication between the OpenGL part and other parts of the app (EditViewController).

This is the code regarding the gestures:

EdtorViewController

It captures the gestures and send the info about them to the RenderViewController. You have to be careful because of the different coordinate systems.

- (void) generateGestureRecognizers {

    //Setup gesture recognizers
    UIRotationGestureRecognizer *twoFingersRotate = [[UIRotationGestureRecognizer alloc] initWithTarget:self action:@selector(twoFingersRotate:)];
    [self.hitView addGestureRecognizer:twoFingersRotate];

    UIPinchGestureRecognizer *twoFingersScale = [[UIPinchGestureRecognizer alloc] initWithTarget:self action:@selector(twoFingersScale:)];
    [self.hitView addGestureRecognizer:twoFingersScale];

    UIPanGestureRecognizer *oneFingerPan = [[UIPanGestureRecognizer alloc] initWithTarget:self action:@selector(oneFingerPan:)];
    [self.hitView addGestureRecognizer:oneFingerPan];

    [twoFingersRotate setDelegate:self];
    [twoFingersScale setDelegate:self];
    [oneFingerPan setDelegate:self];
}

- (void) oneFingerPan:(UIPanGestureRecognizer *) recognizer {    

    //Handle pan gesture
    CGPoint translation = [recognizer translationInView:self.hitView];
    CGPoint location = [recognizer locationInView:self.hitView];

    //Send info to renderViewController
    [self.renderViewController translate:traslation];

    //Reset recognizer so change doesn't accumulate
    [recognizer setTranslation:CGPointZero inView:self.hitView];    
}

- (void) twoFingersRotate:(UIRotationGestureRecognizer *) recognizer {  

    //Handle rotation gesture
    CGPoint locationInView = [recognizer locationInView:self.hitView];
    locationInView = CGPointMake(locationInView.x - self.hitView.bounds.size.width/2, locationInView.y - self.hitView.bounds.size.height/2);

    if ([recognizer state] == UIGestureRecognizerStateBegan || [recognizer state] == UIGestureRecognizerStateChanged) {

        //Send info to renderViewController
        [self.renderViewController rotate:locationInView degrees:recognizer.rotation];

        //Reset recognizer
        [recognizer setRotation:0.0];
    }
}

- (void) twoFingersScale:(UIPinchGestureRecognizer *) recognizer {

    //Handle scale gesture
    CGPoint locationInView = [recognizer locationInView:self.hitView];
    locationInView = CGPointMake(locationInView.x - self.hitView.bounds.size.width/2, locationInView.y - self.hitView.bounds.size.height/2);

    if ([recognizer state] == UIGestureRecognizerStateBegan || [recognizer state] == UIGestureRecognizerStateChanged) {

        //Send info to renderViewController
        [self.renderViewController scale:locationInView ammount:recognizer.scale];

        //reset recognizer
        [recognizer setScale:1.0];
    }

}

//This allows gestures recognizers to happen simultaniously
- (BOOL)gestureRecognizer:(UIGestureRecognizer *)gestureRecognizer shouldRecognizeSimultaneouslyWithGestureRecognizer:(UIGestureRecognizer *)otherGestureRecognizer {
    if (gestureRecognizer.view != otherGestureRecognizer.view)
        return NO;

    if ([gestureRecognizer isKindOfClass:[UILongPressGestureRecognizer class]] || [otherGestureRecognizer isKindOfClass:[UILongPressGestureRecognizer class]])
        return NO;

    return YES;
}

RenderViewController

For every frame, the modelViewMatrix is calculated from three other temporary matrices (translation, scale and rotation)

- (void) setup {

    //Creates the modelViewMatrix from the initial position, rotation and scale
    translatemt = GLKMatrix4Translate(GLKMatrix4Identity, initialPosition.x, initialPosition.y, 0.0);
    scalemt = GLKMatrix4Scale(GLKMatrix4Identity, initialScale, initialScale, 1.0);
    rotatemt = GLKMatrix4Rotate(GLKMatrix4Identity, initialRotation, 0.0, 0.0, 1.0);
    self.modelViewMatrix = GLKMatrix4Multiply(GLKMatrix4Multiply(GLKMatrix4Multiply(translatemt, rotatemt), scalemt), GLKMatrix4Identity);

    //set these back to identities to take further modifications (they'll update the modelViewMatrix)
    scalemt = GLKMatrix4Identity;
    rotatemt = GLKMatrix4Identity;
    translatemt = GLKMatrix4Identity;

    //rest of the OpenGL setup
    [self setupOpengGL];

}

//public interface
- (void) translate:(CGPoint) location {
    //Update the translation temporary matrix
    translatemt = GLKMatrix4Translate(translatemt, location.x, -location.y, 0.0);
}

//public interface
- (void) rotate:(CGPoint) location degrees:(CGFloat) degrees {
    //Update the rotation temporary matrix
    rotatemt = GLKMatrix4Translate(GLKMatrix4Identity, location.x, -location.y, 0.0);
    rotatemt = GLKMatrix4Rotate(rotatemt, -degrees, 0.0, 0.0, 1.0);
    rotatemt = GLKMatrix4Translate(rotatemt, -location.x, location.y, 0.0);
}

//public interface
- (void) scale:(CGPoint) location ammount:(CGFloat) ammount {
    //Update the scale temporary matrix
    scalemt = GLKMatrix4Translate(GLKMatrix4Identity, location.x, -location.y, 0.0);
    scalemt = GLKMatrix4Scale(scalemt, ammount, ammount, 1.0);
    scalemt = GLKMatrix4Translate(scalemt, -location.x, location.y, 0.0);
}

- (void)update {

    //this is done before every render update. It generates the modelViewMatrix from the temporary matrices
    self.modelViewMatrix = GLKMatrix4Multiply(GLKMatrix4Multiply(GLKMatrix4Multiply(rotatemt, translatemt), scalemt), self.modelViewMatrix);

    //And then set them back to identities
    translatemt = GLKMatrix4Identity;
    rotatemt = GLKMatrix4Identity;
    scalemt = GLKMatrix4Identity;

    //set the modelViewMatrix for the effect (this is assuming you are using OpenGL es 2.0, but it would be similar for previous versions
    self.effect.transform.modelviewMatrix = self.modelViewMatrix;
}

Displaying images in open gl es 2.0 on iPhone

I show how to display an image as a texture in OpenGL ES 2.0 on the iPhone in this example, where the image is a frame of video from the camera, and this example, where I pull in a PVR-compressed texture as an image. The former example is described in the writeup here.

I describe how the vertex and fragment shaders work, along with all the supporting code for them, in the video for the OpenGL ES 2.0 class which is part of my freely available iOS development course on iTunes U. Additionally, I highly recommend you read Jeff LaMarche's series of posted chapters from his unpublished OpenGL ES 2.0 book.

In short, once you have the texture loaded in, you'll need to attach it as a uniform to your shader program using code like the following:

glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, myTexture);
glUniform1i(myTextureUniform, 0);

Within your fragment shader, you'll need to define the texture uniform:

uniform sampler2D myTexture;

and then sample the color from the texture at the appropriate point:

gl_FragColor = texture2D(myTexture, textureCoordinate);

Again, I go into this in more detail within my class, and you can use these examples as starting points to work from.

Opengl Es 2.0 to Video on iPad/Iphone