Image Scaling (Keepaspectratiobyexpanding) Through Opengl

Image scaling (KeepAspectRatioByExpanding) through OpenGL

You can simply copy "keep aspect ratio" branch (provided that it is working), and just flip the ratio comparison sign, i.e.:

if (ratiox > ratioy)

becomes

if (ratiox <= ratioy)

But i'm not sure it is actually working (ratio calculations had always bugged me - and yours is tricky), and don't have Qt atm so I can't try. But that should do it. Note that the image will be centered (not left-aligned as on your image), but that can be fixed pretty easily.

EDIT:

Here is source code that works in GLUT application (no QT, sorry):

static void DrawObject(void)
{
int img_width = 1280;//_frame->width();
int img_height = 720;//_frame->height();
GLfloat offset_x = -1;
GLfloat offset_y = -1;

int p_viewport[4];
glGetIntegerv(GL_VIEWPORT, p_viewport); // don't have QT :'(

GLfloat gl_width = p_viewport[2];//width(); // GL context size
GLfloat gl_height = p_viewport[3];//height();

int n_mode = 0;
switch(n_mode) {
case 0: // KeepAspectRatioByExpanding
{
float ratioImg = float(img_width) / img_height;
float ratioScreen = gl_width / gl_height;

if(ratioImg < ratioScreen) {
gl_width = 2;
gl_height = 2 * ratioScreen / ratioImg;
} else {
gl_height = 2;
gl_width = 2 / ratioScreen * ratioImg;
}
// calculate image size
}
break;

case 1: // IgnoreAspectRatio
gl_width = 2;
gl_height = 2;
// OpenGL normalized coordinates are -1 to +1 .. hence width (or height) = +1 - (-1) = 2
break;

case 2: // KeepAspectRatio
{
float ratioImg = float(img_width) / img_height;
float ratioScreen = gl_width / gl_height;

if(ratioImg > ratioScreen) {
gl_width = 2;
gl_height = 2 * ratioScreen / ratioImg;
} else {
gl_height = 2;
gl_width = 2 / ratioScreen * ratioImg;
}
// calculate image size

offset_x = -1 + (2 - gl_width) * .5f;
offset_y = -1 + (2 - gl_height) * .5f;
// center on screen
}
break;
}

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glMatrixMode( GL_MODELVIEW );
glLoadIdentity();
// just simple ortho view, no fancy transform ...

glBegin(GL_QUADS);
glTexCoord2f(0, 0);
glVertex2f(offset_x, offset_y);

glTexCoord2f(ImgWidth, 0);
glVertex2f(offset_x + gl_width, offset_y);

glTexCoord2f(ImgWidth, ImgHeight);
glVertex2f(offset_x + gl_width, offset_y + gl_height);

glTexCoord2f(0, ImgHeight);
glVertex2f(offset_x, offset_y + gl_height);
glEnd();
// draw a single quad
}

This works by comparing screen aspect ratio to image aspect ratio. You are actually comparing ratios of image width to screen width with image height to screen height. That is suspicious at least, not to say wrong.

Also, normalized OpenGL coordinates (provided a simple orthogonal view) are in range (-1, -1) for lower-left corner to (1, 1) for upper right. That means normalized width and height are both 2, and offset is (-1, -1). The rest of the code should be self-explanatory. In case texture is flipped (I tested with kind of generic texture, not sure if it was upright), just change texture coordinates in the respective direction (swap 0s for ImgWidth (or height) and vice versa).

EDIT2:

Using pixel coordinates (not using normalized OpenGL coordinates) is even simpler. You can use:

static void DrawObject(void)
{
int img_width = 1280;//_frame->width();
int img_height = 720;//_frame->height();
GLfloat offset_x = 0;
GLfloat offset_y = 0;

int p_viewport[4];
glGetIntegerv(GL_VIEWPORT, p_viewport);

GLfloat gl_width = p_viewport[2];//width(); // GL context size
GLfloat gl_height = p_viewport[3];//height();

int n_mode = 0;
switch(n_mode) {
case 0: // KeepAspectRatioByExpanding
{
float ratioImg = float(img_width) / img_height;
float ratioScreen = gl_width / gl_height;

if(ratioImg < ratioScreen)
gl_height = gl_width / ratioImg;
else
gl_width = gl_height * ratioImg;
// calculate image size
}
break;

case 1: // IgnoreAspectRatio
break;

case 2: // KeepAspectRatio
{
float ratioImg = float(img_width) / img_height;
float ratioScreen = gl_width / gl_height;

GLfloat orig_width = gl_width;
GLfloat orig_height = gl_height;
// remember those to be able to center the quad on screen

if(ratioImg > ratioScreen)
gl_height = gl_width / ratioImg;
else
gl_width = gl_height * ratioImg;
// calculate image size

offset_x = 0 + (orig_width - gl_width) * .5f;
offset_y = 0 + (orig_height - gl_height) * .5f;
// center on screen
}
break;
}

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glTranslatef(-1, -1, 0);
glScalef(2.0f / p_viewport[2], 2.0f / p_viewport[3], 1.0);
// just simple ortho view for vertex coordinate to pixel matching

glBegin(GL_QUADS);
glTexCoord2f(0, 0);
glVertex2f(offset_x, offset_y);

glTexCoord2f(img_width, 0);
glVertex2f(offset_x + gl_width, offset_y);

glTexCoord2f(img_width, img_height);
glVertex2f(offset_x + gl_width, offset_y + gl_height);

glTexCoord2f(0, img_height);
glVertex2f(offset_x, offset_y + gl_height);
glEnd();
// draw a single quad
}

Note that both versions of the code do use NPOT textures. To adapt the code to fit into your object, one would do something like this:

void GLWidget::paintEvent(QPaintEvent *event)
{
QPainter painter(this);
painter.setRenderHint(QPainter::Antialiasing);

qDebug() << "> GLWidget::paintEvent OpenGL:" << ((painter.paintEngine()->type() != QPaintEngine::OpenGL &&
painter.paintEngine()->type() != QPaintEngine::OpenGL2) ? "disabled" : "enabled");

QGLContext* context = const_cast<QGLContext *>(QGLContext::currentContext());
if (!context)
{
qDebug() << "> GLWidget::paintEvent !!! Unable to retrieve OGL context";
return;
}
context->makeCurrent();

painter.fillRect(QRectF(QPoint(0, 0), QSize(1280, 768)), Qt::black);

painter.beginNativePainting();

/* Initialize GL extensions */
GLenum err = glewInit();
if (err != GLEW_OK)
{
qDebug() << "> GLWidget::paintEvent !!! glewInit failed with: " << err;
return;
}
if (!GLEW_VERSION_2_1) // check that the machine supports the 2.1 API.
{
qDebug() << "> GLWidget::paintEvent !!! System doesn't support GLEW_VERSION_2_1";
return;
}

/* Setting up texture and transfering data to the GPU */

static GLuint texture = 0;
if (texture != 0)
{
context->deleteTexture(texture);
}

glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, texture);

glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0,
GL_LUMINANCE, _frame->width(), _frame->height() + _frame->height() / 2, 0,
GL_LUMINANCE, GL_UNSIGNED_BYTE, _frame->bits());

assert(glGetError() == GL_NO_ERROR);

glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
glEnable(GL_TEXTURE_RECTANGLE_ARB);

glClearColor(0.3, 0.3, 0.4, 1.0);

/* Initialize shaders and execute them */
_init_shaders();

int img_width = _frame->width();
int img_height = _frame->height();
GLfloat offset_x = 0;
GLfloat offset_y = 0;
GLfloat gl_width = width(); // GL context size
GLfloat gl_height = height();

qDebug() << "paint(): gl_width:" << gl_width << " gl_height:" << gl_height <<
" img:" << _frame->width() << "x" << _frame->height();

int fill_mode = 0;
switch(fill_mode) {
case 0: // KeepAspectRatioByExpanding
{
float ratioImg = float(img_width) / img_height;
float ratioScreen = gl_width / gl_height;

if(ratioImg < ratioScreen)
gl_height = gl_width / ratioImg;
else
gl_width = gl_height * ratioImg;
// calculate image size
}
break;

case 1: // IgnoreAspectRatio
break;

case 2: // KeepAspectRatio
{
float ratioImg = float(img_width) / img_height;
float ratioScreen = gl_width / gl_height;

GLfloat orig_width = gl_width;
GLfloat orig_height = gl_height;
// remember those to be able to center the quad on screen

if(ratioImg > ratioScreen)
gl_height = gl_width / ratioImg;
else
gl_width = gl_height * ratioImg;
// calculate image size

offset_x = 0 + (orig_width - gl_width) * .5f;
offset_y = 0 + (orig_height - gl_height) * .5f;
// center on screen
}
break;
}

glDisable(GL_CULL_FACE); // might cause problems if enabled
glBegin(GL_QUADS);
glTexCoord2f(0, 0);
glVertex2f(offset_x, offset_y);

glTexCoord2f(img_width, 0);
glVertex2f(offset_x + gl_width, offset_y);

glTexCoord2f(img_width, img_height);
glVertex2f(offset_x + gl_width, offset_y + gl_height);

glTexCoord2f(0, img_height);
glVertex2f(offset_x, offset_y + gl_height);
glEnd();
// draw a single quad

painter.endNativePainting();
}

Can't guarantee this last code snippet is error-free since I don't have QT. But in case there are any typos, it should be rather straightforward to fix them.

High performance QImage output to display

Yes, render the frames to a QGLWidget and let the video card handle it. That's how Qt MultimediaKit, Phonon and others do it.

Some time ago I shared some code that demonstrated how to accomplish this task: Image scaling (KeepAspectRatioByExpanding) through OpenGL

Problems converting YV12 to RGB through GLSL

The problem here is that the image is actually not YV12, the chrominance and luminance planes are not interleaved, but are laid out in blocks. This could be solved in two ways, either interleave the planes before loading that into the texture and use the rest of the code as is, or it could be done in shader. I removed iostream and replaced it with stdio (i'm using rather old compiler). Here is my code for loading the image and interleaving it:

GLubyte *memblock;
{
FILE *p_fr = fopen("data.yv12", "rb");
if(!p_fr) {
fprintf(stderr, "!!! Failed to load yuv file\n");
return;
}
fseek(p_fr, 0, SEEK_END);
int yuv_file_sz = ftell(p_fr);
fseek(p_fr, 0, SEEK_SET);
memblock = new GLubyte[yuv_file_sz];
if(!memblock) {
fprintf(stderr, "!!! Failed to allocate memblock\n");
return;
}
fread(memblock, yuv_file_sz, 1, p_fr);
fclose(p_fr);
}
// load .raw file

ImgWidth = 1280;
ImgHeight = 720;
ImageYUV = new GLushort[ImgWidth * ImgHeight];
// allocate an image

int chromaWidth = ImgWidth / 2;
int chromaHeight = ImgHeight / 2; // 2x2 luminance subsampling
const GLubyte *pCb = memblock + ImgWidth * ImgHeight; // Cb block after Y
const GLubyte *pCr = pCb + chromaWidth * chromaHeight; // Cr block after Cb
// get pointers to smaller Cb and Cr blocks (the image is *not* interleaved)

for(int i = 0; i < ImgWidth * ImgHeight; ++ i) {
int x = i % ImgWidth;
int y = i / ImgWidth;
GLubyte cb = pCb[(x / 2) + (y / 2) * chromaWidth];
GLubyte cr = pCr[(x / 2) + (y / 2) * chromaWidth];
ImageYUV[i] = (memblock[i] << 8) | ((x & 1)? cr : cb);
}
// convert (interleave) the data to YV12

This is pretty straightforward, and can be used with the shader above.

Now what if we wanted to skip the interleaving? First, i'm going to figure out how the addressing works here (we're going to act like the image is a little bit higher monochrome image, the chrominance planes taking space above the luminance plane):

for(int y = 0; y < ImgHeight; ++ y) {
for(int x = 0; x < ImgWidth; ++ x) {
int CbY = ImgHeight + (y / 4);
int CrY = ImgHeight + chromaHeight / 2 + (y / 4);
int CbCrX = (x / 2) + chromaWidth * ((y / 2) & 1);
// calculate x, y of cr and cb pixels in the grayscale image
// where the Y, Cb anc Cr blocks are next to each other

assert(&memblock[CbCrX + CbY * ImgWidth] == &pCb[(x / 2) + (y / 2) * chromaWidth]);
assert(&memblock[CbCrX + CrY * ImgWidth] == &pCr[(x / 2) + (y / 2) * chromaWidth]);
// make sure the addresses are correct (and they are)

GLubyte cb = memblock[CbCrX + CbY * ImgWidth];
GLubyte cr = memblock[CbCrX + CrY * ImgWidth];
GLubyte Y = memblock[x + y * ImgWidth];

ImageYUV[x + y * ImgWidth] = (Y << 8) | ((x & 1)? cr : cb);
}
}
// convert (interleave) the data to YV12 (a little bit different way, use physical layout in memory)

That has pretty much the same effect. Now we can take the code that calculates the positions and put it in the shader.

static const char *p_s_fragment_shader =
"#extension GL_ARB_texture_rectangle : enable\n"
"uniform sampler2DRect tex;"
"uniform float ImgHeight, chromaHeight_Half, chromaWidth;"
"void main()"
"{"
" vec2 t = gl_TexCoord[0].xy;" // get texcoord from fixed-function pipeline
" float CbY = ImgHeight + floor(t.y / 4.0);"
" float CrY = ImgHeight + chromaHeight_Half + floor(t.y / 4.0);"
" float CbCrX = floor(t.x / 2.0) + chromaWidth * floor(mod(t.y, 2.0));"
" float Cb = texture2DRect(tex, vec2(CbCrX, CbY)).x - .5;"
" float Cr = texture2DRect(tex, vec2(CbCrX, CrY)).x - .5;"
" float y = texture2DRect(tex, t).x;" // redundant texture read optimized away by texture cache
" float r = y + 1.28033 * Cr;"
" float g = y - .21482 * Cb - .38059 * Cr;"
" float b = y + 2.12798 * Cb;"
" gl_FragColor = vec4(r, g, b, 1.0);"
"}";

By using this shader, we can directly upload the raw data to a texture, except it is a little bit higher and only GL_LUMINANCE:

glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0,
GL_LUMINANCE, ImgWidth, ImgHeight + ImgHeight / 2, 0, // !!
GL_LUMINANCE, GL_UNSIGNED_BYTE, memblock); // !!

I will leave it at that. Here are complete source codes:

interleaving in shader (faster, preferrable)

manual interleaving in "C"

Sorry for the quick end, i will have problems if i don't leave my workplace ASAP :).

one finger image scaling / Cropping

I'm not quite sure what you mean by one finger cropping, but I made this library for iOS cropping. Maybe it can help. https://github.com/nicholjs/BFCropInterface

Bad rendering with GL_TEXTURE_MIN_FILTER GL_LINEAR

Looking at the sourcecode I see the following in the shader:

     float CbY = ImgHeight + floor(t.y / 4.0);
float CrY = ImgHeight + chromaHeight_Half + floor(t.y / 4.0);

I have no idea, why you add ImgHeight to the texture coordinates, because all it'd do was wrap around if that's the texture height. Then you're packing the different color components into a single texture. So you must take extra care to correctly calculate offsets. That one pixel high line colored off is a indication, that your texture coordinates are wrong. GL_NEAREST coerces to integers, but with GL_LINEAR they must match. I suggest replacing floor with round.

Decode video frames on iPhone GPU

If you are willing to use an iOS 5 only solution, take a look at the sample app ChromaKey from the 2011 WWDC session on AVCaputureSession.

That demo captures 30 FPS of video from the built-in camera and passes each frame to OpenGL as a texture. It then uses OpenGL to manipulate the frame, and optionally writes the result out to an output video file.

The code uses some serious low-level magic to bind a Core Video Pixel buffer from an AVCaptureSession to OpenGL so they share memory in the graphics hardware.

It should be fairly straightforward to change the AVCaptureSession to use a movie file as input rather than camera input.

You could probably set up the session to deliver frames in Y/UV form rather than RGB, where the Y component is luminance. Failing that, it would be a pretty simple matter to write a shader that would convert RGB values for each pixel to luminance values.

You should be able to do all this on ALL Frames, not just every 10th frame.



Related Topics



Leave a reply



Submit