Opengl: Glflush() VS. Glfinish()

opengl: glFlush() vs. glFinish()

Mind that these commands exist since the early days of OpenGL. glFlush ensures that previous OpenGL commands must complete in finite time (OpenGL 2.1 specs, page 245). If you draw directly to the front buffer, this shall ensure that the OpenGL drivers starts drawing without too much delay. You could think of a complex scene that appears object after object on the screen, when you call glFlush after each object. However, when using double buffering, glFlush has practically no effect at all, since the changes won't be visible until you swap the buffers.

glFinish does not return until all effects from previously issued commands [...] are fully realized. This means that the execution of your program waits here until every last pixel is drawn and OpenGL has nothing more to do. If you render directly to the front buffer, glFinish is the call to make before using the operating system calls to take screenshots. It is far less useful for double buffering, because you don't see the changes you forced to complete.

So if you use double buffering, you probably won't need neither glFlush nor glFinish. SwapBuffers implicitly directs the OpenGL calls to the correct buffer, there's no need to call glFlush first. And don't mind stressing the OpenGL driver: glFlush will not choke on too many commands. It is not guaranteed that this call returns immediately (whatever that means), so it can take any time it needs to process your commands.

glFlush() vs [[self openGLContext] flushBuffer] vs glFinish vs glSwapAPPLE vs aglSwapBuffers

Have you looked at this? It explains when to use glFlush() and glFinish(). Both are OpenGL functions that control execution and synchronizing of commands. Generally you would want to use these functions when doing multi-threaded rendering otherwise there shouldn't be any need.

glSwapAPPLE() and aglSwapBuffers() are extensions provided by Apple to display the contents of the backbuffer to screen (on Windows it's wglSwapBuffers()). You should use either one but not both as they really do the same thing. I would stick to the AGL method as it's analogous to WGL, EGL, etc..

[[self openGLContext] flushBuffer] is probably an objective C wrapper to glFlush() from the looks of it. I can't imagine it doing anything else.

Is calling glFinish necessary when synchronizing resources between OpenGL contexts?

If you manipulate the contents of any object in thread A, those contents are not visible to some other thread B until two things have happened:

  1. The commands modifying the object have completed. glFlush does not complete commands; you must use glFinish or a sync object to ensure command completion.

    Note that the completion needs to be communicated to thread B, but the synchronization command has to be issued on thread A. So if thread A uses glFinish, it now must use some CPU synchronization to communicate that the thread is finished to thread B. If you use fence sync objects instead, you need to create the fence on thread A, then hand it over to thread B who can test/wait on that fence.

  2. The object must be re-bound to the context of thread B. That is, you have to bind it to that context after the commands have completed (either directly with a glBind* command or indirectly by binding a container object that has this object attached to it).

This is detailed in Chapter 5 of the OpenGL specification.

glFinish() vs glFenceSync() + glClientWaitSync()

All of the options you mentioned will influence the performance of the application, since they will stall the pipeline. The modern way of measuring timing in OpenGL is to use Timer Queries: You tell OpenGL that it should save a timestamp when the query is executed on the GPU, thus no synchronization between GPU and CPU is required. The code could look, for example, like this:

GLuint64 startTime, stopTime;
unsigned int queryID[2];

// generate two queries
glGenQueries(2, queryID);
...
// issue the first query
// Records the time only after all previous
// commands have been completed
glQueryCounter(queryID[0], GL_TIMESTAMP);

// call a sequence of OpenGL commands
...
// issue the second query
// records the time when the sequence of OpenGL
// commands has been fully executed
glQueryCounter(queryID[1], GL_TIMESTAMP);
...

// get query results
// (automatically blocks until results are available)
glGetQueryObjectui64v(queryID[0], GL_QUERY_RESULT, &startTime);
glGetQueryObjectui64v(queryID[1], GL_QUERY_RESULT, &stopTime);

printf("Time spent on the GPU: %f ms\n", (stopTime - startTime) / 1000000.0);

(Code taken from Lighthouse3d.com).

Another option is to use glBeginQuery with the GL_TIME_ELAPSED parameter, which is also described in the linked article.

OpenGL buffers,glFlush and glutSwapBuffers()

There is a huge difference on modern platforms, in the sense that compositing window mangers (e.g. Aero on Windows Vista+) effectively own the front-buffer. If you draw single buffered, a buffer swap never occurs, and the end result is that nothing will ever be displayed on the screen.

This also affects some implementations of hybrid GPUs (e.g. Intel integrated + NVIDIA discrete on laptops) even without a compositing window manager. On such a system, the buffer swap operation is what copies the discrete GPU's framebuffer to the integrated for final output.

There is almost no reason to use single-buffered rendering on modern GPUs. It used to be that having to maintain two color buffers ate a lot of memory, which was also a compelling argument against triple-buffering, but these days the amount of memory required for the color buffer is a minute fraction of VRAM.

Opengl es 2.0 - why does glFinish give me a lower framerate on my new android phone compared with an old one?

I think a speculative answer is as good as it's going to get, so — apologies for almost certainly repeating a lot of what you already know:

Commands sent to OpenGL go through three states, named relative to the GPU side of things:

  1. unsubmitted
  2. submitted but pending
  3. completed

Communicating with the code running the GPU is usually expensive. So most OpenGL implementations accept your calls and just queue the work up inside your memory space for a while. At some point it'll decide that a communication is justified and will pay the cost to transfer all the calls at once, promoting them to the submitted state. Then the GPU will complete each one (potentially out-of-order, subject to not breaking the API).

glFinish:

... does not return until the effects of all previously called GL
commands are complete. Such effects include all changes to GL state,
all changes to connection state, and all changes to the frame buffer
contents.

So for some period when that CPU thread might have been doing something else, it now definitely won't. But if you don't glFinish then your output will probably still appear, it's just unclear when. glFlush is often the correct way forwards — it'll advance everything to submitted but not wait for completed, so everything will definitely appear shortly, you just don't bother waiting for it.

OpenGL bindings to the OS vary a lot; in general though you almost certainly want to flush rather than finish if your environment permits you to do so. If it's valid to neither flush nor finish and the OS isn't pushing things along for you based on any criteria then it's possible you're incurring some extra latency (e.g. the commands you issue one frame may not reach the GPU until you fill up the unsubmitted queue again during the next frame) but if you're doing GL work indefinitely then output will almost certainly still proceed.

Android sits upon EGL. Per the spec, 3.9.3:

... eglSwapBuffers and eglCopyBuffers perform an implicit flush operation
on the context ...

I therefore believe that you are not required to perform either a flush or a finish in Android if you're double buffering. A call to swap the buffers will cause a buffer swap as soon as drawing is complete without blocking the CPU.

As to the real question, the S7 has an Adreno 530 GPU. The S2 has a Mali T760MP6 GPU. The Malis are produced by ARM, the Adrenos by Qualcomm, so they're completely different architectures and driver implementations. So the difference that causes the blocking could be almost anything. But it's permitted to be. glFinish isn't required and is a very blunt instrument; it's probably not one of the major optimisation targets.



Related Topics



Leave a reply



Submit