[osg-users] Rendering performance issues

Todd J. Furlong todd at inv3rsion.com
Tue Jul 8 10:03:36 PDT 2008

Not sure if this is related, but...

When timing an OpenGL application, we always make sure to do a 
glFinish() before recording the time at the end of a draw.  The glFinish 
waits for the OpenGL commands in the pipeline to complete before 
returning and gives a truer measurement of the time.  Without waiting, 
draw timing tends to be reported as higher than it really is until the 
FIFO buffer gets filled.


Paul Martz wrote:
>     My main frustration is demonstrated by the second and third timing
>     images I had posted, both with VSync on (so 60Hz max frame rate),
>     with a single GPU.  Both Event/Update timings are trivial.
>     1. One timing had Cull(2.22)+Draw(3.81)+GPU(3.86)=9.89ms @ 59.99Hz 
>     (good)
>     Now all I did was trackball the eyepoint to another location in the
>     scene, and got:
>     2. Cull(0.64)+Draw(0.90)+GPU(0.92)=2.46ms @ 46.46Hz frame rate.
>     This makes no sense to me.  With 4 times more rendering time(#1) we
>     can achieve max frame rate, but with a very light rendering
>     load(#2), our frame rate is substantially degraded?  How frustrating
>     is that! 
> Have you tried adding your own timing code to verify that you really are 
> no longer getting 60Hz framerate? ("Trust but verify" as Reagan once said.)
>      w.r.t. Paul's suggestion about buffer swaps being queued being the
>     cause of the Draw/GPU gap, I'm not sure I quite understand how/why
>     that would happen, plus if that's a potential cause why moving the
>     eyepoint in my example above to a less rendering load would cause it
>     to happen. 
>  I can try to explain it better: The graphics hardware has a FIFO for 
> receiving input. When the upper limit on swaps is reached, the hardware 
> blocks the OS from putting more stuff into the FIFO until one of the 
> swaps already in the FIFO gets processed. Only then will the FIFO accept 
> the new data.
> Illustration: Assume the hardware has a max queued swap limit of 2 
> swaps, and the application is running at an ungodly fast pace...
>   App sends frame 0 data
>     Hardware starts processing frame 0 data
>   App issues swap for frame 0
>   App sends frame 1 data
>   App issues swap for frame 1
>   App is now blocked because 2 swaps are queued
>     Hardware executes swap to display frame 0
>     Hardware starts processing frame 1 data
>   App sends frame 2 data
>   App issues swap for frame 2
>   App is now blocked because 2 swaps are queued
>     Hardware executes swap to display frame 1
>     Hardware starts processing frame 2 data
>   App sends frame 3 data
>   App issues swap for frame 3
>   App is now blocked because 2 swaps are queued
>     Hardware executes swap to display frame 2
>     Hardware starts processing frame 3 data
> Etc. Thus, there will be a gap between when OSG sends the data, and the 
> hardware begins processing it.
> I'm not sure this has anything to do with your third case, where the 
> rendering load is lighter yet the framerate appears to drop. I believe 
> these are two separate issues.
> Paul Martz
> *Skew Matrix Software LLC*
> http://www.skew-matrix.com <http://www.skew-matrix.com/>
> +1 303 859 9466

Todd J. Furlong
Inv3rsion, LLC

More information about the osg-users mailing list