[osg-users] Using SSE within OSG
sebastian.messerschmidt at gmx.de
Tue Jul 29 08:01:14 PDT 2008
Regarding question 2:
Wouldn't it be possible to dynamically link different versions of the
So there would be two Version of the DLLs, one with the
SSE-Optimizations and one with the straightforward code.
I've seen examples of games some years ago, where they linked different
Versions of DLLs depending on the machine the program was run on.
> Dear All,
> There's a discussion going on at the moment over in osg-submissions,
> and it has been raised that this ought to be opened up to the
> non-submissions community for feedback. Note that the following is my
> reading of the issues, and certainly doesn't represent the consensus
> view of the osg-submissions crowd, so feel free to challenge what I'm
> Several people already use SSE instructions
> (http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions) alongside OSG
> to obtain speed improvements through parallelising math operations.
> The general point that has been raised is that under-the-hood, OSG
> does quite a lot that could benefit from the potential performance
> boost given by SSE operations. Obvious targets include some of the
> Vec/Matrix routines, for example. SSE is now sufficiently mainstream
> that the risk of processor incompatibility is felt to be low.
> *Question 1 : Where could the core OSG include SSE?*
> Most people follow the sensible approach of profiling to determine
> their bottlenecks, and then optimising particular methods in order to
> gain speed-up. This would be a sensible approach to follow, as SSEing
> all methods would probably be a waste of effort. It would therefore
> be instructive firstly to know if anybody is using SSE with OSG, and
> where. Secondly, for those who have profiling data and know how much
> time they spend in Vec/Matrix/whatever methods, it would be useful to
> know which methods the community considered good targets for SSEing.
> Any other maths "heavy lifting" going on? (e.g. Intersection testing?
> Delauney triangulation? etc.)
> *Question 2 : How could the core OSG include SSE?*
> SSE code benefits from aligned data. Hence there are several ways in
> which OSG could include SSE:
> a) Provide an aligned Vec4f and aligned Matrix4f class, which support
> SSE operations. This would appear (to me) to be the least intrusive.
> b) Provide branching code within the existing Vec4/Matrix4 methods for
> detecting whether data is aligned, and performing the correct
> operations. This would appear to me to be the most user-transparent.
> Although it would appear to be a performance hit, testing so far on
> some specific code would support the argument that the speed gains
> from SSE outweigh the branch cost; more testing needed, I guess.
> c) Robert suggested that SSE enabled array operators (e.g. providing a
> cross-product operator for Vec3Array) might be appropriate and provide
> the best speed improvement for those who want it. Certainly using SSE
> on large array type data sets is where one gains the most performance
> This question includes the possibility of linking out to, or pulling
> source code our of, an external optimised math library.
> Any other suggestions?
> *Question 3 : (possibly the biggest) Should the core OSG include SSE?*
> There are several downsides to including SSE. Firstly, x-platform
> provision of SSE may be tricky due to the way different compilers
> define aligned data, and how SSE instructions are used within the
> code. I personally don't have much experience here, so any feedback on
> x-plaform issues is useful.
> Secondly, the code readability drops, and the "use the source"
> argument may be trickier when many might not know much SSE.
> So - your opinion, experience and suggestions welcome!
> osg-users mailing list
> osg-users at lists.openscenegraph.org
More information about the osg-users