Here is a small previews of the new pawaaaaaaaarrrrrrr
http://www.newtondynamics.com/downloads ... Solver.wmv
This is not optimized and it only implements the solver in the GPU.
Broad Phase, narrow phase collision, island assembly and body integration are still implemented in the CPU.
What is worse I the and the beginning of the solver the entire scene is serialized to the GPU on entry on serialize to the CPU on exit.
Even with all these ineficioencies you can see that the CPU single core 2.66 core2 pentium, the profile is off the crart, while the GPU ge 280 is about 30 fps.
As the other system have their own proxy representation in GPU memory all the memory copy will go away and the how system should be way, way faster.
The reason I stared this in reverse order (solver firt) in because the Newton solver is the more critical part of the engine, and if this part do not work, then all the rest is useless, This is an experience I had when trying to implement the Open MP version and the impossibility of efficient thread management.
Any way I will continue adding all of the major system, and then I will move to the simpler and trial stuff like, cloth, soft bodies, and SPH fluids.
This works on all CUDA ready GForce.
In fact this can even be implemented in a pixel Shader but the is some thing I will not dare doing.
The Cuda version will be a temperoraty GPU implemenation until the more generic OpenCL language is ready, and then will be abailable on all GPU, and Muticores CPU.
I can not way for the Larrabee which I think will smoke big the Gforce to teh dust.