I now have the code that generate the grid hash every frame.
still a lot more work to come, to stress tested I try a 125k scene.
the scene run a 5 fps, but that no a concern since for these molton objects will have a option, like set as background Gravity, which will be use for stuff like
gravity will be fixed. so no callback
the material will also be fix.
and for GPU will have interop.
the physics takes about 22 ms, but 20 ms is just setting the transforms.
the GPU take 2 ms, so the GPU is not a bottom less pit of performance it start to show sign of the load.
but these is a very heavy load, I am very impressed.
the one concern that keep growing is that for some reason as kernel are executed, the driver keeps insertion those silence spaces. in the capture below, you can see that take 2 ms, but there are three silences of about 150 us

- Untitled.png (59.34 KiB) Viewed 4096 times
of course this is a stress test, my expectation are far more humble, if we can get
8 to 10 bodies in a middle range gpu taking a very small fraction of the GPU, I would consider a success. We cannot take over the GPU, for physics. but in graphics there is a lot of spare idle GPU time. anyway, I do no know what to make of those silence gaps, but the keep showing up.
I am now to the generation of new colliding pairs.
the phase does two this.
-Generate all pairs.
-Prune duplicates.
-Merge with existing pair and leave only the new pairs.
-Delete dead pairs
-copy that array to the cpu so that the engine generated contact joints.
and that will complete the broad phase.