Maybe i've some inspiration for cloth simulation.
I was playing around with this some time ago, because of Softimage XSI cloth frustration.
The yellow rings on the green cloth rig are build from a pair of particles, which are constrained to stay on skeleton and to keep minimal distance from neighbouring particles.
The cloth itself is constrained to keep maximal distance to the rig, and more important not to cross the plane at the rigs attachement point.
This ensures stable cloth - it's impossible to cross a leg through a long skirt for example, no matter if character is kung fu fighting or teleports around.
The donwnside is that the cloth rig requires knowledge of the unique ragdoll, so this adds complexity to the content creation.
However, if the crossing thing becomes a problem - i think it's the "drag and jitter" in clothsim - it may be an option.
For coll. detection the cloth vertex traverses the ragdoll capsules as a tree to quickly find a set of skin vertices linked to a specific region on the capsule.
GPU Grid may be better as it also can handle self collisions in one go.
Performance is good, demo video done on single thread CPU, also skin catmull clark in realtime, anything unoptimised - no SIMD.
Planned to port to OpenCL, but 1 year ago Cuda was twice faster... has that changed meanwhile?
Video:
http://www.megaupload.com/?d=KOGVQBA6