Development of self balancing biped with inverse Dynamics

by **Julio Jerez** » Sun Feb 13, 2022 8:19 pm

ok, guys this is the moment we have been waiting for.
:mrgreen:

I dare anyone to pick an object from the scene.

known issues, if you collide with the object, the arm can push them through the ground very easy.
the problem can be fixed, but for now let us leave it like that.

the joints are migthy powerful, but there many ways to address that. in reality, if a real robot does that, it will most likely destroy the box, damage itself, or triger some safety to will make it stops. we can mimic that behavior.
They easiest way to deal with that is to lower the arm torques limits, right now it is almost unlimited.

anyway, I think with this, other than some tweaks here and there, this prove of concept is a done deal.

Now we can move to the next application of this solver.

I was thinking that before moving to a biped, maybe is better to try a quadruped like the Robot spot from Boston Dynamics. A biped combines two problem at once.
- how to cordinates multiple effectors
-the self banking.

While a quadruped removes the balance of the problem. Them after we master the coordination of multiple effectors, we move on with the biped.

this will be my version of spot.

: Untitled.png (26.09 KiB) Viewed 57922 times

by **JoeJ** » Mon Feb 14, 2022 4:31 am

Haha, lifted the box without issues. All precise and perfect. :mrgreen:

Code super easy as well.

Btw, i posted the ML video just to show state of competition. But if you're eager to become ML expert, just move ahead. There are similar works with ML driving physics simulation, like this: https://www.youtube.com/watch?v=eksOgX3vacs Though, that's the stuff i could do traditionally and without any mocap samples, i'm sure of.

Another impressive one, aiming to reduce input latency: https://www.youtube.com/watch?v=14tNq-fqTmQ

by **Julio Jerez** » Mon Feb 14, 2022 3:54 pm

if you look at this video,
https://www.youtube.com/watch?v=zJz2HtPUolk
https://www.youtube.com/watch?v=KlFO8QMXVCs
it is alone the line of what I wish to achieve, but dynamics and parametric.

if we get a self balancing that operate after a procedural footstep generator, them it is just a matter of solving the footstep placement in the environment the physics does the rest.

by **JoeJ** » Mon Feb 14, 2022 5:52 pm

Yeah, that's also my goal.
Above 3Ds videos do not look very natural, but once you add balancing, it becomes natural automatically without mocap.
And we have precise control and better performance.

Main advantage of ML seems they can add many skills with very little work, if they have proper samples.
Opposed, it likely would take me weeks or months to get some melee boxing or karate to work, even if i already had walking and running.
And because i know nothing about martial arts (did not even play Street Fighter), my results may end up pretty bad.
But i'm more interested in shooting anyway, which is simple.

Though, looking into my crystal ball, i can well imagine we are close to a ML revolution affecting games, and character simulation seems the proper application. A company like Epic may add this to their Meta Humans to offload the training hurdle from users. And boom - suddenly indies can use characters like the big boys. This could have some good effect on games, but also a bad one by further declining custom engines, which blocks innovation on the long run.

I'll continue to observe the AAA Titanic from a distance to find out...

by **Julio Jerez** » Mon Feb 14, 2022 6:19 pm

JoeJ wrote:Though, looking into my crystal ball, i can well imagine we are close to a ML revolution affecting games, and character simulation seems the proper application. A company like Epic may add this to their Meta Humans to offload the training hurdle from users.

I do not think that will happen any time soon.
Neural networks are regression calculator. all they do is that the fix the coefficient of a curve is multiple dimensions. before people were doing that with principal component analysis, and Gaussian networks. and even them we very difficult. everything you can do with a neural network, can also be done with a Gaussian Network at a smaller scale.

basically, what a network does is that takes an entire animation clip as a single sample in a search space. and them from those samples can gives you the set of all sample that are close to the space.

it is not that simple to program neural network. we are living is a world where misinformation is what sales. all the fuss you see about neutral networks is the bit product of one algorithm called backwoods propagation, that allows for networks with hundreds of layers to be programed efficiently

and since Google and apply bought them, every one jump into the band wagon, even IBM though the were going to be able to find a cure for cancer with neural networks and instead end up playing Jeopardy.
The investment in programming neural network is just too high for video games. when the same thing can be done using blending graph.

by **JoeJ** » Mon Feb 14, 2022 7:43 pm

basically, what a network does is that takes an entire animation clip as a single sample in a search space. and them from those samples can gives you the set of all sample that are close to the space.

That's pretty much how i imagined it to work. I won't go there - feel too old to learn it. I would be happy if could learn basics of math as you explained some posts above in my lifetime ;D

However, despite being always critical with hyped stuff, i became more open minded towards ML.
It first occurred to me here on this forum. Pretty early when i came here, and i think this was before the hype. A ML researcher contacted me by PM because he was interested in using it for character animation. We talked a bit, and he told me things hard to believe. He said his stuff could even learn to talk. Though, might take ten years for the training.
Now, all he said became true. A year ago i had a intellectual conversation with a ML chatbot. It did not pass the turing test on me, but the talk was meaningful. The bots reactions to my cynical remarks about the meaning of life from the perspective of an AI bot were eerie relaxed and almost outsmarting, reflecting my trolling with simple truth.

But what really convinced me was this: https://soundcloud.com/yichao-zhou-555747812/sets/bandnet-sound-samples-1
To me, as i'm a hobby musician, i always said 'ML will never be able to compose earwigs'. That's the highest art in music. Nobody has a recipe. We simply don't know why some melodies stick in our heads. And if you compose one yourself, it feels like a gift coming from the outside. You just pick it up and eventually refine it.
Those examples from the link are generated by ML from Beatles samples. And the songs are, while somewhat unconventional, pretty good. I know many human composer who never get there in their whole life. The examples surely are cherry picked, but that's no argument. Finding one good song from a million examples would be not much different than composing one yourself. Thus, many songs the ML has generated must have some higher quality i guess.
So, ML achieved what i personally thought would be impossible for a machine.
The magic surely comes from the Beatles samples, which are human, but the resulting songs are new ones. And the first one is really catchy. Could become a radio hit or at least a good commercial.

ML really seems useful. Especially for tasks which are fuzzy and lack a precise definition of the problem. Which i think is not true for our robotic topic here.
Still, my crystal ball remains showing me the picture of games industry jumping on this. You explained the reasons yourself. :mrgreen:

by **Julio Jerez** » Sat Jul 12, 2025 12:10 pm

Hey Joe, I know this is an old topic, but I thought you might still find it interesting.

Remember a few years back when we realized that even with a robust physics solver, building a self-balancing robot wasn't as straightforward as we expected?
We discovered that contact behavior could become extremely brittle, and we both agreed that the solution likely lay in using a soft contact model.
At the time, I had this idea to modify the contact joint to allow for penetration, thinking that would add the needed softness.
I spent a lot of time trying to make that work, but reality eventually caught up with me, and shoed me that it was a fundamentally flawed approach.
Not just wrong, but flat out boneheaded. Looking back, I’m not sure what I was thinking.
Allowing penetration made no sense, especially for humanoid robots where the contact surface (feet) is so thin. It just didn’t align with the physical realities of the system.

Ironically, I had already solved this problem in a different context: vehicle and track simulations.
These setups use small contact areas too, but they behave just fine. The trick, as it turns out, is that vehicle tires emulate soft contacts not via the contact joint itself, but through the suspension system.
I hadn’t realized that this was essentially the same problem, just tackled from a different angle.

Fast forward to training legged robots: I ran into the same issue again, in the worst possible way. Every time a leg hit the ground, the impulse generated by the contact would completely disrupt the learning process.
To deal with it, I added extra variables to predict foot strike timing and tried to mitigate the impulse, but the results were underwhelming.

Inverse dynamics (ID), based training for target positions performed poorly, while forward dynamics using joint torques yielded slightly better results.
At that point, I started questioning my assumption that inverse dynamics could produce superior outcomes. Maybe I was wrong?

Then, looking at robots from Boston Dynamics and noticed one common feature across all their designs:
a soft pad at the feet.
That’s basically a soft contact, but how to replicate that in simulation?
The breakthrough idea was simple: split the lower leg (calf) and connect it to the foot with a sliding spring.

: Untitled.png (9.9 KiB) Viewed 27040 times

This allows the foot to absorb impact energy via the spring, without compromising the leg’s joint angles solver. The result should be a smoother, more organic motion.
This also helps on the machine learning front.
Neural networks struggle with sharp discontinuities.
One way to handle that is to use large hidden layers, but that introduces a whole new set of challenges.
Now, I think I’ve found a better balance: eliminate the discontinuities at the source using mechanical modeling (soft contacts),
then train using inverse dynamics combining the best of both worlds.

by **Julio Jerez** » Sat Jul 12, 2025 1:05 pm

actually, once we have the idea, we can elaborate and make bet models.
here is one.

: Untitled1.png (25.12 KiB) Viewed 27039 times

this is the trick I use for the tracked vehicle, the tracks are very thin,
so they have a chance to penetrate a polygonal floor, but they also have the upper collision shape that prevent from falling through the floor.

here is very similar, most the time the ball collides, but they have an upright spring damper.
therefore, when the leg hit the floor, the first impact is damped by the spring damper, and the stay put.
The rest of the leg keep going down few steps, and each time attenuation the impact.

tow thing can happen, the leg speed can be high, so the calf hit the floor, but at that point, the impulse is much smaller, or ideality the ball spring absolves the full impact.

this is a better model, because when the robot walks, the calf can be almost horizontal to the floor,
nullifying the longitudinal spring. while the sphere has a special spring that align the movement along the contact normal, or maybe the gravity.

This is more challenging for the solver, because the ball must be very light.
I just have to experiment.

To me, this is the closer I can come to emulate that soft pad.

by **JoeJ** » Mon Jul 14, 2025 5:55 am

Interesting. I had thought about such suspension system too, but never tired it bacauae i didn't want to make the model more complex.

Also, the problem i was facing was different. My current approach of using a single custom 'contact' joint works very well. And while i have explained before, i will give a sumup for the record.

So my problem is different becasue the the 'averaged' contacts as generated by Newton were at the wrong place. With 'avaraged' i mean summing up contatc positions weighted by contact force magnitude, which i did for debug visuals too see if results match expectations.
They didn't. Overall the average is too close to the downprojected com, and too far away from the predicted center of pressure as generated by motor torques.
Also the point jumps around in discontinuous ways, seemingly casued from some contact caching trying to keep the contacts in place for some time.

Thus, the contact force is actually unpredictable, and i had expected this would cause issues with your learning too. Maybe your corrent solution to become more continuous is enough so the learning can predict, but if not it might haunt you again when you move from dog to diped, having a much smaller contact area.

My current solution is to disable default collisons, and instead genreating one contact per foot at the desired location. My contact also models friction and counteracts rotation around the normal, so standing on one foot works.

However, i still have the problem along the line between both contacts when they are both active.
Then, the average contact force of both again ends up too close to com.
I tried to control it by setting max friction, but did not work. My estimate of contact forces is too apporixmate, and also it could happen one foot sinked into the floor.

Still, i think it's good enough as is. I have not yet tried walking over dynamic bodies, but i'm optimistic.

A better way might be to 'hack' the contact forces of default contacts, but i guess this would invalidate solver results and isn't that easy.
MAybe you coudl also pay more attention to motor forces while solving for contatcs, so they can be controlled by them. But maybe that's a chicken and egg problem too.

Anyway, it looks good so far.
I'm currently working on it again. Details like lifting / putting down feet at the right time. It's hairy... :wink:

by **Julio Jerez** » Tue Jul 15, 2025 1:14 pm

Your approach actually aligns with many robotic manipulation strategies that rely on the concept of the Zero Moment Point (ZMP). So it's hard to argue against it.

In general, these methods gather all contact forces and calculate an effective contact point by averaging, typically using sensors that measure acceleration. They then analyze the robot's body configuration to determine whether the contact point required to counteract the acceleration lies within the support polygon. If it does, the robot is considered stable.

I've always felt that these approaches are quite limited, and indeed, they are. The resulting motion isn't very convincing. The robot must keep the support point within the foot area at all times, which is why we often see robots with oversized feet.

This limitation arises because the underlying theory comes from classical control theory, which relies on second-order homogeneous differential equations.
These equations have nice mathematical properties. Their solutions are typically of the form:

X = A * exp(i * k * t)

which have two components in the time domain of the from:
X = A * cos(kt) + i * A * Sin(kt)

the imaginary part is the transient response.
and the real part is the steady state response.

It's easy to see that if the magnitude of the exponent is less than 1.0, them transient component of the motion decays over time.
A large part of classical control theory revolves around techniques for analyzing and manipulating these kinds of equations.
These differential equations are often transformed into the frequency domain using the Fourier Transform, which simplifies them into algebraic equations.
But despite all the effort, these controllers can only reliably predict one step into the future.

With today’s powerful computing and fast physics engines, it's now possible to simulate multiple frames ahead at each control step. I believe this is the approach used by Boston Dynamics.

In the context of Reinforcement Learning (RL), these predictive methods are now called model-based approaches. The advantage of RL is that it can move beyond the constraints of classical homogeneous equations.

Instead of simulating the entire robot, RL simulates control inputs. The process is trial and error: you take an action, evaluate a reward to determine if the action was beneficial, and then repeat this many times, saving the sequence of actions.
Over time, you evaluate which actions lead to survival and success and which do not.
The system gradually changes the landscape to favor good actions and avoid bad ones.

After enough iterations, the RL agent develops a controller that knows what action to take in any given configuration.

Here’s where the challenge arises: improving actions in RL involves shaping the action space into a smooth reward landscape. Optimization follows the gradient of that landscape. However, when the landscape is sharp or has cliffs, often caused by spiky contact forces, the gradients are no longer smooth. This creates instability in training and performance.

I'm skipping many details, but that’s the general idea.

I’m still working on this and exploring how it plays out in practice. One of the things I like most about the RL approach is that you don’t need a local solver to simulate a second or more into the future. Instead, RL embeds that future planning into the weights of the neural network, often by optimizing using the Bellman equation.

by **JoeJ** » Wed Jul 16, 2025 4:19 am

Julio Jerez wrote:Your approach actually aligns with many robotic manipulation strategies that rely on the concept of the Zero Moment Point (ZMP)

I have initially experimented with ZMP, but i don't use.
But maybe the ZMP is equivalent to the center of pressure, as long as the CoP is inside the support polygon. I did read something like that iirc. Then yes, my stuff is all about controlling CoP.

The robot must keep the support point within the foot area at all times, which is why we often see robots with oversized feet.

Well, the CoP can't be outside the contact area of the foot, as there can't be a contact between ground and thin air. So a larger area makes it easier to control the CoP, ofc.
I would say it this way: 'If you would need to place the CoP outside the foot to prevent the fall, then you won't be able to prevent the fall. Except you can lift a foot and put it at the desired CoP in time.'
(It makes sense to talk about a 'desired CoP', shortly dCOP, as seen in papers.)
So the primary way my stuff works is to move the COM to a target as fast as possible by placing the CoP on the edges of the support polygon.
There is an analytical solution for this, which has many transient functions like exp().
But i doubt i use classical control theory, because i never learned anything about that. I also didn't see something sounding like my approach in robotics papers. My guess is that Boston Dynamics indeed does the same thing, but they kept it secret and did not expose it in papers. But ofc. idk.
Currently i see much more nice robots but from other companies. So i thought that's maybe ML mostly.
Haha, and on YT i do even see robots which look like hot females, but i guess that's fake. : )

I loose you pretty quickly when you start talking math, and can't confirm or disagree.

But have you tried the 'Instantaneous Capture Point'?
Afaik BD uses it to plan steps, or to prevent a fall.
I have experimented with this often, and still do. Seems useful.
The math to calculate it is very simple, and i do not really understand why it works. I assume it's a kind of approximation.

This is my code, but better look for 'proper' resources. ; )

Code: Select all: sVec3 CalcICP (const sVec3 &gravity, const sVec3 &linvel) const // instantaneous capture point relative to com { float g = gravity.Length(); sVec3 jP = comL + qL.Rotate(jointL); // position of inverted pendulum ankle joint float zGc = fabs(sVec3(jP - com).Dot(*gravityDir)); // com is from both IP bodies sVec3 IcP = linvel * sqrt (zGc / g); IcP -= *gravityDir * gravityDir->Dot (IcP); // gravityDir is a pointer to a Vec3 and a unit vector, making this code really ugly return IcP; }

by **JoeJ** » Wed Jul 16, 2025 4:28 am

One of the things I like most about the RL approach is that you don’t need a local solver to simulate a second or more into the future.

Hehe, no, that's not what you like about RL.
What you guys like about ML is the fact that you do not need to know how X (balancing) actually works.
You only need to provide enough examples, train, and benefit.
But after that you did not gain any insights on how X (balancing) works, still. :mrgreen:

Altman still does not know how talking works either. Otherwise he would not talk so much bullshit.
hihihihi :twisted:

by **Julio Jerez** » Wed Jul 16, 2025 11:10 am

that's partially true
Machine learning is just regressions.
Basically, they let you fit a surface to approximate data points and then,
interpolate and extrapolate among them.
People have been doing that for centuries with, look up tables, splines, Gaussian processed, and so on. and not one that an interpolate value is less valuable that one you got form an equation.
The difference now is that neural net and a very large capacity to error tolerance and also seem to be unlimited in the amount of data they can encode.

The problem with Neural network machine learning, is that it is extremally prompt to self-correction.
To make an analogy, imagine a maintain range, in which you drop a ball.

now imagine you drop it many, many time, for the same start location.
each time the ball will roll downhill but it will follow different path, simple like this Guy say.
https://www.youtube.com/watch?v=5cVLUPwrSmU

there are tiny imperfections that make the ball follow different path, however every time the ball will end on a minimal point in which each direction, the gradient increases.

now imagine start from different locations, now you have a big imperfection at the start, but the ball still goes downhill until it finds a new local minimal.

the thing is that, just like in a land mountain range you have many high mountains,
there are also many low points. and all those are partial solutions.

you wouldn't believe the number of experiments out there, that are fundamentally wrong,
but still find a partial solution. in fact, even the sole of the tricks, are based on intruding bug is the process to force search to change direction.
that's on the neural net side.
now went you apply bad data, you just get bad results, but that's what the user wants.
it is still the same old, garbage in garbage out.

JoeJ wrote:
Altman still does not know how talking works either. Otherwise, he would not talk so much bullshit.
hihihihi

yes, that's one of the bad guys who want to use the tool to create the most amount of damage.
He is not a good person and have a chip on his should with Elun Musk,
he wants to be the next Elun Musk's mini me.
https://www.youtube.com/watch?v=EuWMcl0bhu4

by **Julio Jerez** » Thu Jul 24, 2025 3:17 pm

ok, Joe, here is a prove of concept of that idea.
https://youtu.be/mVZn7E-jS6I

It may need some calibration, to make harder, but I wanted to test it before moving on,
To my surprise, the first results, seems better than I even expected.
you can see that the controller starts generating long trajectories just after the first epoch.
in fact, in the secund epoch it learns to flip the direction of motion.
wit the hard hinge, this effect happens after several hundred epochs.

anyway, I am still finishing the final touches for GPU, I need to speed the training :shock:

but this seems like a very good trick worth keeping. :mrgreen:

by **JoeJ** » Sat Aug 02, 2025 2:58 am

With such a big difference in training i guess the 'telescope leg' helps a lot with smoothness. Might try this too at some point. Feet getting in contact do cause a little shock wave too for me, but seems no problem so far.

But i just noticed other some issues while trying to make a conveyor belt.
For that i have replaced my static floor box body with a kinetic body, then i set a velocity to the body once at creation. (Maybe i need to set velocity every frame...)

And i have some objects on the floor, but they all show issues:
A simple box moves as expected over the floor, but after some time it stops moving.
And worse, it starts swinging. It's contacts focus on one corner and the diagonal opposite corner.
There are no contacts generated at the other two corners, thus it oscillates like a hanging pendulum.

A simple inverted pendulum model from two boxes and one joint behaves similar.
Initially it moves as expected, but then at the exact same time as the box above, it just stops moving and gets to rest.

The ragdoll model, made of capsules and joints of the same kind, never moves with the floor.
I make it fall before the other bodies stop, so many contacts are generated, but it does not move.

---

Another issue i have is that i can not disable collisions for the ragdoll feet bodies.
Sometimes it works, but mostly Newton ignores my setup and collisions are generated.
For a reliable work around, i iterate all contacts per frame and remove them in model update.
(i do this only for the feet - the other bodies do generate contacts, and so the fallen ragdoll should move on the conveyor belt.)

I'll report if i find out more...

Edit: Setting conveyor velocity every frame has no effect. After 10-15 seconds anything stops moving.
Removing complex models and using just boxes also still shows the same problem.

Development of self balancing biped with inverse Dynamics

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Re: Development of self balancing biped with inverse Dynamic

Who is online