Editor’s note: Today we hear from Perry Nightingale, SVP of Creative AI at WPP about the workflow that cuts training time for humanoid robots from days to minutes — plus access to the open-source code to do it yourself.

Robots are pushing the boundaries of what content creators and directors can capture. These technologies have become critical in the film industry because they open up new possibilities for controlled camera moves in locations where traditional methods would be unsafe or infeasible. 

We’ve found that the programming for these robots is arguably as technical and complicated as the shoots they’re being tested on. To achieve our goals, we needed a hardware stack that was equally advanced as the robots we’d be programming. In this post, we explore how WPP used the new G4 VM instance powered by NVIDIA RTX PRO 6000 Blackwell on Google Cloud, which is a great fit for the unique challenges of training physical AI.

While robotics is always a complex task, especially in the environments we were working in, thanks to the unique software stack we were using, we were able to reduce training cycles from 24 hours down to less than one.

Below, we detail the specific reinforcement learning (RL) workflow, the challenges of the “sim-to-real” gap, and the infrastructure that makes it possible. 

The lessons learned here apply far beyond entertainment. Our process for mastering complex, natural movement on a film set can be replicated across industries to overcome the massive computational complexity of training robots.

Where we started: redefining the agency model

WPP is one of the world’s largest marketing organizations, handling $70 billion of media for enterprise clients. For us, building AI into our production workflows meant fundamentally redefining the agency model, both in terms of processes, relationships, and tools. Notably,  we launched WPP Open last year, our proprietary AI operating platform, where we’re able to take the best of Gemini’s multimodal intelligence, along with other models, and incorporate them directly into every creative step.

The results were immediate. For one of our clients, Verizon, we built an AI-infused promo pipeline that delivered 15 videos in 70% less time, with 50% to 70% efficiency gains across the production cycle. WPP Open has proven so effective for our teams, we’ve begun offering it to our clients, so they can tackle projects in new ways and we can collaborate faster and better.

WPP Open has also challenged and inspired us to look for more ambitious applications of AI. With the latest advancements in Google Cloud’s AI Infrastructure, we saw an opportunity to tackle more complex problems at the cutting edge of creative development.

Why teach a robot to dance?

Our robotics work started with teaching a machine to dance — and not just because we knew a dancing robot would make for a compelling demo. Dance, along with martial arts, is generally accepted as the cutting edge of complex human motion. Mastering these complex movements is a critical step toward achieving natural robotic motion.

For our benchmarking project, we trained our robot to perform a dance sequence captured in a previous project with Universal Music Group.

The workflow

To achieve this complex motion, we needed a workflow that could iterate fast.

1

We captured human motion with the OptiTrack mocap system and retargeted it to an official OpenUSD digital twin of the robot. This is a complex engineering challenge: a human has over 200 degrees of freedom compared to just 29 on a robot. So our team needed to remap the human skeletal data to the far more constrained physical structure of our robot, creating a sophisticated 3D model.

  • MuJoCo, an open source physics engine by Google DeepMind, was a critical piece of simulation software that validated our accuracy continuously, in real-time.

The workflow then moves to reinforcement learning. In our previous environment using single on-premises GPUs, training took ten hours. This time, we used G4 VMs, together with the NVIDIA Isaac Sim image on Google Cloud Marketplace. Google Cloud’s unique innovations, such as… with G4 enabled us to utilize a P2P topology that moves data directly between GPUs without the bottleneck of central processing.

  • Using Google Cloud’s AI Hypercomputer, we saw speed increases in excess of 10x, taking training times down to less than one hour.

Over the training run, the digital twins of the robots are rewarded for getting closer to the intended motion sequence while under the simulated effects of physical gravity, momentum, friction, and small simulated “pushes” that might be expected forces in the real world.

2

3

As the simulation training begins, the robots fall almost instantly (top). After about 3 billion simulations, the robots have learned the complex dance sequence (bottom).

Bridging the “sim-to-real” gap

The “sim-to-real” gap is one of the toughest challenges in robotics. A policy that works perfectly in the above workflow often fails in the physical world due to unmodeled physics or sensor noise. A foot might land differently due to small changes in carpet friction or a gap in the floor.

We ran billions of simulations to develop the reinforcement learning model, which was then condensed into an ONNX policy and deployed to the robots. These policies take real-time observations from the robot — IMU data, joint positions, etc. — and return the necessary movements.

Through high-volume simulation, the humanoid learned how to respond to these small changes and determine what move to make next to keep the sequence on track. Again, we used MuJoCo for critical real-time validation, ensuring that the robot’s ability to adapt in simulation translated to safety and stability in the real world.

What’s Next

To accompany this project, Unitree has released their in-house reinforcement learning code as a sample project on GitHub. Alongside the NVIDIA Isaac Sim image on Google Cloud Marketplace, this means almost instant access to advanced robotic motion research.

Check it out for yourself:

Author: wp_admin - This post was originally published on this site
Share this post

Subscribe to our newsletter

Keep up with the latest blog posts by staying updated. No spamming: we promise.
By clicking Sign Up you’re confirming that you agree with our Terms and Conditions.

Related posts

🎙 AI Assistant(voice)