This is the real story buried under the simulation angle. If you can generate
reliable 3D LiDAR from 2D video, every dashcam on earth becomes training data.
Every YouTube driving video, every GoPro clip, every security camera feed.
Waymo's fleet is ~700 cars. The internet has millions of hours of driving
footage. This technique turns the entire internet into a sensor suite. That's a bigger deal than the simulation itself.
It's not unheard of, there are a handful [0] of metric monodepth methods that output data that's not unlike a really inaccurate 3D lidar, though theirs certainly looks SOTA.
IMO, access to DeepMind and Google infra is a hugely understated advantage Waymo has that no other competitor can replicate.