The approach uses a set of point clouds equipped with semantic logits and 3D Gaussians to represent busy city streets.
A group of researchers from Zhejiang University and Li Auto has introduced Street Gaussians, a new method of modeling and representing busy urban streets using a set of point clouds equipped with semantic logits and 3D Gaussians, each associated with either a foreground vehicle or the background.
To overcome limitations in existing approaches, the method optimizes each object point cloud by incorporating optimizable tracked poses, along with a dynamic spherical harmonics model for the dynamic appearance, to model the dynamics of foreground object vehicles. According to the team, the explicit representation allows for easy composition of object vehicles and background, enabling scene editing operations and achieving rendering speeds of 133 frames per second at a resolution of 1066×1600 within just half an hour of training.
"The proposed method is evaluated on multiple challenging benchmarks, including KITTI and Waymo Open datasets," reads the research paper. "Experiments show that the proposed method consistently outperforms state-of-the-art methods across all datasets. Furthermore, the proposed representation delivers performance on par with that achieved using precise ground-truth poses, despite relying only on poses from an off-the-shelf tracker."
And here are some more examples of dynamic 3D Gaussian scenes achieved with this method:
You can learn more about Street Gaussians here and access the code over here. Also, don't forget to join our 80 Level Talent platform and our Telegram channel, follow us on Instagram, Twitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.