DiffusionLight thinks about lighting and renders a chrome ball for you.
Image credit: Pakkapon Phongthawee, Worameth Chinchuthakun et al.
Lighting is no easy task, so wouldn't it be nice to have someone (or something) do it for you? There are some tools to aid you with this, and DiffusionLight is one of them. Presented by researchers from VISTEC, Tokyo Tech, Google Research, Stability AI, and Pixiv, it can estimate lighting in a single image by rendering a chrome ball into the picture.
To achieve its impressive results, the method uses diffusion models trained on billions of standard images, which makes the lighting realistic and accurate.
"Despite its simplicity, this task remains challenging: the diffusion models often insert incorrect or inconsistent objects and cannot readily generate images in HDR format. Our research uncovers a surprising relationship between the appearance of chrome balls and the initial diffusion noise map, which we utilize to consistently generate high-quality chrome balls."
Image credit: Pakkapon Phongthawee, Worameth Chinchuthakun et al.
So how does it work? According to the DiffusionLight: Light Probes for Free by Painting a Chrome Ball paper, the researchers receive an input image and seek to estimate the scene's lighting as an HDR environment map. They add a chrome ball into the image using a diffusion model and unwarp it to the map.
The setup is based on Stable Diffusion XL with depth-conditioned ControlNet, which takes an image, its depth map, and an inpainting mask as input. For the chrome ball, the team predicts a depth map from the image using an off-the-shelf depth prediction network and then paints a circle at the depth map's center with the distance closest to the camera and in the mask.
Image credit: Pakkapon Phongthawee, Worameth Chinchuthakun et al.
"We feed them along with the input image and the prompt "a perfect mirrored reflective chrome ball sphere" to the diffusion model. We make two improvements to the above base model. First, we propose a technique called 'iterative inpainting' to locate a neighborhood of good initial noise maps that lead to consistent and high-quality chrome balls. Second, to further improve the generated appearance and generate multiple LDR images for exposure bracketing, we fine-tune the diffusion model using LoRA on a set of synthetically generated chrome balls with varying exposures."
This technique can estimate lighting for indoor and outdoor scenes, close-up shots, paintings, and photos of human faces. Using the environmental light estimates, the researchers could seamlessly insert 3D objects into an existing picture.
You can read more about the details of the method in this paper and even try it out for yourself here. Also, join our 80 Level Talent platform and our Telegram channel, follow us on Instagram, Twitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.
Keep reading
You may find these articles interesting