26 - 30 April 2026
National Harbor, Maryland, US
Conference 14029 > Paper 14029-22
Paper 14029-22

Inverse black-box diffusion modeling: multimodal parameter estimation from synthetically-rendered imagery

29 April 2026 • 11:10 AM - 11:30 AM EDT | National Harbor 7

Abstract

Inverse problems in deep learning involving the inference of initial causal parameters from a final output present a significant challenge, particularly when the generative process is a ”black-box” function. These problems are often ill-posed and non-bijective, where a single output can map to multiple valid sets of input parameters. This means the solution space is inherently multimodal, rendering point-estimate methods fundamentally insufficient. We propose a conditional denoising diffusion probabilistic model (DDPM) that learns to sample from the full posterior distribution of plausible rendering parameters given an observed image. Our architecture couples a frozen ResNet-50 encoder, which compresses each input image into a 2048-dimensional context vector, with a conditional 1D U-Net whose multi-headed cross-attention layers fuse visual context into every denoising step. We demonstrate the framework on a practical inverse rendering use case: recovering nine rendering parameters— camera position, sun position, sun color, sun intensity, and trailer orientation—used to produce a 3D vehicle scene in Blender via an API developed in-house. Trained on synthetically generated image-parameter pairs, the model produces, from a single unseen image, posterior distributions that consistently encapsulate the groundtruth values. Re-rendering from the posterior mean yields images with high visual fidelity to the originals, confirming the practical utility of the recovered parameters. We further provide an honest assessment of failure modes, including degraded performance under simulation-to-real domain shift. These results establish conditional diffusion models as a powerful and principled approach for multimodal parameter estimation in domains where the forward simulator is complex, non-differentiable, and opaque.

Presenter

Alexander Li
Massachusetts Institute of Technology (United States)
Mr. Li is a second-year undergraduate student studying AI and Math at MIT. He is a current Air Force ROTC cadet with academic interests including computer vision, ML engineering and full-stack development. When not pursuing academics, he can be found fencing on the MIT Men’s Fencing team or singing acapella as a member of the MIT Logs.
Application tracks: AI/ML
Presenter/Author
Alexander Li
Massachusetts Institute of Technology (United States)
Author
Jared Augsburger
Air Force Research Lab. (United States)
Author
Nathan Jones
Air Force Research Lab. (United States)