Interpolate start reference image.

Overview of the DM4CT benchmark. (a) The reconstruction pipeline, where representative diffusion and baseline methods are applied to measured sinograms using the same forward model. (b) The datasets used in the benchmark, including two simulated CT datasets (medical and industrial) and one real-world dataset acquired at a synchrotron facility. (c) The five simulation configurations used to evaluate robustness to limited views, noise, and ring artifacts. Two example FBP reconstructions under noise and ring artifact conditions are shown. (d) The evaluation metrics, including both qualitative (visual) and quantitative (image quality and computational efficiency) criteria.

Abstract

Diffusion models have recently emerged as powerful priors for solving inverse problems. While computed tomography (CT) is theoretically a linear inverse problem, it poses many practical challenges. These include correlated noise, artifact structures, reliance on system geometry, and misaligned value ranges, which make the direct application of diffusion models more difficult than in domains like natural image generation. To systematically evaluate how diffusion models perform in this context and compare them with established reconstruction methods, we introduce DM4CT, a comprehensive benchmark for CT reconstruction. DM4CT includes datasets from both medical and industrial domains with sparse-view and noisy configurations. To explore the challenges of deploying diffusion models in practice, we additionally acquire a high-resolution CT dataset at a high-energy synchrotron facility and evaluate all methods under real experimental conditions. We benchmark ten recent diffusion-based methods alongside seven strong baselines, including model-based, unsupervised, and supervised approaches. Our analysis provides detailed insights into the behavior, strengths, and limitations of diffusion models for CT reconstruction. The real-world dataset is publicly available at zenodo.org/records/15420527, and the codebase is open-sourced at github.com/DM4CT/DM4CT.

Methods

Diffusion-based methods evaluated in DM4CT. Columns under Technique refer to implementation choices (e.g., latent-space diffusion or DDIM-based sampling). Columns under Reconstruction Strategy denote how measurement conditioning is incorporated, including data consistency gradient steering (DC-grad), separate optimization steps (DC-step), plug-and-play priors, and use of approximate pseudoinverse solutions. A ✓* indicates only a single-step update toward the pseudoinverse. A ✓ indicates methods that incorporate data fidelity via a conjugate-gradient solve rather than a direct pixel-space optimization step.

Method Year Latent DDIM DC-grad DC-step Plug-and-Play Pseudo Inv Variational Bayes
MCG 2022 *
DPS 2023
PSLD 2023
PGDM 2023
DDS 2024
Resample 2024
DMPlug 2024
Reddiff 2024
HybridReg 2025
DiffStateGrad 2025

Main Visualization

Reconstruction results of diffusion-based and other established methods. Top: medical dataset (config iv, 80 angles with noise & ring artifacts); middle: industrial dataset (config ii, 20 angles with mild noise); bottom: real-world synchrotron dataset (60 angles). Red and green boxes show zoom-in regions. PSNR and SSIM appear in the top-left and top-right of each image. A dash (–) indicates that the method exceeded the 40 GB GPU memory limit for single-slice reconstruction and is therefore not executed. Images are consistently linear rescaled across methods to improve contrast.



Some Interesting Results

Tradeoff between Prior and Data Consistency & Reconstruction Uncertainty

(a) Impact of data consistency step size η on PSNR and data fit in DPS. Moderate values improve both, while large η disrupts denoising and causes collapse. Visual examples in the plot highlight the transition from prior-dominated to noise-dominated reconstructions. (b) Mean and standard deviation of ten MCG reconstructions conditioned on the same real measurement. Note that the real measurement used in (b) is different from the one used for (a).

Prior Contribution and Consistency: A Null Space Perspective

Decomposition of reconstructions into range and null space components for different data consistency strategies with config i). For each method, the full reconstruction is shown on the left, with zoomed-in red insets of the range component in the center and the corresponding null component on the right. The top-left of each null component indicates its relative L2 energy as a percentage of the total reconstruction, reflecting the extent of content introduced by the prior. Zoom in for details.

Data Consistency for Latent Diffusion: Gradient or Optimization?

Reconstruction results of latent diffusion methods using only data consistency gradients (PSLD) versus additional optimization steps (ReSample) under noise-free (40 projections, no noise) and noisy (80 projections) scenarios. ADMM-PDTV serves as a classical model-based baseline that applies data consistency optimization with heuristic prior. Red insets show magnified regions.

Computation Efficiency

(a) Reconstruction time and GPU memory. The time is counted on medical dataset. (b) Training time and GPU memory of pixel diffusion, latent diffusion and SwinIR.

Fine Tuning Existing Natural Image Encoders (SDXL)

Comparison of autoencoder reconstruction, unconditional diffusion generation, and CT reconstruction across different autoencoders. The VQ-VAE used in our benchmark produces consistently superior representations and reconstructions, while SDXL AutoencoderKL variants exhibit reduced stability and quality.

Comparison between Training Stages on Reconstruction Performance

Visualization of unconditional genration and CT reconstruction using different stages of the trained diffusion models. The early-stage model produces noisy unconditional generations, while it yields the sharpest structures and the best fine-detail recovery for CT reconstruction.

Acknowledgments

The authors acknowledge financial support by the European Union H2020-MSCA-ITN-2020 under grant agreement no. 956172 (xCTing). JS is also supported by grant from Dutch Research Council under grant no. ENWSS.2018.003 (UTOPIA) and no. NWA.1160.18.316 (CORTEX). The computation in this work is supported by SURF Snellius HPC infrastructure under grant no. EINF-15060. Synchrotron data acquisition was financially supported by the Dutch Research Council, project no. 016.Veni.192.23.

BibTeX

@inproceedings{
shi2026dmct,
title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
author={Shi, Jiayang and Pelt, Dani{\"e}l M and Batenburg, K Joost},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=YE5scJekg5}
}