image: Previous high-order solvers are unstable for guided sampling: Samples use the pre-trained DPMs on ImageNet 256 256 with a classifier guidance scale 8.0, varying different samplers (and different solver orders) with only function evaluations. : DDIM with the dynamic thresholding. Our proposed DPM-Solver++ (detailed in Algorithm 2) can generate better samples than the first-order DDIM, while other high-order samplers are worse than DDIM.
Credit: Beijing Zhongke Journal Publising Co. Ltd.
Diffusion probabilistic models (DPMs) have achieved impressive success on various tasks, such as high-resolution image synthesis, image editing, text-to-image generation, voice synthesis, 3D generation, molecule generation, video generation and data compression. Compared with other deep generative models such as generative adversarial networks (GANs) and variational autoencoders (VAEs), DPMs can even achieve better sample quality by leveraging an essential technique called guided sampling, which uses additional guidance models to improve the sample fidelity and the condition-sample alignment. Through it, DPMs in text-to-image and image-to-image tasks can generate high-resolution photorealistic and artistic images which are highly correlated to the given condition, bringing a new trend in artificial intelligence art painting.
The sampling procedure of DPMs gradually removes the noise from pure Gaussian random variables to obtain clear data, which can be viewed as discretizing either the diffusion stochastic differential equations (SDEs) or the diffusion ordinary differential equations (ODEs) defined by a parameterized noise prediction model or data prediction model. Guided sampling of DPMs can also be formalized with such discretization by combining an unconditional model with a guidance model, where a hyperparameter controls the scale of the guidance model (i.e., guidance scale). The commonly-used method for guided sampling is denoising diffusion implicit models (DDIM), which is proven as a first-order diffusion ODE solver, and it generally needs 100 to 250 steps of large neural network evaluations to converge, which is time-consuming.
Dedicated high-order diffusion ODE solvers can generate high-quality samples in 10 to 20 steps for sampling without guidance. However, their effectiveness for guided sampling has not been carefully examined before. In this work published in Machine Intelligence Research, researchers demonstrate that previous high-order solvers for DPMs generate unsatisfactory samples for guided sampling, even worse than the simple first-order solver DDIM. Researchers identify two challenges of applying high-order solvers to guided sampling: 1) The large guidance scale narrows the convergence radius of high-order solvers, making them unstable; and 2) the converged solution does not fall into the same range with the original data (a.k.a. “train-test mismatch”).
Based on the observations, researchers propose DPM-Solver++, a training-free fast diffusion ODE solver for guided sampling. Researchers find that the parameterization of the DPM critically impacts the solution quality. Subsequently, they solve the diffusion ODE defined by the data prediction model, which predicts the clean data given the noisy ones. Researchers derive a high-order solver for solving the ODE with the data prediction parameterization, and adopt dynamic thresholding methods to mitigate the train-test mismatch problem. Furthermore, researchers develop a multistep solver which uses smaller step sizes to address instability.
It can be observed from the figures in the paper that DPM-Solver++ can generate high-quality samples in only 15 steps, which is much faster than all the previous training-free samplers for guided sampling. Researchers’ additional experimental results show that DPM-Solver++ can generate high-fidelity samples and almost converge within only 15 to 20 steps, for a wide variety of guided sampling applications, including both pixel-space DPMs and latent-space DPMs.
In Section 2, researchers review DPMs and their sampling methods from three aspects: Fast sampling for DPMs by diffusion ODEs, guided sampling for DPMs, and exponential integrators and high-order ODE solvers.
In Section 3, researchers examine the performance of existing high-order diffusion ODE solvers and highlight the challenges. The first challenge is the large guidance scale causes high-order solvers to be instable. The second challenge is the “train-test mismatch” problem.
In Section 4, researchers design novel high-order diffusion ODE solvers for faster guided sampling. As discussed in Section 3, previous high-order solvers have instability and “train-test mismatch” issues for large guidance scales. The “train-test mismatch” issue arises from the ODE itself, and researchers find the parameterization of the ODE is critical for the converged solution to be bounded. While previous high-order solvers are designed for the noise prediction model, researchers solve the ODE (4) for the data prediction model, which itself has some advantages and thresholding methods are further available to keep the samples bounded. Researchers also propose a multistep solver to address the instability issue.
Section 5 focuses on fast solvers for diffusion SDEs. First, it points out that sampling by diffusion models can be alternatively implemented by solving diffusion SDEs and present the expression of diffusion SDEs and their transformed form based on logSNR. Then, researchers derive the exact solution of diffusion SDEs by applying variation-of-constants formula. Finally, it proposes SDE-DPM-Solver-1, SDE-DPM-Solver++1, SDE-DPM-Solver-2M, and SDE-DPM-Solver++(2M), along with their corresponding formulas based on different assumptions.
In essence, all training-free sampling methods for DPMs can be understood as either discretizing diffusion SDEs or discretizing diffusion ODEs. As DPM-Solver++ is designed for solving diffusion ODEs, in Section 6, researchers discuss the relationship between DPM-Solver++ and other diffusion ODE solvers, and they also briefly discuss other fast sampling methods for DPMs.
In Section 7, researchers show that DPM-Solver++ can speed up both the pixel-space DPMs and the latent-space DPMs for guided sampling. Researchers vary different number of function evaluations (NFE), and compare DPM-Solver++ with the previous state-of-the-art fast samplers for DPMs including DPM-Solver, DEIS, PNDM and DDIM. Researchers also convert the discrete-time DPMs to the continuous-time and use these continuous-time solvers.
Section 8 is the conclusion. Researchers study the problem of accelerating guided sampling of DPMs. They demonstrate that previous high-order solvers based on the noise prediction models are abnormally unstable and generate worse-quality samples than the first-order solver DDIM for guided sampling with large guidance scales. To address this issue and speed up guided sampling, researchers propose DPM-Solver++, a training free fast diffusion ODE solver for guided sampling. DPM-Solver++ is based on the diffusion ODE with the data prediction models, which can directly adopt the thresholding methods to stabilize the sampling procedure further. Researchers propose both single step and multistep variants of DPM-Solver++. Experiment results show that DPM-Solver++ can generate high-fidelity samples and almost converge within only 15 to 20 steps, applicable for pixel-space and latent-space DPMs.
See the article:
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
http://doi.org/10.1007/s11633-025-1562-4
Journal
Machine Intelligence Research
Article Title
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
Article Publication Date
23-Jun-2025