1 Shenzhen Key Laboratory of Ubiquitous Data Enabling Laboratory, Shenzhen International Graduate School, Tsinghua University, 2 Department of Electronic Engineering, Tsinghua University, 3 Pengcheng Laboratory.
where $k_p,k_i,k_d$ are the coefficients of the proportional, integral, and derivative gains, $U_{\theta}(\cdot)$ is the energy function, and $\xi_t\sim\mathcal{N}(\mathbf{0},\mathbf{I})$.
PIDLD Algorithm Flowchart
1Require: Score function \( \nabla_x U_\theta(x)=\nabla_x\log p_\theta(x)=\nabla_x(-f_\theta(x)) \); number of steps \(T\); step size \(\epsilon\); control parameters \(k_p,k_i,k_d\); decay rate \(\gamma < 1\); initial point \(x_0\).
11Decay integral gain: \(k_i = k_i \cdot \gamma\).
12End for
13Return: \(\hat{x} = x_T\).
📊 Experiments
In the experiments, we evaluated our method against standard Langevin sampling on mainstream generative models (SGM, EBM). We first used toy experiments on 2-d point datasets to validate the effectiveness of integral and derivative terms, and then used benchmark datasets in classical image generation tasks and reasoning tasks to further demonstrate the superiority of our method.
Toy Experiments
In the toy experiments, we sample points from a 2-d Gaussian mixture distribution using PID-controlled Langevin dynamics with different control parameters, and compare the sampling speed and quality.
Adding $I$ term, $D$ term, or both $I$ and $D$ terms, would significantly accelerate sampling process and produce lower KL divergence.
Increasing derivative term coefficient $k_d$ produces lower KL divergence.
Increasing integral term coefficient $k_i$ produces lower KL divergence, but excessive $k_i$ may lead to rebounding phenomenon.
Decaying integral term coefficient $k_i$ ensures smoother transition from early exploration to later convergence, alleviating the rebounding phenomenon.
Effect of $k_i$ and $k_d$ on bias. Adding $I$ term (left) excels in mitigating bias.
Effect of $k_i$ and $k_d$ on oscillation. Adding $D$ term (right) excels in reducing oscillation.
Computer Vision Tasks
For the computer vision task, we use score-based generative model (NCSNv2) and energy-based model (IGEBM) to sample images from CIFAR10 and CelebA datasets. Our proposed PIDLD method outperforms all the baselines across different settings.
CIFAR10 samples with 25 sampling steps using score-based generative model.
CelebA samples with 25 sampling steps using score-based generative model.
FID comparison for SGM and EBM models across different NFEs on CIFAR10 and CelebA dataset.
Dataset
SGM
EBM
CIFAR10
NFEs
25×1
100×1
232×1
232×3
232×5
10
20
30
40
Vanilla
46.8
17.2
16.0
12.8
12.5
135.8
58.1
40.3
35.3
Ours
18.3
12.1
11.7
11.6
11.4
99.0
46.1
32.8
33.2
CelebA
NFEs
50×1
250×1
500×1
500×3
500×5
15
20
25
30
Vanilla
25.0
13.6
14.0
11.3
9.5
109.1
63.5
41.3
35.4
Ours
8.0
5.7
5.9
5.9
5.6
58.0
38.9
32.2
30.0
Ablation study of PIDLD under image sampling tasks. P+I+D denotes the complete model, while P+I, P+D, and P represent models with the derivative term, integral term, or both terms removed, respectively. Figures on the bar indicate performance improvements of each ablation compared to P.
Reasoning Tasks
For the reasoning task, we use energy-based model (IRED) to sample solutions for Sudoku and connectivity tasks. Our proposed PIDLD method outperforms all the baselines across different settings.
Accuracy comparison for baseline and our models across Sudoku and Connectivity tasks under different NFEs.
Task
EBM Accuracy (%)
Sudoku
NFEs
5
10
15
30
40
80
Vanilla
45.99
51.00
50.93
50.77
53.63
55.02
MILD
49.75
54.82
53.55
55.25
56.56
56.64
Ours
50.54
55.48
55.55
55.94
57.02
56.64
Connectivity
NFEs
1
2
3
4
5
10
Vanilla
86.16
87.22
87.22
87.48
87.38
87.49
MILD
86.16
88.54
89.21
89.75
90.15
90.33
Ours
86.16
91.32
92.31
92.82
92.95
93.28
Ablation study of PIDLD under reasoning tasks. P+I+D denotes the complete model, while P+I, P+D, and P represent models with the derivative term, integral term, or both terms removed, respectively. Percentages indicate performance improvements of each ablation compared to P.
📚 Citation
If you find the idea useful for your research, please consider citing
@inproceedings{chen2025pidcontrolled,
title={{PID}-controlled Langevin Dynamics for Faster Sampling on Generative Models},
author={Hongyi Chen and Jianhai Shu and Jingtao Ding and Yong Li and Xiao-Ping Zhang},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=y9LHDCKeeN}
}