Notes of Score Diffusion Models (updating)

This is a summary of unified framework of diffusion models by Stochastic Differential Equations (SDE) and the related topics.

Background

In score-based diffusion models , the authors proposed a unified framework that connects the score-based model NCSN and DDPM through the perspective of Stochastic Differential Equations (SDE). They interpret the forward process (adding noise) and the backward process (denoising sampling) as SDE and reverse SDE, respectively.

Data Perturbation with Itô SDE

The diffusion process xtt=0T from the original input x0 to Gaussian noise xT with continuous time variable t[0,T] can be modeled by the Itô SDE

(1)dx=f(x,t)dt+G(x,t)dwt

We consider the case G(x,t)=g(t)I which is independent of x. Then, Eq. (1) can be rewritten as

(2)dx=f(t)xdt+g(t)dwt

The perturbation kernel of this SDE have the form

(3)p(xtx0)=N(xtstx0,st2σt2I)

where

(4)st=exp(0tf(ξ)dξ)andσt2=0t(g(ξ)s(ξ))2dξ

Here, I’ll use st and s(t), and σt and σ(t) interchangeably for simplicity. The corresponding marginal distribution is obtained by

(5)p(xt)=p(xtx0)pdata(x0)dx0(6)=stdpdata(x0)N(xtst1x0,σt2I)dx0(7)=stdpdata(x0)N(xtst1x00,σt2I)dx0(8)=std[pdata(x0)N(0,σt2I)](xtst1)(9)=stdp(xtst1;σt)

The Fokker-Plank equation describes the evolution of the marginal distribution pt(x), interchangeable with p(xt), over time under the effect of drift forces and random (or noise) forces. It can be written as

(10)pt(x)t=x[f(x,t)pt(x)]+12xx[G(x,t)GT(x,t)pt(x)](11)=i=1dxi[fi(x,t)pt(x)]+12i=1dj=1d2xixj[k=1dGik(x,t)Gjk(x,t)pt(x)](12)=i=1dxi[fi(x,t)pt(x)12[x[G(x,t)GT(x,t)]+G(x,t)GT(x,t)xlogpt(x)]pt(x)](13)=i=1dxi[f~i(x,t)pt(x)]

where

f~i(x,t)=fi(x,t)12[x[G(x,t)GT(x,t)]+G(x,t)GT(x,t)xlogpt(x)].

If we consider the case G(x,t)=g(t)I which is independent of x, then Eq. 13 can be rewritten as

(14)pt(x)t=x[f(x,t)pt(x)]+12g2(t)xx[pt(x)](15)=i=1dxi[[fi(x,t)12g2(t)xlogpt(x)]pt(x)]

Probability Flow ODE

According to the Fokker-Plank equation, there exists an ODE which shares the same marginal distribution p(x) as the SDE. From Eq. 15, the corresponding SDE is reduced to ODE given by

(16)dx=[f(x,t)12g2(t)xlogpt(x)]dt+0dwt(17)=[f(t)x12g2(t)xlogpt(x)]dt

The ODE is named as the probability flow ODE (PF ODE).

If we further build the ODE according to st and σt, we have

(18)f(t)=s˙tst1andgt=st2σ˙tσt.

The derivation can be found in EDM paper (Eq. 28 and 34). Then, Eq. 17 can be rewritten as

(19)dx=[s˙tst1xst2σ˙tσtxlogpt(x)]dt(20)=[s˙tst1xst2σ˙tσtxlogp(xst1;σt)]dt(21)=[σ˙tσtxlogp(xt;σt)]dtwherest=1

where the marginal pt(x)=st1p(xtst1;σt) and

p(xtst1;σt)=[pdata(x0)N(0,σt2I)](xtst1)

as shown in Eq. 9.

Connection Between PF ODE and SDE

According to the Eq. (102) in , the authors derived a family of SDE for any choice of g(t). The SDE is given by

(22)dx=(12g2(t)σ˙tσt)xlogp(x;σt)dt+g(t)dwt(23)=f^(x,t)dt+g(t)dwt

The PF ODE is a special case of the SDE when g(t)=0. The derivation is given in the EDM paper (Appendix B.5). It is a little bit long but not difficult to follow. Furthermore, the authors parameterized g(t)=2β(t)σt and Eq. 23 becomes

(24)dx=σ˙tσtxlogp(x;σt)dt+β(t)σt2xlogp(x;σt)dt+2β(t)σtdwt

where β(t) is a free function.

Reverse SDE

The reverse diffusion process from xT to x0 can also be modeled by Itô SDE according to :

(25)dx=[f^(x,t)g2(t)xlogpt(x)]dt+g(t)dwt(26)=σ˙tσtxlogp(x;σt)dtβ(t)σt2xlogp(x;σt)dt+2β(t)σtdwtLangevin dynamics: noise cancellation

Unified Framework of Score Diffusion Models

NCSN (VE SDE)

DDPM (VP SDE)

DDIM

Preconditioning

Sampling

Related Topics

Flow Matching

Consistency Model

Footnotes

    References

    1. Score-Based Generative Modeling through Stochastic Differential Equations
      Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S. and Poole, B.. International Conference on Learning Representations.
    2. Elucidating the design space of diffusion-based generative models
      Karras, T., Aittala, M., Aila, T. and Laine, S., 2022. Advances in neural information processing systems, Vol 35, pp. 26565--26577.
    3. Reverse-time diffusion equation models
      Anderson, B.D., 1982. Stochastic Processes and their Applications, Vol 12(3), pp. 313--326. Elsevier.