Kantorivich Dual Problem with Quadratic Cost

Author

Evan Tufte

Introduction

The Kantorovich dual problem is a common reformulation of the Kantorovich problem. The case c(x,y)=12|xy|2 is an important special case for several reasons. First off, the minimum cost is the 2-Wasserstein distance. From the point of view of mathematical analysis, the quadratic cost is special because the c-transform for c(x,y)=12|xy|2 is closely related to convex conjugates, and many tools from convex analysis can be used. Moreover, in the quadratic case the Monge optimal transport map is of the form T=ϕ for a convex function ϕ (as discussed in the section on Brenier’s theorem).

Definitions

Definition of the c-Transform and c-Concavity

Let X and Y be Polish spaces, and let c:X×YR{+} be a measurable function. The c-transform of f:XR{} is defined by fc(y)=infxX(c(x,y)f(x)) the c-transform of g:YR{} is gc(x)=infyY(c(x,y)g(y)). Moreover, a function f is called c-concave if there exists g:YR{} such that f=gc.

A comment: when c is symmetric (i.e., X=Y and c(x,y)=c(y,x) for all x,yX=Y), the c-transform and c-transform are identical. In this case we will just call them both the c-transform.

Definition of the convex conjugate

Suppose f:RdR{+}. The convex conjugate of f is defined by f(y)=supxRd(xyf(x)). Here, xy is the dot product of x and y. The map ff is called the Legendre transform.

The Relationship between c-Concavity and Convex Conjugates

The following is Proposition 1.21 in (). The proof below mirrors the proof there.

Theorem 1 (Relationship between the c-transform and Legendre transform) Given a proper function f:RdR{}, define uf:RdR{+} by uf(x)=12|x|2f(x). Then ufc=(uf). Moreover, a proper function g:RdR{} is c-concave if and only if x12|x|2g(x) is convex and lower semi-continuous.

Proof. Let f:RdR{}, and let xRd. We compute:

ufc(x)=12|x|2fc(x)=12|x|2infyRd(12|xy|2f(y))=12|x|2+supyRd(12|xy|2+f(y))=supyRd(12|x|212|xy|2+f(y))=supyRd(12|x|212(|x|22xy+|y|2)+f(y))=supyRd(xy(12|y|2f(y)))=(uf)(x). This proves ufc=(uf).

It remains to show that g:RdR{} is c-concave if and only if ug is convex and lower semi-continuous.

Let g:RdR{}. Assume g is c-concave. Then, g=fc for some f:RdR{}. Now, ug=ufc=uf. By Fenchel Moreau this implies ug is convex and lower semicontinous.

Now, assume ug is convex and lower semicontinuous. Then, by Fenchel Moreau, ug=h for some function h:RdR{+}. Define f(x)=12|x|2h(x) (so uf(x)=h(x)). Then, ug=(uf)=ufc, which implies g=fc. Thus, g is c-concave.

The Dual problem

Let X and Y be Polish spaces. Let μ and ν be probability measures on X and Y respectively. Let c:X×Y[0,] be lower semicontinuous. The dual problem of the Kantorovich problem is (1)supf(x)+g(y)c(x,y)f(x)dμ(x)+g(y)dν(y) where the supremum is take over all fL1(μ),gL1(ν) such that f(x)+g(y)c(x,y).

There is one thing here that isn’t standard: sometimes you take instead fCb(X), gCb(Y) instead of fL1(μ),gL1(ν). That is, you maximize over the smaller set of continuous and bounded functions. Whenever X and Y are Polish spaces Cb is dense in L1 so the supremum is the same. However, working in L1 is can be convenient because sometimes maximizers (f,g) of are non-continuous L1 functions.

Solvability of the Dual Problem with Quadratic Cost

Theorem 2 Let μ and ν be (Borel) probability measures on Rd. Assume Rd|x|2dμ(x)<+andRd|y|2dμ(x)<+. Then, there is a maximizer (f,g) to the dual Kantorovich problem. Moreover, uf and ug are both convex and uf=ug.

This is Theorem 1.40 in () (p.32), a proof can be found there. For more general c(x,y), the dual problem still has a solution (and the maximum of the dual problem is equal to the infimum in the primal problem) as is discussed here. However, much of the proof in the non-quadratic case relies on theory of the c-transform instead of convex analysis. The book () is also a good resource to see the proof in the general case.

References

Santambrogio, Filippo. 2015. “Optimal Transport for Applied Mathematicians.” Birkäuser, NY 55 (58-63): 94.