Partial least squares regression

In mathematics, Muirhead's inequality, named after Robert Franklin Muirhead, also known as the "bunching" method, generalizes the inequality of arithmetic and geometric means.

Preliminary definitions

The "a-mean"

For any real vector

a = (a_{1}, \dots, a_{n})

define the "a-mean" [a] of nonnegative real numbers x₁, ..., x_n by

[a] = \frac{1}{n!} \sum_{σ} x_{σ_{1}}^{a_{1}} \dots x_{σ_{n}}^{a_{n}},

where the sum extends over all permutations σ of { 1, ..., n }.

In case a = (1, 0, ..., 0), this is just the ordinary arithmetic mean of x₁, ..., x_n. In case a = (1/n, ..., 1/n), it is the geometric mean of x₁, ..., x_n. (When n = 2, this is the Heinz mean.)

Doubly stochastic matrices

An n × n matrix P is doubly stochastic precisely if both P and its transpose P^T are stochastic matrices. A stochastic matrix is a square matrix of nonnegative real entries in which the sum of the entries in each column is 1. Thus, a doubly stochastic matrix is a square matrix of nonnegative real entries in which the sum of the entries in each row and the sum of the entries in each column is 1.

The inequality

Muirhead's inequality states that [a] ≤ [b] for all x_i ≥ 0 if and only if there is some doubly stochastic matrix P for which a = Pb.

The proof makes use of the fact that every doubly stochastic matrix is a weighted average of permutation matrices (Birkhoff-von Neumann theorem).

Another equivalent condition

Because of the symmetry of the sum, no generality is lost by sorting the exponents into decreasing order:

a_{1} \geq a_{2} \geq \dots \geq a_{n}

b_{1} \geq b_{2} \geq \dots \geq b_{n} .

Then the existence of a doubly stochastic matrix P such that a = Pb is equivalent to the following system of inequalities:

a_{1} \leq b_{1}

a_{1} + a_{2} \leq b_{1} + b_{2}

a_{1} + a_{2} + a_{3} \leq b_{1} + b_{2} + b_{3}

⋮ ⋮ ⋮ ⋮

a_{1} + \dots + a_{n - 1} \leq b_{1} + \dots + b_{n - 1}

a_{1} + \dots + a_{n} = b_{1} + \dots + b_{n} .

(The last one is an equality; the others are weak inequalities.)

The sequence $b_{1}, \dots, b_{n}$ is said to majorize the sequence $a_{1}, \dots, a_{n}$ .

Symmetric sum-notation tricks

It is useful to use a kind of special notation for the sums. A success in reducing an inequality in this form means that the only condition for testing it is to verify whether one exponent sequence ( $α_{1}, \dots, α_{n}$ ) majorizes the other one.

\sum_{sym} x_{1}^{α_{1}} \dots x_{n}^{α_{n}}

This notation requires developing every permutation, developing an expression made of n! monomials, for instance:

\begin{aligned} \sum_{sym} x^{3} y^{2} z^{0} & = x^{3} y^{2} z^{0} + x^{3} z^{2} y^{0} + y^{3} x^{2} z^{0} + y^{3} z^{2} x^{0} + z^{3} x^{2} y^{0} + z^{3} y^{2} x^{0} \\ = x^{3} y^{2} + x^{3} z^{2} + y^{3} x^{2} + y^{3} z^{2} + z^{3} x^{2} + z^{3} y^{2} \end{aligned}

Deriving the arithmetic-geometric mean inequality

Let

a_{G} = (\frac{1}{n}, \dots, \frac{1}{n})

a_{A} = (1, 0, 0, \dots, 0)

we have

a_{A 1} = 1 > a_{G 1} = \frac{1}{n}

a_{A 1} + a_{A 2} = 1 > a_{G 1} + a_{G 2} = \frac{2}{n}

⋮ ⋮ ⋮

a_{A 1} + \dots + a_{A n} = a_{G 1} + \dots + a_{G n} = 1

then

[a_A] ≥ [a_G]

which is

\frac{1}{n!} (x_{1}^{1} \cdot x_{2}^{0} \dots x_{n}^{0} + \dots + x_{1}^{0} \dots x_{n}^{1}) (n - 1)! \geq \frac{1}{n!} (x_{1} \cdot \dots \cdot x_{n})^{\frac{1}{n}} n!

yielding the inequality.

Examples

Suppose you want to prove that x² + y² ≥ 2xy by using bunching (Muirhead's inequality): We transform it in the symmetric-sum notation:

\sum_{s y m} x^{2} y^{0} \geq \sum_{s y m} x^{1} y^{1} .

The sequence (2, 0) majorizes the sequence (1, 1), thus the inequality holds by bunching. Again,

x^{3} + y^{3} + z^{3} \geq 3 x y z

\sum_{s y m} x^{3} y^{0} z^{0} \geq \sum_{s y m} x^{1} y^{1} z^{1}

which yields

2 x^{3} + 2 y^{3} + 2 z^{3} \geq 6 x y z

the sequence (3, 0, 0) majorizes the sequence (1, 1, 1), thus the inequality holds by bunching.

References

Biography of R.F. Muirhead
Combinatorial Theory by John N. Guidi, based on lectures given by Gian-Carlo Rota in 1998, MIT Copy Technology Center, 2002.
Kiran Kedlaya's guide to solving inequalities [1].
Reference on PlanetMath (Muirhead's theorem)

Partial least squares regression

Contents

Preliminary definitions

The "a-mean"

Doubly stochastic matrices

The inequality

Another equivalent condition

Symmetric sum-notation tricks

Deriving the arithmetic-geometric mean inequality

Examples

References

Navigation menu

Partial least squares regression

Preliminary definitions

The "a-mean"

Doubly stochastic matrices

The inequality

Another equivalent condition

Symmetric sum-notation tricks

Deriving the arithmetic-geometric mean inequality

Examples

References

Navigation menu

Search