Partial least squares regression

From formulasearchengine
Revision as of 04:44, 8 January 2014 by en>BattyBot (Software implementation: fixed CS1 errors: dates & General fixes using AWB (9832))
Jump to navigation Jump to search

In mathematics, Muirhead's inequality, named after Robert Franklin Muirhead, also known as the "bunching" method, generalizes the inequality of arithmetic and geometric means.

Preliminary definitions

The "a-mean"

For any real vector

a=(a1,,an)

define the "a-mean" [a] of nonnegative real numbers x1, ..., xn by

[a]=1n!σxσ1a1xσnan,

where the sum extends over all permutations σ of { 1, ..., n }.

In case a = (1, 0, ..., 0), this is just the ordinary arithmetic mean of x1, ..., xn. In case a = (1/n, ..., 1/n), it is the geometric mean of x1, ..., xn. (When n = 2, this is the Heinz mean.)

Doubly stochastic matrices

An n × n matrix P is doubly stochastic precisely if both P and its transpose PT are stochastic matrices. A stochastic matrix is a square matrix of nonnegative real entries in which the sum of the entries in each column is 1. Thus, a doubly stochastic matrix is a square matrix of nonnegative real entries in which the sum of the entries in each row and the sum of the entries in each column is 1.

The inequality

Muirhead's inequality states that [a] ≤ [b] for all xi ≥ 0 if and only if there is some doubly stochastic matrix P for which a = Pb.

The proof makes use of the fact that every doubly stochastic matrix is a weighted average of permutation matrices (Birkhoff-von Neumann theorem).

Another equivalent condition

Because of the symmetry of the sum, no generality is lost by sorting the exponents into decreasing order:

a1a2an
b1b2bn.

Then the existence of a doubly stochastic matrix P such that a = Pb is equivalent to the following system of inequalities:

a1b1
a1+a2b1+b2
a1+a2+a3b1+b2+b3
a1++an1b1++bn1
a1++an=b1++bn.

(The last one is an equality; the others are weak inequalities.)

The sequence b1,,bn is said to majorize the sequence a1,,an.

Symmetric sum-notation tricks

It is useful to use a kind of special notation for the sums. A success in reducing an inequality in this form means that the only condition for testing it is to verify whether one exponent sequence (α1,,αn) majorizes the other one.

symx1α1xnαn

This notation requires developing every permutation, developing an expression made of n! monomials, for instance:

symx3y2z0=x3y2z0+x3z2y0+y3x2z0+y3z2x0+z3x2y0+z3y2x0=x3y2+x3z2+y3x2+y3z2+z3x2+z3y2

Deriving the arithmetic-geometric mean inequality

Let

aG=(1n,,1n)
aA=(1,0,0,,0)

we have

aA1=1>aG1=1n
aA1+aA2=1>aG1+aG2=2n
aA1++aAn=aG1++aGn=1

then

[aA] ≥ [aG]

which is

1n!(x11x20xn0++x10xn1)(n1)!1n!(x1xn)1nn!

yielding the inequality.

Examples

Suppose you want to prove that x2 + y2 ≥ 2xy by using bunching (Muirhead's inequality): We transform it in the symmetric-sum notation:

symx2y0symx1y1.

The sequence (2, 0) majorizes the sequence (1, 1), thus the inequality holds by bunching. Again,

x3+y3+z33xyz
symx3y0z0symx1y1z1

which yields

2x3+2y3+2z36xyz

the sequence (3, 0, 0) majorizes the sequence (1, 1, 1), thus the inequality holds by bunching.

References