Margin Infused Relaxed Algorithm

Margin Infused Relaxed Algorithm (MIRA)[1] is a machine learning algorithm, an online algorithm for multiclass classification problems. It is designed to learn a set of parameters (vector or matrix) by processing all the given training examples one-by-one and updating the parameters according to each training example, so that the current training example is classified correctly with a margin against incorrect classifications at least as large as their loss.[2] The change of the parameters is kept as small as possible.

A two-class version called binary MIRA[1] simplifies the algorithm by not requiring the solution of a quadratic programming problem (see below). When used in an one-vs.-all configuration, binary MIRA can be extended to a multiclass learner that approximates full MIRA, but may be faster to train.

The flow of the algorithm[3][4] looks as follows:

  Input: Training examples ${\displaystyle T=\{x_{i},y_{i}\}}$
Output: Set of parameters ${\displaystyle w}$

  ${\displaystyle i}$ ← 0, ${\displaystyle w^{(0)}}$ ← 0
for ${\displaystyle n}$ ← 1 to ${\displaystyle N}$
for ${\displaystyle t}$ ← 1 to ${\displaystyle |T|}$
${\displaystyle w^{(i+1)}}$ ← update ${\displaystyle w^{(i)}}$ according to ${\displaystyle \{x_{t},y_{t}\}}$
${\displaystyle i}$ ← ${\displaystyle i+1}$
end for
end for
return ${\displaystyle {\frac {\sum _{j=1}^{N\times |T|}w^{(j)}}{N\times |T|}}}$


The update step is then formalized as a quadratic programming[2] problem: Find ${\displaystyle min\|w^{(i+1)}-w^{(i)}\|}$, so that ${\displaystyle score(x_{t},y_{t})-score(x_{t},y')\geq L(y_{t},y')\ \forall y'}$, i.e. the score of the current correct training ${\displaystyle y}$ must be greater than the score of any other possible ${\displaystyle y'}$ by at least the loss (number of errors) of that ${\displaystyle y'}$ in comparison to ${\displaystyle y}$.

References

1. {{#invoke:Citation/CS1|citation |CitationClass=journal }}
2. {{#invoke:citation/CS1|citation |CitationClass=conference }}
3. Watanabe, T. et al (2007): Online Large Margin Training for Statistical Machine Translation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 764–773.
4. Bohnet, B. (2009): Efficient Parsing of Syntactic and Semantic Dependency Structures. Proceedings of Conference on Natural Language Learning (CoNLL), Boulder, 67-72.