Good–Turing frequency estimation: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
→‎The method: some small math notation cleanups per WP:MOSMATH
 
en>DarwinPeacock
trying to clarify intro sentence
Line 1: Line 1:
Hello friend. Let me [https://www.geico.com/getaquote/auto/mechanical-breakdown-insurance/ introduce] myself. I am Luther  [http://www.classicguides.com/automotive/extended-vehicle-warranty-plans/ extended auto warranty] [http://www.empredators.de/index.php?mod=users&action=view&id=10734 extended car warranty] auto [http://www.ag.state.mn.us/consumer/cars/mncarlaws/mncarlaws_2.asp warranty warranty] Aubrey. The favorite pastime for him and his children is to generate and now he is trying to make cash with it. My job is a manufacturing and distribution officer and I'm doing pretty good monetarily. Her [http://www.bankrate.com/finance/auto/automakers-best-car-warranties-1.aspx family life] in Delaware  car warranty but she requirements to transfer because of her family.<br><br>Also visit my website - [http://Www.team.trustfactory.ch/index.php?mod=users&action=view&id=4832 Www.team.trustfactory.ch]
The '''leftover hash lemma''' is a [[lemma (mathematics)|lemma]] in [[cryptography]] first stated by [[Russell Impagliazzo]], [[Leonid Levin]], and [[Michael Luby]].
 
Imagine that you have a secret [[key (cryptography)|key]] <math>\scriptstyle X</math> that has <math>\scriptstyle n</math> uniform random [[bit]]s, and you would like to use this secret key to encrypt a message. Unfortunately, you were a bit careless with the key, and know that an [[adversary (cryptography)|adversary]] was able to learn about <math>\scriptstyle t \;<\; n</math> bits of that key, but you do not know which. Can you still use your key, or do you have to throw it away and choose a new key? The leftover hash lemma tells us that we can produce a key of [[almost all|almost]] <math>\scriptstyle n \,-\, t</math> bits, over which the adversary has almost no knowledge. Since the adversary knows all but <math>\scriptstyle n \,-\, t</math> bits, this is [[almost optimal]].
 
More precisely, the leftover hash lemma tells us that we can extract about <math>\scriptstyle H_\infty(X)</math> (the [[min-entropy]] of <math>\scriptstyle X</math>) bits from a [[random variable]] <math>\scriptstyle X</math> that are almost uniformly distributed. In other words, an adversary who has some partial knowledge about <math>\scriptstyle X</math>, will have almost no knowledge about the extracted value. That is why this is also called '''privacy amplification''' (see privacy amplification section in the article [[Quantum key distribution]]).
 
[[Randomness extractor]]s achieve the same result, but use (normally) less randomness.
 
==Leftover hash lemma==
Let <math>\scriptstyle X</math> be a random variable over <math>\scriptstyle \mathcal X</math> and let <math>\scriptstyle m \;>\; 0</math>. Let <math>\scriptstyle h :\; \mathcal{S} \,\times\, \mathcal{X} \;\rightarrow\; \{0,\, 1\}^m</math> be a 2-[[universal hashing|universal]] [[hash function]]. If
:<math>m \leq H_\infty(X) - 2 \log\left(\frac{1}{\varepsilon}\right)</math>
 
then for <math>\scriptstyle S</math> uniform over <math>\scriptstyle \mathcal S</math> and independent of <math>\scriptstyle X</math>, we have
:<math>\delta[(h(S, X), S), (U, S)] \leq \varepsilon</math>
 
where <math>\scriptstyle U</math> is uniform over <math>\scriptstyle \{0,\, 1\}^m</math> and independent of <math>\scriptstyle S</math>.
 
<math>\scriptstyle H_\infty(X) \;=\; -\log \max_x \Pr[X=x]</math> is the [[Min-entropy]] of <math>\scriptstyle X</math>, which measures the amount of randomness <math>\scriptstyle X</math> has. The min-entropy is always less than or equal to the [[Shannon entropy]]. Note that <math>\scriptstyle \max_x \Pr[X=x]</math> is the probability of correctly guessing <math>\scriptstyle X</math>. (The best guess is to guess the most probable value.) Therefore, the min-entropy measures how difficult it is to guess <math>\scriptstyle X</math>.
 
<math>\scriptstyle \delta(X,\, Y) \;=\; \frac{1}{2} \sum_v \left | \Pr[X=v] \,-\, \Pr[Y=v] \right |</math> is a [[statistical distance]] between <math>\scriptstyle X</math> and <math>\scriptstyle Y</math>.
 
==See also==
* [[Universal hashing]]
* [[Min-entropy]]
* [[Rényi entropy]]
* [[Information theoretic security]]
 
==References==
*[http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=45477 C. H. Bennett, G. Brassard, and J. M. Robert. ''Privacy amplification by public discussion''. SIAM Journal on Computing, 17(2):210-229, 1988.]
*[http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=73009 R. Impagliazzo, L. A. Levin, and M. Luby. ''Pseudo-random generation from one-way functions''. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing (STOC '89), pages 12-24. ACM Press, 1989.]
*[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=10153&arnumber=476316&type=ref C. Bennett, G. Brassard, C. Crepeau, and U. Maurer. ''Generalized privacy amplification''. IEEE Transactions on Information Theory, 41, 1995.]
*[http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=312213 J. Håstad, R. Impagliazzo, L. A. Levin and M. Luby. ''A Pseudorandom Generator from any One-way Function''. SIAM Journal on Computing, v28 n4, pp. 1364-1396, 1999.]
 
[[Category:Theory of cryptography]]
[[Category:Probability theorems]]

Revision as of 21:52, 14 October 2013

The leftover hash lemma is a lemma in cryptography first stated by Russell Impagliazzo, Leonid Levin, and Michael Luby.

Imagine that you have a secret key that has uniform random bits, and you would like to use this secret key to encrypt a message. Unfortunately, you were a bit careless with the key, and know that an adversary was able to learn about bits of that key, but you do not know which. Can you still use your key, or do you have to throw it away and choose a new key? The leftover hash lemma tells us that we can produce a key of almost bits, over which the adversary has almost no knowledge. Since the adversary knows all but bits, this is almost optimal.

More precisely, the leftover hash lemma tells us that we can extract about (the min-entropy of ) bits from a random variable that are almost uniformly distributed. In other words, an adversary who has some partial knowledge about , will have almost no knowledge about the extracted value. That is why this is also called privacy amplification (see privacy amplification section in the article Quantum key distribution).

Randomness extractors achieve the same result, but use (normally) less randomness.

Leftover hash lemma

Let be a random variable over and let . Let be a 2-universal hash function. If

then for uniform over and independent of , we have

where is uniform over and independent of .

is the Min-entropy of , which measures the amount of randomness has. The min-entropy is always less than or equal to the Shannon entropy. Note that is the probability of correctly guessing . (The best guess is to guess the most probable value.) Therefore, the min-entropy measures how difficult it is to guess .

is a statistical distance between and .

See also

References