Diffusion map: Difference between revisions

Latest revision as of 10:09, 7 January 2015

Land Economist Reister from Holland Landing, usually spends time with hobbies including house brewing, ganhando dinheiro na internet and pc activities. Lately has paid a visit to Archaeological Site of Olympia.

Feel free to surf to my web site; ganhar dinheiro

@@ Line 1: / Line 1: @@
-In [[natural language processing]], '''semantic compression''' is a process of compacting a lexicon used to build
+Land Economist Reister from Holland Landing, usually spends time with hobbies including house brewing, ganhando dinheiro na internet and pc activities. Lately has paid a visit to Archaeological Site of Olympia.<br><br>Feel free to surf to my web site; [http://www.comoganhardinheiro101.com/slide-central/ ganhar dinheiro]
-a textual document (or a set of documents) by reducing language heterogeneity, while maintaining text [[semantics]].
-As a result, the same ideas can be represented using a smaller set of words.
-Semantic compression is a [[lossy compression]], that is, some data is being discarded, and an original document
-cannot be reconstructed in a reverse process.
-==Semantic compression by generalization==
-Semantic compression is basically achieved in two steps, using [[frequency list|frequency dictionaries]] and [[semantic network]]:
-#	determining cumulated term frequencies to identify target lexicon,
-#	replacing less frequent terms with their hypernyms ([[generalization]]) from target lexicon.<ref>[http://dx.doi.org/10.1007/978-3-642-12090-9_10 D. Ceglarek, K. Haniewicz, W. Rutkowski, Semantic Compression for Specialised Information Retrieval Systems], Advances in Intelligent Information and Database Systems, vol. 283, p. 111-121, 2010</ref>
-Step 1 requires assembling word frequencies and
-information on semantic relationships, specifically [[hyponymy]]. Moving upwards in word hierarchy,
-a cumulative concept frequency is calculating by adding a sum of hyponyms' frequencies to frequency of their hypernym:
-<math>cum f(k_{i}) = f(k_{i}) + \sum_{j} cum f(k_{j})</math> where <math>k_{i}</math> is a hypernym of <math>k_{j}</math>.
-Then, a desired number of words with top cumulated frequencies are chosen to build a targed lexicon.
-In the second step, compression mapping rules are defined for the remaining words, in order to handle every occurrence
-of a less frequent hyponym as its hypernym in output text.
-;Example
-The below fragment of text has been processed by the semantic compression. Words in bold have been replaced by their hypernyms.
-<blockquote>They are both '''nest''' building '''social insects''', but '''paper wasps''' and honey '''bees''' '''organize''' their '''colonies'''
-in very different '''ways'''. In a new study, researchers report that despite their '''differences''', these insects
-'''rely on''' the same network of genes to guide their '''social behavior'''.The study appears in the Proceedings of the
-'''Royal Society B''': Biological Sciences. Honey '''bees''' and '''paper wasps''' are separated by more than 100 million years of
-'''evolution''', and there are '''striking differences''' in how they divvy up the work of '''maintaining''' a '''colony'''.</blockquote>
-The procedure outputs the following text:
-<blockquote>They are both '''facility''' building '''insect''', but '''insect''' and honey '''insects''' '''arrange''' their '''biological groups'''
-in very different '''structure'''. In a new study, researchers report that despite their '''difference of opinions''', these insects
-'''act''' the same network of genes to '''steer''' their '''party demeanor'''. The study appears in the proceeding of the
-'''institution bacteria''' Biological Sciences. Honey '''insects''' and '''insect''' are separated by more than hundred million years of
-'''organic process''', and there are '''impinging difference of opinions''' in how they divvy up the work of '''affirming''' a '''biological group'''.</blockquote>
-==Implicit semantic compression==
-A natural tendency to keep natural language expressions concise can be perceived as a form of implicit semantic compression, by omitting unmeaningful words or redundant meaningful words (especially to avoid [[pleonasm]]s)
-.<ref>[http://dx.doi.org/10.3115/990100.990155 N. N. Percova, On the types of semantic compression of text],
-COLING '82 Proceedings of the 9th Conference on Computational Linguistics, vol. 2, p. 229-231, 1982</ref>
-==Applications and advantages==
-In [[vector space model]], compacting a lexicon lead to a reduction of [[curse of dimensionality|dimensionality]], which results in less
-[[computational complexity]] and a positive influence on efficiency.
-Semantic compression is advantageous in information retrieval tasks, improving their effectiveness (in terms of both precision and recall).<ref>[http://dl.acm.org/citation.cfm?id=1947662.1947683 D. Ceglarek, K. Haniewicz, W. Rutkowski, Quality of semantic compression in classification] Proceedings of the 2nd International Conference on Computational Collective Intelligence: Technologies and Applications, vol. 1, p. 162-171, 2010</ref> This is due to more precise descriptors (reduced effect of language diversity – limited language redundancy, a step towards controlled dictionary)
-As in the example above, it is possible to display the output as natural text (re-applying inflexion, adding stop words).
-==See also==
-* [[Text simplification]]
-* [[Lexical substitution]]
-* [[Information theory]]
-* [[Quantities of information]]
-==References==
-<references/>
-==External links==
-* [http://semantic.net.pl/semantic_compression.php Semantic compression on Project SENECA (Semantic Networks and Categorization) website]
-[[Category:Information retrieval]]
-[[Category:Natural language processing]]
-[[Category:Quantitative linguistics]]
-[[Category:Computational linguistics]]

Diffusion map: Difference between revisions

Latest revision as of 10:09, 7 January 2015

Navigation menu

Search