Radon–Nikodym theorem: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Koumz
No edit summary
 
en>Yobot
m WP:CHECKWIKI error fixes using AWB (9773)
Line 1: Line 1:
Hello! My name is Maryann. <br>It is a little about myself: I live in Netherlands, my city of Capelle Aan Den Ijssel. <br>It's called often Eastern or cultural capital of ZH. I've married 1 years ago.<br>I have 2 children - a son (Virgilio) and the daughter (Star). We all like Aircraft spotting.<br><br>Look at my web page :: [http://jessehallskibase.com/author/belinda-broido/ Belinda Broido]
In [[computer science]], '''top-down parsing''' is a parsing strategy where one first looks at the highest level of the [[parse tree]] and works down the parse tree by using the rewriting rules of a [[formal grammar]]. [[LL parser]]s are a type of parser that uses a top-down parsing strategy.
 
Top-down parsing is a strategy of analyzing unknown data relationships by hypothesizing general [[parse tree]] structures and then considering whether the known fundamental structures are compatible with the hypothesis. It occurs in the analysis of both natural [[language]]s and [[computer language]]s.
 
Top-down parsing can be viewed as an attempt to find left-most derivations of an input-stream by searching for [[parse tree|parse-trees]] using a top-down expansion of the given [[formal grammar]] rules. Tokens are consumed from left to right. Inclusive choice is used to accommodate [[ambiguity]] by expanding all alternative right-hand-sides of grammar rules.<ref name="AhoSethiUllman 1986">{{cite book |last1=Aho |first1=Alfred V. |authorlink1=Alfred Aho |last2=Sethi |first2=Ravi |authorlink2=Ravi Sethi |last3=Ullman |first3=Jeffrey D. |authorlink3=Jeffrey Ullman |title=Compilers, principles, techniques, and tools |year=1986 |publisher=Addison-Wesley Pub. Co. |isbn=978-0201100884 |edition=Rep. with corrections.}}</ref>
 
Simple implementations of top-down parsing do not terminate for [[left recursion|left-recursive]] grammars, and top-down parsing with backtracking may have [[Exponential time|exponential]] time complexity with respect to the length of the input for ambiguous [[Context-free grammar|CFGs]].<ref name="AhoUllman 1972">{{cite book |last1=Aho |first1=Alfred V. |authorlink1=Alfred Aho |last2=Ullman |first2=Jeffrey D. |authorlink2=Jeffrey Ullman |title=The Theory of Parsing, Translation, and Compiling (Volume 1: Parsing.) |year=1972 |publisher=Prentice-Hall |location=Englewood Cliffs, NJ |isbn=978-0139145568 |edition=Repr.}}</ref> However, more sophisticated top-down parsers have been created by Frost, Hafiz, and Callaghan <ref name="FrostHafizCallaghan 2007">Frost, R., Hafiz, R. and Callaghan, P. (2007) " Modular and Efficient Top-Down Parsing for Ambiguous Left-Recursive Grammars ." ''10th International Workshop on Parsing Technologies (IWPT), ACL-SIGPARSE '', Pages: 109 - 120, June 2007, Prague.</ref><ref name="FrostHafizCallaghan 2008">Frost, R., Hafiz, R. and Callaghan, P. (2008) " Parser Combinators for Ambiguous Left-Recursive Grammars." '' 10th International Symposium on Practical Aspects of Declarative Languages (PADL), ACM-SIGPLAN '', Volume 4902/2008, Pages: 167-181, January 2008, San Francisco.</ref> which do [[#Accommodating left recursion in top-down parsing|accommodate ambiguity and left recursion]] in polynomial time and which generate polynomial-sized representations of the potentially exponential number of parse trees.
 
==Programming language application==
A [[compiler]] parses input from a programming language to assembly language or an internal representation by matching the incoming symbols to  [[Formal_grammar#The_syntax_of_grammars|production rules]]. Production rules are commonly defined using [[Backus-Naur form]]. An [[LL parser]] is a type of parser that does top-down parsing by applying each production rule to the incoming symbols, working from the left-most symbol yielded on a production rule and then proceeding to the next production rule for each non-terminal symbol encountered. In this way the parsing starts on the Left of the result side (right side) of the production rule and evaluates non-terminals from the Left first and, thus, proceeds down the parse tree for each new non-terminal before continuing to the next symbol for a production rule.
 
For example:
 
* <math>A \rightarrow aBC</math>
* <math>B \rightarrow c \mid cd</math>
* <math>C \rightarrow df \mid eg</math>
 
would match <math>A \rightarrow aBC</math> and attempt to match <math>B \rightarrow c \mid cd</math> next. Then <math>C \rightarrow df \mid eg</math> would be tried. As one may expect, some languages are more [[ambiguity|ambiguous]] than others. For a non-ambiguous language in which all productions for a non-terminal produce distinct strings: the string produced by one production will not start with the same symbol as the string produced by another production. A non-ambiguous language may be parsed by an LL(1) grammar where the (1) signifies the parser reads ahead one token at a time. For an ambiguous language to be parsed by an LL parser, the parser must lookahead more than 1 symbol, e.g. LL(3).
 
The common solution to this problem is to use an [[LR parser]], which is a type of [[shift-reduce parser]], and does [[bottom-up parsing]].
 
== Accommodating left recursion in top-down parsing ==
A [[formal grammar]] that contains [[left recursion]] cannot be [[parsing|parsed]] by a naive [[recursive descent parser]] unless they are converted to a weakly equivalent right-recursive form. However, recent research demonstrates that it is possible to accommodate left-recursive grammars (along with all other forms of general [[Context-free grammar|CFGs]]) in a more sophisticated top-down parser by use of curtailment. A [[recognizer|recognition]] algorithm which accommodates [[ambiguity|ambiguous]] grammars and curtails an ever-growing direct left-recursive parse by imposing depth restrictions with respect to input length and current input position, is described by Frost and Hafiz in 2006.<ref name="FrostHafiz2006">Frost, R. and Hafiz, R. (2006) " A New Top-Down Parsing Algorithm to Accommodate Ambiguity and Left Recursion in Polynomial Time." ''ACM SIGPLAN Notices'', Volume 41 Issue 5, Pages: 46 - 54.</ref> That algorithm was extended to a complete [[parsing]] algorithm to accommodate indirect (by comparing previously computed context with current context) as well as direct left-recursion in [[polynomial]] time, and to generate compact polynomial-size representations of the potentially exponential number of parse trees for highly ambiguous grammars by Frost, Hafiz and Callaghan in 2007.<ref name="FrostHafizCallaghan 2007"/> The algorithm has since been implemented as a set of [[parser combinator]]s written in the [[Haskell (programming language)|Haskell]] programming language. The implementation details of these new set of combinators can be found in a paper <ref name="FrostHafizCallaghan 2008"/> by the authors, which was presented in PADL'08.
The [http://www.cs.uwindsor.ca/~hafiz/proHome.html X-SAIGA] site has more about the algorithms and implementation details.
 
== Time and space complexity of top-down parsing ==
When top-down parser tries to parse an [[ambiguous]] input with respect to an ambiguous CFG, it may need exponential number of steps (with respect to the length of the input) to try all alternatives of the CFG in order to produce all possible parse trees, which eventually would require exponential memory space. The problem of exponential time complexity in top-down parsers constructed as sets of mutually recursive functions has been solved by Norvig in 1991.<ref name=" Norvig 1991">Norvig, P. (1991) “Techniques for automatic memoisation with applications to context-free parsing.” ''Journal - Computational Linguistics'' Volume 17, Issue 1, Pages: 91 - 98.</ref> His technique is similar to the use of dynamic programming and state-sets in [[Earley parser|Earley's algorithm]] (1970), and tables in the [[CYK algorithm]] of Cocke, Younger and Kasami.
 
The key idea is to store results of applying a parser <code> p </code> at position <code> j </code> in a memotable and to reuse results whenever the same situation arises. Frost, Hafiz and Callaghan<ref name="FrostHafizCallaghan 2007"/><ref name="FrostHafizCallaghan 2008"/> also use [[memoization]] for refraining redundant computations to accommodate any form of CFG in [[polynomial]] time ([[Big O notation|Θ]](n<sup>4</sup>) for left-recursive grammars and [[Big O notation|Θ]](n<sup>3</sup>) for non left-recursive grammars). Their top-down parsing algorithm also requires polynomial space for potentially exponential ambiguous parse trees by 'compact representation' and 'local ambiguities grouping'. Their compact representation is comparable with Tomita’s compact representation of [[bottom-up parsing]].<ref name=" Tomita1985">Tomita, M. (1985) “Efficient Parsing for Natural Language.” ''Kluwer, Boston, MA''.</ref>
 
Using PEG's, another representation of grammars, packrat parsers provide an elegant and powerful parsing algorithm. See [[Parsing expression grammar]].
 
==Disagreement with Facts==
While LR can be said to be better for languages it is also backwards to specify.{{Citation needed|date=June 2013}}
 
LL can be small and powerful and readable, although it can be slower.{{Citation needed|date=June 2013}} The time taken depends greatly on the BNF (table) just as a bad table would cause LALR problems.{{Citation needed|date=June 2013}}  LL can require a lot of memory.{{Citation needed|date=June 2013}}  [http://sourceforge.net/projects/bnf2xml/ bnf2xml] supports styles of recursion, and etc: expression isn't a problem with LL.{{Citation needed|date=June 2013}}  An LALR with a long path back to the top has more problems than an LL with a short path to the next element.{{Citation needed|date=June 2013}}
 
==See also==
* [[Bottom-up parsing]]
* [[Parsing]]
* [[Recursive descent parser]]
* [[Parsing expression grammar]]
 
==References==
{{reflist}}
 
== External links ==
* [http://www.cs.uwindsor.ca/~hafiz/proHome.html X-SAIGA] - eXecutable SpecificAtIons of GrAmmars
 
{{DEFAULTSORT:Top-Down Parsing}}
[[Category:Parsing algorithms]]
 
[[cs:Syntaktická analýza shora dolů]]
[[ko:하향식 파싱]]
[[hr:Parsiranje od vrha prema dnu]]
[[ja:トップダウン構文解析]]
[[pl:Analiza zstępująca]]
[[ro:Parsare top-down]]
[[ru:Нисходящий синтаксический анализ]]
[[sr:Анализа наниже]]
[[uk:Метод рекурсивного спуску]]

Revision as of 21:17, 4 December 2013

In computer science, top-down parsing is a parsing strategy where one first looks at the highest level of the parse tree and works down the parse tree by using the rewriting rules of a formal grammar. LL parsers are a type of parser that uses a top-down parsing strategy.

Top-down parsing is a strategy of analyzing unknown data relationships by hypothesizing general parse tree structures and then considering whether the known fundamental structures are compatible with the hypothesis. It occurs in the analysis of both natural languages and computer languages.

Top-down parsing can be viewed as an attempt to find left-most derivations of an input-stream by searching for parse-trees using a top-down expansion of the given formal grammar rules. Tokens are consumed from left to right. Inclusive choice is used to accommodate ambiguity by expanding all alternative right-hand-sides of grammar rules.[1]

Simple implementations of top-down parsing do not terminate for left-recursive grammars, and top-down parsing with backtracking may have exponential time complexity with respect to the length of the input for ambiguous CFGs.[2] However, more sophisticated top-down parsers have been created by Frost, Hafiz, and Callaghan [3][4] which do accommodate ambiguity and left recursion in polynomial time and which generate polynomial-sized representations of the potentially exponential number of parse trees.

Programming language application

A compiler parses input from a programming language to assembly language or an internal representation by matching the incoming symbols to production rules. Production rules are commonly defined using Backus-Naur form. An LL parser is a type of parser that does top-down parsing by applying each production rule to the incoming symbols, working from the left-most symbol yielded on a production rule and then proceeding to the next production rule for each non-terminal symbol encountered. In this way the parsing starts on the Left of the result side (right side) of the production rule and evaluates non-terminals from the Left first and, thus, proceeds down the parse tree for each new non-terminal before continuing to the next symbol for a production rule.

For example:

would match AaBC and attempt to match Bccd next. Then Cdfeg would be tried. As one may expect, some languages are more ambiguous than others. For a non-ambiguous language in which all productions for a non-terminal produce distinct strings: the string produced by one production will not start with the same symbol as the string produced by another production. A non-ambiguous language may be parsed by an LL(1) grammar where the (1) signifies the parser reads ahead one token at a time. For an ambiguous language to be parsed by an LL parser, the parser must lookahead more than 1 symbol, e.g. LL(3).

The common solution to this problem is to use an LR parser, which is a type of shift-reduce parser, and does bottom-up parsing.

Accommodating left recursion in top-down parsing

A formal grammar that contains left recursion cannot be parsed by a naive recursive descent parser unless they are converted to a weakly equivalent right-recursive form. However, recent research demonstrates that it is possible to accommodate left-recursive grammars (along with all other forms of general CFGs) in a more sophisticated top-down parser by use of curtailment. A recognition algorithm which accommodates ambiguous grammars and curtails an ever-growing direct left-recursive parse by imposing depth restrictions with respect to input length and current input position, is described by Frost and Hafiz in 2006.[5] That algorithm was extended to a complete parsing algorithm to accommodate indirect (by comparing previously computed context with current context) as well as direct left-recursion in polynomial time, and to generate compact polynomial-size representations of the potentially exponential number of parse trees for highly ambiguous grammars by Frost, Hafiz and Callaghan in 2007.[3] The algorithm has since been implemented as a set of parser combinators written in the Haskell programming language. The implementation details of these new set of combinators can be found in a paper [4] by the authors, which was presented in PADL'08. The X-SAIGA site has more about the algorithms and implementation details.

Time and space complexity of top-down parsing

When top-down parser tries to parse an ambiguous input with respect to an ambiguous CFG, it may need exponential number of steps (with respect to the length of the input) to try all alternatives of the CFG in order to produce all possible parse trees, which eventually would require exponential memory space. The problem of exponential time complexity in top-down parsers constructed as sets of mutually recursive functions has been solved by Norvig in 1991.[6] His technique is similar to the use of dynamic programming and state-sets in Earley's algorithm (1970), and tables in the CYK algorithm of Cocke, Younger and Kasami.

The key idea is to store results of applying a parser p at position j in a memotable and to reuse results whenever the same situation arises. Frost, Hafiz and Callaghan[3][4] also use memoization for refraining redundant computations to accommodate any form of CFG in polynomial time (Θ(n4) for left-recursive grammars and Θ(n3) for non left-recursive grammars). Their top-down parsing algorithm also requires polynomial space for potentially exponential ambiguous parse trees by 'compact representation' and 'local ambiguities grouping'. Their compact representation is comparable with Tomita’s compact representation of bottom-up parsing.[7]

Using PEG's, another representation of grammars, packrat parsers provide an elegant and powerful parsing algorithm. See Parsing expression grammar.

Disagreement with Facts

While LR can be said to be better for languages it is also backwards to specify.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.

LL can be small and powerful and readable, although it can be slower.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park. The time taken depends greatly on the BNF (table) just as a bad table would cause LALR problems.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park. LL can require a lot of memory.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park. bnf2xml supports styles of recursion, and etc: expression isn't a problem with LL.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park. An LALR with a long path back to the top has more problems than an LL with a short path to the next element.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.

See also

References

43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.

External links

  • X-SAIGA - eXecutable SpecificAtIons of GrAmmars

cs:Syntaktická analýza shora dolů ko:하향식 파싱 hr:Parsiranje od vrha prema dnu ja:トップダウン構文解析 pl:Analiza zstępująca ro:Parsare top-down ru:Нисходящий синтаксический анализ sr:Анализа наниже uk:Метод рекурсивного спуску

  1. 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

    My blog: http://www.primaboinca.com/view_profile.php?userid=5889534
  2. 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

    My blog: http://www.primaboinca.com/view_profile.php?userid=5889534
  3. 3.0 3.1 3.2 Frost, R., Hafiz, R. and Callaghan, P. (2007) " Modular and Efficient Top-Down Parsing for Ambiguous Left-Recursive Grammars ." 10th International Workshop on Parsing Technologies (IWPT), ACL-SIGPARSE , Pages: 109 - 120, June 2007, Prague.
  4. 4.0 4.1 4.2 Frost, R., Hafiz, R. and Callaghan, P. (2008) " Parser Combinators for Ambiguous Left-Recursive Grammars." 10th International Symposium on Practical Aspects of Declarative Languages (PADL), ACM-SIGPLAN , Volume 4902/2008, Pages: 167-181, January 2008, San Francisco.
  5. Frost, R. and Hafiz, R. (2006) " A New Top-Down Parsing Algorithm to Accommodate Ambiguity and Left Recursion in Polynomial Time." ACM SIGPLAN Notices, Volume 41 Issue 5, Pages: 46 - 54.
  6. Norvig, P. (1991) “Techniques for automatic memoisation with applications to context-free parsing.” Journal - Computational Linguistics Volume 17, Issue 1, Pages: 91 - 98.
  7. Tomita, M. (1985) “Efficient Parsing for Natural Language.” Kluwer, Boston, MA.