Most efficient DCG for text parsing?

jan · July 4, 2021, 7:21am

A good option is to use term_expansion/2 to generate the 127 rules for you. Typing 127 rules is a bit boring. From the implementation point of view they are most likely the best solution though.

Opinions vary. Roughly you have three options

Use var/1 (nonvar/1) checks in the rules.
Use delays (when/2, freeze/2)
Write two DCGs

All three have their merits. The advantage of the first two is that it keeps the code for parsing and generation together, which makes it easier to maintain the consistency if you change the rules over time. That is particularly true for the delay version. Delays are relatively slow though and you have to be careful if you also want to use cuts (or if->then;else) to make sure all relevant delayed calls have materialized. Committing is more or less obligatory in real-world DCGs, in particular for artificial languages as not doing so typically leads to practically infinite backtracking in case of a syntax error. You also need to be sure not to leave any residual goals behind. And explicit var/nonvar split is typically more efficient, but uglier. Two distinct DCGs avoid all the var/nonvar tests and delays, but make it harder to keep the two in sync. On the other hand, the two DCGs are typically different when it comes to comment, layout handling and other input ambiguities (escaped strings, floating point numbers, etc). The parsing one often simply skips all that while the generating one may wish to keep track of nesting to emit line breaks and indentation. Trying to combine all of that in one implementation is not always a good idea.

… it depends on the use case, requirements and preferences …

Topic		Replies	Views
What's the idiomatic way of developing DCGs? Help!	8	673	December 17, 2020
Wiki Discussion: DCG and phrase/3 Wiki Discussion	5	2588	June 12, 2023
DCG to read Prolog terms Help!	8	1658	August 15, 2019
Tokenizing files with DCGs General dcg	3	657	August 12, 2022
Is there a preferred way to check for valid characters when using DCG? Help!	5	846	March 12, 2019

Most efficient DCG for text parsing?

Related topics