Confused with lazy_list_location

Hi,
Relative newcomer to prolog. I’ve been experimenting with DCGs and hit
an oddity. As part of outputting the transformed input I wanted to add
the file and line where the input was found. I added:
lazy_list_location(Loc)
to my rule and Loc to the outputted term. At some point I discovered
the parser was occasionally hiccuping. Hours later I tracked it down
to using lazy_list_location//1. I read the docs to say that I could
call this at any time (but the use of “(error)” makes me wonder…)

Below is some sample code. Create an input file with 4 items. My sample
looks like (tab separated):

abcdefghi       IN      A      127.0.0.1
repeat the line so file size is > about 40K.
?- open(output,write,S),set_output(S),proc_file(test_data),close(S).

Search output file for “eh?” It’s notable, I suspect, that it loses track
very near 4K boundaries!

Happens on 8.0.1 and 7.6 both on Linux.

Is this one of those “surprising” things in Prolog? :slight_smile: Is there a better
way to get file and location?

Cheers,
Eddie

a_space(Char):- char_type(Char, space).
ws --> [W], { a_space(W) }, ws.
ws --> [W], { a_space(W) }.

word_([H|T])  --> [H], {\+ a_space(H)}, word__(T).
word__([H|T]) --> [H], {\+ a_space(H)}, word__(T).
word__([])    --> [].
word(W) --> word_(W1), {atom_string(W1,W)}.

record --> [].
record --> word(A), ws, word(B), ws, word(C), ws, word(D), "\n",
	   lazy_list_location(Loc),
	   {writeln(record(A,B,C,D,Loc))}, record.
record --> [_], {writeln('eh?')}, record.

proc_file(F):- phrase_from_file(record, F).
1 Like

I have never used lazy_list_location//1 so I can’t directly answer to that, but have used phrase_from_file//2 which was not keeping track of the line and position and as I noted in the other topic referenced below, my solution was to just thread counters through the DCG. If you find yourself adding many variables to DCG, then have a look at package edcgs.

A related question of interest: Line_count/2 and phrase_from_file

1 Like

Surely lazy_list_location//1 is intended for error messages. It is pretty slow as it has to walk over the list until the freeze node that realizes the lazy part, then has to analyze that, and do quite a bit of work to convert the file position at the end of the block back to the real position.

I think you have two options. One is to count characters and/or lines as @EricGT suggests. In this case though, I’d probably go for a loop using read_line_to_codes/2 and use a DCG per line. In fact, I’d probably discard DCGs as a whole and use read_line_to_string/2 and split_string/4, followed by appropriate type conversion for the resulting fields. I admit that is not really elegant Prolog. The problem is so simple though that the full power of DCGs is more an overkill and low-level byte managing predicates most probably do the job quicker and more simply.

1 Like

Thanks for the clarification. I had no idea it would be doing so much
work. The actual problem is considerably more complex so I think DCGs
are still a good approach. And now that I’m being forced to think (!)
I realise that tracking the line number is probably not as hard as I
originally thought.

The edcgs package also looks good for a future iteration. Thanks to both.

1 Like

Author of the package edcgs here. I am not sure if @EricGT really wanted to point to this package or edcg instead, which lets you define more accumulators, e.g., to count things. On the other hand, edcgs might assist you in automatically get a corresponding parse tree based on the nonterminals encoded in the DCG.

In order to stop confusing people with the very similar names, I decided to rename the edcgs package to dcg4pt (short for DCG for Parse Trees). @jan, I don’t know how to delete the legacy edcgs package from the list online, maybe you can just unpublish it?

2 Likes

Yes, thanks.

Thanks, that would help.