Using phrase_from_file

The lazy parsing predicate phrase_from_file/2,3 is useful, especially since it provides nice feedback for file position when an error is encountered.

It does, however, place certain restrictions on the DCG - particularly in handling eof.
I asked Jan to outline those restrictions, he suggested I do so on discourse so everyone could benefit - So this is me asking Jan (or anybody else who can help) what steps one needs to take to make one’s DCG happy with phrase_from_file.

2 Likes

I doubt I can produce an exact description of the limitations. I think all the normal clean DCG stuff should work fine. If you get access to the input list though you have to be a bit careful as only predicates that nicely unify against the input list work. So, for example length/2 doesn’t work. There could be other corner cases, but I guess the best way to get hold of these is to discuss them starting with a concrete case.

If you have a grammar you think should be fine butt that doesn’t work with phrase_from_file/2, try using read_file_to_codes/3 and run the grammar on the concrete list. If the two yield different results, something is wrong …

1 Like

I might be wrong about this, but since there is no “Rest” argument to phrase_from_file, you need to use remainder//1 instead. This is already in the example code snippet in the docs to phrase_from_file/2.

Are you working with students doing an example using courses, students and exam scheduling. I am working on a similar question on StackOverflow and while the code works without using phrase_from_file, when I change it to phrase_from_file it fails on remainder//1 from dcg/basics because the data coming in from phrase_from_file is a lazy list (open list) and remainder//1 is expecting a closed list.

no. that’s not me, but this is a classic example of same thing. I’m working on bvh_animation.

I’ll submit a small test case later, at work now.

I take it that a concrete list is a closed list or more commonly just a list. Any other names for list that I might be missing? I know of list, closed list, partial list, open list and difference list.

If so I need to update my Difference List wiki.

concrete list is not a well defined term. Sorry. You should add lazy list to your list though. A lazy list is a list whose tail is an attributed value and when you unify against the tail the lists gets longer (and will either terminate in [] or in a new attributed variable).

2 Likes

In

student(Student) -->
    "\t",
    string_without("\n", S),
    { atom_codes(Student, S) },
    "\n".

it requires the student line to end with a \n but in the real world a file may not end with a \n and thus the need for the use of remainder//1.

in my case it’s necessary to read files that exist, and modifying them is not really an option. The file format is only defined by usage, but there are millions of BVH files in the world.

1 Like

The reason I am taking a strong stance on this one is that I have worked in real world production systems where files would be transferred from outside companies and they would work fine for quite some time (weeks/months/years), then a new programmer on the sending side would make a change to the code, the new file would come over and cause the receiving process to fail on our side. Since it was a background task and would first typically be noticed by an end user on our side the problem would escalate up and by the time it reached my desk several layers of management had been in the loop, so in my world handling the missing end of lines is an option I take. :wink:

1 Like