`term_string/2` returns `end_of_file` with null string at the second arg

jan · April 29, 2024, 6:29pm

We can clear end-of-file on streams. That is used for the console where ^D (end-of-file) terminates a consult of ?- [user]. The system resets the end-of-file marker such that we can continue. But, term_string/2 is a conversion for exactly one term.

peter.ludemann · April 29, 2024, 7:28pm

I strongly prefer pure logical code – in fact, that was the original aim of my thesis years ago (which got sidetracked quite a bit).

But bitter experience has taught me that pure logical code is difficult to debug, and the current state of art for debuggers isn’t at all “capable” for finding these bugs. I would argue that the solution isn’t a better debugger but better declarations, such as det/1 or SSU, plus type inferencing on steroids.

(Constraints don’t play well with cuts; and experience has taught me that they’re even more difficult to get right.)

So, in the absence of other tools, I prefer errors to failure when it’s not obvious which is “best”. Before the det/1 directive, I wrote my own “must_succeed” support because tracking down unexpected failures was too difficult (my code tended to use enormous data structures, which weren’t very suitable for the debugger). A related problem was predicates that left unexpected backtrack points, for which det/1 also helped. But, as I’ve said, det/1 and SSU aren’t enough; which is why I prefer throwing an error when there’s an error (as opposed to an inconsistency, for which failure is the best approach).

ridgeworks · April 30, 2024, 3:01pm

Different strokes for different folks. I do distinguish between between low level code (built-in’s) and, to a lesser extent library API’s, and application code where it’s a programmer’s choice, and maybe even a requirement, that errors be generated.

For this particular case, if I had a vote, I would prefer fail - logically speaking, there is no term which corresponds to the empty string. The status quo (end_of_file) is the least attractive option, because it’s ambiguous:

?- term_string(T,""), term_string(T,"end_of_file").
T = end_of_file.

but I believe this has been fixed in 9.3.5. (Also read_term_from_atom/3, atom_to_term/3 ?)

However, I don’t anticipate any changes in current API’s. I guess I would like an efficient way of implementing:

silent(G) :- catch(G,_,fail).

So two things:

somehow avoid meta-call (use goal expansion?)
avoid building and throwing exception if handler = fail

Off-topic, but I’ve never really understood this. Do you mean constraints as in library(clpfd) and dif/2 or constraints as in arbitrary user defined code that has its execution conditionally deferred?

In some ways deferred execution is like concurrency - how does one manage errors in, e.g., concurrent_maplist?

kuniaki.mukai · April 30, 2024, 4:05pm

Thanks. I have checked that prolog cgi pages of mine now works as expected with respect to the new “end_of_file” syntax error for term_string.

However, I notice that we still see end_of_file or -1 for queries like below. As there is prepared ‘at_end_of_stream/0,1’ IMHO, it seems better to hide such “end_of_file” marks from the user. I know -1 has been used since Edinburgh Prolog. So I understand such explicit but artificial end_of_file terms are mainly from historical reasons, I guess, but I may be wrong.

?- open_string("",  S), get_code(S, C), close(S).
S = <stream>(0x600001ee8000),
C = -1.

?- open_string("",  S), at_end_of_stream(S), close(S).
S = <stream>(0x600001e72000).

?- open_string("",  S), at_end_of_stream(S), read(S, X), close(S).
S = <stream>(0x600001ef7100),
X = end_of_file.

jan · May 1, 2024, 7:30am

Both are part of tradition as well as ISO standard. There is nothing wrong with -1 as it is not a valid character and thus an unambiguous indication for end-of-file. The problem with read/2 is there there is no “out-of-band” term. We do need it. Consider

hello. /* this is the end */

read/2 returns hello, leaving the file position after the full-stop (. + white). So, at_end_of_stream/1 still fails. Calling read again must return something. This is end_of_file. So, if your input is guaranteed to not contain end_of_file, you can test for that. If you want unambiguous handling of terms you need to check for read/2 to return end_of_file and test at_end_of_stream/1 to succeed.

Of course we could also throw an exception on end-of-file, but this is IMO wrong. First of all, in general we would like to avoid the need for exception handling to to perfectly normal things and second, you want to be able to handle exceptions at a higher level. In this use case you’d typically have to handle it completely local.

Requiring local handling is the situation @ridgeworks is in: he would like to handle all exceptions as local as possible as failures. That is indeed not very well supported in the current system. If it needs to be resolved I’d go for a more efficient catch/3 implementation. That is surely possible, but as far as I’m concerned not very high on the priority list. Even for @ridgeworks use cases, spending the same time on general optimization is likely to have more impact.

ridgeworks · May 1, 2024, 3:14pm

Totally agree. I rarely use catch/3 and instead depend on causing “premature failure” using a guard (usually type filters which are VM instructions) before calling a predicate which may generate an error. So this is more of a philosophical niggle (small code bloat) than a practical concern.

On the other other hand, general optimizations are always welcome.

kuniaki.mukai · May 1, 2024, 4:09pm

I agree with this on -1.

Going to test your comment on end_of_file, I get confused to see the result
of preparatory queries, which seems to mean that the end_of_file atom
is not necessary for read, provided that at_end_of_stream because it returns ‘syntax error’ correctly for such error item on the input stream. I’m afraid I might be missing points.

% ?- open_string("end_of_file. end_of_file. ", S), read(S, X), read(S, Y), read(S, Z).
%@ S = <stream>(0x600001de4400),
%@ X = Y, Y = Z, Z = end_of_file.

% ?- open_string("a. b. ", S), read(S, X), read(S, Y), read(S, Z).
%@ S = <stream>(0x600001de2800),
%@ X = a,
%@ Y = b,
%@ Z = end_of_file.

% ?- open_string("a.\nb.\n", S),  read(S, X), read(S, Y), read(S, Z).
%@ S = <stream>(0x600001ddae00),
%@ X = a,
%@ Y = b,
%@ Z = end_of_file.

% ?- open_string("a. b. /* / */", S),  read(S, X), read(S, Y), read(S, Z).
%@ S = <stream>(0x600001ddaf00),
%@ X = a,
%@ Y = b,
%@ Z = end_of_file.

% ?- open_string("a. b. / / */ ", S),  read(S, X), read(S, Y), read(S, Z).
%@ ERROR: Stream <stream>(0x600001ddb000):1:14 Syntax error: Unexpected end of file
%@ ^  Exception: (4) setup_call_cleanup('$toplevel':notrace(call_repl_loop_hook(begin, 0)), '$toplevel':'$query_loop'(0), '$toplevel':notrace(call_repl_loop_hook(end, 0))) ? creep

jan · May 1, 2024, 4:37pm

I don’t understand this. All results are to be expected. The problem is that if we do not return end_of_file on reaching the end of file, but rely on at_end_of_stream/1, We cannot detect that b is the last term in this example by you

After reading b, at_end_of_stream/1 is still false. So, we must call read/2 one more time. What should that call do? Sure, it will read up to end-of-file, such that after this call at_end_of_stream/1 is true. We can either succeed, but than we need to bind the output term to something (or not, but a variable is also a valid Prolog term, so that does not help). Or we need to fail, but that also leads to pretty awkward code or raise an exception, which is even worse. So, we have little choice but bind the term to something. Traditionally this is end_of_file.

kuniaki.mukai · May 1, 2024, 5:02pm

Thanks I got point.
I thought it may not be difficult to add to at_end_of_stream an action to skip fillers and comments until next start position of possible prolog term or to the true “end_of_file.”

jan · May 2, 2024, 7:14am

at_end_of_stream/1 is not aware of the content of the stream. You would need multiple of these predicates depending on the content. That sounds hopelessly complicated. I think it is as good as it can be. Another line of thought I once had is to allow defining “out-of-bound” values. Something like that already exists for dicts, that are compounds with a functor that has no Prolog syntax. Similarly, we could define an out-of-bound end-of-file constant and a predicate to test for this. Given that, we could do

read(In, Term),
(   is_end_of_file(Term)
->  done
;   ...
).

I proposed things along these lines with the version 7 extensions, but there was no warm welcome …

kuniaki.mukai · May 2, 2024, 8:05am

Sorry, I could not imagine such implementation difficulty. You are always busy on
more important issues. I do not insist, and I should be silent on this.

BTW, the following is_end_of_file seems to be almost equivalent what I had in mind
as functionality to skipping filler contents on the stream, which is expected to free the ‘end_of_file’ atom from the reserved but not elegant mission. I hope it is still a conservative elegant extension on the current stream I/O.

Topic		Replies	Views
Unexpected behavior of term_string Predicate	2	283	December 28, 2022
term_string('A', "'A'") Help! discussion	18	814	June 22, 2022
Library(persistency) data files must have a blank last line Nice to know	0	270	January 30, 2023
Je n'arrive pas à charger un fichier General	1	165	January 28, 2024
Exceptions in foreign code - more questions Help!	8	403	February 2, 2021

`term_string/2` returns `end_of_file` with null string at the second arg

Related topics