Hello,
can You give me a small simple Example on "how to load a file with the contents of three words: “for me do”, and parsing it in SWIPL ?
Thanks for helping
Jens
What is there to parse in this file?
$ echo "for me do" > small_file
$ swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 9.1.21)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.
For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).
?- phrase_from_file("for me do\n", small_file).
true.
At least explain what kind of AST you expect to get from such input?
To parse into lists of codes:
words([W|Ws]) -->
word(W),
next_words(Ws).
word([]) --> [].
word([C|Cs]) -->
[C],
{ code_type(C, alnum) },
word(Cs).
next_words([]) --> "\n".
next_words(Ws) -->
" ",
words(Ws).
Usage:
?- once(phrase_from_file(words(Ws), 'small_file')).
Ws = [[102, 111, 114], [109, 101], [100, 111]].
portray_text/1 is handy, to represent code lists of length 3 or greater:
?- once(phrase_from_file(words(Ws), 'small_file')).
Ws = [`for`, [109, 101], [100, 111]].
You really should be using library(dcg/basics) and library(dcg/high_order) for such things. I would nevertheless wait for @paule32 to show what AST he would expect from their example input. Still:
Your words//1 could be a sequence//2
your word//1 could be maybe csym//1 or nonblanks//1?
There are also integer//1, xdigits//1, …
I get other output:
It does not help beginners to understand a simple example, if parts of the puzzle are hidden in a library they don’t immediately know how to see.
I like DCGs which are specific/explicit, to prevent surprises.
It’s the same as my 1st of 2 showings.
sorry for idle, have doing house hold work…
however. The AST in flex could be:
digits [0-9]*
id [_a-zA-Z0-9]
either then:
"for" { do something }
"me" { do other thing }
"do" { do simple }
or:
id { get id name, and handle the name. then do things on it }
and in the grammar:
start
: /* could be empty */
| for_token
| me_token
| do_token
;
for_token
: for do me
| do me
| do
;:
as example in source file:
/* empty or whitespaces line/s) */
do me
me
for me do
did I miss something ?
You show:
Ws = [`for`, [109, 101], [100, 111]].
I get:
Ws = [[102, 111, 114], [109, 101], [100, 111]].
or did You mean the using of portray_text/1 ?
This is just busy work. The source code is out there, you can always read it.
don’t worry. I have’nt used SWIPL for a while.
Thanks for your hints.
I have a bit of trouble following.
How familiar are you with DCGs? The main point is that to parse anything, you should probably use a DCG, unless there is a good reason not to. Then, you could additionally use code_type/2 as @brebs did in their example. From the docs of char_type/2:
csym
Char is a letter (upper- or lowercase), digit or the underscore (_
). These are valid C and Prolog symbol characters.
So your [_a-zA-Z0-9]
seems to be exactly a csym?
It gets interesting when you take a Pascal or C style language and then use DCGs to parse it to an AST but not clear from your example so far.