No permission to encoding a (user_) stream?

I am trying to modify standard operating system I/O streams encoding to UTF-8:

:- set_stream(user_input,  encoding(utf8)).
:- set_stream(user_output, encoding(utf8)).
:- set_stream(user_error,  encoding(utf8)).

There is no problem in Emacs prolog-mode / shell-mode, but errors in command-line:

1 ?- set_stream(user_input,  encoding(utf8)).
ERROR: No permission to encoding stream `user_input'
ERROR: In:
ERROR:   [10] set_stream(user_input,encoding(utf8))
ERROR:    [9] toplevel_call(user:user: ...) at c:/program files/swipl/boot/toplevel.pl:1117

Why is there a difference between Emacs and command line? Is there way to solve this problem for command-line?

Thanks.

My environment:

Windows 7 with command line 65001 (UTF8)

SWI-Prolog version 8.4.1 for x64-win64

It depends on whether you use swipl-win.exe or swipl.exe, but the result is AFAIK the same. Using swipl-win.exe I/O is bound to the SWI-Prolog console which exchanges text using wchar_t units. Using swipl.exe, when you talk to a normal Windows console window it turns the encoding to wchar_t again and uses the Windows console Unicode API. Both should ensure all Unicode characters are exchanged correctly (well, in fact only those up to 0xffff). Only if I/O is bound to something else you can change the encoding.

1 Like

I see, thanks.

BTW (maybe OT), do you know what does the 65001 (UTF8) exactly mean in windows?

codepage

I once heard that windows command line doesn’t support utf-8, but why they provide code page 65001 (UTF8)? This stuff doesn’t seem to offer any help? It confused me for long time :sweat_smile:.

Does this answer your question?
https://stackoverflow.com/questions/1629437/is-codepage-65001-and-utf-8-the-same-thing

Also: Code Page Identifiers - Win32 apps | Microsoft Docs

Still confusing.

What happen if swipl send utf-8 characters to Windows console which using code page 65001 (UTF8)?

As said, SWI-Prolog uses the Unicode console API if it detects a genuine Windows console. That talks UTF-16, although SWI-Prolog only handles UCS-2, i.e., the Unicode code points up to 2^{16}. Internally, SWI-Prolog represents characters as UCS-2 (UCS-4 on non-Windows) regardless of code pages (or locale as it is known elsewhere). The locale/code pages only matter when reading (or writing) a file in text mode without explicitly specifying the encoding. When reading or writing files without a given encoding SWI-Prolog relies on the C runtime functions to translate I/O to/from Unicode.

I’m not sure whether or not the I/O encoding initialization always works correctly on Windows. You can use set_prolog_flag(encoding, utf8) to make SWI-Prolog read/write files in UTF-8 regardless of the code pages/locale.

If this does answer your question, please tell us what is going wrong.

Frankly speaking, there are no serious problems at the moment.

I am using Emacs to write Prolog programs. Like most Emacs users, my Emacs configuration uses UTF-8 encoding. However by default SWI-Prolog does not use UTF-8 encoding for std IO in windows.

So by default if I run the following program in Emacs (prolog-mode / shell), it will complain:

;; encoding.pl, which is UTF-8

:- op(600, xfx, →).

X→Y :- write("我"), write_canonical(X→Y).

main :- X→Y.
?- consult(["encoding.pl"]).
true.

?- main.
 \316ŇĄ\372(_,_)
true.

?- ?- 1→2.
Warning: user_input:16:9: Illegal multibyte Sequence
ERROR: Syntax error: Operator expected
ERROR: ?- 
ERROR: ** here **
ERROR: 1\uFFFD .

Note that these complains (or unexpected outputs) won’t appear in command-line with code page 65001 (UTF8).

My current solution is set_prolog_flag and set_stream in init.pl.

;; init.pl

:- set_prolog_flag(encoding, utf8).
:- set_stream(user_input,  encoding(utf8)).
:- set_stream(user_output, encoding(utf8)).
:- set_stream(user_error,  encoding(utf8)).

Then everything is OK, except that when I use swipl in command line, it will complain:

C:\Users\Chansey>swipl
ERROR: c:/users/chansey/appdata/roaming/swi-prolog/init.pl:3:
ERROR:    set_stream/2: No permission to encoding stream `user_input'
Warning: c:/users/chansey/appdata/roaming/swi-prolog/init.pl:3:
Warning:    Goal (directive) failed: user:set_stream(user_input,encoding(utf8))
ERROR: c:/users/chansey/appdata/roaming/swi-prolog/init.pl:4:
ERROR:    set_stream/2: No permission to encoding stream `user_output'
Warning: c:/users/chansey/appdata/roaming/swi-prolog/init.pl:4:
Warning:    Goal (directive) failed: user:set_stream(user_output,encoding(utf8))
ERROR: c:/users/chansey/appdata/roaming/swi-prolog/init.pl:5:
ERROR:    set_stream/2: No permission to encoding stream `user_error'
Warning: c:/users/chansey/appdata/roaming/swi-prolog/init.pl:5:
Warning:    Goal (directive) failed: user:set_stream(user_error,encoding(utf8))
Welcome to SWI-Prolog (threaded, 64 bits, version 8.4.1)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

1 ?-

In fact, this doesn’t affect the subsequent use (everything is still OK). It’s just that the error message is a bit annoying.

I see. catch/3 can help to get rid of the error. More interesting might be to figure out whether your Windows installation prefers UTF-8. In this specific case actually how this is handled in Emacs (on Windows) and whether we can detect that the environment in general and the Emacs in particular wants to chat in UTF-8.

In general Windows and UTF-8 is hard. Windows has 8-bit applications that use code pages to map 8-bit characters to Unicode. It seems to have OEM (console) and GUI applications that are distinct. As of Windows 10, it can also provide system-wide UTF-8 support (see links below). As is, portable applications like R, Emacs, etc. all seem to have their own ways to specify they want to use UTF-8. So does SWI-Prolog :slight_smile: Microsoft appears to be moving toward UTF-8 as well. If anyone knows the best way out, please share.

There are some pointers in windows - How can I manually determine the CodePage and Locale of the current OS - Server Fault. Another interesting read is UTF-8 Support on Windows - The R Blog as well as https://helpcenter.nshift.com/hc/en-us/articles/360016886479-Errors-caused-by-Windows-10-Unicode-UTF-8-encoding

1 Like

I have pushed a patch that uses Windows GetACP() to get the current code page. If this is 65001 it seems Windows wants UTF-8 and thus we setup SWI-Prolog’s I/O to use UTF-8.

Should be in tomorrows daily binary. Please give it a try and report (ideally this should mean you do not need this stuff in your init.pl and all should work as expected).

1 Like

I think this should be a good option. Thanks.

I just tried “swipl-w64-2022-05-04.exe Tue May 3 20:23:43 2022 12,968,860” in windows 7 (in VMware). It seems that it doesn’t fix anything:

Given a UTF-8 file encoding.pl

If I set nothing in init.pl, then

In windows 7 command line with 65001:

c:\work-pl\prolog>swipl encoding.pl
ERROR: c:/work-pl/prolog/encoding.pl:3:21: Syntax error: illegal_character
ERROR: c:/work-pl/prolog/encoding.pl:5:1: Syntax error: End of file in quoted string
Warning: c:/work-pl/prolog/encoding.pl:19:
Warning:    'c:/work-pl/prolog/encoding.pl':19:8: Illegal multibyte Sequence
Welcome to SWI-Prolog (threaded, 64 bits, version 8.5.10-53-g5637fed39)

In windows 7 Emacs with UTF-8 encoding:

?- consult(["encoding.pl"]).
ERROR: c:/work-pl/prolog/encoding.pl:3:21: Syntax error: illegal_character
ERROR: c:/work-pl/prolog/encoding.pl:5:1: Syntax error: End of file in quoted string
Warning: c:/work-pl/prolog/encoding.pl:19:
Warning:    'c:/work-pl/prolog/encoding.pl':19:8: Illegal multibyte Sequence

If I set

:- set_prolog_flag(encoding, utf8).

in init.pl, then

In windows 7 command line with 65001:

It is OK.

In windows 7 Emacs with UTF-8 encoding:

?- consult(["encoding.pl "]).
ERROR: c:/work-pl/prolog/encoding.pl :3:21: Syntax error: illegal_character
ERROR: c:/work-pl/prolog/encoding.pl :5:1: Syntax error: End of file in quoted string
Warning: c:/work-pl/prolog/encoding.pl :20:
Warning:    'c:/work-pl/prolog/encoding.pl ':20:0: Illegal multibyte Sequence

Therefore, in order to fix the Emacs’ problem, I still need to set init.pl

:- set_prolog_flag(encoding, utf8).
:- set_stream(user_input,  encoding(utf8)).
:- set_stream(user_output, encoding(utf8)).
:- set_stream(user_error,  encoding(utf8)).

But that will cause the “No permission” problem when running in command line.

All in all, there is nothing to be changed.

I think the issue is that GetACP() doesn’t get the 65001 code page and you only get that on later Windows 10 and Windows 11 systems where you can switch the Windows ANSI encoding to UTF-8 as explained in one of the links I quoted.

Now the question becomes how one figures out that the console uses the 65001 code page? Even if we find out, this would be a logical choice for swipl.exe, but I’m a bit less convinced that would be the right choice for swipl-win.exe.

Anyone knows how to get the console code page?

Anyway, Windows 7 is dead, no?

HA, I use Windows 7 because the current installation on my computer has been working continuously for over 10 years. It is very stable (e.g. no automatic update and 100% disk usage problem). Also, I have many old software installed that no one knows if it would still work after upgrading.

So it’s not dead to me :wink:.