Hi, I was looking at code_type/2 (in swi-prolog 9.2.5) and noticed:
?- code_type(65, to_upper(U)).
U = 97.
But that excludes this alternative:
?- code_type(65, to_upper(65)).
true.
I would then expect code_type(65, to_upper(U)) to also provide the U = 65 alternative on back-tracking.
Is this intended behaviour, or a bug?
Or shouldn’t code_type(65, to_upper(65)) succeed at all - this success seemingly contradicts the definition of to_upper at char_type/2 (which code_type/2 refers to).
code_type/2 is not really intended as a pure logical access to character classes. It is more of a single foreign predicate that provides access to the C library character classification and conversion API. to_upper and to_lower are a bit dubious classifications. code_type(U, upper(L)) says that U is the uppercase alternative of L, which implies L must be a lowercase letter. But, in many cases you want the uppercase version of a code or, if the code is not lowercase, the original. That is what to_upper does.
Probably the case conversion operations should not have been made part of code_type/2
To get a fully-relational Unicode upper-case conversion (for the purposes of writing a case-insensitive search), I came up with:
code_upper(C, U) :-
% Is already considered to be upper
( C = U,
% Unicode in swi-prolog
between(0, 1114111, C),
% Is not lower-case
\+ code_type(C, lower(_))
% Or lower and upper are different
; code_type(U, upper(C))
).