Dear SWI-Prolog user,
SWI-Prolog 8.5.14 is ready for download. This release contains
many visible changes whose implementation touched a lot of code.
As a result, some regression is not unlikely. People using the
development version for production purposes should thoroughly
test this version first. Suspect areas are I/O, atom and string
manipulation in Prolog or C, notably when non-ASCII characters
are involved. Regression is more likely on Windows, but may also
affect other platforms.
Highlights:
-
Update character tables to Unicode 14.0.0
-
Distinguish Unicode decimal digits and act on them regardless
of the script. It is not allowed to mix digits from multiple
scripts in the same number. -
Change internal wchar_t text on systems where wchar_t is 2 bytes
(Windows) to deal with the encoding as UTF-16. This allows for
the full unicode range on all platforms. These patches also
provide UTF-16 I/O on all platforms. The Unicode surrogate
pairs code units are now considered illegal code points
(0xD800 … 0xDFFF). Note that these changes may well have
caused regression. Also several of the extensions still handle
wchar_t on Windows as UCS-2 (notably xpce). -
Added string_bytes/3 to convert between Unicode text and byte
sequences. Used by updated base64_encoded/3. -
The JAVA interface now avoids recursion for exchanging terms.
Contributed by Paul Singleton. -
Several msys2 portability issues by @mgondan1
-
Several fixes to pack_install/1, in part by Peter.Ludemann.
-
Avoid a crash on too deeply nested C->Prolog->C call stacks.
Partial implementation (fully functional on Linux, approximation
on MacOS and not on Windows). -
Added PL_scan_options() to the foreign API to simplify processing
option lists and make their processing consistent. The new API
is a slight generalization of an old internal API.Enjoy — Jan
SWI-Prolog Changelog since version 8.5.13
-
ENHANCED: base64_encoded/3: added option encoding(Encoding) and
bootstrap base64/2 and base64url/2 from this predicate. base64url/2
now uses UTF-8 encoding (MODIFIED). -
ADDED: string_bytes/3, get the bytes for representing a (Unicode)
string in a given encoding. -
FIXED: Avoid C-stack overflow in recursive C->Prolog->C calls by
demanding a minimum of 100Kbytes stack before calling Prolog. -
FIXED: pack_install/1: if the pack is already installed at an older
version, upgrade it. -
ENHANCED: Make git probe silent on all possible errors.
-
FIXED: GIT URLs must be a valid absolute URL to begin with
-
FIXED: pack_install/1: version comparison for already installed
versions. -
ADDED: PL_scan_options() public API to deal with option lists.
-
ADDED: pack_install/1 to test whether a URL is a GIT URL using
git_remote_branches/2. -
MDOIFIED: Fixed various internal recoding issues to ENC_WCHAR.
These changes also ensures canonical text as used for atoms and strings
only contain valid Unicode code points. As a result, passing invalid
strings to Prolog using the foreign API may result in a failure. -
DOC: Unicode and UTF-16 issues.
-
MODIFIED: Be consistent about valid character codes. These are the
Unicode code points 0…U+10FFFF, while the range reserved for UTF-16
surrogate pairs is excluded (U+D800…U+DFFF). -
DOC: Base64 encoding issues.
-
PORT: add_package_path/1 also under Windows This change allows for
a GNU-style directory structure also under Windows. -
DOC: Rename section label for the statistics section of the manual to
avoid a clash with the library documentation, hiding statistics/2 docs. -
TEST: Avoid surrogates for all encodings
There are a number of visible
changes * UTF: Fixed length handling in setenv/2.
-
PORT: Replace most wint_t by int for character classification purposes
because Windows wint_t is 2 bytes, so we cannot classify anything0xffff
-
UTF: atom_concat/3 can now handle UTF-16 sequences.
-
UTF: Make PL_cmp_text() and PL_unify_text_range() deal with UTF-16
strings. -
CLEANUP: sub_atom/5: more consistent typing and better reuse of
primitives. This patch fixes handling atoms longer than 2G code points -
FIXED: Reading terms from Unicode symbol sequences.
-
ADDED: Use UTF-16 for canonical text on Windows. This is a first
step that implements some of the basic handling for creating and
writing atoms with code points > U+FFFF -
MODIFIED: Official encoding names for UTF-16. Now also allows
aliases for the IANA names for specifying the encoding (UTF-8,
UTF-16BE, UTF-16LE). -
ENHANCED: Allow reading and writing UTF-16 files.
-
PORT: MSYS2, add
%MINGW_PREFIX%/bin
to dll search -
MODIFIED: PL_get_char() now returns a domain or representation error if
the code is outside the Unicode range (domain) or cannot be represented
by the system (representation). -
FIXED: built-in option list processing should raise a type error if
the list is cyclic. -
FIXED: incr_invalidate_calls/1: succeed if no tabling happened in
the calling thread yet (and thus there is no variant table). -
ADDED: Allow floats from other scripts.
-
DOC: various Unicode related updates, including handling of non-ASCII
decimal number characters. -
MODIFIED: Make the Prolog parser parse decimal numbers in other
scripts to integers. -
ADDED: char_type/2 type
decimal
. -
MODIFIED: Updated character classification for read/1 and friends to
be based on Unicode 14.0.0 -
MODIFIED: Updated to Unicode 14.0.0 (from 6.0.0)
-
CLEANUP: library(unicode/unicode_data) to avoid conflict with table/1.
Package clib
-
ADDED: library(sched), providing a start at accessing the OS scheduling
primitives. -
FIXED: read_line_to_codes/2: avoid line ending with \r if the \n is
found just after a flush. This patch also includes a rewrite of this
predicate and read_stream_to_codes/3 to use UTF-8 as intermediate
representation rather than wchar_t, avoiding the need for UTF-16
surrogate pairs.
Package cpp
- ADDED: PlTerm casts for bool and uint32_t
Package jpl
-
ENHANCED: Non-recursive Term.getTerm reimplementation Replaces the
original recursive implementation of Term.getTerm() etc. (which runs
out of JVM stack for e.g. lists of more than a few thousand members)
with a depth-unlimited non-recursive version (see Term.getLoop)
and adds a couple of JUnit tests -
ENHANCED: Non-recursive Term.put reimplementation Replaces the original
recursive implementation of Term.put() etc. (which runs out of JVM
stack for e.g. lists of more than a few thousand members) with a
depth-unlimited non-recursive version (see Term.putLoop) and adds a
couple of JUnit tests -
TEST: Cleanup and enhancements.