Dear SWI-Prolog user,
SWI-Prolog 8.5.14 is ready for download. This release contains
many visible changes whose implementation touched a lot of code.
As a result, some regression is not unlikely. People using the
development version for production purposes should thoroughly
test this version first. Suspect areas are I/O, atom and string
manipulation in Prolog or C, notably when non-ASCII characters
are involved. Regression is more likely on Windows, but may also
affect other platforms.
Highlights:
-
Update character tables to Unicode 14.0.0
-
Distinguish Unicode decimal digits and act on them regardless
of the script. It is not allowed to mix digits from multiple
scripts in the same number. -
Change internal wchar_t text on systems where wchar_t is 2 bytes
(Windows) to deal with the encoding as UTF-16. This allows for
the full unicode range on all platforms. These patches also
provide UTF-16 I/O on all platforms. The Unicode surrogate
pairs code units are now considered illegal code points
(0xD800 .. 0xDFFF). Note that these changes may well have
caused regression. Also several of the extensions still handle
wchar_t on Windows as UCS-2 (notably xpce). -
Added string_bytes/3 to convert between Unicode text and byte
sequences. Used by updated base64_encoded/3. -
The JAVA interface now avoids recursion for exchanging terms.
Contributed by Paul Singleton. -
Several msys2 portability issues by @mgondan1
-
Several fixes to pack_install/1, in part by Peter.Ludemann.
-
Avoid a crash on too deeply nested C->Prolog->C call stacks.
Partial implementation (fully functional on Linux, approximation
on MacOS and not on Windows). -
Added PL_scan_options() to the foreign API to simplify processing
option lists and make their processing consistent. The new API
is a slight generalization of an old internal API.Enjoy — Jan
SWI-Prolog Changelog since version 8.5.13
-
ENHANCED: base64_encoded/3: added option encoding(Encoding) and
bootstrap base64/2 and base64url/2 from this predicate. base64url/2
now uses UTF-8 encoding (MODIFIED). -
ADDED: string_bytes/3, get the bytes for representing a (Unicode)
string in a given encoding. -
FIXED: Avoid C-stack overflow in recursive C->Prolog->C calls by
demanding a minimum of 100Kbytes stack before calling Prolog. -
FIXED: pack_install/1: if the pack is already installed at an older
version, upgrade it. -
ENHANCED: Make git probe silent on all possible errors.
-
FIXED: GIT URLs must be a valid absolute URL to begin with
-
FIXED: pack_install/1: version comparison for already installed
versions. -
ADDED: PL_scan_options() public API to deal with option lists.
-
ADDED: pack_install/1 to test whether a URL is a GIT URL using
git_remote_branches/2. -
MDOIFIED: Fixed various internal recoding issues to ENC_WCHAR.
These changes also ensures canonical text as used for atoms and strings
only contain valid Unicode code points. As a result, passing invalid
strings to Prolog using the foreign API may result in a failure. -
DOC: Unicode and UTF-16 issues.
-
MODIFIED: Be consistent about valid character codes. These are the
Unicode code points 0..U+10FFFF, while the range reserved for UTF-16
surrogate pairs is excluded (U+D800..U+DFFF). -
DOC: Base64 encoding issues.
-
PORT: add_package_path/1 also under Windows This change allows for
a GNU-style directory structure also under Windows. -
DOC: Rename section label for the statistics section of the manual to
avoid a clash with the library documentation, hiding statistics/2 docs. -
TEST: Avoid surrogates for all encodings
There are a number of visible
changes * UTF: Fixed length handling in setenv/2.
-
PORT: Replace most wint_t by int for character classification purposes
because Windows wint_t is 2 bytes, so we cannot classify anything0xffff
-
UTF: atom_concat/3 can now handle UTF-16 sequences.
-
UTF: Make PL_cmp_text() and PL_unify_text_range() deal with UTF-16
strings. -
CLEANUP: sub_atom/5: more consistent typing and better reuse of
primitives. This patch fixes handling atoms longer than 2G code points -
FIXED: Reading terms from Unicode symbol sequences.
-
ADDED: Use UTF-16 for canonical text on Windows. This is a first
step that implements some of the basic handling for creating and
writing atoms with code points > U+FFFF -
MODIFIED: Official encoding names for UTF-16. Now also allows
aliases for the IANA names for specifying the encoding (UTF-8,
UTF-16BE, UTF-16LE). -
ENHANCED: Allow reading and writing UTF-16 files.
-
PORT: MSYS2, add
%MINGW_PREFIX%/binto dll search -
MODIFIED: PL_get_char() now returns a domain or representation error if
the code is outside the Unicode range (domain) or cannot be represented
by the system (representation). -
FIXED: built-in option list processing should raise a type error if
the list is cyclic. -
FIXED: incr_invalidate_calls/1: succeed if no tabling happened in
the calling thread yet (and thus there is no variant table). -
ADDED: Allow floats from other scripts.
-
DOC: various Unicode related updates, including handling of non-ASCII
decimal number characters. -
MODIFIED: Make the Prolog parser parse decimal numbers in other
scripts to integers. -
ADDED: char_type/2 type
decimal. -
MODIFIED: Updated character classification for read/1 and friends to
be based on Unicode 14.0.0 -
MODIFIED: Updated to Unicode 14.0.0 (from 6.0.0)
-
CLEANUP: library(unicode/unicode_data) to avoid conflict with table/1.
Package clib
-
ADDED: library(sched), providing a start at accessing the OS scheduling
primitives. -
FIXED: read_line_to_codes/2: avoid line ending with \r if the \n is
found just after a flush. This patch also includes a rewrite of this
predicate and read_stream_to_codes/3 to use UTF-8 as intermediate
representation rather than wchar_t, avoiding the need for UTF-16
surrogate pairs.
Package cpp
- ADDED: PlTerm casts for bool and uint32_t
Package jpl
-
ENHANCED: Non-recursive Term.getTerm reimplementation Replaces the
original recursive implementation of Term.getTerm() etc. (which runs
out of JVM stack for e.g. lists of more than a few thousand members)
with a depth-unlimited non-recursive version (see Term.getLoop)
and adds a couple of JUnit tests -
ENHANCED: Non-recursive Term.put reimplementation Replaces the original
recursive implementation of Term.put() etc. (which runs out of JVM
stack for e.g. lists of more than a few thousand members) with a
depth-unlimited non-recursive version (see Term.putLoop) and adds a
couple of JUnit tests -
TEST: Cleanup and enhancements.