Ann: SWI-Prolog 10.1.8

Dear SWI-Prolog user,

SWI-Prolog 10.1.8 is ready for download. This is a stabilising
release on top of the large Unicode overhaul shipped in 10.1.7. The
main themes are tightening up the Unicode source syntax (bracket-pair
atoms, solo characters, write/quote behaviour), rejecting surrogate
code points at every Prolog API surface, a new UTS #39 confusable
identifier linter, a Unicode character-name lookup library, and a
macOS packaging overhaul that ships a swipl.framework and a signed,
notarised split-layout .pkg installer.

Highlights

  • Surrogate code points are now consistently rejected by every
    Prolog API and stream decoder.
  • New library(unicode_security) (UTS #39 / UAX #24) and
    list_confusable_identifiers/{0,1} in library(check) for
    spotting mixed-script and look-alike identifiers.
  • New library(uniname) with unicode_name/2 for Unicode
    character-name lookup, generation and enumeration.
  • macOS runtime is now a standalone swipl.framework; ninja pkg
    produces a signed and notarised .pkg installer with the
    split /Applications + /Library/Frameworks layout.

Unicode source syntax — follow-ups

  • ENHANCED: Unicode bracket pairs (Ps/Pe) now behave
    consistently with {}. An empty pair (optionally with layout in
    between) reads as the atom '<open><close>', the same name acts
    as a functor when followed by (, and '⟨⟩'(X) writes as ⟨X⟩.
    The bare bracket-pair atom is also written unquoted.
  • ADDED: pattern_syntax and prolog_solo categories in
    char_type/2 and code_type/2. pattern_syntax exposes UAX #31
    Pattern_Syntax membership; prolog_solo exposes the kernel’s own
    “stands as a token on its own” flag.
  • ADDED: write_term/2 option pattern_syntax_solo. When set,
    single-character atoms outside the immutable UAX #31 Pattern_Syntax
    set are quoted, so atoms like '€', '·', '🎉' round-trip
    safely across future Unicode versions. write_canonical/1 enables
    the option by default; write/1 and writeq/1 are unchanged.
  • FIXED: solo characters are no longer needlessly quoted by
    writeq/1 ($needs_quotes/1 re-uses the kernel’s own quote
    decision; the underlying unquoted_atom helper was mis-routing
    UCS atoms through the byte-oriented branch).
  • FIXED: compare/3 on wide atoms now orders by code point
    rather than by wchar_t unit. On Windows (16-bit wchar_t) a
    supplementary-plane atom like '\U0001D11E' is stored as a
    surrogate pair and previously wrongly ordered below BMP atoms in
    U+DC00..U+FFFF.

Surrogate code points are now rejected everywhere

A well-formed Unicode text never contains an isolated surrogate code
point. Several Prolog API surfaces still accepted them, leaking
invalid Unicode into atoms, strings and streams.

  • FIXED: UTF-8 and UTF-16 stream decoders (_PL__utf8_code_point,
    Sgetcode’s inline UTF-8 branch, and get_utf16) now substitute
    U+FFFD with the usual SIO_WARN diagnostic on a surrogate
    sequence, matching the convention for other decode errors.
  • FIXED: atom_codes/2, put_code/{1,2}, put_char/{1,2},
    put/{1,2} and format/{2,3} ~c now reject surrogates with
    type_error(character_code, Code) (or format_argument_type for
    ~c) before reaching the stream layer.
  • FIXED: Sputcode() itself now calls reperror() on a
    surrogate, so foreign callers cannot emit invalid UTF-8 or
    raw-wchar_t bytes via Sputcode(0xD800, s).

New: identifier security and Unicode names

  • ADDED: library(unicode_security) (in packages/utf8proc)
    implements UTS #39 and UAX #24 over generated tables:
    unicode_script/2, unicode_script_extensions/2,
    unicode_identifier_status/2, unicode_identifier_type/2,
    unicode_skeleton/2, unicode_confusable/{2,3},
    unicode_resolved_scripts/2 and unicode_restriction_level/2.
    UCD source files are no longer vendored; the regen-uts39 target
    re-runs etc/gen_uts39.pl from a locally fetched UCD copy.
  • CHANGE: unicode_script/2, unicode_identifier_type/2,
    unicode_script_extensions/2 and unicode_identifier_status/2
    are now plain semidet: they fail on code points with no entry in
    the table rather than absorbing the missing case with a default
    (common, [], restricted). Callers that want a default can
    add their own fall-through clause.
  • ADDED: list_confusable_identifiers/{0,1} in library(check),
    autoloaded when library(unicode_security) is available and
    registered as a check:checker/2. Walks every clause in the
    selected modules (default [user]) and warns on
    • mixed-script identifiers (worse than single_script), and
    • confusable identifier collisions (distinct identifiers with
      the same UTS #39 skeleton).
  • ADDED: library(uniname) exporting unicode_name/2. The
    backing C plugin uses a compact 360 KB table (ICU / GNU libunistring
    layout) and supports forward (+,-) and reverse (-,+) lookup as
    semidet, plus full enumeration (-,-) via a stateful foreign
    iterator (>100k solutions in roughly 50 ms).

macOS packaging

  • ADDED: BUILD_MACOS_FRAMEWORK installs libswipl as
    swipl.framework (Versions/A/swipl), so third-party apps can
    link -framework swipl. A new findHomeFromFramework() uses
    dladdr() on the framework binary to locate the Prolog home
    without environment variables or a swipl.home file.
  • ADDED: BUILD_MACOS_BUNDLE arranges the split layout the
    installer ships: swipl-win.app in /Applications,
    swipl.framework in /Library/Frameworks, with install rpaths
    that resolve the framework both relative to the app and at the
    absolute system path.
  • ADDED: ninja pkg produces a single Developer-ID-signed and
    Apple-notarised .pkg (pkgbuild + productbuild flow with
    welcome / license / conclusion pages and a left-pane logo).
    MacPorts/Homebrew dylibs are bundled into the framework with
    rewritten @rpath/<basename> references and a hardened-runtime
    signature. Universal builds merge the arm64 and x86_64 trees,
    name the pkg fat, and drive signing, notarisation and stapling
    in a single run.

xpce

  • ADDED: library(pce_symbol_picker) — a non-modal singleton
    (and modal pick_symbol/1) to browse Unicode blocks and curated
    code ranges, type the picked symbol into the focused window, and
    remember recents. Supports user-defined code_range/3 lists,
    matching pairs, and a filter that searches either block names or
    Unicode character names.
  • ADDED: PceEmacs / Epilog insert_symbol command bound to
    C-x 8 RET and C-x 8 s. Opens the symbol picker targeting the
    invoking editor or terminal.
  • ADDED: PceEmacs normalize_region / normalize_buffer
    M-x commands to apply a Unicode normalisation form (nfc, nfd,
    nfkc, nfkd) to the region or whole buffer (conditional on
    library(unicode)).
  • ADDED: window ->pdf prints the full bounding box of a window
    rather than only <-area; display_manager <->focus_message
    fires a message on keyboard-focus changes; text_item gains an
    optional clear_image clickable icon.
  • ENHANCED: text <-pointed gains a round argument selecting
    caret-style rounding (snap to nearest gap) versus exact
    hit-testing, and now uses the Pango layout for hit-testing rather
    than summing per-character widths — clicks land on the right
    glyph for proportional fonts and font-fallback runs (emoji,
    Greek, math).
  • ENHANCED: SDL backend Windows font fallback chains add Thai /
    Lao (Leelawadee UI) and Yi syllables (Microsoft Yi Baiti); display
    methods to query and override SDL’s on-screen-keyboard policy.
  • FIXED: ws_discard_input() no longer reads a stale fd
    after a socket-based dispatch hook. Replaced the persistent
    dispatch_fd cache with a small console registry populated at
    init time (stdin) and by Epilog pty creation; ws_dispatch()
    watches an fd only for the duration of one call, and discard
    uses tcflush() / FlushConsoleInputBuffer() rather than
    read().
  • FIXED: Timer and Frame callbacks now hold a code reference
    around SDL_PushEvent, eliminating a use-after-free when the
    object is destroyed before the queued event is drained.
  • FIXED: graphical ->pdf negates the page offset
    (previously a graphical at a non-zero position produced a blank
    page); xpce printf %c handles non-ASCII code points via Put()
    rather than snprintf’s byte-collapsing path.
  • FIXED: man class hierarchy icons; thread monitor icons;
    PceEmacs class menu (raised a type error).
  • MODIFIED: auto_copy class variable is now also defined on
    terminal_image.

libedit

  • ADDED: Ctrl+Left / Ctrl+Right (xterm ESC[1;5D / ESC[1;5C)
    are bound to word motion (ed-prev-word / em-next-word, and the
    matching vi-mode bindings). The xpce terminal emits the same
    sequences for the Ctrl modifier on the cursor keys.

C API and C++ binding

  • MODIFIED: PL_predicate() now takes a UTF-8 string.
    Identifier names are program objects rather than externally
    encoded data.
  • C++ binding (packages/cpp):
    • PlModule, PlPredicate, PlFunctor and the
      PlCompound(functor, args) convenience constructors now
      interpret their functor-name argument as UTF-8 (consistent
      with PL_predicate()) and no longer take a PlEncoding
      argument. Text-parsing PlCompound(text[,enc]) constructors
      still take PlEncoding.
    • PlTerm::unify_atom(const std::string&) gains a
      PlEncoding parameter for symmetry with the const char*
      overload.
    • pl2cpp.plx documents the PlEncoding enum, the
      ENC_INPUT / ENC_OUTPUT defaults and the trailing-encoding
      constructor / method forms.

Other

  • FIXED: Reset cached JIT index decisions when a predicate’s
    supervisor changes. A “not indexable” verdict made on a transient
    clause shape (e.g. while an autoload triggered through
    trapUndefined() re-entered Prolog) could otherwise stick and
    disable JIT indexing for the eventual stable clause set.
  • FIXED: engine_destroy/1 no longer self-deadlocks; destroying
    a running engine from another thread no longer crashes.
  • FIXED: Confusable detection now handles zero-arity compounds.
  • MODIFIED: The code walker does not track below call/1. This
    avoids false positives for undefined-predicate detection when
    using call(Goal).
  • FIXED (packages/clib): bsd-crypt.c no longer
    unconditionally #includes crypt.h, so the fallback compiles
    on systems that ship neither libc crypt() (glibc ≥ 2.39) nor
    libxcrypt (crypt.h is now guarded by HAVE_CRYPT_H).
  • FIXED (packages/http): the proxy test suite reserves its
    unused port by holding a bound, unlistened socket, so concurrent
    ctest jobs cannot grab the port and turn an expected
    ECONNREFUSED into a successful connect.
  • FIXED (packages/ltx2htm): \verb/\verbatim bodies round-trip
    as UTF-8 (issue #8), so literal non-ASCII content no longer renders
    double-encoded in HTML output.

Documentation pipeline

  • The PDF manual builds without the utf8proc package: lualatex
    handles the Unicode content directly.

    Enjoy — Jan

Thanks for that Jan. OK if I close out my pull request, as it’s superseded by this?

Thats an interesting creative way to handle Ps/Pe. But maybe a
little bit too lax realized. For example I find that the behaviour is
not always consistent, when they don’t appear in pairs.

Here is an example:

?- X = foo(⟩).
X = foo('⟩').

?- X = foo(}).
ERROR: Syntax error: Illegal start of term

These two happen in a clean install, like starting Windows Sandbox and installing swi-prolog there. I think these have been around, not starting with 10.1.8. Yes a long list, I do get that the new epilog is a big jump, and yes it is a development version.

  1. In Windows, there are issues about where swi-prolog starts, I mean what directory is the current after clicking a file in Explorer or clicking it in Task Bar jump list. From jump list (that lists pl-files that have been clicked in Explorer) directory is always c:/windows/system32. If clicked in Explorer it is always Documents/Prolog. Current directory used to be the directory where pl-file is located in file system.
  2. Also pl-file extension is not properly registered in system, clicking a pl-file in Explorer pops up a “Choose executable” dialog, this propably is caused by the item 1.
  3. And there is a problem with lines that wrap over to another line. If you write multiple times ‘write(test),write(test)…’ so that line extends to next line and then do backspace in upper line, cursor jumps to line above, not sure could be in linux version also, havent checked 10.1.8
  4. Pasting is done with ctrl-v in epilog and shift-ctrl-v in PceEmacs, this is annoying. And the ‘paint it and it gets copied to cut buffer’ (which is incredibly handy) is gone
  5. Sometimes there is an issue where epilog is waiting for something and splitting a window (very handy feature!) messes up all display, both in windows and linux.

Thanks for the list! I’ll go through them in more detail later.

No, but it is off by default. Add *.auto_copy: @on to the resource file and it should work again. It seems that this is the direction things are moving. The resource is @off by default for Windows and MacOS and @on other (Unix-oriented) systems.

I’m not sure I get point (1). Version 10.1.9 (will appear soon) should fix associating .pl files with swipl-win.exe. An extra check-box in the installer allows you to make this association or not. Normally (on Windows)

  • The desktop icon/start shortcut should start swipl-win.exe --win-app. This causes it to switch to Document/Prolog, creating this on first usage.
  • (New) opening a .pl file runs swipl-win <file.pl>, which (on Windows) causes Prolog to start, change dir to the dir of <file.pl> and load the file.
  • 10.1.9 also support dropping files on Epilog (consults) and PceEmacs (opens file in a tab).

This does not reproduce for me. Please give the width of the window (tty_size/2) and the exact text entered.

Yes. Ctrl-V is a basic Emacs command (Page down). It means nothing in Epilog. Would it help to add a binding Shift-Ctrl-V for Epilog?

With waiting, you mean it is running some Prolog goal? Or, is it locked up. I’ve seen a couple of cases where the GUI freezes. I’ve detected this as it is properly waiting for a new SDL event, but that never arrives. I’m nowhere near a sequence to make it freeze though, which makes it rather hard to debug :frowning: Scenarios are welcome …