A safer C API

That kind of technology is provided by the "ffi" pack for SWI-Prolog. This parses the C header files and provides both the types and access to the #define constants. That comes with its own problems :frowning:

I guess we could add some sort of enum facility to PL_scan_options(). The API quickly gets rather complicated though :frowning: Next step would be range checks for numeric options … Possibly C++ can do a better job here as I think it can provide access to the names and values of enum types. That still does not fix the whole problem as enum values are typically either prefixed (C) or in some C++ name space. In most cases remove these when defining the Prolog API.

A possible alternative would be to do something similar as the embedded Python library, which comes in a release and debug version. The debug version has a lot of additional runtime checks. As is, the SWI-Prolog C API has very few checks, i.e., if you feed it something illegal the behaviour is fully undefined (read: typically it simply crashes). We could add runtime checks. For a term_t, this means it must be part of one of the valid “foreign frames”. Typically there are not many valid term_t values and thus some other integer can be reported as a runtime error that would probably have avoided this discussion :slight_smile: Something similar, though less selective, can be done for the other integer handles. We could also consider a runtime switch for these checks, although that may have too much impact on performance.

I gave it a try to add basic validity checks to the API functions. As is, this checks the term_t, atom_t and functor_t parameters. Well, I probably forgot some checks … Anyway, somewhat to my surprise, the performance impact appears negligible. In part this is because this protects the public API, while the most time critical usage of the C API internally bypasses the public API.

Re-running @Boris example, we now get

swipl -g run_tests -t halt swiplite.pl
[10/10] sqlite3:bad_mode ..
ERROR: API error: invalid term_t 531973 (out of range)
Time: Thu May 30 10:55:21 2024
Inferences: 668871
Thread: 1 (main)
C-stack trace labeled "API":
  [0] save_backtrace() at /home/jan/src/swipl-devel/src/os/pl-cstack.c:337 [0x7fbf095d5708]
  [1] vsysError() at /home/jan/src/swipl-devel/src/pl-init.c:1845 [0x7fbf0953b889]
  [2] vfatalError() at /home/jan/src/swipl-devel/src/pl-init.c:1897 [0x7fbf0953bae4]
  [3] valid_term_t___LD() at /home/jan/src/swipl-devel/src/pl-fli.c:143 [0x7fbf0955691a]
  [4] PL_domain_error() at /home/jan/src/swipl-devel/src/pl-error.c:851 [0x7fbf094f97f2]
  [5] pl_sqlite3_open() at /home/jan/Bugs/sqlite/swiplite.c:46 (discriminator 1) [0x7fbf09841321]
  [6] PL_next_solution___LD() at /home/jan/src/swipl-devel/src/pl-vmi.c:4506 (discriminator 2) [0x7fbf094ebb5b]

This is now enabled by default. If anyone experiences significant impact on performance, it can be disabled using cmake -DVALIDATE_API=OFF at build time.

Note that this may cause broken extensions to fail fatally while before they were lucky enough to avoid a fatal crash. This happened to the bundled sgml/xml parser.

Please report issues such as performance degradation or new crashes. If there is too much impact we should disable this by default and establish a more careful path.

@peter.ludemann: following earlier discussion, I added PL_api_error() that is similar to raising a system error, except that it prints ERROR: API error:, indicating we are dealing with illegal use of the API rather than an issue in the Prolog system itself.

With this flag, we could define atom_t etc al as structs, and catch all such mistakes at compile time, without worrying about performance.

[ moved to a new topic ]

I think we have been there :slight_smile: There are two main issues with that. One is that we can no longer compare atoms and functors. So, this type of code no longer works. That is a no-go

  if ( PL_get_atom_ex(term, &a) )
  { if ( a == ATOM_hello )
    else if ( a == ATOM_world )

I know it can work in C++, but not in C. The second worry is about ABI compatibility (and possibly slow ABIs for struct passing still being in use).

I (still) think that the more promising route would be to cast them to some pointer type. As these integers are all pointer-sized (intptr_t), this is possible. That may also have some performance impact on some CPUs, but probably limited because SWI-Prolog (as almost all Prolog systems) already heavily depends on casting between pointers and integers.

I pushed the second incarnation of these runtime checks (actually forgot to push the first …). This patch adds PL_free_term_ref(), which (I think) can solve the issues of piling term_t handles with the C++ interface. It has little use for the C interface as, rather than creating and freeing, we can typically easily reuse the handle. This currently does not deal with PL_new_term_refs(n). As far as the system are concerned, these are just independent term handles. This could be added if useful (enough).

More importantly though, the new checking distinguishes between term_t from predicate arguments and those created by the user. The idea is that we cannot free term_t handles that are predicate arguments. But, we can also not use the PL_put_*() functions on these. These checks are now implemented, which raised three issues in the various extensions. Two were awkward, but fairly innocent. The last was a real bug waiting to happen. As often, using PL_put_*() on an argument is not allowed, but in some cases you can get away with it. Note that a type-based check does not work here, but a runtime check does.