Changes to SWI-cpp.h

peter.ludemann · July 30, 2022, 2:24am

(For PlTerm)
This doesn’t work for long, int, bool, float (and term_t and string) because they’re either reserved words or types; they need a get_XXX. So, for consistency, I think it makes sense to leave the “get_” everywhere except for PlTerm::type(), PlTerm::c_str() and PlTerm:wc_str() (char* and wchar_t* can probably be made into implicit conversion operators, so c_str() and wc_str() wouldn’t be needed in most places). Also, casting to size_t interferes with casting to term_t (I got a error message from compiler), so we need a PlTerm::get_size_t() …or perhaps PlTerm::get_int_sz()? [Even though size_t is the same as uint64_t on most platforms, I’d prefer to keep it separate]

(No need for the “_sz” with PlTermv::size(), as long as I rename the field size to size_, which is a common C++ convention that the Google style guide likes)

There may be other tweaks I need to make … I haven’t yet made all the changes to SWI-cpp.h and also haven’t propagated them to the places where it’s used.

jan · July 30, 2022, 11:51am

We are talking about using the C interface from C++, no? If all the other conversions are removed we could keep this one, no? Otherwise the user can write term_t{t} or t.ref (If I recall well). The theory should be that using the native C interface from C++ should be an exception to be used with some rarely used C functions that are not worth making C++ alternatives for (I think). Normally, the C++ alternative should wrap the C primitive in an OO style and (at least) add exceptions.

Bad luck

peter.ludemann · July 30, 2022, 6:23pm

If C/C++ ever gets a PEG/packrat parser, then it’ll be able to remove the notion of “keywords” (one of the things that PL/I got right, although I have no idea what technology they used for parsing - probably some ad-hoc thing). And then we’ll be able to add methods with names like long(), int(), etc.

Anyway, I found some of the problems with ambiguous conversions when I changed uses of ref to *this, depending on the PlTerm → term_t conversion. This implicit conversion shouldn’t be needed outside of SWI-cpp.h, unless it’s for a relatively rare function in SWI-Prolog.h; and if it is, the best solution is to add it as a method to class PlTerm. (I think that you and I are in violent agreement)

(BTW, I’ve moved ref into private, so it can only be accessed now by term_t{t} or t.get_term_t())

jan · July 31, 2022, 1:11pm

I don’t know whether we want to cover all of SWI-Prolog.h in the C++ interface. At least the current implementation is more geared towards the relatively small subset that support 99% of the applications. As this starts to look like a completely new C++ interface we should also consider that the current C interface defines a lot of functions that should probably be considered deprecated. Should we officially deprecate these? Might be hard as some functions merely have a better alternative for specific use cases.

peter.ludemann · July 31, 2022, 4:06pm

Marking the deprecated items in SWI-Prolog.h would be great. A few are marked as deprecated in the documentation, but I suspect that there are more.
Is there a C equivalent for [[deprecated("Use XXX")]]?

peter.ludemann · August 2, 2022, 6:55pm

I’ve been adding some methods for wchar_t* and std::wstring … these cause ambiguities, for example with PlCall(). And, if in future we add support for UTF8 strings, there’ll be even more ambiguities.

So, I think it’s best to make most (all?) of the implicit conversion operators into explicit and to suggest programmers use method calls (t.c_str(), t.wcstring(), etc.) rather than cast operators or type initializers (I suspect that most C++ programmers don’t understand the subtle differences\ between int(x), static_cast<int>(x), and int{x}, nor know whether their compiler optimizes these or not).

One classic mistake I’ve seen C++ programmers make is using c_str() when calling an overloaded function that supports both std::string and const char*. Making the conversions explicit increases the chances of making this mistake, but I don’t see an easy way around it – it’s just one of the C++'s problems because of its history that derives from C (which in turn made the mistake of using ASCIZ encoding to save a byte).

It’s probably a good idea to encourage use of std::wstring everywhere – there’s probably a lot of foreign function code that would break if given kanji or even Cyrillic. I don’t think that the API can do anything to encourage this, other than making the conversion to std::wstring implicit and all other conversion operators explicit. Any opinions?

jan · August 3, 2022, 7:28am

So be it, I guess. I never liked C++ Too complicated for me Well, the exception handling is still practical …

peter.ludemann · August 3, 2022, 3:05pm

Me too (I used several Pascal-like systems programming languages before C, and found C to be a terrifying step backwards) – but I like C even less. (As long as I stick with the “safe” subset of C++)

Anyway, I’ve more-or-less decided to make a new API (SWI-cpp2.h) – the changes aren’t large, but they are incompatible with the existing API; it should be easy to migrate from the current API to the new one.

The main changes will be:

support for std::string, std::wstring
PlTerm becomes a “base class” without a public constructor, and the constructors that take a string become a subclass PlTermAtom
Rename subclass PlString to PlTermString for consistency
New subclasses PlTermInt and PlTermFloat for the constructors that take numbers
New subsclass PlTermVar for creating a term ref with an uninstantiated variable

As before, I’ll check this out with rolog.cpp, rocksdb, swipl-win, and I’ll write some more test cases. I don’t know when I’ll get this done; working on this is just a hobby.

peter.ludemann · August 7, 2022, 12:06am

Status update: I’ve mostly finished the changes I proposed, but the documentation is only partially updated. (The SWI-cpp.h file also needs a bit more clean-up).
There’s a short summary of the changes starting at line 50 of packages-cpp/pl2cpp.doc at fec3ff6141f9e42fa57f87fba42a03238abd547f · kamahen/packages-cpp · GitHub

You can see my changes here: GitHub - kamahen/packages-cpp at m2
and the changes to the programs that use it here:
Comparing SWI-Prolog:master...kamahen:cpp · SWI-Prolog/packages-swipl-win · GitHub
Comparing JanWielemaker:master...kamahen:cpp · JanWielemaker/rocksdb · GitHub
Comparing mgondan:main...kamahen:cpp · mgondan/rolog · GitHub

swipl-win appears to work OK.
I’ve run the tests for rocksdb … also, it has undergone more changes, to make better use of the C++ API, so it supplements the tests in test_cpp.pl.
I’ve compiled rolog but haven’t tested it.

Are there any other users of SWI-cpp.h that I’ve missed?

As usual, comments, criticisms, or (even) applause are all welcome.

kuniaki.mukai · August 7, 2022, 11:29am

It is impressive to hear comment on Pascal. I am not familiar with Pascal, but it was not the first time for me to hear similar comments on C compared with Pascal. Also around before 1970, a Japanese group proposed AlgolN as a successor of Algol 60 and Algo 68. A leading member of the goup was Nobuo Yoneda, who is famous rather in category theory for Yoneda’s lemma. Around 20 years ago, I read some article on why Algol did not get popular. The author explained, it was because some authorized commitee did not exploit Yoneda’s proposal for some reason. Although I am not appropriate to judge anything, but it is just a rumor for me. Anyway your comment on Pascal reminde me of fore-runners great efforts on Programming languages before the monster C, though I feel like C personally because it is more close to Turing machine language, i.e., ridiculous reason.

peter.ludemann · August 7, 2022, 5:32pm

I’m hoping that someone will volunteer to do a code review (please contact me offline, because I’m still making small changes both to the code and the documentation). I think that it’s “good enough” for now, given the small number of people using it; and I can make incremental improvements.

We need to coordinate the (next?) release of SWI-Prolog and the changes to rocksdb and rolog (swipl-win isn’t a problem because it’s part of the regular release process). Is it possible to put a version dependency into a pack?

I can prepare PRs, and @jan can merge the ones for packages/cpp, packages/swipl-win, rocksdb; hopefully @mgondan1 can do the same for rolog. Hopefully, nobody else is affected.

peter.ludemann · August 8, 2022, 10:23pm

I’ve created PRs for the changes. I’m not 100% happy with what I’ve done (there are possibly some naming conventions that need to be improved, particularly in distinguishing between methods that throw an exception for failure and those that return a bool to indicate success. Also, it needs a lot more unit tests. (And so does rocksdb, which depends heavily on these changes)

@jan: ENHANCED: Version 2 of C++ interface by kamahen · Pull Request #16 · SWI-Prolog/packages-cpp · GitHub
@jan: Changes to SWI-cpp.h API by kamahen · Pull Request #17 · SWI-Prolog/packages-swipl-win · GitHub
@jan: Changes to SWI-cpp.h API by kamahen · Pull Request #12 · JanWielemaker/rocksdb · GitHub
@mgondan1: Changes to SWI-cpp.h API by kamahen · Pull Request #2 · mgondan/rolog · GitHub

peter.ludemann · August 9, 2022, 4:12pm

@Jan sent me a private note with some suggestions and questions. I’ll paraphrase his comments here. (And a few I’ll put as “code review” comments)

All classes in the end embed (only) a C handle. Doesn’t it make sense
to have the same method for accessing this, e.g. c, so we can do
PlFunctor f;
...
PL_something(f.c(), ...)

I know it is common to provide access functions for members to keep
binary compatibility, but as this is only inline code I doubt that makes
much sense, so f.c and c as a public member would do as well?

The general convention with C++ is for struct to be for POD (plain old data) that might have some convenience functions and class is for everything else; the biggest difference is whether the default visibility is “public” (struct) or “private” (class); and typically fields in a class follow a convention (suffix “_” at Google, prefix “m_” at GNU_, etc.). The C++ wrappers in SWI-cpp.h seem to be half-way in between – they’re POD in the sense that they just wrap a handle, but they provide quite a few “convenience” functions, so I’ve decided that they should be “class”.

As Jan says, there’s no need for getters/setters for binary compatibility; the question is how to access the underlying handle when needed. The compiler will expand PL_something(f.c(), ...) to PL_something(f.functor_, ...) so efficiency isn’t the issue. In general, when someone wants to access a PL_something() in SWI-Prolog.h, we should want to make that available in SWI-cpp.h (as f.something(...)), so the convenience of accessing the wrapped handle shouldn’t be a significant issue.

Perhaps leave the handle private and access it by C()? Making the method name upper-case would also make it easy to find with a regular text editor.

Why PlAtom::reset() and PlTerm::reset()?

I’m following the style of “smart pointers”, by having a function for setting the value to “null” (for atom_t and term_t, the “null” is 0, although I don’t think it’s documented anywhere). I forgot to include PlAtom::reset(atom_t) and PlTerm::reset(term_t). [I’ve also created “null” constants – they’re “private” for now.].

If we access the wrapped handle using C(), then we need a method for setting the value. If we make the wrapped handle public (as in the original SWI-cpp.h, then there’s no need for C() and reset(). I don’t have a strong opinion on this; the atom and term wrappers aren’t really smart pointers, so it could be confusing to follow the smart pointer conventions.

(We could, of course define a dereference operator (*) instead of C(), but I think that would be confusing because atom_t and term_t don’t always act like pointers in the SWI-Prolog.h functions.

some questions about is_valid()

The idea is that a constructor should be written like this:

PlAtom::PlAtom(const char *text) 
    : handle_(PL_new_atom(text)) { ... }

and not like this:

PlAtom::PlAtom(...) {
    handle_ = PL_new_atom(text);
    ....
}

If the system guarantees that PL_new_atom() always succeeds or else terminates the program with a fatal error, then there’s no need for is_valid() in the constructor.

If PL_new_atom() fails due to out-of-memory, does that always give a fatal error? From looking at various bits of foreign function code and swipl internal code, it appears that a “resource error” is a possibility; and that’s why I added the test to the constructors.

However, there’s a second use for is_valid() – some code uses a “null” value (that is: 0) to indicate that the value hasn’t been set. We could create a operator bool() to check for validity, but there is also a “get_bool” defined for terms (which calls PL_get_bool(), which checks whether the term is an atom with the value true or false (and throws an exception if it’s any other value). This is confusing, so I decided to remove the operator bool() and require the check to be explicit.

[BTW, Jan originally wrote PlAtom constructor an initializer list, but some other constructors weren’t that way and I’ve modified them to use an initializer list where possible]

I’m unsure about the unify_*_ex() functions. Seems a lot of stuff that is barely ever needed. Why not PlTerm::must_be_unbound()? You typically only need that for returning blobs, so it could also be part of a C++ blob infrastructure.

When I was applying my changes to users of SWI-cpp.h, I found a number of places where unification was assumed to succeed. However, Jan told me in a private email that a check is always needed because even when unifying with a newly created term (from PL_new_term_ref(), which is encapsulated in the PlTerm_var constructor), there can be failure, typically from out-of-memory.

So, I annotated the regular unification methods with [[nodiscard]] to catch such errors and added the “must succeed” methods that throw an exception if they fail (with the suffix “_ex”). If my understanding is wrong, I could remove the “_ex” methods, and programmers can use write, e.g., (void)t.unify_term(t2) to turn off the warning.

peter.ludemann · August 10, 2022, 3:27am

On further thought, I think that “*” for accessing the underlying atom_t, term_t, etc. works quite well – if you think of it as “unwrapping” rather than “dereferencing” (and reset() is part of the convention for smart pointers, so that fits nicely). So, t.atom().get_atom_t() becomes *t.atom().

If nobody objects, I’ll make this change and update the PRs (together with the other things that @jan has commented on.

As for blobs … I intend to work on an API for them. It’ll probably need at least C++-11 and possibly C++-17, but I assume that’s acceptable (rocksdb already requires C++-17 to build librocksdsb.so.

jan · August 10, 2022, 8:44am

But why? I might be wrong (no C++ user), but I thought smart pointers are there to deal with life time of the object pointed at by means of reference counters, no? The Prolog handles do not serve that role (possibly except for atom_t where you can (un)register the atom. It is rarely needed though and (un)register at constructor/destructor potentially seriously impact performance on highly concurrent applications). You normally never reset these handles, the same way that it is rare for C variables bound to one of these types to be reassigned.

That is a bit of an open issue. As is, PL_new_atom() cannot fail. It could make sense to allow this to fail when creating huge atoms and malloc fails. For small atoms it makes little sense. It is pretty unlikely you can meaningfully recover without the recovery running out of something again. So yes, there is some reason for an is_valid() test, but it must by simply testing for null. Prolog has no means to verify that some other handle is valid or not. Well, it could test that the value might be valid. The value of that is limited though as it may become invalid asynchronously due to atom-GC.

At least GCC’s C warning for unused return is not fooled by casting it to void Anyway. Unification in Prolog can almost always fail and the foreign predicate should almost always return with failure (after cleaning up if necessary). Failure of PL_unify() has two reasons: logical failure and resource errors. In 99% of the cases the foreign predicate doesn’t care as it needs to do the same thing. The only exception are predicates that create some resource and make it available to Prolog as a blob. In that scenario we better consider it an error if the output unification fails (logically). I’d implement that is part of the (future) blob wrapper though. As is, term.unify() should return true/false for logical results and raise a C++ exception otherwise (by re-throwing PL_exception(0)).

Note that the C style for dealing with the Prolog API is typically like this if no cleanup is required (which is often not the case as Prolog backtracking will do its job if the sequence fails somewhere in the middle).

  ...
  return ( PL_*() && PL_*() && ... );

The C++ version will involve quite a bit more code because the exception and logical failure cases run different code paths.

This looks like C++ level two I’m usually in favor of shorter notations though

peter.ludemann · August 10, 2022, 5:10pm

I have been guilty of overly complicating things and mis-naming.

So …

I’ll make the wrapped value (atom_t, term_t, etc.) available as “public” - no need for a “getter” (or “*” operator). For now, I’ll call it “C” for each class (so far, PlFunctor, PlAtom, PlTerm).
PlCompound’s fields will remain private. I haven’t made any changes to PlFrame, PlQuery, PlEngine, etc. (yet). (I’ll probably do minor cosmetic changes there, and remove copy constructors, as per one filed Issue.)

Each class will also have a null and methods is_null() and not_null() - as @jan pointed out, is_valid() has implications beyond just testing for null.

For unification, I think it makes sense to introduce a pseudo-exception PlUnificationFailure. This would be an additional “catch” in the code for PREDICATE and would simply return false – the existing foreign function call would notice if there was also a regular exception and handle things appropriately.
This will allow the C-style code of

  return ( PL_f1(t, ...) && PL_f2(t, ...) && ... );

to be done in C++ as

  t.f1(...);
  t.f2(...);

and without a need to check the return values of unification unless it’s needed for the logic of the code (typically, for non-deterministic foreign predicates).
I’ll also add a PlCheck(…) that can be used to wrap a C function (in SWI-Prolog.h) to cause an exit from the function. This would allow writing the C-style code like this, if the desired methods are missing from SWI-cpp.h:

  PlCheck(PL_f1(t.C, ...));
  PlCheck(PL_f2(t.C, ...));

As for blobs: they’ll be a future PR. There are too many changes already in this PR.

peter.ludemann · August 11, 2022, 12:12am

I’ve pushed the changes to the code … the documentation has not been updated.

peter.ludemann · August 21, 2022, 8:37pm

It seems that g++ 12 has problems with SWI-cpp.h – discussion is here: Problems compiling Rocksdb
I intend to install g++ 12 and see if I can reproduce the problems … it takes a while to download and build the compiler, so I don’t know when I’ll be able to do this.

EDIT: I could not reproduce the problem after building and installing g++ 12

jan · August 22, 2022, 7:12am

Weren’t you using Ubuntu 22.04? gcc-12 is available there. Just the default is gcc-11.

peter.ludemann · August 22, 2022, 2:11pm

I have Ubuntu 20.04, whose default is gcc-9 and which doesn’t have a gcc-12 in a PPA. (An upgrade has been offered, but last time I tried that, I ended up having to a complete re-install, so I’m not switching to 22.04 right now.)
And my Chromebook uses Debian 11 bullseye, whose default is gcc-10, and for which I haven’t figured out how to add an “unauthorized” PPA (such as for swi-prolog).

Topic		Replies	Views
Ann: SWI-Prolog 9.1.17 Releases	19	492	October 24, 2023
Raw to byte or character conversion Help!	23	1368	July 18, 2022
Install/use swi-prolog for C++ General	16	784	August 19, 2023
Cpp2 exceptions General	73	1622	April 29, 2023
Ann: SWI-Prolog 9.1.0 Releases	7	608	December 7, 2022

Changes to SWI-cpp.h

Related topics