Apple M1 can now be compiled with gcc-mp-12

After updating to Apple Ventura 12.2.1, Clang 14.0, updating Macports and installing gcc12 from macports as gcc-mp-12, it now builds and runs SWI-Prolog on the M1. Earlier versions raised quite a couple of issues in the test suite. One remains:

 % [53/120] cpp:new_chars_2 ..libc++abi: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc

@peter.ludemann ?

After configuring with

CC=gcc-mp-12 CXX=g++-mp-12 cmake -DCMAKE_BUILD_TYPE=PGO -G Ninja ..

It runs the PGO training benchmarks 10% faster than the Clang 14 version (and 15% faster than using gcc12 on AMD3950X).

1 Like

My guess is that this is from a stress test of SWI-cpp.h (not SWI-cpp2.h) and can be fixed by adding this to the definition of PREDICATE:

	  } catch ( std::bad_alloc& ) \
	  { return PlResourceError("memory").plThrow(); \

If that doesn’t fix things, I’ll have to do some more research (ideally with a way of reproducing this on Ubuntu, because I don’t have an M1).
[There is a way of preventing std::bad_alloc, but that would require testing the results of new for being non-nullptr, which is probably something we don’t want to do.]

EDIT: I’ve added this to SWI-cpp2.h before the catch(PlException&) [because I’ll be making PlException a subclass of std::exception]; and there’s a test for it in test_cpp.{cpp,pl}, which only tests SWI-cpp2.h. (It’ll be a “little while” before I can update SWI-cpp2.h, and I have no plans to add tests for SWI-cpp.h unless @jan wants me to.)

It is from test_cpp.pl, using test_cpp.cpp, including SWI-cpp2.h. So far we did not see this, so it might still be an issue with gcc on the M1. Previous versions produced a Prolog binary that failed many of the tests. I’ll investigate it when I have time. This was just a quick check after updating Macports.

Presumably, it’s this test, where PREDICATE’s catch should turn the std::bad_alloc into a Prolog resource_error(memory):

test(new_chars_2, error(resource_error(memory))) :-
    too_big_alloc_request(Request),
    new_chars(Request, Result),
    delete_chars(Result).

There are some comments with this test about maximum values with ASAN – it’s possible that MacOS has different maximum allocation values and can’t properly handle the std::bad_alloc with the amount this test requests. Perhaps reduce the Request value in too_big_alloc_request/1? (We only need a value that’s bigger than all available memory, to ensure that the test raises an allocation error)

There’s one other test in test_cpp.pl that potentially could cause a std:;bad_alloc – test square_roots_2b. Although that ought to be intercepted by the Prolog engine and turned into resource_error(stack).

Looks like a compiler bug. There is clearly a catch of std::bad_alloc while that is the exception it complains is not caught. I tried to lower the request. 2^56 should be enough as, AFAIK there is no 64 bit CPU on the market that can address more. Same result. Tried a catch(...), still same result.

Played a bit more with the request. Using 1TB, the allocation succeeds. This implies it does over committing as my M1 only has 16Gb :slight_smile: Bumping this to 256TB causes the Apple clang version to work and the gcc version to crash on std:bad_alloc. Possibly an ABI incompatibility between Apple’s core libraries and gcc?

Note that both MacOS and Linux normally over commit, so malloc works as long as there is enough virtual address space. It is probably OS dependent what happens if you start using too much of the returned area. AFAIK on Linux, the OOM killer will kick in to heuristically select a process to kill such that the OS survives.

I don’t know whether this means std::bad_alloc is practically non-existent on some OSes and this having a test case for it is dubious/not portable? Note that if I change your malloc/2 predicate to throw std::bad_alloc on a request > 1Gb, the test does pass. I.e., an explicit throw std::bad_alloc() is processed correctly.

I think that the problem is in my test – I assumed that new[] and malloc() were the same, but they aren’t necessarily the same.

If the test is changed to use new[] and delete[], it probably works correctly.

There are a couple of other bugs in the “out of memory” tests – I must have written them before I had had my morning coffee. I’ll put together a fix for them – hopefully today (Pacific time).

This should fix the M1 tests:

It doesn’t :frowning: Looks like a gcc bug. More likely an incompatibility with the Apple libraries, but that must be sorted out between them. Still, it isn’t clear what this brings. Typically allocation succeeds due to overcommit of memory. Filling the memory gets the OS into trouble. I don’t know what MacOS does then. On Linux the OOM killer takes action and kills processes on heuristic grounds to make the OS survive. catch() doesn’t help here :slight_smile:

Here’s a test case, with output on my machine. It runs the test for both malloc() and new[] - in the former, nullptr is returned and in the latter std::bad_alloc is thrown.

#include <iostream>
#include <cstdlib>

void test(size_t size) {
  try {
    char* p = new char[size];
    if (p == nullptr) {
      std::cerr << "new[] returned 0 for size=" << size << std::endl;
    } else {
      std::cerr << "new[] returned non-0 for size=" << size << ": " << static_cast<void*>(p) << std::endl;
      delete[] p;
    }
  } catch(const std::bad_alloc& e) {
    std::cerr << "bad_alloc caught(1) with size=" << size << ": " << e.what() << std::endl;
  }
  try {
    char* p = static_cast<char*>(malloc(size));
    if (p == nullptr) {
      std::cerr << "malloc returned 0 for size=" << size << std::endl;
    } else {
      std::cerr << "malloc returned non-0 for size=" << size << ": " << size << ": " << static_cast<void*>(p) << std::endl;
    }
    free(p);
  } catch(const std::bad_alloc& e) {
    std::cerr << "bad_alloc caught(2) with size=" << size << ": " << e.what() << std::endl;
  }
}


int main() {
  test(10);
  test(0xffffffffffffff);
  return 0;
}
$ g++ malloc_bug.cpp && ./a.out
new[] returned non-0 for size=10: 0x1cceeb0
malloc returned non-0 for size=10: 10: 0x1cceeb0
bad_alloc caught(1) with size=72057594037927935: std::bad_alloc
malloc returned 0 for size=72057594037927935

Agreed. However, there are cases where std::bad_alloc is thrown, and for completeness, I wanted PREDICATE to catch it and convert it to a Prolog resource error. There might be other C++ exceptions that should have similar handling. Note that try/catch has no overhead if there’s no exception thrown, so there’s no performance hit in catching lots of different exceptions (except for code bloat).

The rather unexpected result is that all works as expected on g++12 on Linux, clang on Apple and g++ 12 on Apple …

I’m curious – what is the output of my text program on an M1?

BTW, malloc() has a “no throw guarantee” according to the documentation I’ve read. Also, there’s a “nothrow” variant of new[] - if you wish, I can add that to the little test program.

jan-m1 ~/Bugs/Apple > c++ -o t t.cpp    
jan-m1 ~/Bugs/Apple > ./t
new[] returned non-0 for size=10: 0x600001484040
malloc returned non-0 for size=10: 10: 0x600001484040
bad_alloc caught(1) with size=72057594037927935: std::bad_alloc
malloc returned 0 for size=72057594037927935
jan-m1 ~/Bugs/Apple > g++-mp-12 -o t2 t.cpp 
jan-m1 ~/Bugs/Apple > ./t2
new[] returned non-0 for size=10: 0x6000007cc040
malloc returned non-0 for size=10: 10: 0x6000007cc040
bad_alloc caught(1) with size=72057594037927935: std::bad_alloc
malloc returned 0 for size=72057594037927935

Ok. That makes your current test more portable. Both MacOS and Linux do throw. Now it gets really weird as gcc-12 catch does work in a simple stand alone project, but not when creating a SWI-Prolog plugin!?

I’ll make a better standalone test, and I will also add to the test suite. Don’t want to report a bug when there isn’t one.

1 Like

Something is weird here … my original test mistakenly used malloc() instead of new[] and I saw an exception (which I assume was from C++ but maybe it was from Prolog?); and this was on Debian. I’ll try to reproduce that older test and proceed carefully from there. I’ll also improve my standalone test to make the messages more clear.

A couple of big picture questions:

  • Does this mean we can get back to performance numbers on Intel similar to what was observed pre-M1 days?
  • The last couple of dev releases have been built with libbf because of build issues with GMP; is that still the case?

Not likely. gcc for the M1 cannot cross compiler for Intel and while it is probably possible to make it do so it is not easy. If it could, it is unclear whether profile guided optimization would work through the M1 Intel emulation.

Yes. The GMP version crashes some weird way with recent builds. No clue why.

Possibly related – I’m seeing a memory leak from GMP when I load C++ foreign code; (but don’t execute it) as far as I know, my C++ code doesn’t do anything with GMP and the .so doesn’t reference GMP (the swipl executable does). The memory leaks are from __gmp_default_allocate (called by __gmpz_init_set_ui). This might be an artefact of how I run ASAN - I do LD_PRELOAD="/lib/x86_64-linux-gnu/libasan.so.6:libstdc++.so.6" because otherwise ASAN can’t handle C++ throw properly.

libstdc++ is part of the foreign executable and not part of the swipl executable. The references that are in swipl and not in the C++ .so are:
libgmp
libswipl
libtinfo
libz