Ok. Compiled for debugging, which reveals all atom-creating threads hang in reserveAtom(). I managed to make it work using the patch below.
[root@cd3e54ac4605 src]# git diff
diff --git a/src/pl-atom.c b/src/pl-atom.c
index 9fae71203..a0f569606 100644
--- a/src/pl-atom.c
+++ b/src/pl-atom.c
@@ -472,7 +472,7 @@ reserveAtom(void)
#endif /*O_ATOMGC*/
for(;;)
- { index = GD->atoms.highest;
+ { index = __atomic_load_n(&GD->atoms.highest, __ATOMIC_ACQUIRE);
idx = MSB(index);
assert(index >= 0);
Now I’m not completely happy. What I do not get is that no matter what you change, after an incremental build it works fine. It breaks after a clean build. Using ninja clean
before rebuilding rather than starting completely from scratch works too (i.e., causes the bug). @dmchurch, do you have any idea why this could be?
It looks a lot like a C compiler error. If I use gdb and step at instruction level it loops executing a non-conditional jump to itself. That smells for me. The only reasoning I could understand is that if gcc decides there is no need to load GD->atoms.highest
from memory and subsequently not to do any of the other loads it can infer that if this loop doesn’t break out the first time it never will and thus it can just as well do something silly.
I suspect @dmchurch may be able to give more insightful comment I’m having a hard time reading assembler and with all the inlining that takes place there are few anchors that tell me where I am