@ridgeworks discovered that 8.1.26 is up to nearly twice as fast as the current version on MacOS. I first kind of didn’t believe it, but in the end it proves true. What changed is that I installed the latest (10.15.4) MacOS version on the build machine and I got two Xcode updates since. Current version using clang 11.0.3. I don’t know the other versions
See this on the old but still useful Pereira benchmark suite below.
Anyone has a clue what causes this? Did Apple really mess up so badly or did they change some default flag/added additional security/… that causes this? Hints on how to get back to good old performance are welcome.
8.1.26 from download page
tail_call_atom_atom took 0.38-0.09=0.29/145946 = 2.0 usec/iter.
binary_call_atom_atom took 0.41-0.06=0.35/94737 = 3.7 usec/iter.
cons_list took 0.29-0.06=0.24/91525 = 2.6 usec/iter.
walk_list took 0.36-0.07=0.28/122727 = 2.3 usec/iter.
walk_list_rec took 0.34-0.08=0.26/125581 = 2.1 usec/iter.
args(1) took 0.34-0.07=0.26/120000 = 2.2 usec/iter.
args(2) took 0.29-0.05=0.24/81818 = 3.0 usec/iter.
args(4) took 0.28-0.03=0.25/54545 = 4.5 usec/iter.
args(8) took 0.31-0.02=0.29/33333 = 8.7 usec/iter.
args(16) took 0.32-0.01=0.31/19355 = 15.8 usec/iter.
cons_term took 0.30-0.05=0.24/84375 = 2.9 usec/iter.
walk_term took 0.33-0.07=0.26/110204 = 2.4 usec/iter.
walk_term_rec took 0.48-0.08=0.41/122727 = 3.3 usec/iter.
shallow_backtracking took 0.29-0.27=0.02/415385 = 0.0 usec/iter.
deep_backtracking took 0.23-0.04=0.20/59341 = 3.3 usec/iter.
choice_point took 0.66-0.06=0.60/94737 = 6.3 usec/iter.
trail_variables took 0.56-0.06=0.51/87097 = 5.8 usec/iter.
medium_unify took 0.79-0.51=0.29/771429 = 0.4 usec/iter.
deep_unify took 0.70-0.15=0.55/235161 = 2.3 usec/iter.
integer_add took 0.63-0.03=0.60/49091 = 12.2 usec/iter.
floating_add took 0.53-0.03=0.49/40909 = 12.1 usec/iter.
arg(1) took 0.30-0.03=0.27/40000 = 6.8 usec/iter.
arg(2) took 0.31-0.03=0.28/40909 = 6.9 usec/iter.
arg(4) took 0.28-0.03=0.26/37500 = 6.9 usec/iter.
arg(8) took 0.28-0.03=0.26/38217 = 6.8 usec/iter.
arg(16) took 0.28-0.03=0.26/38298 = 6.8 usec/iter.
index took 0.57-0.07=0.50/100000 = 5.0 usec/iter.
assert_unit took 1.03-0.00=1.03/1525 = 676.1 usec/iter.
access_unit took 0.32-0.02=0.30/26471 = 11.4 usec/iter.
slow_access_unit took 0.02-0.00=0.02/1607 = 11.2 usec/iter.
setof took 0.20-0.01=0.20/7692 = 25.4 usec/iter.
pair_setof took 0.23-0.01=0.22/6522 = 34.3 usec/iter.
double_setof took 0.46-0.00=0.46/1837 = 251.5 usec/iter.
bagof took 0.17-0.01=0.16/10112 = 16.2 usec/iter.
34 benchmarks took 11.16 seconds
8.1.26 rebuilt using latest clang
tail_call_atom_atom took 0.75-0.13=0.62/145946 = 4.2 usec/iter.
binary_call_atom_atom took 0.73-0.08=0.65/94737 = 6.9 usec/iter.
cons_list took 0.68-0.08=0.60/91525 = 6.6 usec/iter.
walk_list took 0.88-0.11=0.77/122727 = 6.2 usec/iter.
walk_list_rec took 0.76-0.12=0.64/125581 = 5.1 usec/iter.
args(1) took 0.69-0.11=0.58/120000 = 4.8 usec/iter.
args(2) took 0.65-0.07=0.58/81818 = 7.1 usec/iter.
args(4) took 0.69-0.05=0.64/54545 = 11.7 usec/iter.
args(8) took 0.76-0.03=0.73/33333 = 22.0 usec/iter.
args(16) took 0.86-0.02=0.84/19355 = 43.6 usec/iter.
cons_term took 0.73-0.09=0.64/84375 = 7.6 usec/iter.
walk_term took 0.89-0.12=0.77/110204 = 7.0 usec/iter.
walk_term_rec took 1.08-0.11=0.97/122727 = 7.9 usec/iter.
shallow_backtracking took 0.42-0.38=0.04/415385 = 0.1 usec/iter.
deep_backtracking took 0.44-0.05=0.39/59341 = 6.5 usec/iter.
choice_point took 1.10-0.09=1.01/94737 = 10.7 usec/iter.
trail_variables took 0.97-0.08=0.89/87097 = 10.3 usec/iter.
medium_unify took 0.98-0.72=0.26/771429 = 0.3 usec/iter.
deep_unify took 0.74-0.22=0.52/235161 = 2.2 usec/iter.
integer_add took 0.85-0.04=0.81/49091 = 16.4 usec/iter.
floating_add took 0.74-0.04=0.70/40909 = 17.2 usec/iter.
arg(1) took 0.49-0.04=0.46/40000 = 11.5 usec/iter.
arg(2) took 0.51-0.04=0.47/40909 = 11.5 usec/iter.
arg(4) took 0.47-0.03=0.43/37500 = 11.5 usec/iter.
arg(8) took 0.48-0.04=0.44/38217 = 11.6 usec/iter.
arg(16) took 0.48-0.04=0.45/38298 = 11.7 usec/iter.
index took 0.95-0.09=0.86/100000 = 8.6 usec/iter.
assert_unit took 0.81-0.00=0.81/1525 = 531.1 usec/iter.
access_unit took 0.57-0.02=0.54/26471 = 20.5 usec/iter.
slow_access_unit took 0.03-0.00=0.03/1607 = 19.9 usec/iter.
setof took 0.24-0.01=0.24/7692 = 30.7 usec/iter.
pair_setof took 0.27-0.01=0.26/6522 = 40.2 usec/iter.
double_setof took 0.54-0.00=0.54/1837 = 292.9 usec/iter.
bagof took 0.24-0.01=0.23/10112 = 22.5 usec/iter.
34 benchmarks took 19.42 seconds