How many days will it realistically take to code up a working prototype of WAM (without optimization)?

I think you literally compile last call into iteration sometimes,
depending on determinism of the current clause. This compilation
gives quite some speed. But I guess it also accounts for a

further budget of 10-100 days compiler design and implementation:

app([X|Y], Z, [X|T]) :- app(Y, Z, T).

?- vm_list(app).
       0 h_list_ff(3,4)
       3 h_void
       4 h_list
       5 h_var(3)
       7 h_firstvar(5)
       9 h_pop
      10 i_enter
      11 l_nolco('L1')
      13 l_var(0,4)
      16 l_var(2,5)
      19 i_tcall
L1:   20 b_var(4)
      22 b_var1
      23 b_var(5)
      25 i_depart(app/3)
      27 i_exit

The depart is also found in A Portable Prolog Compiler, Clocksin
et al. from 1983 as an addition in the section “Some Additions to
the Intermediate Language”, its not part of the 7 basic instructions.

Clocksin refers to a Warren 1980 paper concerning the depart instruction.
What SWI-Prolog added further is l_nolco and i_tcall, l_nolco doing
the determinism check, right? And i_tcall doing the looping jump. Seems

you need the check always since freeze/2 can sneak in non-determinism.

Depart is indeed in there, but not how to implement it (if I recall correctly). As we have a sequence of instructions that build the arguments for the new call followed by a depart we do not know there is a depart rather than a call. So, old SWI-Prolog created a new frame. Then found the depart and determinism and overwrites the old frame with the new frame and continues. That is problematic though. In the old days, the environment stacks had internal reference pointers and you could thus not simply copy the frame. Bart Demoen talked me into never having reference pointers in the environments. This means that if we need such a pointer we create a variable on the global stack and make the reference from there. That is a bit of overhead, but stack frame manipulation gets a lot simpler, so it pays off.

But still, we create all arguments on the new frame and then copy them. That is where the two branches come in: if we can do a tail call we modify the current frame. The current instruction set for that is quite limited. If there is no way, we still go the old route, but that is infrequent now. The itcall is not a loop call. It could be if there is only one clause, but otherwise we need to restart the clause search (could be improved). Self calls are common and often time critical though and we safe an argument as well as some checks.

The one day budget is almost over. Homework at the end here.
You are right. “depart” is only specified. In the section
Some Additions to the Intermediate Language:

image

In contrast to call:

But the ZIP subscribes to the idea that body goals need a
call instruction. Which then leads to the idea that predicate
calls are similar to routine calls in ordinary programming

languages. Not all Prolog systems deal with the body of
clauses this way. See also Paul Taraus paper on Hitchhiker’s
Guide to Reinventing a Prolog
. If you abandon this idea of

calls, you can really do a prolog system with 4 instructions.
The below thaw/4 variant of data/3 takes a variables frame V,
and we see the instruction pop is somehow redundant if

the functor instruction also carries an arity:

thaw(V, X)   --> [var, K], {arg(K, V, X)}.
thaw(V, C)   --> [functor, F/N], 
                 {functor(C, F, N), C=..[_|L]}, 
                 thaw_list(V, L), 
                 [pop]. 
thaw(_, X)   --> [const, X].

thaw_list(_, [])    --> [].
thaw_list(V, [X|L]) --> thaw(V, X), thaw_list(V, L).

We can already do a copy_term/2:

?- T=f(X,g(a,X),b), numbervars(T, 1, _), 
    data(T, L, []), !, thaw(v(Y), S, L, []).
S = f(Y, g(a, Y), b)

Implementing a vanilla interpreter is left as an exercise.

There are many reasons why we would consider constructing from scratch a WAM. The most obvious is to have a detailed and practical understanding of the Prolog compiler. Another reason is to keep using Prolog while not being restricted to its existing syntax. Inbuilt syntactic flexibility would keep most of us happy. By syntactic flexibility I mean the ability to reorder and rename terms and operators in a manner equivalent to the “cannonical” way.
A typical example would be the ability for the programmer to code in a way closer to his/her natural language syntax:
equals(X, Y):- … becomes
… (X, Y)ቢያክል
where the variables precede the predicate name, and the conditional :- is replaced by ቢ, it’s equivalent in a given natural language.
The absence of such syntactic flexibility is most probably the main reason why we seek a “better” programming language. Most of us are happy with Prolog while we would want to see similar improvements. But the ISO standards are unlikely to yield to such drastic requests. Would development teams consider it feasible within the existing framework? Not sure.