But now I am trying to read/set the value of a prolog variable directly from assembler.
If I have some code like this:
asm_test(A) :-
call_asm((
mov r11,<address of A>;
mov [r11],0x1122334455;
mov rax, 1; % this means call_asm succeeds
ret
)),
A == 0x1122334455.
How do I get the address of variable A within the prolog stack?
PS. I know I have to deal with the term_t handles, gmp numbers, and tagged integers, but for now I just want to get the address of the variable (hopefully without calling C code).
What is your actual problem? What leads to this? What are you trying to do/build/make/accomplish? What is the bigger picture? What project are you working on? http://xyproblem.info
I have my doubt there is much to gain wrt. speed. If it is about access to foreign data without a C compiler there is the libffi package. Anyway, more low level access can always be interesting. I’ve been thinking about that as well, leaning to a good llvm integration.
A foreign function receives a normal argument vector (by default, there is also a more efficient calling convention that is mostly used internally) of term_t objects. This is an offset to the local stack base (to which you have no direct access). There you find objects that are described in src/pl-data.h. There are quite a few indirections involved, mostly defined as C macros and inline functions. Note that you cannot assign anything there: you must use unification.
Rather complicated. There is a notion of LD in the source that is the notion of the local data for an engine/thread. The structure is defined in pl-global.h and contains stacks.local.base. The LD pointer is thread specific and the way it is implemented depends on the OS. The typical Unix implementation uses pthread_getspecific()
If you use the PL_FA_VARARGS calling convention, your function is called from pl-vmi.c:3736 as
rc = (*f)(h0, DEF->functor->arity, &context);
where context is a struct foreign_context pointer that provides the LD pointer for the calling engine.
Note that you cannot store the local stack base. Of course, every thread has its own. Worse, the stack base address can change on a stack expansion that may be triggered by any of the PL_* functions that allocate space from one of the stacks. The LD pointer for a specific thread never changes.
I’m afraid it is hard to avoid function abstraction for accessing what is behind a term_t. Its indirection is needed to tell the system which terms are accessible from foreign code and allow the system to move the underlying data during GC and stack expansions. Some of them might profit from assembly optimization though See pl-fli.c.
What are you after? Fun? Performance? Thing you cannot do from C?
For now I am just having fun We’ll see what it turns into.
Thanks!! I didn’t know about PL_FA_VARARGS passing the LD in context. This will do it I think! Assuming I am staying within the same thread my plan is to:
Store LD and the offset of the local stack base
Any time I need the local stack base I will use the LD and the offset
to get to the local stack.
However, because term_ts are behind so much indirection, I may just provide a way to call the PL_unify* functions from within assembler. They will be called at the very end of the assembly routine, to bind the variables in whatever way the assembly code wants to do it.
I am thinking about doing this with dlsym turned into a predicate to give me the address of the functions. Then I can call that from within assembler using the regular C calling conventions.
? term_t is a type alias for (eventually) uintptr_t, i.e. an integer large enough to hold a pointer, which is the (integer) offset. Using the VARARGS calling convention it is the offset of the first predicate argument and the others are simple +1, +2, …
Using the indirection of the local stack base you get a pointer to the argument block of the called predicate.
Not really during. The unification opportunistically hopes there is enough space to trail the assignments and create wakeup nodes for constraints. If this is not the case it rewinds, garbage collects and.or resizes the stacks and retries
test(call_prolog_unify,[ setup(mmap(4096,Map)),
cleanup(munmap(Map))
]) :-
exerun:et_dlsym('PL_unify_integer',PLUnifyAddr), % get address for PL_unify_integer
L = [A], % Just to declare variable A
exerun:et_term_t1(A,AHandle), % get term_t for A
format('ahandle=~w',[AHandle]),
asm((
mov rdi, AHandle; % 1st argument of PL_unify_integer
mov rsi, 0x1122; % 2nd argument of PL_unify_integer
mov rax, PLUnifyAddr;
call rax; % Call PL_unify_integer
mov rax,1; % Fake return TRUE to see the output of assertion below
ret
),
C, []),
%Put the asm code in memory and run it
C = mcode(Code,_Opts),
mmap_put(Map,Code),
mmap_run(Map,0),
%Now see if it worked
assertion(A == 0x1122).
Here is the output of the assertion:
ahandle=401
ERROR: /home/u/p.pl:862:
test call_prolog_unify: assertion failed
Assertion: _724388==4386
PL_unify_integer(401,0x1122) is being called properly. Tracing in gdb shows that it fails to unify A with 0x1122 because of this line:
canBind(*p) returns FALSE, because the term_t does not contain a reference to a variable with *p set to 0. In other words, setVar was not called for A.
I would think setVar is called right after the VM evaluates L=[A] but this is not happening.
In gdb, I see that *p == <non-zero integer> which means
p is not pointing to a variable, even though the term_t handle
is properly passed to PL_unify_integer.
Here is some of the other code which may help to debug:
Provided I correctly interpret what all this means, this seems problematic. term_t references are only valid during a foreign language call. So, exerun:et_term_t1(A,AHandle) gets a handle for A, but as it completes the handle is invalidated.
You could probably write a predicate in C that takes as arguments a number of Prolog parameters (packed in a term or list) and an address for the code to run. In that case the arguments remain valid because you are still in the scope of the C-defined predicate.
I need the term_t handle (or anything I can use to reference the variable) before running the code, so that the asm((...)) can use the handle in the assembly code.
If I use the first argument with a PL_FA_VARARGS calling convention, and then calculate the offset of variables that way (using LD from the context argument), will it get invalidated also?
Not really, because asm is not a foreign predicate (it does not have access to term_t) it is a basic assembler in prolog (I know you will like that ).
What I could do, with an API like the one you suggest, is guarantee that the foreign predicate call (to get the variable address/handle) and the asm predicate call are next to each other.
If I bundle the foreign call with the asm predicate call, would it be guaranteed that the address calculated with the LD->stacks.local.base+<offset> does not get invalidated?
I could even dereference that address once or as many times as necessary.
This other option just came to my mind now:
Call the prolog asm predicate from within the foregin C function, would the term_t remain valid after a PL_open_query(..)/PL_next_solution()/PL_close_query() within the same foreign function?
LD->stacks.local.base is not constant. It doesn’t change often, but can change almost any time in theory. But, as you now seem to be calling the PL_* functions, you only need to pick up the term_t and consider them pointer-size integers.
You get term_t handles for the arguments. As I said, this is basically just a reference to the location of the argument vector on the stack. A foreign call also creates a foreign environment using PL_open_foreign_frame(), to be closed with PL_close_foreign_frame() on return (well, for foreign calls from the core there is a simplified more low level implementation). This environment contains term_t handles you create using PL_new_term_ref() and similar functions. Note that all these are allocated on the environment stack and foreign frames and predicate invocations must thus be nested consistently.
So, I think the only thing you can write is a foreign C-defined predicate that takes a set of Prolog arguments and a pointer to the assembler generated code. Can be e.g.
call_asm(+Ptr)
call_asm(+Ptr, +A0),
call_asm(+Ptr, +A0, …).
Bind all these to a VARARGS C predicate and you can pick up the handles really easy.