Setting value of variable direcly from assembler

I already have code that works to fail/succeed a predicate from assembler, so something like the following works:

succeeds :-
   call_asm((
              mov rax, 1;      % succeed
              ret
   )).

fails :-
   call_asm((
              mov rax, 0;      % fail
              ret
   )).

?- succeeds.
true.

?- fails.
false.

But now I am trying to read/set the value of a prolog variable directly from assembler.

If I have some code like this:

asm_test(A) :-
   call_asm((
      mov r11,<address of A>;
      mov [r11],0x1122334455;
      mov rax, 1; % this means call_asm succeeds
      ret
   )),
   A == 0x1122334455.

How do I get the address of variable A within the prolog stack?

PS. I know I have to deal with the term_t handles, gmp numbers, and tagged integers, but for now I just want to get the address of the variable (hopefully without calling C code).

What is your actual problem? What leads to this? What are you trying to do/build/make/accomplish? What is the bigger picture? What project are you working on? http://xyproblem.info

I am doing an experiment to integrate prolog tightly with assembly. That’s it.

I have my doubt there is much to gain wrt. speed. If it is about access to foreign data without a C compiler there is the libffi package. Anyway, more low level access can always be interesting. I’ve been thinking about that as well, leaning to a good llvm integration.

A foreign function receives a normal argument vector (by default, there is also a more efficient calling convention that is mostly used internally) of term_t objects. This is an offset to the local stack base (to which you have no direct access). There you find objects that are described in src/pl-data.h. There are quite a few indirections involved, mostly defined as C macros and inline functions. Note that you cannot assign anything there: you must use unification.

Hope this gives a start.

Thanks, that is helpful.
Any idea on how to get (indirectly) the local stack base?

Rather complicated. There is a notion of LD in the source that is the notion of the local data for an engine/thread. The structure is defined in pl-global.h and contains stacks.local.base. The LD pointer is thread specific and the way it is implemented depends on the OS. The typical Unix implementation uses pthread_getspecific()

If you use the PL_FA_VARARGS calling convention, your function is called from pl-vmi.c:3736 as

rc = (*f)(h0, DEF->functor->arity, &context);

where context is a struct foreign_context pointer that provides the LD pointer for the calling engine.

Note that you cannot store the local stack base. Of course, every thread has its own. Worse, the stack base address can change on a stack expansion that may be triggered by any of the PL_* functions that allocate space from one of the stacks. The LD pointer for a specific thread never changes.

I’m afraid it is hard to avoid function abstraction for accessing what is behind a term_t. Its indirection is needed to tell the system which terms are accessible from foreign code and allow the system to move the underlying data during GC and stack expansions. Some of them might profit from assembly optimization though :slight_smile: See pl-fli.c.

What are you after? Fun? Performance? Thing you cannot do from C?

For now I am just having fun :slight_smile: We’ll see what it turns into.

Thanks!! I didn’t know about PL_FA_VARARGS passing the LD in context. This will do it I think! Assuming I am staying within the same thread my plan is to:

  • Store LD and the offset of the local stack base
  • Any time I need the local stack base I will use the LD and the offset
    to get to the local stack.

However, because term_ts are behind so much indirection, I may just provide a way to call the PL_unify* functions from within assembler. They will be called at the very end of the assembly routine, to bind the variables in whatever way the assembly code wants to do it.

I am thinking about doing this with dlsym turned into a predicate to give me the address of the functions. Then I can call that from within assembler using the regular C calling conventions.

BTW, is there an easy way to get the (integer) term_t handle from within prolog? Or do I need to write the (simple) get_term_t foreign function?

? term_t is a type alias for (eventually) uintptr_t, i.e. an integer large enough to hold a pointer, which is the (integer) offset. Using the VARARGS calling convention it is the offset of the first predicate argument and the others are simple +1, +2, …

Using the indirection of the local stack base you get a pointer to the argument block of the called predicate.

But … be careful. Simply unifying against one may move the entire block!

Thanks, especially for this:

I keep forgetting that the stack could be reallocated during unification.

Not really during. The unification opportunistically hopes there is enough space to trail the assignments and create wakeup nodes for constraints. If this is not the case it rewinds, garbage collects and.or resizes the stacks and retries :slight_smile:

SWI-Prolog is very smart under the hood! :grin:

I am almost there:

test(call_prolog_unify,[  setup(mmap(4096,Map)),
                          cleanup(munmap(Map))
                       ]) :-
    exerun:et_dlsym('PL_unify_integer',PLUnifyAddr),  % get address for PL_unify_integer
    L = [A],                                                                       % Just to declare variable A
    exerun:et_term_t1(A,AHandle),                                % get term_t for A
    format('ahandle=~w',[AHandle]),
    asm((
                     mov rdi, AHandle;              % 1st argument of PL_unify_integer
                     mov rsi, 0x1122;                % 2nd argument of PL_unify_integer
                     mov rax, PLUnifyAddr;    
                     call rax;                              % Call PL_unify_integer
                     mov rax,1;                         % Fake return TRUE to see the output of assertion below
                     ret
        ),
        C, []),
    %Put the asm code in memory and run it
    C = mcode(Code,_Opts),
    mmap_put(Map,Code),
    mmap_run(Map,0),
    %Now see if it worked
    assertion(A == 0x1122).

Here is the output of the assertion:

ahandle=401
ERROR: /home/u/p.pl:862:
        test call_prolog_unify: assertion failed
        Assertion: _724388==4386

PL_unify_integer(401,0x1122) is being called properly. Tracing in gdb shows that it fails to unify A with 0x1122 because of this line:

canBind(*p) returns FALSE, because the term_t does not contain a reference to a variable with *p set to 0. In other words, setVar was not called for A.

I would think setVar is called right after the VM evaluates L=[A] but this is not happening.

In gdb, I see that *p == <non-zero integer> which means
p is not pointing to a variable, even though the term_t handle
is properly passed to PL_unify_integer.

Here is some of the other code which may help to debug:

static foreign_t
et_dlsym(term_t funcname, term_t addr)
{  char *funcname_str = NULL;
   void *faddr = NULL;

   if (!PL_get_atom_chars(funcname,&funcname_str))
      return PL_type_error("function name",funcname);

   if ((faddr = PL_dlsym(RTLD_DEFAULT,funcname_str)) != NULL)
   {  if (!PL_unify_integer(addr,(intptr_t)faddr))
         return FALSE;
      return TRUE;
   }
   return FALSE;
}

static foreign_t
et_term_t1(term_t aterm1,
               term_t termt1)
{  if (!PL_unify_integer(termt1,(intptr_t) aterm1))
      return FALSE;

   Sdprintf("term_t=%lud %ld",aterm1,(intptr_t)aterm1);
   return TRUE;
}

Provided I correctly interpret what all this means, this seems problematic. term_t references are only valid during a foreign language call. So, exerun:et_term_t1(A,AHandle) gets a handle for A, but as it completes the handle is invalidated.

You could probably write a predicate in C that takes as arguments a number of Prolog parameters (packed in a term or list) and an address for the code to run. In that case the arguments remain valid because you are still in the scope of the C-defined predicate.

Ahh…that is why it is breaking.

I need the term_t handle (or anything I can use to reference the variable) before running the code, so that the asm((...)) can use the handle in the assembly code.

If I use the first argument with a PL_FA_VARARGS calling convention, and then calculate the offset of variables that way (using LD from the context argument), will it get invalidated also?

I am proposing something like this:

asm(A1, A2, ....,
        (....))

Would that work?

Not really, because asm is not a foreign predicate (it does not have access to term_t) it is a basic assembler in prolog (I know you will like that :slight_smile: ).

What I could do, with an API like the one you suggest, is guarantee that the foreign predicate call (to get the variable address/handle) and the asm predicate call are next to each other.

If I bundle the foreign call with the asm predicate call, would it be guaranteed that the address calculated with the LD->stacks.local.base+<offset> does not get invalidated?
I could even dereference that address once or as many times as necessary.

   asm(ListOfArgs,AssemblerCode) :-
      foreign_helper_to_get_handles_for_args(ListOfArgs,ListOfHandles),
      replace_asm_var_refs_with_addresses(ListOfHandles,AssemblerCode,AssemblerCode1),
      asm1(AssemblerCode1,MachineCode,_Opts).

This other option just came to my mind now:
Call the prolog asm predicate from within the foregin C function, would the term_t remain valid after a PL_open_query(..)/PL_next_solution()/PL_close_query() within the same foreign function?

LD->stacks.local.base is not constant. It doesn’t change often, but can change almost any time in theory. But, as you now seem to be calling the PL_* functions, you only need to pick up the term_t and consider them pointer-size integers.

You get term_t handles for the arguments. As I said, this is basically just a reference to the location of the argument vector on the stack. A foreign call also creates a foreign environment using PL_open_foreign_frame(), to be closed with PL_close_foreign_frame() on return (well, for foreign calls from the core there is a simplified more low level implementation). This environment contains term_t handles you create using PL_new_term_ref() and similar functions. Note that all these are allocated on the environment stack and foreign frames and predicate invocations must thus be nested consistently.

So, I think the only thing you can write is a foreign C-defined predicate that takes a set of Prolog arguments and a pointer to the assembler generated code. Can be e.g.

  • call_asm(+Ptr)
  • call_asm(+Ptr, +A0),
  • call_asm(+Ptr, +A0, …).

Bind all these to a VARARGS C predicate and you can pick up the handles really easy.

That sound good. I’ll give it a try. Thanks!