How can I encode integers as single bytes?

I am writing a Y86-64 (a simplified version of X86-64 from CSAPP by Bryant and O’Hallaron) implementation in Prolog. I would like to represent memory as a list of bytes. I am having trouble with this implementation though as, according to the General purpose arithmetic manual page, all integers in SWI Prolog are 64 bits.

Using 64 bit integers is problematic for multiple reasons. First off I am using 8 times more memory than I need to, and second it is unnatural to pretend that a 64 bit integer is actually a 8 bit integer, and part of my goal for this project is to code something that maps closely to how a computer really works.

One mention of bytes in the manual I found under Primitive character I/O, but this has to with streams which is not what I am looking for. I have tried to find the source of these predicates to see if I could figure out how the bytes were represented but there are no links to any source in the manual.

Another mention of bytes I have found in the manual is Predicate hex_bytes/2 from the crypto library. While this was promising the documentation states Bytes is a list of integers between 0 and 255 that represent the sequence as a list of bytes, which insinuates that this predicate is just pretending that the integers are 8 bits.

I have thought about using chars. However I cannot do arithmetic on chars easily. I could represent my memory as a list of chars, and then when I go to do arithmetic get the integer value with char_code/2. However this is clumsy.

I would appreciate help.

1 Like

Interesting … on my machine I get: 72057594037927935

Which is 56 bits.

Having such overhead isn’t so good, if what one needs is (much) less bits – I guess for symbolic processing – bit size isn’t a factor …

Dan

Thanks.

I think the question was how to store compactly byte values in an array, so that its memory efficient – if you store values in byte range as a dynamic fact you will still use 64 bit per memory cell.

But, i think, if performance isn’t a factor then you can use, 64 bit as 8 consecutive memory locations for bytes or 4 for 2 byte word each, and use some bit shifting and masking to get to each.

But creating such a compacting would indeed impact performance … one way or another.

Dan

I went to try to change the max/min_tagged_integer flags but could not due to permissions errors. The manual page on set_prolog_flag/2 clearly states the problem - If the flag is a system-defined flag that is not marked changeable above, an attempt to modify the flag yields a permission_error. If the provided Value does not match the type of the flag, a type_error is raised.

So am I correct to say that it is not possible to store integers as single bytes?

Hi Nicholas,

Why don’t you look at bit operations and roll your own implementation:

bitwise shift [1]
bitwise and [2]

[1] f((>>)/2)
[2] f((/\)/2)

Based on max tagged you can store 56 bits in one memory location, or 8 bytes – at least on my machine. So you create dynamic fact structure that maps between memory addresses in 8 increments (arg1) and an integer (arg2)-- with the integer holding 8 consecutive bytes.

To access memory address 8i+n, you access fact at address 8i, and then use masks and shifts to get at byte n

Non existence of a fact at address 8i means that the whole “block” of 8 bytes is zero.

Could that work for you?

Thanks Dan this is close to what I plan to do unless I find a way to actually store individual bytes.

There is one thing you said though that confused me:

isn’t 56 bits 7 bytes? Am I missing something?

Prolog is a dynamically typed language where each value, regardless of its type, is stored as a cell, which is “pointer size”. For integers, this stores small integers. Larger integers are stored using an indirection. Prolog integers are as close as you get to mathematical integers: they are only bound by available memory.

Prolog doesn’t have any other data types, so if you want to deal with compact arrays such as bytes, you need to escape to foreign language (through C). The already mentioned "ffi" pack for SWI-Prolog pack can do this or you need to write your own access functions. There you define what should happen if the Prolog integer doesn’t fit in a byte. No matter how you do this, this gets you out of the nice logical Prolog world.

Yes of course – my (thought) mistake – I was think in 8ts but its 7 …

Dan

A list of bytes is going to have O(N) performance for indexing … if you want faster access, use library(assoc) or library(rbtrees).

You might want to just store your data as a “blob” and have predicates for accessing the bytes inside your blob. There’s a full foreign interface in SWI-Prolog. Here’s some fairly simple code that does serialization/deserialization, which might give you some ideas: contrib-protobufs/protobufs.c at 137a9d87ccc990d29e7ae337bb4a97ad8af05982 · SWI-Prolog/contrib-protobufs · GitHub (I don’t have an example of “blobs”, but I’m sure there are some).

Hi Nicholas,

I am curious - how did the implementation go …

Dan

1 Like

I am working on a more sophisticated version of the project with a full on assembler, linker, and emulator … though I hate to say that I have switched to Common Lisp.

Now i am curious, how does Lisp provide you with better “low level” representation support …

My need for low level representation support is only in the emulator, which I
am writing in C (but haven’t started yet). I figured if I was going to
do it correctly in either Prolog or CL I was gonna need to use a CFFI,
and figured I may as well just write it in C.

The decision to write the assembler and linker in CL was not as much a
technical decision but more about my growing interest in Lisp.