Get floating point representation in memory

Anyone know how to get the exact binary representation of the floating point number in memory, as an integer?

For example

float_memory(0.6,Int), format('~2r',[Int]).

would print:

0_0111_1110__0011_0011_0011_0011_0011_010

(the underscores (’_’) are added only to make it easier to read)

Until someone else corrects me:

The only important question is why?

If we forget about the only important question, it seems that the easy way would be to write it in C. The predicate should most probably return the sign, exponent and fraction separately? I also suspect that since the representation in memory might be architecture-dependent, there will be some trickery involved. Which again makes me wonder, why?

I might be wrong but I think that SWI-Prolog uses only C double floats (so you’ll have a total of 64 bits, not 32).

Thanks Boris.

Yes, that is correct.

double floats is fine.

Certain equations involving logarithms can be simplified if a number is expressed as significand*2^e; and this is very similar to how floating point numbers are stored in memory in the IEEE-754 standard.

UPDATE: to see why this is very useful, look here.

1 Like

I have some code from my Msgpack implementation that gets a byte representation of a float, which might be useful for you?

3 Likes

There is something similar in Jeff Rosenwald’s Google protocol buffer library that is bundled. That is nice for sending floats over a network. With the updates to float arithmetic we could consider adding a predicate that (dis)assembles a float into/from its three parts. No really sure how useful it is in the context of Prolog though. As long as you do arithmetic in Prolog the advantage disappears mostly with the Prolog overhead. If you push arithmetic down to C, LLVM, Cuda, etc. you go outside the Prolog world anyway and thus accessing the float bits is not a problem. I think that Prolog’s good support to facilitate domain specific languages should give a more prominent role in this area, but that still doesn’t answer why you’d want to have access to details from a float …

1 Like

It is extremely useful in Prolog, because Prolog can be used to derive new algorithms and equations which can then be implemented in C or something else (or the prolog code may output C code directly).

I think this would be a good addition, since we already have predicates like float_fractional_part/1.

This would also be another very practical use for such a predicate.

1 Like

I have to agree with SWI on this one.

One way I think of Prolog is not as a programming language or even a Turing complete programming language but as a language used to research ideas because it is so much more versatile and is one of the few languages that is homoiconic. Also because Prolog is much more friendly to creating DSL with term rewriting and DCGs, it is one of the first languages I would turn to in order to create and validate a DSL.

1 Like

Strawman proposal:

New builtin arithmetic predicate: float_value(?F, ?S^E)

At least one of the arguments must be ground, either F as a floating point number or S^E where S and E are signed integers. This is not a bit for bit representation of a float, but a slightly higher abstraction (exponent bias mapped to signed E, sign and implied significand bit included in S). Subnormals might be a bit tricky and we have to decide how to treat NaN’s and infinities.

Would something like this satisfy requirements? Counter proposals? Bad idea?

1 Like

The C/C++ world has double frexp(double x, int *exponent). I don’t really like the name, but float_value seems strange as well as the value of a float is, well, the float itself :slight_smile: I think we also want the base, just to accommodate different bases for the future. So, we have some options

  • Add functions mantissa, base and exponent
  • Add a predicate X(?F, ?Mantissa, ?Base, ?Exponent)
  • Add a predicate Y(?F, ?Mantissa*Base^Exponent)

The first has the big disadvantage that you typically want all three (or 2 if you merely assume the base is 2). The second is most efficient while the 3rd could almost be called float_value/2 as the arithmetic value of the expressions are equivalent.

X could be float_mantissa_base_exp/4. It is a bit long, but it won’t conflict and is immediately clear what it means. Predicates don’t get faster by shortening their name :slight_smile: while I do not expect code calling this in a lot of places?

I guess we can deal with NaN and Inf by copying this value to both the Mantissa and Exponent?

1 Like

Supporting base 2 isn’t a problem because it’s just a simple bit mask/copy, but arbitrary bases is more difficult and potentially involves rounding and all that that entails. For a builtin perhaps this should be limited to only bases that are native, i.e. currently 2. More flexible base conversion can be implemented in Prolog.

Since F is mantissa(F)*base(F)^exponent(F) is true (for reconstructing floats), a purely functional solution is interesting. It allows you to do extraction and calculation (and reconstruction?) in the same is evaluation. (Also nicely short-circuits any predicate name discussions.)

As suggested, for any float(F):
inf is mantissa(inf), -inf is mantissa(-inf), nan is mantissa(nan)
inf is exponent(inf), inf is exponent(-inf), nan is exponent(nan)
2 is base(F) (pending some new supported float format)

Current float_overflow and float_undefined flags to control error/continuation value treatment for special values.

Similar functionality; did you have any use case(s) in mind?

Also did you have any particular reason to go with predicates as opposed arithmetic functions, and multi-predicates vs. a single one that produces all the pieces?

This would be an acceptable compromise IMHO.

The second one would be my preference because

  1. As you point out, normally you need all three values
  2. I don’t see any real value (and it is not as efficient) to express the mantissa/base/exponent as a compound term of the form M*B^E

All in all, great suggestions!

Nice. Any n arity predicate can be used as an n-1 arity function assuming the correct instantiation pattern. If I recall, Eclipse has the same model.

I think that should be it, except we should name them float_* for consistency and possibly use radix rather than base? Which is more popular?

For arithmetic functions I think float_ is implied, but any predicate names should have the prefix. There’s also something to be said for shorter function names for constructing arithmetic expressions. But it may hinge on whether F can be a non-float numeric argument (converted to a float)?

In this context, base and radix seem to be used almost interchangeably (from IEEE 754 - Wikipedia " a base (also called radix ) b …".

My main motivation is to be consistent with the ISO float_fractional_part/1 and float_integer_part/1 functions. This would also suggest _part … Not sure I like that …

I’ve been thinking that in order to allow these predicates to be used to produce novel algorithms that deal with the float’s memory representation directly in addition to the parts, we could add an extra argument called Memory (or something similar).

for example:

float_parts(?F, ?Mantissa, ?Base, ?Exponent, ?Memory)

Where Memory is the exact memory representation as an Integer.

This will allow potentially developed algorithms to deal directly with the memory representation (including bias, hidden bit, etc), in addition to the parts (i.e. mantissa, exponent and base).

I could surmise that novel algorithms could be developing dealing directly with the memory representation and/or the parts, and it doesn’t cost much to add the extra argument.

What do you think?

EDIT: this exact memory representation is also very useful for implementing serialized bitstreams for network transmission/storage/etc

If a predicate solution is favoured, is there any reason not to just use the foreign language API for this. Seems like 20 lines of C code would do the job.

I would say precisely because it is only 20 lines of C code, it should be part of the bultin functionality. The reason is this:

The barrier of entry for some mathematician/scientist/physicist who is experimenting with this kind of algorithm in Prolog is not the 20 lines of C code, but having to learn about the FFI API of SWI-Prolog (not to mention the added learning curve of packaging and distributing .so files with add-ons). Providing this predicate will eliminate this barrier of entry, and it is not very difficult to provide.

I really don’t mind if it is a predicate or a function.

The overhead between multiple predicates and one predicate
only is probably the same. Its call pattern dependent. It depends
what technical use cases you have in your applications. For
example if you only call:

float_parts(F, _, _, E, _)

Most prolog systems somehow allocate the annonymous variables,
your forreign function predicate computes all results, and unfies
them with the annonymous variables, and the superflous results
are thrown away, since they are not used. Its probably faster if you had.

float_exponent(F, E)

To cater for single and multi use, you can of course provide both.
This is already seen in the rational numbers. As an example we have
numerator/1 and denominator/1 evaluable functions, which are
redundant to rational/3 predicate. No problem.