New version of the units package v0.16

Hello everybody,

I have released a new version of the units pack (v0.16) with the following improvements:

  • Added a whole lot of documentations and improved the README
  • Improved unit conversion speed through dimensional analysis
  • Improved handling of variables in partially instantiated units and quantities

I have put a lot of effort in documenting and adding tests and I feel that the library is now quite stable (in theory, since I’m practically the only one having tested the library ^^).
Please, try it and drop me a message if you find a bug or if you have an advice on how to improve the library !

Here is an example I am particularly proud of which demonstrate the unique power of this library:

:- use_module(library(units)).

avg_speed(Distance, Time, Speed) :-
  qeval(Speed =:= Distance / Time as isq:speed).

?- avg_speed(m, s, quantity(Speed)). % traditional forward mode
Speed = 1*isq:speed[si:metre/si:second].

?- avg_speed(quantity(Distance), s, m/s). % "backward" mode
Distance = 1*isq:length[si:metre].

?- avg_speed(X*inch, hour, m/s). % overspecified units
X = 18000000r127.

?- avg_speed(quantity(Distance), quantity(Time), quantity(Speed)). % underspecified units
Distance = _A*isq:length[_B],
Time = _C*isq:time[_D],
Speed = _E*isq:speed[_F],
... % lots of constraints

@jan Unfortunately, it seems that the pack page on swi-prolog.org did not update with the latest version. Would it be possible to check why ?

It seems the US backend didn’t sync. Restarted. Note that for the web page to update can take up to 2 hours. One hour for the US backend to sync and one hour for the CDN. Installing packages always get their info from the EU backend, so they are always up to date.

It seems to work now, thank you !

In early experiments with the new version, a couple of things caught my attention, mostly by accident:

  1. API consistency

This works:

?- unit_symbol(si:gram,S).
S = g.

but this doesn’t

?- unit_symbol(si:newton,S).
false.

Similarly:

?- unit_kind(si:gram,K).
K = isq:mass.

?- unit_kind(si:newton,K).
false.

We know si:newton has both a symbol and a kind:

?- qeval(X is 'N').
X = 1*kind_of(isq:length*isq:mass/isq:time**2)[si:newton].
  1. Unexpected domain error

This works:

?- qeval(R is (3*ft)/(2*m)).
R = 3r2*1[international:foot/si:metre].

but this doesn’t:

?- qeval(quantity R is (3*ft)/(2*m)).
ERROR: Domain error: [_17214{when = ...}.._17164] expected, found `international:foot/si:metre'

Why the difference? Should dividing two quantities of the same kind, e.g., isq_length result in a dimensionless quantity?

Note that in the success case the “normal” form of the quantity is incorrect; shouldn’t R = 3r2*1[international:foot/si:metre] have an explicit kind_of component?

  1. Constraint correctness

Consider:

?- qeval(quantity R =:= quantity D).
R = _A*_C[_B],
D = _A*_C[_D],
% constraints ...

Why should R and D be constrained to have the same numeric value, _A, but allowed to have different unit values?

In a similar vein:

?- qeval(R is quantity(W)/quantity(H)).
R = _A*(_D/_E)[_B/_C],
W = _F*_D[_G],
H = _H*_E[_I],
% constraints

I would have expected the units of R to be [_G/_I] not [_B/_C]

  1. Malformed quantity

Note missing kind_of in second query:

?- qeval(X is cgs:erg).
X = 1*kind_of(isq:length**2*isq:mass/isq:time**2)[cgs:erg].

?- qeval(X is cgs:barye).
X = 1*(isq:mass/(isq:length*isq:time**2))[cgs:barye].
  1. The README says:

The same symbol can be used for multiple units in the library. There are currently no mechanism to avoid name collision, so be extra careful when using them. Known pitfall symbols: ft usually means foot, but can also mean si:femto(si:tonne)

That’s a good warning but doesn’t really tell me when the condition exists or what to do about it. It sounds like a major source of future problems, so why is it not forbidden?

As an aside, does the concept of symbols come from mp-units (the C++ library) or is that a value added feature in pack units. If it comes from mp_units how do they deal with this issue?

Could you use the library(units) in a search engine, to
ask things like the following:

Q: How much costs a galon of milk in Paris expressed in dollars.

Didn’t Chat-80 already have unit conversion? I saw
something if I am mistaken.

Thank you so much for taking the time to test the new version :slight_smile:
Thanks to you, I have found 3 obvious bugs which I have corrected in a new v0.18 release.

  • Here, unit_symbol/2 is working as intended as it is only used for base units not defined in terms of other units. si:newton is defined using the predicate unit_symbol_formula/3.
  • Again, unit_kind/2 is only used to define the quantity kind of basic units. The kinds of derived units like si:newton are automatically derived from the kinds of their basic units.

More generally, these are multifile predicates used to specify the system of units and quantities.
They define the basic data from which we are going derive all the other rules.
I have made the effort to document these predicates because they are the extension points for user to extend the system with custom units and quantities.

In the library, I have implemented higher level predicates which does what you want:

  • unit(Unit, Symbol) which takes all units (basics, prefixed or aliased) and associate the symbol
  • all_unit_kind/2 takes any units (even derived units) and associate the kind of the unit.
?- unit_defs:unit(si:newton, S).
S = 'N'.

?- unit_defs:all_unit_kind(si:newton, S).
S = kind_of(isq:length*isq:mass/isq:time**2).

That was a bug, which is fixed in v0.18. The quantity R is ... is taking a different path in the code, which triggered the bug.

So, basically, there is no kind_of(1). This would allow to implicitly convert the quantity to any ratio of any units, which should not be possible.
That’s why kind_of(isq:length)/kind_of(isq:length) is simplified to 1.
You could argue that it should still be possible to convert back to ratios of any type of lengths but since the information is lost, I can’t do that.

Note that even though the variables are different, the current implementation works as you expect.
The chain of unification is hidden in the constraints because it needs to normalize units before any unification:

?- qeval(quantity R =:= quantity D).
R = _A*_C[_B],
D = _A*_C[_D],
when((ground(_D);ground(_E)), unit_defs:normalize_unit(_D, _E)),
when((ground(_B);ground(_E)), unit_defs:normalize_unit(_B, _E)),
...

?- qeval(quantity R =:= quantity D), qeval(D =:= m).
R = D, D = 1*kind_of(isq:length)[si:metre].

This had a little bug, but also works as expected now:

?- qeval(R is quantity(W)/quantity(H)), qeval(R is m/s).
R = 1*(isq:length/isq:time)[si:metre/si:second],
W = _A*isq:length[si:metre],
H = _B*isq:time[si:second],
::(_A, real(-1.0Inf, 1.0Inf)),
::(_B, real(-1.0Inf, 1.0Inf)).

That is a bug, which is fixed now.

Well, as you have probably guessed, because using symbols is really convienient for writing one off queries.

Yes, the concept of symbols comes from the original mp-units library.
In their case, every units, quantities and symbols are actually c++ identifiers (like types), meaning they can piggy back on the c++ namespace system to avoid name collisions.
In my case, I have lost that feature when translating their definitions of units by wrapping all definitions in predicates like unit_symbol/2 and unit_symbol_formula.
It would be so cool to also be able to piggy back swi-prolog module system to avoid name collisions, but that seems very complicated.
This is the last big usability issue, and I really need to fix it but I have now idea how…

There is an example with currencies if you are interested.
With it, you can do things like this:

?- qeval((
       Rate is 0.9591406100134280 * us_dollar/euro,
       Milk is 1*usc:gallon as isq:volume,
       PriceMilkInParis is 5*euro/litre as currency/isq:volume,
       Price is Milk * PriceMilkInParis * Rate in us_dollar)).
Rate = 0.959140610013428*1[us_dollar/euro],
Milk = 1*isq:volume[usc:gallon],
PriceMilkInParis = 5*(currency/isq:volume)[euro/si:litre],
Price = 18.153710838288895*currency[us_dollar].

In the case of this library, it also has the concept of quantities, which allows it to be much more expressive.

I’ve separated my replies by issue

  1. API consistency

Yes those do almost what I want (accessible through library(units/unit_defs). But all_unit_kind/2 isn’t fully logical, e.g.,

?- all_unit_kind(Unit,isq:time).
false.

There really are two levels of API here: one for users who just want to use the existing units data (unit_defs) and one for users wishing to define new units. That wasn’t quite clear to me before.

  1. Constraint correctness

OK, I see what’s happening. To permit the two quantities to be equal, the units must be “normalized” to the same value. Assuming the same also applies to the second example. Conclusion: you can’t infer too much from the units without considering all the (inter-connected) constraints which, in practice, can be somewhat difficult.

This is a more demanding project from a constraints perspective than I had originally thought. Hopefully it will all just work and most of the underlying complexity will be hidden.

Missed one:

I’m not following this. The “kind” of a ratio of two lengths is kind_of(isq:length/isq:length), isn’t it? What’s the logic in reducing it to 1, that just loses information for what gain?. Maybe I need an example.

Further documentation says

. For example, a speed of 3 metre/second would be represented as 3 * isq:speed[si:metre/si:second] .

Followed by the example

?- qeval(Q is 3 * si:metre / si:second).
Q = 3*kind_of(isq:length/isq:time)[si:metre/si:second].

so there’s already inconsistency in the term representation of a quantity. You’re now adding an additional (undocumented?) form (for unification purposes) of a quantity, namely one with kind_of/1 replaced by 1, for reasons which aren’t yet clear to me.

Sorry about this.
I have just checked the original library and they indeed allow kind_of(1).
Thinking about it more, I don’t think my earlier explanation makes any sense ^^
I have just striked through my earlier response and I will reenable the kind_of(1) in the library.

The reason is purely a balance of convienience vs safety.
We don’t want to have too much extra stuff which should just disappear while still maintaining some safety.
The whole explanation is written in the original c++ library: Dimensionless Quantities - mp-units

I would argue that this is not a case of inconsistency.
Quantities can have different representation through quantity types as quantity types are related to each others through formulas and hierarchies.
kind_of(isq:length/isq:time) represent the whole subtree of quantity types of any kind of length divided by any kind of time, of which isq:speed is one of such specialized quantity type.

But you are right about the kind_of(1). My bad for the confusion.

Unfortunately, in this case, a fully logical predicate would be very challenging, because it has to handle arithmetic expression of units, which can be arbitrarily nested.
I have tried to maintain logical purity where I could, but anything with arithmetic expression is very difficult.
This reminds of when I tried to write a pure dcg for the c language and I gave up because of this very problem.

Whoops, this code fragment would seem to do what I want(?):

?- unit(Unit,_),all_unit_kind(Unit,kind_of(isq:time)).
Unit = non_si:hour ;
Unit = non_si:minute ;
Unit = non_si:day ;
Unit = si:deci(non_si:hour) ;
Unit = si:deci(non_si:minute) ;
Unit = si:deci(non_si:day) ;
Unit = si:deci(si:second) ;
Unit = si:deci(si:day) ;
Unit = si:deci(si:minute) ;
Unit = si:deci(si:hour) ;
Unit = si:deci(iau:'Julian_year') ;
Unit = si:deci(iau:day) ;
Unit = si:deci(cgs:second) ;
Unit = si:giga(non_si:hour) ;
Unit = si:giga(non_si:minute) ;
Unit = si:giga(non_si:day) ;
Unit = si:giga(si:second) ;
Unit = si:giga(si:day) ;
Unit = si:giga(si:minute) ;
Unit = si:giga(si:hour) .
%% terminated

So a logical version of this would seem trivial.

Hum, did not thought of that ^^
A generate and test approach is always possible.
In the library, I also used delayed constraints with when to recover logical purity.

I wasn’t suggesting symbols should be forbidden, just that duplicate symbols not be allowed. I really think that symbols are the most user-friendly way of specifying units.

I don’t think there’s a simple answer but here’s my take. At the root of the issue is the desire for dynamic loading of new unit data. If permitted, each time you add new unit data, you have to ensure that’s it’s consistent, however that’s defined, with the existing data. Since we’re not talking about consistency at the predicate level (details of symbols, units, kinds, etc. definitions are arguments), that means the standard module loading isn’t helpful.

So I think you’re having to write your own loader for dynamic unit data; you can’t just use the module loader since there is no mechanism for enforcing consistency. And you’ll have define where (exiting module, dynamic module, ..) and how (as asserted facts, ??) the unit database is manifested.

And you’ll also have to define the rules for consistency. For example are duplicate symbols allowed and if so, on what conditions. Same applies to kinds and units. You can make the call as to whether some sort of name space management is desirable. It seems not unreasonable for units, but qualifying symbols seems to defeat their whole rationale (simple naming).

I don’t think it’s that difficult to construct a dynamic loader, particularly one for flat files of ground facts, but there are probably lots of details to be filled in (analysing file paths, constructing streams for stings for embedded data, ..). This would seem to a common problem so I wonder why there isn’t a generic solution with a hook for consistency checking somewhere.

Small module housekeeping issue: module unit_defs should use module units.

I have finally managed to piggy back on the swi-prolog module system to resolve naming conflicts for symbols and units !
What made it click for me was the use of the module_transparent/1 directive, which allows me to retrieve the module of the calling context and transmit it to the normalize_unit/3 predicate which will recognize the symbols.
Then it was a matter of generating predicates using the symbols of units like this:

:-module(cgs_symbol,[erg/1,dyn/1,dyne/1,'P'/1,poise/1,'St'/1,stokes/1,'Ba'/1,barye/1,s/1,second/1,'K'/1,kayser/1,'Gal'/1,gal/1,g/1,gram/1]).
erg(:(cgs,erg)).
dyn(:(cgs,dyne)).
dyne(:(cgs,dyne)).
'P'(:(cgs,poise)).
poise(:(cgs,poise)).
'St'(:(cgs,stokes)).
stokes(:(cgs,stokes)).
'Ba'(:(cgs,barye)).
barye(:(cgs,barye)).
s(:(cgs,second)).
second(:(cgs,second)).
'K'(:(cgs,kayser)).
kayser(:(cgs,kayser)).
'Gal'(:(cgs,gal)).
gal(:(cgs,gal)).
g(:(cgs,gram)).
gram(:(cgs,gram)).

Now, to use units, users needs to import a module (there is one per system) like this:

?- use_module(library(units/systems/si/symbols), except([ft/1])).
?- use_module(library(units/systems/usc/symbols), [ft/1, pk/1 as mypk]).
true.
?- qeval(X is ft), qeval(Y is mypk).
X = 1*kind_of(isq:length)[usc:foot],
Y = 1*kind_of(isq:length**3)[usc:peck].

And amazingly, the module system will take care of checking for collision !

@ridgeworks What do you think of this approach ?
From your last message, you argued for the exact opposite approach, so I would be curious if my approach will convince you ^^

One thing I am a bit anxious is that this approach introduce ~1600 predicates, with ~1200 for the si system only.
Although, on my laptop, swi-prolog doesn’t seem to have any problem with that.
I have also replaced calls to predicates like unit(Unit, Symbol) with meta predicates call(Module:Symbol, Unit), so I was worried about performance consideration.
But it seems the meta predicate call is faster than the previous approach !
The runtime of my test suite has been divided by 2 ! (although I am not sure it is entirely because of this)

Insufficient detail for me to see what’s really going on. At one level it appears that symbol names are now being used for predicate names. If that’s the case, then the module loader should indeed be able to catch any collisions.

I’m a little unsure of the process:

So is module cgs_symbol just used to generate another module file which the user loads, or does it not work like that?

I guess I was arguing that if the module loader has limitations that may prevent you from doing all the consistency checking necessary, where as if unit had its own data loader (not that difficult to implement) you would have complete flexibility in what rules you wanted to enforce.

Now simple symbol conflicts is an obvious issue, but how about:

  • prefixes resulting in a symbol conflict, e.g., your example of ft being translated to si:femto(imperial:ton), but there are others (yd, rd).
  • can the same unit name be defined in two different systems, e.g., imperial:quart, usc:quart? If so, which gets applied in ?- qeval(X is quart).? Or is the user always expected to qualify the unit?

My own suspicion is that using the module system for structuring the unit data isn’t buying you much and you’re having to somewhat contort things to make it work for you. It might be better to start from a clean slate, but that’s your call.

Do you mean predicates or facts/rules? I assume the latter, which shouldn’t be an issue at all. In fact the current version I’m running:

?- module_property(units, program_size(Size)).
Size = 230752.

I’m sure you can find applications that exceed this by many orders of magnitude.

No, if the user want to use symbols from the cgs system, he just needs to import that module into the context where he wants to use the symbols.
There is one module per system.

So now, both problems is handled by the module system.
the same symbol can be defined (as a predicate) in multiple modules:

  • there is a ft predicate in the imperial, international, usc and si modules
  • to disambiguate which one the user wants, he just needs to import the predicate from the module corresponding to its need:
    • if he wants a si:femto(si:tonne), he needs to import the si symbols module
    • if he wants a usc:foot, he needs to import the usc module.
  • same logic with your quart example, if the user wants imperial:quart, he needs to import the imperial symbol module, if he wants usc, import the usc module.

That’s what I thought before but I found it wasn’t that much contorsion when implementing it ^^

I really meant predicates ^^
See the number of predicates in the si symbol module: units/prolog/units/systems/si/symbols.pl at 540d398afee18cf42dfde1eb6886f63f33761095 · kwon-young/units · GitHub
And it’s not really the size, but the number of predicates that worries me…

That seems a bit contradictory. Clearly there are multiple modules defining , e.g., the cgs system or the si system. So I’m interpreting this to mean that using such modules imports their “exports” into a common context and there is just one such context “per system”. That makes some kind of sense to me. It’s also not clear what the user is importing. Is it just the (new) symbols files or is it all the pre-existing data that was used to generate the symbol file?

So the first thing that happens is that the module loader will flag if the user tries to use a “system” that defines a symbol (i.e., predicate) that’s already defined. That’s an error that the user must resolve by “excepting” definitions he doesn’t want. Alternatively he can “cherry pick” just the definitions he wants from any given system, excluding any definitions that might cause difficulty. I guess that’s workable but I’d have to actually try it for a while to give me confidence.

And I see how that gets extended to work my other issues - pretty much everything is now a symbol. And any basic symbol (from pre-existing data) actually gets multiplied by all the prefix definitions. e.g.

qL(:(si,quecto(:(si,litre)))).
rL(:(si,ronto(:(si,litre)))).
yL(:(si,yocto(:(si,litre)))).
zL(:(si,zepto(:(si,litre)))).
aL(:(si,atto(:(si,litre)))).
fL(:(si,femto(:(si,litre)))).
pL(:(si,pico(:(si,litre)))).
nL(:(si,nano(:(si,litre)))).
µL(:(si,micro(:(si,litre)))).
mL(:(si,milli(:(si,litre)))).
cL(:(si,centi(:(si,litre)))).
dL(:(si,deci(:(si,litre)))).
daL(:(si,deca(:(si,litre)))).
hL(:(si,hecto(:(si,litre)))).
kL(:(si,kilo(:(si,litre)))).
'ML'(:(si,mega(:(si,litre)))).
'GL'(:(si,giga(:(si,litre)))).
'TL'(:(si,tera(:(si,litre)))).
'PL'(:(si,peta(:(si,litre)))).
'EL'(:(si,exa(:(si,litre)))).
'ZL'(:(si,zetta(:(si,litre)))).
'YL'(:(si,yotta(:(si,litre)))).
'RL'(:(si,ronna(:(si,litre)))).
'QL'(:(si,quetta(:(si,litre)))).
'L'(:(si,litre)).
litre(:(si,litre)).

I now see where the predicate “explosion” happens. Every basic symbol, e.g., L, requires 24 additional symbol definitions to cover all prefixes. And it seems all easily derivable from the original data. See that’s what I would call a contortion. You needed to take any existing symbol, turn the definition into a predicate for that symbol and for all prefixes applied to that symbol, just so the module loader can be used to detect symbol (and prefixed symbol) conflicts.

BTW, what’s the process now required for a user to define a new system of units?