New version of the units package v0.16

No, I meant the context module as defined by swi-prolog: context_module/1
This means that the user can import ft as usc:foot in module a and si:femto(si:tonne) in module b and there won’t be ambiguities or name clash.

That’s what I thought too.
But given how easy it is parse and generate prolog, I would call it only a minor contorsion.
By the way, this is not my original idea, but my translation of how this was done in the original mp-units library.
They also have defined lots of symbols as c++ identifiers (although much less because I suspect they mostly done it by hand given how hard it is to parse and generate c++ ^^).
However, the feature set is not exactly the same because c++ namespace can modified inside functions, so user can import symbols only in specific c++ function, which we cannot do with prolog predicates.

If the user wants to use symbols in expressions for their new units, they just need to provide 1 arity predicate mapping the symbols to the fully qualified units.
See the currency example: units/examples/currency.pl at 98e950c14f984f1e23cc47388019772889d2163a · kwon-young/units · GitHub

I think I understand that. Also if I import si/symbols.pl and usc/symbols.pl, I will get an error indicating duplicate definitions of ft. Then the importing code needs to be modified to exclude one of the definitions, before continuing.

It would seem that as a user the only importing required is the symbols files. All other units data is assumed to be preloaded. It looks like this is currently done by units.pl. Is there any thought to removing this, so the user can selectively choose which systems are needed (just like symbols)? As an example fragment, to load the si system and import all its symbols, a user would:

:- use_module(library(units/systems/si)).
:- use_module(library(units/systems/si/symbols.pl)).

BTW, I’ve now done a fair amount of work with the symbols inherited from mp-units and I’m not thrilled with a lot of their choices. I know this is somewhat subjective, but instead of making things easier to do in a Prolog environment, they seem to be mode difficult or more obscure. Case in point, who would have guessed that ft could be interpreted as “femto-tonne”. But I’ll defer the discussion to another post.

Well, yes and no.
The things is that most systems are actually interdependent. For example, usc is defined using units from the international system, which are in turn defined in terms of si.
So, to use usc, you would actually need to load the si systems anyway.
But, you could use si without loading anything else.
Although, with the name clashes resolved around symbols, I don’t really see the benefit now.

Yeah, I think this is a big area for improvement.
We should definitely curate all the defined symbols to avoid pitfalls for users and reduce nameclashes between systems to a maximum.
For example, I will remove the ft for femto-tonne symbol as it doesn’t make sense anyway.
Let me know of any other symbols you would like to add/remove or modify !

Wouldn’t the standard technique be if that if one system “used” another, then it should import it? So system usc should use_module(library(../international). In turn, system international should use_module(library(../si). Doesn’t that do the job without requiring any pre-loading? The user just loads the system(s) they require.

Rather than identify specifics, here’s my guidelines for defining symbols:

  1. Minimize the use of extended keycodes like Ω, µ, Å, etc. These characters maybe nice to read, but can be a pain to generate. At least I can find keystrokes (e.g., opt/alt-Z for Ω) to generate these characters; there are others that don’t meet that minimal requirement.
  2. Minimize symbols that have to be quoted. Is 'W' really better than watt? 'A' better than amp? And it’s a little bizarre that use with a prefix often doesn’t require quoting, e.g., kW and mA. A particularly bad choice IMO is 'hp(I)' for horsepower.
  3. Minimize the use of one character symbols to obvious, commonly used cases. For example, I think m for metre and s for second is OK. But I’d rather see ton instead of t, hr instead of h, day instead of d.
  4. Don’t significantly shorten prefixes that aren’t commonly used. So prefixes m, c, k are justifiable (IMO) as short cases, but use long versions for quecto,pico,mega, yottto, etc. I realize the boundary between the two classes may not be that clear but I would err on the side of long versions because there’s no doubt what they mean. Another example: I wouldn’t advocate for any short versions of the iec prefixes.

So that’s my list of general principles for defining symbols (and prefixes). Comments, additions, changes?

I agree, altough we need to find sensible alternatives.
Ω is full name is si:ohm, but you can already use units without the system qualifier, like this: ohm.
For µ, we could use u ?
For the angstrom, I don’t know since A is already used for Ampere.
By the way, instead of replacing them, I think I will provide plain text alternative where it make sense, like u for µ or deg for °.

Well, this is a big pain point of prolog. loosing all capitalized words is a pain since so many units needs capitalization.
By the way, you can already use watt, but amp is a good idea !
the hp(I) is aliased to hp in mp-units, which I failed to translate.

Well, we are following the rules for prolog atom, so I can’t do much about it…

Well, user don’t have to use the short form if they don’t want.
Do we really want to remove symbols just because they are rarely used ?
For example, h and d could be very useful if you are working with timespans ?

So there’s already an adequate symbol, i.e., ohm, no need for Ω. BTW, did you know:

?- char_type(`Ω`,T).
T = alnum ;
T = alpha ;
T = csym ;
T = csymf ;
T = prolog_var_start ;
T = prolog_identifier_continue ;
T = csymf ;
T = print ;
T = graph ;
T = upper ;
T = upper(ω) ;
T = to_lower('Ω') ;
T = to_upper(ω) ;
false.

i.e., var(Ω) is true.

Possibly, although as a prefix I think I slightly prefer micro, i.e., full name.

In this case I would use 'A' even though it violates my “Avoiding caps (rule 2)”. That’s supported by (Angstrom - Wikipedia). I would use amp for ampere.

If you have a concern for predicate count why make it worse for symbols that are already defined. If the user feels that strongly about a particulalr notation, he can define his own symbol.

I consider this the (small) price to be paid for self declaring syntax. Besides, this is library for Prolog users who are used to such issues. And do “so many units” really need capitalization?

For units in which the unqualified unit name is perfectly fine, I see no need to add another equivalent symbol. I think we should be trying to minimize the number of symbol synonyms for any given unit; best case is zero.

Nobody said you could other than minimizing the number of symbols that require quoting.

Yes! As already noted users can define their own symbols if they feel the need. At the same time we minimize the number of predicates required by the symbol files.

A significant improvement IMO, is to have the system module export the symbols. For example if si.pl was modified as so:

:- module(si, []).
:- reexport(si/symbols).  % export symbols defined in si/symbols.pl

:- use_module(si/constants).
:- use_module(si/prefixes).
:- use_module(si/units).

then the user would just:

:- use_module(library(units/systems/si)).

Same import control (includes/excludes) as currently used on symbol files. And it nicely hides all the details of how the systems themselves are constructed.

Of course all this depends on giving the user total control of all systems to be loaded/used. If the units package attempted to pre-load multiple systems, it would likely run into symbol name conflicts with no way to resolve them.

Full disclosure: I haven’t actually tried this but I have used reexport in other places.

I’m not saying you should use it, but SWI-Prolog has a flag (module sensitive) called var_prefix. After setting this flag to true, variables are written as _<name>. See SWI-Prolog -- Force only underscore to introduce a variable

I have already used reexport/1 and it is awesome !
It is an interesting suggestions.
One thing I don’t really understand is the benefit of only importing the unit data for a specific system.
This has no benefit compared to the current implementation.
But it forces the user to add an import line for the si system, even if he wants to only use fully qualified names.
While currently, the only line the user needs to type to start using the library is use_module(library(units)). which I find cleaner and simpler.
Basically, the idea is that symbols are optional and the user should use them sparingly.

See, I have the exact opposite opinion. Symbols compose the users dictionary and should be the primary way of specifying units and not something optional to be used “sparingly”. Put another way, if I had a complete, well managed set of symbols, why would I use anything else?

No, I’m importing the “system”. In the process it exports its symbols (checking for conflicts) as well as ensuring that the underlying unit data is installed for qeval to use. (Remember that the unit data files themselves don’t export anything visible to the user.) Benefits are described below.

That’s true, the user must explicitly load whatever system(s) their application requires, whether its symbols are used or not. I actually think that’s a good thing - just like using any other library or resource.

In addition this eliminates any dependancies between units.pl and any systems (user or pack defined, now or in the future) used by the application. I think that’s a beneficial decoupling.

A minor point: systems that aren’t used, don’t get loaded.

And is it really that onerous to require importing of any systems used with the units library, i.e..?

:- use_module (library(units)), use_module(library(units/systems/si), ...

If so just put whatever you want in a simple script file and load it instead.

I think this scheme is eminently preferable to the status quo, but that’s in part because of my perspective on the importance of symbols.

Thanks for the input. This would be useful for composing the data files in question, but hard to see how it would help on the calling side (top-level or user code).

This is an awesome tip, thank you very much.
I think it could be very useful for users using some capitalized unit heavily or for convienient oneoff top level queries.
I’m going to document the trick in the README.

After testing this, it feels hard to use actually.
For example, the lsp still thinks that capitalized atom are variables and the style checker reports underscore variables when use more than once.
Moreover, the top level hides the underscore from the output.

Well, for clarity.
Symbols are inherently ambiguous if you don’t know which system you are working with, or if you don’t have a deep knowledge of the units you are manipulating.

Well, the problem is that the multifile predicates used to define the systems are themselves defined in units.pl.
Meaning that to load a system, you need to first load units.pl. So there is already a inherent coupling in one direction.
Same thing with the fact that all systems are defined in terms of the si system, which is the most heavy weight system of all, meaning that only loading usc won’t have any impact compared to loading all systems.

I’m not completely convinced, so I made a comparative table of the two views:

user interaction status quo optin system with default symbols
use si without symbols use_module(library(units)). use_module(library(units)). use_module(library(units/si), []).
use si with symbols use_module(library(units)). use_module(library(units/systems/si/symbols)). use_module(library(units)). use_module(library(units/si)).
use si and usc without symbols use_module(library(units)). use_module(library(units)). use_module(library(units/si), []). use_module(library(units/usc), []).
use si and usc with symbols use_module(library(units)). use_module(library(units/systems/si/symbols)). use_module(library(units/systems/usc/symbols)). use_module(library(units)). use_module(library(units/si)). use_module(library(units/usc)).

edits: use_module(library(units)). is needed anyway for the new system to import things like qeval etc

Now that I made the table, I’m not sure what to think anymore ^^
Please, if any other people are reading this thread, please express your opinion so we can gather more point of views !

How so? Aren’t symbols completely defined in terms of the corresponding unit? If I’m using the si system (because I imported it) how is using m somehow ambiguous and si:metre not? And what “deep knowledge” does si:metre impart. So I really don’t understand the ambiguity you’re talking about?

The only coupling is that units.pl defines the multiifile “hook” that is used to define the data. Indeed that has to be loaded before any system data. But the only requirement is a temporal one. Nothing needs to change in units.pl to add a new system to the library. The model is the same as many of system hooks, e.g., prolog:message//1.

In any case the usc system should import the international system and the si system (as well as any dependancies they might have). But there’s lots of other systems (imperial, iec,hep, …). that don’t need to be loaded. si may dominate now, but that doesn’t mean the others don’t have any impact. In any case, I see this as a minor point; there are larger issues to sort out.

I suppose there is the option (comes for free) of importing a system without symbols, but I don’t see myself using it. So maybe it boils down to your ambiguity issue which I don’t understand.

Given the back and forth we had on this topic, I don’t think I will be able to convince you, so here is what I propose as future concrete steps:

  • use_module(library(units)).
    • exports qeval, qformat, the library error integration etc
    • loads only the si system unit data, since it is the root system that will always be used by any other systems
    • notably, this does not exports any symbols from the si system
  • use_module(library(units/systems/{MySystem})).
    • this will load the unit data for that system (and any system it is dependent on)
    • export symbols for MySystem, one can use the second argument to modify the export list

I think this proposal fixes all concerns of @ridgeworks :

  • symbols by default when importing a system
  • only loads units data of system used
  • fix transitive dependencies of units data across systems

and mine:

  • no symbols when importing the library to avoid potential name clash
  • a single import will give a functional unit library (albeit only with the si system)
  • if the user wants symbols or another system, he needs to read it up in the documentation how to do it. There I can document how to avoid importing symbols or manipulate the import list as well as warn for potential name clashing with user predicates.

I think this will work for me because pre-loading the si system without symbols is benign, and I can effectively ignore it. But is it true that “it is the root system that will always be used by any other systems”? System angular doesn’t as far as I can see. And maybe some user defined system, e.g., for currency, might not use it either. So I don’t think you need to designate a “root system” and preloading si just seems superfluous to me other than the minor convenience of using si units with a single import.

Can I use the unqualified unit name with prefixes. On v0.16, apparently not:

?- qeval(X is g).
X = 1*kind_of(isq:mass)[cgs:gram].

?- qeval(X is gram).
X = 1*kind_of(isq:mass)[cgs:gram].

?- qeval(X is kg).
X = 1*kind_of(isq:mass)[si:kilo(cgs:gram)].

?- qeval(X is kgram).
ERROR: No rule matches units:eval_(kgram,_23854)

In my case, I kind of like that convenience.
Only loading si was a compromise between your view and mine ^^
Maybe I should check empirically if loading a single system (like a small independent one like currency) will really speed up the logic.
If not, I think I will keep loading all systems data.

No. the idea for now is that an atom is a symbol or a concatenation of symbols.
but you can do k(gram).
I’m not sure it would be wise to increase the number of possible combination between symbols and unit names.

I don’t see what’s stopping you from writing that one line script for your personal use that loads whatever you want. But let others do the same. And you keep talking about loading “all systems data” inferring that what’s there now is all that will ever be. But I’ll keep insisting that it should load no systems data - which will always be the cheapest - and systems can be incrementally loaded according to the application’s need. IMO, that’s just good design.

I was hoping that there really would be no distinction between an unqualified unit name and a symbol from a user perspective. But now there is one since I can write kg and k(gram) but not k(g) and kgram. That’s unfortunate.

In the case where there isn’t a symbol for a unit, I guess I would like to see the unqualified unit name “promoted” to symbol status. For example, if the 'W' symbol for si:watt was removed, a “default” symbol watt would be defined. I don’t think this should increase the overall number of combinations of symbols and unit names.

Note that you can write k(g) but not kgram.
The idea is that you cannot mix symbol and non-symbol in an atom.
I concede that this limitation is a bit artificial.
On the other hand, it is going to again explode the number of predicates to export in order to use those atoms as symbols.

That is already the case, see the bit unit in iec: units/prolog/units/systems/iec/units.pl at 7e865b1ec6f9fc218c13a76365a99812261bff20 · kwon-young/units · GitHub
You just repeat the name of the unit in the symbol argument.