Using global (backtrackable) variables to hold return values

Hello,

I’d like to avoid adding an extra parameter to hold return values from predicate calls.

Would it be bad practice if i were to use a global variable, say, ret_value, to set (backtrackable) return values and then use these in subsequent predicates?

Any reason why not to do it?

thank you,

Dan

Don’t have answer, just trying to better understand the question.

Is this referring to Global variables?

yes …
b_setval/2 and friends …

One reason not to do this is that it ends up greatly limiting how predicates can be used. I think in many situations, multiple arguments could be thought of as the “return” value, depending on how you want to use it – e.g. append([1], [2,3], X) or append(X, [2,3], [1,2,3]) or append([1], X, [2,3]).

Thank you James,

Yes. I understand that.

I am mainly thinking about predicates that are not relational but functional.

For example, predicates that assert new structures and return newly created identifies.

In most cases I can ignore the “internal” generated identifier; but in some cases, I’d like to have it – such as when testing code.

So, for those some cases, I am wondering if i can get away with creating a second predicate with an extra argument for binding the returned identifier.

Dan

I still don’t fully understand your question.

This Q&A on StackOverflow might help you.

Why do some DCG test cases use assertion( Rest == [])?

I am the OP of the question and false who gave the answer is believed to be Ulrich Neumerkel but won’t swear to that in a court of law.

Not sure I fully understand your use case. That said, Logtalk parametric objects may provide a solution. Object parameters are logical variables so any bindings are automatically undone by backtracking. Object parameters are also shared by all object predicates and provide O(1) access. Thus, a toy example could be:

:- object(foo(_Id_)).

    :- public(bar/1).

    bar(StructArgs) :-
         create_some_struct(StructArgs, Struct, _Id_),
         assertz(Struct).

:- end_object.

When calling bar/1, the object parameter, _Id_, would be bound to the struct identifier:

% ignore returned identifier
?- foo(_)::bar(...).
...

% access returned identifier
?- foo(Id)::bar(...).
...

Still, not clear why you don’t simply write your predicates such that they always have an output argument, which can trivially be ignored, for the identifiers.

P.S. I see on a regular basis people that appear to think that using global variables is nasty but fast and thus a solution worth looking. Wondering if that’s the reasoning also here…

Given that the OP is talking about backtrackable global variables, let’s compare them agains a cleaner solution that uses logical variables. Global variables are a non-portable, low level mechanism and, I assume, fairly optimized (never looked into the implementation). Thus, it’s reasonable to expect that a global variables based solution be indeed faster. But is that really true? And, if true, how much faster?

The code that follows assumes two predicates that assert structures in user that have an identifier. Moreover, as the OP mentioned, the code assumes that those identifiers should be stored so that they can be used for testing or (I assume) debugging. I’m going to use a module and a parametric object (but, again, this is about global variables vs logical variables):

:- module(foo, [bar/1, baz/1]).

bar(StructArgs) :-
	create_some_struct(bar, StructArgs).

baz(StructArgs) :-
	create_some_struct(baz, StructArgs).

create_some_struct(Functor, StructArgs) :-
	gensym(id, Id),
	Struct =.. [Functor, Id | StructArgs],
	assertz(user:Struct),
	b_setval(Functor, Id).

The equivalent parametric object code using logical variables is:

:- object(foo(_BarId_, _BazId_)).

	:- public(bar/1).
	bar(StructArgs) :-
		create_some_struct(bar, StructArgs, _BarId_).

	:- public(baz/1).
	baz(StructArgs) :-
		create_some_struct(baz, StructArgs, _BazId_).

	create_some_struct(Functor, StructArgs, Id) :-
		gensym::gensym(id, Id),
		Struct =.. [Functor, Id | StructArgs],
		{assertz(Struct)}.

:- end_object.

Let’s try it:

?- set_logtalk_flag(optimize, on).
true.

?- {gensym(loader), foo_2}.
% [ /Users/pmoura/logtalk/library/gensym/gensym.lgt loaded ]
% [ /Users/pmoura/logtalk/library/gensym/loader.lgt loaded ]
% (0 warnings)
% [ /Users/pmoura/Desktop/dgross/foo_2.lgt loaded ]
% (0 warnings)
true.

?- use_module(library(gensym)), [foo].
true.

?- time(true).
% 2 inferences, 0.000 CPU in 0.000 seconds (60% CPU, 333333 Lips)
true.

?- time((foo(BarId, BazId)::bar([1,2,3]), foo(BarId, BazId)::baz([a,b,c]))).
% 23 inferences, 0.000 CPU in 0.000 seconds (88% CPU, 547619 Lips)
BarId = id1,
BazId = id2.

?- time((foo(BarId, BazId)::bar([1,2,3]), foo(BarId, BazId)::baz([a,b,c]))).
% 22 inferences, 0.000 CPU in 0.000 seconds (91% CPU, 511628 Lips)
BarId = id3,
BazId = id4.

?- time((bar([1,2,3]), baz([a,b,c]), b_getval(bar, BarId), b_getval(baz, BazId))).
% 39 inferences, 0.000 CPU in 0.000 seconds (84% CPU, 750000 Lips)
BarId = id1,
BazId = id2.

?- time((bar([1,2,3]), baz([a,b,c]), b_getval(bar, BarId), b_getval(baz, BazId))).
% 30 inferences, 0.000 CPU in 0.000 seconds (81% CPU, 750000 Lips)
BarId = id3,
BazId = id4.

But inferences don’t all take the same amount of time. Repeating the queries a fair amount of time gives:

?- time((between(1,100000,_), foo(BarId, BazId)::bar([1,2,3]), foo(BarId, BazId)::baz([a,b,c]), fail)).
% 2,300,001 inferences, 0.927 CPU in 1.030 seconds (90% CPU, 2480585 Lips)
false.

?- time((between(1,100000,_), bar([1,2,3]), baz([a,b,c]), b_getval(bar, BarId), b_getval(baz, BazId), fail)).
% 3,100,001 inferences, 0.530 CPU in 0.587 seconds (90% CPU, 5848363 Lips)
false.

The global variables version is faster (despite requiring more inferences, mainly due to the 4 extra predicate calls in the query) :scream: The Dark Side can indeed be alluring :smiley:

But what about adding the identifier as an output argument to the predicates? This may sound like an obvious solution but there could be all sorts of information that we may want to make available for e.g. testing or debugging without wanting the overhead of carrying that information in additional and variable number of predicate arguments that would need to be ignored in normal usage. Consider:

:- module(foo2, [bar/2, baz/2]).

bar(StructArgs, Id) :-
	create_some_struct(bar, StructArgs, Id).

baz(StructArgs, Id) :-
	create_some_struct(baz, StructArgs, Id).

create_some_struct(Functor, StructArgs, Id) :-
	gensym(id, Id),
	Struct =.. [Functor, Id| StructArgs],
	assertz(user:Struct).

We this alternative solution we get:

?- [foo2].
true.

?- time((bar([1,2,3], BarId), baz([a,b,c], BazId))).
% 27 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 586957 Lips)
BarId = id800005,
BazId = id800006.

?- time((bar([1,2,3], BarId), baz([a,b,c], BazId))).
% 26 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 565217 Lips)
BarId = id800007,
BazId = id800008.

?- time((between(1,100000,_), bar([1,2,3], BarId), baz([a,b,c], BazId), fail)).
% 2,700,001 inferences, 0.564 CPU in 0.625 seconds (90% CPU, 4787041 Lips)
false.

A bit surprising to me, still slower than the global variables solution but in the same ballpark. Jan, knowing all the fine details, can explain the performance difference here much better than me.

1 Like

Hi Paolo,

Thank you for the comparision.

What i had in mind also, was thread safety …

Dan

Still don’t understand your specific use case, maybe this one will hit closer to the mark.

If the requirements are

  1. Preserve data using assert so that it does not have to be passed back as a parameter.
  2. Has to have thread safety.

You note it is used for testing, but for this repose that is a requirement that can be discarded.

Use library(persistence).

Here is a real example I recently created.

This is a file with path:
'D:/Cellular Information/Source Code/Prolog/PDF/PDF_persistency.pl'

:- module(
        pdf__persistency,
        [
            exists_name_value/3,    % ?Name:atom, ?Hierarchy:list, ?Value:compound
            add_name_value/3        % +Name:atom, ?Hierarchy:list, +Value:compound
        ]
      ).

:- use_module(library(persistency)).

:- persistent
    name_value(name:atom,hierarchy:list,value:compound).

:- initialization(db_attach('D:/Cellular Information/Source Code/Temp/pdf.journal', [])).

exists_name_value(Name,Hierarchy,Value) :-
  with_mutex(pdf_db,name_value(Name,Hierarchy,Value)).

add_name_value(Name,Hierarchy,Value) :-
    with_mutex(pdf_db,assert_name_value(Name,Hierarchy,Value)).

Here is the code that makes uses of it.

This is in a file with path:
‘D:/Cellular Information/Source Code/Prolog/PDF/PDF__parser.pl’


:- use_module('D:/Cellular Information/Source Code/Prolog/PDF/pdf_persistency.pl').

persist_structures(Trailer_dictionary,Xref,Linear_structure) :-
    (
        % Does it exist
        exists_name_value('Trailer directory',[],Trailer_dictionary)
    ;
        % If not then add it
        add_name_value('Trailer directory',[],Trailer_dictionary)
    ),
    (
        % Does it exist
        exists_name_value('Xref',[],Xref)
    ;
        % If not then add it
        add_name_value('Xref',[],Xref)
    ),
    persist_trailer_dictionary(Linear_structure,[],Trailer_dictionary).

Here is a partial list of the contents of:
D:\Cellular Information\Source Code\Temp\pdf.journal

created(1559406518.911974).
assert(name_value('Trailer directory',[],dictionary([key_value(name('Size'),number(78)),key_value(name('Root'),reference(76,0)),key_value(name('Info'),reference(77,0)),key_value(name('ID'),array([hex_number("d8d301b6211e2bc9e7508b45f162d319"),hex_number("d8d301b6211e2bc9e7508b45f162d319")]))]))).
assert(name_value('Xref',[],xref(0,[object(0,0,65535,type(free)),object(1,426961,0,type(in-use)),object(2,16,0,type(in-use)),object(3,4485,0,type(in-use)),object(4,422790,0,type(in-use)),object(5,423567,0,type(in-use)),object(6,424345,0,type(in-use)),object(7,424768,0,type(in-use)),object(8,425611,0,type(in-use)),object(9,400088,0,type(in-use)),object(10,400158,0,type(in-use)),object(11,5215,0,type(in-use)),object(12,427866,0,type(in-use)),object(13,4661,0,type(in-use)),object(14,427042,0,type(in-use)),object(15,5285,0,type(in-use)),object(16,10920,0,type(in-use)),object(17,44189,0,type(in-use)),object(18,427126,0,type(in-use)),object(19,11111,0,type(in-use)),object(20,15716,0,type(in-use)),object(21,51884,0,type(in-use)),object(22,427210,0,type(in-use)),object(23,15907,0,type(in-use)),object(24,17973,0,type(in-use)),object(25,424578,0,type(in-use)),object(26,110066,0,type(in-use)),object(27,427294,0,type(in-use)),object(28,18175,0,type(in-use)),object(29,21552,0,type(in-use)),object(30,256447,0,type(in-use)),object(31,287623,0,type(in-use)),object(32,427378,0,type(in-use)),object(33,21755,0,type(in-use)),object(34,24250,0,type(in-use)),object(35,320180,0,type(in-use)),object(36,356948,0,type(in-use)),object(37,366709,0,type(in-use)),object(38,427462,0,type(in-use)),object(39,24455,0,type(in-use)),object(40,29328,0,type(in-use)),object(41,375685,0,type(in-use)),object(42,427546,0,type(in-use)),object(43,29530,0,type(in-use)),object(44,35517,0,type(in-use)),object(45,427630,0,type(in-use)),object(46,35684,0,type(in-use)),object(47,41042,0,type(in-use)),object(48,427714,0,type(in-use)),object(49,41198,0,type(in-use)),object(50,44043,0,type(in-use)),object(51,399513,0,type(in-use)),object(52,399693,0,type(in-use)),object(53,399792,0,type(in-use)),object(54,399891,0,type(in-use)),object(55,399989,0,type(in-use)),object(56,400226,0,type(in-use)),object(57,400571,0,type(in-use)),object(58,404754,0,type(in-use)),object(59,405195,0,type(in-use)),object(60,411563,0,type(in-use)),object(61,411856,0,type(in-use)),object(62,413215,0,type(in-use)),object(63,413431,0,type(in-use)),object(64,413819,0,type(in-use)),object(65,414385,0,type(in-use)),object(66,422097,0,type(in-use)),object(67,422317,0,type(in-use)),object(68,425787,0,type(in-use)),object(69,426373,0,type(in-use)),object(70,425929,0,type(in-use)),object(71,426666,0,type(in-use)),object(72,425999,0,type(in-use)),object(73,426077,0,type(in-use)),object(74,427798,0,type(in-use)),object(75,427826,0,type(in-use)),object(76,428012,0,type(in-use)),object(77,428082,0,type(in-use))]))).
assert(name_value('Root',['Root'],object(id(76,0),dictionary([key_value(name('Type'),name('Catalog')),key_value(name('Pages'),reference(12,0)),key_value(name('PageLabels'),reference(75,0))])))).
assert(name_value('Pages',['Root','Pages'],object(id(12,0),dictionary([key_value(name('Type'),name('Pages')),key_value(name('Kids'),array([reference(1,0),reference(14,0),reference(18,0),reference(22,0),reference(27,0),reference(32,0),reference(38,0),reference(42,0),reference(45,0),reference(48,0)])),key_value(name('Count'),number(10)),key_value(name('MediaBox'),array([number(0),number(0),number(612),number(792)]))])))).
assert(name_value('Kids',['Root','Pages','Kids'],object(id(1,0),dictionary([key_value(name('Type'),name('Page')),key_value(name('Parent'),reference(12,0)),key_value(name('Resources'),reference(3,0)),key_value(name('Contents'),reference(2,0))])))).
assert(name_value('Resources',['Root','Pages','Kids','Resources'],object(id(3,0),dictionary([key_value(name('ProcSet'),array([name('PDF'),name('Text')])),key_value(name('Font'),dictionary([key_value(name('F1'),reference(4,0)),key_value(name('F2'),reference(5,0)),key_value(name('F3'),reference(6,0)),key_value(name('F5'),reference(7,0)),key_value(name('F6'),reference(8,0))])),key_value(name('ExtGState'),dictionary([key_value(name('GS1'),reference(9,0)),key_value(name('GS2'),reference(10,0))])),key_value(name('ColorSpace'),dictionary([key_value(name('Cs8'),reference(11,0))]))])))).
assert(name_value('F1',['Root','Pages','Kids','Resources','Font','F1'],object(id(4,0),dictionary([key_value(name('Type'),name('Font')),key_value(name('Subtype'),name('Type1')),key_value(name('FirstChar'),number(32)),key_value(name('LastChar'),number(181)),key_value(name('Widths'),array([number(333),number(313),number(552),number(668),number(479),number(896),number(750),number(281),number(354),number(354),number(521),number(667),number(313),number(417),number(313),number(531),number(500),number(500),number(500),number(500),number(500),number(500),number(500),number(500),number(500),number(500),number(313),number(313),number(667),number(667),number(667),number(365),number(927),number(740),number(698),number(740),number(823),number(646),number(594),number(781),number(833),number(365),number(344),number(740),number(604),number(927),number(813),number(844),number(646),number(844),number(771),number(563),number(729),number(771),number(750),number(1000),number(802),number(750),number(750),number(365),number(531),number(365),number(583),number(500),number(333),number(479),number(573),number(490),number(573),number(479),number(323),number(552),number(563),number(292),number(281),number(542),number(271),number(927),number(573),number(573),number(573),number(573),number(396),number(375),number(344),number(573),number(521),number(781),number(531),number(521),number(510),number(396),number(566),number(396),number(521),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(490),number(833),number(333),number(333),number(333),number(333),number(333),number(750),number(333),number(333),number(333),number(333),number(333),number(333),number(333),number(668),number(333),number(333),number(333),number(552)])),key_value(name('Encoding'),name('WinAnsiEncoding')),key_value(name('BaseFont'),name('JONDFF+Bembo-Bold')),key_value(name('FontDescriptor'),reference(56,0))])))).

Also since these structures are rather complex and hard to interpret as single line, using print_term/2 comes in handy.


As you noted that you are looking at b_setval/2 and friends and b_setval/2 reads:

Associate the term Value with the atom Name or replace the currently associated value with Value. If Name does not refer to an existing global variable, a variable with initial value [] is created (the empty list). On backtracking the assignment is reversed.

This example does not implement On backtracking the assignment is reversed. however library(persistency) notes

The persistent/1 expands each declaration into four predicates:

  • name(Arg, ...)
  • assert_name(Arg, ...)
  • retract_name(Arg, ...)
  • retractall_name(Arg, ...)

so the requirement might be satisfied by use of retract_name(Arg, ...) with some added logic :smiley:

However if you are using this for testing or learning then possibly you might want to see all variations of the value, in which case there would not be a need to retract on backtracking and instead save each value with a unique seqential value for sorting such as a time stamp using date_time_stamp/2 or sequential values using library(gensym)

I don’t know what is going on in Logtalk. Is there another assert behind the scene? In any case backtrackable global variables are pretty fast as, regardless of the value, simply sets a pointer and trails this. Most of the time is probably spent in mapping the name to the location (a hash table). There is a fairly high price during garbage collection because the way the locations are stored doesn’t play well with garbage collection and thus we have to do some conversion work before and after GC for each global variable. So, do not create too many of them!

The OP however want to use it as an alternative to argument passing if I read the post correctly. I can’t see the point of that and I do not understand the reasoning behind it. But, it is phrased as a general question. Possibly there are scenarios where the code gets easier to read and (unlikely) faster.

Thank you Jan.

Yes, both – speed and readability, but also thread saftey in case of need in the future.

Dan

Thanks for the detailed overview on the implementation of global variables.

Note that the final comparison on my answer is between two module-based alternatives. That said, the only “assert behind the scenes” is on the gensym/2 calls, both for the Logtalk and the module versions. I just completed a re-run of the tests where the Logtalk version uses SWI-Prolog library(gensym) instead of the Logtalk own version of the library. With that change I get:

?- set_logtalk_flag(optimize, on).
true.

?- use_module(library(gensym)), [foo].
true.

?- {gensym(loader), foo_2}.
% [ /Users/pmoura/logtalk/library/gensym/gensym.lgt loaded ]
% [ /Users/pmoura/logtalk/library/gensym/loader.lgt loaded ]
% (0 warnings)
% [ /Users/pmoura/Desktop/dgross/foo_2.lgt loaded ]
% (0 warnings)
true.

?- time(true).
% 2 inferences, 0.000 CPU in 0.000 seconds (60% CPU, 333333 Lips)
true.

?- time((foo(BarId, BazId)::bar([1,2,3]), foo(BarId, BazId)::baz([a,b,c]))).
% 32 inferences, 0.000 CPU in 0.000 seconds (81% CPU, 653061 Lips)
BarId = id1,
BazId = id2.

?- time((foo(BarId, BazId)::bar([1,2,3]), foo(BarId, BazId)::baz([a,b,c]))).
% 26 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 962963 Lips)
BarId = id3,
BazId = id4.

?- time((bar([1,2,3]), baz([a,b,c]), b_getval(bar, BarId), b_getval(baz, BazId))).
% 33 inferences, 0.000 CPU in 0.000 seconds (70% CPU, 733333 Lips)
BarId = id5,
BazId = id6.

?- time((bar([1,2,3]), baz([a,b,c]), b_getval(bar, BarId), b_getval(baz, BazId))).
% 30 inferences, 0.000 CPU in 0.000 seconds (78% CPU, 750000 Lips)
BarId = id7,
BazId = id8.

?- time((between(1,100000,_), foo(BarId, BazId)::bar([1,2,3]), foo(BarId, BazId)::baz([a,b,c]), fail)).
% 2,700,001 inferences, 0.544 CPU in 0.603 seconds (90% CPU, 4965820 Lips)
false.

?- time((between(1,100000,_), bar([1,2,3]), baz([a,b,c]), b_getval(bar, BarId), b_getval(baz, BazId), fail)).
% 3,100,001 inferences, 0.549 CPU in 0.610 seconds (90% CPU, 5647517 Lips)
false.

Thus, it turns out that the performance is roughly the same and the performance difference compared with my first benchmarks as due to the different performances of the gensym libraries. Using the same library indeed makes the comparison between globals variables and logical variables a more fair one :slight_smile: It turns out that the Dark Side doesn’t win this round :grinning:

Why is Logtalk’s gensym portable library slower? To make it thread-aware/thread-safe, all predicates are declared as synchronized, which means that their compiled versions use a with_mutex/2 wrapper. This is the source of the observed slowdown. The SWI-Prolog gensym library uses the proprietary flag/3 built-in predicated, which performs atomic updates, which seems to be why is faster.

1 Like

Flag is quite a bit cheaper than assert+retract and, indeed, atomic. Flags are a simple hash table mapping the flag name to a struct that holds the value. The update is nothing more than storing the (integer) value in native machine format.

1 Like