Linear and logistic regression - Reply 2

@friguzzi These would be perfect to provide baselines for an application of slipcover to ecological data that I am currently working on. Unfortunately, the example on cplint@SWISH throws exceptions:
In logistic regression, the goal example_log_r(100,Coeff) leads to
procedure 'maplist(A,B,C,D,E,F)' does not exist,
while in linear regression, the goal example_lr(100,Coeff) leads to an instantiation error thrown by

mcintyre:mc_sample_arg_raw(noise(1,_1804,5),1,_1790,[_1810]) at /home/mlunife/.local/share/swi-prolog/pack/cplint/prolog/mcintyre.pl:2354

Hi Felix,
thanks for reporting, I recently updated swipl and mcintyre to check arguments. I have now fixed the linear regression example cplint on SWISH -- Probabilistic Logic Programming
As regards logistic regression, I coded a workaround
cplint on SWISH -- Probabilistic Logic Programming
But @jan, how to perform maplist on more than 4 lists?

Fabrizio

Good question. The compiled expansion of library(apply_macros) allows for maplist/N with arbitrary arity. It used to do its job whenever loaded. Now it depends on the flags optimise, optimise_apply and apply_macros_scope.

Now it gets a little hard. In my view, library(apply_macros) is a temporary crude alternative for inline and (thus) should not have any effect besides performance.

Of course, we can add a maplist/6, which seems to solve your problem. This too is not satisfying IMO (unless there is some agreement about an upper bound).

Also unsatisfactory, I’m tempted to say that the best solution, at least for now, is to add your own copy/pasted version of maplist/6 to your source.

Better suggestions are welcome …

1 Like

If one is not limited by the sandbox such as with SWISH then adding maplist with more arguments is easy. In the case noted using SWISH, for me it seems that maplist/N with N upto 9 would be of value.

Actually I added this definition

maplist(Goal, List1, List2, List3, List4, List5) :-
    maplist_(List1, List2, List3, List4, List5, Goal).

maplist_([], [], [], [], [], _).
maplist_([Elem1|Tail1], [Elem2|Tail2], [Elem3|Tail3], [Elem4|Tail4], [Elem5|Tail5], Goal) :-
    call(Goal, Elem1, Elem2, Elem3, Elem4, Elem5),
    maplist_(Tail1, Tail2, Tail3, Tail4, Tail5, Goal).

and it was allowed by the sandbox, even if it contains call.

1 Like

Thanks a lot!
A couple of points I noticed:
It seems as if the concrete way the matrix calculation is split up in the logistic regression iteration step can be unstable with cleaner input (for instance, setting noise to 1 leads to a zero-divisor error).
I forked a version at
https://cplint.eu/p/my_logistic_regression.pl
that uses a different split and avoids that issue.

Also, I do get coefficient values widely different from [1,2,3], but I am not sure why that is.

Thanks @felix.weitkaemper
nice solution. I also updated my example with your approach to avoid zero division errors.

1 Like

Regarding the coefficients, bear in mind that any factor multiplied for all the coefficients, gives a set of coefficients identifying the same line.

@friguzzi As I am now finishing up the project I mentioned 19 months ago above, I would like to share my final code on github once I submit the paper for review. Since that includes an adapted version of the regression code (I added things like AIC computations for the logistic regression and F statistics for the linear regression), it would be great if you could license the regression code under the same Artistic License 2.0 you used for cplint and the matrix pack.

Hi @felix.weitkaemper I published the code at GitHub - friguzzi/logistic_regression: Logistic regression using Iteratively reweighted least squares (IRLS) with an MIT License because it’s simpler, is that ok?
I will also publish it as a pack soon, I’m trying to solve a dependency compilation problem.

Dear Fabrizio,

Thank you so much! The MIT license makes everything even simpler, of course. Now that the licensing issue is clarified, I can put my code in a public github repository too.