Computing probability distributions for clp(BNR) intervals

Rscho314 · June 12, 2025, 10:45pm

Hello,

I’ve got a solver over contingency matrices generating clp(BNR) intervals. It can also of course enumerate concrete solutions, but with the number of possible solutions expanding fast along data size it is unfeasible to enumerate them all. Now, clp(BNR) intervals are great, but it would be really nice if I could compute the probability distribution over those intervals ! I am a noob, so I would value your advice on how to do that.

clp(BNR) recently introduced a search/6 predicate that looks like a good candidate for random solution sampling. Would you know of other (better ?) ways to randomly sample a solution space without enumerating all solutions first ?

Thanks!

kwon-young · June 13, 2025, 8:29am

Hello,
I recently wondered about something similar, I think.

Just to clarify, when you say:

do you mean that, for example:

you have a clpBNR real X with an interval [0, 10] : X::real(0, 10)
This real X has an unknown concrete solution inside the interval [0, 10]
however, you know the probability distribution of the unknown solution, for example a Gaussian with mean 5 and std 1 ?
meaning that if you want to search for a solution, it would be better to start enumerating solution close to 5 first ?

I stumbled on a similar problem when thinking about Kalman filters, where we could model the hidden state with a constrained variable.
And instead of an interval, the constraint would store a probability distribution.
New observation would update the probability distribution, and we could even reject observation that are too unlikely to happen.
Has this idea been explored before ?
I feel there is a connection with probabilistic logic programming, but since I don’t know anything about that, I have no idea what I am talking about…

[edit]
well, after a quick search, it seems my idea was already explored in cplint: kalman_filter.pl

Rscho314 · June 13, 2025, 10:49am

@kwon-young Hi,

No, I meant the case where the distribution is unknown, and we seek to estimate it. : you’ve got an interval, but if you enumerate solutions values that are close to the distribution mode will appear more frequently.

So, that amounts to bayesian constrained sampling. You’d specify a prior distribution (or just use a default uniform prior if you don’t want to bother), and randomly sample a number of program solutions to get a posterior. I looked through Foundations of Probabilistic Logic Programming but it’s unclear to me how to do that with an arbitrary program using other libs than CPLINT.

ridgeworks · June 13, 2025, 3:50pm

Assuming I understand you, I’m afraid clpBNR provides little to no support for this. From a probability perspective, the only guarantee is that there is zero probability that a solution exists outside an interval. The interval itself may contain any number of solutions, including none at all. And the same applies to any sub-interval of that interval.

So the only way of absolutely determining that a solution exists at any point (e.g., its midpoint) within the interval is to unify the clpBNR constrained variable with the point value. If it succeeds there is a 100% probability that a solution exists at that point; if it fails there is a 0% probability.

So it’s hard to see how to avoid enumerating all solutions, and then group to provide a “simplified” view of the solution space, which I believe is what you’re looking for.

You might be able to approximate what you want by making some assumptions based on knowledge of the problem, e.g., at some sufficiently small width an interval contains one solution. Then sample the interval over an inclusive set of such small sub-intervals to generate the data for histogram. But all that seems a bit dubious to me.

Rscho314 · June 13, 2025, 6:04pm

I’m afraid clpBNR provides little to no support for this.

I think I understand what you mean, but then have I misunderstood your search/6 predicate ? As I understand it, search/6 with the indomain_random choice strategy allows to enumerate solutions in a random order. Can’t I use that to randomly sample the solution space for a desired number of solutions, and then make a histogram from that ?

ridgeworks · June 14, 2025, 3:53pm

For finite domains, indomain_random attempts to remove the solution at a random point in the domain. For continuous domains (reals), it tries to split the domain at a random point (semantics inherited from EClipSe’s search library). So I don’t think it does what you want.

But if you have a sampling strategy, I don’t think it would be difficult to write code to do what you want, e.g., some variation of:

% collect sucees failure at N random points in interval Int
test_random_pts(0,_,[]).
test_random_pts(N, Int, [sol(Pt,Bool)|Sols]) :- N>0,
    range(X,[L,H]), Pt is L+random_float*(H-L),
    test_pt(Int,Pt,Bool),
    N1 is N-1,
    test_random_pts(N1,In,Sols).

test_pt(Int,Pt,B) :- (not(not(Int=Pt)) -> B=true ; B=false).

% ?- X::real(-100,100), test_random_pts(5,X,Sols).

Topic		Replies	Views
Clpr Help! clpr	13	911	January 14, 2021
Debugging constraints Discussion	19	965	February 16, 2021
Comparing CLPR with CLPBNR for constraints on reals General discussion	5	732	September 23, 2020
clpBNR: difference constraint on reals Help!	30	1159	September 11, 2021
Interval arithmetic (just a few functions) Help! how-to	6	989	April 11, 2021

Computing probability distributions for clp(BNR) intervals

Related topics