Computing probability distributions for clp(BNR) intervals

Hello,

I’ve got a solver over contingency matrices generating clp(BNR) intervals. It can also of course enumerate concrete solutions, but with the number of possible solutions expanding fast along data size it is unfeasible to enumerate them all. Now, clp(BNR) intervals are great, but it would be really nice if I could compute the probability distribution over those intervals ! I am a noob, so I would value your advice on how to do that.

clp(BNR) recently introduced a search/6 predicate that looks like a good candidate for random solution sampling. Would you know of other (better ?) ways to randomly sample a solution space without enumerating all solutions first ?

Thanks!

Hello,
I recently wondered about something similar, I think.

Just to clarify, when you say:

do you mean that, for example:

  • you have a clpBNR real X with an interval [0, 10] : X::real(0, 10)
  • This real X has an unknown concrete solution inside the interval [0, 10]
  • however, you know the probability distribution of the unknown solution, for example a Gaussian with mean 5 and std 1 ?
  • meaning that if you want to search for a solution, it would be better to start enumerating solution close to 5 first ?

I stumbled on a similar problem when thinking about Kalman filters, where we could model the hidden state with a constrained variable.
And instead of an interval, the constraint would store a probability distribution.
New observation would update the probability distribution, and we could even reject observation that are too unlikely to happen.
Has this idea been explored before ?
I feel there is a connection with probabilistic logic programming, but since I don’t know anything about that, I have no idea what I am talking about…

[edit]
well, after a quick search, it seems my idea was already explored in cplint: kalman_filter.pl

@kwon-young Hi,

No, I meant the case where the distribution is unknown, and we seek to estimate it. : you’ve got an interval, but if you enumerate solutions values that are close to the distribution mode will appear more frequently.

So, that amounts to bayesian constrained sampling. You’d specify a prior distribution (or just use a default uniform prior if you don’t want to bother), and randomly sample a number of program solutions to get a posterior. I looked through Foundations of Probabilistic Logic Programming but it’s unclear to me how to do that with an arbitrary program using other libs than CPLINT.

Assuming I understand you, I’m afraid clpBNR provides little to no support for this. From a probability perspective, the only guarantee is that there is zero probability that a solution exists outside an interval. The interval itself may contain any number of solutions, including none at all. And the same applies to any sub-interval of that interval.

So the only way of absolutely determining that a solution exists at any point (e.g., its midpoint) within the interval is to unify the clpBNR constrained variable with the point value. If it succeeds there is a 100% probability that a solution exists at that point; if it fails there is a 0% probability.

So it’s hard to see how to avoid enumerating all solutions, and then group to provide a “simplified” view of the solution space, which I believe is what you’re looking for.

You might be able to approximate what you want by making some assumptions based on knowledge of the problem, e.g., at some sufficiently small width an interval contains one solution. Then sample the interval over an inclusive set of such small sub-intervals to generate the data for histogram. But all that seems a bit dubious to me.

I’m afraid clpBNR provides little to no support for this.

I think I understand what you mean, but then have I misunderstood your search/6 predicate ? As I understand it, search/6 with the indomain_random choice strategy allows to enumerate solutions in a random order. Can’t I use that to randomly sample the solution space for a desired number of solutions, and then make a histogram from that ?

For finite domains, indomain_random attempts to remove the solution at a random point in the domain. For continuous domains (reals), it tries to split the domain at a random point (semantics inherited from EClipSe’s search library). So I don’t think it does what you want.

But if you have a sampling strategy, I don’t think it would be difficult to write code to do what you want, e.g., some variation of:

% collect sucees failure at N random points in interval Int
test_random_pts(0,_,[]).
test_random_pts(N, Int, [sol(Pt,Bool)|Sols]) :- N>0,
    range(X,[L,H]), Pt is L+random_float*(H-L),
    test_pt(Int,Pt,Bool),
    N1 is N-1,
    test_random_pts(N1,In,Sols).

test_pt(Int,Pt,B) :- (not(not(Int=Pt)) -> B=true ; B=false).

% ?- X::real(-100,100), test_random_pts(5,X,Sols).

Just a note about x-markup.js here, interesting “electron” variant:

But a little dry, could use some hydration? But I am really not
sure whether such topics can be discussed here, since for
web 1.0 its fine. Might belong to a topic of web 2.0 that

one might want a web page to incrementally appear,
similar like the browsers in 1995 were faced with slow
land lines and could progressively render a graphic image.

Now the delay is induced by workers doing some work.

Edit 14.06.2025
Well there was this french philosophy professor who showed
me his orgmode. I told him I am only interested in web 2.0.
I guess the approach is the same like 1995 where

graphic images had to be specially “compiled” server
side to be able to progressively render. Basically storing
different increments of the same content on the server.

Hydration refers to various techniques reviving the same
idea but for text markup forms I guess. Or you can also draw
inspiration in the game DOOM, and mess around with the

scrolling of web pages.