The facts I’m asserting during data load look like this:
db:resource(Id:atom, Type:atom, Age:int, Profile:atom, Path:atom, Attrs:dict)
Where (Id, Age) is always unique and Age is either 1 or 2, representing the state of the resource identified by Id at different points in time. Type is drawn from an enumeration of < 100. Profile & Path are used for filtering results if required. Attrs is a dict containing the attributes of the resource. These facts are all created during data load, around 50k of them.
There’s another fact type that’s also created during data load which records relationships between two resources, but just for the most recent resource records:
db:link(+SrcId:atom, SrcType:atom, +DstId:atom, +DstType:atom)
Finally there is 2nd-level tabled predicate with multiple instances, with an overall form similar to this, most are significantly more complicated, referring to attributes of multiple resources
symptom(Id, 'Unused resource') :-
db:resource(Id, Type, 1, _, _, _),
\+ db:link(_, _, Id, Type)
It’s these symptom predicates that are tabled subsumptively. Symptoms are then grouped by resource so you can get a list of everything that’s wrong with a resource, that’s also tabled.
Grouped symptoms are then filtered for reporting, e.g. "List all the symptoms of this sort where the resource type is A, the Profile is B and the Path contains the string ‘foo’. Typically there will be ~25k symptoms, listing a subset of interest when tabling is used is fast.
Thanks for the masterclass & tips
I’ll use ignore/1, I’ll compare tabling without subsumption and I’ll take a look at your goodies library as well.