Rdf/{3,4} with excluded graphs

One feature I’m missing in semweb is to be able to query triples excluding given subset of graphs. Of course, one can do rdf/4 with unbound graph variable and then filter results by graph name, but that is obviously suboptimal as excluded graphs can be quite large. Alternatively one can generate a list of graphs with exclusions and do multiple rdf/4 queries. But that breaks atomicity of the query.

So, I’m just wondering how difficult it would be to implement such functionality natively using graph indexes without major architectural changes?

One could, for example, associate a property like “invisible(true)” with a set of graphs via rdf_graph_property/2 to mark them excluded from triple walking (unless, explicitly addressed by rdf/4 4th argument).

It has crossed my mind before. First of all, there is not much of a notion of graphs in rdf_db. The core store is a quadruple store where the graph is one of the arguments. There are graph objects, but these mainly collect some statistics about triples associated with this graph and perform change tracking.

That means that in the end a query over a subset of the graphs is one of the two alternatives you indicate. The only gain one can get is, for filtering, having to return the triple to Prolog for Prolog to discard it rather than testing some bit on the associated graph and skip the triple. For the multiple queries the system can safe some time in pre-processing the query and could run all queries in the same generation. You can do this yourself using rdf_transaction/1.

I’m mainly concerned about the API. Playing around with global flags on graphs seems dubious.

Agree that global flags is not the ideal API.

Alternatively, if we treat graphs as just another element in the tuple with string (or atom) value, an API could be based on rdf_where/1 to specify constraints on graph variables including all normal string operations like prefix, substring, etc.