I've put an introductory Prolog tut for SQL programmers on Swish

joeblog · February 24, 2021, 2:35pm

I’ve had my first encounter with NULL redoing the Stanford SQL MooC with this problem:

select sName, sID, cName, major
from Student left outer join Apply using(sID);

There are four students in the Student table who aren’t in the Apply table, and the requirement is not to ignore them (as a normal join would do) but include them with their cName and major set to NULL.

This is how I’ve done it in Prolog:

student(SID, SName, _GPA, _SizeHS),
(   apply(SID, CName, Major, _Decision)
    ;   
    \+apply(SID, CName, Major, _Decision),
    CName = null, Major = null
).

Boris · February 24, 2021, 4:22pm

I will bring this up again, but I promise it is the last time. Do not use the atom null for representing a missing value, esp. if any other column is represented using atoms. This makes the text “null”, represented as the atom null, exactly the same as a missing value.

I am not sure what to use instead. The atomic, not-an-atom [] is one option for SWI-Prolog, but it has other problems. See the comments by @jan and @j4n_bur53.

grossdan · February 24, 2021, 4:32pm

In another domain I had the problem of “missing” value that would cause a fail if not taken care of. I decided to introduce an “custom rolled” optional value based on dict.

Something like below:

Errors would be some error message atoms created and, eventually, checked, throughout.

Not sure how approriate this is in an SQL context — but it could show that Prolog can work with optionals, including via, monads.

	new_value(Value, Optional) :-
		Optional = optional{value:Value,error:nil}. 

	new_error(Error, Optional) :-
		Optional = optional{value:nil,error:Error}. 

	is_error(Optional) :-
		Optional = optional{value:_Value,error:Error},
		\+ Error = nil.  
			
	is_value(Optional) :-
		Optional = optional{value:_Value,error:Error},
		Error = nil.

peter.ludemann · February 24, 2021, 5:33pm

I think the correct solution for null (which has its own problems in SQL – see Errors considered harmful - #102 by peter.ludemann) is to do something like Haskell’s Maybe = Just a | Nothing … that is, a non-defaulty representation. (In Prolog, this would be just(X) and nothing, for example, which can always be distinguished.)
Although, when designing a database, it’s often possible to avoid null completely – with specifying not null for a column when creating a table in SQL, and specifying a suitable default value for strings (typically a zero-length string).

Boris · February 24, 2021, 5:40pm

The first relational database I used seriously was Oracle DB; in Oracle, a zero-length string is in practice a NULL. imagine my shock when I discovered that not all databases do that.

Of course missing values can be avoided completely: just normalize your data. The SQL null is just a pragmatic approach to not normalizing your data. There are again many reasons why it is in practice useful to not normalize your data, there is enough written on the topic and I am definitely not an expert.

PS: faking a missing value instead of using null, just because the column has been defined as “not null”, this doesn’t sound right.

Boris · February 24, 2021, 6:06pm

Yes, but the person and whatever number they have are separate entities. Not having a serial number does not make you less of a person.

This is not what I meant by “normalizing your data”. Fully normalizing the data would mean that you have a table person, and another table linking the person to some number. There will be simply no row in that other table for people that have a missing number.

But I am certain I am not telling you anything you didn’t know.

Boris · February 24, 2021, 6:13pm

One major practical reason for allowing missing values in a relational database is that you can denormalize your data, and keep pieces of data that are usually used at the same time in close physical proximity to each other. So, getting the data is faster/cheaper. But again, those are obvious things I guess?

Boris · February 24, 2021, 6:23pm

There are many things that can be ignored, this is true.

Boris · February 24, 2021, 6:27pm

Well mission accomplished I guess, you made me read 3 Wikipedia articles in rapid succession, since you made me doubt myself

fsaenzperez · February 24, 2021, 7:59pm

No, they are not supported. I think there is no need for them for what DES is targeted to. The number of columns in a predicate depends on the host Prolog system. Currently, it runs in both SICStus and SWI-Prolog. Ciao and GNU Prolog were also supported up to series 3.

fsaenzperez · February 24, 2021, 8:03pm

Maybe techniques as vertical partitioning, as found in big data systems (e.g., Hadoop/Hive) might be appropriate for efficiently dealing with large amounts of data.

fsaenzperez · February 24, 2021, 8:10pm

I agree. In DES, since there are no user compound terms, a null is internally represented as ‘$NULL’(Id), where Id is a unique identifier, so that two nulls are not equal. Moreover, comparisons need to receive a special treatment, as in:

DES> null=null.
{
}
Info: 0 tuples computed.          
DES> null\=null.
{
}
Info: 0 tuples computed.          
DES> null>null.
{
}
Info: 0 tuples computed.          
DES> null=<null.
{
}
Info: 0 tuples computed.

This tries to mimic the SQL behaviour because SQL statements are translated to Datalog.

joeblog · February 24, 2021, 8:13pm

What strikes me as an easy way out of the NULL debacle is this:

student(SID, SName, _GPA, _SizeHS),
(   apply(SID, CName, Major, _Decision)
    ;   
    \+apply(SID, CName, Major, _Decision)
).

Then the undefined variables are simply omitted.

joeblog · February 24, 2021, 8:43pm

I made the mistake of using Facebook’s GraphQL in the past. It’s utter shite nobody should touch with a hazmat and a bargepole.

I wasted hours trying to figure out Facebook’s “technology” clearly developed by a bunch of pimply 11-year-olds who can’t document zilch, only to see Facebook’s “standard” completely rewritten by lawyers with zero computer science after the Cambridge Analytica scandal (thereby breaking everyone’s code).

Long story short, friends don’t let friends use “technology” developed by Facebook, Google, Amazon, Microsoft, IBM…

fsaenzperez · February 24, 2021, 8:48pm

The following paper discusses (though discourages the use of) nulls in DES:

[Sae12a] [BibTeX] F. Sáenz-Pérez, " Outer Joins in a Deductive Database System "

Chris Date, an appreciated author in databases, discourages both nulls and duplicates.

jan · February 28, 2021, 7:46pm

Nice job. I see a lot of findall/3 with sort/2 and sort/4. This works, but breaks the uniformity of solutions being answers to the query. library(solution_sequences) avoids this, providing distinct/1,2, order_by/2, offset/2, limit/2, etc. This library was added to make translating SQL like queries easier

Note that the SWISH menu on queries allow you to quickly add some of these wrappers to the current query.

joeblog · March 1, 2021, 8:26am

I wasn’t aware of the Modify solution sequences, so many thanks for the pointer. I’m in the process of rewriting the tut to use them, So far, I translated

select distinct sName, major
from Student, Apply
where Student.sID = Apply.sID;

to

distinct([SName, Major], (student(_ID, SName, _GPA, _SizeHS), 
           apply(_ID, _CName, Major, _Decision)) ).

It took me a bit of trial and error to figure out ?Witness in distinct (?Witness, :Goal) is a list of variables.

jan · March 1, 2021, 8:50am

Witness can be any term. Well, if there are no variables or the variables do not share with the goal you only get one answer Otherwise it simply uses the variables in the term, regardless of the shape of the term.

wdiestel · March 1, 2021, 4:32pm

I would look out for examples that can’t be done with SQL, or are much more elegant in Prolog.

I had some experience in the past with heavy SQL statements. The differences that matter from my point of view includes:

writing a complex SQL statement with nesting, existence checks, various join types (inner, outer, left, right) can be very painful and consume several hours to get it right
in Prolog if you don’t get the expected results you start the tracer and look through the “joins” = “unifications” step by step. In SQL your best choice is to break the SQL into pieces and debug those seperately and then put them together again, which is quite time consuming
if your SQL is part of an application those additional hours don’t really matter, it’s more important what database is in operation serving concurrent users, which DB drivers you may use etc.
on the other hand if you try to stick together your personal information base from heterogenous sources, where others use MS Excel or MS Access, and use this ad hoc, when someone asks for a particular view/extract of data, you can’t use SQL. You never get the time to consolidate your data to a degree, where an SQL would generate a clean output. Your only choices are to use some kind of Excel thing or use Prolog. Excel is easier at the beginning and painful because of redundancies and manual errors when your data collection grows. In Prolog you need more time and thoughts in the beginning to organize your data, but later you can produce reliable output with decent effort. Normally in the end of a project you have a lot more pressure - so Prolog worked well for me in those conditions.
usecases where this worked out well for me were: dependency analysis of legacy code, identifying missing or inconsistent application documentation (architecture, components, interfaces etc.), specify network topology like cabling between devices and switches, routing, firewall rules etc.

grossdan · March 1, 2021, 6:05pm

Thank you for sharing – very interesting.

Topic		Replies	Views
Pro2sql – a lighweight prolog to SQL translator - Reply 1 General	2	319	July 2, 2023
Some examples of graph databases using Prolog Nice to know	3	2036	March 8, 2021
How many of you realize that SWI-Prolog is an excellent, genuinely excellent language? Nice to know	4	1491	December 4, 2021
What's the difference between Prolog and Datalog? Help!	3	6243	February 15, 2021
Yet another google language General	4	641	April 16, 2021

I've put an introductory Prolog tut for SQL programmers on Swish

Related topics