I’ve translated the graph database examples given by Stanford’s Jennifer Widom into Prolog on Swish, bringing a series of notes I’ve done while redoing her online course to 3.
This exercise has reinforced my prejudice against the myriad of NoSQL graph database systems hyped by Facebook et all (admittedly rooted in me not bothering to learn any of them) for two reasons:
Plain vanilla SQL handles graph databases perfectly, though the with recursive base select case union recursive select case pattern isn’t easy to read or write.
Prolog shines at these problems – at least I hope my translations of Widom’s SQL examples show that.
As always, suggestions for improvements are welcome.
I see two reasons for the NoSQL hype. Note that it says “NoSQL” while it seems that the underlying sentiment is NoSchema or maybe NoRelations. Either way, SQL as an interface to the database has been hated by those who don’t understand relational databases and have never really used SQL, and by those who understand relational databases and have had to use a lot of SQL.
This is the first reason: there are a lot of developers out there, working for businesses, and they would like to use a database to keep data somewhere, but they don’t want to touch SQL in particular. The relational view of the data is hated by proxy.
The other reason is purely technical: there are real world data problems and workloads that do not work with relational databases. Too much data in total, too much data coming in, lack of structure within the data… The last one deserves attention: there might be some underlying structure, but you’d have to know it upfront; and sometimes all you can afford is just keep all of it, at least for a while, and only look at it once you must (something breaks or you get a bright idea and finally are able to formulate a question that you could answer using the data you’ve accumulated).
Sometimes you really don’t need ACID transactions even if you need to keep your data somehow…
I’ve been pondering the pros and cons of schemas doing this exercise of translating SQL to Prolog, and it ties in with the Errors considered harmful thread in this group.
Ultimately, I’ve decided I’m on the side of schemas. A snag with dynamically typed languages is the initial “savings” in lines of code from not needing to declare variable types is more than lost down the line by the number of times types need to checked subsequently – and those lines of type-checking code often come at the price of bitter experience from crashes or weird results from garbage input which an SQL or other statically-typed system would have simply rejected upfront.
It seems to me that a lot of NoSQL boils down to extracting data from huge blobs of text structured into Json (queried by GraphQL), XML (queried by XPath), or whatever is currently in fashion.
Dabbling with GraphQL and Xpath has mainly helped me understand this Perlisism
The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.
My interpretation of that is that using blobs of text as a “universal compound type” leads to lots of headaches in trying to decipher other people’s (and sometimes your own) code down the line.
I have second-hand knowledge of two large projects that could not afford an SQL/Relational database. One is in a company in the Frankfurt am Main banking business that just couldn’t figure out how to handle the amount of transactions coming in, had they used a relational database. Their product is a graph database implemented in Java. The other one is a retail logistics company here in Finland (Relex), they are using a column-based in-memory database, also implemented in Java
Which actually raises the question, why not Prolog instead?