Music Notation Grammar

Hello,

I have this idea of using prolog together with a clp over reals in order to write a pure bidirectional grammar of the modern music notation.

The idea is to write a single code base able to do both:

  • music notation typesetting (what musescore, lilypond, sibelius or finale do)
  • Optical Music Recognition (OMR) which is the inverse process, recover the original music score (like a MusicXML file) from a pdf or an image of a music score.

I have made a proof of concept here: GitHub - kwon-young/music: Music Notation Grammar

Although the state of the project is very rudimentary (no documentation, primitive workflow, very incomplete music notation support), I just wanted to pick the mind of more prolog minded persons to see if maybe some peoples already attempted something similar.
Or just, what do you think ?

Also, if anyone knows how to parse xml in prolog given a xsd/xsl schema, I would welcome some help (this is parsing MusicXML files).

3 Likes

Very interesting, I like specially the idea to compose clpBNR with R-trees.
Parsing/generating 2D data is not easy, even with sophisticated libraries like OpenCV…

I don’t understand the need for XSD/XSL (isn’t library(xpath) enough?), but I have never investigated MusicXML.

Parsing/generating 2D data is not easy, even with sophisticated libraries like OpenCV…

Well, the idea here is to do all low level image processing steps before and then apply prolog only on geometric structures such as bounding boxes, segments or points.
Of course, symbol detection is largely a TODO for the future :slight_smile:

I don’t understand the need for XSD/XSL (isn’t library(xpath) enough?), but I have never invetigated MusicXML.
I think I’ve read somewhere that using a schema can automatically parse the type of some attributes such as numbers …
Currently, I have to manually use atom_number/2 predicate together with coroutines to correctly parse numbers in xml files.

library(xpath) seems could handle this problem, eg from docs:

 xpath(DOM, //div(@width(number)=W, @height(number)=H), Div)

xpath as with atom_number/2 can’t deal with non-ground terms, it’s not logically pure.

By using an xsd schema, I can parse an xml file at the beginning of the grammar and directly have number in the parsed tree, or generate an xml file at the end of my grammar from a tree containing numbers.

The disadvantage of using xpath is that during the execution of the grammar, I use clp to constrain integer variables, and with sufficient constraints, it should be grounded at the end of the grammar execution.
So during the grammar, I don’t know if my number attribute is grounded or not.

Hopefully useful: Most efficient DCG for text parsing? - #5 by jan

Thanks, that comment is a very good summary of the different choices for making a bi-directional grammar.

In my case, I chose to use delays, in order to really have a single grammar for generation and parsing.

The idea is that I ignore any performance problem until I find a music score that I cannot parser generate in a reasonable time :slight_smile:
Also the grammar is not destined to be used interactively, so I don’t care about syntax errors.

But developing such a grammar is very cumbersome since it frequently backtrack to infinity every time I make an error…