It seems that you have the bad fortune of asking a question that is simple enough to play with but complex enough for us to have fun and demonstrate some useful code.
Here is my rendition of a possible solution. Don’t think you have to use this as I know when you look at it you will think some crazy guy wrote it with an overly complex set of DCGs, but having used DCGs to parse hundreds of files, this is what happens.
Details
Directory: C:/Users/Groot/Documents
- Change as needed.
File: data.sql
INSERT INTO `CITATIONS` VALUES ('1','0006-2944','1975 Jun','1975-6-1',1975),('10','1873-2968','1975 Sep 01','1975-9-1',1975),('100','0547-6844','1975','1975-1-1',1975),('1000','0264-6021','1975 Sep','1975-9-1',1975),('10000','0006-3002','1976 Sep 28','1976-9-28',1976),('100000','0160-3450','1978 Sep','1978-9-1',1978),('1000000','0006-3363','1976 Dec','1976-12-1',1976);
INSERT INTO `CITATIONS` VALUES ('2','0006-2945','1975 Jun','1975-6-1',1975),('20','1873-2969','1975 Sep 01','1975-9-1',1975),('200','0547-6845','1975','1975-1-1',1975),('2000','0264-6022','1975 Sep','1975-9-1',1975),('20000','0006-3003','1976 Sep 28','1976-9-28',1976),('200000','0160-3451','1978 Sep','1978-9-1',1978),('2000000','0006-3364','1976 Dec','1976-12-1',1976);
For the example data above the second line is a copy of the first with the id for each citation changed.
File: persist_citations.pl
Note: The library(persistency) code is not needed to create the Quick Load File, but since you noted it in your question, there are not many working examples of library(persistency) code and it was easy, it was added.
:- module(persist_citations,[
main/0
]).
:- use_module(library(persistency)).
:- set_prolog_flag(double_quotes, codes).
:- set_prolog_flag(back_quotes,string).
:- debug(citation).
:- persistent
citation(pmid:number,issn:atom,edat:float,pyear:number).
% :- initialization(main).
file(input,'data.sql').
main :-
db_attach('citation.journal', []),
load_and_persist_data.
add_citation(Pmid, Issn, Edat, Pyear) :-
debug(citation,'Pmid: ~w, Issn: ~w, Edat: ~w, Pyear: ~w',[Pmid,Issn,Edat,Pyear]),
(
(
is_of_type(number, Pmid),
is_of_type(atom, Issn),
is_of_type(float, Edat),
is_of_type(number, Pyear)
)
->
(
citation(Pmid, Issn, Edat, Pyear), !
;
assert_citation(Pmid, Issn, Edat, Pyear)
)
;
debug(citation,'Invalid citation data!~n',[])
).
load_and_persist_data :-
setup_call_cleanup(
(
file(input,Filename),
open(Filename,read,Stream)
),
(
% set_stream(Stream, newline(posix)), % If you need to preserve cr lf endings.
read_stream_to_codes(Stream,Codes),
DCG = 'citation_lines+',
phrase(DCG,Codes,[])
),
close(Stream)
).
'citation_lines+' -->
citation_line, !,
'citation_lines*'.
'citation_lines*' -->
citation_line, !,
'citation_lines*'.
'citation_lines*' --> [].
citation_line -->
"INSERT INTO `CITATIONS` VALUES ",
'citations+',
semicolon,
'eol?'.
'citations+' -->
citation, !,
'citations*'.
'citations*' -->
comma,
citation, !,
'citations*'.
'citations*' --> [].
citation -->
open_paren,
pmid(Pmid),
comma,
issn(Issn),
comma,
date_unknown,
comma,
edat(Edat),
comma,
pyear(Pyear),
close_paren,
{ add_citation(Pmid, Issn, Edat, Pyear) }.
pmid(Number) -->
single_quote,
'digit+'(Digit_codes,[]),
single_quote,
{ number_codes(Number,Digit_codes) }.
issn(Atom) -->
single_quote,
'digit+'(T0,T1),
dash(T1,T2),
'digit+'(T2,[]),
single_quote,
{ atom_codes(Atom,T0) }.
date_unknown -->
single_quote,
'digit+',
'month_3_letter?',
'date?',
single_quote.
'month_3_letter?' --> sp, month_3_letter, !.
'month_3_letter?' --> [].
'date?' --> sp, 'digit+', !.
'date?' --> [].
edat(TimeStamp) -->
single_quote,
'digit+'(Year_codes,[]),
dash,
'digit+'(Month_codes,[]),
dash,
'digit+'(Day_codes,[]),
single_quote,
{
number_codes(Year,Year_codes),
number_codes(Month,Month_codes),
number_codes(Day,Day_codes),
date_time_stamp(date(Year,Month,Day,0,0,0,0,-,-), TimeStamp)
}.
pyear(Number) -->
'digit+'(T0,[]),
{ number_codes(Number,T0) }.
% Recognizers
% If the sequence is successfully recognized the side effect is that the sequence is removed from the input.
month_3_letter --> month_3_letter(_,_).
'digit+' --> 'digit+'(_,_).
'digit*' --> 'digit*'(_,_).
digit --> digit(_,_).
'wsp+' --> 'wsp+'(_,_).
'wsp*' --> 'wsp*'(_,_).
wsp --> wsp(_,_).
'cr?' --> 'cr?'(_,_).
'eol?' --> 'eol?'(_,_).
htab --> htab(_,_).
cr --> cr(_,_).
lf --> lf(_,_).
sp --> sp(_,_).
dash --> dash(_,_).
semicolon --> semicolon(_,_).
comma --> comma(_,_).
open_paren --> open_paren(_,_).
close_paren --> close_paren(_,_).
single_quote --> single_quote(_,_).
month_3_letter([0'J,0'a,0'n|T],T) --> "Jan", !.
month_3_letter([0'F,0'e,0'b|T],T) --> "Feb", !.
month_3_letter([0'M,0'a,0'r|T],T) --> "Mar", !.
month_3_letter([0'A,0'p,0'r|T],T) --> "Apr", !.
month_3_letter([0'M,0'a,0'y|T],T) --> "May", !.
month_3_letter([0'J,0'u,0'n|T],T) --> "Jun", !.
month_3_letter([0'J,0'u,0'l|T],T) --> "Jul", !.
month_3_letter([0'A,0'u,0'g|T],T) --> "Aug", !.
month_3_letter([0'S,0'e,0'p|T],T) --> "Sep", !.
month_3_letter([0'O,0'c,0't|T],T) --> "Oct", !.
month_3_letter([0'N,0'o,0'v|T],T) --> "Nov", !.
month_3_letter([0'D,0'e,0'c|T],T) --> "Dec".
'digit+'(T0,T) -->
digit(T0,T1), !,
'digit*'(T1,T).
'digit*'(T0,T) -->
digit(T0,T1), !,
'digit*'(T1,T).
'digit*'(T,T) --> [].
digit([C|T],T) -->
[C],
{ between(0'0,0'9,C) }.
'wsp+'(T0,T) -->
wsp(T0,T1), !,
'wsp*'(T1,T).
'wsp*'(T0,T) -->
wsp(T0,T1), !,
'wsp*'(T1,T).
'wsp*'(T,T) --> [].
wsp(T0,T) -->
sp(T0,T), !.
wsp(T0,T) -->
htab(T0,T).
'cr?'(T0,T) --> cr(T0,T), !.
'cr?'(T,T) --> [].
'eol?'(T0,T) -->
'cr?'(T0,T1),
lf(T1,T), !.
'eol?'(T,T) --> [].
htab([0x09|T],T) --> [0x09].
cr([0x0D|T],T) --> [0x0D].
lf([0x0A|T],T) --> [0x0A].
sp([0x20|T],T) --> [0x20].
dash([0'-|T],T) --> "-".
semicolon([0';|T],T) --> ";".
comma([0',|T],T) --> ",".
open_paren([0'(|T],T) --> "(".
close_paren([0')|T],T) --> ")".
single_quote([0''|T],T) --> "'".
If you have questions please ask; it would takes pages just to explain why the code is written like this. I know this is not beginner DCGs and does lots of subtle things that will not be be easily understood.
File: citation.journal
- This is where the data is persisted after running the code.
created(1601723402.391526).
assert(citation(1,'0006-2944',170812800.0,1975)).
assert(citation(10,'1873-2968',178761600.0,1975)).
assert(citation(100,'0547-6844',157766400.0,1975)).
assert(citation(1000,'0264-6021',178761600.0,1975)).
assert(citation(10000,'0006-3002',212716800.0,1976)).
assert(citation(100000,'0160-3450',273456000.0,1978)).
assert(citation(1000000,'0006-3363',218246400.0,1976)).
assert(citation(2,'0006-2945',170812800.0,1975)).
assert(citation(20,'1873-2969',178761600.0,1975)).
assert(citation(200,'0547-6845',157766400.0,1975)).
assert(citation(2000,'0264-6022',178761600.0,1975)).
assert(citation(20000,'0006-3003',212716800.0,1976)).
assert(citation(200000,'0160-3451',273456000.0,1978)).
assert(citation(2000000,'0006-3364',218246400.0,1976)).
Example run with debug enabled.
?- working_directory(_,'C:/Users/Groot/Documents').
true.
?- ['persist_citations.pl'].
true.
?- main.
% Pmid: 1, Issn: 0006-2944, Edat: 170812800.0, Pyear: 1975
% Pmid: 10, Issn: 1873-2968, Edat: 178761600.0, Pyear: 1975
% Pmid: 100, Issn: 0547-6844, Edat: 157766400.0, Pyear: 1975
% Pmid: 1000, Issn: 0264-6021, Edat: 178761600.0, Pyear: 1975
% Pmid: 10000, Issn: 0006-3002, Edat: 212716800.0, Pyear: 1976
% Pmid: 100000, Issn: 0160-3450, Edat: 273456000.0, Pyear: 1978
% Pmid: 1000000, Issn: 0006-3363, Edat: 218246400.0, Pyear: 1976
% Pmid: 2, Issn: 0006-2945, Edat: 170812800.0, Pyear: 1975
% Pmid: 20, Issn: 1873-2969, Edat: 178761600.0, Pyear: 1975
% Pmid: 200, Issn: 0547-6845, Edat: 157766400.0, Pyear: 1975
% Pmid: 2000, Issn: 0264-6022, Edat: 178761600.0, Pyear: 1975
% Pmid: 20000, Issn: 0006-3003, Edat: 212716800.0, Pyear: 1976
% Pmid: 200000, Issn: 0160-3451, Edat: 273456000.0, Pyear: 1978
% Pmid: 2000000, Issn: 0006-3364, Edat: 218246400.0, Pyear: 1976
true.
?- halt.
Example run with :- debug(citation).
commented out.
?- working_directory(_,'C:/Users/Groot/Documents').
true.
?- ['persist_citations.pl'].
true.
?- main.
true.
?- halt.
Note: If you have not worked with library(persistency) before, you need to run halt
at the end so that the persistency file will be updated and closed. Until halt
the file will be open but not contain all of the data.
Example run that creates the Quick Load File.
?- working_directory(_,'C:/Users/Groot/Documents').
true.
?- ['persist_citations.pl'].
true.
?- main.
true.
?- tell('citation_facts.pl'),listing(persist_citations:citation/4),told.
With a text editor open `citation_facts.pl` and remove the line `:- dynamic citation/4.` I don't know how to create a listing of a dynamic predicate without that appearing.
?- qcompile('citation_facts.pl').
true.
Check to make sure `citation_facts.qlf` was created; it is a binary file so not worth trying to read in an editor.
?- halt.
Start up a new SWI-Prolog instance.
?- working_directory(_,'C:/Users/Groot/Documents').
true.
?- ['citation_facts.qlf'].
true.
?- citation(A,B,C,D).
A = 1,
B = '0006-2944',
C = 170812800.0,
D = 1975 ;
A = 10,
B = '1873-2968',
C = 178761600.0,
D = 1975 ;
A = 100,
B = '0547-6844',
C = 157766400.0,
D = 1975 ;
...
?- citation(Pmid,Issn,Edat_timestamp,Pyear),stamp_date_time(Edat_timestamp,Edat,0).
Pmid = 1,
Issn = '0006-2944',
Edat_timestamp = 170812800.0,
Pyear = 1975,
Edat = date(1975, 6, 1, 0, 0, 0.0, 0, -, -) ;
Pmid = 10,
Issn = '1873-2968',
Edat_timestamp = 178761600.0,
Pyear = 1975,
Edat = date(1975, 9, 1, 0, 0, 0.0, 0, -, -) ;
...