s(CASP) biological example

I put together this notebook which demonstrates how haplotype inference can be done in s(CASP). I think it is very elegant.

https://swish.swi-prolog.org/?code=https://raw.githubusercontent.com/samwalrus/scasp_haplotype/master/swish.swinb

A couple of questions, would the query ?- ? genotype(1,[2,1,2]),genotype(2,[1,2,1]). always find the model with the smallest number of haplotype/1 literals first? If not how do I use scasp_model/2 to return the model for me to count the number of haplotype/1 literals in the model?

6 Likes

FYI for others.


pubmed.ncbi.nlm.nih.gov

Haplotype inference

(ref)

The information carried by combination of alleles on the same chromosome, called haplotypes, is of crucial interest in several fields of modern genetics as population genetics or association studies. However, this information is usually lost by sequencing and needs, therefore, to be recovered by inference.


www.genome.gov

haplotype

(ref)

A haplotype is a set of DNA variations, or polymorphisms, that tend to be inherited together. A haplotype can refer to a combination of alleles or to a set of single nucleotide polymorphisms (SNPs) found on the same chromosome.

allele

(ref)

An allele is one of two or more versions of a gene. An individual inherits two alleles for each gene, one from each parent. If the two alleles are the same, the individual is homozygous for that gene. If the alleles are different, the individual is heterozygous. Though the term allele was originally used to describe variation among genes, it now also refers to variation among non-coding DNA sequences.


Wikipedia

(ref)

haplotype

A haplotype (haploid genotype) is a group of alleles in an organism that are inherited together from a single parent

(ref)

allele

An allele is one of two, or more, forms of a given gene variant.

If you don’t mind answering this question. I know I would find it helpful and maybe others would too.

The code you posted

%conflation.
conflation(2,0,1). %Heterozygous
conflation(2,1,0). %Heterozygous
conflation(0,0,0). %Homozygous
conflation(1,1,1). %Homozygous.

conflation_seq([],[],[]).
conflation_seq([G|Gs],[HA1|HAs],[HB1|HBs]):-
       conflation(G,HA1,HB1),
       conflation_seq(Gs,HAs,HBs).

genotype(_G,Genotype):-
    haplotype(Hap1),
    haplotype(Hap2),
    conflation_seq(Genotype,Hap1,Hap2).

haplotype(_X).

looks like standard Prolog. Except for :- use_module(library(scasp)). I would not have guessed it being s(CASP).

Can you demonstrate what the answer would be with normal Prolog and how using s(CASP) changes the results. :slightly_smiling_face:

Also, can conflation_seq/3 be changed out for one of the predicates from library(apply), or does using s(CASP) cause problems?

1 Like

You can query with prolog

?- genotype(1,[2,1,2]),genotype(2,[1,2,1]).
true.

Not very helpful.

But query with scasp (note the ? op):

?- ? genotype(1,[2,1,2]),genotype(2,[1,2,1]).
#### s(CASP) model

* haplotype holds for [0, 1, 0]
* haplotype holds for [1, 0, 1]
* haplotype holds for [1, 1, 1]
* genotype holds for 1, and [2, 1, 2]
* genotype holds for 2, and [1, 2, 1]
* conflation holds for 1, 1, and 1
* conflation holds for 2, 0, and 1
* conflation_seq holds for [], [], and []
* conflation_seq holds for [1], [1], and [1]
* conflation_seq holds for [1, 2], [1, 0], and [1, 1]
* conflation_seq holds for [1, 2, 1], [1, 0, 1], and [1, 1, 1]
* conflation_seq holds for [2], [0], and [1]
* conflation_seq holds for [2, 1], [0, 1], and [1, 1]
* conflation_seq holds for [2, 1, 2], [0, 1, 0], and [1, 1, 1]

#### s(CASP) justification

Expand All +1 -1 Collapse All

* genotype holds for 1, and [2, 1, 2], because

  * haplotype holds for [0, 1, 0], and

  * haplotype holds for [1, 1, 1], and

  * conflation_seq holds for [2, 1, 2], [0, 1, 0], and [1, 1, 1], because

Also, can conflation_seq/3 be changed out for one of the predicates from library(apply), or does using s(CASP) cause problems?

Not sure, I had to redefine some basic predicates before so just keeping it simple.

3 Likes

As is, s(CASP) doesn’t allow high-order calling, e.g., call/N. I don’t know how feasible it would be to add that, neither how useful. s(CASP) only gets interesting when negation is involved. Without negation its semantics is no different from Prolog and it is not rocket science to build up a justification and model from standard Prolog SLD resolution.

To deal with negation, s(CASP) uses constructive negation. From the positive rules it derives negative rules that do not work using NAF (Negation As Failure), but build up a model of the world where some goal is known to be not true. Constraints come handy there. This allows the system to derive from p(a) that not p(X) may be true for any X not equal to a. Where ASP easily explodes during the grounding phase, s(CASP) easily explodes in the complexity of (notably) the negative world. Well, it works differently, but it is surprisingly easy to create what seems to be rather simple programs that perform unexpectedly poor.

1 Like

(off-topic) you just summarized my whole career in one sentence

3 Likes

Yes the idea here would be to try and add constraints on what haplotypes are admitable which might use negation… but complexity will prob make it impractical for real tasks.

Still trying to understand this though…?

The s(CASP) model is just a list of terms. You can use Prolog to inspect it. There is no magic there. Possibly we can define some more abstract operations on them. I do not see much more than “give me all model terms about p/1)”, which would return p(X), not p(X), -p(X) and not -p(X) instances.

I am just confused how to actually get that list… ?

?- ? genotype(1,[2,1,2]),genotype(2,[1,2,1]),scasp_model(M).
o permission to scasp procedure `scasp_embed:scasp_model/1'
In:
  [15] throw(error(permission_error(scasp,procedure,...),_1806))
  [10] scasp_dyncall:body_calls_pi((genotype(1,...),...,...),'9cbd6aed-557a-48e7-b8ae-cdb040dce973',_1866) at /scratch/swish/src/swish/pack/sCASP/prolog/scasp/dyncall.pl:166
   [9] findall_loop(_1922,scasp_dyncall:body_calls_pi(...,'9cbd6aed-557a-48e7-b8ae-cdb040dce973',_1942),_1926,[]) at /home/swish/lib/swipl/boot/bags.pl:99
   [8] setup_call_catcher_cleanup('$bags':'$new_findall_bag','$bags':findall_loop(_2006,...,_2010,[]),_1988,'$bags':'$destroy_findall_bag') at /home/swish/lib/swipl/boot/init.pl:646
   [4] scasp_dyncall:query_callees((...,...),_2068) at /scratch/swish/src/swish/pack/sCASP/prolog/scasp/dyncall.pl:161
   [3] scasp_dyncall:scasp_query_clauses((...,...),_2132) at /scratch/swish/src/swish/pack/sCASP/prolog/scasp/dyncall.pl:113
   [2] scasp_dyncall:scasp((...,...)) at /scratch/swish/src/swish/pack/sCASP/prolog/scasp/dyncall.pl:67
   [1] setup_call_catcher_cleanup(scasp:set_prolog_flag(scasp_show_justification,false),scasp:scasp(...),_2248,scasp:set_prolog_flag(scasp_show_justification,unicode)) at /home/swish/lib/swipl/boot/init.pl:646

Note: some frames are missing due to last-call optimization.
Re-run your program in debug mode (:- debug.) to get more detail.

(In swish).

You are hit by operator precedence. ? is a high priority operator, so ? a,b. is executed as a s(CASP) conjunction. scasp_model/1 however is Prolog, so you need to write

?- (? a,b), scasp_model(M).
2 Likes

ah ok :slight_smile: thanks!

I would say this is abduction, but there is also abducible/1 in scasp but I am not sure how this is meant to be used.

See also :
example from the book simply logical:

and a simple ilp with negation example.

#abducible p(X).

gets implemented in s(CASP) as

p(X) :- not -p(X).
-p(X): not p(X).

if I’m not mistaken. So it creates two different “worlds,” one in which p(X) holds, and one in which it is known not to hold.

1 Like