New versions of packs bio_db (v4.3) and bio_db_repo (v23.10.04) are now available on the pack server.
-
bio_db
is a pack servicing high quality biological data. It contains the table definitions and code
for serving the data. The data reside with pack(bio_db_repo).
Currently there are 109 data tables serving 74,147,583 records.
The pack contains relations/data-tables of biological entities in human, pig, mouse and chicken
from a variety of databases.
v4.3 includes data from the VGNC database and introduces cross-species tables (mult token).
There was a major, and painful, harmonisation of build scripts.
There is a new script to build bio_db_repo on a SLURM cluster. -
bio_db_repo
contains fact bases for the bio_db tables. The current .tgz is about 0.5Gb.
It is not necessary to install the whole package- if you don’t need a substantial part of the tables.
Instead, bio_db will interactively install only the tables needed on demand.
The data is published about twice a year.
This version was build on an HPC cluster. -
bio_analytics
is a pack that implements some bioinformatics tasks based on tables in pack(bio_db)
and heavily relying on pack(Real)- to pass data to underlying R functions.
?- use_module(library(bio_db)).
or
?- use_module(library(lib)).
?- lib(bio_db).
links:
(ICLP 2019 paper): [1909.08254] Advances in Big Data Bio Analytics
http://stoics.org.uk/~nicos/packs/sware/bio_db
GitHub - nicos-angelopoulos/bio_db: Access, use and manage big, biological datasets.
http://stoics.org.uk/~nicos/packs/sware/bio_analytics