What is OpenBiodiv?
What data is in OpenBiodiv?
What knowledge can be obtained from OpenBiodiv?
How to find information about biodiversity in OpenBiodiv?
General search
User applications
Application Programing Interface (API)
OpenBiodiv is a biodiversity database containing knowledge extracted from scientific literature, built as an Open Biodiversity Knowledgement Management System (OBKMS). OpenBiodiv consists of a knowledge graph, a Linked Open Dataset, an ontology (OpenBiodiv-O) and а website. The knowledge graph contains semantic statements about authors, articles, treatments, taxonomic names, examined materials, institutions, genomic sequences, habitats, localities, and more. Each entity in the Linked Open Dataset has its globally unique, persistent and resolvable identifiers (GUPRI).
Data is modelled according to the OpenBiodiv-O ontology integrating semantic resource types from recognised biodiversity and publishing ontologies with biodiversity-specific resource types not modelled before.
The aim of OpenBiodiv is to make biodiversity knowledge easily findable and accessible both by humans and machines. OpenBiodiv has several user-oriented applications, a RESTful API and a SPARQL endpoint where experienced users can write complex queries.
OpenBiodiv gathers knowledge extracted from semantically enhanced biodiversity-related articles published in Pensoft’s journals (e.g. ZooKeys, PhytoKeys, MycoKeys, Biodiversity Data Journal, etc.) and taxonomic treatments harvested and semantically annotated by Plazi from journals of other publishers (e.g. Zootaxa, European Journal of Taxonomy, etc.) and exposes the links between and within articles.
OpenBiodiv offers a broad biodiversity-related querying system answering open-ended queries based on the data. OpenBiodiv can be used to obtain new knowledge about taxa, scientific articles and their subsections, the examined materials and their metadata, localities, sequences and a lot more. OpenBiodiv can discover hidden links within biodiversity data and can guide research into how data is used in scholarly articles.
The system is able to return information with relevant visual representation about any one or a combination of its major data classes within a certain scope and semantic context
Data classes are:
Examples of data properties are:
Article metadata and sections are:
Semantic classes are article sections grouped by topic:
Using OpenBiodiv one can answer complex questions like these (see Sample SPARQL queries for more detail):
There are four approaches for exploration of data stored in the graph:
The general search is available on the homepage of OpenBiodiv and allows exploration of the knowledge graph based on key terms like taxonomic names, persons, articles. The user only needs to type the name of an entity of interest belonging to one of the above-mentioned types and the system finds information about it. Misspelling the name is not a problem because the Elasticsearch index supports fuzziness for maximum edit distance allowed for matching. It can also automatically determine the semantic type of the searched entity.
This application is designed to answer the following general question: Find me information about an entity mentioned within a certain article section in OpenBiodiv. The results will show the number of mentions of this entity (e.g. taxonomic name) in each section of interest (e.g. Titles (X), Abstracts (Y), Treatments (Z), etc.) and aggregated by articles.
By clicking on the hyperlinked number, the user is redirected to the article section where that entity is mentioned.
A simple graphic representation of the information, for example, about Element X mentioned in Y titles and Z abstracts (plot comparison) illustrates the distributions of the element in the searched sections.
In addition to being visualised in the web page, the results can be exported to a CSV file for further use.
This application extends the functionality of the Literature exploration app by adding two or more data elements (named entities), e.g. taxon names, sequences, specimens, specific terms, etc. to be searched together within a given context. For example, some possible questions are:
Give me article sections or taxon treatment sections where Data element 1 and Data element 2 are mentioned together, e.g.:
Taxon name A & Taxon name B
Sequence C & Taxon name Y
Taxon name X & Treatment Y
The basic aim of this data discovery application is to search, discover and display data available from trusted external resources, for example specimens, collections, sequences, taxon names, literature, persons and others. The element of interest may be present also in OpenBiodiv.
This service is available also as a an additional step to other apps. For example, when one is making a bibliographic exploration about a certain named entity, it could have the option to ask for additional information about that entity available from external resources.
The data records and their identifiers obtained as a result of the search across various resources can be stored as CSV file or RDF using the SCOR ontology.
OpenBiodiv performs a number of queries at regular intervals to generate reports and send these to the users subscribed to the RSS & E-mail Alert service. The queries can deliver for example:
OpenBiodiv can be explored by an unlimited number of various SPARQL queries, however it also provides an API for programmatic access to the data. The documentation of the API is described in Swagger. The API construction and functionalities follow the recommendations elaborated by the Technical Research Infrastructures forum of the BiCIKL project.