BBOP is at the cutting edge of developing and applying new artificial intelligence (AI) and machine learning (ML) techniques in bioinformatics and biomedical ontologies. Approaches we are exploring include Knowledge Graphs (KGs) and Large Language Models (LLMs) such as GPT-3/4 and LLAMA2.

Below are some examples of AI/ML-related projects we are currently engaged in. Note that this work is evolving quickly, so this page may not be up to date!

OntoGPT: a Python package for the generation of Ontologies and Knowledge Bases using large language models (LLMs)

OntoGPT implements two different strategies for knowledge extraction: SPIRES and TALISMAN (see below for info about those)

SPIRES (Structured Prompt Interrogation and Recursive Extraction of Semantics)

  • A Zero-shot learning (ZSL) approach to extracting nested semantic structures from text
  • Takes two inputs - 1) LinkML schema 2) free text, and outputs knowledge in a structure conformant with the supplied schema in JSON, YAML, RDF or OWL formats
  • Source: part of OntoGPT
  • Templates (see sidebar)
  • Blog post
  • Paper: Caufield JH, Hegde H, Emonet V, Harris NL, Joachimiak MP, Matentzoglu N, Kim H, Moxon SAT, Reese JT, Haendel MA, Robinson PN, Mungall CJ. Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning. Bioinformatics. 2024. https://doi.org/10.1093/bioinformatics/btae104

TALISMAN: a Python package for summarizing gene set functions using large language models (LLMs)*

CurateGPT

DRAGON-AI: Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence

  • An ontology generation method employing Large Language Models (LLMs) and Retrieval Augmented Generation (RAG).
  • Source: part of CurateGPT
  • Preprint: Toro A, Anagnostopoulos AV, Bello S, Blumberg K, et al. Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI). arXiv [cs.AI]. 2023. https://arxiv.org/abs/2312.10904

Phenomics Assistant

  • A user-friendly interface that uses large language models (LLMs) to enable natural-language interaction with a knowledge graph of biomolecular and biomedical information.
  • Interface: https://monarch-assistant.streamlit.app/
  • Source: Phenomics Assistant
  • Preprint: O’Neil ST, Schaper K, Elsarboukh G, Reese JT, Moxon SAT, Harris NL, Munoz-Torres MC, Robinson PN, Haendel MA, Mungall CJ. Phenomics Assistant: An Interface for LLM-based Biomedical Knowledge Graph Exploration. bioRxiv 2024.01.31.578275. https://doi.org/10.1101/2024.01.31.578275

Other work

More info

Edit