PubChem and Big Data Chemistry

ualr bioinf v0 211105225242 thumbnail 2pubchem and big data chemistry 1 638
ualr bioinf v0 211105225242 thumbnail 4

Presented at the Bioinformatics Seminar at the University of Arkansas, Little Rock on November 5, 2021.

PubChem ( is a popular chemical database at the National Library of Medicine, National Institutes of Health. Arguably, PubChem is one of the largest chemical information resources in the public domain, with 111 million unique chemical structures, 1.39 million biological assays, and 292 million biological activity result outcomes. It also contains significant amounts of scientific research data and the inter-relationships between chemicals, proteins, genes, scientific literature, patents, and more. PubChem is a key resource for big data in chemistry and has been used in many studies for developing bioactivity and toxicity prediction models, discovering polypharmacologic (multi-target) ligands, and identifying new macromolecule targets of compounds (for drug-repurposing or off-target side effect prediction). It has also been used for cheminformatics education as well as chemical health and safety training. This presentation provides a high-level overview of PubChem’s data, tools, and services.

Leave a Reply

Your email address will not be published.