The database of Human Annotated and Predicted Protein Interactions (HAPPI) version 2.0 is a major update to the original HAPPI 1.0 database. It contains 2,922,202 unique protein-protein interactions (PPI) linked by 23,060 human proteins, making it the most comprehensive database covering human PPI data today. These PPIs contain both physical/direct interactions and high-quality functional/indirect interactions.

Compared with the HAPPI 1.0 database release, HAPPI database version 2.0 (HAPPI-2) represents a 485% of human PPI data coverage increase and a 73% protein coverage increase. The revamped HAPPI web portal provides users with a friendly search, curation, and data retrieval interface, allowing them to retrieve human PPIs and available annotation information on the interaction type, interaction quality, interacting partner drug targeting data, and disease information. The updated HAPPI-2 can be freely accessed by Academic users at http://discovery.informatics.uab.edu/HAPPI.


A new database framework to perform integrative “gene-set, net- work, and pathway analysis” (GNPA).

In this framework, we integrated heterogeneous data on pathways, annotated list, and gene-sets (PAGs) into a PAG electronic repository (PAGER). PAGs in the PAGER database are organized into P-type, A-type and G-type PAGs with a three-letter- code standard naming convention. The PAGER database currently compiles 44 313 genes from 5 species including human, 38663 PAGs, 324830 gene–gene relationships and two types of 3174323 PAG–PAG regulatory relationships—co-membership based and regulatory relation- ship based. To help users assess each PAG’s biological relevance, we developed a cohesion measure called Cohesion Coefficient (CoCo), which is capable of disambiguating between bio- logically significant PAGs and random PAGs with an area-under-curve performance of 0.98. PAGER database was set up to help users to search and retrieve PAGs from its online web interface.

PAGER enable advanced users to build PAG–PAG regulatory networks that provide complementary biological insights not found in gene set analysis or individual gene network analysis. We provide a case study using cancer functional genomics data sets to demonstrate how integrative GNPA help improve network biology data coverage and therefore biological interpretability.



A new network visualization technique using scattered data interpolation and surface rendering, based upon a foundation layout of a scalar field. Contours of the interpolated surfaces are generated to support multi-scale visual interaction for data exploration. Our framework visualizes quantitative attributes of nodes in a network as a continuous surface by interpolating the scalar field, therefore avoiding scalability issues typical in conventional network visualizations while also maintaining the topological properties of the original network. We applied this technique to the study of a bio-molecular interaction network integrated with gene expression data for Alzheimer's Disease (AD). In this application, differential gene expression profiles obtained from the human brain are rendered for AD patients with differing degrees of severity and compared to healthy individuals. We show that this alternative visualization technique is effective in revealing several types of molecular biomarkers, which are traditionally difficult to detect due to "noises" in data derived from DNA microarray experiments.