Welcome to NetSci's
Lists of Software for
Bioinformatics: Databases

Notice: Statements and opinions made for the products within this listing were supplied by their owners. Network Science Corporation assumes no responsibility for the content of these listings. All product and company names mentioned in this publication are patents, trademarks, registered trademarks or servicemarks of their respective holders. The Tabular Software Listings portion of this site are Copyright © 1995/2006 by Network Science Corporated. All rights reserved.

The Software Section of NetSci is accessed by hundreds of scientists every week. Our goal is to make this resource as comprehensive as possible. If your software program is not included, please send e-mail with a brief description, the categories under which your program should appear, the platforms supported, and contact information.

For programs currently listed in NetSci, please check the table and description and notify us of any changes or additions.

[ A, B, C ] -- [ D, E, F ] -- [ G, H, I ] -- [ J, K, L ]
[ M, N, O ] -- [ P, Q, R ] -- [ S, T, U ] --
[ V, W ] -- [ X, Y, Z]

--- A, B, C ---


AceDB is an acronym for A Caenorhabditis elegans Database. AceDB is a genome database designed specifically for flexibly handling bioinformatic data. It includes tools designed to manipulate genomic data, but is increasingly also used for non-biological data.


AiO (All in One) is a program for Windows, that combines typical DNA/protein features such as plasmid map drawing, finding of ORFs, translate, backtranslate and high quality printing with a number of databases. These databases allow the management of oligonucleotides, oligonucleotide-manufacturers, restriction enzymes, structural DNA and program users in a multi-user/multi-group environment. An AiO specific website, with the possibility to download is at:

Christiaan Karreman
Department of Oncological Chemistry, Building 23.12.U1
University of Dusseldorf
Universitatstrasse 1


The BioCyc collection of databases provides electronic reference sources on the pathways and genomes of different organisms. Currently, detailed organism-specific databases are available for 14 species. In addition, the MetaCyc metabolic pathway database contains literature-derived metabolic pathway data for 160 species.

Scientists can use BioCyc databases to visualize the layout of genes within a chromosome, or of an individual biochemical reaction, or of a complete biochemical pathway. The structures of chemical compounds can be displayed in pathways and reactions. The navigation capabilities of the software allow a user to move from a display of an enzyme to a display of a reaction that the enzyme catalyzes, or to the gene that encodes the enzyme. The interface supports a variety of queries, such as generating a display of the map positions of all genes that code for enzymes within a given biochemical pathway. As well as being used as a reference source to look up individual facts, BioCyc databases support computational studies of the metabolism, such as design of novel biochemical pathways for biotechnology, studies of the evolution of metabolic pathways, and simulation of metabolic pathways.

BioCyc is linked to other biological databases containing protein and nucleic-acid sequence data, bibliographic data, protein structures, and descriptions of different strains.

The EcoCyc and MetaCyc databases are highly curated databases whose content is derived principally from the biomedical literature. EcoCyc is a model-organism database for E. coli, whereas MetaCyc is a nonredundant metabolic pathway database containing pathways from many organisms.

The BioCyc collection is curated by the genomics group at SRI. For additional information, check the BioCyc web site.


Biopendium: Using proprietary bioinformatics applications on one of the world's most sophisticated high speed computer clusters, the Company has developed a formidable relational database known as the Biopendium(TM), which brings together information on sequence, structure and function relationships for all gene products in the public domain. This currently comprises over 100 million such relationships enabling essential information to be derived about potential new targets, as part of the validation and lead optimisation process.

The Biopendium pre-calculated database and integrated data mining and visualisation tools can be used to enhance drug discovery in the following ways:

  • Target Discovery and Validation
  • Structure-Function Annotation
  • Sequence to Structure and Function
  • Identification of the Functional Residues of a Protein
  • Added Confidence in Drug Development

CONTACT: Professor Ken Powell or Margaret Walsh, both of Inpharmatica, +44-171-631-4644, or fax, +44-171-631-4844; Web site is at http://www.inpharmatica.co.uk


CAMELEON is a set of multiple sequence alignment tools with links to databases of known 3D structural fragments. It was developed at Oxford Molecular Group but is no longer available.

--- D, E, F ---


ERGO Light is a curated database of public and proprietary genomic DNA, with connected similarities, functions, pathways, functional models, clusters and more. The system presents these data interconnected with WWW hyperlinks but also allows searches and comparisons. See http://www.ergo-light.com/ERGO/


The Expasy site contains a 2-D gel data database, searching engine and links to several gel databases throughout the world.

--- G, H, I ---


GAIA 22 is a Chromosome 22 specific version of the GAIA database. GAIA is a data analysis and storage system for genomic sequence and its annotation. As a data analysis engine it accepts raw genomic sequence and automatically adds significant annotation. As a data storage system, it incorporates such sequence and annotation into its database along with a record of experimental support for the annotation. It facilitates queries against the data and graphical visualization of the query results. The database is available at the University of Pennsylvania Bioinformatics web site.


GeneCards is a database of human genes, their products and their involvement in diseases. It offers concise information about the functions of all human genes that have an approved symbol, as well as selected others [gene listing]. It is especially useful for those who are searching for information about large sets of genes or proteins, e.g. for scientists working in functional genomics and proteomics. The site is hosted at the Weizmann Institute.


GENESEQ was a database of protein and nucleic acid sequences extracted from world-wide patent documents. The program was offered Oxford Molecular Group but is no longer available.


GeneWorks - was an integrated sequence analysis and database searching on the Macintosh previously marketed by Oxford Molecular Group. The program is no longer available.


ISYS(TM), is the National Center for Genome Resources' new product that integrates independent bioinformatic software tools and databases. This software tool enables genomic investigators to go beyond single-focus analysis tools by creating an integrated environment in which they can more easily and rapidly discover new knowledge. ISYS is capable of integrating data sources and analysis tools from an investigator's own laboratory with those developed elsewhere. Free evaluation copies of ISYS can be downloaded and installed from the ISYS Web site, http://www.ncgr.org/isys.

ISYS's key technical strength is the platform's ability to couple separately developed Java(TM) software components and analysis tools, harnessing their collective strength while allowing them to evolve independently. ISYS uses DynamicDiscovery(TM) technology to allow users to find appropriate paths from one software component to another, according to their particular data sets and configuration of components. DynamicDiscovery is used to integrate Java programs and to allow the integration of Web pages with those programs. ISYS's Web integration feature works generically with virtually any Web page, or it can be augmented with even richer functionality if pages are specially "marked up" for integration with the system. ISYS has the ability to mark up several popular bioinformatics Web sites and lets users of the system extend it to mark up their own favorite pages. In addition, ISYS has a published API (Application Programming Interface) to allow users to adapt their own programs and databases for integration with the platform. Alternatively, NCGR developers and scientists will work with organizations, one-on-one, to tailor ISYS to meet their specific discovery needs.

--- J, K, L ---

--- M, N, O ---

--- P, Q, R ---


OligoMaster is a multi-user oligonucleotide cataloguing application designed to help biologists manage and organise their oligonucleotide collections, available in versions for Windows, Macintosh and Linux. OligoMaster allows one to quickly and easily retrieve essential information about specific oligonucleotides before running an experiment. One can choose between keeping one=92s oligonucleotides private from others or sharing them with other users of the database. OligoMaster can run either in single workstation mode, or client-server (multi-user) mode, where multiple clients (which can be mixed platform) can remotely connect to a single database over the network. In addition, OligoMaster has built-in automatic calculation of values, allowing one to enter sequence and OD260 value and have all other values calculated from these. The calculation is entirely scriptable, which means that some or all of the calculations can be done using different algorithms. OligoMaster is available from http://www.oligomaster.com


PhyloPat provides phylogenetic pattern analysis of eukaryotic genes. Phylogenetic patterns show the presence or absence of certain genes or proteins in a set of species. They can also be used to determine sets of genes or proteins that occur only in certain evolutionary branches. Phylogenetic patterns analysis has routinely been applied to protein databases such as COG and OrthoMCL, but not upon gene databases. Here we present a tool named PhyloPat which allows the complete Ensembl gene database to be queried using phylogenetic patterns.


ProteinCenter(™) integrates the contents of a large number of public protein sequence databases and your experimental systems biology data. It provides a number of bioinformatics tools, and much more, to be the most comprehensive bioinformatics analysis tool available for any scientist who looks at protein and sequence information. ProteinCenter(tm) for example enables comparing multiple data sets of thousands of proteins in minutes. Get the true overlap, independently of the original database source. Import analytical results from leading international scientific studies and see your data in a new context. Data sets can be mined, compared and documented in a matter of minutes using a range of truly novel methods for handling large proteomics data sets. More information available at http://www.proxeon.com/protein-sequence-databases-software.html

The ProteinCenter(™) Open Access is a free subset of the commercial ProteinCenter(™). Lookup protein accession codes and peptides in a global database with more than 25 million protein accession codes. One single resource for lookup of any protein accession codes and identifiers from GenBank, Refseq, EMBL, UniProt, Swiss-Prot, Trembl, PIR, IPI, PDB, Ensembl etc. including many of the outdated accession codes. Overview annotation for the protein in the ProteinCard, including mapping of accession codes, GO, disease, pathway annotation, sequence features etc. It also includes very fast retrieval of richly annotated BLAST neighbors. Lookup tryptic peptide(s) to see all proteins including the given peptides (6-32 AA and no miscleavages). Try it out at http://www.proxeon.com/proteincenter-open-access.html


Relibase is a web-based tool for searching and analysing protein ligand structures in the PDB. It features:

  • An intuitive web-based user interface
  • A fast database search engine
  • Local installation for confidential searching
  • Standard text searching
  • 2D substructure searching
  • 3D protein-ligand interaction searching
  • Similarity searching for ligands
  • Automatic superposition of related binding sites to compare ligand binding modes, water positions, ligand-induced conformational changes, etc.
  • Clear 3D visualisation using RasMol

Relibase is freely available from http://relibase.ccdc.cam.ac.uk and the mirror sites http://relibase.rutgers.edu and http://relibase.ebi.ac.uk

More information about Relibase, and the commercial version Relibase+ is available from the CCDC website at http://www.ccdc.cam.ac.uk/prods/relibase/


ResNet is a comprehensive database of molecular networks and protein interactions, derived from automatic analysis of the whole PubMed. It contains more than 200,000 events of regulation, interaction and modification between 15,000 proteins, cell processes and small molecules.

Ariadne Genomics
9700 Great Seneca Highway, Suite 113
Rockville, MD 20850
Phone: (240) 453-6296
Fax: (240) 453-6208
Web: http://www.ariadnegenomics.com


The Rosetta Resolver System, provides high-capacity data storage, retrieval and analysis of gene expression data. The system is ideal for life science research organizations that need to assess compound specificity or toxicity, identify new genes or therapeutic targets, or compare and analyze large databases of expression profiles. The Rosetta Resolver system combines flexibility and ease of use with high-performance algorithms to produce an enterprise solution for rapid analysis of gene expression data. The system can accept and analyze data from a wide variety of expression profiling formats, and applies Rosetta's proprietary error models to yield quality statistics for every gene expression measurement. These statistics are automatically leveraged by all analysis tools, ensuring reliable results.

Additional new features of the Rosetta Resolver system version 2.0 include the following:

  • Analysis of both intensity and ratio-based expression profiles
  • Summarization and analysis of data at the feature, reporter (oligonucleotide, cDNA, or probe pair), sequence, exon, and UniGene cluster levels
  • Automated sequence annotation updates from public and proprietary databases
  • Ability to build ratio-based analyses from intensity-based analyses using advanced statistical error models that leverage replicate hybridizations
  • GEML-compatibility enables users to export data in the GEML format for exchange with collaborators and/or for publication. GEML compatible software such as Rosetta's GEML Conductor tools can be used to visualize the data.

Version 2.0 is available now through Rosetta's strategic partner, Agilent Technologies. The system comes in several flexible packages to meet the needs of a variety of research organizations. Additional information about the Rosetta Resolver system can be found on Rosetta Inpharmatics' Web site at www.rii.com.

--- S, T, U ---


SGD is a scientific database of the molecular biology and genetics of the yeast Saccharomyces cerevisiae, which is commonly known as baker's or budding yeast. As well as indexing many Saccharomyces-related resources, SGD also provides sequence analysis tools, a gene registry and genetic/physical maps for the organism. SGD is hosted at the Stanford University Genome Center.


SRS is a database integration and biological information search system. It is capable of quering 400 different molecular biology, bibliographic, compound data, genetic and medical databases via a single interface. With SRS, LION Bioscience has improved the efficacy of 14 bioinformatics departments of leading life science companies.

SRS is available from

LION bioscience AG
Im Neuenheimer Feld 515-517
69120 Heidelberg, Germany
phone: ++49(0)6221/4038-0
fax: ++49(0)6221/4038-101


Software Solution for BioMedicine (SSBM) offers high-speed analysis of both public and proprietary genetic databases within the security of the corporate firewall. SSBM provides speed, security, and scalability through the employment of a client-server architecture including an Oracle(R) relational database, analysis tools residing on a UNIX Server, and Java-based graphical user interface. This distributed architecture facilitates collaboration throughout a globally dispersed corporation (or network of several organizations) on multiple discrete research projects. In addition, SSBM is seamlessly integrated with Vector NTI Suite(™), InforMax's desktop sequence analysis software package.

SSBM uses the ResearchLogic(TM) system designed to support the natural process flow of research without unnecessary manual intervention. Alert agents and algorithm chains combine elementary analysis routines into complex procedures and allow the application of those procedures to database objects. System algorithms include BLAST and FASTA Suite, Multiple Sequence Alignment, PCR Primer/Oligo Search and Analysis, Sequence Primer and Hybridization Probe Search, Protein Pattern and Sequence Analysis, DNA/RNA Pattern and Sequence Analysis, Profile/Matrix/Pattern Generation, Profile-Based Database Search, Restriction Analysis, ORF Search, PROSITE Pattern Search, and BLOCKS Profile Search. In addition, SSBM contains ResearchLogic Extensions(TM) which allows the incorporation of user algorithms into ResearchLogic.

Additional information about InforMax can be obtained from:

InforMax, Inc.
444 North Frederick Ave, STE 308
Gaithersburg, MD 20877
Tel: (301)216-0586
Fax: (301)216-0087
URL: http://www.informaxinc.com

--- V, W ---


Vector NTI is a Macintosh- and Windows-based molecular biology support system which offers:

  • Access to internet resources
  • Automatic design of molecules
  • Advanced PCR analysis
  • Maintanence of parent-descendent trees
  • Professional database
  • Dynamic electrophoresis simulation
  • Graphical molecular editor
  • Publication quality graphics

A free demo of Vector NTI is available from:

InforMax, Inc.
444 North Frederick Ave, STE 308
Gaithersburg, MD 20877
Tel: (301)216-0586
Fax: (301)216-0087
URL: http://www.informaxinc.com

--- X, Y, Z ---

NetSci, ISSN 1092-7360, is published by Network Science Corporation. Except where expressly stated, content at this site is copyright (© 1995 - 2010) by Network Science Corporation and is for your personal use only. No redistribution is allowed without written permission from Network Science Corporation. This web site is managed by:

Network Science Corporation
4411 Connecticut Avenue NW, STE 514
Washington, DC 20008
Tel: (828) 817-9811
E-mail: TheEditors@netsci.org
Website Hosted by Total Choice