Information Handling for Natural Products Acquisitions

Gregg R. Dietzman

White Point BioMarine, Inc.
POB 2989, 180 First Street
Friday Harbor, WA 98250
Tel: (360)378-7292, Fax: (360)378-7260
e-mail: gregg@wpbm.com

This paper appeared in Screening Forum, 3(4), 11 (1995).



http://www.netsci.org/Science/Special/feature10.html

When using natural products to support High Throughput Screening (HTS), discovery programs must include effective technologies to support the acquisition and inventory of the natural products sources. A critical element in this process is information handling. Information handling requirements at every stage of this process will grow, and computer technology is available to meet the demand.

With the advance of screening technologies, it is now possible to evaluate upwards of 100,000 test substances per week against several different targets. The required test substance quantities are as low as a few micrograms per assay. In keeping with these trends, natural products acquisition programs are now scoped to collect large numbers of diverse tissue samples, but only to collect small wet weights. This format allows collectors to provide large numbers of samples for screening, and they rely on recollection for follow-on studies. This means that discovery programs actively testing natural products must track a greater number of samples. In addition, information on known natural products chemistry must be considered in an effort to contain the costs of follow-on studies.

When a natural product registers as a confirmed "hit" in a discovery screen, the people involved simply want to know: what is it, where did it come from, is it novel and how to get more. The general information handling system requirements, therefore, fall into two categories: recollection, and dereplication.

Recollection must be a cornerstone of any natural products acquisition program; you must be able to go back and get more of the source organism for follow-on studies. Dereplication against the growing number of known chemical compounds is becoming increasingly important for discovery screens. Based upon initial chemical characterization of an active fraction(s), a chemical substructure-based database search can be performed to compare with, and dereplicate against, known chemical compounds. Laboratory chemists are at a tremendous advantage when they can generate a list of possible chemical structures to compare with an extract containing unknown compounds. A chemist can then quickly evaluate the possibility for the presence of known compounds and will be able to make an informed decision regarding the interest in the extract as a source of novel bioactive compounds. This rapid process can save the costs of follow-on isolation and structural elucidation studies and increase a program's efficiency.

General information handling system requirements are recollection and dereplication. These requirements share a common feature: they both have a spatial, or mapable, element. The latitude-longitude location of a source organism collection can be easily obtained using satellite navigation technologies and placed on a map. The position of past collections can be similarly mapped, however, the accuracy of these data will vary. It is possible to map these positions in a geographic information system (GIS) where computer-based mapped data are linked to traditional text/field database records. Using a comprehensive dataset of natural products chemical discoveries within a GIS would allow investigators to easily visualize the spatial relationships between collection points to support decisions. Furthermore, expeditions for new acquisition efforts could effectively be planned to target specific regions based on existing information, and to avoid duplicating past collections.

One of the most powerful features of a comprehensive information management system that supports natural products acquisition programs is use of spatial data and GIS technology. GIS technology allows the system to step beyond the requirements for a normalized data model in a relational database design. GIS allows for addition of those data, which have a spatial element, as an overlay. Importantly, spatial data can be selectively added on a need to know basis, and, most importantly, addition of these varied data does not affect the logical design and data model.

HTS is an information-intensive industry. The breadth of existing information that is necessary to consider in the discovery process, the increasing number of samples submitted for testing and the evolving discovery screen models all contribute to a growing dataset. Currently available technologies can meet the information handling requirements for natural products acquisition programs. These technologies, however, must be effectively realized for an application. For example, GIS technology is a powerful tool that is typically delivered with extended capabilities that reach beyond the described general requirements for this application. In developing an effective solution, the GIS capability must be delivered in a customized format that provides only those features needed by the natural products investigator - again, these technologies must be realized for the application.

These information management requirements, however, can become assets that are used to guide the collection effort and enhance the probability of success. Screening natural products extracts does not necessarily require a random approach. Examples of structure/activity relationships are beginning to emerge for a variety of biological assays. Published information on natural products chemistry and related biological activity can guide a collection effort to target, but not duplicate collections from, organism groups with known properties. A contemporary drug discovery program requires advanced capabilities for information handling.



NetSci, ISSN 1092-7360, is published by Network Science Corporation. Except where expressly stated, content at this site is copyright (© 1995 - 2010) by Network Science Corporation and is for your personal use only. No redistribution is allowed without written permission from Network Science Corporation. This web site is managed by:

Network Science Corporation
4411 Connecticut Avenue NW, STE 514
Washington, DC 20008
Tel: (828) 817-9811
E-mail: TheEditors@netsci.org
Website Hosted by Total Choice