Examining Core Technologies: Protein Science

Allen B. Richon, Ph.D.

Network Science Corporation
4411 Connecticut Avenue NW, STE 514
Washington, DC 20008-8677
USA
E-mail: TheEditors@netsci.org



http://www.netsci.org/Science/Special/feature13.html

Introduction: The number of pharmaceutical research groups actively using some form of protein science in their research program has increased dramatically during the past ten years. Despite the wide-spread use of this technology however, little information has been gathered concerning specific applications and approaches used within the industry. During 1997, Network Science Corporation contacted fifteen companies located in the United States to assess their level of investment in and the use of proteins in research. Given the costs required to implement these types of programs, the survey was limited to larger organizations. The specific segments of protein science surveyed were protein supply, structure determination methods, and structural prediction methods. The questionnaire (Appendix 1) was distributed to fifteen organizations. The responses from the nine companies which participated are summarized below. In most cases, each company stated a similar strategy for addressing protein-based research. Comments from groups whose approach differed significantly within a given area have been highlighted.

Protein Supply: Proteins are used extensively in all phases of research including screening and screen design, structural chemistry/biochemistry (X-ray and NMR), and as medicinal agents. Regardless of the application however, the responsibility for supplying proteins has been centralized in most organizations with specialists addressing cloning/biology, expression/purification, and separations. Companies which reported using a decentralized approach to protein supply identified the organizational structure as an impediment to successfully addressing research projects.

Staffing levels for protein production range from 3-6 for groups assigned directly to structural chemistry to departments of 20 or more in companies which have centralized groups. However, as with many areas of research, these resources were stated to be sufficient to address only those projects deemed critical by the organization; managers reported that there is very little time for more 'speculative' projects or for basic research. Despite this lack of flexibility in allocating resources, larger organizations are reluctant to use external groups as a means of augmenting in-house efforts. Smaller groups stated that they outsource only a limited number of tasks such as fermentation or purification.

The process of designing proteins for use in structural research projects is generally accomplished by a team of molecular biologists, protein chemists, and structural specialists. The mix of the team is frequently decided by the history of and the objectives for each project. Depending on goals for the project, this process can be long and frustrating for structural chemists due in part to a lack of knowledge within biology about the requirements of protein samples for structural analysis.

The first set of questions for a protein production project addresses the appropriate choice of expression systems. The most common sets of questions which were expressed in response to the survey were:

  • How much protein is needed?
  • Does the protein require labels?
  • What are the stability criteria that must be addressed?
  • Can a known process be modified to ease purification?

Based on the requirements above, the cell lines most commonly reported for protein expression were E. coli, bacculo, and mammalian. When possible, human cell lines were identified as the preferred expression system. The process reported most frequently was that transient transfection techniques were used to obtain a stable cell line which was then worked up to provide a production line. Once reproducible conditions were defined, they were codified and stored in a database of company cell lines. Since this approach has proven readily adaptable to many cell lines, many of the larger companies stated that they are becoming 'functionality driven' and will thus initiate an expression project on any protein of known sequence.

The final question for this section of the survey was the identification of emerging technologies in protein production, purification, and characterization. The most significant technology identified was the shift of expression systems from bacteria to other (usually mammalian) cell lines. Other areas identified were the use of alternative expression media and the use of cell types which suppress proteolytic degradation.

Protein Structure Determination: Unlike the production groups, which were found to be similar across organizations, each of the companies surveyed has a different approach to organizing the protein structure determination group. And none of the companies is satisfied with the current structure. The range of organizational formats is:

  • Protein Supply, X-ray, Computational, and NMR each in separate departments.

  • Structural Group within chemistry which includes X-ray, NMR, Computational, and Informatics. Protein Supply in a separate part of the company.

  • Computational, X-ray, and Protein Supply in one department. NMR in another part of the organization.

  • Structure determination (NMR, X-ray) in one group. Modeling and Informatics in a second group. Protein Supply in a third group.

  • Integrated Structural Group within chemistry.

  • X-ray and NMR in one group. Computational dispersed throughout the company. Protein Supply in a separate part of the organization.

The structural chemistry/biochemistry component of these groups varies considerably from company to company. The averages for the companies which responded to the survey to date are shown below. These figures reflect personnel directly allocated to protein structure determination projects rather than to general research.

X-Ray NMR Computational Equipment
4 Ph.D.
2 B.S./M.S.
3 Ph.D.
1 B.S./M.S.
5 Ph.D.
1 B.S./M.S.
$4 million


Each of the managers reported that two major projects (or one major project in addition to 3-4 start-ups) per scientist was an optimum number for their staff. However, they also stated that their groups frequently were required to take-on 'unscheduled' projects. Like scientists in Protein Supply, structure determination groups generally do not use contract services organizations as an additional resource. However, they did comment that they use academic collaborators for selected assignments.

One of the topics that continues to be debated within the pharmaceutical industry is the contribution that structural information and computational chemistry makes to drug discovery. Each of the managers contacted for this survey stated that crystallography and modeling made significant contributions to the design of enzyme inhibitors, to the optimization of lead candidates, and assisted in the elimination of difficult targets. In contrast, managers felt that efforts in NMR have not made the impact that was expected by research organizations. While NMR has assisted in examining stability for small proteins or single domains for larger structures, defining crystallization conditions, examining protein stability, assessing binding interactions, and assisting in secondary structure assignment, many research managers feel that the techniques still need to evolve in order to directly impact drug discovery. Areas that were identified as a potential 'breakthrough technology' for NMR included its use in screening and the development of SAR via NMR.

Emerging technologies identified for protein structure determination were the use of synchrotron sources with selenium enrichments for proteins, electron cryo-microscopy for membrane receptors, MAD phasing, and the development of insynchrotron sources.

Protein Structure Prediction: The final topic examined by this survey was the use of structure prediction techniques in protein research. All of the companies contacted stated that they have CADD groups which actively use prediction methods in support of discovery projects. Techniques utilized by the groups are generally based on homology, sequence alignment, and experimental data. In contrast to the structure determination groups which use a significant amount of academic software, the CADD groups generally use commercial software. Determining the size of the groups dedicated to protein modeling is complicated by the fact that computational scientists will frequently cross project types (e.g., a scientist will frequently work on both large and small molecule projects). The size of computational groups surveyed ranged from three specialists to twelve (non-specialists do not contribute to this area of research), with managers reporting that a varying percentage of their group's time is dedicated to protein-based projects.

The assignment and management of projects within the CADD group differs significantly from the previous groups. Projects are prioritized by research management and the success of the group is assessed according to its impact on discovery objectives. However, project management and 'customer' interactions vary within each company. One group of the CADD managers contacted stated that projects within their organizations are set and managed through ad hoc discussions and contacts with members of the research project teams. In contrast, other managers stated that formal project teams are created by research management and that specialists are assigned to the teams according to the requirements for each project. Since these approaches appear to be dictated by corporate culture, both appear to be effective within their organizations.

As with the structure determination group, managers for CADD reported that their staff is generally tasked with two to three 'major' projects as well as several smaller efforts. Commercial software is used in the majority of companies although four organizations reported that they use academic collaborators to develop new technology. The topics addressed by these projects included knowledge-based folding/alignment techniques, threading, using database mining techniques on classes of proteins, and identification of new classes of templates. None of the organizations reported outsourcing their computational projects.

Notice: All material (Copyright © 1996-2009 by Network Science Corporation. All rights reserved) is published by Network Science Corporation, 4411 Connecticut Avenue NW, STE 514, Washington, DC 20008-8677 Telephone: (828) 817-9811, E-mail: TheEditors@netsci.org.

Material for this survey was collected from the best available sources, however, its accuracy cannot be guaranteed. Network Science Corporation, or any person acting on its behalf, makes no warranty or representation, expressed or implied, with respect to the accuracy, completeness, or usefulness of the material contained in this publication nor the programs or data described herein. We assume no responsibility or liability with respect to the use of this publication, any materials contained herein, programs described herein, or for any damages resulting from the use of any of the above. No part of this publication can be reproduced in any form or by any means without permission in writing from Network Science Corporation.



NetSci, ISSN 1092-7360, is published by Network Science Corporation. Except where expressly stated, content at this site is copyright (© 1995 - 2010) by Network Science Corporation and is for your personal use only. No redistribution is allowed without written permission from Network Science Corporation. This web site is managed by:

Network Science Corporation
4411 Connecticut Avenue NW, STE 514
Washington, DC 20008
Tel: (828) 817-9811
E-mail: TheEditors@netsci.org
Website Hosted by Total Choice