Computational Visualization Center University of Texas at Austin   
National Science Foundation


Sub-nanometer structure based fold determination of large biological complexes.
(2003 - 2006)

Introduction Angstrom Results Publications Software

(a) outer shell of RDV (b) asymmetric unit of RDV outer shell (c) P8 trimer after averaging four types of trimers in an asymmetric unit (d) P8 monomer with helixhunter results
(e) inner shell of RDV (f) asymmetric unit of RDV inner shell with two P3 (g) P3 monomer with helixhunter results (h) putative beta sheet density in P8 (arrow)
The identified helices are annotated as cylinders in (d and g) with green colors with high confidence and yellow as less confidence.


Applications of anisotropic filtering on Rice Dwarf Virus (RDV).
(a) unfiltered slice of RDV (b) filtered slice of RDV. The densities are colored radially. (c) top and side views of the polymerase complex at the 5-fold vertices extracted after filter is applied.


Fuzzy C-means segmentation.
(a) Contour Spectrum of Rice Dwarf Virus (RDV) (b) Preliminary results of Fuzzy C-means classification using 5 materials in order to segment the salient structural features of the RDV capsid.


Structural genomic initiatives target solving structures of most existing protein folds by x-ray crystallography and MR spectroscopy, such that most of the remaining proteins can be modeled with useful accuracy based on their similarity to the known structures. While structures of individual proteins or small complexes, such as most of the Protein Data Bank entries, provide important information, they do not necessarily yield the "full picture" of a functional biological complex. The study of large macro molecular complexes, such as viruses, ion channels, the ribosome and other machines of various types, offer a more complete structural and functional description of the protein machinery. In addition to x-ray crystallography, electron cryonics (cryoEM) of single particles has become a powerful tool in revealing the structures of large complexes at subnanometer resolutions (5 - 10A) 1-7.

As recent advancements have propelled structure determination by cryoEM to subnanometer resolutions, the infrastructure for analysis and visualization of the assemblies still remains relatively undeveloped. We shall develop computational and visualization tools for graphical display, feature extraction and the modeling of large macromolecular complexes based on the subnanometer resolution data obtained by cryoEM. In view of the progress made in biochemical purification of large complexes and the improved resolution of cryoEM, we expect that the number of macromolecular complex structures solved at subnanometer resolution will continue to increase 8. While it is not possible to unambiguously determine a full atomic model at this resolution, it is possible to define molecular domains and secondary structure elements, such as helices and sheets. With the help of computational modeling, we will define the connectivity among these observed secondary structure elements and derive the folds of protein domains within the larger complex. With this information in hand, we will also link our structural informatics with other biophysical and biochemical informatics to interpret the functional mechanism of biological machines and their components.

In this project, we combine the complementary expertise of Chandrajit Bajaj at University of Texas, Austin, in visualization, feature recognition, and geometric modeling, Wah Chiu at Baylor College of Medicine in electron cryomicroscopy, and Andrej Sali at University of California, San Francisco, in protein structure modeling. Both simulated and experimental data will be used for testing and validation of our approaches, as described in the following sections. To facilitate close collaboration between these three groups, monthly video-conferences will be held. An annual meeting at one of the participating institution sites will also be organized, for face-to-face interactions among all the investigators. We will also disseminate our tools via a formal workshop, as well as maintain an open source code policy. We expect that we will create a computational infrastructure to which other investigators will contribute their algorithms, so that both the computational and biological communities are able to extract maximum structural information from large macromolecular complexes. Ultimately, this effort will culminate in a better understanding of the structure and function of proteins, and thus contribute to the usefulness of genome sequencing, structural genomics and functional genomics in biology.

A major emphasis in this proposal is to train undergraduate, graduate and postdoctoral fellows in an interdisciplinary scientific computing environment. The proposed research in the field of computational biology shall actively involve graduate students and stimulate classroom teaching. In addition, we will host an annual workshop to disseminate our technology to a broader community.

Principal Investigators:
Chandrajit Bajaj, University of Texas at Austin
Wah Chiu, Baylor College of Medicine at Houston
Andrej Sali, University of California at San Francisco

This material is based upon work supported by the National Science Foundation under Grant No. 0325550

Any options, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.



CCV Sponsors Computational Visualization Center



   Computational Visualization Center University of Texas at Austin