UCSD and the Venter Institute establish a Web-based resource for advancing metagenomic research
March 2007—The Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) project launched its Web site to the public on January 23, 2007. By leveraging emerging technologies in distributed data storage and analysis not commonly found in gene sequence resources, the CAMERA Web site (camera.calit2.net) is on its way to create a community resource to enable researchers to access high-performance computational resources and analyze raw environmental sequence data and associated metadata-innovative cyberinfrastructure not currently available at other gene sequence resources.
Led by a multi-disciplinary group from the California Institute for Telecommunications and Information Technology (Calit2) at UCSD, CAMERA is a collaboration of the J. Craig Venter Institute (JCVI) of Rockville Maryland, the Center for Earth Observations and Applications and the Scripps Genome Center (both at the Scripps Institution of Oceanography), and the San Diego Supercomputer Center. It has brought together leaders in high-throughput DNA sequencing and metagenomic analysis tools, with advanced cyberinfrastructure-optically coupled computing, emerging Grid middleware, and user workspaces. Responding to the needs of an exceptionally broad, multidisciplinary scientific community, the CAMERA team is developing and deploying tools that will allow biologists to access and analyze millions of gene sequences in a collaborative Web-based work environment, or Portal. The principal Investigator of the project is Dr. Larry Smarr, Director of Calit2. Several groups of UCSD scientists are co-investigators on the CAMERA project, including CRBS's John Wooley and Peter Arzberger. Wooley's group, the Joint Center for Structural Genomics (JCSG), is helping CAMERA researchers advance the development of ocean-derived drugs and therapies for combating cancer and other diseases by deploying some of JCSG's high-throughput analysis workflows and collaboration tools. Arzberger's group at the National Biomedical Computation Resource (NBCR) is helping CAMERA leverage NBCR's library of key technologies in grid-enabled computation, data integration and modeling, and interactive visualization to enable CAMERA researchers to explore volumes of metagenomic data.
Driven by computationally intense questions associated with CAMERA's large metagenomic data collection, Philip Papadopoulos and computer programmers at the San Diego Supercomputer Center (SDSC) are responding by coupling middleware, compute, and storage capabilities with high- performance computing facilities in a unified and scalable architecture wrapped in a researcher-friendly interface.
Together, the developers from CRBS and SDSC are able to leverage the technology developed over the years from other CRBS cyberinfrastructure technologies including Web portal environments, access to scalable computing clusters, and a geographical information system (GIS) based metadata query interface. Using a service-oriented architecture, CRBS's extensible cyberinfrastructure provides the fundamental framework for hosting collaborative research across interinstitutional and interdisciplinary communities of scientists and non-scientists.
Early in 2006, the Gordon and Betty Moore Foundation awarded UCSD and the Craig Venter Institute a seven year, $24.5 million grant to help build a community resource for advancing metagenomics research. To support an intentional paradigm shift in the way in which the science of metagenomics develops, CAMERA's knowledge base will encompass massive amounts of genomic sequence and associated metadata in a tool-rich workplace on the Web.
In deciding to fund the CAMERA project, the Moore Foundation recognized the important contributions that marine metagenomics can make if the volumes of metagenomic data being collected were made more accessible.
For at least 3.5 billion years, microbes have played an integral role in the history and function of life on Earth, becoming the fundamental engines that drive cycles of energy and matter on our planet. They represent the single largest source of evolutionary and biochemical diversity on the planet and yet our understanding of how natural microbial communities function on land and in the oceans is limited to the tiny fraction of microbes that could be successfully cultivated and characterized.
The power of gene sequencing for biological discovery has the ability to change this. The development of shotgun sequencing technology for examining all of the sequences within small samples of water or soil is at the crux of environmental genomics and has the potential to revolutionize how biologists identify and characterize legions of unseen microorganisms. Metagenomics will allow us to explore the enormous potential of this vast reservoir of genetic and biochemical diversity for both fundamental knowledge creation, as well as construct new synthetic applications to address societal needs, such as energy sources and the treatment of diseases.
Currently CAMERA's Web site includes access to three datasets, the largest being the genomic data acquired by the 2003-06 Venter Institute's Global Ocean Sampling (GOS) expedition that amassed seawater samples for genomic analysis every 200 miles on its world circumnavigation. In the future additional gene sequences, gene families, and associated environmental metadata will be integrated into the CAMERA Web site. Biologists are invited to use and provide feedback on the analytical tools currently available on the CAMERA Web site.
Serving the needs of the scientific community is the central focus of the CAMERA effort. New software releases, training sessions, and periodic solicitation of feedback will ensure that CAMERA's infrastructure and services evolves to meet the needs and priorities of the genomics, microbiology, and molecular biologists.
Figure 1. CAMERA Web site (http://camera.calit2.net/)
- John Wooley, Co-Principal Investigator CAMERA; Calit2 and Associate Vice Chancellor of Research, UCSD
- Peter Arzberger, Co-Principal Investigator CAMERA; Director and Principal Investigator, National Biomedical Computation Resource; Chair, Pacific Rim Application and Grid Middleware Assembly
- Lee G. Hornbrook, External Relations (858) 822-0755, email@example.com
About Moore Foundation
The Gordon and Betty Moore Foundation, established in September 2000, works in collaboration with grantees and other partners to achieve significant and measurable outcomes in three areas: environmental conservation, science and the San Francisco Bay Area. In April 2004, the Foundation launched its 10-year Marine Microbiology Initiative, with the goal of attaining new knowledge regarding the composition, function and ecological role of microbial communities in the world's oceans. www.moore.org
The California Institute for Telecommunications and Information Technology, a partnership between UC San Diego and UC Irvine, houses over 1,000 researchers organized around more than 50 projects on the future of telecommunications and information technology and how these technologies will transform a range of applications important to the economy and citizens' quality of life. www.calit2.net
The Center for Research in Biological Systems (CRBS), established in 1996 by UCSD, provides a dynamic environment where researchers from biology, medicine, chemistry, and physics team up with colleagues from computer science and other information technologies in order to drive collaborative studies of complex biological systems across multiple spatial scales. Major foci of the Center's activities include four federally funded research projects: the National Center for Microscopy and Imaging Research (NCMIR); the Biomedical Informatics Research Network (BIRN); the National Biomedical Computation Resource (NBCR); and the Joint Center for Structural Genomics. http://crbs.ucsd.edu
The Center for Earth Observations and Applications was established in November 2005 by UCSD to stimulate support and coordinate sustained research and applications in Earth observations at the university. Led by Scripps Institution of Oceanography in partnership with Calit2 and other campus organizations, CEOA provides an integrating vision for work across the spectrum of natural, physical, and social sciences, engineering, and information technology related to Earth observations and applications. http://ceoa.ucsd.edu
About J. Craig Venter Institute
J. Craig Venter Institute is a not-for-profit research institute dedicated to the advancement of the science of genomics; the understanding of its implications for society; and communication of those results to the scientific community, the public, and policymakers. Founded by J. Craig Venter, Ph.D., Venter Institute is home to approximately 200 staff and scientists with expertise in human and evolutionary biology, genetics, bioinformatics/informatics, information technology, high-throughput DNA sequencing, genomic and environmental policy research, and public education in science and science policy. J. Craig Venter Institute is a 501(c)(3) organization. For additional information, please visit http://www.venterinstitute.org.
In 2005, the San Diego Supercomputer Center (SDSC) celebrates two decades of enabling international science and engineering discoveries through advances in computational science and high-performance computing. Continuing this legacy into the era of cyberinfrastructure, SDSC is a strategic resource to academia and industry, providing leadership in Data Cyberinfrastructure, particularly with respect to data curation, management and preservation, data-oriented high-performance computing, and Cyberinfrastructure-enabled science and engineering. SDSC is an organized research unit of the University of California, San Diego and one of the founding sites of NSF's TeraGrid. www.sdsc.edu