Home » Science & Technology » Moving data from the warehouse to the workbench: a bridge to Galaxy ..., 20160629

Moving data from the warehouse to the workbench: a bridge to Galaxy ..., 20160629

Written By IUPTI on Wednesday, Apr 19, 2017 | 03:32 PM

 
Galaxy Community Conference 2016, Indiana University - Bloomington | https://gcc2016.iu.edu/ Moving data from the warehouse to the workbench: a bridge to Galaxy from the Tripal community genome database software platform https://gcc16.sched.com/event/b66d19af0d6b6c6c90c7c6fd5985eea3# Authors: Margaret Staton1, Ming Chen1, Nathan Henry1, Emily Grau2, Connor Wytko3, Brian Soto3, Sook Jung3, Kuangching Wang4, Nick Watts5, Chun-huai Cheng3, Lacey A. Sanderson6, Jill Wegrzyn2, Doreen Main3, F. Alex Feltus7, Stephen P. Ficklin3 1 University of Tennessee Institute of Agriculture Department of Entomology and Plant Pathology, Knoxville, TN 37996, USA 2 University of Connecticut Department of Ecology and Evolutionary Biology, Storrs, CT 06269 USA 3 Washington State University Department of Horticulture, Pullman, WA 99164 USA 4 Clemson University Department of Electrical & Computer Engineering, Clemson, SC 29634 USA 5 Clemson University, Clemson Computing and Information Technology, Anderson, SC 29625 USA 6 University of Saskatchewan, Department of Plant Sciences, Saskatoon, Saskatchewan, SK S7N Canada 7 Clemson University Department of Genetics & Biochemistry, Clemson, SC 29634 USA Abstract: Online community genome databases offer curated and mission-specific data and information to scientists with shared basic and applied research goals. In an effort to share a common code base, standardize storage formats, and simplify site construction, a coalition of genome databases have developed the software Tripal. Tripal is an open-source platform that bridges Drupal, a popular content management system (CMS), and Chado, a standardized relational database for storage of biological data. There is a need for users of community databases to not only discover, visualize and download genomic information but to directly port it to analysis workflow software such as the Galaxy platform. Through development of the new Tripal Galaxy module, site visitors will be able to select custom datasets from within and across Tripal databases and import those directly to a Galaxy instance from within a Tripal-based site. Additionally, a set of pre-designed workflows for common analyses needed by users of community databases will be made publicly available, including functional annotation of gene sequences, genomic variant discovery and genotype/phenotype association. Current efforts are focused on enabling authenticated users to move data from within a Tripal community database to the Tripal community Galaxy instance or a public Galaxy instance, creation of PHP bindings for the Galaxy API, and establishment of the most commonly needed analysis workflows for database users. Speaker: Margaret Staton University of Tennessee Knoxville