Conekt Bioenergy

Open platforms for data mining democritize the access to analyses that would only be achieved by scientists with computer science skills. CoNekT is a platform for analysis of expression profiles and co-expression networks that can be used to study one or more conditions or tissues in an organism, and the comparative analysis among different organisms. Since CoNekT is open source, research groups with particular interests can populate an instance for mining transcriptome data and raise hypotheses in their field of expertise in biology. Our research group studies plants of interest in bioenergy. Recently, we implemented an instance of CoNekT entitled “CoNekT Bioenergy” for analysis of plants with publicly available RNA-Seq data, particularly grasses with C3 and C4 metabolism. This instance was implemented in the cloud service “USP Internuvem”, using the Apache2 web service and MariaDB/Mysql as the database management system. We selected plants with C4 metabolism in genera Setaria, Sorghum, Pennisetum, Panicum, Miscanthus, Paspalum, Digitaria, and sugarcane, and with C3 metabolism, including Bracrypodium distachyon and Nicotiana tabacum, for prospection of RNA-Seq datasets. For selection of publicly avaiable sequencing runs, we recovered NCBI metadata using Python scripts connected to an sqlite3 database (with the Python3 built-in connector), and the Biopython module Entrez, which recovers NCBI information such as SRA, Biosample, and Bioproject metadata. The following criteria were used to download sequencing data: short “bulk” RNA-Seq reads, “paired-end” layout, experiments with an associated publication indexed in PubMed, and “strand-specific” sequencing. Since “strandness” information is not available for most datasets, we used Salmon to infer is. So far, we have downloaded sequencing data for leaf experiments of Setaria viridis (C4 model) and for two different organs (leaves and flowers) in Nicotiana tabacum (C3). For these two species, we used 130 and four datasets, respectively, for quantification of transcript expression and generation of a TPM matrix, followed by inference of a co-expression network with LSTrAP. CDS were downloaded either from Phytozome-JGI (S. viridis) and SolGenomics (N. tabacum), and functional annotation was obtained from the same databases or carried out with InterProScan. Next steps in this work include functional annotation, selection of additional datasets, generation of expression profiles and co-expression networks for the remaining species. Next steps in this work also involve using tools for processing natural language to better exploit NCBI metadata. We will also study the expression of specific genes and conservation of co-expression modules in plants of our interest, assign plant ontology terms that facilitate the organization of datasets analysed by users of our platform, include new species and implement new functionalities.

Poster presented at the XIII Simposio de Pos-Graduandos do CENA
Poster presented at the XIII Simposio de Pos-Graduandos do CENA
Renato Augusto Correa dos Santos
Renato Augusto Correa dos Santos
Postdoct Gene regulatory networks in grasses
Diego Mauricio Riaño-Pachón
Diego Mauricio Riaño-Pachón
Assistant Professor (MS3.2) in Computational, Evolutionary and Sistems Biology

I am a computational biologist/bioinformatician at the University of São Paulo, Campus Luiz de Queiroz (Piracicaba/SP, Brazil).

Felipe Vaz Peres
Felipe Vaz Peres
Master student - PPG - Ciências CENA/USP - Análise multi-genotípica de RNAs longos não codificantes em cana-de-açúcar

decoding the data-driven secrets of life.

Jorge Mario Muñoz Pérez
Jorge Mario Muñoz Pérez
PhD student - PPG - Ciências CENA/USP - Sugarcane Co-expression networks

I am PhD student at PPG - Ciências at the Center for Nuclear Energy in Agriculture, University of São Paulo. I am interested in the intersection of Plant Biology, Programing and Mathemathics. I have worked in Plant Biotechnology, Population Genomics, Machine Learning applied to the discovery of anti-microbial peptides. Currenlty i turned into Sugarcane omics, focusing on genomic resources and gene Co-expression network analysis.

Arthur Shuzo Owtake Cardoso
Arthur Shuzo Owtake Cardoso
Undergrad student - Biotechnology (USP) - Conekt Bioenergy

Related