SUMMER - Shiny Utility for Metabolomics and Multiomics Exploratory Research








Background

SUMMER is an R shiny application for multiomics analysis (metabolomics + transcriptomics/proteomics).

SUMMER was developed by the Bioinformatics Core at the Salk Institute. Please email lhuang@salk.edu or mshokhirev@salk.edu for any question.


Tutorial: summer tutorial

Test Datasets: Download the test input file for metabolites [1] Download the test input file for genes [1] Download the test input file for proteins (transformed from gene test dataset) [1]

Note: the raw intensities can be shown in '1000' or '1,000' format, but the latter one needs to be delimited by tab or space rather than comma. The gene expression table does not allow comma in the numbers.


More about input data formats:

SUMMER accepts the following input fomrat:

1. Metabolomics: KEGG [2] compound ID with raw intensity from Mass Spec.

2. Transcriptomics: Entrez Gene ID with FPKM/TPM values or microarray raw intensity.

3. or Proteomics: Entrez Gene ID with raw intensity from Mass Spec. Imputation and quantile normalization will be performed on the input data. If you have normalized proteomics data, you can upload it in the 'Transcriptomics panel'. (Currently either transcriptomics or proteomics data is accepted as input)


SUMMER Workflow:

SUMMER uses boostrap to estimate the possible range of the change in reaction rate potentials. It uses the similar idea as GSEA [3] permute by gene method for bootstrap. In addition to the existing knowledge about metabolic reactions in KEGG database [4], it is possible to add user-specific reactions in future updates. Currently, it only supports human and mouse. It is required to have metabolomics data and either one of transcriptomics or proteomics to construct the reaction network. SUMMER creates an interactive network graph for data exploration and outputs the significantly altered pathways based on the metabolites/reaction changes.


References:

[1] Elevating acetyl-CoA levels reduces aspects of brain aging. Currais et al., eLife 2019;8:e47866

[2] New Approach for Understanding Genome Variations in KEGG. Kanehisa et al., Nucleic Acids Research 2019; 47(D1):D590-D595.

[3] Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Subramanian et al., PNAS 2005; 102(43): 15545-15550

[4] WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Liao et al., Nucleic Acids Research 2019; 47:W199–W205

SUMMER was developed in R shiny environment and uses the following packages:

shiny limma WebGestaltR visNetwork igraph DT ggplot2 plotly data.table dplyr parallel

summer cartoon figure credit: smallq111


Principle Component Analysis is a great tool to check sample quality through univariant analysis. Once the input dataset is succesfully uploaded and mapped to KEGG database, a button to run PCA will be shown below.

The PCA plots will be shown in the order of metabolite, gene, and protein (at the bottom) when multiple inputs are detected. The PCA work on total metabolites and top 10% most variable genes/proteins to minimize noise in datasets.








Once the input dataset is succesfully uploaded and mapped to KEGG database, a button to run DE analysis will be shown below.

Metabolites associated with KEGG reactions will be analyzed by limma to identify DE metabolites.

Expressed reactions that have at least one measured substrate and one measured product will be tested by bootstrap method to identify DE reactions.

The test is arranged in the way of group B vs group A. Up-regulation means that expression is higher in group B than group A and vice versa.

Default cutoff (un-adjustable) for the barplot: absolute logFC > 0.5, adjusted p-value or ranking score < 0.05




A button to run pathway analysis will be shown after the DE analysis is performed.

Currently, only over-representation analysis is supported.

A network for each pathways included in KEGG reaction pathway database can be constructed regardless of the significance of that pathway in order to provide an unbiased overview of what happens at pathway level. However, care should be taken to properly interpret the information included in the pathway network. The pathway network will be shown at the bottom of the page, please scroll down a bit to find it.


A button to render the master network graph will be available once the DE analysis is performed. Please select the cutoff to control the size of the network. Stringent cutoff will lead to small network with most significance whereas relaxed cutoff will lead to a bigger and more connected network.