An example study and pipeline:  What is the role of tumorin in tumor development and survival in TNBC?

The goal is to understand how central metabolism is impacted by the knockdown, including sources of NADPH needed for proliferation. The user may already have information from Seahorse analysis, and phenotypic effects of the knockdown. Cell cycle distribution analysis is also important for the overall interpretation.


1.  3 cell lines ±shRNA against tumorin- triplicate experiments, 3 tracers ([U-13C]-Glc, [13C1,2]-Glc, [U-13C,15N]-Gln = 54 experiments.
Polar + non polar metabolites = 108 analytical samples.
Protein may be used for additional experiments (e.g. expression), and/or normalization.
Polar metabolites analyzed by NMR and GC-MS or IC/FT-MS= 216 experiments,
+ 54 FT-MS of nonpolar fraction
total = 270 analyses.
Media samples at 5 time points for each dish =270 media samples, analyzed by NMR and MS
total= 540 analyses.
Total analyses = 810


2.  Same cell lines as orthotopic xenograft in NSG mice (5 mice/group), two tracers = 60 mice.
Tumor tissue + nontumor tissue + 2 blood samples per mouse = 240 samples.
120 blood analyses by MS and NMR = 240 analytical samples.
Tissue polar and non polar = 480 analyses.
Total mouse sample analyses = 720
Grand total = 1530 analyses.
The number of analytes including isotopomers and isotopologues is > 200,000 quantified analytes for this study.
At this point, the data are reduced to lists of compounds, their isotopomers and their amounts for biological interpretation.


Outline Procedure


Design experiment->execute biological experiment with tracers-> harvest sample->prepare sample for analysis->analytical data acquisition->data reduction->information retrieval and interpretation.
Detailed SOPs are available


Cell Culture


Grow cells in culture in triplicate with each tracer ± shRNA. 18 experiments (x3 for each cell line).
Sample media at defined time points and store (e.g. 0, 3,6,9,24 h-the zero time point is critical) (90 samples)
Harvest and extract cells
Store metabolite fractions (polar, non-polar, protein)
Dried samples must be reconstituted in appropriate volume of buffers for different analytical platforms, and loaded into the correct labeled sample tubes.
Prepare for NMR – run first for quality control on sample and extraction integrity
Prepare for GC or IC MS
Prepare for FT-MS
Record spectra on the different platforms
Reduce data to raw isotopologues distributions for each tracer
Repeat any bad experiments
Repeat for next cell line


Tumor Bearing Mice


Treat with tracer, sample blood. Harvest tissue
Extract tissues and blood
Prepare for analytical spectroscopy
Record spectra on the different platforms
Reduce data to raw isotopologues distributions for each tracer
Repeat any bad experiments


Data Acquisition and Reduction


GC-MS adn IC-MS take 1 h per sample to run. QC/standard samples must be run in interleaved mode.
NMR spectra take 1.2 h per cell or tissue sample to run, 0.5 h for plasma extracts
FT-MS for lipids takes 10-15 min per sample.
Data reduction for this density of data is 1-1.5 h/spectrum.
The results can be interpreted in terms of specific networks related to cell growth or survival, with limited flux information (exclusively in this design for inputs and outputs).


Data Processing and Analysis


The data comprise several components, as follows.

  1. Metadata that describe in exact detail the entire workflow from sample receipt to final products. No useful results can be obtained without these data. An Excel spreadsheet is required for these data, and a template is provided.

  2. Raw analytical data, i.e. the streams of bits coming from the instruments. For FT-MS and NMR these represent digitized electrical signals in the form of free induction decays comprising both real and imaginary parts. For other MS data, these are digital representations of ion counts.

  3. Raw analytical data have to be transformed into a usable form, which for FT-MS and NMR is the discrete fourier transform and associated digital processing to suppress truncation artefacts, optimize signal-to-noise ratios etc. The resulting output is a spectrum of intensity versus frequency. For NMR, the frequency is usually transformed to chemical shift, in ppm, that is independent of magnetic field strength. For FT-MS, the frequencies are mapped onto an m/z range.

  4. Intensities (ordinate values) must be internally normalized to obtain amounts of materials (i.e. numbers of moles of substances or of ions), and back calculated to the values associated with the original spectrum, on an agreed upon measure of the amount of that species (such as biomass weight, protein mass etc.). This absolutely requires accurate metadata. The amounts are proportional to peak areas (or volumes) NOT peak heights; appropriate numerical or analytical integration procedures must be correctly applied, taking due account of baseline drift, phasing errors and peak overlap.

  5. For isotopomer and isotopologue analyses, the intensities are usually expressed as mole fractions (“enrichments”). As these are ratios, normalization to cell amount is not needed. For MS, the natural abundance needs to be corrected. Software is available for this [cf. Moseley (2010) Correcting for the effects of natural abundance in stable isotope resolved metabolomics experiments involving ultra-high resolution mass spectrometry. BMC Bioinformatics 11,139]

  6. Spectral features need to be mapped onto identifiable molecules (“assignment”), using the available spectral information, and by reference to our own and other public databases.

  7. For “profiling” typically one is concerned with case-control comparisons, which require large numbers of specimens (each unique). Multivariate statistics are generally appropriate for initial analyses- are the groups different? What is different about these groups? PCA and OPLSDA (SimcaP) may be used.

  8. Normalization. To compare case and control, the quantity of each metabolite must be normalized to the appropriate amount of specimen. Cell number is generally not appropriate as cell volumes vary widely among types, and also in response to treatment. Total biomass or a surrogate is appropriate (e.g. dry weight, total protein total DNA).

Total DNA may not be appropriate in a case-control study because the amount of DNA per cell varies twofold during the cell cycle, and the control and treated samples do not necessary have the same cell cycle distribution. Comparison of different cell types is then further compromised where there are different numbers of chromosomes present (diploid G1 normal cell, triploid cancer cell, tetraploid cells arrested at G2/M).

  1. With SIRM studies, a question is often what pathways were impacted, which requires pathway tracing (SIRM) and biochemical expertise.

Additional Analyses

Quantitative analyses may also be carried out (e.g. what is the rate of nutrient utilization and waste product excretion). Kinetic models (flux analysis) based on enzymology can also be applied where needed. These studies need careful consideration of the time dependence of the biomass as a function of time for accurate normalization of rates. Flux: the number of moles of nutrients (e.g. glucose, glutamine) consumed and the number of moles of product excreted (e.g. lactate, alanine, glutamate) is measured as a function of time, producing a time course of consumption and excretion. To determine rates, it is essential to normalize to the functional unit of metabolism which is the amount of enzymes present in the system. This is proportional to the concentration of the enzymes times the relevant intracellular volume (unknown).

  1. With tracers, the time course of the isotopomer distributions can be determined, as can the fraction of glucose (glutamine) consumed that is converted to excreted product (e.g. lactate, alanine, glutamate).

  2. Further statistical analyses as needed should be carried out by statisticians versed in multivariate analyses.