Provides functionality for untargeted LC-MS metabolomics research as specified in the associated publication in the 'Metabolomics Data Processing and Data Analysis—Current Best Practices' special issue of the Metabolites journal (2020). This includes tabular data preprocessing, feature selection and supporting visualizations. Raw data preprocessing and functionality related to biological context, such as pathway analysis, is not included.
Details
In roughly chronological order, the functionality of notame is as follows. Please see the vignettes and paper for more information.
Tabular data preprocessing (reducing unwanted variation and completing the dataset, returning modifed objects):
mark_nas
mark specified values as missingflag_detection
flag features with low detection rateflag_contaminants
flag contaminants based on blankscorrect_drift
correct drift using cubic splineruvs_qc
Remove Unwanted Variation (RUV) between batchespca_bhattacharyya_dist
Bhattacharyya distance between batches in PCA spaceperform_repeatability
compute repeatability measuresassess_quality
assess quality information of featuresquality
extract quality information of featuresflag_quality
flag low-quality featuresflag_report
report of flagged featuresimpute_rf
impute missing values using random forestimpute_simple
impute missing values using simple imputationcluster_features
cluster correlated features originating from the same metabolitecompress_clusters
Compress clusters of features to a single featurelog
logarithmsexponential
exponentialpqn_normalization
probabilistic quotient normalizationinverse_normalize
inverse-rank normalizationscale
scale
Tabular data preprocessing visualizations (saved to file by default):
visualizations
write all relevant data preprocessing visualizations to pdfplot_injection_lm
estimate the magnitude of driftplot_sample_boxplots
plot a boxplot for each sampleplot_dist_density
plot distance densityplot_tsne_arrows
,plot_pca_arrows
t-SNE and PCA plots with arrowsplot_tsne_hexbin
,plot_pca_hexbin
t-SNE and PCA hexbin plotsplot_dendrogram
sample dendrogramplot_sample_heatmap
sample heatmapplot_pca_loadings
PCA loadings plotplot_quality
plot quality metrics
Feature selection – Univariate analysis (return data.frames):
perform_lm
linear modelsperform_logistic
logistic regressionperform_lmer
linear mixed modelsperform_oneway_anova
Welch’s ANOVA and classic ANOVAperform_lm_anova
linear models ANOVA tableperform_t_test
pairwise and paired t-testsperform_kruskal_wallis
Kruskal-Wallis rank-sum testperform_non_parametric
pairwise and paired non- parametric testsperform_correlation_tests
correlation testperform_auc
area under curveperform_homoscedasticity_tests
test homoscedasticitycohens_d
Cohen's Dfold_change
fold changesummary_statistics
summary statisticssummarize_results
statistics cleaning
Feature selection – Supervised learning (return various objects):
muvr_analysis
multivariate modelling with minimally biased variable selection (MUVR2)mixomics_pls
,mixomics_plsda
a simple PLS(-DA) model with set number of components and all featuresmixomics_pls_optimize
,mixomics_plsda_optimize
test different numbers of components for PLS(-DA)mixomics_spls_optimize
,mixomics_splsda_optimize
test different numbers of components and features for PLS(-DA)fit_rf
,importance_rf
fit random forest and feature importanceperform_permanova
PERMANOVA
Feature-wise visualizations (these are often drawn for a subset of interesting features after analysis, saved by default):
save_beeswarm_plots
save beeswarm plots of each feature by groupsave_group_boxplots
save box plots of each feature by groupsave_scatter_plots
save scatter plots of each feature against a set variablesave_subject_line_plots
save line plots with meansave_group_lineplots
save line plots with errorbars by groupsave_batch_plots
save batch correction plots
Results visualizations (returned by default, save them using
save_plot
):
plot_p_histogram
histogram of p-valuesvolcano_plot
volcano plotmanhattan_plot
manhattan plotmz_rt_plot
m/z vs retention time plot (cloud plot)plot_effect_heatmap
heatmap of effects between variables, such as correlations
Object utilities:
read_from_excel
read formatted Excel filesconstruct_metabosets
construct MetaboSet objectswrite_to_excel
write results to Excel filegroup_col
get and set name of the special column for group labelstime_col
get and set the name of the special column for time pointssubject_col
get and set the name of the special column for subject identifiersflag
get and set values in the flag columndrop_flagged
drop flagged featuresdrop_qcs
drop QC samplesjoin_fData
join new columns to feature datajoin_pData
join new columns to pheno datacombined_data
retrieve both sample information and featuresmerge_metabosets
merge MetaboSet objects togethermerge_objects
merge SummarizedExperiment objects togetherfix_object
fix object for functioning of notameMetaboSet-class
MetaboSet
Other utilities:
References
Klåvus et al. (2020). "notame": Workflow for Non-Targeted LC-MS Metabolic Profiling. Metabolites, 10: 135.