Fix object for functioning of notame — fix

Attempts to create missing columns in pheno and feature data. Optionally cleans the object and splits the object by mode. Modifies supplied "Sample_ID" column if needed. Aims to make the object compatible with all of notame.

Usage

fix_object(
  object,
  id_prefix = "ID_",
  id_column = NULL,
  split_by = NULL,
  name = NULL,
  clean = TRUE,
  split_data = FALSE,
  assay.type = NULL
)

Arguments

object: a SummarizedExperiment or MetaboSet object
id_prefix: character, prefix for autogenerated sample IDs, see Details
id_column: character, column name for unique identification of samples
split_by: character vector, in the case where all the modes are in the same object, the column names of feature data used to separate the modes (usually Mode and Column)
name: in the case where object only contains one mode, the name of the mode, such as "Hilic_neg"
clean: boolean, whether to select best classes, reorder columns and consistently rename columns in pheno and feature
split_data: logical, whether to split data by analytical mode recorded in the "Split" column of feature data. If TRUE (the default), will return a list of MetaboSet objects, one per analytical mode. If FALSE, will return a single MetaboSet object.
assay.type: character, assay to be used in case of multiple assays

Value

A new SummarizedExperiment object or MetaboSet object with a single peak table. If split_data = TRUE, a list containing separate objects for analytical modes.

Details

Only specify one of split_by and name. The feature data will contain columns named "Split", used to separate features from different modes, and "Flag" for recording flagged features. Unless a column named "Feature_ID" is found in feature data, a feature ID will be generated based on the value of "Split", mass and retention time. The function will try to find columns for mass and retention time by looking at a few common alternatives, and throw an error if no matching column is found. Sample information needs to contain a row called "Injection_order", and the values need to be unique. In addition, a possible sample identifier row needs to be named "Sample_ID", or to be specified in id_column, and the values need to be unique, with an exception of QC samples: if there are any "QC" identifiers, they will be replaced with "QC_1", "QC_2" and so on. If a "Sample_ID" column is not found, it will be created using the id_prefix and injection order or by renaming id_column.

Examples

data(example_set)
ex_set <- example_set
rowData(ex_set)$Flag <- NULL
fixed <- fix_object(ex_set)
#> INFO [2025-06-23 22:36:33] Pheno data was cleaned
#> INFO [2025-06-23 22:36:33] Initializing 'Flag' column with unflagged features
#> INFO [2025-06-23 22:36:33] Feature data was cleaned