Skip to contents

For each ID in a data frame, collapses all rows belonging to that ID into a single row. Variables whose names match a specified pattern are summarized according to a chosen method. The output data frame contains one row per unique ID and one column per matched variable. Non-matching variables are ignored.

Usage

OR.collapse(data, ID_varname, pattern, method = "single")

Arguments

data

A data frame containing repeated measurements.

ID_varname

Character string giving the name of the ID variable. Rows with the same ID are treated as belonging to the same individual.

pattern

String containing a regular expression used to select variables. All variable names matching this pattern (via grep()) are collapsed.

method

Character string specifying how multiple values within each ID should be collapsed. Options include:

  • "left": Returns the first non-missing value in order of appearance.

  • "mean": Returns the mean of non-missing values.

  • "max": Returns the maximum of non-missing values.

  • "min": Returns the minimum of non-missing values.

  • "single" (default): Returns the mean of non-missing values and prints a message if more than one unique non-missing value is found.

  • "sum": Returns the sum of non-missing values.

Value

A data frame with one row per unique ID and one column for each variable matching pattern, summarized according to method.

Details

An example use case is when working with REDCap exports, where the measurements for one patient are distributed across multiple event rows.

See also