Skip to contents

Fits a univariable polynomial regression using Huber M-estimation with degree selected by robust AIC (AICR), then flags observations whose residuals exceed a small-sample–adjusted multiple (k) of the median absolute deviation (MAD).

Usage

OR.outliers.rlm.ggplot(
  x,
  y,
  max.degree = 3,
  p = 0.05,
  tol.min = 1e-04,
  tol.target = 1e-04,
  col.in = "#0072B5FF",
  col.out = "#BC3C29FF",
  echo = FALSE,
  x.breaks = NA,
  x.labels = NA,
  x.title = "",
  y.breaks = NA,
  y.title = ""
)

Arguments

x

Numeric predictor vector.

y

Numeric response vector.

max.degree

Maximum polynomial degree considered. Default = 3. Internally capped at n - 2, where n is the number of non-missing observations.

p

Target two-sided exclusion proportion under normality for the residual-based modified z-score rule. Default = 0.05.

tol.min

M-estimation minimum convergence tolerance. Default = 0.0001.

tol.target

M-estimation target convergence tolerance. Default = 0.0001.

col.in

Color used for the fitted curve, ribbon band, and observations not flagged as outliers. Default = "#0072B5FF".

col.out

Color used for observations flagged as outliers. Default = "#BC3C29FF".

echo

Logical. If TRUE, prints the internal data frame used for the plot together with the computed k and MAD values.

x.breaks

Numeric vector specifying x-axis tick locations. If NA, the values of x are used.

x.labels

Labels for the x-axis ticks. If NA, the values of x are used.

x.title

Title for the x-axis.

y.breaks

Numeric vector specifying y-axis tick locations. Horizontal gridlines are drawn at these values.

y.title

Title for the y-axis.

Details

For numerical stability, x and y are standardized before fitting and back-transformed for plotting.

See also

Other outliers: OR.kMAD(), OR.outliers(), OR.outliers.rlm()

Examples

y <- c(36.3, 47.9, 47.2, 43.9, 47.6, 49.6, 53.2, 59.3, 63.2, 70.8, 75.9, 88.5,
       97.3, 103.6, 6.1, 120.2, 135.8, 139.4)
x <- 1:length(y) - 1
OR.outliers.rlm.ggplot(x, y, max.degree = 4, p = 0.01, x.title = "X",
                       y.breaks = seq(0, 150, 50), y.title = "Y")