degradr

R
Reliability
Statistics
Provides tools to estimate the Remaining Useful Life (RUL) of degrading systems using linear mixed effects models. Supports both univariate and multivariate degradation signals. For multivariate inputs, signals are fused into a univariate health index prior to modeling. Linear and exponential degradation paths are supported (the latter via a logarithmic transformation). Remaining Life Distributions (RLDs) are computed via Bayesian updating for new units, enabling in-situ predictive maintenance.
Author

Pedro Abraham Montoya Calzada

Published

July 25, 2025

Introduction

Accurately estimating the Remaining Useful Life (RUL) of degrading systems is essential for predictive maintenance, safety assurance, and optimal resource allocation across various engineering domains.

linear mixed effects models offer a principled way to account for both population-level degradation trends and unit-specific deviations. When combined with Bayesian updating, they enable real-time prediction of RUL as new sensor readings are collected. Furthermore, in multivariate settings where multiple sensors monitor the same component, health index construction via data fusion provides a powerful reduction strategy that preserves prognostic value while simplifying the modeling process.

This post introduces the R package degradr, a toolbox designed to estimate RUL from univariate or multivariate degradation signals using linear mixed effects models. The package supports signal fusion, Bayesian posterior computation, and visualization, making it suitable for both exploratory and operational use.

Motivation and Scope

The statistical methodology implemented in degradr is not new: it is based on the degradation modeling framework of Lu and Meeker (Lu and Meeker 1993), extended with multivariate signal fusion and Bayesian prediction techniques as proposed by Liu and Huang (Liu and Huang 2016). Despite its theoretical maturity and demonstrated performance in prognostic tasks, this methodology had not yet been made available in a reusable R library.

The main motivation behind degradr is thus reproducibility and accessibility. By encapsulating the full workflow—from signal fusion to posterior predictive distributions—in a single R package, practitioners and researchers can now apply these techniques without having to reimplement complex routines from scratch. Additionally, the package serves as a didactic tool for understanding degradation modeling via mixed effects models and Bayesian updating.

This implementation aims to bridge the gap between advanced degradation modeling techniques and their practical application, particularly in industrial and reliability engineering settings.

Methodology

We follow the mixed effects degradation modeling framework originally proposed by Lu and Meeker (1993), extended by Liu and Huang (2016) to include multivariate sensor fusion and Bayesian updating.

Lu, C. Joseph, and William O. Meeker. 1993. “Using Degradation Measures to Estimate a Time-to-Failure Distribution.” Technometrics 35 (2): 161–74. https://doi.org/10.1080/00401706.1993.10485038.

A general degradation model with random effects can be written as:

\[ L_t = \eta(\phi, \theta, t) + \varepsilon_t \]

where:

  • \(\eta(\cdot)\) is a parametric degradation function,

  • \(\phi\) is a vector of fixed effects (common to all units),

  • \(\theta\) is a vector of random effects (unit-specific deviations),

  • and \(\varepsilon\_t\) is an i.i.d. error term representing measurement noise.

To operationalize this, we consider a polynomial model of degree ( p ):

\[ L_{i,j,t} = \sum_{k = 0}^{p} \theta_{i,j}^{(k)} t^k + \varepsilon_{i,j,t} \]

where:

  • \(L\_{i,j,t}\) is the degradation signal of unit \(i\), \(j\) at time \(t\),

  • \(p\) is the order of the polynomial model,

  • \(\Gamma_{i,j} = (\theta_{i,j}^{(0)},\dots,\theta_{i,j}^{(p)})\) is the vector of random effects parameters that is often assumed to follow a multivariate normal distribution: \(\Gamma_{i,j} \sim \mathcal{N}_{p+1}(u_j^0,\Sigma_j^0)\)

For exponentially degrading systems and \(p = 2\), the model is expressed as:

\[ x_{i,j,t} = \phi_j + \alpha_{i,j} \exp\left(\theta^{(1)}_{i,j} t + \theta^{(2)}_{i,j} t^2 + \varepsilon_{i,j,t} - \frac{\sigma_j^2}{2}\right) \]

Taking logarithms and subtracting the baseline \(\phi_j\), we obtain a linearized version:

\[ L_{i,j,t} = \log(x_{i,j,t} - \phi_j) = \theta^{(0)}_{i,j} + \theta^{(1)}_{i,j} t + \theta^{(2)}_{i,j} t^2 + \varepsilon_{i,j,t} \]

with \(\theta^{(0)}_{i,j}=ln\alpha_{i,j}-\sigma_j^2/2\).

Posterior Distribution of Random Effects

Let \(L_{i,j} = [L_{i,j,1},\dots,L_{i,j,n_i}]^\top\) be the observed degradation signal for unit \(i\), sensor \(j\), up to time \(n_i\). The posterior distribution of \(\Gamma\_{i,j}\) is:

\[ \Gamma_{i,j} \mid L_{i,j} \sim \mathcal{N}_{p+1}(u^1_j, \Sigma^1_j) \]

where:

\[ u^1_j = \left( \frac{\Psi_i^\top \Psi_i}{\sigma_j^2} + (\Sigma^0_j)^{-1} \right)^{-1} \left( \frac{\Psi_i^\top L_{i,j}}{\sigma_j^2} + (\Sigma^0_j)^{-1} u^0_j \right) \]

\[ \Sigma^1_j = \left( \frac{\Psi_i^\top \Psi_i}{\sigma_j^2} + (\Sigma^0_j)^{-1} \right)^{-1} \]

with:

\[ \Psi_i = \begin{bmatrix} 1 & t_1 & \cdots & t_1^p \\ 1 & t_2 & \cdots & t_2^p \\ \vdots & \vdots & \ddots & \vdots \\ 1 & t_{n_i} & \cdots & t_{n_i}^p \end{bmatrix} \in \mathbb{R}^{n_i \times (p+1)} \]

Remaining Life Distribution

Let \(\tilde{T}\_{i,j}\) be the estimated Remaining Useful Life (RUL) for unit \(i\), sensor \(j\), based on the observed degradation signal.

Assume the failure threshold \(D_j\sim\mathcal{N}(u_j^d,v_j^d)\) is a random variable estimated from historical failure times. The degradation trajectory projected forward is:

\[ L_{i,j,n_i + t} = \theta^{(0)}_{i,j} + \cdots + \theta^{(p)}_{i,j} (n_i + t)^p + \varepsilon_{i,j,n_i+t} \]

which is normally distributed with:

  • Mean:

\[ \tilde{u}_{i,j,n_i+t} = [1, (n_i + t), \dots, (n_i + t)^p] u^1_j \]

  • Variance:

\[ \tilde{\sigma}^2_{i,j,n_i+t} = [1, (n_i + t), \dots, (n_i + t)^p] \, \Sigma^1_j \, [1, (n_i + t), \dots, (n_i + t)^p]^\top \]

Then the conditional cumulative distribution function (cdf) of \(\tilde{T}_{i,j}\), given \(L_{i,j}\), is:

\[ F_{\tilde{T}_{i,j} \mid L_{i,j}}(t) = P(\tilde{T}_{i,j} \le t \mid L_{i,j}) = \Phi\left( \frac{ \tilde{u}_{i,j,n_i+t} - u^d_j }{ \sqrt{ \tilde{\sigma}^2_{i,j,n_i+t} + v^d_j } } \right) = \Phi(g(t)) \]

Truncating to non-negative RUL:

\[ P(\tilde{T}_{i,j} \le t \mid L_{i,j}, \tilde{T}_{i,j} \ge 0) = \frac{ \Phi(g(t)) - \Phi(g(0)) }{ 1 - \Phi(g(0)) } \]

If the failure threshold \(D_j\) is fixed:

\[ F_{\tilde{T}_{i,j} \mid L_{i,j}}(t) = 1 - \Phi\left( \frac{ D_j - \tilde{u}_{i,j,n_i+t} }{ \sqrt{ \tilde{\sigma}^2_{i,j,n_i+t} } } \right) =1 - \Phi(g_2(t)) \]

and the conditional distribution becomes:

\[ P(\tilde{T}_{i,j} \le t \mid L_{i,j}, \tilde{T}_{i,j} \ge 0) = \frac{ \Phi(g_2(0)) - \Phi(g_2(t)) }{ \Phi(g_2(0)) } \]

Data-Level Fusion Based on Selected Degradation Models

When multiple sensor signals are available and correlated with degradation, we adopt a data-level fusion approach to construct a univariate health index. This health index serves as a surrogate for the multivariate degradation process and simplifies the downstream modeling task. This index seeks to jointly minimize the error of the model fit and the variance of the failure threshold \(D_j\)

The health index is obtained by solving the following optimization problem:

\[ \text{obj} = \min_{\boldsymbol{w}} \ \boldsymbol{w}^\top \mathbf{A} \boldsymbol{w} \]

subject to:

\[ \boldsymbol{w}^\top \mathbf{M}^\top \mathbf{1} = 1, \quad \mathbf{M} \boldsymbol{w} \ge 0 \]

For more details see (Liu and Huang 2016).

Features of degradr

The degradr package provides a compact and flexible toolbox to model degradation processes and predict Remaining Useful Life (RUL). It is designed to accommodate both univariate and multivariate degradation signals, and supports linear and exponential trajectories. Its main features include:

1. Support for Univariate and Multivariate Signals

  • Univariate modeling is performed directly using raw degradation measurements from a single sensor.
  • Multivariate modeling leverages a data fusion procedure to construct a health index from multiple correlated sensor readings. This fusion balances model fit and failure threshold variance, as proposed in (Liu and Huang 2016).
Liu, Kaibo, and Shuai Huang. 2016. “Integration of Data Fusion Methodology and Degradation Modeling Process to Improve Prognostics.” IEEE Transactions on Automation Science and Engineering 13 (1): 344–54. https://doi.org/10.1109/TASE.2014.2349733.

2. Linear and Exponential Mixed Effects Models

  • Both linear and exponential degradation paths are supported.
  • Models are estimated via:
    • nlme::lme() for hierarchical mixed-effects modeling, or
    • an alternative method based on per-unit fits (when method = "lm").

3. Bayesian Updating and Posterior RUL Estimation

  • Once a model is trained on historical units, degradr uses Bayesian updating to compute the posterior distribution of model parameters for new units.
  • Remaining Life Distributions (RLDs) are then derived via forward simulation using these posteriors.

4. Core Functions

Function Description
fit_model() Fits a linear or exponential mixed-effects model to univariate signals.
fit_healthindex() Constructs and models a health index from multivariate signals.
predict_rul() Estimates the Remaining Useful Life (RUL) for one or more partially observed degradation signals based on a previously fitted linear or exponential mixed-effects model.
plot_degradr() Visualizes degradation trajectories and thresholds across units.

5. Built-in Example Datasets

The package includes several datasets to facilitate testing and demonstration:

  • filter_train and filter_test: Gas filter degradation data with censored RULs.
  • train_FD001 and test_FD001: Subsets of the NASA C-MAPSS turbofan engine dataset.
  • RUL_FD001: Ground-truth RULs for evaluation.

These datasets allow users to experiment with real-world degradation scenarios and validate model performance.


Together, these features make degradr a practical and educational tool for studying degradation phenomena and predictive maintenance workflows.

Examples

Install the package from GitHub

devtools::install_github("Abraham1011/degradr")

Example for a degradation signal

We begin with a simple example using a univariate degradation signal. The dataset filter_train contains sensor readings from 49 gas filters. Failure is defined when the differential pressure exceeds 600 Pa. Our goal is to estimate the Remaining Useful Life (RUL) for new units using a parametric mixed-effects model.

1. Load and visualize the training data

library(degradr)
library(ggplot2)

data(filter_train)
data(filter_test)

# Rename columns as expected
colnames(filter_train) <- c("t", "x", "unit")
colnames(filter_test) <- c("t", "x", "unit", "RUL")

# Visualize degradation trajectories
g <- ggplot(filter_train, aes(x = t, y = x, group = factor(unit), color = factor(unit))) +
  geom_line(size = 0.7, alpha = 0.8) +
  geom_hline(yintercept = 600, linetype = "dashed", color = "red", linewidth = 1) +
  labs(
    title = "Degradation Trajectories of Filter Units",
    subtitle = "Dashed red line indicates failure threshold (600 Pa)",
    x = "Time",
    y = "Differential Pressure (Pa)",
    color = "Unit ID"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    legend.position = "none",
    plot.title = element_text(face = "bold", size = 16),
    plot.subtitle = element_text(margin = margin(b = 10)),
    axis.title = element_text(face = "bold"),
    panel.grid.minor = element_blank()
  ) +
  scale_color_viridis_d(option = "D")
print(g)
Line plot showing degradation trajectories
Figure 1: Degradation trajectories for gas filters

2. Fit a mixed-effects degradation model

model <- fit_model(
  data = filter_train,
  type = "exponential",
  degree = 1
)

3. Predict Remaining Useful Life (RUL)

rul_estimates <- predict_rul(
  data = filter_test,
  model = model,
  D = 600  # Failure threshold
)

head(rul_estimates)
  unit       RUL
1    1 39.884159
2    2 57.679633
3    3  7.585585
4    4  2.264131
5    5 16.828236
6    6 81.782255

Example with multiple signals

In this example, we use data from the C-MAPSS dataset (train_FD001 and test_FD001) which contains multiple sensor signals from simulated aircraft engine degradation. We aim to construct a health index from these signals and estimate RUL accordingly.

1. Load and preprocess data

library(dplyr)
data(train_FD001)
data(test_FD001)

# Select relevant sensor variables and adjust sign (all signals should trend upward)
train <- train_FD001 %>%
  select(unit, t, T24, T50, P30, Nf, Ps30, phi, NRf, BPR, htBleed, W31, W32) %>%
  mutate(across(c(P30, phi, W31, W32), ~ . * -1))

test <- test_FD001 %>%
  select(unit, t, T24, T50, P30, Nf, Ps30, phi, NRf, BPR, htBleed, W31, W32) %>%
  mutate(across(c(P30, phi, W31, W32), ~ . * -1))

2. Fit a health index model

hi_model <- fit_healthindex(
  data = train,
  type = "exponential",
  degree = 2,
  r = 0.8  
)

3. Predict RUL using the health index model

rul_hi <- predict_rul(
  data = test,
  model = hi_model
)

head(rul_hi)
  unit       RUL
1    1 166.61559
2    2 124.49630
3    3  82.04446
4    4  80.91050
5    5  88.60125
6    6 119.89087