53  statusPartition()

ImportantDisclaimer

These packages (Note 1) are a one-person project undergoing rapid evolution. Backward compatibility (per Hadley Wickham) is provided as a courtesy rather than a guarantee.

Until further notice, these packages should

  • not be used as a basis for research grant applications,
  • not be cited as an actively maintained tool in a peer-reviewed manuscript,
  • not be used to support or fulfill requirements for pursuing an academic degree.

In addition, work primarily based on these packages (Note 1) should not be presented at academic conferences or similar scholarly venues.

Furthermore, a person’s ability to use these packages (Note 1) does not necessarily imply an understanding of their underlying mechanisms. Accordingly, demonstration of their use alone should not be considered sufficient evidence of expertise, nor should it be credited as a basis for academic promotion or advancement.

These statements do not apply to the contributors (Tip 1) to these packages (Note 1) with respect to their specific contributions.

These statements do not apply when the maintainer of these packages (Note 1), Tingting Zhan, is credited as the first author, the lead author, and/or the corresponding author in a peer-reviewed manuscript, or as the Principal Investigator or Co-Principal Investigator in a research grant application and/or a final research progress report.

These statements are advisory in nature and do not modify or restrict the rights granted under the GNU General Public License https://www.r-project.org/Licenses/.

Note

The examples in Chapter 53 require

library(maxEff)

The function maxEff::statusPartition() (v0.2.4, GPL-2)

  1. splits a left-censored Surv (Therneau 2026, v3.8.6, LGPL (>= 2)) object by its survival status, i.e., observed versus left-censored;
  2. partitions the observed and left-censored subjects, respectively, into test/training sets.

See Section 12.2 for the usage of the terms “split” versus “partition”.

Listing 53.1 creates a Surv object capacitor_failure from the example data frame capacitor from package survival (Therneau 2026, v3.8.6, LGPL (>= 2)),

Listing 53.1: Data: left-censored Surv object capacitor_failure
capacitor_failure = survival::capacitor |> 
  with(expr = survival::Surv(time, status))
capacitor_failure
#  [1]  439   904  1092  1105   572   690   904  1090   315   315   439   628   258   258   347   588   959  1065  1065  1087   216   315   455   473   241   315   332   380   241   241   435   455 
# [33] 1105+ 1105+ 1105+ 1105+ 1090+ 1090+ 1090+ 1090+  628+  628+  628+  628+  588+  588+  588+  588+ 1087+ 1087+ 1087+ 1087+  473+  473+  473+  473+  380+  380+  380+  380+  455+  455+  455+  455+

The function statusPartition() (Listing 53.2) intends to avoid the situation that a Cox proportional hazards model survival::coxph() in one or more of the partitioned data set being degenerate due to the fact that all subjects in that partition being censored.

Listing 53.2: Example: statusPartition(), balanced by survival status (Listing 53.1)
set.seed(12); id = capacitor_failure |>
  statusPartition(times = 1L, p = .5) |>
  unlist()
capacitor_failure[id, 2L] |> 
  table()
# 
#  0  1 
# 16 16

The function statusPartition() is an extension of the very popular function caret::createDataPartition() (Listing 53.3), which stratifies a Surv object by the quantiles of its survival time (as of package caret v7.0.1, GPL (>= 2)).

Listing 53.3: Review: caret::createDataPartition(), not balanced by survival status (Listing 53.1)
set.seed(12); id0 = capacitor_failure |>
  caret::createDataPartition(times = 1L, p = .5) |>
  unlist()
capacitor_failure[id0, 2L] |> 
  table()
# 
#  0  1 
# 19 14