FastSurvival is built for speed, but a fast estimator is only useful if it returns the same answer as the established implementation. This vignette checks the numerical agreement between each FastSurvival function and a reference from a well-known package: the survival package for the Kaplan-Meier estimate, the log-rank test, the Cox hazard ratio, and milestone survival; survRM2 for the restricted mean survival time; and survAH for the average hazard with survival weight. The weighted log-rank and max-combo tests are checked against the nph package, and the robust modestly-weighted test against the nphRCT package. The same comparisons form the basis of the automated test suite shipped with the package.
survfit_fast() evaluates the Kaplan-Meier estimate at a
single time point. We compare against summary(survfit(...))
at the same time on the ovarian dataset.
library(survival)
ord <- order(ovarian$futime)
t_s <- ovarian$futime[ord]
e_s <- ovarian$fustat[ord]
fast <- survfit_fast(t_s, e_s, t_eval = 500, conf.type = "log")
fit <- survfit(Surv(futime, fustat) ~ 1, data = ovarian)
ref <- summary(fit, times = 500)
data.frame(
quantity = c("survival", "std.err"),
fast = c(unclass(fast)["surv"], unclass(fast)["std.err"]),
survival = c(ref$surv, ref$std.err),
row.names = NULL
)
#> quantity fast survival
#> 1 survival 0.59607843 0.59607843
#> 2 std.err 0.09992615 0.09992615survdiff_fast() returns a two-sided chi-square statistic
when side = 2, matching the statistic from
survdiff().
survdiff_fast() also computes Fleming-Harrington G(rho,
gamma) weighted log-rank tests through the weight = "fh"
argument. The nph
package provides logrank.test() with the same family, so we
compare the chi-square statistic across several weight choices on the
ovarian data.
fh_grid <- data.frame(rho = c(0, 1, 0, 1), gamma = c(1, 0, 0, 1))
do.call(rbind, lapply(seq_len(nrow(fh_grid)), function(i) {
r <- fh_grid$rho[i]
g <- fh_grid$gamma[i]
fast <- as.numeric(survdiff_fast(ovarian$futime, ovarian$fustat, ovarian$rx,
control = 1, side = 2,
weight = "fh", rho = r, gamma = g))
nph_chisq <- nph::logrank.test(ovarian$futime, ovarian$fustat, ovarian$rx,
rho = r, gamma = g)$test$Chisq
data.frame(rho = r, gamma = g, fast = fast, nph = nph_chisq)
}))
#> rho gamma fast nph
#> 1 0 1 0.0001020735 0.0001020735
#> 2 1 0 1.6848546117 1.6848546117
#> 3 0 0 1.0627398613 1.0627398613
#> 4 1 1 0.0033228087 0.0033228087The two implementations agree to numerical precision across the
weight family, including the ordinary log-rank test recovered at
rho = 0, gamma = 0.
coxph_fast() returns the Pike-Halley Estimator, a
closed-form approximation to the Cox partial likelihood maximizer. On
data with a moderate hazard ratio the two agree closely. We report the
log hazard ratio from both.
rmst_fast() integrates the Kaplan-Meier survival curve
up to a horizon. We compare the two-group RMST difference against
survRM2::rmst2().
library(survRM2)
arm <- as.integer(ovarian$rx == 2)
tau <- 500
fast_rmst <- rmst_fast(ovarian$futime, ovarian$fustat, ovarian$rx,
control = 1, tau = tau)
ref_rmst <- rmst2(time = ovarian$futime, status = ovarian$fustat,
arm = arm, tau = tau)
c(fast = unclass(fast_rmst)["diff"],
survRM2 = ref_rmst$unadjusted.result[1, 1])
#> fast.diff survRM2
#> 98.88034 98.88034milestone_fast() compares Kaplan-Meier survival between
two groups at a milestone timepoint. The per-group survival
probabilities match those from survfit().
tstar <- 500
fast_ms <- milestone_fast(ovarian$futime, ovarian$fustat, ovarian$rx,
control = 1, tau = tstar, method = "wald")
fit_g <- survfit(Surv(futime, fustat) ~ rx, data = ovarian)
ref_g <- summary(fit_g, times = tstar)
ms_surv <- fast_ms$surv
data.frame(
group = c("control", "treatment"),
fast = c(ms_surv["control"], ms_surv["treatment"]),
survival = ref_g$surv,
row.names = NULL
)
#> group fast survival
#> 1 control 0.5384615 0.5384615
#> 2 treatment 0.6581197 0.6581197ahsw_fast() computes the average hazard with survival
weight of Uno and Horiguchi. We compare the per-group average hazard
against survAH::ah2().
library(survAH)
arm <- as.integer(ovarian$rx == 2)
tau <- 500
fast_ah <- ahsw_fast(ovarian$futime, ovarian$fustat, ovarian$rx,
control = 1, tau = tau)
ref_ah <- ah2(time = ovarian$futime, status = ovarian$fustat,
arm = arm, tau = tau)
data.frame(
quantity = c("AH (control)", "AH (treatment)"),
fast = unclass(fast_ah)[c("ah.ctrl", "ah.trt")],
survAH = c(ref_ah$ah["AH (arm0)", "Est."],
ref_ah$ah["AH (arm1)", "Est."]),
row.names = NULL
)
#> quantity fast survAH
#> 1 AH (control) 0.001235076 0.001235076
#> 2 AH (treatment) 0.000723445 0.000723445The average hazard is the ratio of the cumulative event probability
to the restricted mean survival time, so for the ovarian
data, where follow-up is measured in days, the values are on the order
of 1e-03 per day in both groups.
ahr_fast() computes the Kalbfleisch-Prentice average
hazard ratio between two groups over a restricted interval. The
reference implementation is ahrKM() from the AHR package,
but that package has been archived on CRAN, so here we check the point
estimates against a direct survival-based computation: each group’s
Kaplan-Meier curve from survfit(), integrated to form the
group shares of the total hazard.
tau <- 500
fast_ahr <- ahr_fast(ovarian$futime, ovarian$fustat, ovarian$rx,
control = 1, tau = tau)
g <- ovarian$rx
ev <- c(ovarian$futime[g == 1 & ovarian$fustat == 1],
ovarian$futime[g == 2 & ovarian$fustat == 1])
grid <- sort(unique(c(0, ev[ev <= tau], tau)))
km_on_grid <- function(gi) {
sf <- survfit(Surv(futime, fustat) ~ 1, data = ovarian[g == gi, ])
approxfun(sf$time, sf$surv, method = "constant",
yleft = 1, rule = 2, f = 0)(grid)
}
S1 <- km_on_grid(1)
S2 <- km_on_grid(2)
m <- length(grid)
dS1 <- S1 - c(1, S1[-m])
GL <- S1[m] * S2[m]
ref_theta1 <- -sum(S2 * dS1) / (1 - GL)
data.frame(
quantity = c("theta (group 1)", "theta (group 2)", "AHR"),
fast = c(fast_ahr$theta[[1]], fast_ahr$theta[[2]], fast_ahr$ahr),
reference = c(ref_theta1, 1 - ref_theta1, (1 - ref_theta1) / ref_theta1),
row.names = NULL
)
#> quantity fast reference
#> 1 theta (group 1) 0.6965377 0.6965377
#> 2 theta (group 2) 0.3034623 0.3034623
#> 3 AHR 0.4356725 0.4356725The point estimates match the survival-based reference to numerical
precision. The average hazard ratio was additionally cross-checked
against the archived AHR package, the reference implementation used by
Dormuth et al. (2024): the two group shares, the average hazard ratio,
the variances, and both test statistics agree to within the order of
1e-15. That external comparison is reproducible with the
tools/verify_ahr_fast.R script shipped in the package
sources.
maxcombo_fast() computes the max-combo test, the most
extreme of a set of Fleming-Harrington weighted log-rank statistics. The
nph package provides logrank.maxtest(), whose
default weight set is FH(0, 0), FH(0, 1), and FH(1, 0). We request the
same three weights from maxcombo_fast() and compare the
component Z-scores. The individual Z-scores are kept in the
z attribute, signed so that a negative value indicates
benefit for the treatment group under the package convention. Because
nph orients the contrast in the opposite direction, the
signs are mirrored, so we compare absolute values.
rho <- c(0, 0, 1)
gamma <- c(0, 1, 0)
fast_mc <- maxcombo_fast(ovarian$futime, ovarian$fustat, ovarian$rx,
control = 1, side = 2, rho = rho, gamma = gamma)
nph_mc <- nph::logrank.maxtest(ovarian$futime, ovarian$fustat, ovarian$rx)
data.frame(
weight = c("FH(0,0)", "FH(0,1)", "FH(1,0)"),
fast = abs(attr(fast_mc, "z")),
nph = abs(nph_mc$tests$z),
row.names = NULL
)
#> weight fast nph
#> 1 FH(0,0) 1.03089275 1.03089275
#> 2 FH(0,1) 0.01010314 0.01010314
#> 3 FH(1,0) 1.29801950 1.29801950The component statistics agree in absolute value. The final max-combo
statistic is the largest of these, which for the two-sided test is
max(abs(Z)) and is therefore always non-negative. The
p-values from the two packages are close but not identical, since they
use different numerical methods for the joint distribution (a
multivariate normal integral here, a Bonferroni-style combination in
nph).
rmw_fast() computes the robust modestly-weighted (rMW)
test of Magirr and Öhrn, the maximum of the standard log-rank statistic
and a single modestly-weighted log-rank statistic. Each component is a
weighted log-rank test, so we check the two component Z-scores against
nphRCT, the
implementation used by the method’s authors. The modestly-weighted
component uses the survival-threshold parameterization, so
s_star = 1 recovers the standard log-rank test and
s_star = 0.5 gives the modestly-weighted component. As with
the max-combo test the package orients a negative Z toward treatment
benefit while nphRCT uses the opposite sign, so we compare
absolute values.
ov_df <- data.frame(
time = ovarian$futime,
event = ovarian$fustat,
arm = factor(ifelse(ovarian$rx == 1, "control", "experimental"),
levels = c("control", "experimental"))
)
fit_rmw <- rmw_fast(ov_df$time, ov_df$event, ov_df$arm,
control = "control", side = 1, s_star = 0.5)
z_lr_nph <- nphRCT::wlrt(Surv(time, event) ~ arm, data = ov_df,
method = "mw", s_star = 1)$z
z_mw_nph <- nphRCT::wlrt(Surv(time, event) ~ arm, data = ov_df,
method = "mw", s_star = 0.5)$z
data.frame(
component = c("log-rank (s_star = 1)", "modestly-weighted (s_star = 0.5)"),
fast = abs(attr(fit_rmw, "z")),
nphRCT = abs(c(z_lr_nph, z_mw_nph)),
row.names = NULL
)
#> component fast nphRCT
#> 1 log-rank (s_star = 1) 1.0308927 1.0308927
#> 2 modestly-weighted (s_star = 0.5) 0.7582546 0.7582546The component statistics agree in absolute value. The null
correlation of the two components, reported in the corr
attribute, matches the covariance computed by the authors’
find_cor routine to numerical precision (0.98 on the
ovarian data). The combined statistic and one-sided p-value
then follow from the bivariate normal distribution of the two
components, evaluated with mvtnorm::pmvnorm. On the POPLAR
overall-survival data analyzed in the original article,
rmw_fast() reproduces the published one-sided p-values of
0.0028 for the log-rank test, 0.0009 for the modestly-weighted test, and
0.0012 for the rMW test, with a component correlation of 0.97.
Across all functions the FastSurvival results reproduce the reference values from the established packages. The point estimates and test statistics agree to numerical precision for the Kaplan-Meier, log-rank, RMST, milestone, and average-hazard quantities, the average hazard ratio matches a survival-based reference, and the closed-form Cox hazard ratio agrees with the partial-likelihood maximizer to the order expected for the Pike-Halley approximation. This agreement is verified continuously by the package test suite.
Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282), 457-481.
Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50(3), 163-170.
Fleming, T. R., & Harrington, D. P. (1991). Counting Processes and Survival Analysis. New York: John Wiley & Sons.
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34(2), 187-220.
Homma, G. (2025). One step from Pike to Cox: a closed-form hazard ratio estimator. Manuscript under review.
Royston, P., & Parmar, M. K. B. (2013). Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Medical Research Methodology, 13, 152.
Uno, H., Claggett, B., Tian, L., et al. (2014). Moving beyond the hazard ratio in quantifying the between-group difference in survival analysis. Journal of Clinical Oncology, 32(22), 2380-2385.
Tang, Y. (2021). Some new confidence intervals for Kaplan-Meier based estimators from one and two sample survival data. Statistics in Medicine, 40(23), 4961-4976.
Uno, H., & Horiguchi, M. (2023). Ratio and difference of average hazard with survival weight: new measures to quantify survival benefit of new therapy. Statistics in Medicine, 42(7), 936-952.
Kalbfleisch, J. D., & Prentice, R. L. (1981). Estimation of the average hazard ratio. Biometrika, 68(1), 105-112.
Dormuth, I., Pauly, M., Rauch, G., & Herrmann, C. (2024). Sample size calculation under nonproportional hazards using average hazard ratios. Biometrical Journal, 66(6), e202300271.
Karrison, T. G. (2016). Versatile tests for comparing survival curves based on weighted log-rank statistics. The Stata Journal, 16(3), 678-690.
Lin, R. S., Lin, J., Roychoudhury, S., et al. (2020). Alternative analysis methods for time to event endpoints under nonproportional hazards: a comparative analysis. Statistics in Biopharmaceutical Research, 12(2), 187-198.
Magirr, D., & Burman, C.-F. (2019). Modestly weighted logrank tests. Statistics in Medicine, 38(20), 3782-3790.
Magirr, D., & Öhrn, F. (2026). Robust modestly weighted log-rank tests. Pharmaceutical Statistics, 25(1), e70066.