The Quiet Power of S3 – Why R's Simplest OOP System Is So Effective

R’s object-oriented programming landscape includes several systems (S3, S4, R6, and the emerging S7), but for most practical data science and engineering work in 2025, S3 remains the clear winner for its balance of simplicity, power, and seamless integration with the ecosystem.

After years of building production analytics packages – including for Health New Zealand’s immunisation programmes – I’ve seen firsthand how S3’s minimalism lets us deliver rich, extensible behaviour without fighting the language.

A More Realistic Vaccination Example

Real public health analytics rarely treat all vaccines the same. Rules for “fully vaccinated” or “up-to-date” vary by programme.

Let’s mock an event fact table with two vaccines:

  • COVID-19: up-to-date = ≥1 dose in the last 6 months
  • MMR: fully vaccinated = exactly 3 doses (ever)
library(tibble)
library(lubridate)

vaccination_events <- tribble(
  ~event_id, ~person_id, ~vaccine,   ~dose_number, ~administration_date,
  100001,    "P001",     "COVID",    1,            ymd("2025-02-10"),
  100002,    "P001",     "COVID",    2,            ymd("2025-07-15"),  # within last 6 months
  100003,    "P002",     "COVID",    1,            ymd("2024-11-20"),  # >6 months ago
  100004,    "P003",     "MMR",      1,            ymd("2010-03-05"),
  100005,    "P003",     "MMR",      2,            ymd("2010-04-10"),
  100006,    "P003",     "MMR",      3,            ymd("2011-08-20"),
  100007,    "P004",     "MMR",      1,            ymd("2015-06-01"),
  100008,    "P004",     "MMR",      2,            ymd("2015-07-15"),
  100009,    "P005",     "COVID",    1,            ymd("2025-09-01")   # within last 6 months
)

vaccination_events

Console output:

# A tibble: 9 × 5
  event_id person_id vaccine dose_number administration_date
     <dbl> <chr>     <chr>         <dbl> <date>             
1   100001 P001      COVID             1 2025-02-10         
2   100002 P001      COVID             2 2025-07-15         
3   100003 P002      COVID             1 2024-11-20         
4   100004 P003      MMR               1 2010-03-05         
5   100005 P003      MMR               2 2010-04-10         
6   100006 P003      MMR               3 2011-08-20         
7   100007 P004      MMR               1 2015-06-01         
8   100008 P004      MMR               2 2015-07-15         
9   100009 P005      COVID             1 2025-09-01         

Vaccine-Specific Summaries with S3

We create separate S3 classes that share the same raw events but implement different business logic.

# Base constructor – holds the raw events
new_vacc_summary_base <- function(events_df) {
  structure(
    list(events = events_df),
    class = c("vacc_summary_base", "list")
  )
}

# COVID-specific class and methods
new_covid_summary <- function(events_df) {
  obj <- new_vacc_summary_base(events_df |> filter(vaccine == "COVID"))
  class(obj) <- c("covid_summary", class(obj))
  obj
}

vaccinated_people.covid_summary <- function(obj) {
  recent_cutoff <- today() - months(6)
  
  obj$events |>
    filter(administration_date >= recent_cutoff) |>
    distinct(person_id) |>
    nrow()
}

# MMR-specific class and methods
new_mmr_summary <- function(events_df) {
  obj <- new_vacc_summary_base(events_df |> filter(vaccine == "MMR"))
  class(obj) <- c("mmr_summary", class(obj))
  obj
}

vaccinated_people.mmr_summary <- function(obj) {
  obj$events |>
    count(person_id, dose_number) |>
    count(person_id) |>
    filter(n == 3) |>
    nrow()
}

# Generic for reuse
vaccinated_people <- function(obj) UseMethod("vaccinated_people")

print.vacc_summary_base <- function(x, ...) {
  cat("Vaccination summary for", class(x)[1], "\n")
  cat("Total events:", nrow(x$events), "\n")
  cat("Unique people:", n_distinct(x$events$person_id), "\n")
  invisible(x)
}

Usage – the magic of dispatch:

covid_sum <- new_covid_summary(vaccination_events)
mmr_sum   <- new_mmr_summary(vaccination_events)

print(covid_sum)
print(mmr_sum)

vaccinated_people(covid_sum)  # → 2 (P001 and P005 – recent doses)
vaccinated_people(mmr_sum)    # → 1 (only P003 has all 3 doses)

Example output:

Vaccination summary for covid_summary 
Total events: 4 
Unique people: 3 

Vaccination summary for mmr_summary 
Total events: 5 
Unique people: 2 

> vaccinated_people(covid_sum)
[1] 2

> vaccinated_people(mmr_sum)
[1] 1

Why This Pattern Was So Powerful in Production

During the COVID-19 response and ongoing immunisation work at Health New Zealand, we used exactly this S3 approach in internal packages:

  • Different vaccines (COVID, flu, childhood schedule) each got their own lightweight S3 class.
  • Custom generics like vaccinated_people(), coverage_by_age(), equity_gap() dispatched to the correct logic automatically.
  • Analysts could write vaccinated_people(obj) without knowing the underlying rules – the object knew how to answer.

At AA Insurance, we applied the same idea to risk models: different product lines (motor, home, contents) shared raw claims data but had product-specific expected_loss() and retention_rate() methods.

The result: clean, extensible code that evolved with policy changes without breaking existing reports or dashboards.

When S3 Wins

For analytics packages, modelling outputs, reporting objects, and any domain where behaviour depends on type but you want minimal ceremony – S3 is ideal.

You get true polymorphism, easy extension by other packages, and perfect tidyverse integration – all with almost no boilerplate.

Need mutable state or private fields? Reach for R6 (or watch S7).
But for most real-world data work, S3’s quiet power is hard to beat.

Enjoying S3 in your own projects? I’d love to hear your favourite pattern.

phil@virtus-solutions.io

#rstats #datascience #OOP #dataengineering #publichealth