26  hyperframe

ImportantDisclaimer

These packages (Note 1) are a one-person project undergoing rapid evolution. Backward compatibility (per Hadley Wickham) is provided as a courtesy rather than a guarantee.

Until further notice, these packages should

  • not be used as a basis for research grant applications,
  • not be cited as an actively maintained tool in a peer-reviewed manuscript,
  • not be used to support or fulfill requirements for pursuing an academic degree.

In addition, work primarily based on these packages (Note 1) should not be presented at academic conferences or similar scholarly venues.

Furthermore, a person’s ability to use these packages (Note 1) does not necessarily imply an understanding of their underlying mechanisms. Accordingly, demonstration of their use alone should not be considered sufficient evidence of expertise, nor should it be credited as a basis for academic promotion or advancement.

These statements do not apply to the contributors (Tip 1) to these packages (Note 1) with respect to their specific contributions.

These statements do not apply when the maintainer of these packages (Note 1), Tingting Zhan, is credited as the first author, the lead author, and/or the corresponding author in a peer-reviewed manuscript, or as the Principal Investigator or Co-Principal Investigator in a research grant application and/or a final research progress report.

These statements are advisory in nature and do not modify or restrict the rights granted under the GNU General Public License https://www.r-project.org/Licenses/.

The function hyperframe() creates a hyper data frame, i.e., an R object of S3 class 'hyperframe'. The S3 generic function as.hyperframe() converts R objects of various classes into a hyper data frame. Note 26.1 and Note 1.2 summarize the S3 methods for the generic function as.hyperframe() and the S3 methods for the class 'hyperframe', respectively, in the spatstat.* family of packages,

S3 methods of spatstat.geom::as.hyperframe (v3.7.3)
visible isS4
as.hyperframe.anylist TRUE FALSE
as.hyperframe.data.frame TRUE FALSE
as.hyperframe.default TRUE FALSE
as.hyperframe.hyperframe TRUE FALSE
as.hyperframe.listof TRUE FALSE
as.hyperframe.ppx TRUE FALSE
TipExamples in Chapter 26 Require
library(groupedHyperframe)

Table 26.1 summarizes the S3 methods for the class 'hyperframe' in package groupedHyperframe (v0.4.0, GPL-2),

Table 26.1: S3 methods groupedHyperframe::*.hyperframe (v0.4.0)
visible generic isS4
aggregate.hyperframe FALSE stats::aggregate FALSE
as.groupedHyperframe.hyperframe FALSE groupedHyperframe::as.groupedHyperframe FALSE
get_nested_factor.hyperframe FALSE groupedHyperframe::get_nested_factor FALSE
getGroupsFormula.hyperframe FALSE nlme::getGroupsFormula FALSE
length.hyperframe FALSE base::length FALSE
superimpose.hyperframe FALSE spatstat.geom::superimpose FALSE
within.hyperframe FALSE base::within FALSE

26.1 Examples

Listing 26.1 creates a subset of the hyper data frame flu (Section 10.12).

Listing 26.1: Data: fluM, a subset of flu (Section 10.12)
fluM = spatstat.data::flu |>
  spatstat.geom::subset.hyperframe(
    subset = (stain == 'M2-M1') & (virustype == 'wt'),
    select = c('pattern', 'frameid')
  )
fluM
# Hyperframe:
#             pattern frameid
# wt M2-M1 13   (ppp)      13
# wt M2-M1 22   (ppp)      22
# wt M2-M1 27   (ppp)      27
# wt M2-M1 43   (ppp)      43
# wt M2-M1 49   (ppp)      49
# wt M2-M1 65   (ppp)      65
# wt M2-M1 71   (ppp)      71
# wt M2-M1 84   (ppp)      84

Listing 26.2 shows that each member of the ppp-hypercolumn fluM$pattern (Listing 26.1) has one multi-type mark with two levels 'M2' and 'M1'.

Listing 26.2: Review: number of M1 and/or M2 points per point-pattern in fluM$pattern (Listing 26.1)
fluM$pattern |>
  sapply(FUN = \(i) {
    i |> 
      spatstat.geom::marks.ppp() |> 
      table()
  }) |>
  addmargins()
#     wt M2-M1 13 wt M2-M1 22 wt M2-M1 27 wt M2-M1 43 wt M2-M1 49 wt M2-M1 65 wt M2-M1 71 wt M2-M1 84  Sum
# M2          117          65          71         241         150         116          57         104  921
# M1          354         152         143         165         267         202         208         405 1896
# Sum         471         217         214         406         417         318         265         509 2817

26.2 Within

The S3 method within.hyperframe() ..

fluM |>
  within(expr = {
    npt = spatstat.geom::npoints(pattern)
  })
# Hyperframe:
#             pattern frameid npt
# wt M2-M1 13   (ppp)      13 471
# wt M2-M1 22   (ppp)      22 217
# wt M2-M1 27   (ppp)      27 214
# wt M2-M1 43   (ppp)      43 406
# wt M2-M1 49   (ppp)      49 417
# wt M2-M1 65   (ppp)      65 318
# wt M2-M1 71   (ppp)      71 265
# wt M2-M1 84   (ppp)      84 509

26.2.1 Default \(r_\text{max}\)

Listing 26.3: Example: function .rmax()
fluM |> 
  within(expr = {
    rmax = pattern |>
      .rmax(fun = 'G', i = 'M2', j = 'M1')
  })
# Hyperframe:
#             pattern frameid     rmax
# wt M2-M1 13   (ppp)      13 338.9151
# wt M2-M1 22   (ppp)      22 517.2146
# wt M2-M1 27   (ppp)      27 533.2422
# wt M2-M1 43   (ppp)      43 496.4215
# wt M2-M1 49   (ppp)      49 390.2446
# wt M2-M1 65   (ppp)      65 448.6595
# wt M2-M1 71   (ppp)      71 442.1411
# wt M2-M1 84   (ppp)      84 316.8583

26.2.2 Aggregate Marks-Statistics from ppp-Hypercolumn

Listing 26.4 aggregates the relative frequencies of the 'M2' and 'M1' marks in each member of the ppp-hypercolumn fluM$pattern (Listing 26.1, Listing 26.2).

Listing 26.4: Example: function aggregate_marks.hyperframe(), for relative frequencies (Listing 26.1)
fluM_relfreq = fluM |>
  within(expr = {
    markstats = pattern |>
      aggregate_marks(FUN = \(z) table(z)/length(z))
  })
fluM_relfreq
# Hyperframe:
#             pattern frameid markstats
# wt M2-M1 13   (ppp)      13 (numeric)
# wt M2-M1 22   (ppp)      22 (numeric)
# wt M2-M1 27   (ppp)      27 (numeric)
# ✂️ --- output truncated --- ✂️

Listing 26.5 shows the aggregated relative frequencies of the 'M2' and 'M1' marks in the returned hypercolumn fluM_relfreq$markstats (Listing 26.4).

Listing 26.5: Example: inside numeric-hypercolumn fluM_relfreq$markstats (Listing 26.4)
fluM_relfreq$markstats
# wt M2-M1 13:
#        M2        M1 
# 0.2484076 0.7515924 
# 
# wt M2-M1 22:
#        M2        M1 
# 0.2995392 0.7004608 
# ✂️ --- output truncated --- ✂️

The S3 method t.vectorlist() (Section 43.3) is the fastest way to extract a “slice” from the numeric-hypercolumn, e.g., fluM_relfreq$markstats.

Listing 26.6: Advanced: function t.vectorlist() (Listing 26.4)
fluM_relfreq$markstats |>
  groupedHyperframe:::t.vectorlist()
# A 'vectorlist' of 2 vectors 
# Name(s): M2, M1 
# Storage Mode: numeric 
# Individual Vector Length: 8

Unfortunately, package spatstat.data (v3.1.9, GPL (>= 2)) does not have a hyper data frame with (any) ppp-hypercolumn of

  • 'dataframe' mark-format, to showcase the use of parameter by (Section 36.7.3.1).
  • 'vector' mark-format and numeric marks, to showcase the aggregation by sample mean and standard deviation.

26.2.3 Function-Value-Table on Eligible Marks

Listing 26.7: Data: a hyper data frame fluG (Listing 26.1)
fluG = fluM |>
  within(expr = {
    m.G = pattern |>
      Gcross_(i = 'M1', j = 'M2', r = 0:300, correction = 'none')
  })
fluG
# Hyperframe:
#             pattern frameid  m.G
# wt M2-M1 13   (ppp)      13 (fv)
# wt M2-M1 22   (ppp)      22 (fv)
# wt M2-M1 27   (ppp)      27 (fv)
# ✂️ --- output truncated --- ✂️

26.2.4 Cumulative Average Vertical Height of Trapzoidal Integration of fv-Hypercolumn

Listing 26.8 creates a numeric-hypercolumn $m.G.cumv from the fv-hypercolumn fluG$m.G in the returned hyper data frame fluG_vt.

Listing 26.8: Example: function cumvtrapz() in hyperframe (Listing 26.7)
fluG_vt = fluG |>
  within(expr = {
    m.G.cumv = cumvtrapz(m.G, drop = TRUE)
  })
fluG_vt
# Hyperframe:
#             pattern frameid  m.G  m.G.cumv
# wt M2-M1 13   (ppp)      13 (fv) (numeric)
# wt M2-M1 22   (ppp)      22 (fv) (numeric)
# wt M2-M1 27   (ppp)      27 (fv) (numeric)
# ✂️ --- output truncated --- ✂️

The S3 method t.vectorlist() (Section 43.3) is the fastest way to extract a “slice” from a numeric-hypercolumn, e.g., fluG_vt$m.G.cumv (Listing 26.8), which is a 'vectorlist' (Chapter 43) although not supported as a hypercolumn in hyper data frame as of package spatstat.geom (v3.7.3, GPL (>= 2)). A “slice” of the hypercolumn fluG_vt$m.G.cumvtrapz at the 50th index of the \(r\)-vector, i.e., \(r=50\), may be extracted by calling the S3 method with.hyperframe() (v3.7.3, GPL (>= 2)) (Listing 26.10), but the S3 method t.vectorlist() (Listing 26.9) is much faster.

Listing 26.9: Advanced: function t.vectorlist() (Listing 26.8)
tG = fluG_vt$m.G.cumv |>
  as.vectorlist() |>
  t()
Listing 26.10: Review: function with.hyperframe() (Listing 26.8, Listing 26.9)
Code
fluG_vt |>
  spatstat.geom::with.hyperframe(expr = {
    m.G.cumv['50']
  }) |>
  identical(y = tG[[50L]]) |>
  stopifnot()
fluG_vt |>
  spatstat.geom::with.hyperframe(expr = {
    m.G.cumv[50L]
  }) |>
  identical(y = tG[[50L]]) |>
  stopifnot()

26.2.5 Random Re-Labelling Envelope Residual & Test

Listing 26.11 performs the random re-labelling envelope residual and test on the ppp-hypercolumn fluM$pattern (Listing 26.1), and creates a hyper data frame with a numeric column $GET_p.

Listing 26.11: Example: functions rlabelRes.ppp(), GET::global_envelope_test() (Listing 26.1)
fluM |>
  within(expr = {
    GET_p = pattern |>
      rlabelRes(fun = spatstat.explore::Gcross) |>
      GET::global_envelope_test() |>
      attr(which = 'p', exact = TRUE)
  })
# Hyperframe:
#             pattern frameid GET_p
# wt M2-M1 13   (ppp)      13  0.01
# wt M2-M1 22   (ppp)      22  0.02
# wt M2-M1 27   (ppp)      27  0.02
# wt M2-M1 43   (ppp)      43  0.01
# wt M2-M1 49   (ppp)      49  0.02
# wt M2-M1 65   (ppp)      65  0.01
# wt M2-M1 71   (ppp)      71  0.05
# wt M2-M1 84   (ppp)      84  0.01

26.3 Length

The S3 method length.hyperframe() finds the number of columns and/or hypercolumns of a hyper data frame. Table 26.2 explains its rational and similarity to other length methods in package base (R version 4.5.3 (2026-03-11)).

Table 26.2: Rational of S3 Method length.hyperframe()
length() of 'data.frame' length.POSIXlt() length.hyperframe()
User-Perceived Length Yes (Listing 26.12), the number of columns Yes (Listing 26.15), the number of elements Yes (Listing 26.13), the number of (hyper)columns
Internal Structure Length Yes, the number of list elements No (Listing 26.16) No (Listing 26.14)

Listing 26.12 reveals that the data frame Formaldehyde from package datasets (R version 4.5.3 (2026-03-11)) has 2 columns, using the .Primitive S3 generic function length().

Listing 26.12: Review: function length() on data.frame
datasets::Formaldehyde |>
  length()
# [1] 2

Listing 26.13 reveals that the hyper data frame demohyper (Section 10.9) has 3 (hyper)columns. The internal structure length of a hyper data frame (Listing 26.14) is not relevant to end users and may change without notice in package spatstat.geom (v3.7.3, GPL (>= 2)).

Listing 26.13: Example: function length.hyperframe()
spatstat.data::demohyper |>
  length()
# [1] 3
Listing 26.14: Review: length of hyper data frame, internal-structure
spatstat.data::demohyper |>
  unclass() |>
  length()
# [1] 8

The S3 method length.POSIXlt() (R version 4.5.3 (2026-03-11)) (Listing 26.15) returns the user-perceived length of a POSIXlt object, rather than its internal structure length (Listing 26.16).

Listing 26.15: Review: length of a POSIXlt object, user-perceived
tm = Sys.time() |> 
  as.POSIXlt.POSIXct(tz = 'GMT')
tm |> 
  length.POSIXlt()
# [1] 1
Listing 26.16: Review: length of a POSIXlt object, internal-structure (Listing 26.15)
tm |> 
  unclass() |>
  length()
# [1] 11

26.4 Aggregation

The S3 method aggregate.hyperframe()

  • splits, according to the grouping level specified in the parameter by,
    • the hypercolumn(s) that are ppplist (Chapter 37) into list(s) of ppplist;
    • the hypercolumn(s) that are imlist (Chapter 29) into list(s) of imlist;
    • the hypercolumn(s) that are fvlist (Chapter 21) into list(s) of fvlist;
    • the hypercolumn(s) that are solist (Chapter 39) into list(s) of solist.
  • aggregates, according to the grouping level specified in the parameter by,
    • the regular column(s) by simply taking their unique-value, as the elements in each column must be all.equal within each grouping of by;
  • returns a hyper data frame.

When the primary input is a grouped hyper data frame (Chapter 25, e.g., in Section 3.4), the aggregation may be specified at either one of the nested grouping levels (Chapter 49) \(g_1,\cdots,g_{m-1}\). Aggregation at the lowest grouping level \(g_m\) is ignored, i.e., no aggregation to be performed.

26.5 Adding group

The S3 generic function as.groupedHyperframe() creates a (grouped) hyper data frame. Package groupedHyperframe (v0.4.0, GPL-2) implements the following S3 methods (Table 26.3),

Table 26.3: S3 methods of groupedHyperframe::as.groupedHyperframe (v0.4.0)
visible generic isS4
as.groupedHyperframe.hyperframe FALSE groupedHyperframe::as.groupedHyperframe FALSE

The S3 method as.groupedHyperframe.hyperframe() converts a hyper data frame into a grouped hyper data frame by inspecting and adding a (nested) grouping structure to the input.

Listing 26.17 adds a nested grouping structure ~id/brick to the hyper data frame osteo (Section 10.20).

Listing 26.17: Example: function as.groupedHyperframe.hyperframe()
spatstat.data::osteo |> 
  as.groupedHyperframe(group = ~ id/brick)
# Grouped Hyper Data Frame: ~id/brick
# 
# 40 brick nested in
# 4 id
# 
#         id shortid brick   pts depth
# 1   c77za4       4     1 (pp3)    45
# 2   c77za4       4     2 (pp3)    60
# 3   c77za4       4     3 (pp3)    55
# 4   c77za4       4     4 (pp3)    60
# ✂️ --- output truncated --- ✂️

26.6 Superimpose

Listing 26.18 summarizes the S3 methods of the generic function superimpose() (v3.7.3, GPL (>= 2)) in the spatstat.* family of packages,

Listing 26.18: S3 methods spatstat.*::superimpose.*
Code
library(spatstat)
.S3methods(generic = 'superimpose', all.names = TRUE) |> 
  attr(which = 'info', exact = TRUE) |>
  subset.data.frame(subset = grepl(pattern = '^spatstat\\.', x = from))
#                      visible            from     generic  isS4
# superimpose.default     TRUE   spatstat.geom superimpose FALSE
# superimpose.lpp         TRUE spatstat.linnet superimpose FALSE
# superimpose.ppp         TRUE   spatstat.geom superimpose FALSE
# superimpose.ppplist     TRUE   spatstat.geom superimpose FALSE
# superimpose.psp         TRUE   spatstat.geom superimpose FALSE
# superimpose.splitppp    TRUE   spatstat.geom superimpose FALSE

The S3 method superimpose.hyperframe() performs a by-element superimpose of the

  • point-pattern (ppp) hypercolumns
  • line-segment-pattern (psp) hypercolumns 🚧

of multiple hyper data frames, if-and-only-if all input hyper data frames have identical

  • dimensions, i.e., dim.hyperframe() (v3.7.3, GPL (>= 2))
  • columns, i.e., unclass(.)$df
  • names and class of the hyper columns, i.e., unclass(.)$hypercolumns

The hypercolumn fluM$pattern (Listing 26.1) contains 8 point-patterns, each of them has 200-500 points (Listing 26.2). Listing 26.19 creates a hyper data frame fluM1 which consists of the same columns as fluM, but a ppp-hypercolumn $pattern with M1 marks only; and another hyper data frame fluM2 which consists of the M2 marks only.

Listing 26.19: Data: two hyper data frames fluM1 and fluM2 (Listing 26.1)
fluM1 = fluM2 = fluM
fluM1$pattern = fluM$pattern |> 
  spatstat.geom::solapply(
    FUN = spatstat.geom::subset.ppp, 
    subset = (marks == 'M1')
  )
fluM2$pattern = fluM$pattern |> 
  spatstat.geom::solapply(
    FUN = spatstat.geom::subset.ppp, 
    subset = (marks == 'M2')
  )

Listing 26.20 recreates the hyper data frame fluM (Listing 26.1) by superimposing the hyper data frames fluM2 and fluM1 (Listing 26.19). Note that the order of fluM2-then-fluM1 matters, because the points are arranged in M2-then-M1 in the original hypercolumn fluM$pattern (Listing 26.1, Listing 26.2).

Listing 26.20: Example: function superimpose.hyperframe() (Listing 26.1, Listing 26.19)
superimpose(fluM2, fluM1) |>
  identical(y = fluM) |>
  stopifnot()

Note that the S3 method superimpose.ppplist() (v3.7.3, GPL (>= 2)) superimposes all point-patterns from all input point-pattern-lists (Listing 26.21).

Listing 26.21: Review: superimpose.ppplist() not what we need (Listing 26.19)
Code
list(
  'all superimposed' = spatstat.geom::superimpose.ppplist(
    unclass(fluM2)$hypercolumns$pattern, 
    unclass(fluM1)$hypercolumns$pattern
  ) |>
    spatstat.geom::npoints.ppp(),
  'all M2' = fluM2$pattern |> 
    vapply(FUN = spatstat.geom::npoints.ppp, FUN.VALUE = NA_integer_) |>
    sum(),
  'all M1' = fluM1$pattern |> 
    vapply(FUN = spatstat.geom::npoints.ppp, FUN.VALUE = NA_integer_) |>
    sum()
)
# $`all superimposed`
# [1] 2817
# 
# $`all M2`
# [1] 921
# 
# $`all M1`
# [1] 1896

26.7 \(k\)-Means Clustering

Package groupedHyperframe (v0.4.0, GPL-2) implements the following S3 methods to the class 'hyperframekm' (Table 26.4),

Table 26.4: S3 methods groupedHyperframe::*.hyperframekm (v0.4.0)
visible generic isS4
split.hyperframekm FALSE base::split FALSE

26.7.1 Examples

Listing 26.22 performs \(k\)-means clustering of the ppp-hypercolumn fluM$pattern in the hyper data frame fluM (Listing 26.1).

Listing 26.22: Example: function kmeans.hyperframe() (Listing 26.1)
set.seed(13); flu_k = fluM |>
  within(expr = {
    pattern = pattern |>
      kmeans.ppp(formula = ~ x + y, centers = 3L)
  })
flu_k
# Hyperframe:
#             pattern frameid
# wt M2-M1 13 (pppkm)      13
# wt M2-M1 22 (pppkm)      22
# wt M2-M1 27 (pppkm)      27
# wt M2-M1 43 (pppkm)      43
# wt M2-M1 49 (pppkm)      49
# wt M2-M1 65 (pppkm)      65
# wt M2-M1 71 (pppkm)      71
# wt M2-M1 84 (pppkm)      84

26.7.2 Split by \(k\)-Means Clustering

The S3 method split.hyperframekm() splits a hyperframekm by the \(k\)-means clustering indices of the one-and-only-one pppkm-hypercolumn. The returned object is a grouped hyper data frame with grouping structure

  • ~.id/.cluster, if the input is a hyper data frame
  • ~ <existing/grouping/structure>/.cluster, if the input is a grouped hyper data frame. Note that the grouping level .id is believed to be equivalent to the lowest level of existing grouping structure.

Listing 26.23 splits the hyper data frame fluM (Listing 26.1) by the \(k\)-means clustering of the ppp-hypercolumn fluM$pattern.

Listing 26.23: Example: function split.hyperframekm() (Listing 26.22)
set.seed(13); flu_k |>
  groupedHyperframe:::split.hyperframekm()
# Grouped Hyper Data Frame: ~.id/.cluster
# 
# 24 .cluster nested in
# 8 .id
# 
#    pattern .id .cluster frameid
# 1    (ppp)   1        1      13
# 2    (ppp)   1        2      13
# 3    (ppp)   1        3      13
# 4    (ppp)   2        1      22
# 5    (ppp)   2        2      22
# 6    (ppp)   2        3      22
# ✂️ --- output truncated --- ✂️