Skip to contents

A drop in replacement for tidyr::nest() which optionally takes a message and headline to store in the history graph.

Usage

# S3 method for class 'trackr_df'
nest(
  .data,
  ...,
  .by = NULL,
  .key = NULL,
  .names_sep = NULL,
  .messages = c("{.count.out} items"),
  .headline = ""
)

Arguments

.data

A data frame.

...

<tidy-select> Columns to nest; these will appear in the inner data frames.

Specified using name-variable pairs of the form new_col = c(col1, col2, col3). The right hand side can be any valid tidyselect expression.

If not supplied, then ... is derived as all columns not selected by .by, and will use the column name from .key.

[Deprecated]: previously you could write df %>% nest(x, y, z). Convert to df %>% nest(data = c(x, y, z)).

.by

<tidy-select> Columns to nest by; these will remain in the outer data frame.

.by can be used in place of or in conjunction with columns supplied through ....

If not supplied, then .by is derived as all columns not selected by ....

.key

The name of the resulting nested column. Only applicable when ... isn't specified, i.e. in the case of df %>% nest(.by = x).

If NULL, then "data" will be used by default.

.names_sep

If NULL, the default, the inner names will come from the former outer names. If a string, the new inner names will use the outer names with names_sep automatically stripped. This makes names_sep roughly symmetric between nesting and unnesting.

.messages

a set of glue specs. The glue code can use any global variable, grouping variable, {.count.in}, {.count.out} or {.strata}. Defaults to "{.count.out} items".

.headline

a headline glue spec. The glue code can use any global variable, grouping variable, or {.strata}. Defaults to nothing.

Value

the data dataframe result of the tidyr::nest function but with a history graph updated.

See also

Examples

library(dplyr)
library(dtrackr)

starwars %>%
  track() %>%
  tidyr::unnest(starships, keep_empty = TRUE) %>%
  tidyr::nest(world_data = c(-homeworld)) %>%
  history()
#> dtrackr history:
#> number of flowchart steps: 2 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "13 items"

# There is a problem with `tidyr::unnest` that means if you want to override the
# `.messages` option at the moment it will most likely fail. Forcing the use of
# the specific `dtrackr::p_unnest` version solves this problem, until hopefully it is
# resolved in `tidyr`:
starwars %>%
  track() %>%
  p_unnest(
    films,
    .messages = c("{.count.in} characters", "{.count.out} appearances")
  ) %>%
  dplyr::group_by(gender) %>%
  tidyr::nest(
    people = c(-gender, -species, -homeworld),
    .messages = c("{.count.in} appearances", "{.count.out} planets")
  ) %>%
  status() %>%
  history()
#> dtrackr history:
#> number of flowchart steps: 5 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> ├ [NA]: "3 items"
#> ├ [feminine]: "feminine", "13 items"
#> └ [masculine]: "masculine", "49 items"

# This example includes pivoting and nesting. The CMS patient care data
# has multiple tests per institution in a long format, and observed /
# denominator types. Firstly we pivot the data to allow us to easily calculate
# a total percentage for each institution. This is duplicated for every test
# so we nest the tests to get to one row per institution. Those institutions
# with invalid scores are excluded.
cms_history = tidyr::cms_patient_care %>%
  track() %>%
  tidyr::pivot_wider(names_from = type, values_from = score) %>%
  dplyr::mutate(
    percentage = sum(observed) / sum(denominator) * 100,
    .by = c(ccn, facility_name)
  ) %>%
  tidyr::nest(
    results = c(measure_abbr, observed, denominator),
    .messages = c("{.count.in} test results", "{.count.out} facilities")
  ) %>%
  exclude_all(
    percentage > 100 ~ "{.excluded} facilities with anomalous percentages",
    na.rm = TRUE
  )

print(cms_history %>% dtrackr::history())
#> dtrackr history:
#> number of flowchart steps: 3 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "126 test results", "14 facilities"

# not run in examples:
if (interactive()) {
  cms_history %>% flowchart()
}