Set operations

These perform set operations on tracked dataframes. It merges the history of 2 (or more) dataframes and combines the rows (or columns). It calculates the total number of resulting rows as {.count.out} in other terms it performs exactly the same operation as the equivalent dplyr operation. See dplyr::bind_rows(), dplyr::bind_cols(), dplyr::intersect(), dplyr::union(), dplyr::setdiff(),dplyr::intersect(), or dplyr::union_all() for the underlying function details.

Usage

# S3 method for class 'trackr_df'
setdiff(
  x,
  y,
  ...,
  .messages = "{.count.out} items in difference",
  .headline = "Difference"
)

# S3 method for class 'trackr_df'
setdiff(
  x,
  y,
  ...,
  .messages = "{.count.out} items in difference",
  .headline = "Difference"
)

Arguments

x, y

Vectors to combine.

...

a collection of tracked data frames to combine Named arguments passed on to tidyr::nest

.data

A data frame.

.by

<tidy-select> Columns to nest by; these will remain in the outer data frame.

.by can be used in place of or in conjunction with columns supplied through ....

If not supplied, then .by is derived as all columns not selected by ....

.key

The name of the resulting nested column. Only applicable when ... isn't specified, i.e. in the case of df %>% nest(.by = x).

If NULL, then "data" will be used by default.

.names_sep

If NULL, the default, the inner names will come from the former outer names. If a string, the new inner names will use the outer names with names_sep automatically stripped. This makes names_sep roughly symmetric between nesting and unnesting.

.messages

a set of glue specs. The glue code can use any global variable, or {.count.out}

.headline

a glue spec. The glue code can use any global variable, or {.count.out}

Value

the dplyr output with the history graph updated.

Examples

library(dplyr)
library(dtrackr)

# Set operations
people = starwars %>% select(-films, -vehicles, -starships)
chrs = people %>% track("start")

lhs = chrs %>% include_any(
  species == "Human" ~ "{.included} humans",
  species == "Droid" ~ "{.included} droids"
)

# these are different subsets of the same data
rhs = chrs %>% include_any(
  species == "Human" ~ "{.included} humans",
  species == "Gungan" ~ "{.included} gungans"
) %>% comment("{.count} gungans & humans")


# Unions
set = bind_rows(lhs,rhs) %>% comment("{.count} 2*human,droids and gungans")
# display the history of the result:
set %>% history()
#> dtrackr history:
#> number of flowchart steps: 5 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "79 2*human,droids and gungans"
nrow(set)
#> [1] 79
# not run - display the flowchart:
# set %>% flowchart()

set = union(lhs,rhs) %>% comment("{.count} human,droids and gungans")
# display the history of the result:
set %>% history()
#> dtrackr history:
#> number of flowchart steps: 5 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "44 human,droids and gungans"
nrow(set)
#> [1] 44
# not run - display the flowchart:
# set %>% flowchart()

set = union_all(lhs,rhs) %>% comment("{.count} 2*human,droids and gungans")
# display the history of the result:
set %>% history()
#> dtrackr history:
#> number of flowchart steps: 5 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "79 2*human,droids and gungans"
nrow(set)
#> [1] 79
# not run - display the flowchart:
# set %>% flowchart()

# Intersections and differences

set = setdiff(lhs,rhs) %>% comment("{.count} droids and gungans")
# display the history of the result:
set %>% history()
#> dtrackr history:
#> number of flowchart steps: 5 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "6 droids and gungans"
nrow(set)
#> [1] 6
# not run - display the flowchart:
# set %>% flowchart()

set = intersect(lhs,rhs) %>% comment("{.count} humans")
# display the history of the result:
set %>% history()
#> dtrackr history:
#> number of flowchart steps: 5 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "35 humans"
nrow(set)
#> [1] 35
# not run - display the flowchart:
# set %>% flowchart()

Usage

Arguments

Value

See also

Examples