Apply a set of inclusion criteria and record the actions of the
filter to the dtrackr
history graph. Because of the ... filter specification,
all parameters MUST BE NAMED. This function is the opposite of
exclude_all()
and the filtering criteria work to identify rows to
include i.e. the results include anything that match any of the criteria. If
na.rm=TRUE
they also keep anything that cannot be evaluated by the criteria.
include_any(
.data,
...,
.headline = .defaultHeadline(),
na.rm = TRUE,
.type = "inclusion",
.asOffshoot = FALSE,
.tag = NULL
)
a dataframe which may be grouped
a dplyr filter specification as a set of formulae where the LHS are predicates to test the data set against, items that match at least one of the predicates will be included. The RHS is a glue specification, defining the message, to be entered in the history graph for each predicate matched. This can refer to grouping variables, variables from the environment and {.included} and {.matched} or {.missing} (included = matched+missing), {.count} and {.total} - group and overall counts respectively, e.g. "excluding {.matched} items and {.missing} with missing values".
a glue specification which can refer to grouping variables of .data, or any variables defined in the calling environment
(default TRUE) if the filter cannot be evaluated for a row count that row as missing and either exclude it (TRUE) or don't exclude it (FALSE)
default "inclusion": used to define formatting
do you want this comment to be an offshoot of the main flow (default = FALSE).
if you want the summary data from this step in the future then give it a name with .tag.
the filtered .data dataframe with the history graph updated with the summary of included items as a new stage
library(dplyr)
library(dtrackr)
iris %>% track() %>% group_by(Species) %>% include_any(
Petal.Length > 5 ~ "{.included} long ones",
Petal.Length < 2 ~ "{.included} short ones"
) %>% history()
#> dtrackr history:
#> number of flowchart steps: 3 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> ├ [Species:setosa]: "Species:setosa", "inclusions:", "0 long ones", "50 short ones"
#> ├ [Species:versicolor]: "Species:versicolor", "inclusions:", "1 long ones", "0 short ones"
#> └ [Species:virginica]: "Species:virginica", "inclusions:", "41 long ones", "0 short ones"
# simultaneous evaluation of criteria:
data.frame(a = 1:10) %>%
track() %>%
include_any(
# These two criteria identify the same value and one item is excluded
a > 1 ~ "{.included} value > 1",
a != min(a) ~ "{.included} everything but the smallest value",
) %>%
status() %>%
history()
#> dtrackr history:
#> number of flowchart steps: 3 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "9 items"
# the behaviour is equivalent to dplyr's filter function:
data.frame(a=1:10) %>%
dplyr::filter(a > 1, a != min(a)) %>%
nrow()
#> [1] 9
# step-wise evaluation of criteria results in a different output
data.frame(a = 1:10) %>%
track() %>%
# Performing the same exclusion sequentially results in 2 items
# being excluded as the criteria no longer identify the same
# item.
include_any(a > 1 ~ "{.included} value > 1") %>%
include_any(a != min(a) ~ "{.included} everything but the smallest value") %>%
status() %>%
history()
#> dtrackr history:
#> number of flowchart steps: 4 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "8 items"
# the behaviour is equivalent to dplyr's filter function:
data.frame(a=1:10) %>%
dplyr::filter(a > 1) %>%
dplyr::filter(a != min(a)) %>%
nrow()
#> [1] 8