Slice operations behave as in dplyr, except the history graph can be updated with
tracked dataframe with the before and after sizes of the dataframe.
See dplyr::slice(), dplyr::slice_head(), dplyr::slice_tail(),
dplyr::slice_min(), dplyr::slice_max(), dplyr::slice_sample(),
for more details on the underlying functions.
Arguments
- .data
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
- ...
For
slice(): <data-masking> Integer row values.Provide either positive values to keep, or negative values to drop. The values provided must be either all positive or all negative. Indices beyond the number of rows in the input are silently ignored.
For
slice_*(), these arguments are passed on to methods. Named arguments passed on todplyr::slice_max.by,by-
<
tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative togroup_by(). For details and examples, see ?dplyr_by. .preserveRelevant when the
.datainput is grouped. If.preserve = FALSE(the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.n,propProvide either
n, the number of rows, orprop, the proportion of rows to select. If neither are supplied,n = 1will be used. Ifnis greater than the number of rows in the group (orprop > 1), the result will be silently truncated to the group size.propwill be rounded towards zero to generate an integer number of rows.A negative value of
norpropwill be subtracted from the group size. For example,n = -2with a group of 5 rows will select 5 - 2 = 3 rows;prop = -0.25with 8 rows will select 8 * (1 - 0.25) = 6 rows.order_by<
data-masking> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.with_tiesShould ties be kept together? The default,
TRUE, may return more rows than you request. UseFALSEto ignore ties, and return the firstnrows.na_rmShould missing values in
order_bybe removed from the result? IfFALSE,NAvalues are sorted to the end (like inarrange()), so they will only be included if there are insufficient non-missing values to reachn/prop.weight_by<
data-masking> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1.replaceShould sampling be performed with (
TRUE) or without (FALSE, the default) replacement.
- .messages
a set of glue specs. The glue code can use any global variable, {.count.in}, {.count.out} for the input and output dataframes sizes respectively and {.excluded} for the difference
- .headline
a glue spec. The glue code can use any global variable, {.count.in}, {.count.out} for the input and output dataframes sizes respectively.
Examples
library(dplyr)
library(dtrackr)
# Subset the data by the maximum of a given value
iris %>% track() %>% group_by(Species) %>%
slice_max(prop=0.5, order_by = Sepal.Width,
.messages="{.count.out} / {.count.in} = {prop} (with ties)",
.headline="Widest 50% Sepals") %>%
history()
#> dtrackr history:
#> number of flowchart steps: 3 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> ├ [setosa]: "Widest 50% Sepals", "31 / 50 = 0.5 (with ties)"
#> ├ [versicolor]: "Widest 50% Sepals", "29 / 50 = 0.5 (with ties)"
#> └ [virginica]: "Widest 50% Sepals", "29 / 50 = 0.5 (with ties)"
# The narrowest 25% of the iris data set by group can be calculated in the
# slice_min() function. Recording this is a matter of tracking and
# using glue specs.
iris %>%
track() %>%
group_by(Species) %>%
slice_min(prop=0.25, order_by = Sepal.Width,
.messages="{.count.out} / {.count.in} (with ties)",
.headline="narrowest {sprintf('%1.0f',prop*100)}% {Species}") %>%
history()
#> dtrackr history:
#> number of flowchart steps: 3 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> ├ [setosa]: "narrowest 25% setosa", "12 / 50 (with ties)"
#> ├ [versicolor]: "narrowest 25% versicolor", "13 / 50 (with ties)"
#> └ [virginica]: "narrowest 25% virginica", "19 / 50 (with ties)"
