Slice operations behave as in dplyr, except the history graph can be updated with
tracked dataframe with the before and after sizes of the dataframe.
See dplyr::slice()
, dplyr::slice_head()
, dplyr::slice_tail()
,
dplyr::slice_min()
, dplyr::slice_max()
, dplyr::slice_sample()
,
for more details on the underlying functions.
p_slice_sample(
.data,
...,
.messages = c("{.count.in} before", "{.count.out} after"),
.headline = "slice data"
)
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
Arguments passed on to dplyr::slice_sample
n,prop
Provide either n
, the number of rows, or prop
, the
proportion of rows to select. If neither are supplied, n = 1
will be
used. If n
is greater than the number of rows in the group
(or prop > 1
), the result will be silently truncated to the group size.
prop
will be rounded towards zero to generate an integer number of
rows.
A negative value of n
or prop
will be subtracted from the group
size. For example, n = -2
with a group of 5 rows will select 5 - 2 = 3
rows; prop = -0.25
with 8 rows will select 8 * (1 - 0.25) = 6 rows.
weight_by
<data-masking
> Sampling
weights. This must evaluate to a vector of non-negative numbers the same
length as the input. Weights are automatically standardised to sum to 1.
replace
Should sampling be performed with (TRUE
) or without
(FALSE
, the default) replacement.
a set of glue specs. The glue code can use any global variable, {.count.in}, {.count.out} for the input and output dataframes sizes respectively and {.excluded} for the difference
a glue spec. The glue code can use any global variable, {.count.in}, {.count.out} for the input and output dataframes sizes respectively.
the sliced dataframe with the history graph updated.
dplyr::slice_sample()
library(dplyr)
library(dtrackr)
# In this example the iris dataframe is resampled 100 times with replacement
# within each group and the
iris %>%
track() %>%
group_by(Species) %>%
slice_sample(n=100, replace=TRUE,
.messages="{.count.out} / {.count.in} = {n}",
.headline="100 {Species}") %>%
history()
#> dtrackr history:
#> number of flowchart steps: 3 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> ├ [Species:setosa]: "100 setosa", "100 / 50 = 100"
#> ├ [Species:versicolor]: "100 versicolor", "100 / 50 = 100"
#> └ [Species:virginica]: "100 virginica", "100 / 50 = 100"