Equivalent dplyr
functions for mutating, selecting and renaming a data set
act in the normal way. mutates / selects / rename generally don't add
anything in documentation so the default behaviour is to miss these out of
the history. This can be overridden with the .messages, or .headline values
in which case they behave just like a comment()
See dplyr::mutate()
,
dplyr::add_count()
, dplyr::add_tally()
, dplyr::transmute()
,
dplyr::select()
, dplyr::relocate()
, dplyr::rename()
dplyr::rename_with()
, dplyr::arrange()
for more details.
p_add_count(x, ..., .messages = "", .headline = "", .tag = NULL)
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).
Arguments passed on to dplyr::add_count
wt
<data-masking
> Frequency weights.
Can be NULL
or a variable:
If NULL
(the default), counts the number of rows in each group.
If a variable, computes sum(wt)
for each group.
sort
If TRUE
, will show the largest groups at the top.
name
The name of the new column in the output.
If omitted, it will default to n
. If there's already a column called n
,
it will use nn
. If there's a column called n
and nn
, it'll use
nnn
, and so on, adding n
s until it gets a new name.
.drop
Handling of factor levels that don't appear in the data, passed
on to group_by()
.
For count()
: if FALSE
will include counts for empty groups (i.e. for
levels of factors that don't exist in the data).
For add_count()
: deprecated since it
can't actually affect the output.
a set of glue specs. The glue code can use any global variable, grouping variable, {.new_cols} or {.dropped_cols} for changes to columns, {.cols} for the output column names, or {.strata}. Defaults to nothing.
a headline glue spec. The glue code can use any global variable, grouping variable, {.new_cols}, {.dropped_cols}, {.cols} or {.strata}. Defaults to nothing.
if you want the summary data from this step in the future then give it a name with .tag.
the .data dataframe after being modified by the dplyr
equivalent
function, but with the history graph updated with a new stage if the
.messages
or .headline
parameter is not empty.
dplyr::add_count()
library(dplyr)
library(dtrackr)
# mutate and other functions are unitary operations that generally change
# the structure but not size of a dataframe. In dtrackr these are by ignored
# by default but we can change that so that their behaviour is obvious.
# add_count
# adding in a count or tally column as a new column
iris %>%
track() %>%
add_count(Species, name="new_count_total",
.messages="{.new_cols}",
# .messages="{.cols}",
.headline="New columns from add_count:") %>%
history()
#> dtrackr history:
#> number of flowchart steps: 2 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> └ "New columns from add_count:", "new_count_total"
# add_tally
iris %>%
track() %>%
group_by(Species) %>%
dtrackr::add_tally(wt=Petal.Length, name="new_tally_total",
.messages="{.new_cols}",
.headline="New columns from add_tally:") %>%
history()
#> dtrackr history:
#> number of flowchart steps: 3 (approx)
#> tags defined: <none>
#> items excluded so far: <not capturing exclusions>
#> last entry / entries:
#> ├ [Species:setosa]: "New columns from add_tally:", "new_tally_total"
#> ├ [Species:versicolor]: "New columns from add_tally:", "new_tally_total"
#> └ [Species:virginica]: "New columns from add_tally:", "new_tally_total"