Skip to contents

Controlling dtrackr

Functions that control dtrackr

track()
Start tracking the dtrackr history graph
untrack()
Remove tracking from the dataframe
pause()
Pause tracking the data frame.
resume()
Resume tracking the data frame.
print(<trackr_graph>)
Print a history graph to the console
plot(<trackr_graph>)
Plots a history graph as html
flowchart()
Flowchart output
history()
Get the dtrackr history graph
capture_exclusions()
Start capturing exclusions on a tracked dataframe.
excluded()
Get the dtrackr excluded data record
tagged()
Retrieve tagged data in the history graph

Pipeline annotations

dtrackr has functions that are specific additions to a data pipeline. These are novel data pipeline functions that don’t have a direct equivalent in dplyr or tidyr.

comment()
Add a generic comment to the dtrackr history graph
status()
Add a summary to the dtrackr history graph
count_subgroup()
Add a subgroup count to the dtrackr history graph
exclude_all()
Exclude all items matching one or more criteria
include_any()
Include any items matching a criteria

S3 dplyr function extensions

S3 functions extend the behaviour of a subset of the dplyr or tidyr commands that operate on dataframes, and return dataframes, to include a history graph. In general these work the same as the normal dplyr functions but allow additional parameters that allow us to configure how the history is captured as they are executed. dplyr functions that are not listed here can still be used as normal with a tracked dataframe as input and give you the same output but simply do not give you the additional functionality to store a entry in the history graph. This may be because the operations are terminal such as count(), tally(), glimpse(), pull(), or they modify subsequent behaviour (e.g rowwise()), or they do not output dataframes (e.g. group_map() and group_walk()). If a function is not supported in dtrackr it can still be used, but that step of the pipeline will not be captured in the history graph.

add_count(<trackr_df>)
dplyr modifying operations
add_tally()
dplyr modifying operations
arrange(<trackr_df>)
dplyr modifying operations
distinct(<trackr_df>)
Distinct values of data
filter(<trackr_df>)
Filtering data
group_by(<trackr_df>)
Stratifying your analysis
group_modify(<trackr_df>)
Group-wise modification of data and complex operations
mutate(<trackr_df>)
dplyr modifying operations
relocate(<trackr_df>)
dplyr modifying operations
rename(<trackr_df>)
dplyr modifying operations
rename_with(<trackr_df>)
dplyr modifying operations
select(<trackr_df>)
dplyr modifying operations
summarise(<trackr_df>)
Summarise a data set
reframe(<trackr_df>)
Summarise a data set
transmute(<trackr_df>)
dplyr modifying operations
ungroup(<trackr_df>)
Remove a stratification from a data set
anti_join(<trackr_df>)
Anti join
full_join(<trackr_df>)
Full join
inner_join(<trackr_df>)
Inner joins
left_join(<trackr_df>)
Left join
right_join(<trackr_df>)
Right join
semi_join(<trackr_df>)
Semi join
nest_join(<trackr_df>)
Nest join
slice(<trackr_df>)
Slice operations
slice_head(<trackr_df>)
Slice operations
slice_tail(<trackr_df>)
Slice operations
slice_min(<trackr_df>)
Slice operations
slice_max(<trackr_df>)
Slice operations
slice_sample(<trackr_df>)
Slice operations
bind_rows()
Set operations
bind_cols()
Set operations
intersect(<trackr_df>)
Set operations
union(<trackr_df>)
Set operations
union_all(<trackr_df>)
Set operations
setdiff(<trackr_df>)
Set operations

S3 function extensions in other packages

dtrackr support for functions from other tidyverse packages is evolving. The focus is on functions that take a dataframe as input and produce a dataframe as output, and naturally fit within a data pipeline. pivot_longer() and pivot_wider() are good examples which are already implemented. Tracking of nest() and unnest() is not yet implemented (but this does not stop you from using these functions in a pipeline), and purrr functions such as map_df(), map_dfc(), map_dfr(), pmap(), pmap_dfr(), pmap_dfc() are potential candidates for future implementation, as are functions that acquire data such as those from the readr package.

Dot graph rendering.

Rendering of a dtrackr history once converted to GraphViz dot format. These can be also used for non-dtrackr dot content

dot2svg()
Convert Graphviz dot content to a SVG
save_dot()
Save DOT content to a file
std_size
Standard paper sizes

Legacy interface

The following operations are all aliases for functions above. They are generally backend functions, and should not be used for new projects.

p_add_count()
dplyr modifying operations
p_add_tally()
dplyr modifying operations
p_anti_join()
Anti join
p_arrange()
dplyr modifying operations
p_bind_cols()
Set operations
p_bind_rows()
Set operations
p_capture_exclusions()
Start capturing exclusions on a tracked dataframe.
p_clear()
Clear the dtrackr history graph
p_comment()
Add a generic comment to the dtrackr history graph
p_copy()
Copy the dtrackr history graph from one dataframe to another
p_count_if()
Simple count_if dplyr summary function
p_count_subgroup()
Add a subgroup count to the dtrackr history graph
p_distinct()
Distinct values of data
p_exclude_all()
Exclude all items matching one or more criteria
p_excluded()
Get the dtrackr excluded data record
p_filter()
Filtering data
p_flowchart()
Flowchart output
p_full_join()
Full join
p_get()
Get the dtrackr history graph
p_get_as_dot()
DOT output
p_group_by()
Stratifying your analysis
p_group_modify()
Group-wise modification of data and complex operations
p_include_any()
Include any items matching a criteria
p_inner_join()
Inner joins
p_intersect()
Set operations
p_left_join()
Left join
p_mutate()
dplyr modifying operations
p_nest_join()
Nest join
p_pause()
Pause tracking the data frame.
p_pivot_longer()
Reshaping data using tidyr::pivot_longer
p_pivot_wider()
Reshaping data using tidyr::pivot_wider
p_reframe()
Summarise a data set
p_relocate()
dplyr modifying operations
p_rename()
dplyr modifying operations
p_rename_with()
dplyr modifying operations
p_resume()
Resume tracking the data frame.
p_right_join()
Right join
p_select()
dplyr modifying operations
p_semi_join()
Semi join
p_set()
Set the dtrackr history graph
p_setdiff()
Set operations
p_slice()
Slice operations
p_slice_head()
Slice operations
p_slice_max()
Slice operations
p_slice_min()
Slice operations
p_slice_sample()
Slice operations
p_slice_tail()
Slice operations
p_status()
Add a summary to the dtrackr history graph
p_summarise()
Summarise a data set
p_tagged()
Retrieve tagged data in the history graph
p_track()
Start tracking the dtrackr history graph
p_transmute()
dplyr modifying operations
p_ungroup()
Remove a stratification from a data set
p_union()
Set operations
p_union_all()
Set operations
p_untrack()
Remove tracking from the dataframe