Controlling dtrackr

Functions that control dtrackr

track()

Start tracking the dtrackr history graph

untrack()

Remove tracking from the dataframe

pause()

Pause tracking the data frame.

resume()

Resume tracking the data frame.

print(<trackr_graph>)

Print a history graph to the console

plot(<trackr_graph>)

Plots a history graph as html

flowchart()

Flowchart output

history()

Get the dtrackr history graph

capture_exclusions()

Start capturing exclusions on a tracked dataframe.

excluded()

Get the dtrackr excluded data record

tagged()

Retrieve tagged data in the history graph

Pipeline annotations

dtrackr has functions that are specific additions to a data pipeline. These are novel data pipeline functions that don’t have a direct equivalent in dplyr or tidyr.

comment()

Add a generic comment to the dtrackr history graph

status()

Add a summary to the dtrackr history graph

count_subgroup()

Add a subgroup count to the dtrackr history graph

exclude_all()

Exclude all items matching one or more criteria

include_any()

Include any items matching a criteria

S3 dplyr function extensions

S3 functions extend the behaviour of a subset of the dplyr or tidyr commands that operate on dataframes, and return dataframes, to include a history graph. In general these work the same as the normal dplyr functions but allow additional parameters that allow us to configure how the history is captured as they are executed. dplyr functions that are not listed here can still be used as normal with a tracked dataframe as input and give you the same output but simply do not give you the additional functionality to store a entry in the history graph. This may be because the operations are terminal such as count(), tally(), glimpse(), pull(), or they modify subsequent behaviour (e.g rowwise()), or they do not output dataframes (e.g. group_map() and group_walk()). If a function is not supported in dtrackr it can still be used, but that step of the pipeline will not be captured in the history graph.

add_count(<trackr_df>)

dplyr modifying operations

add_tally()

dplyr modifying operations

arrange(<trackr_df>)

dplyr modifying operations

distinct(<trackr_df>)

Distinct values of data

filter(<trackr_df>)

Filtering data

group_by(<trackr_df>)

Stratifying your analysis

group_modify(<trackr_df>)

Group-wise modification of data and complex operations

mutate(<trackr_df>)

dplyr modifying operations

relocate(<trackr_df>)

dplyr modifying operations

rename(<trackr_df>)

dplyr modifying operations

rename_with(<trackr_df>)

dplyr modifying operations

select(<trackr_df>)

dplyr modifying operations

summarise(<trackr_df>)

Summarise a data set

reframe(<trackr_df>)

Summarise a data set

transmute(<trackr_df>)

dplyr modifying operations

ungroup(<trackr_df>)

Remove a stratification from a data set

anti_join(<trackr_df>)

Anti join

full_join(<trackr_df>)

Full join

inner_join(<trackr_df>)

Inner joins

left_join(<trackr_df>)

Left join

right_join(<trackr_df>)

Right join

semi_join(<trackr_df>)

Semi join

nest_join(<trackr_df>)

Nest join

slice(<trackr_df>)

Slice operations

slice_head(<trackr_df>)

Slice operations

slice_tail(<trackr_df>)

Slice operations

slice_min(<trackr_df>)

Slice operations

slice_max(<trackr_df>)

Slice operations

slice_sample(<trackr_df>)

Slice operations

bind_rows()

Set operations

bind_cols()

Set operations

intersect(<trackr_df>)

Set operations

union(<trackr_df>)

Set operations

union_all(<trackr_df>)

Set operations

setdiff(<trackr_df>)

Set operations

S3 function extensions in other packages

dtrackr support for functions from other tidyverse packages is evolving. The focus is on functions that take a dataframe as input and produce a dataframe as output, and naturally fit within a data pipeline. pivot_longer() and pivot_wider() are good examples which are already implemented. Tracking of nest() and unnest() is not yet implemented (but this does not stop you from using these functions in a pipeline), and purrr functions such as map_df(), map_dfc(), map_dfr(), pmap(), pmap_dfr(), pmap_dfc() are potential candidates for future implementation, as are functions that acquire data such as those from the readr package.

pivot_longer(<trackr_df>)

Reshaping data using tidyr::pivot_longer

pivot_wider(<trackr_df>)

Reshaping data using tidyr::pivot_wider

Dot graph rendering.

Rendering of a dtrackr history once converted to GraphViz dot format. These can be also used for non-dtrackr dot content

dot2svg()

Convert Graphviz dot content to a SVG

save_dot()

Save DOT content to a file

std_size

Standard paper sizes

Legacy interface

The following operations are all aliases for functions above. They are generally backend functions, and should not be used for new projects.

p_add_count()

dplyr modifying operations

p_add_tally()

dplyr modifying operations

p_anti_join()

Anti join

p_arrange()

dplyr modifying operations

p_bind_cols()

Set operations

p_bind_rows()

Set operations

p_capture_exclusions()

Start capturing exclusions on a tracked dataframe.

p_clear()

Clear the dtrackr history graph

p_comment()

Add a generic comment to the dtrackr history graph

p_copy()

Copy the dtrackr history graph from one dataframe to another

p_count_if()

Simple count_if dplyr summary function

p_count_subgroup()

Add a subgroup count to the dtrackr history graph

p_distinct()

Distinct values of data

p_exclude_all()

Exclude all items matching one or more criteria

p_excluded()

Get the dtrackr excluded data record

p_filter()

Filtering data

p_flowchart()

Flowchart output

p_full_join()

Full join

p_get()

Get the dtrackr history graph

p_get_as_dot()

DOT output

p_group_by()

Stratifying your analysis

p_group_modify()

Group-wise modification of data and complex operations

p_include_any()

Include any items matching a criteria

p_inner_join()

Inner joins

p_intersect()

Set operations

p_left_join()

Left join

p_mutate()

dplyr modifying operations

p_nest_join()

Nest join

p_pause()

Pause tracking the data frame.

p_pivot_longer()

Reshaping data using tidyr::pivot_longer

p_pivot_wider()

Reshaping data using tidyr::pivot_wider

p_reframe()

Summarise a data set

p_relocate()

dplyr modifying operations

p_rename()

dplyr modifying operations

p_rename_with()

dplyr modifying operations

p_resume()

Resume tracking the data frame.

p_right_join()

Right join

p_select()

dplyr modifying operations

p_semi_join()

Semi join

p_set()

Set the dtrackr history graph

p_setdiff()

Set operations

p_slice()

Slice operations

p_slice_head()

Slice operations

p_slice_max()

Slice operations

p_slice_min()

Slice operations

p_slice_sample()

Slice operations

p_slice_tail()

Slice operations

p_status()

Add a summary to the dtrackr history graph

p_summarise()

Summarise a data set

p_tagged()

Retrieve tagged data in the history graph

p_track()

Start tracking the dtrackr history graph

p_transmute()

dplyr modifying operations

p_ungroup()

Remove a stratification from a data set

p_union()

Set operations

p_union_all()

Set operations

p_untrack()

Remove tracking from the dataframe