%>%: Pipe

With dplyr we can perform a series of operations, for example select and then filter, by sending the results of one function to another using what is called the pipe operator: %>%. Some details are included below.

We wrote code above to show three variables (state, region, rate) for states that have murder rates below 0.71. To do this, we defined the intermediate object new_table. In dplyr we can write code that looks more like a description of what we want to do without intermediate objects:

[original data] -> [select] -> [filter]

For such an operation, we can use the pipe %>%. The code looks like this:

murders %>% select(state, region, rate) %>% filter(rate <= 0.71)
#>           state        region  rate
#> 1        Hawaii          West 0.515
#> 2          Iowa North Central 0.689
#> 3 New Hampshire     Northeast 0.380
#> 4  North Dakota North Central 0.595
#> 5       Vermont     Northeast 0.320

This line of code is equivalent to the two lines of code above. What is going on here?

In general, the pipe sends the result of the left side of the pipe to be the first argument of the function on the right side of the pipe.

Remember that the pipe sends values to the first argument, so we can define other arguments as if the first argument is already defined:

16 %>% sqrt() %>% log(base = 2)
#> [1] 2

Therefore, when using the pipe with data frames and dplyr, we no longer need to specify the required first argument since the dplyr functions we have described all take the data as the first argument. In the code we wrote:

murders %>% select(state, region, rate) %>% filter(rate <= 0.71)

murders is the first argument of the select function, and the new data frame (formerly new_table) is the first argument of the filter function.

Instruction

Run the sample code to see how %>% structure works.

library(dplyr) library(dslabs) data("murders") # Add the rate column murders <- mutate(murders, rate = total / population * 100000) # Without Pipe filter(select(murders, state, region, rate), rate <= 0.71) # Using pipe(%>%) to fix the code . murders %>% select(state, region, rate) %>% filter(rate <= 0.71)

Previous: 3-5 | select(): Selecting columns with select

Next: 3-7 | summarize(): Compute summary statistics

Back to Main