Get Name of Dataframe Passed Through Pipe in R

Get name of dataframe passed through pipe in R

This is a first attempt, it's kind of a hack, but seems like it might work.

find_chain_parts <- function() {
    i <- 1
    while(!("chain_parts" %in% ls(envir=parent.frame(i))) && i < sys.nframe()) {
          i <- i+1
      }
    parent.frame(i)
}

printfirstname <- function(df){
    ee <- find_chain_parts()
    print(deparse(ee$lhs))
}

mtcars %>% printfirstname
# [1] "mtcars"

The pipe function creates an environment that keeps track of the chain parts. I tried walking up the current execution environments looking for this variable and then use the lhs info stored there to find the symbol at the start of the pipe. This isn't well tested.

How to get a function input name in R whilst using pipes

By default the pipe operator %>% pass the input variable name as dot "." to the next function, but you can control the output from the pipe operator lhs to pass you the actual name of the input variable instead of dot in your function. See the below function ( edited to work with/without %>%)

myfun <- function(x) {
  x <- substitute(x)

 if (x !="."){
 print(deparse(x))
 }else{

  i <- 1
  while(!("chain_parts" %in% ls(envir=parent.frame(i))) && i < sys.nframe()) {
    i <- i+1
  }
  ee <- parent.frame(i)
  print(deparse(ee$lhs))
  }
}

mean %>% myfun()
[1] "mean"

myfun(mean)
[1] "mean"

Hope it helps.
-Ahmed Alhendi

How to set the row names of a data frame passed on with the pipe % % operator?

with the later version of tibble, a more elegant solution exists:

df <- df %>% pivot(.) %>% tibble::column_to_rownames('ID_full')

Importantly, it works also when the column to turn to the rowname is passed as a variable, which is super-convenient, when inside the function!

% % .$column_name equivalent for R base pipe |

We can use getElement().

iris |> getElement('Sepal.Length') |> cut(5)

Update row names of a data frame in a pipe (% %)

Using the `rownames<-`() assignment function.

library(magrittr)
d %>% `rownames<-`(NULL)
#   X1 X2 X3 X4
# 1  1  4  7 10
# 2  2  5  8 11
# 3  3  6  9 12

Data:

d <- structure(list(X1 = 1:3, X2 = 4:6, X3 = 7:9, X4 = 10:12), class = "data.frame", row.names = c("a", 
"b", "c"))

Transform data to data.frame with the pipe operator

After the transpose, convert to tibble with as_tibble and change the column names with set_names

library(dplyr)
library(tibble)
x %>% 
  t %>%
  as_tibble(.name_repair = "unique") %>%
  setNames(c("a", "b"))
# A tibble: 1 x 2
#      a     b
#  <int> <int>
#1     1     2

Or another option if we want to use the OP's syntax would be to wrap the code with {}

x %>%
     {data.frame(a = .[1], b = .[2])}

How to pass a dataframe column as an argument in a function using piping?

We can use non-standard evaluation with curly-curly ({{}})

library(dplyr)
library(rlang)

fxtop <- function(df, number, column){
   tops <- df %>% top_n(number, {{column}})
   return(tops)
}

and pass unquoted variable names

fxtop(df=df_econ, number=5, pop)

#   date        pce     pop psavert uempmed unemploy
#  <date>      <dbl>   <dbl>   <dbl>   <dbl>    <dbl>
#1 2014-12-01 12062  319746.     7.6    12.9     8717
#2 2015-01-01 12046  319929.     7.7    13.2     8903
#3 2015-02-01 12082. 320075.     7.9    12.9     8610
#4 2015-03-01 12158. 320231.     7.4    12       8504
#5 2015-04-01 12194. 320402.     7.6    11.5     8526

If you want to pass column name as string (quoted), we can use sym with !!

fxtop <- function(df, number, column){
  tops <- df %>% top_n(number, !!sym(column))
  return(tops)
}
fxtop(df=df_econ, number=5, 'pop')

Get expression that evaluated to dot in function called by `magrittr` pipe

y is not "gone forever", because the pipe calls your function, and it also knows about y. There's a way to recover y, but it requires some traversal of the calling stack. To understand what's happening, we'll use ?sys.frames and ?sys.calls:

‘sys.calls’ and ‘sys.frames’ give a pairlist of all the active calls and frames, respectively, and ‘sys.parents’ returns an integer vector of indices of the parent frames of each of those frames.

If we sprinkle these throughout your x_expression(), we can see what happens when we call y %>% x_expression() from the global environment:

x_expression <- function(x) {
  print( enquo(x) )
  # <quosure>
  #   expr: ^.
  #   env:  0x55c03f142828                <---

  str(sys.frames())
  # Dotted pair list of 9
  #  $ :<environment: 0x55c03f151fa0> 
  #  $ :<environment: 0x55c03f142010> 
  #  ...
  #  $ :<environment: 0x55c03f142828>     <---
  #  $ :<environment: 0x55c03f142940>

  str(sys.calls())
  # Dotted pair list of 9
  #  $ : language y %>% x_expression()    <---
  #  $ : language withVisible(eval(...
  #  ...
  #  $ : language function_list[[k]...
  #  $ : language x_expression(.)
}

I highlighted the important parts with <---. Notice that the quosure captured by enquo lives in the parent environment of the function (second from the bottom of the stack), while the pipe call that knows about y is all the way at the top of the stack.

There's a couple of ways to traverse the stack. @MrFlick's answer to a similar question as well as this GitHub issue traverse the frames / environments from sys.frames(). Here, I will show an alternative that traverses sys.calls() and parses the expressions to find %>%.

The first piece of the puzzle is to define a function that converts an expression to its Abstract Sytax Tree(AST):

# Recursively constructs Abstract Syntax Tree for a given expression
getAST <- function(ee) purrr::map_if(as.list(ee), is.call, getAST)
# Example: getAST( quote(a %>% b) )
# List of 3
#  $ : symbol %>%
#  $ : symbol a
#  $ : symbol b

We can now systematically apply this function to the entire sys.calls() stack. The goal is to identify ASTs where the first element is %>%; the second element will then correspond to the left-hand side of the pipe (symbol a in the a %>% b example). If there is more than one such AST, then we're in a nested %>% pipe scenario. In this case, the last AST in the list will be the lowest in the calling stack and closest to our function.

x_expression2 <- function(x) {
  sc <- sys.calls()
  ASTs <- purrr::map( as.list(sc), getAST ) %>%
    purrr::keep( ~identical(.[[1]], quote(`%>%`)) )  # Match first element to %>%

  if( length(ASTs) == 0 ) return( enexpr(x) )        # Not in a pipe
  dplyr::last( ASTs )[[2]]    # Second element is the left-hand side
}

(Minor note: I used enexpr() instead of enquo() to ensure consistent behavior of the function in and out of the pipe. Since sys.calls() traversal returns an expression, not a quosure, we want to do the same in the default case as well.)

The new function is pretty robust and works inside other functions, including nested %>% pipes:

x_expression2(y)
# y

y %>% x_expression2()
# y

f <- function() {x_expression2(v)}
f()
# v

g <- function() {u <- 1; u %>% x_expression2()}
g()
# u

y %>% (function(z) {w <- 1; w %>% x_expression2()})  # Note the nested pipes
# w

Get Name of Dataframe Passed Through Pipe in R