Identifying where value changes in R data.frame column
A simple solution is to use the lag function in dplyr:
which(df$value != dplyr::lag(df$value))
How to determine when a change in value occurs in R
Like this:
df$ind[c(FALSE, diff(as.numeric(df$value)) == 100)]
Determine when columns of a data.frame change value and return indices of the change
In data.table
version 1.8.10 (stable version in CRAN), there's a(n) (unexported) function called duplist
that does exactly this. And it's also written in C and is therefore terribly fast.
require(data.table) # 1.8.10
data.table:::duplist(x[, 3:5])
# [1] 1 4 5
If you're using the development version of data.table
(1.8.11), then there's a more efficient version (in terms of memory) renamed as uniqlist
, that does exactly the same job. Probably this should be exported for next release. Seems to have come up on SO more than once. Let's see.
require(data.table) # 1.8.11
data.table:::uniqlist(x[, 3:5])
# [1] 1 4 5
How to identify value change in one column in R?
Perhaps you could look at the transition of drug_type
from "A" to "B", or include where the number of distinct drug_brand
is greater than 1?
library(tidyverse)
df %>%
group_by(id) %>%
filter(any(drug_type == "B" & lag(drug_type) == "A") |
n_distinct(drug_brand) > 1)
find the place where the variable in a dataframe changes its value
Here is a possible way. Use diff
to get the values where column b
changes but be carefull, the first value of b
, by definition of change, hasn't changed. (The problem is that diff
returns a vector with one less element.)
inx <- c(FALSE, diff(data$b) != 0)
data[inx, ]
# a b
#4 4 1
After seeing the OP's comment to another post, the following code shows that this method can also solve the issue when b
starts with any value,not just zero.
data2 <- data.frame(a=c(1,2,3,4,5,6),b=c(1,1,1,0,0,0))
inx <- c(FALSE, diff(data2$b) != 0)
data2[inx, ]
# a b
#4 4 0
How to create column that identifies another column where the row values change?
I suppose you could create a difference matrix for the first 4 columns (using your data frame df
):
df_diff <- rbind(0, diff(as.matrix(df[1:4])))
Which would give you:
A B C D
[1,] 0 0 0 0
[2,] 0 0 0 1
[3,] 0 -6 0 1
[4,] 4 0 0 0
Then, using sapply
with an index for your data frame and the different matrix, you could do the following:
df$F <- sapply(seq_len(nrow(df)), function(i) ifelse(df[i, 5] < 0,
names(which(df_diff[i, ] != 0))[1],
NA_character_))
This will check for negative values in column 5, and for those negative select the first column name found with a difference identified in the different matrix (different of not zero). Otherwise, will put in NA
. A new column F
will contain this result.
Output
A B C D E F
1 0 6 0 0 0 <NA>
2 0 6 0 1 -5 D
3 0 0 0 2 4 <NA>
4 4 0 0 2 -1 A
Data
df <- structure(list(A = c(0, 0, 0, 4), B = c(6, 6, 0, 0), C = c(0,
0, 0, 0), D = c(0, 1, 2, 2), E = c(0, -5, 4, -1), F = c(NA, "D",
NA, "A")), row.names = c(NA, -4L), class = "data.frame")
Finding the column index where a row changes values in R dataframe/Datatable
What about this:
x <- sapply(1:NCOL(df), function(x) rle(df[x,])$values)
Output of x:
[[1]]
Col2 Col3
1 2 9
[[2]]
Col1 Col2 Col3
2 2 7 6
[[3]]
Col1 Col2 Col3
3 1 5 4
Then if you'd like the full range of before/after values, you could use:
lapply(x,function(i) paste0(i,collapse="->"))
[[1]]
[1] "2->9"
[[2]]
[1] "2->7->6"
[[3]]
[1] "1->5->4"
Identifying where Remark column changes in R data.frame based on time stamp
Try this
d <- data.frame(ID, Remarks, Date, stringsAsFactors = F)
d %>% filter(Remarks != lag(Remarks, default = ''))
Output:
ID Remarks Date
1 1 joined 2020/08/01 06:31:38
2 1 newrole 2020/08/01 13:17:07
3 1 transferred 2020/08/01 13:29:01
4 2 joined 2020/08/03 06:31:38
5 2 newrole 2020/08/04 06:31:38
6 2 transferred 2020/08/04 13:17:07
7 3 joined 2020/08/07 13:17:07
8 3 newrole 2020/08/07 13:29:01
9 3 transferred 2020/08/10 13:29:01
Related Topics
Adding 15 Business Days in Lubridate
Calling a User-Defined R Function from C++ Using Rcpp
Get Margin Line Locations in Log Space
Aggregating All Unique Values of Each Column of Data Frame
Format Ttest Output by R for Tex
How to Read Specific Rows of CSV File with Fread Function
Count the Number of Unique Characters in a String
Reshape Multi Id Repeated Variable Readings from Long to Wide
Error in File(File, "Rt"):Invalid 'Description' Argument in Complete.Cases Program
Any Way to Force Fread() of Data.Table Not to Stop on Empty Lines
Correctly Color Vertices in R Igraph
Showing Equation of Nls Model with Ggpmisc
How to Declare a Thousand Separator in Read.Csv
Remove "Showing 1 to N of N Entries" Shiny Dt
Add Raster to Ggmap Base Map: Set Alpha (Transparency) and Fill Color to Inset_Raster() in Ggplot2
How to Add Shaded Confidence Intervals to Line Plot with Specified Values