How to Use Plyr to Number Rows

How do I use plyr to number rows?

I'd do it like this:

library(plyr)
ddply(df, c("kmer", "cvCut"), transform, newID = seq_along(kmer))

Numbering rows within groups in a data frame

Use ave, ddply, dplyr or data.table:

df$num <- ave(df$val, df$cat, FUN = seq_along)

or:

library(plyr)
ddply(df, .(cat), mutate, id = seq_along(val))

or:

library(dplyr)
df %>% group_by(cat) %>% mutate(id = row_number())

or (the most memory efficient, as it assigns by reference within DT):

library(data.table)
DT <- data.table(df)

DT[, id := seq_len(.N), by = cat]
DT[, id := rowid(cat)]

doing a plyr operation on every row of a data frame in R

Just treat it like an array and work on each row:

adply(df, 1, transform, max = max(x, y))

splitting text in column and add row number

I'm biased in favor of cSplit from the "splitstackshape" package, but you might be interested in unnest from "tidyr" in conjunction with "dplyr":

library(dplyr)
library(tidyr)
df %>%
mutate(b = strsplit(b, ";")) %>%
unnest(b)
# a b
# 1 1 g
# 2 1 j
# 3 1 n
# 4 2 x
# 5 2 f
# 6 2 v

ddply() and using length to count within a specific set of rows in R

If you want to stick with plyr:

df.ddply <- ddply(df, "name", summarise, counter=length(var[var == 1]))

plyr summarize count error row length

This was a bit of a weird one, I didn't use classical plyr, but I think this is roughly what you're looking for. I removed the filtering column , filt as to not get counts of that:

library(dplyr)

data %>%
filter(filt == 1) %>%
select(-filt) %>%
purrr::map_df(function(a_column){
purrr::map_int(1:4, function(num) sum(a_column == num))
})

# A tibble: 4 x 4
A B C D
<int> <int> <int> <int>
1 0 1 1 0
2 0 1 1 1
3 0 0 0 1
4 2 0 0 0

Applying a function to every row of a table using dplyr?

As of dplyr 0.2 (I think) rowwise() is implemented, so the answer to this problem becomes:

iris %>% 
rowwise() %>%
mutate(Max.Len= max(Sepal.Length,Petal.Length))

Non rowwise alternative

Five years (!) later this answer still gets a lot of traffic. Since it was given, rowwise is increasingly not recommended, although lots of people seem to find it intuitive. Do yourself a favour and go through Jenny Bryan's Row-oriented workflows in R with the tidyverse material to get a good handle on this topic.

The most straightforward way I have found is based on one of Hadley's examples using pmap:

iris %>% 
mutate(Max.Len= purrr::pmap_dbl(list(Sepal.Length, Petal.Length), max))

Using this approach, you can give an arbitrary number of arguments to the function (.f) inside pmap.

pmap is a good conceptual approach because it reflects the fact that when you're doing row wise operations you're actually working with tuples from a list of vectors (the columns in a dataframe).

trying to apply the function plyr::ldply on lists to convert into a data.frame of differing number of rows

Actually, you may want to reverse the order in your dplyr::left_join since info contains more clk than filing. The latter maintains two empty data frames in original listing.

library(dplyr)
library(tibble)

info <- do.call("rbind", lapply(data, "[[", 1))
filing <- do.call("rbind", lapply(data, "[[", 2))

final_df_op <- info %>%
left_join(filing %>%
tibble::rownames_to_column(., "cik") %>%
mutate(cik = gsub("\\..*", "", cik)), by = "cik")

str(final_df_op)
# 'data.frame': 51 obs. of 30 variables:
# $ name : chr "AAR CORP" "AAR CORP" "AAR CORP" "AAR CORP" ...
# $ cik : chr "0000001750" "0000001750" "0000001750" "0000001750" ...
# $ fiscal_year_end : chr "0531" "0531" "0531" "0531" ...
# ...

Should you be interested, consider the base R counterpart with following changes:

  • Instead of lapply, use mapply to iterate elementwise through the data items and their corresponding names;

  • Run transform to add a column for cik by corresponding list name;

  • Merge both objects with all.x=TRUE for left join specification.

Base R

info <- do.call("rbind", mapply(function(d, n) transform(d[[2]], cik=n),
data, names(data), SIMPLIFY=FALSE, USE.NAMES=FALSE))

# TRY CATCH TO ACCOUNT FOR ZERO-ROW DF ERRORS
filing <- do.call("rbind", mapply(function(d, n)
tryCatch(transform(d[[2]], cik=n),
error = function(e) NA),
data, names(data), SIMPLIFY=FALSE, USE.NAMES=FALSE))

# LEFT JOIN MERGE
final_df <- merge(info, filing, by="cik", all.x=TRUE)

str(final_df)
# 'data.frame': 51 obs. of 30 variables:
# $ cik : chr "0000001750" "0000001750" "0000001750" "0000001750" ...
# $ name : chr "AAR CORP" "AAR CORP" "AAR CORP" "AAR CORP" ...
# $ fiscal_year_end : chr "0531" "0531" "0531" "0531" ...
# ...


Related Topics



Leave a reply



Submit