how to rearrange an order of matches between two data frames
In this case I find it easier to switch the data to the wide format and before merging it to the lookup table.
You could try:
library(tidyr)
library(dplyr)
df1_tmp <- df1
df2_tmp <- df2
#add numerical id to df1_tmp to keep row information
df1_tmp$id <- seq_along(df1_tmp[,1])
#switch to wide and unnest rows with several strings
df1_tmp <- gather(df1_tmp,key="s_val",value="query_string",-id)
df1_tmp <- df1_tmp %>%
mutate(query_string = strsplit(as.character(query_string), ";")) %>%
unnest(query_string)
df2_tmp$IDs. <- gsub("[()]", "", df2_tmp$IDs.)
#add numerical id to df1_tmp to keep row information
df2_tmp$id <- seq_along(df2_tmp$IDs.)
#unnest rows with several strings
df2_tmp <- df2_tmp %>%
mutate(IDs. = strsplit(as.character(IDs.), ",")) %>%
unnest(IDs.)
res <- merge(df1_tmp,df2_tmp,by.x="query_string",by.y="IDs.")
res$ID_col_n <- paste(paste0(res$id.x,res$s_val))
res$total_id <- 1:nrow(res)
res <- spread(res,s_val,value=query_string,fill=NA)
res
#summarize to get required output
res <- res %>% group_by(id.y) %>%
mutate(No=n()) %>% group_by(id.y,No) %>%
summarise_each(funs(paste(.[!is.na(.)],collapse=","))) %>%
select(-id.x,-total_id)
colnames(res)[colnames(res)=="id.y"]<-"IDs"
res$df1_colMatch_counts <- rowSums(res[,-(1:3)]!="")
df2_counts <- df2_tmp %>% group_by(id) %>% summarize(df2_string_counts=n())
res <- merge(res,df2_counts,by.x="IDs",by.y="id")
res
res
IDs No ID_col_n s1 s2 df1_colMatch_counts df2_string_counts
1 1 1 4s1 P41182 1 2
2 2 1 4s1 P41182 1 2
3 3 1 4s1 P41182 1 2
4 4 3 2s2,3s1,5s1 Q9Y6Q9,Q09472 Q92831 2 4
5 15 1 3s2 P54612 1 5
6 16 1 7s2 O15143 1 7
Reordering rows in a dataframe to match order of rows in another dataframe
Since you want to order the dataframes according to the Paper ID, you should first set them as the index in both dataframes:
df1.set_index('Paper ID', inplace=True)
df2.set_index('Paper ID', inplace=True)
Now you can reindex df2
to match the order of df1
:
df2 = df2.reindex(df1.index)
Finally, reset the indices to restore the default index:
df1.reset_index(inplace=True)
df2.reset_index(inplace=True)
Matching dataframe row order with another dataframe in Python based on str data
After getting some helpful feedback I realized what I really needed was how to sort by custom list, so I found sorting by a custom list in pandas . I tried the selected answer which did not work for me but the second answer did. My implementation is below.
sorter=['A','B','C'] #whatever order I want Description to be sorted in
df.Description = df.Description.astype("category")
df.Description.cat.set_categories(sorter, inplace=True)
df.sort_values(by=['Code','Description'])
How do I rearrange two columns in a dataframe so that row values match in R?
One way is to match
the values in 'x' with that of substring of 'y' by removing the prefix part with str_remove
and use that index to order the 'y'
library(stringr)
library(dplyr)
df %>%
mutate_all(as.character) %>%
mutate(y = y[match(x, str_remove(y, ".*-"))])
# x y
#1 L1 E17-L1
#2 L10 G15-L10
#3 L100 G1-L100
#4 L101 E14-L101
Reorder one data.frame using two columns from another data.frame in R
Try using paste to combine your id and lob within your merge function call.
b[match(paste(a$id,a$lob), paste(b$id,b$lob)),]
id lob val
1 1+ X 1
4 3 X 5
7 2 X 4
8 1 X 3
9 1 Y 2
2 1+ Y 9
1.1 1+ X 1
4.1 3 X 5
4.2 3 X 5
Make same order of similar column in two data frames
We can try with match
and order
DF2[order(match(DF2$RS2, DF1$RS1)),1, drop=FALSE]
# RS2
#1 rs_12
#2 rs_23
#3 rs_23
#5 rs_23
#4 rs_34
#6 rs_34
#7 rs_34
how to reorder a dataframe based on another vector in R
You can order your df using dplyr
library(dplyr)
df %>%
arrange(factor(BRANCH, levels = Branches))
How do I sort the order of columns in a dataframe by another dataframe in R?
One trick is to use names(df1):
df2<-df2[names(df1)]
And you get an identical set of columns in df2 as in df1- this is very handy if you need to use rbind()!
Related Topics
Rscript Could Not Find Function
3D Equivalent of the Curve Function in R
Read CSV with Two Headers into a Data.Frame
Row-Wise Sum of Values Grouped by Columns with Same Name
Multiple Y Axis for Bar Plot and Line Graph Using Ggplot
How to Specify the Size/Layout of a Single Plot to Match a Certain Grid in R
Testing a Function That Uses Enquo() for a Null Parameter
Outputting Difftime as Hh:Mm:Ss:Mm in R
Datatype for Linear Model in R
How to Run a High Pass or Low Pass Filter on Data Points in R
Count Unique Combinations of Values
Adding Multiple Lag Variables Using Dplyr and for Loops
Include Text Control Characters in Plotmath Expressions
R: Interpolation of Nas by Group
Join Two Data Tables and Use Only One Column from Second Dt
Plot a Jpg Image Using Base Graphics in R