Duplicating rows in R merge function
The problem you have is that your variables are not unique. If you merge them you will get more and more rows. You should have a look what you get when you do:
dt <- data.frame(level12R$level1.2_are.out$parameters$stdyx.standardized[,1:2])
tail(dt)
paramHeader param
30 ASRREA.ON ATBR10CG
31 ASRREA.ON ATBR10DG
32 ASRREA.ON ATBR10FG
33 ASRREA.ON ATBR12AG
34 Intercepts ASRREA
35 Residual.Variances ASRREA
You can see that the last to variables are the same, but come from different headers.
So we have to extend the join so we can make unique records. Looking at the data that would take 3 columns, 1, 2 and 8 "header", "variable" and "betweenwithin". Then we can loop through everything without getting duplicate records. Your dt object ends up with 35 records and 51 variables with NA's where the results were not 35 records but 34 or even 25.
nomes <- '0'
dt <- data.frame(Level12R$level1.2_are.out$parameters$stdyx.standardized[,c(1:2, 8)])
names(dt)<-c("header", "variable", "betweenwithin")
for(i in 1:length(Level12R)) {
nomes[i] = names(Level12R)[i]
df = eval(parse(text=paste0("Level12R$",nomes[i],"$parameters$stdyx.standardized", collapse=NULL)))
df <- df[,c(1:3, 8)]
names(df)<-c("header", "variable", toupper(substr(nomes[i],10,12)), "betweenwithin")
dt <- left_join(x=dt, y=df)
}
Normally I would use a list object in a loop, and later on see what I need to do with the data in the list. It prevents creating unintended side effects when using joins / merges etc.
Why using merge function in R creates duplicates?
We can get only the unique
rows of DF1
and DF2
and then merge
.
DF <- merge(unique(DF1), unique(DF2), by = c("Date", "Time"), all.x= TRUE)
merge values from one dataframe onto another without creating duplicates in R
If df2 has duplicates we can use unique
the get rid of them. I.e.
df2_clean <- unique(df2)
library(dplyr)
df1_and_df2 <- df1 %>% left_join(df2_clean)
Explanation for what caused the original problem:
If we join two data.sets x
and y
where the common column is not unique in both of them, the join will combine each observation in x
with each observation in y
leading to many duplicated rows
How to eliminate duplication row in R when using merge function
Use data.frame(FL_ratio, time)
.
The merge(...)
function is not meant for this. Since time
and FL_ratio
are vectors, merge(FL_ratio, time)
will produce a cross-product: for each element of FL_ratio
there will be rows for all the values of time
. This is why you're getting 10,816 rows. You can see this below:
x <- 1:3
y <- 4:6
merge(x,y)
## x y
## 1 1 4
## 2 2 4
## 3 3 4
## 4 1 5
## 5 2 5
## 6 3 5
## 7 1 6
## 8 2 6
## 9 3 6
data.frame(x,y)
## x y
## 1 1 4
## 2 2 5
## 3 3 6
Related Topics
How to Append a Sequential Number for Every Element in a Data Frame
R: Error in Usemethod("Group_By_"):Applied to an Object of Class
How to Join (Merge) Data Frames (Inner, Outer, Left, Right)
Find Complement of a Data Frame (Anti - Join)
How to Disable Scientific Notation
Complete Dataframe With Missing Combinations of Values
Include Levels of Zero Count in Result of Table()
General Suggestions For Debugging in R
Create Counter With Multiple Variables
Get the Difference Between Dates in Terms of Weeks, Months, Quarters, and Years
Convert Categorical Variables to Numeric in R
Rstudio Does Not Display Any Output in Console After Entering Code
How to Reshape Data from Long to Wide Format
Combine a List of Data Frames into One Data Frame by Row
Collapse Text by Group in Data Frame
Annotating Text on Individual Facet in Ggplot2