Combine multiple Data Frames with WHILE loop
You could create a list with all the dataframes and then concatenate them.
Like before the while loop have a list of dataframes.
list_of_dfs = []
And prior to the index+=1
add the final_list
to list of dataframes.
list_of_dfs.append(final_list)
You probably dont want to append like final_list.append(final_list)
.
Eventually, you could do
my_df_of_concern = pd.concat(list_of_dfs, index=0)
See https://pandas.pydata.org/docs/reference/api/pandas.concat.html
Merge several data.frames into one data.frame with a loop
You may want to look at the closely related question on stackoverflow.
I would approach this in two steps: import all the data (with plyr
), then merge it together:
filenames <- list.files(path=".../tempDataFolder/", full.names=TRUE)
library(plyr)
import.list <- llply(filenames, read.csv)
That will give you a list of all the files that you now need to merge together. There are many ways to do this, but here's one approach (with Reduce
):
data <- Reduce(function(x, y) merge(x, y, all=T,
by=c("COUNTRYNAME", "COUNTRYCODE", "Year")), import.list, accumulate=F)
Alternatively, you can do this with the reshape
package if you aren't comfortable with Reduce
:
library(reshape)
data <- merge_recurse(import.list)
How to create a for loop for combining several data frames and df subsets into one data frame?
You can define a function that will sum up all numeric columns of a data.frame, and leave other columns as NA, append this to original data frame:
numericCols = sapply(iris,is.numeric)
func = function(df,numCols){
iris_sums <- colSums(df[,numCols])
result <- rep(NA,ncol(df))
names(result) <- colnames(df)
result[names(iris_sums)] <- iris_sums
rbind(df,result,rep(NA,ncol(df)))
}
Then we use purrr to map each subset:
split(iris,iris$Species) %>% map_dfr(func,numCols=numericCols)
How to merge for loop output dataframes into one with python?
A vectorized (read "much faster") solution:
a = np.array(dfa['A'].str.split('').str[1:-1].tolist())
b = np.array(dfb['B'].str.split('').str[1:-1].tolist())
dfb[['disB_1', 'disB_2', 'disB_3']] = (a != b[:, None]).sum(axis=2)
Output:
>>> dfb
B disB_1 disB_2 disB_3
0 AC 1 2 1
1 BC 2 1 1
2 CC 2 2 0
Loop for merging multiple dataframes from list of dataframes in R
I'm not a fan of how this ends up with multiple columns with the same name, but that's what you wanted.
You aren't really asking for a merge because that would give 3 x 3 = 9 rows, so I used cbind.
(I changed the name of the list of data.frames to df_list to avoid confusion)
df_list <- list(
data.frame(ID = 1, b = c('x', 'y', 'z'), c = c('y', 'z', 'x'), d = c('z', 'x', 'y')),
data.frame(ID = 1, b = c('x', 'y', 'z'), c = c('y', 'z', 'x'), d = c('z', 'x', 'y')),
data.frame(ID = 2, b = c('x', 'y', 'z'), c = c('y', 'z', 'x'), d = c('z', 'x', 'y'))
)
for (i in 1:(length(df_list) - 1)) {
if (NROW(df_list[[i]]) == NROW(df_list[[i + 1]]) &&
all(df_list[[i]]$ID == df_list[[i + 1]]$ID)) {
df_list[[i]] <- cbind(df_list[[i]], df_list[[i + 1]][, -1])
df_list[[i + 1]] <- list()
}
}
df_list <- df_list[!sapply(df_list, function(x) NROW(x) == 0)]
df_list
[[1]]
ID b c d b c d
1 1 x y z x y z
2 1 y z x y z x
3 1 z x y z x y
[[2]]
ID b c d
1 2 x y z
2 2 y z x
3 2 z x y
Iterating over a merge of multiple dataframes
Solution 1:
Use if 'value' column only in df1 and df2, but not df_master.
dfcon = pd.concat([df1, df2])
df = pd.merge(df_master, dfcon, how='left', on='CAS')
Solution 2:
Use if 'value' column is also in df_master.
df_master_drop = df_master.drop(columns=['value'])
df_drop = pd.merge(df_master_drop, dfcon, how='left', on='CAS')
df = df_master.combine_first(df_drop)
Notes:
Use dfcon = pd.concat([df1, df2]).drop_duplicates('CAS') if there are duplicates. This will preserves earliest CAS value.
Merging multiple dataframe columns into one using for loop
It is because you save every thing in df_merge. df_merge is always the latest merged, not the sum of all merged dataframes.
I would suggest to set df_merge to a value first, like this.
dfs = [df2,df3,df4,df5,df6]
df_merge = df1
for i in dfs:
df_merge = pd.merge(df_merge,i,how='left',on='Date')
print("Shape of df_merge = ",df_merge.shape)
Merging multiple data frames in a loop
You can try something like this:
create the 'x' column with all NA values in your first data.frame
df[,"x"] <- NA
use your ID column to name the rows of your first data.frame
rownames (df) <- df$ID
and then use this rownames to replace the 'x' column just in the desired rows depending of each of your other datasets
df[df1$ID, "x"] <- df1$x
df[df2$ID, "x"] <- df2$x
This will keep the NA values in the 'x' column as in your example.
Related Topics
Explain a Lazy Evaluation Quirk
Calculating Cumulative Sum For Each Row
How to Export Multiple Data.Frame to Multiple Excel Worksheets
Simpler Population Pyramid in Ggplot2
Split Violin Plot With Ggplot2
How R Formats Posixct With Fractional Seconds
Subset Dataframe by Multiple Logical Conditions of Rows to Remove
How to Put Labels Over Geom_Bar in R With Ggplot2
Special Variables in Ggplot (..Count.., ..Density.., etc.)
How to Flatten/Merge Overlapping Time Periods
Turning Off Some Legends in a Ggplot
Standard Evaluation in Dplyr: Summarise a Variable Given as a Character String
How to Match Fuzzy Match Strings from Two Datasets
Plotting Contours on an Irregular Grid
Long/Bigint/Decimal Equivalent Datatype in R
Convert the Values in a Column into Row Names in an Existing Data Frame