Merging Rows with the Same Id Variable

R: Combine rows with same ID

Something like this:
Here we first group for all except the Var variables, then we use summarise(across... as suggested by @Limey in the comments section.
Main feature is to use na.rm=TRUE:

library(dplyr)

df %>%
group_by(ID, Date, N_Date, type) %>%
summarise(across(starts_with("Var"), ~sum(., na.rm = TRUE)))
     ID Date   N_Date type    Var1  Var2  Var3  Var4
<int> <chr> <int> <chr> <int> <int> <int> <int>
1 1 4.7.22 50000 normal 12 23 5 54
2 2 4.7.22 4000 normal 0 2 0 0
3 3 5.7.22 20000 normal 7 0 0 0

Merge rows with the same ID but with overlapping variables

I'm not sure if this actually is what you want, but to combine rows of a data frame based on multiple conditions you can use the dplyr package and its summarise()function. I generated some data to use in R directly, you would have to modify the code according to your needs.

# generate data
ID<-rep(1:20,2)
visitors<-sample(1:50, 40, replace=TRUE)
impact<-sample(rep(c("a", "b", "c", "d", "e"), 8))
arrival<-sample(rep(8:15, 5))
departure <- sample(rep(16:23, 5))

df<-data.frame(ID, visitors, impact, arrival, departure)
df$impact<-as.character(df$impact)

# summarise rows with identical ID
df_summary <- df %>%
group_by(ID) %>%
summarise(visitors = max(visitors), arrival = min(arrival),
departure = max(departure), impact = paste0(impact, collapse =", "))

Hope this helps!

Pandas Merge and Complete rows with same id

If there is only one non empty value per groups use:

df = df.replace('',np.nan).groupby('ID', as_index=False).first().fillna('')

If possible multiple values and need unique values in original order use lambda function:

print (df)
ID LU MA ME JE VE SA DI
0 201 B C B
1 201 C C C B C


f = lambda x: ','.join(dict.fromkeys(x.dropna()).keys())
df = df.replace('',np.nan).groupby('ID', as_index=False).agg(f)
print (df)
ID LU MA ME JE VE SA DI
0 201 B,C C C B C

Merging rows in a dataframe R with duplicate id's

You could use summarize_all, grouped by person_id. This preserves the variables in each first row per person_id not being NA.
I added a pivot_wider to preserve the different test_dates (as pointed out by @Andrea M).

library(dplyr)
library(lubridate)

df1 <- df %>%
group_by(person_id) %>%
mutate(id = seq_along(person_id)) %>%
pivot_wider(names_from = id,
values_from = test_date,
names_prefix = "test_date") %>%
summarize_all(list(~ .[!is.na(.)][1]))

Output

> df1
# A tibble: 2 x 9
person_id serial_number freezer_number test_1 test_2 test_3 test_4 test_date1 test_date2
<chr> <chr> <chr> <chr> <chr> <lgl> <lgl> <chr> <chr>
1 x c d positive positive NA NA 01/01/2010 05/01/2010
2 y e f positive NA NA NA 02/02/2020 NA

How to merge rows from table based on a common ID? SAS EG

You will have to add some actual SAS code into your Enterprise Guide project to do that.

Create a new variable and use CATX() function to build the string. Use BY group processing.

data want;
do until (last.id1);
set QUERY_FOR_TABLE1 ;
by id1 ;
length text $200;
text=catx(',',text,text1,text2);
end;
keep id1 text;
run;

Pandas | merge rows with same id

Use

  • DataFrame.groupby - Group DataFrame or Series using a mapper or by a Series of columns.
  • .groupby.GroupBy.last - Compute last of group values.
  • DataFrame.replace - Replace values given in to_replace with value.

Ex.

df = df.replace('',np.nan, regex=True)
df1 = df.groupby('id',as_index=False,sort=False).last()
print(df1)

id firstname lastname email updatedate
0 A1 wendy smith smith@mail.com 2019-02-03
1 A2 harry lynn harylynn@mail.com 2019-03-12
2 A3 tinna dickey tinna@mail.com 2013-06-12
3 A4 Tom Lee Tom@mail.com 2012-06-12
4 A5 Ella NaN Ella@mail.com 2019-07-12
5 A6 Ben Lang Ben@mail.com 2019-03-12

MYSQL how to merge rows with same field id into a single row

GROUP_CONCAT supports DISTINCT and SEPARATOR``

CREATE TABLE table1 (
`rowid` VARCHAR(139),
`title` VARCHAR(139),
`author_f_name` VARCHAR(139),
`author_m_name` VARCHAR(139),
`author_l_name` VARCHAR(139),
`coauthor_first_name` VARCHAR(139),
`coauthor_middle_name` VARCHAR(139),
`coauthor_last_name` VARCHAR(139)
);

INSERT INTO table1
(`rowid`, `title`, `author_f_name`, `author_m_name`, `author_l_name`, `coauthor_first_name`, `coauthor_middle_name`, `coauthor_last_name`)
VALUES
('1.', 'Blog Title.', 'Roy', NULL, 'Thomas.', 'Joe', 'Shann', 'Mathews'),
('1.', 'Blog Title.', 'Thomas', 'NULL', 'Edison', 'Kunal', NULL, 'Shar');
SELECT 
`rowid`
, GROUP_CONCAT(DISTINCT `title` SEPARATOR ' |||') tilte
, GROUP_CONCAT(DISTINCT `author_f_name` SEPARATOR ' |||') author_f_name
, GROUP_CONCAT(DISTINCT `author_m_name` SEPARATOR ' |||') author_m_name
, GROUP_CONCAT(DISTINCT `author_l_name` SEPARATOR ' |||') author_l_name
, GROUP_CONCAT(DISTINCT `coauthor_first_name` SEPARATOR ' |||') coauthor_first_name
, GROUP_CONCAT(DISTINCT `coauthor_middle_name` SEPARATOR ' |||') coauthor_middle_name
, GROUP_CONCAT(DISTINCT `coauthor_last_name` SEPARATOR ' |||') coauthor_last_name
FROM table1
GROUP BY `rowid`

rowid | tilte | author_f_name | author_m_name | author_l_name | coauthor_first_name | coauthor_middle_name | coauthor_last_name
:---- | :---------- | :------------ | :------------ | :---------------- | :------------------ | :------------------- | :-----------------
1. | Blog Title. | Roy |||Thomas | NULL | Edison |||Thomas. | Joe |||Kunal | Shann | Mathews |||Shar

db<>fiddle here

SELECT 
`rowid`
, GROUP_CONCAT(DISTINCT `title` SEPARATOR ' |||') tilte
, GROUP_CONCAT(DISTINCT CONCAT(`author_f_name`,' ',COALESCE(`author_m_name`,''),' ',`author_l_name`) SEPARATOR ' |||') author_full_name
, GROUP_CONCAT(DISTINCT CONCAT(`coauthor_first_name`,' ',COALESCE(`coauthor_middle_name`,''),' ',`coauthor_last_name`) SEPARATOR ' |||') coauthor_full_name
FROM table1
GROUP BY `rowid`

rowid | tilte | author_full_name | coauthor_full_name
:---- | :---------- | :--------------------------------- | :-------------------------------
1. | Blog Title. | Roy Thomas. |||Thomas NULL Edison | Joe Shann Mathews |||Kunal Shar

db<>fiddle here



Related Topics



Leave a reply



Submit