Merge Multiple Rows with Same Id into One Row

Pandas | merge rows with same id

Use

  • DataFrame.groupby - Group DataFrame or Series using a mapper or by a Series of columns.
  • .groupby.GroupBy.last - Compute last of group values.
  • DataFrame.replace - Replace values given in to_replace with value.

Ex.

df = df.replace('',np.nan, regex=True)
df1 = df.groupby('id',as_index=False,sort=False).last()
print(df1)

id firstname lastname email updatedate
0 A1 wendy smith smith@mail.com 2019-02-03
1 A2 harry lynn harylynn@mail.com 2019-03-12
2 A3 tinna dickey tinna@mail.com 2013-06-12
3 A4 Tom Lee Tom@mail.com 2012-06-12
4 A5 Ella NaN Ella@mail.com 2019-07-12
5 A6 Ben Lang Ben@mail.com 2019-03-12

How to concatenate many rows with same id in sql?

In SQL-Server you can do it in the following:

QUERY

SELECT id, displayname = 
STUFF((SELECT DISTINCT ', ' + displayname
FROM #t b
WHERE b.id = a.id
FOR XML PATH('')), 1, 2, '')
FROM #t a
GROUP BY id

TEST DATA

create table #t 
(
id int,
displayname nvarchar(max)
)

insert into #t values
(1 ,'Editor')
,(1 ,'Reviewer')
,(7 ,'EIC')
,(7 ,'Editor')
,(7 ,'Reviewer')
,(7 ,'Editor')
,(19,'EIC')
,(19,'Editor')
,(19,'Reviewer')

OUTPUT

id  displayname
1 Editor, Reviewer
7 Editor, EIC, Reviewer
19 Editor, EIC, Reviewer

MYSQL how to merge rows with same field id into a single row

GROUP_CONCAT supports DISTINCT and SEPARATOR``

CREATE TABLE table1 (
`rowid` VARCHAR(139),
`title` VARCHAR(139),
`author_f_name` VARCHAR(139),
`author_m_name` VARCHAR(139),
`author_l_name` VARCHAR(139),
`coauthor_first_name` VARCHAR(139),
`coauthor_middle_name` VARCHAR(139),
`coauthor_last_name` VARCHAR(139)
);

INSERT INTO table1
(`rowid`, `title`, `author_f_name`, `author_m_name`, `author_l_name`, `coauthor_first_name`, `coauthor_middle_name`, `coauthor_last_name`)
VALUES
('1.', 'Blog Title.', 'Roy', NULL, 'Thomas.', 'Joe', 'Shann', 'Mathews'),
('1.', 'Blog Title.', 'Thomas', 'NULL', 'Edison', 'Kunal', NULL, 'Shar');
SELECT 
`rowid`
, GROUP_CONCAT(DISTINCT `title` SEPARATOR ' |||') tilte
, GROUP_CONCAT(DISTINCT `author_f_name` SEPARATOR ' |||') author_f_name
, GROUP_CONCAT(DISTINCT `author_m_name` SEPARATOR ' |||') author_m_name
, GROUP_CONCAT(DISTINCT `author_l_name` SEPARATOR ' |||') author_l_name
, GROUP_CONCAT(DISTINCT `coauthor_first_name` SEPARATOR ' |||') coauthor_first_name
, GROUP_CONCAT(DISTINCT `coauthor_middle_name` SEPARATOR ' |||') coauthor_middle_name
, GROUP_CONCAT(DISTINCT `coauthor_last_name` SEPARATOR ' |||') coauthor_last_name
FROM table1
GROUP BY `rowid`

rowid | tilte | author_f_name | author_m_name | author_l_name | coauthor_first_name | coauthor_middle_name | coauthor_last_name
:---- | :---------- | :------------ | :------------ | :---------------- | :------------------ | :------------------- | :-----------------
1. | Blog Title. | Roy |||Thomas | NULL | Edison |||Thomas. | Joe |||Kunal | Shann | Mathews |||Shar

db<>fiddle here

SELECT 
`rowid`
, GROUP_CONCAT(DISTINCT `title` SEPARATOR ' |||') tilte
, GROUP_CONCAT(DISTINCT CONCAT(`author_f_name`,' ',COALESCE(`author_m_name`,''),' ',`author_l_name`) SEPARATOR ' |||') author_full_name
, GROUP_CONCAT(DISTINCT CONCAT(`coauthor_first_name`,' ',COALESCE(`coauthor_middle_name`,''),' ',`coauthor_last_name`) SEPARATOR ' |||') coauthor_full_name
FROM table1
GROUP BY `rowid`

rowid | tilte | author_full_name | coauthor_full_name
:---- | :---------- | :--------------------------------- | :-------------------------------
1. | Blog Title. | Roy Thomas. |||Thomas NULL Edison | Joe Shann Mathews |||Kunal Shar

db<>fiddle here

Same ID with multiple records to single row

If you just want to view the output suggested in table B using table A, then use a pivot query:

SELECT
source,
ID,
1 AS Type_1_ID
MAX(CASE WHEN Type_ID = 1 THEN Error_info END) AS Error_info_1,
2 AS Type_2_ID,
MAX(CASE WHEN Type_ID = 2 THEN Error_info END) AS Error_info_2
FROM yourTable
GROUP BY
source,
ID;

R: Combine rows with same ID

Something like this:
Here we first group for all except the Var variables, then we use summarise(across... as suggested by @Limey in the comments section.
Main feature is to use na.rm=TRUE:

library(dplyr)

df %>%
group_by(ID, Date, N_Date, type) %>%
summarise(across(starts_with("Var"), ~sum(., na.rm = TRUE)))
     ID Date   N_Date type    Var1  Var2  Var3  Var4
<int> <chr> <int> <chr> <int> <int> <int> <int>
1 1 4.7.22 50000 normal 12 23 5 54
2 2 4.7.22 4000 normal 0 2 0 0
3 3 5.7.22 20000 normal 7 0 0 0

Combine two rows in a single row with the same ID name

A simple aggregation should do the trick

Example

Select ID
,ID1 = max(ID1)
,ID2 = max(ID2)
,ID3 = max(ID3)
From YourTable
Group By ID

Merge multiple rows (with same ID) into one?

I believe this will work (I tested it on an Oracle 11g, not a Teradata):

SELECT *
FROM (SELECT id_nbr AS ID,
contact_type AS contype,
contact_first_name AS firstName,
contact_last_name AS lastName,
contact_phone_number AS phoneNumber,
contact_address AS address,
contact_email AS email
FROM database.account_info
WHERE contact_type in ('AAA', 'BBB', 'CCC')
)
PIVOT (max(firstName) AS firstName,
max(lastName) AS lastName,
max(phoneNumber) AS phone,
max(email) AS email,
max(address) AS address
FOR contype IN ('AAA', 'BBB', 'CCC')
)

The last line may require an alias, one of:

        ) AS derived_pivot
) derived_pivot


Related Topics



Leave a reply



Submit