Duplicate Rows When Joining Tables

Why do multiple-table joins produce duplicate rows?

If one of the tables M, S, D, or H has more than one row for a given Id (if just the Id column is not the Primary Key), then the query would result in "duplicate" rows. If you have more than one row for an Id in a table, then the other columns, which would uniquely identify a row, also must be included in the JOIN condition(s).

References:

Related Question on MSDN Forum

SQL JOIN two tables and duplicate rows appear

Your not joining each row to just one row.

Table1         Table2

a | b | c x | y | z
---+---+--- ---+---+---
1 | 2 | 3 1 | 2 | 6
2 | 2 | 4 2 | 2 | 7
3 | 2 | 5 3 | 2 | 8

To join these tables without duplicating you would use:

SELECT
*
FROM
Table1
INNER JOIN
Table2
ON Table1.a = Table2.x
AND Table1.b = Table2.y

If, however, I used the following, I'd get 9 rows...

SELECT
*
FROM
Table1
INNER JOIN
Table2
ON Table1.a = Table2.x

This is because each row in Table1 would join to all three rows in Table2

 a | b | c | x | y | z
---+---+---+---+---+---

1 | 2 | 3 | 1 | 2 | 6
1 | 2 | 3 | 2 | 2 | 7
1 | 2 | 3 | 3 | 2 | 8

2 | 2 | 4 | 1 | 2 | 6
2 | 2 | 4 | 2 | 2 | 7
2 | 2 | 4 | 3 | 2 | 8

3 | 2 | 5 | 1 | 2 | 6
3 | 2 | 5 | 2 | 2 | 7
3 | 2 | 5 | 3 | 2 | 8

It doesn't matter if I include any of Table2's columns in the SELECT, it's the JOIN that's the issue.

SELECT
Table1.a,
Table1.b,
Table1.c
FROM
Table1
INNER JOIN
Table2
ON Table1.a = Table2.x

Will give...

 a | b | c
---+---+---

1 | 2 | 3
1 | 2 | 3
1 | 2 | 3

2 | 2 | 4
2 | 2 | 4
2 | 2 | 4

3 | 2 | 5
3 | 2 | 5
3 | 2 | 5


A similar case is when you need to aggregate the data before you join the data.

Table1            Table2

Month | Val Month | Day | Sales
-------+----- -------+-----+-------
1 | 'X' 1 | 1 | 6
2 | 'Y' 1 | 2 | 7
3 | 'Z' 1 | 3 | 6

2 | 1 | 4
2 | 2 | 5
2 | 3 | 6

3 | 1 | 7
3 | 2 | 5
3 | 3 | 3

To join these table without duplication, Table2 needs aggregating to the same granularity as Table1.

SELECT
*
FROM
Table1
INNER JOIN
(
SELECT
Month,
SUM(sales) AS sales
FROM
Table2
GROUP BY
month
)
AS Table2Monthly
ON Table2Monthly.month = Table1.month

Joining multiple tables results in duplicate rows

You May Try either of the Below

SELECT
C.*,
Query = CASE WHEN PVT.Query IS NOT NULL THEN 'Yes' ELSE 'No' END,
Feedback = CASE WHEN PVT.Feedback IS NOT NULL THEN 'Yes' ELSE 'No' END
FROM Customer C
LEFT JOIN FeedBack
PIVOT
(
MAX(Id)
FOR
[Type] IN
(
[Query],[Feedback]
)
)Pvt
ON PVT.CustomerId = c.CustomerId

or Simply

SELECT
C.*,
Query = CASE WHEN EXISTS(SELECT 1 FROM FeedBack WHERE CustomerId = C.CustomerId and [Type]='Query') THEN 'Yes' ELSE 'No' END,
Feedback = CASE WHEN EXISTS(SELECT 1 FROM FeedBack WHERE CustomerId = C.CustomerId and [Type]='Feedback') THEN 'Yes' ELSE 'No' END
FROM Customer C

To Make it more Dynamic ou may Try this

DECLARE @SQL VARCHAR(MAX)

;WITH CTE
AS
(
SELECT
RN = ROW_NUMBER() OVER(PARTITION BY [Type] ORDER BY [Type]),
QRY = LTRIM(RTRIM([Type]))+' = CASE WHEN EXISTS(SELECT 1 FROM FeedBack WHERE CustomerId = C.CustomerId and [Type]='''+LTRIM(RTRIM([Type]))+''') THEN ''Yes'' ELSE ''No'' END'
FROM FeedBack
)
SELECT
@SQL = 'SELECT
C.*'
+SUBSTRING(','+L.List,1,LEN(L.List)-1)
+' FROM Customer C'
FROM
(
SELECT
QRY + ', ' [text()]
FROM CTE
WHERE RN = 1
FOR XML PATH('')
)L(List)

EXEC(@SQL)

Please Refer this Sqlfiddle for detailed example

JOIN results give duplicate rows - SQL Server

Just add group by clause with sum aggregation function

SELECT I.InvoiceNumber, I.AmountPaid,
sum(O.Amount) Amount
FROM OrderInvoice O
JOIN Invoice I
ON O.InvoiceNumber = I.InvoiceNumber group by I.InvoiceNumber, I.AmountPaid

Result :

InvoiceNumber   AmountPaid  Amount
123 10 10
567 10 10

INNER JOIN Duplicate Records

I don't understand why you are using GROUP BY u.username, it doesn't quite make sense here.

However, you will get some duplicate records if you remove it - and that's because you INNER JOIN on just one of the fields, while I'm assuming it should be on both the referrer and the "upliner". If you join on just the one, it will match all the "upliner" of 1122 for every user. If you also join on the users code, you should be getting the result you are looking for.

SELECT r.upliner,
r.refer,
r.date_time,
u.username,
u.name,
u.sponsor
FROM referral_details AS r
INNER JOIN userdetails AS u
ON r.upliner = u.sponsor
AND r.refer = u.code
WHERE u.sponsor = '1122'

Duplicate Rows when self joining tables in SQL

You may alter your join condition to check that the first part number is strictly less than the second one:

SELECT
t1.PARTNUMB, t1.PARTDESC, t1.ITEMCLSS, t2.PARTNUMB, t2.PARTDESC, t2.ITEMCLSS
FROM PARTFIRST t1
INNER JOIN PARTSECOND t2
ON t1.WRHSNUMB = t2.WRHSNUMB AND
t1.ITEMCLSS = t2.ITEMCLSS AND
t1.PARTNUMB < t2.PARTNUMB;

The problem with using FIRST.PARTNUMB <> SECOND.PARTNUMB is that it would report two different part numbers twice, once on the left/right side and vice-versa. By using a strictly less than inequality, we exclude "duplicates," as you view them.

Joining two tables and getting duplicate rows

LEFT JOIN two tables with formatted date and ID

-- MySQL
SELECT t1.ID
, t1.DateYM
, t1.Status
, (CASE WHEN t2.Appointment = 'Registered' then 'Yes' else 'No' end) as Appointment
, t2.state
FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.ID = t2.ID
and DATE_FORMAT(t1.DateYM, "%Y-%m") = DATE_FORMAT(t2.Appt_date, "%Y-%m")
ORDER BY t1.ID, t1.DateYM

Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ef715b567a2d9a47f9796923386352a6

Mysql query returns duplicate rows when joining tables

You can try below -

SELECT * FROM user_score us LEFT JOIN user_avatar ua ON us.gamerid=ua.userid 
and ua.last_update =
(select max(last_update) from user_avatar ua1 where ua.userid=ua1.userid)


Related Topics



Leave a reply



Submit