Sql Join Using a Mapping Table

SQL JOIN using a mapping table

SELECT
c.*,
p.Name
FROM
Collection c
JOIN Person_Collection pc ON pc.collection_id = c.id
JOIN Person p ON p.id = pc.person_id
ORDER BY p.Name

Tables join using mapping table

If I understood you correctly, you just need to add another full outer join:

select t1.id, t3.id
from table1 t1
full outer join mapping t2 on( t1.col2= t2.col1)
full outer join table3 t3 on(t1.year = t3.year and t2.col2 = t3.col2

Just to make sure - a full outer join keeps all the records from both tables being joined, no matter if there is a match or not! I've added another full outer join but change it to the kind of join you need if it isn't full.

Joining tables through a mapping table with a condition

Those are virtually identical, and the most direct route to the data.

You have to establish the relationship of your tables in the FROM which you have done in both queries. Then you have to restrict what data is present in your result set using your c.name = ? condition, which you have done in both queries.

It is very likely that MySQL's optimizer will execute both of these queries in the exact same manner. To be sure, you can run EXPLAIN on both and see if there are any differences. For example:

EXPLAIN SELECT
p.*
FROM
category c
JOIN product_category pc
ON c.id = pc.category_id
AND c.name = ?
JOIN product p
ON p.id = pc.product_id
ORDER BY p.name;

Read more about EXPLAIN here

Lastly, the second one is easier to read so going that route will insure that whoever takes this over from you will have less of an urge to murder you in your sleep. Which is a good thing.

How to join a mapping table in my query

You need to join the question to the mapping table first.

SELECT questions.* 
, posts.post
, COUNT(posts.post) as total_answers
, posts.votes
, posts.id as post_id
, posts.created
, users.id as user_id
, users.username, users.rep
, topics.name
FROM questions
LEFT JOIN posts ON questions.id = posts.question_id
LEFT JOIN users ON questions.user_id = users.id
LEFT JOIN topic_mapping ON questions.id = topic_mapping.question_id
LEFT JOIN topics ON topic_mapping.topic_id = topics.id
GROUP BY questions.id

SQL JOIN - Map value from one table to its equivalent in another during the JOIN

Use case statement in the join ON clause for mapping.

Demo.

Prepare tables:

hive> create table testt1 as select 1 as key;
hive> create table testt2 as select 3 as key;

Join using case:

 select t1.key, t2.key 
from testt1 t1
left join testt2 t2
on t2.key=case when t1.key=1 then 3 --add more cases
--when t1.key=<some value> then <mapped value>
else t1.key --default mapping t1.key=t2.key
end
;

Result:

OK
1 3
Time taken: 41.191 seconds, Fetched: 1 row(s)

Join two tables by two columns using a third mapping table

You could JOIN your spanish and english tables with the mapping table and get the equivalent of each word and then get the code:

WITH spanish AS (SELECT 'mesa' origin, 'techo' dest, 'AA' code FROM DUAL
UNION
SELECT 'mesa' origin, 'suelo' dest, 'BB' code FROM DUAL
UNION
SELECT 'suelo' origin, 'mesa' dest, 'CC' code FROM DUAL
UNION
SELECT 'suelo' origin, 'techo' dest, 'DD' code FROM DUAL),
english AS (SELECT 'table' origin, 'floor' dest, 'XX' code FROM DUAL
UNION
SELECT 'table' origin, 'roof' dest, 'YY' code FROM DUAL
UNION
SELECT 'floor' origin, 'table' dest, 'WW' code FROM DUAL
UNION
SELECT 'floor' origin, 'roof' dest, 'ZZ' code FROM DUAL),
map AS (SELECT 'table' english, 'mesa' spanish FROM DUAL
UNION
SELECT 'floor' english, 'suelo' spanish FROM DUAL
UNION
SELECT 'roof' english, 'techo' spanish FROM DUAL)
SELECT spanish.origin, spanish.dest, english.origin, spanish.code, english.code
FROM spanish, english, map map1, map map2
WHERE spanish.origin = map1.spanish
AND spanish.dest = map2.spanish
AND english.origin = map1.english
AND english.dest = map2.english

Note that I changed the mapping table a bit. There were some words in English in the Spanish column. Also, I think the results you said were wrong. This is what I get:

DD ZZ
CC WW
BB XX
AA YY

EDIT: It's a bad habit of mine doing the joins like I did before. Using the proper syntax the query would be:

WITH (.....)
SELECT spanish.origin, spanish.dest, english.origin, spanish.code, english.code
FROM spanish JOIN map map1 ON spanish.origin = map1.spanish
JOIN map map2 ON spanish.dest = map2.spanish
JOIN english ON map1.english = english.origin AND map2.english = english.dest

Join tables and sum specific data based on a mapping

Although you are new to SQL, your data structure as it is will be nothing but trouble. Can the query be done? Yes, but harder. I would first like to suggest an alternative to what you have to identify "groups". Create a second table of groups, then have all companies associated with said group. You could even have some clear-text content of the group such as

CompanyGroups
CompanyGroupID CompanyGroupName
1 Eastern Group
2 Northern Group
3 Technical Group
4 Furniture Group

Then the companies
SourceCompanyId CompanyGroupID
4626 3
359468 3
7999 3
56167 4
11947 4

So, there is one record per company and the known group associated.

If a company can possibly be associated with multiple groups, you could have additional records per company and alternate group as well.

Now, back to the regularly scheduled program and your query. You need to have one "common" group so all targets are associated, including the underlying source company in the group, such as your 4626 was the source, and the other two of 359468, 7999 are in the same. It expands on the other answer, but forces the left-most ID into a primary position.

select distinct
SourceCompanyID as GrpParent,
SourceCompanyID as IncludedCompany
from
CompanyGroup cg
UNION
select
cgParent.SourceCompanyID as GrpParent,
cgTarget.TargetCompanyId as IncludedCompany
from
CompanyGroup cgParent
JOIN CompanyGroup cgTarget
on cgParent.SourceCompanyID = cgTarget.SourceCompanyID

Notice the first part of the query is getting the source once even if they are associated with five other targets. We don’t want to duplicate counts because of duplicate sources. It holds its own ID as both the parent and the company to be included as part-of the group.

The second starts again with the same parent, but gets the target as the included company. So, based on your data

SourceCompanyId  TargetCompanyId
4626 359468
4626 7999
56167 11947

Would result as

GrpParent   IncludedCompany
-- first the distinct portion before union
4626 4626
56167 56167
-- now the union portion
4626 359468
4626 7999
56167 11947

And you can see the five total records and the 4626 "Group" shows all three company IDs including itself on the right-side, similarly for 56167 having two entries with each respective on the included companies right-side.

Now with this, you should be able to join the summation of data by the GROUP and not cause duplicated aggregations.

select
CompGrps.GrpParent,
sum( CompSales.Sales ) as GroupTotalSales
from
( select distinct
SourceCompanyID as GrpParent,
SourceCompanyID as IncludedCompany
from
CompanyGroup cg
UNION
select
cgParent.SourceCompanyID as GrpParent,
cgTarget.TargetCompanyId as IncludedCompany
from
CompanyGroup cgParent
JOIN CompanyGroup cgTarget
on cgParent.SourceCompanyID = cgTarget.SourceCompanyID
) as CompGrps
JOIN
( SELECT
s.CompanyId,
SUM(s.Sales) AS Sales
FROM
Sales s
group by
s.CompanyId ) CompSales
on CompGrps.IncludedCompany = CompSales.CompanyID
group by
CompGrps.GrpParent
order by
sum( CompSales.Sales ) desc

So notice the first query getting distinct group companies, and the secondary querying from its own per-company sales can be joined on the company ID of itself, but summed based on the common group parent, thus giving totals the the outer level per GROUP.

I also tacked on a simple order by to get largest sales sorted at the top. As you can see, it's a bit messier with existing structure, but can be done.

The output should look something like

GrpParent  GroupTotalSales
4626 2900 (4626 had 1600, 359468 had 800, and 7999 had 500)
56167 1300 (56167 had 1000, 11947 had 300)


Related Topics



Leave a reply



Submit