Left Join or Select from Multiple Table Using Comma (,)

Left join or select from multiple table using comma (,)

First of all, to be completely equivalent, the first query should have been written

   SELECT mw.*, 
nvs.*
FROM mst_words mw
LEFT JOIN (SELECT *
FROM vocab_stats
WHERE owner = 1111) AS nvs ON mw.no = nvs.vocab_no
WHERE (nvs.correct > 0 )
AND mw.level = 1

So that mw.* and nvs.* together produce the same set as the 2nd query's singular *. The query as you have written can use an INNER JOIN, since it includes a filter on nvs.correct.

The general form

TABLEA LEFT JOIN TABLEB ON <CONDITION>

attempts to find TableB records based on the condition. If the fails, the results from TABLEA are kept, with all the columns from TableB set to NULL. In contrast

TABLEA INNER JOIN TABLEB ON <CONDITION>

also attempts to find TableB records based on the condition. However, when fails, the particular record from TableA is removed from the output result set.

The ANSI standard for CROSS JOIN produces a Cartesian product between the two tables.

TABLEA CROSS JOIN TABLEB
-- # or in older syntax, simply using commas
TABLEA, TABLEB

The intention of the syntax is that EACH row in TABLEA is joined to EACH row in TABLEB. So 4 rows in A and 3 rows in B produces 12 rows of output. When paired with conditions in the WHERE clause, it sometimes produces the same behaviour of the INNER JOIN, since they express the same thing (condition between A and B => keep or not). However, it is a lot clearer when reading as to the intention when you use INNER JOIN instead of commas.

Performance-wise, most DBMS will process a LEFT join faster than an INNER JOIN. The comma notation can cause database systems to misinterpret the intention and produce a bad query plan - so another plus for SQL92 notation.

Why do we need LEFT JOIN? If the explanation of LEFT JOIN above is still not enough (keep records in A without matches in B), then consider that to achieve the same, you would need a complex UNION between two sets using the old comma-notation to achieve the same effect. But as previously stated, this doesn't apply to your example, which is really an INNER JOIN hiding behind a LEFT JOIN.

Notes:

  • The RIGHT JOIN is the same as LEFT, except that it starts with TABLEB (right side) instead of A.
  • RIGHT and LEFT JOINS are both OUTER joins. The word OUTER is optional, i.e. it can be written as LEFT OUTER JOIN.
  • The third type of OUTER join is FULL OUTER join, but that is not discussed here.

What's the difference between comma separated joins and join on syntax in MySQL?

There is no difference at all.

First representation makes query more readable and makes it look very clear as to which join corresponds to which condition.

SQL left join vs multiple tables on FROM line?

The old syntax, with just listing the tables, and using the WHERE clause to specify the join criteria, is being deprecated in most modern databases.

It's not just for show, the old syntax has the possibility of being ambiguous when you use both INNER and OUTER joins in the same query.

Let me give you an example.

Let's suppose you have 3 tables in your system:

Company
Department
Employee

Each table contain numerous rows, linked together. You got multiple companies, and each company can have multiple departments, and each department can have multiple employees.

Ok, so now you want to do the following:

List all the companies, and include all their departments, and all their employees. Note that some companies don't have any departments yet, but make sure you include them as well. Make sure you only retrieve departments that have employees, but always list all companies.

So you do this:

SELECT * -- for simplicity
FROM Company, Department, Employee
WHERE Company.ID *= Department.CompanyID
AND Department.ID = Employee.DepartmentID

Note that the last one there is an inner join, in order to fulfill the criteria that you only want departments with people.

Ok, so what happens now. Well, the problem is, it depends on the database engine, the query optimizer, indexes, and table statistics. Let me explain.

If the query optimizer determines that the way to do this is to first take a company, then find the departments, and then do an inner join with employees, you're not going to get any companies that don't have departments.

The reason for this is that the WHERE clause determines which rows end up in the final result, not individual parts of the rows.

And in this case, due to the left join, the Department.ID column will be NULL, and thus when it comes to the INNER JOIN to Employee, there's no way to fulfill that constraint for the Employee row, and so it won't appear.

On the other hand, if the query optimizer decides to tackle the department-employee join first, and then do a left join with the companies, you will see them.

So the old syntax is ambiguous. There's no way to specify what you want, without dealing with query hints, and some databases have no way at all.

Enter the new syntax, with this you can choose.

For instance, if you want all companies, as the problem description stated, this is what you would write:

SELECT *
FROM Company
LEFT JOIN (
Department INNER JOIN Employee ON Department.ID = Employee.DepartmentID
) ON Company.ID = Department.CompanyID

Here you specify that you want the department-employee join to be done as one join, and then left join the results of that with the companies.

Additionally, let's say you only want departments that contains the letter X in their name. Again, with old style joins, you risk losing the company as well, if it doesn't have any departments with an X in its name, but with the new syntax, you can do this:

SELECT *
FROM Company
LEFT JOIN (
Department INNER JOIN Employee ON Department.ID = Employee.DepartmentID
) ON Company.ID = Department.CompanyID AND Department.Name LIKE '%X%'

This extra clause is used for the joining, but is not a filter for the entire row. So the row might appear with company information, but might have NULLs in all the department and employee columns for that row, because there is no department with an X in its name for that company. This is hard with the old syntax.

This is why, amongst other vendors, Microsoft has deprecated the old outer join syntax, but not the old inner join syntax, since SQL Server 2005 and upwards. The only way to talk to a database running on Microsoft SQL Server 2005 or 2008, using the old style outer join syntax, is to set that database in 8.0 compatibility mode (aka SQL Server 2000).

Additionally, the old way, by throwing a bunch of tables at the query optimizer, with a bunch of WHERE clauses, was akin to saying "here you are, do the best you can". With the new syntax, the query optimizer has less work to do in order to figure out what parts goes together.

So there you have it.

LEFT and INNER JOIN is the wave of the future.

Left join FROM multiple tables

It depends what you want to do. You seem to want a Cartesian product of the first two tables with lookups on the third:

SELECT * 
FROM table1 CROSS JOIN
table2 LEFT JOIN
other_table
ON other_table.id = table2.id AND
other_table.otherId = table1.otherId;

Commas -- which should just be banned permanently from FROM clauses -- are similar to CROSS JOINs. However, the parsing of the SQL statement is different. A comma prevents the tables before it from being used in ON clauses after it; that is the source of your error.

SQL query multiple tables, with multiple joins and column field with comma separated list

If you really cannot modify the table structure, probably the best you can do is one of the old list hacks:

  • Use a JOIN with FIND_IN_SET(value, commaSeparatedString)

    SELECT n.Host, c.Name AS ControlName, s.Name AS ServiceName
    FROM node n
    LEFT JOIN control c ON c.controlID = n.controlID
    LEFT JOIN service s ON FIND_IN_SET(s.serviceID, n.serviceId)
    ORDER BY n.host, s.Name
    ;

  • Use LIKE to detect the presence of a specific serviceID value within the node list

    SELECT n.Host, c.Name AS ControlName, s.Name AS ServiceName
    FROM node n
    LEFT JOIN control c ON c.controlID = n.controlID
    LEFT JOIN service s ON
    CONCAT(',', n.serviceID,',') LIKE
    CONCAT('%,', s.serviceID,',%')
    ORDER BY n.host, s.Name
    ;

SQLFiddle

However, as you already noted that column really should be normalized. While the methods above should work for small data sets, they suffer from the usual problems of working with "lists". Neither method is very index friendly, and as a result, will not scale well. Also, both perform string comparisons. So the slightest difference may cause the matching to fail. For example, 1,4 would match two serviceID's, whereas 1,(space)4 or 1,4.0 would match only one.

Update based on comments:

On second read, I am not sure the above answers the precise question you are asking, but it should provide a good basis to work with ...

If you no longer want a CSV list, just use one of the queries above and output the individual query columns as usual. The result will be one service name per row, ie:

   server1 | Control Name One | Service Name 200
server1 | Control Name One | Service Name 50
..

Otherwise, if you need to preserve the comma separated values, one possibility is to use a <cfoutput group=".."> on the query results. Since the results are ordered by "Host" first, something like the code below. NB: For "group" to work properly, the results must be ordered by Host and you must use multiple cfoutput tags as shown below.

 <cfoutput query="..." group="Host"> 
#Host# |
#ControlName# |
<cfoutput>
#ServiceName#,
</cfoutput>
<br>
</cfoutput>

The result should look like this:

server1 | Control Name One | Service Name 200, Service Name 50, Service Name Four, Service Name One, Service Name Three, Service Name Two, 
server2 | Control Name Two | Service Name 200, Service Name Four, Service Name Three, Service Name Two,
server3 | Control Name Two | Service Name 200, Service Name 50, Service Name Four, Service Name One, Service Name Three, Service Name Two,
server4 | Control Name Three | Service Name 200, Service Name 50, Service Name One, Service Name Two,
server5 | Control Name Three | Service Name Four, Service Name One,


Update 2:

I forgot there is a simpler alternative to cfoutput group in MySQL: GROUP_CONCAT

<cfquery name="qry" datasource="MySQL5">
SELECT n.Host, c.Name AS ControlName, GROUP_CONCAT(s.Name) AS ServiceNameList
FROM node n
LEFT JOIN control c ON c.controlID = n.controlID
LEFT JOIN service s ON FIND_IN_SET(s.serviceID, n.serviceId)
GROUP BY n.Host, c.Name
ORDER BY n.host
</cfquery>

Combine left join with commas

You would simply use proper JOIN or INNER JOIN syntax:

SELECT * 
FROM A a1 JOIN
AP ap1
ON ap1.a_id = a1.achats_id JOIN
P p1
ON p1.p_id = ap1.products_id JOIN
M m1
ON p1.m_id = m1.m_id LEFT JOIN
F f1
ON f1.id_f = p1.f1 LEFT JOIN
C c1
ON p1.c_id = c1.c_id LEFT JOIN
S s1
ON p1.saison_id = s1.s_id
WHERE a1.a_type NOT IN (1) AND a1.type_c IN (1, 2)
ORDER BY a1.a_id ASC;

Voila! Problem solved.

SQL Join on Comma Separated Row

Please try the following solution.

It is using XML and XQuery.

Notable points:

  • No derived tables or CTEs.
  • CData section protects against chars like ampersand, and the like.
  • XPath expression contains text() for maximum performance. SQL Server peculiarity.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, Hosts VARCHAR(1024), Description VARCHAR(30));
INSERT INTO @tbl (Hosts, Description) VALUES
('192.168.0.1,192.168.0.2,192.168.0.3', 'Group A'),
('192.168.0.10,192.168.0.13,192.168.0.15', 'Group B'),
('192.168.0.22', 'Group C');
-- DDL and sample data population, end

DECLARE @separator CHAR(1) = ',';

SELECT ID, [Description]
,x.value('text()[1]', 'VARCHAR(20)') AS Hosts
FROM @tbl
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' +
REPLACE(Hosts, @separator, ']]></r><r><![CDATA[') +
']]></r></root>' AS XML)) AS t1(c)
CROSS APPLY c.nodes ('/root/r') AS t2(x);

Output

+----+-------------+--------------+
| ID | Description | Hosts |
+----+-------------+--------------+
| 1 | Group A | 192.168.0.1 |
| 1 | Group A | 192.168.0.2 |
| 1 | Group A | 192.168.0.3 |
| 2 | Group B | 192.168.0.10 |
| 2 | Group B | 192.168.0.13 |
| 2 | Group B | 192.168.0.15 |
| 3 | Group C | 192.168.0.22 |
+----+-------------+--------------+


Related Topics



Leave a reply



Submit