How to Update in SQLite Using a Left Join to Select Candidate Rows

How to UPDATE in SQLite using a LEFT JOIN to select candidate rows

 UPDATE assoc SET cachedData = NULL
WHERE EXISTS (SELECT * FROM otherTable
WHERE otherTable.Col1 = assoc.Col1 AND otherTable.Col2 = assoc.Col1)

Be aware that this is not especially performant.

SQL: Left joining optional column should show in the table but with different row

A simple method is to add a condition to your ON clause for table D:

LEFT JOIN D ON D.B_id = B.ID AND C.ID IS NULL

If you want two null rows in case you don't find neither C nor D, use UNION ALL instead:

SELECT A.ID as A_id, B.ID as B_id, C.ID as C_id, NULL as D_id
FROM A
LEFT JOIN B ON B.A_id = A.ID
LEFT JOIN C ON C.B_id = B.ID
UNION ALL
SELECT A.ID as A_id, B.ID as B_id, NULL as C_id, D.ID as D_id
FROM A
LEFT JOIN B ON B.A_id = A.ID
LEFT JOIN D ON D.B_id = B.ID

count distinct values while left joining

You can group by people.id and count the distinct companies:

SELECT p.id, p.name, 
COUNT(DISTINCT r.company) companies
FROM people p LEFT JOIN reports r
ON p.id = r.people_id
GROUP BY p.id;

I assume the id is the primary key of the table people.

Group_Concat with left join is slowing down. I need all rows in left table as well

This is your query:

SELECT a.id, a.firstname, a.lastname, a.state, a.mobile, a.lead_description,
GROUP_CONCAT(n.note SEPARATOR ', ') as notes
FROM applicant a LEFT JOIN
note n
ON a.id = n.applicant_id
WHERE delete_type = 0 AND a.type = 0
GROUP BY a.id
ORDER BY a.id DESC

You should start with indexes. I would recommend applicant(type, id) and note(applicant_id). And including delete_type in one of them (I don't know which table it comes from).

Second, this query may be faster using a correlated subquery. This would look like:

SELECT a.id, a.firstname, a.lastname, a.state, a.mobile, a.lead_description,
(select GROUP_CONCAT(n.note SEPARATOR ', ')
from note n
where a.id = n.applicant_id
) as notes
FROM applicant a
WHERE delete_type = 0 AND a.type = 0
ORDER BY a.id DESC

The condition on delete_type either goes in the outer query or the subquery -- it is not clear which table this comes from.

This avoids the large group by on the external data, which can be a performance boost.

INNER JOIN vs LEFT JOIN performance in SQL Server

A LEFT JOIN is absolutely not faster than an INNER JOIN. In fact, it's slower; by definition, an outer join (LEFT JOIN or RIGHT JOIN) has to do all the work of an INNER JOIN plus the extra work of null-extending the results. It would also be expected to return more rows, further increasing the total execution time simply due to the larger size of the result set.

(And even if a LEFT JOIN were faster in specific situations due to some difficult-to-imagine confluence of factors, it is not functionally equivalent to an INNER JOIN, so you cannot simply go replacing all instances of one with the other!)

Most likely your performance problems lie elsewhere, such as not having a candidate key or foreign key indexed properly. 9 tables is quite a lot to be joining so the slowdown could literally be almost anywhere. If you post your schema, we might be able to provide more details.


Edit:

Reflecting further on this, I could think of one circumstance under which a LEFT JOIN might be faster than an INNER JOIN, and that is when:

  • Some of the tables are very small (say, under 10 rows);
  • The tables do not have sufficient indexes to cover the query.

Consider this example:

CREATE TABLE #Test1
(
ID int NOT NULL PRIMARY KEY,
Name varchar(50) NOT NULL
)
INSERT #Test1 (ID, Name) VALUES (1, 'One')
INSERT #Test1 (ID, Name) VALUES (2, 'Two')
INSERT #Test1 (ID, Name) VALUES (3, 'Three')
INSERT #Test1 (ID, Name) VALUES (4, 'Four')
INSERT #Test1 (ID, Name) VALUES (5, 'Five')

CREATE TABLE #Test2
(
ID int NOT NULL PRIMARY KEY,
Name varchar(50) NOT NULL
)
INSERT #Test2 (ID, Name) VALUES (1, 'One')
INSERT #Test2 (ID, Name) VALUES (2, 'Two')
INSERT #Test2 (ID, Name) VALUES (3, 'Three')
INSERT #Test2 (ID, Name) VALUES (4, 'Four')
INSERT #Test2 (ID, Name) VALUES (5, 'Five')

SELECT *
FROM #Test1 t1
INNER JOIN #Test2 t2
ON t2.Name = t1.Name

SELECT *
FROM #Test1 t1
LEFT JOIN #Test2 t2
ON t2.Name = t1.Name

DROP TABLE #Test1
DROP TABLE #Test2

If you run this and view the execution plan, you'll see that the INNER JOIN query does indeed cost more than the LEFT JOIN, because it satisfies the two criteria above. It's because SQL Server wants to do a hash match for the INNER JOIN, but does nested loops for the LEFT JOIN; the former is normally much faster, but since the number of rows is so tiny and there's no index to use, the hashing operation turns out to be the most expensive part of the query.

You can see the same effect by writing a program in your favourite programming language to perform a large number of lookups on a list with 5 elements, vs. a hash table with 5 elements. Because of the size, the hash table version is actually slower. But increase it to 50 elements, or 5000 elements, and the list version slows to a crawl, because it's O(N) vs. O(1) for the hashtable.

But change this query to be on the ID column instead of Name and you'll see a very different story. In that case, it does nested loops for both queries, but the INNER JOIN version is able to replace one of the clustered index scans with a seek - meaning that this will literally be an order of magnitude faster with a large number of rows.

So the conclusion is more or less what I mentioned several paragraphs above; this is almost certainly an indexing or index coverage problem, possibly combined with one or more very small tables. Those are the only circumstances under which SQL Server might sometimes choose a worse execution plan for an INNER JOIN than a LEFT JOIN.

Query to select a row from a column that matches x but not y where y is everything else thats not x

You've already received comments encouraging you to consider a different design for your schema and the rationale for this so I'll only focus on a suggestion for your schema here.

You may consider using REPLACE to determine if the column of ingredients will be empty or whether this recipe has no other ingredients. The LIKE was used to determine whether the recipe had the desired ingredients.

Approach 1

SELECT 
recipe,
ingredients
FROM mytable
WHERE (
CONCAT(',',ingredients,',') LIKE '%,salt,%' OR
CONCAT(',',ingredients,',') LIKE '%,pepper,%'
) AND
REPLACE(REPLACE(REPLACE(ingredients,'salt',','),'pepper',','),',','')=''

View working demo online

Approach 2

Ingredients that are only desired are placed in a subquery and filtered using the left join. The having clause is then used to determine whether the list of ingredients only has these ingredients .

SELECT recipe
FROM (
SELECT
recipe,
ingredients,
desired
FROM
mytable m
LEFT JOIN (
SELECT 'salt' as desired UNION ALL
SELECT 'pepper'
) d ON CONCAT(',',ingredients,',') LIKE CONCAT('%,',d.desired,',%')
) t
GROUP BY
recipe
HAVING
LEN(
MAX(
REPLACE(ingredients,',','')
)
) <= SUM(LEN(desired))

View working demo online

What is the difference between INNER JOIN and OUTER JOIN?

Assuming you're joining on columns with no duplicates, which is a very common case:

  • An inner join of A and B gives the result of A intersect B, i.e. the inner part of a Venn diagram intersection.

  • An outer join of A and B gives the results of A union B, i.e. the outer parts of a Venn diagram union.

Examples

Suppose you have two tables, with a single column each, and data as follows:

A    B
- -
1 3
2 4
3 5
4 6

Note that (1,2) are unique to A, (3,4) are common, and (5,6) are unique to B.

Inner join

An inner join using either of the equivalent queries gives the intersection of the two tables, i.e. the two rows they have in common.

select * from a INNER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a = b.b;

a | b
--+--
3 | 3
4 | 4

Left outer join

A left outer join will give all rows in A, plus any common rows in B.

select * from a LEFT OUTER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a = b.b(+);

a | b
--+-----
1 | null
2 | null
3 | 3
4 | 4

Right outer join

A right outer join will give all rows in B, plus any common rows in A.

select * from a RIGHT OUTER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a(+) = b.b;

a | b
-----+----
3 | 3
4 | 4
null | 5
null | 6

Full outer join

A full outer join will give you the union of A and B, i.e. all the rows in A and all the rows in B. If something in A doesn't have a corresponding datum in B, then the B portion is null, and vice versa.

select * from a FULL OUTER JOIN b on a.a = b.b;

a | b
-----+-----
1 | null
2 | null
3 | 3
4 | 4
null | 6
null | 5

Only select first row of repeating value in a column in SQL

You can use a EXISTS semi-join to identify candidates:

Select wanted rows:

SELECT * FROM tbl t
WHERE NOT EXISTS (
SELECT *
FROM tbl
WHERE col1 = t.col1
AND id = t.id - 1
)
ORDER BY id;

Get rid of unwanted rows:

DELETE FROM tbl AS t
-- SELECT * FROM tbl t -- check first?
WHERE EXISTS (
SELECT *
FROM tbl
WHERE col1 = t.col1
AND id = t.id - 1
);

This effectively deletes every row, where the preceding row has the same value in col1, thereby arriving at your set goal: only the first row of every burst survives.

I left the commented SELECT statement because you should always check what is going to be deleted before you do the deed.

Solution for non-sequential IDs:

If your RDBMS supports CTEs and window functions (like PostgreSQL, Oracle, SQL Server, ... but not SQLite prior to v3.25, MS Access or MySQL prior to v8.0.1), there is an elegant way:

WITH cte AS (
SELECT *, row_number() OVER (ORDER BY id) AS rn
FROM tbl
)
SELECT id, col1
FROM cte c
WHERE NOT EXISTS (
SELECT *
FROM cte
WHERE col1 = c.col1
AND rn = c.rn - 1
)
ORDER BY id;

Another way doing the job without those niceties (should work for you):

SELECT id, col1
FROM tbl t
WHERE (
SELECT col1 = t.col1
FROM tbl
WHERE id < t.id
ORDER BY id DESC
LIMIT 1) IS NOT TRUE
ORDER BY id;


Related Topics



Leave a reply



Submit