How do I (or can I) SELECT DISTINCT on multiple columns?
SELECT DISTINCT a,b,c FROM t
is roughly equivalent to:
SELECT a,b,c FROM t GROUP BY a,b,c
It's a good idea to get used to the GROUP BY syntax, as it's more powerful.
For your query, I'd do it like this:
UPDATE sales
SET status='ACTIVE'
WHERE id IN
(
SELECT id
FROM sales S
INNER JOIN
(
SELECT saleprice, saledate
FROM sales
GROUP BY saleprice, saledate
HAVING COUNT(*) = 1
) T
ON S.saleprice=T.saleprice AND s.saledate=T.saledate
)
Select distinct from multiple fields using sql
This should give you all distinct values from the table. I presume you'd want to add where clauses to select only for a particular question. However, this solution requires 5 subqueries and can be slow if your table is huge.
SELECT DISTINCT(ans) FROM (
SELECT right AS ans FROM answers
UNION
SELECT wrong1 AS ans FROM answers
UNION
SELECT wrong2 AS ans FROM answers
UNION
SELECT wrong3 AS ans FROM answers
UNION
SELECT wrong4 AS ans FROM answers
) AS Temp
SELECT DISTINCT on multiple columns along with other columns
SELECT DISTINCT a,b,c FROM t
is roughly equivalent to:
SELECT a,b,c FROM t GROUP BY a,b,c,t
It's a good idea to get used to the GROUP BY syntax, as it's more powerful.
It's a good idea to get used to the GROUP BY syntax,
Please see this post:
Possible duplicate ?
Counting DISTINCT over multiple columns
If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.
Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.
I believe a distinct count of the computed column would be equivalent to your query.
Select distinct on multiple columns simultaneously, and keep one column in PostgreSQL
This should do the trick
CREATE TABLE tblB AS (
SELECT A, B, max(C) AS max_of_C FROM tblA GROUP BY A, B
)
How To Select Distinct Row Based On Multiple Fields
WITH CTE
AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Owner
ORDER BY Date DESC) AS RN
FROM tablename
)
SELECT ID, Name, Date, Location, Owner
FROM CTE
WHERE RN = 1;
Select distinct values from multiple columns in same table
It's better to include code in your question, rather than ambiguous text data, so that we are all working with the same data. Here is the sample schema and data I have assumed:
CREATE TABLE tbl_data (
id INT NOT NULL,
code_1 CHAR(2),
code_2 CHAR(2)
);
INSERT INTO tbl_data (
id,
code_1,
code_2
)
VALUES
(1, 'AB', 'BC'),
(2, 'BC', NULL),
(3, 'DE', 'EF'),
(4, NULL, 'BC');
As Blorgbeard commented, the DISTINCT
clause in your solution is unnecessary because the UNION
operator eliminates duplicate rows. There is a UNION ALL
operator that does not elimiate duplicates, but it is not appropriate here.
Rewriting your query without the DISTINCT
clause is a fine solution to this problem:
SELECT code_1
FROM tbl_data
WHERE code_1 IS NOT NULL
UNION
SELECT code_2
FROM tbl_data
WHERE code_2 IS NOT NULL;
It doesn't matter that the two columns are in the same table. The solution would be the same even if the columns were in different tables.
If you don't like the redundancy of specifying the same filter clause twice, you can encapsulate the union query in a virtual table before filtering that:
SELECT code
FROM (
SELECT code_1
FROM tbl_data
UNION
SELECT code_2
FROM tbl_data
) AS DistinctCodes (code)
WHERE code IS NOT NULL;
I find the syntax of the second more ugly, but it is logically neater. But which one performs better?
I created a sqlfiddle that demonstrates that the query optimizer of SQL Server 2005 produces the same execution plan for the two different queries:
If SQL Server generates the same execution plan for two queries, then they are practically as well as logically equivalent.
Compare the above to the execution plan for the query in your question:
The DISTINCT
clause makes SQL Server 2005 perform a redundant sort operation, because the query optimizer does not know that any duplicates filtered out by the DISTINCT
in the first query would be filtered out by the UNION
later anyway.
This query is logically equivalent to the other two, but the redundant operation makes it less efficient. On a large data set, I would expect your query to take longer to return a result set than the two here. Don't take my word for it; experiment in your own environment to be sure!
SQL Query Multiple Columns Using Distinct on One Column Only
select * from tblFruit where
tblFruit_ID in (Select max(tblFruit_ID) FROM tblFruit group by tblFruit_FruitType)
Select distinct on 2 columns, but return all columns from SQL Server
You can use partition by name and type as below
SELECT * from(select
[name]
,[type]
,[col1]
,[col2]
,[col3]
,[etc]
,[dateAdded]
,[ID]
,ROW_NUMBER() OVER(Partition by name, type order by dateAdded DESC) rownumber from [dbo].[Table]) a where rownumber = 1;
Related Topics
Select a Column If Other Column Is Null
SQL Constraint Minvalue/Maxvalue
Oracle SQL How to Remove Time from Date
Concat All Column Values in SQL
Division (/) Not Giving My Answer in Postgresql
Delete All But Top N from Database Table in SQL
Should Every SQL Server Foreign Key Have a Matching Index
Using Alias in When Portion of a Case Statement in Oracle SQL
Access Substitute for Except Clause
SQL . the Sp or Function Should Calculate the Next Date for Friday
Return Rows in the Exact Order They Were Inserted
Difference Between Timestamps in Milliseconds in Oracle
Spark Replacement for Exists and In