How to do a case sensitive GROUP BY?
You can use an case sensitive collation:
with temp as
(
select 'Test' COLLATE Latin1_General_CS_AS as name
UNION ALL
select 'TEST'
UNION ALL
select 'test'
UNION ALL
select 'tester'
UNION ALL
select 'tester'
)
SELECT name, COUNT(name)
FROM temp
group by name
SQL Case Sensitive Group By
Assign the COLLATE also on the GROUP BY clause
select accountCode COLLATE Latin1_General_CS_AS, count(OrderId)
from <<TableName>>
group by accountCode COLLATE Latin1_General_CS_AS
Make GroupBy query case-sensitive
You need to change the collation of the column to one that's case sensitive. For example:
SELECT [name] COLLATE Latin1_General_100_CS_AS_SC;
FROM [Example].[dbo].[User]
GROUP BY [Example].[dbo].[User].[name] COLLATE Latin1_General_100_CS_AS_SC;
Case sensitive collations usually have a CS
in their name.
To choose which collation to use, you can find the collation of the column with:
SELECT c.collation_name
FROM sys.columns c
INNER JOIN sys.objects o
ON o.object_id = c.object_id
AND o.type = 'U'
INNER JOIN sys.schemas s
ON s.schema_id = o.schema_id
WHERE lower(s.name) = 'dbo'
AND lower(o.name) = 'user'
AND lower(c.name) = 'name';
A good guess is to take that collation and replace the CI
(for case insensitive) for a CS
and use it as demonstrated above.
grouping the same value in upper case and lower case with select query
Use lower()
or upper()
:
SELECT MIN(InputParameterName) as InputParameterName, Period,
SUM(CAST(PeriodInput AS FLOAT)) PeriodInput
FROM Table1
GROUP BY LOWER(InputParameterName), Period;
The use LOWER()
in the GROUP BY
defines the columns. It is not used in the SELECT
so you always get a value in the data.
If you don't care about the case, then you should look into setting the collation of the column, table, or database so case is ignored in expressions involving strings.
EDIT:
If the above result returns two rows, then case is not the problem. You probably have hidden characters. the most likely are spaces, so you can try:
GROUP BY LOWER(REPLACE(InputParameterName, ' ', '')), Period;
Hopefully, the hidden spaces are no more complicated than spaces.
EDIT II:
I get it now. You want the two rows, you just want the values to be the same. Then window functions can do what you want:
SELECT MIN(InputParameterName) OVER (PARTITION BY LOWER(InputParameterName)),
Period, SUM(CAST(PeriodInput AS FLOAT)) as PeriodInput
FROM Table1
GROUP BY InputParameterName, Period;
Case-sensitive GROUP BY in an Access query
As explained in the Microsoft support article here:
The Microsoft Jet database engine is inherently case-insensitive. When joining tables, it matches lowercase "abc" to uppercase "ABC," which in most cases is desirable.
(The same is true for the newer "Access Database Engine", a.k.a. "ACE", which is used by Access 2010.)
The article goes on to describe several strategies for performing case-sensitive JOINs. Of those, the one that is most likely to work for you (where you need both your JOINs and your GROUP BY clauses to be case-sensitive) is the one they call "Hexadecimal Expansion" where you create another column in each table that will store a hexadecimal representation of the ID value. The function they suggest is...
Public Function StrToHex(X As Variant) As Variant
Dim I As Long, Temp As String
If IsNull(X) Then Exit Function
Temp = Space$(Len(X) * 2)
For I = 1 To Len(X)
Mid$(Temp, I * 2 - 1, 2) = Right$("0" & Hex$(Asc(Mid$(X, I, 1))), 2)
Next I
StrToHex = Temp
End Function
...which will work provided that your ID strings do not contain Unicode characters. That function will convert a string into the hex representation of each character, e.g.,
StrToHex("GORD") --> "474F5244"
StrToHex("gord") --> "676F7264"
Once you've populated your new "surrogate ID" fields with the corresponding hex values you can use those fields in your JOINs and GROUP BY clauses.
For more information see the above-mentioned support article:
How To Perform a Case-Sensitive JOIN Through Microsoft Jet
Group by is case sensitive in T-SQL even though db and server collations are CI
This is more of extended comment that real answer.
I believe this issue is coming from how SQL Server is attempting to evaluate case
statement expression.
To prove that server is case insensetive you can run the following two statements
SELECT CASE WHEN 'Bertrand' = 'bertrand' THEN 'true' ELSE 'false' end
-
DECLARE @base TABLE
(NAME VARCHAR(1)
,value INT
)
INSERT INTO @base Values('a',0),('b',0),('B',0)
SELECT * FROM @base
SELECT name, COUNT(value) AS Cnt
FROM @base
GROUP BY NAME
results:
as you can see here even though letter in second row is lower case and in third row is upper case, group by clause ignores the case. Looking at execution plan there are two expression for
Expr 1007 COUNT([value])
Expr 1004 CONVERT_IMPLICIT(int,[Expr1007],0)
now when we change it to case
SELECT CASE WHEN name = 'a' THEN 'adam' ELSE 'bertrand' END AS name, COUNT(value) AS Cnt
FROM @base
GROUP BY CASE WHEN name = 'a' THEN 'adam' ELSE 'bertrand' END
execution plan shows 3 expressions. 2 from above and new one
Expr 1004 CASE WHEN [NAME]='a' THEN 'adam' ELSE 'bertrand' END
so at this point aggregate function is no longer evaluating value of the column name
but now it evaluating value of the expression.
What i think is happening is, could be incorrect. When SQL server converts both CASE
statement in SELECT
and GROUP BY
clause to a expression it comes up with different expression value. In this case you might as well do 'bertrand'
in select
and 'charlie'
in group by
clause because if CASE
expression is not 100% match between select and group by clause SQL Server will consider them as different Expr
aka (columns) that no longer match.
Update:
To take this one step further, the following statement will also fail.
SELECT CASE WHEN name = 'a' THEN 'adam' ELSE UPPER('bertrand') END AS name
,COUNT(value) AS Cnt
FROM @base
GROUP BY CASE WHEN name = 'a' THEN 'adam' ELSE UPPER('Bertrand') END
Even wrapping the different case strings in UPPER()
function, SQL Server is still unable to process it.
SQLite using GROUP BY without being case sensitive
You could try the LOWER() function to make them all lower case.
You might also want to use TRIM() to address whitespace at either end.
SELECT count(Email) as Number, Email
FROM tbl_SAP_Users
GROUP BY LOWER(TRIM(Email))
HAVING Number > 1
ORDER BY Number DESC;
pandas groupby - case sensitive issues in groups
Just as a followup to this you don't actually need to overwrite the keyword when you do the grouping. You can instead do the whole transformation in the call to groupby
grouped = df.groupby(df['Keyword'].str.lower())
So as an example you could then have:
df = pandas.DataFrame({'Keyword': ['Attorney', 'ATTORNEY', 'foo'], 'x' : [1, 2, 42]})
df.groupby(df['Keyword'].str.lower()).sum()
Which outputs:
x
Keyword
attorney 3
foo 42
as you would expect
Related Topics
Ora-00972 Identifier Is Too Long Alias Column Name
Optional Arguments in Where Clause
Pseudo_Encrypt() Function in Plpgsql That Takes Bigint
Converting a String to Hex in SQL
For JSON Path Returns Less Number of Rows on Azure SQL
How to Represent a Data Tree in SQL
How to Find the Data Directory for a SQL Server Instance
Why & When Should I Use Sparse Column? (SQL Server 2008)
Mysql: Count Records from One Table and Then Update Another
Order by Items Must Appear in the Select List If Select Distinct Is Specified
"Order By" Using a Parameter for the Column Name
Spark Replacement for Exists and In
Generate Default Values in a Cte Upsert Using Postgresql 9.3