How to Use 'Like' Statement with Unicode Strings

How to use 'LIKE' statement with unicode strings?

Make sure the collation on your table supports unicode.

sql server like query for Unicode

Maybe something like the following will help:

DECLARE @Tab TABLE (ID INT, Salary NVARCHAR(MAX))
INSERT @Tab VALUES (1, N'₹10,000'),(2,N'20,000'),(3,N'30,000'),(4,N'₹50,000')

SELECT ID, CAST(Salary AS nvarchar) Salary
FROM @Tab
WHERE UNICODE(Salary) = UNICODE(N'₹')

How can I get rid of having to prefix a WHERE query with 'N' for Unicode strings?

As @sworkalot has mentioned below:

The default for .Net is Unicode, that's why you don't need to specify
it. This is not the case for Sql Manager.

If not specified Sql will assume that you work with asci according to
the collation specified in your DB.

Hence, when working from Sql Server you need to use N'

https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/

Check out these examples, pay close attention to the data types and the values being assigned:

DECLARE @Varchar VARCHAR(100) = '嗄'
DECLARE @VarcharWithN VARCHAR(100) = N'嗄' -- Has N prefix

DECLARE @NVarchar NVARCHAR(100) = '嗄'
DECLARE @NVarcharWithN NVARCHAR(100) = N'嗄' -- Has N prefix

SELECT
Varchar = @Varchar,
VarcharWithN = @VarcharWithN,
NVarchar = @NVarchar,
NVarcharWithN = @NVarcharWithN

SELECT
Varchar = CONVERT(VARBINARY, @Varchar),
VarcharWithN = CONVERT(VARBINARY, @VarcharWithN),
NVarchar = CONVERT(VARBINARY, @NVarchar),
NVarcharWithN = CONVERT(VARBINARY, @NVarcharWithN)

Results:

Varchar VarcharWithN    NVarchar    NVarcharWithN
? ? ? 嗄

Varchar VarcharWithN NVarchar NVarcharWithN
0x3F 0x3F 0x3F00 0xC455

NVARCHAR data type stores 2 bytes for each character while VARCHAR only stores 1 (you can see this on the VARBINARY cast on the 2nd SELECT). Since chinese characters representation need 2 bytes to be stored, you have to use NVARCHAR to store them. If you try to stuff them in a VARCHAR it will be stored as ? and you will lose the original character information. This also happens on the 3rd example, because the literal doesn't have the N so it's converted to VARCHAR before actually assigning the value to the variable.

It's because of this that you need to add the N prefix when typing these characters as literals, so the SQL engine knows that you are typing characters that need 2 byte representation. So if you are doing a comparison against a NVARCHAR column always add the N prefix. You can change the database collation, but it's recommended to always use the proper data type independent of the collation so you don't have problems when using coding on different databases.

If you could explain the reason why you want to omit the N prefix we might address that, although I believe there is no work around in this particular case.

Unicode characters causing issues in SQL Server 2005 string comparison

I guess the Unicode collation set for your connection/table/database specifies that ss == ß. The latter behavior would be because it's on a faulty fast path, or maybe it does a binary comparison, or maybe you're not passing in the ß in the right encoding (I agree it's stupid).

http://unicode.org/reports/tr10/#Searching mentions that U+00DF is special-cased. Here's an insightful excerpt:

Language-sensitive searching and
matching are closely related to
collation. Strings that compare as
equal at some strength level are those
that should be matched when doing
language-sensitive matching. For
example, at a primary strength, "ß"
would match against "ss" according to
the UCA, and "aa" would match "å" in a
Danish tailoring of the UCA.

Send Unicode string from MS Access to SQL Server using a DataSet

By using dynamic SQL to construct an SQL statement with sting literals like 'this' you are implicitly converting the string from Unicode into the single-byte character set used by the SQL Server, and any Unicode characters that do not map to that target character set will be replaced by question marks.

So, for example, with my SQL Server ...

cmd.CommandText = "INSERT INTO myTable (textCol) VALUES ('γιορτή')";
cmd.ExecuteNonQuery();

... will be inserted as ...

????t?

... even though [textCol] is defined as an NVARCHAR column.

The correct approach is to use a parameterized query, like so

cmd.CommandText = "INSERT INTO myTable (textCol) VALUES (@word)";
cmd.Parameters.Add("@word", System.Data.SqlDbType.NVarChar).Value = "γιορτή";
cmd.ExecuteNonQuery();

How to find corrupt record with unicode character in SQL and delete the record

charindex does the magic:

select *
from dbo.TestTable
where charindex(nchar(0xDB6D), SomeString) > 0


Related Topics



Leave a reply



Submit