Difference between N'String' vs U'String' literals in Oracle
In this answer i will try to provide informations from official resources
(1) The N'' text Literal
N''
is used to convert a string to NCHAR
or NVARCHAR2
datatype
According to this Oracle documentation Oracle - Literals
The syntax of text literals is as follows:
where
N
orn
specifies the literal using the national character set (NCHAR
orNVARCHAR2
data).
Also in this second article Oracle - Datatypes
The N'String'
is used to convert a string to NCHAR
datatype
From the article listed above:
The following example compares the
translated_description
column of thepm.product_descriptions
table with a national character set string:SELECT translated_description FROM product_descriptions
WHERE translated_name = N'LCD Monitor 11/PM';
(2) The U'' Literal
U''
is used to handle the SQL NCHAR String Literals in Oracle Call Interface (OCI)
Based on this Oracle documentation Programming with Unicode
The Oracle Call Interface (OCI) is the lowest level API that the rest of the client-side database access products use. It provides a flexible way for C/C++ programs to access Unicode data stored in SQL
CHAR
andNCHAR
datatypes. Using OCI, you can programmatically specify the character set (UTF-8, UTF-16, and others) for the data to be inserted or retrieved. It accesses the database through Oracle Net.
OCI is the lowest-level API for accessing a database, so it offers the best possible performance.
Handling SQL NCHAR String Literals in OCI
You can switch it on by setting the environment variable
ORA_NCHAR_LITERAL_REPLACE
toTRUE
. You can also achieve this behavior programmatically by using theOCI_NCHAR_LITERAL_REPLACE_ON
andOCI_NCHAR_LITERAL_REPLACE_OFF
modes inOCIEnvCreate()
andOCIEnvNlsCreate()
. So, for example,OCIEnvCreate(OCI_NCHAR_LITERAL_REPLACE_ON)
turns onNCHAR
literal replacement, whileOCIEnvCreate(OCI_NCHAR_LITERAL_REPLACE_OFF)
turns it off.[...] Note that, when the
NCHAR
literal replacement is turned on,OCIStmtPrepare
andOCIStmtPrepare2
will transformN'
literals withU'
literals in the SQL text and store the resulting SQL text in the statement handle. Thus, if the application usesOCI_ATTR_STATEMENT
to retrieve the SQL text from theOCI
statement handle, the SQL text will returnU'
instead ofN'
as specified in the original text.
(3) Answer for your question
From datatypes perspective, there is not difference between both queries provided
Unicode- VARCHAR and NVARCHAR
The column type nvarchar allows you to store Unicode characters, which basically means almost any character from almost any language (including modern languages and some obsolete languages), and a good number of symbols too.
How to convert a string with Unicode encoding to a string of letters
Technically doing:
String myString = "\u0048\u0065\u006C\u006C\u006F World";
automatically converts it to "Hello World"
, so I assume you are reading in the string from some file. In order to convert it to "Hello" you'll have to parse the text into the separate unicode digits, (take the \uXXXX
and just get XXXX
) then do Integer.ParseInt(XXXX, 16)
to get a hex value and then case that to char
to get the actual character.
Edit: Some code to accomplish this:
String str = myString.split(" ")[0];
str = str.replace("\\","");
String[] arr = str.split("u");
String text = "";
for(int i = 1; i < arr.length; i++){
int hexVal = Integer.parseInt(arr[i], 16);
text += (char)hexVal;
}
// Text will now have Hello
String literals and escape characters in postgresql
Partially. The text is inserted, but the warning is still generated.
I found a discussion that indicated the text needed to be preceded with 'E', as such:
insert into EscapeTest (text) values (E'This is the first part \n And this is the second');
This suppressed the warning, but the text was still not being returned correctly. When I added the additional slash as Michael suggested, it worked.
As such:
insert into EscapeTest (text) values (E'This is the first part \\n And this is the second');
How to convert string to unicode using PostgreSQL?
PostgreSQL databases have a native character type, the "server encoding". It is usually utf-8.
All text is in this encoding. Mixed encoding text is not supported, except if stored as bytea
(i.e. as opaque byte sequences).
You can't store "unicode" or "non-unicode" strings, and PostgreSQL has no concept of "varchar" vs "nvarchar". With utf-8, characters that fall in the 7-bit ASCII range (and some others) are stored as a single byte, and wider chars require more storage, so it's just automatic. utf-8 requires more storage than ucs-2 or utf-16 for text that is all "wide" characters, but less for text that's a mixture.
PostgreSQL automatically converts to/from the client's text encoding, using the client_encoding
setting. There is no need to convert explicitly.
If your client is "Unicode" (which Microsoft products tend to say when they mean UCS-2 or UTF-16), then most client drivers take care of any utf-8 <--> utf-16 conversion for you.
So you should not need to care, so long as your client does I/O with correct charset options and sets a correct client_encoding
that matches the data its actually sends on the wire. (This is automatic with most client drivers like PgJDBC, nPgSQL, or the Unicode psqlODBC driver).
See:
- character set support
Related Topics
How to Restrict Null as Parameter to Stored Procedure SQL Server
Sum Columns with Null Values in Oracle
Omitting the Milliseconds in a Date
Sql: How to Use Union and Order by a Specific Select
How to Treat Max() of an Empty Table as 0 Instead of Null
How to Perform a Bitwise Group Function
#1146 - Table 'Phpmyadmin.Pma_Tracking' Doesn't Exist
Counting Rows for All Tables at Once
Recursive Query Used for Transitive Closure
In SQL Is There a Difference Between Count(*) and Count(<Fieldname>)
Pros and Cons of Autoincrement Keys on "Every Table"
Are Determinants and Candidate Keys Same or Different Things
What Column Should the Clustered Index Be Put On
How to Take Sum of Column with Same Id in SQL
SQL Server If Not Exists Usage
Add Row to Query Result Using Select