Convert UTF-8 String Classic ASP to SQL Database
Paul's answer isn't wrong but it is not the only part to consider:
You will need to go through each of these steps to make sure that you are getting consistent results;
IMPORTANT: These steps have to be performed on each and every page in your web application or you will have problems (emphasized by Paul's comment).
Each page needs to be saved using
UTF-8
encoding double check this as some IDEs will default toWindows-1252
(also often misnamed as "ANSI").Each page will need the following line added as the very first line in the page, to make this easier I put this along with some other values in an include file so I can include them in each page as I go.
Include File - page_encoding.asp
<%@Language="VBScript" CodePage = 65001 %>
<%
Response.CharSet = "UTF-8"
Response.CodePage = 65001
%>Usage in the top of an ASP page (prefer to put in a config folder at the root of the web)
<!-- #include virtual="/config/page_encoding.asp" -->
Response.Charset = "UTF-8"
is the equivalent of setting the;charset
in the HTTPcontent-type
header.Response.CodePage = 65001
tell's ASP to process all dynamic strings asUTF-8
.Include files in the page will also have to be saved using
UTF-8
encoding (double check these also).
Follow these steps and your page will work, your problem at the moment is some pages are being interpreted as Windows-1252
while others are being treated as UTF-8
and you're ending up with a mis-match in encoding.
Classic ASP - How to convert a UTF-8 string to UTF-16?
So sick of answering this question, but I feel impelled to as you have made a common assumption that many make when it comes to encoding in ASP, PHP or whatever language you are using.
In web development encoding is intrinsically linked to
The source encoding you use to save the web page
Just looking at the comments under the iconv reference made me laugh and sad at the same time because there are so many people out there who don't understand this topic.
Take for example your PHP snippet
iconv("utf-8","ucs-2be","Мухтарам Мизоч");
This will work providing the following is true
- The page author saved the file using
UTF-8
encoding (Most modern editors have this option in some shape or form). The client Internet Browser knows it should be handling the page as
UTF-8
either via a meta tag in the HTML,<meta http-equiv="content-type" content="text/html; charset=utf-8">
or by specifying a HTTP Content-Type Header
In terms of Classic ASP it is the same you need to;
Make sure the page is saved as
UTF-8
encoding, this includes any#include
files that are dependencies.Tell IIS that your pages are
UTF-8
by specifying this pre-processing instruction at the very top of the page (must be the first line).<%@Language="VBScript" CodePage = 65001 %>
Tell the browser what encoding you are using
<%
'Tell server to send all strings back to the client as UTF-8
'while also setting the charset in the HTTP Content Type header.
Responce.CodePage = 65001
Response.ContentType = "html/text"
Response.Charset = "UTF-8"
%>
UPDATE:
Neither UCS-2
(UTF-16
LE) or UCS-2BE
(UTF-16
BE) are supported by Classic ASP, specifying either CodePage
(1200 or 1201) will result in;
ASP 0203 - Invalid CodePage Value
After reading a bit about Kannel it does appear as though you can control the character set you send to the SMS gateway, I would recommend you try to send it using UTF-8
.
Links
Sending arabic SMS in kannel (This question is about sending arabic SMS using Java to Kannel but the information is relevant).
Unicode on Windows XP (Although aimed at Windows XP the codepage information is still relevant).
Classic ASP, MySQL or ODBC UTF8 encoding
You have a chance for Slovenian letters according to this mapping and an excerpt from Windows-1252 wiki article:
According to the information on Microsoft's and the Unicode Consortium's websites,
positions 81, 8D, 8F, 90, and 9D are unused; however, the Windows API
MultiByteToWideChar maps these to the corresponding C1 control codes.The euro character at position 80 was not present in earlier versions of this code page,
nor were the S, s, Z, and z with caron (háček).
Here's the things to do:
Use UTF-8 (without BOM) encoded files against the possibility of contain hard-coded text. (✔ already done)
Specify UTF-8 for response charset with ASP on server-side or with meta tags on client-side. (✔ already done)
Tell the MySQL Server your commands are in charset utf-8, and you expect utf-8 encoded result sets. Add an initial statement to the connection string :
...;stmt=SET NAMES 'utf8';...
Set the Response.CodePage to 1252.
I've tested the following script and it works like a charm.
DDL: http://sqlfiddle.com/#!8/c2c35/1
ASP:
<%@Language=VBScript%>
<%
Option Explicit
Response.CodePage = 1252
Response.LCID = 1060
Response.Charset = "utf-8"
Const adCmdText = 1, adVarChar = 200, adParamInput = 1, adLockOptimistic = 3
Dim Connection
Set Connection = Server.CreateObject("Adodb.Connection")
Connection.Open "Driver={MySQL ODBC 3.51 Driver};Server=localhost;Database=myDb;User=myUsr;Password=myPwd;stmt=SET NAMES 'utf8';"
If Request.Form("name").Count = 1 And Len(Request.Form("name")) Then 'add new
Dim rsAdd
Set rsAdd = Server.CreateObject("Adodb.Recordset")
rsAdd.Open "names", Connection, ,adLockOptimistic
rsAdd.AddNew
rsAdd("name").Value = Left(Request.Form("name"), 255)
rsAdd.Update
rsAdd.Close
Set rsAdd = Nothing
End If
Dim Command
Set Command = Server.CreateObject("Adodb.Command")
Command.CommandType = adCmdText
Command.CommandText = "Select name From `names` Order By id Desc"
If Request.QueryString("name").Count = 1 And Len(Request.QueryString("name")) Then
Command.CommandText = "Select name From `names` Where name = ? Order By id Desc"
Command.Parameters.Append Command.CreateParameter(, adVarChar, adParamInput, 255, Left(Request.QueryString("name"), 255))
End If
Set Command.ActiveConnection = Connection
With Command.Execute
While Not .Eof
Response.Write "<a href=""?name=" & .Fields("name").Value & """>" & .Fields("name").Value & "</a><br />"
.MoveNext
Wend
.Close
End With
Set Command.ActiveConnection = Nothing
Set Command = Nothing
Connection.Close
%><hr />
<a href="?">SHOW ALL</a><hr />
<form method="post" action="<%=Request.ServerVariables("SCRIPT_NAME")%>">
Name : <input type="text" name="name" maxlength="255" /> <input type="submit" value="Add" />
</form>
As a last remark:
When you need to apply html encoding to strings fetched from the database, you shouldn't use Server.HTMLEncode anymore due to Response.Codepage is 1252 on server-side and since Server.HTMLEncode is dependent context codepage this will cause gibberish outputs.
So you'll need to write your own html encoder to handle the case.
Function MyOwnHTMLEncode(ByVal str)
str = Replace(str, "&", "&")
str = Replace(str, "<", "<")
str = Replace(str, ">", ">")
str = Replace(str, """", """)
MyOwnHTMLEncode = str
End Function
'Response.Write MyOwnHTMLEncode(rs("myfield").value)
How do I convert UTF-8 data from Classic asp Form post to UCS-2 for inserting into SQL Server 2008 r2?
You have to tell SQL Server 2008 that you are sending in unicode data by adding an N to the front of your insert value. so its like this
strTest = "Служба мгновенных сообщений"
strSQL = "INSERT INTO tblTest (test) VALUES (N'"&strTest&"')"
The N tells SQL server to treat the Contents as Unicode. and does not corrupt the data.
See http://support.microsoft.com/kb/239530 for further info.
Here is test code Run on Classic ASP IIS 7 SQL Server 2008r2
CREATE TABLE [dbo].[tblTest](
[test] [nvarchar](255) NULL,
[id] [int] IDENTITY(1,1) NOT NULL
ASP Page
<%
Response.CodePage = 65001
Response.CharSet = "utf-8"
strTest = Request("Test")
Set cnn = Server.CreateObject("ADODB.Connection")
strConnectionString = Application("DBConnStr")
cnn.Open strConnectionString
strSQL = "INSERT INTO tblTest (test) VALUES (N'"&strTest&"')"
Set rsData = cnn.Execute(strSQL)
%>
<html xmlns="http://www.w3.org/1999/xhtml" charset="utf-8">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title></title>
</head
<body>
<form action="test.asp" method="post" name="form1" >
<br/><br/><br/><center>
<table border="1">
<tr><td><b>Test SQL Write</b> </td></tr>
<tr><td><input type="text" name="Test" style="width: 142px" Value="<%=strtext%>" /></td></tr>
<tr><td><input type="Submit" value="Submit" name "Submit" /></td></tr></table> </center>
</form>
</body>
</html>
Problems with runnig UTF-8 encoded sql files in Classic ASP
It is certain that FileSystemObject does not handle UTF-8 but Unicode and ANSI.
ADODB.Stream can handle a lot of character sets including utf-8 so you can use it instead.
Replace your code up to the first For
with the following.
Dim arrSqlLines
With Server.CreateObject("Adodb.Stream")
.Charset = "utf-8"
.Open
.LoadFromFile filePath
If .EOS Then
'an empty array if file is empty
arrSqlLines = Array()
Else
'to obtain an array of lines like you desire
'remove carriage returns (vbCr) if exist
'and split the text by using linefeeds (vbLf) as delimiter
arrSqlLines = Split(Replace(.ReadText, vbCr, ""), vbLf)
End If
.Close
End With
Related Topics
How to Check If a Table Exists in a Given Schema
Get a List of Dates Between Two Dates
Which Is Faster/Best? Select * or Select Column1, Colum2, Column3, etc
Calculate a Running Total in MySQL
MySQL Query Group by Day/Month/Year
MySQL Error: Key Specification Without a Key Length
How to Combine Multiple Rows into a Comma-Delimited List in Oracle
Only Inserting a Row If It's Not Already There
How to Query Between Two Dates Using MySQL
Is Select or Insert in a Function Prone to Race Conditions
Foreign Key to Non-Primary Key