Set UTF-8 as default string encoding in Heroku
As per the Heroku support staff, this is the magic thing:
heroku config:add LANG=en_US.UTF-8
Although heroku console
will keep reporting strings encoding as ASCII-8BIT
, your actuall app will be running with the correct encoding, based on the LANG
config var.
You can double check that by doing this:
$ heroku run bash
Running bash attached to terminal... up, run.2
u20415@022e95bf-3ab6-4291-97b1-741f95e7fbda:/app$ irb
irb(main):001:0> "a".encoding
=> #<Encoding:UTF-8>
Why doesn't Heroku specify a default character encoding on their virtual machines?
Heroku support was extremely helpful in answering my question. Essentially, they just leave the encoding up to the buildpack for each language. Additionally they gave me links to how this is implemented in the buildpacks they maintain for Python and Ruby.
I found the Python script particularly helpful in setting the encoding automatically for Haskell apps (I copied parts of it for the Haskell buildpack). Here's the PR I created to set UTF-8 as a default in Haskell. You should specify the default character encoding in two places in the compile script in the buildpack API. First, it's probably a good idea to export the language before compiling the compiler itself, as they do for Python. Second (and most importantly for your application) you should set it in the .profile.d script so that it gets picked up as the context for your application.
Hope this is helpful info to folks making buildpacks for other languages!
Heroku and Rails: how to set utf-8 as the default encoding
In your config/application.rb,
config.encoding = "utf-8"
in the database.yml,
development:
adapter: mysql2(whatever your db)
host: localhost
encoding: utf8
you also have to add(including the hash)
# encoding: UTF-8
source:http://craiccomputing.blogspot.com/2011/02/rails-utf-8-and-heroku.html
Heroku app does not display Cyrillic characters with UTF-8 as default encoding
If you have installed the locale buildpack, remove the .locales
file you created and the buildpack by going to your Heroku app's settings.
I fixed the problem by applying the following settings in my application.properties
file:
spring.http.encoding.charset=UTF-8
spring.http.encoding.enabled=true
spring.http.encoding.force=true
Also, on a side note, the first build after these settings are applied may be extra slow - if you are building in Eclipse and are used to your Heroku app building in around 1 minute, beware that this may take over 5 minutes to build.
Default Charset on Heroku (US-ASCII) causing problems
Finally, I got in touch with the friendly staff at Heroku--they gave the following suggestion to over-ride file.encoding
property via the JAVA_OPTS
env-variable.
Issued the following from my Heroku Toolbelt, & things began working now.
heroku config:add JAVA_OPTS='-Xmx384m -Xss512k -XX:+UseCompressedOops -Dfile.encoding=UTF-8'
This way, the JVM picks it up, & now Charset.defaultCharset( )
returns UTF-8
, with special characters appearing as they should!
They also said, we could alternatively do the following as well:
heroku config:add JAVA_TOOL_OPTIONS='-Dfile.encoding=UTF-8'
Also, it would be a good idea to embed this property right into the Procfile of the app, so that our code behaves the same when we push it to a new Heroku app.
Heroku: GET data is not retrieved as UTF-8
The support team of Heroku could answer my problem. If you are experiencing the same problem, this is the solution.
If you are using the latest version of webapp-runner (7.0.57.2), you
can add the option --uri-encoding in your Procfile.
java ... -jar target/dependency/webapp-runner.jar --port $PORT --uri-encoding UTF-8 --expand-war target/*.war`
Ruby converting string encoding from ISO-8859-1 to UTF-8 not working
You assign a string, in UTF-8. It contains ä
. UTF-8 represents ä
with two bytes.
string = 'ä'
string.encoding
# => #<Encoding:UTF-8>
string.length
# 1
string.bytes
# [195, 164]
Then you force the bytes to be interpreted as if they were ISO-8859-1, without actually changing the underlying representation. This does not contain ä
any more. It contains two characters, Ã
and ¤
.
string.force_encoding('iso-8859-1')
# => "\xC3\xA4"
string.length
# 2
string.bytes
# [195, 164]
Then you translate that into UTF-8
. Since this is not reinterpretation but translation, you keep the two characters, but now encoded in UTF-8:
string = string.encode('utf-8')
# => "ä"
string.length
# 2
string.bytes
# [195, 131, 194, 164]
What you are missing is the fact that you originally don't have an ISO-8859-1 string, as you would from your Web-service - you have gibberish. Fortunately, this is all in your console tests; if you read the response of the website using the proper input encoding, it should all work okay.
For your console test, let's demonstrate that if you start with a proper ISO-8859-1 string, it all works:
string = 'Norrlandsvägen'.encode('iso-8859-1')
# => "Norrlandsv\xE4gen"
string = string.encode('utf-8')
# => "Norrlandsvägen"
EDIT For your specific problem, this should work:
require 'net/https'
uri = URI.parse("https://rusta.easycruit.com/intranet/careerbuilder_se/export/xml/full")
options = {
:use_ssl => uri.scheme == 'https',
:verify_mode => OpenSSL::SSL::VERIFY_NONE
}
response = Net::HTTP.start(uri.host, uri.port, options) do |https|
https.request(Net::HTTP::Get.new(uri.path))
end
body = response.body.force_encoding('ISO-8859-1').encode('UTF-8')
Related Topics
Uploading Multiple Files With Paperclip
How to Read Lines of a File in Ruby
Why Are All Rails Helpers Available to All Views, All the Time? How to Disable This
Ruby - Net/Http - Following Redirects
How to Write Postgresql Functions on Ruby on Rails
Passing a Method as a Parameter in Ruby
How to Search a Folder and All of Its Subfolders For Files of a Certain Type
Ruby Regular Expression to Match a Url
How to Use Active Support Core Extensions
Imagemagick - "Core_Rl_Magick_.Dll Not Found" or How to Install Rmagick on Windows With Ruby 1.9.2
Is There an Inverse 'Member' Method in Ruby
How to Deal With the Sum of Rounded Percentage Not Being 100
What Does the "||=" Operand Stand for in Ruby