Ruby on Rails "Invalid Byte Sequence in Utf-8" Due to Bot

Ruby on Rails invalid byte sequence in UTF-8 due to bot

So you don't have to piece together the comments in my other reply, this is what I'm doing now – I've seen no errors for 24 hours, so it looks very promising:

Add rack-utf8_sanitizer to your Gemfile:

gem 'rack-utf8_sanitizer'

and run

bundle

Put this middleware in app/middleware/handle_invalid_percent_encoding.rb and rename the class HandleInvalidPercentEncoding (because ExceptionApp is a bit too general).

In the config block of config/application.rb do:

require "#{Rails.root}/app/middleware/handle_invalid_percent_encoding.rb"


# NOTE: These must be in this order relative to each other.
# HandleInvalidPercentEncoding just raises for encoding errors it doesn't cover,
# so it must run after (= be inserted before) Rack::UTF8Sanitizer.
config.middleware.insert 0, HandleInvalidPercentEncoding
config.middleware.insert 0, Rack::UTF8Sanitizer # from a gem

Deploy. Done.

(app happens to be the location for middleware in the project I'm working on, but I'd probably prefer lib. Whatever. Either should work.)

Rails send_data throws invalid byte sequence in UTF-8 ... but why?

Rails assumes UTF-8. Telling it explicitly that it is binary data solves the problem. Thanks for your help.

pdfdata.force_encoding('BINARY')

Rails ActiveRecord invalid byte sequence in UTF-8 issue

If the stream is configured as UTF-8 stream, you can't write compressed binary (which may contain any value).

I think, setting data as binary stream before write:

data.force_encoding "ASCII-8BIT"

might help.

invalid byte sequence in UTF-8 on page request

we created a rails middleware that filters out all the strange encodings that can not be handled within our app.

the problem that we encounter is that there are requests that have strange encodings, for example Cp1252 / Windows-1252. when ruby 1.9 tries to match those strings against utf-8 regexps it blows up.

i tried various ways of dealing with this problem by using iconv, but it looks like solutions that work on my mac don't work on the servers. so the simplest approach is probably the best...



Related Topics



Leave a reply



Submit