Mail gem - how to clean up the body string
If you have a properly formatted email, you can use Mail helper methods:
mail = Mail.new(email_string)
mail.text_part # finds the first text/plain part
mail.html_part # finds the first text/html part
This doesn't always work if you have e.g. single part messages (text only) or receive email from the internet at large since you can't rely on formatting from every client out there. Believe me, I've learned the hard way.
Ruby Mail gem extract headers and clean up body
Yes, it's because real-world email has all kinds of surprises that don't fit the protocol.
To get the header part and body part:
header_part, body_part = message.body.split(/\n\s*\n/m, 2)
You may find some useful patterns for your parsing in this file:
lib/mail/patterns.rb
How to clean up a string (email body) with regards to special characters?
You need to use a MIME parser, which should take care of removing the headers and getting rid of the quoted printable encoding. Depending on the layout of your email, body[text] might get you a lot more than you want. You need to either download the BODYSTRUCTURE and pick out the parts you want, or download the entire message (BODY[]) and use a MIME parser.
Rails - Mail, getting the body as Plain Text
The code above:
message = Mail.new(params[:message])
will create a new instance of the mail gem from the full message. You can then use any of the methods on that message to get the content. You can therefore get the plain content using:
message.text_part
or the HTML with
message.html_part
These methods will just guess and find the first part in a multipart message of either text/plain or text/html content type. CloudMailin also provides these as convenience methods however via params[:plain] and params[:html]. It's worth remembering that the message is never guaranteed to have a plain or html part. It may be worth using something like the following to be sure:
plain_part = message.multipart? ? (message.text_part ? message.text_part.body.decoded : nil) : message.body.decoded
html_part = message.html_part ? message.html_part.body.decoded : nil
As a side note it's also important to extract the content encoding from the message when you use these methods and make sure that the output is encoded into the encoding method you desire (such as UTF-8).
Mail gem determine whether plaintext or html
You could look at maildata.content_type
:
maildata.content_type
#=> "text/plain; charset=us-ascii"
If it's a multipart e-mail, you could have both plain text and HTML. You could then look at the parts
array to see which content types it includes:
maildata.content_type
#=> "multipart/alternative; boundary=\"--==_mimepart_4f848491e618f_7e4b6c1f3849940\"; charset=utf-8"
maildata.parts.collect { |part| part.content_type }
#=> ["text/plain; charset=utf-8", "text/html; charset=utf-8"]
Rails gmail gem: Get correct address of sender and message body
A work-around that I found is with Net::IMAP.
Note that the email needed from From
is done mail.from[0]
.
imap = Net::IMAP.new('imap.gmail.com', 993, usessl = true, certs = nil, verify = false)
imap.login(USERNAME, PASSWORD)
imap.select('Inbox')
imap.search(["ALL"]).each do |message_id|
emails = imap.fetch(message_id,'RFC822')[0].attr['RFC822']
mail = Mail.read_from_string emails
@email = Email.create(:subject => mail.subject, :message => mail.body.decoded, :sender => mail.from[0], :date => mail.date)
end
imap.disconnect
Character encoding with Ruby 1.9.3 and the mail gem
After playing a bit, I found this:
body.decoded.force_encoding("ISO-8859-1").encode("UTF-8") # => "This reply has accents: Résumé..."
message.parts.map { |part| part.decoded.force_encoding("ISO-8859-1").encode(part.charset) } # multi-part
You can extract the charset from the message like so.
message.charset #=> for simple, non-multipart
message.parts.map { |part| part.charset } #=> for multipart, each part can have its own charset
Be careful with non-multipart, as the following can cause trouble:
body.charset #=> returns "US-ASCII" which is WRONG!
body.force_encoding(body.charset).encode("UTF-8") #=> Conversion error...
body.force_encoding(message.charset).encode("UTF-8") #=> Correct conversion :)
Related Topics
Document Model Attributes with Yard
Ruby Gems in Stand-Alone Ruby Scripts
Including a Virtual Attribute in the Respond_With Hash
Rails 4 Update Nested Attributes
Checking If a String Is Valid JSON Before Trying to Parse It
Ruby - How to Write a New File with Output from Script
Begin Rescue Not Catching Error
How to Allow Binary File Download Using Grape API
How to Stop God from Leaving Stale Resque Worker Processes
How to Disable a Form Submit Button "A Là Ruby on Rails Way"
How to Test If All Items in an Array Are Identical
Architecture for a Modular, Component-Based Sinatra Application
How to Use the Ruby "Self" Keyword
Bash: /Home/Xxx/.Rvm/Scripts/Rvm: No Such File or Directory
How to Tell If I'm Running from Jruby VS. Ruby