extract all email addresses from some .txt documents using ruby
Depending on the nature of your .txt documents, you don't have to use one of the complicated regexes that attempt to validate email addresses. You're not trying to validate anything. You're just trying to grab what's already there. Generally speaking, a regex to grab what's already there can be much simpler than a regex that needs to validate input.
An important question is whether your .txt documents contain @ signs that are not part of an email address you want to extract.
This regex handles your first two requirements:
\w+@[\w.-]+|\{(?:\w+, *)+\w+\}@[\w.-]+
Or if you want to allow any sequence of non-space characters containing an @ sign, plus your second requirement (which has spaces):
\S+@\S+|\{(?:\w+, *)+\w+\}@[\w.-]+
Extract email addresses from a block of text
Howabout this for a (slightly) better regular expression
\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b
You can find this here:
Email Regex
Just an FYI, the problem with your email is that you allow only one type of separator before or after an email address. You would match "@" alone, if separated by spaces.
R gsub to extract emails from text
We can try the str_extract()
from stringr
package:
str_extract(text, "\\S*@\\S*")
[1] "Saolonm@hotmail.com"
[2] "26.leonard@gmail.com"
[3] "jcdavola31@gmail.com"
[4] "andrescarnederes@headset.cl"
[5] "luciana.chavela.ecuador@gmail.com"
where \\S*
match any number of non-space character.
Regex to extract method names from file using ruby
str =<<_
@Test(priority =10)
@RunAsClient
public void metodName(){
...
}
...
@Test(priority = 20)
public void otherMethodName(){
...
}
...
_
r = /
\@Test\(\s*priority\s*=\s*\d+\s*\)\s*\n # Match string
(?:\@RunAsClient\n)? # Optionally match string
(?:\w+\s+)+ # Match (word, >= 1 spaces) >= 1 times
\K # Forget everything matched so far
\w+ # match word
(?= # begin positive lookahead
(?:\([^)]*\)\s*\{) # match paren-enclosed expression, >= 0 spaces, {
| # or
(?:\s*\{) # match >= 0 spaces, {
) # end positive lookahead
/x # extended/free-spacing regex definition mode
str.scan r
#=> ["metodName", "otherMethodName"]
Trying to open / access the text in an attachment to an email using ruby gems
The text file attachment will be Base64 encoded. So you should be able to just decode it like this.
puts current_mail.attachments.each{|a| a.decode_body}
=>"can we see this?"
Select most common domain from list of email addresses
email_list = 10.times.map { emails }
#=> ["alfred.grass426@gmail.com", "elisa.oak239@icloud.com",
# "daniel.fruit1600@outlook.com", "ana.fruit3761@icloud.com",
# "daniel.grass742@yahoo.com", "elisa.oak3891@outlook.com",
# "alfred.leaf1321@gmail.com", "alfred.grass5295@outlook.com",
# "ramzes.fruit435@gmail.com", "ana.fruit4233@yahoo.com"]
email_list.group_by { |s| s[/@\K.+/] }.max_by { |_,v| v.size }.first
#=> "gmail.com"
\K
in the regex means disregard everything matched so far. Alternatively, @\K
could be replaced by the positive lookbehind (?<=@)
.
The steps are as follows.
h = email_list.group_by { |s| s[/@\K.+/] }
#=> {"gmail.com" =>["alfred.grass426@gmail.com", "alfred.leaf1321@gmail.com",
# "ramzes.fruit435@gmail.com"],
# "icloud.com" =>["elisa.oak239@icloud.com", "ana.fruit3761@icloud.com"],
# "outlook.com"=>["daniel.fruit1600@outlook.com", "elisa.oak3891@outlook.com",
# "alfred.grass5295@outlook.com"],
# "yahoo.com" =>["daniel.grass742@yahoo.com", "ana.fruit4233@yahoo.com"]}
a = h.max_by { |_,v| v.size }
#=> ["gmail.com", ["alfred.grass426@gmail.com", "alfred.leaf1321@gmail.com",
# "ramzes.fruit435@gmail.com"]]
a.first
#=> "gmail.com"
If, as here, there is a tie for most frequent, modify the code as follows to get all winners.
h = email_list.group_by { |s| s[/@\K.+/] }
# (same as above)
mx_size = h.map { |_,v| v.size }.max
#=> 3
h.select { |_,v| v.size == mx_size }.keys
#=> ["gmail.com", "outlook.com"]
Will this regex for email work for all emails?
short answer: NO. not ALL emails can be checked by regex. there's a thread somewhere here on SO, where they explain this much better than i could if i attempted. I think the only way to check if email is really an email is to contact the mail server and enquire whether user account exists.
please, have a read here: https://stackoverflow.com/a/1373724/81520
Export entire html table to a text document using Watir
To get HTML of the entire table (if it is the only table on the page):
browser.table.html
You will get something like this:
=> "<table border=\"1\" cellpadding=\"2\">\n<tbody><tr>\n<th> Address </th>\n<th> Council tax band </th>\n<th> Annual council tax </th>\n</tr>\n\n<tr>\n<td> 2, STONELEIGH AVENUE, COVENTRY, CV5 6BZ </td>\n<td align=\"center\"> F </td>\n<td align=\"center\"> £2125 </td>\n</tr>\n\n</tbody></table>"
To get HTML of each row and put it in an array:
browser.table.trs.collect {|tr| tr.html}
=> ["<tr>\n<th> Address </th>\n<th> Council tax band </th>\n<th> Annual council tax </th>\n</tr>",
"<tr>\n<td> 2, STONELEIGH AVENUE, COVENTRY, CV5 6BZ </td>\n<td align=\"center\"> F </td>\n<td align=\"center\"> £2125 </td>\n</tr>"]
To get text of each cell and put it in an array:
browser.table.trs.collect {|tr| [tr[0].text, tr[1].text, tr[2].text]}
=> [["Address", "Council tax band", "Annual council tax"],
["2, STONELEIGH AVENUE, COVENTRY, CV5 6BZ", "F", "£2125"]]
To write text of each cell to file:
content = b.table.trs.collect {|tr| [tr[0].text, tr[1].text, tr[2].text]}
File.open("table.txt", "w") {|file| file.puts content}
The file will look like this:
Address
Council tax band
Annual council tax
2, STONELEIGH AVENUE, COVENTRY, CV5 6BZ
F
£2125
Get names of all files from a folder with Ruby
You also have the shortcut option of
Dir["/path/to/search/*"]
and if you want to find all Ruby files in any folder or sub-folder:
Dir["/path/to/search/**/*.rb"]
Related Topics
What Is the Best Practice When It Comes to Testing "Infinite Loops"
Why Do I Get an Error Installing the JSON Gem in Ubuntu
How to Disable Db:Schema:Dump for Migrations
Why Is Enumerable#Each_With_Object Deprecated
How to Use a Variable as Object Attribute in Rails
Does Ruby Call Initialize Method Automatically
How to Pass Data from a Controller to a Model with Ruby on Rails
Devise Login with User or Admin Models and Basecamp Style Subdomains
How to Make Httparty Ignore Ssl
How to Create an Anchor and Redirect to This Specific Anchor in Ruby on Rails
What Exactly "Config.Assets.Debug" Setting Does
Rails: How to Get Has_Many Associations of a Model
What Is '-Mix' in a Ruby Regular Expression
Ruby (And Rails) Nested Module Syntax
Do I Have to Manually Uninstall All Dependent Gems
Removing a Model in Rails (Reverse of "Rails G Model Title...")