Is there an efficient way to perform hundreds of text substitutions in Ruby?
An alternative approach, if your input data is separated words, would simply be to build a hash table of {error => correction}
.
Hash table lookup is fast, so if you can bend your input data to this format, it will almost certainly be fast enough.
How to replace words inside template placeholders
In your Regex, you have added the \A
and \z
anchors. These ensure that your regex only matches, if the string only contains exactly <%= Name %>
with nothing before or after.
To match the your pattern anywhere in the string, you can simply remove the anchors:
parsed_body = body.gsub(/<%= Name %>/, "Some person")
Comibine conditions in Ruby
yes there is :
if %w(new create).include? a
#code here
else
#code
end
How to match a string in array, regardless of the string size in Ruby
Here's where I'd start with this sort of task; These are great building blocks for human-interfaces on the web or in applications:
require 'regexp_trie'
saxophone_section = ["alto 1", "alto 2", "tenor 1", "tenor 2", "bari sax"]
RegexpTrie.union saxophone_section # => /(?:alto\ [12]|tenor\ [12]|bari\ sax)/
The output of RegexpTrie.union
is a pattern that will match all of the strings in saxophone_section
. The pattern is concise and efficient, and best of all, doesn't have to be generated by hand.
Applying that pattern to the string being created will show if you have a hit when there's a match, but only when there's enough of the string to match.
That's where a regular Trie is very useful. When you're trying to find what possible hits you could have, prior to having a full match, a Trie can find all the possibilities:
require 'trie'
trie = Trie.new
saxophone_section = ["alto 1", "alto 2", "tenor 1", "tenor 2", "bari sax"]
saxophone_section.each { |w| trie.add(w) }
trie.children('a') # => ["alto 1", "alto 2"]
trie.children('alto') # => ["alto 1", "alto 2"]
trie.children('alto 2') # => ["alto 2"]
trie.children('bari') # => ["bari sax"]
Blend those together and see what you come up with.
Remove excess junk words from string or array of strings
Dealing with stopwords is easy, but I'd suggest you do it BEFORE you split the string into the component words.
Building a fairly simple regular expression can make short work of the words:
STOPWORDS = /\b(?:#{ %w[to and or the a].join('|') })\b/i
# => /\b(?:to|and|or|the|a)\b/i
clean_string = 'to into and sandbar or forest the thesis a algebra'.gsub(STOPWORDS, '')
# => " into sandbar forest thesis algebra"
clean_string.split
# => ["into", "sandbar", "forest", "thesis", "algebra"]
How do you handle them if you get them already split? I'd join(' ')
the array to turn it back into a string, then run the above code, which returns the array again.
incoming_array = [
"14000",
"Things",
"to",
"Be",
"Happy",
"About",
]
STOPWORDS = /\b(?:#{ %w[to and or the a].join('|') })\b/i
# => /\b(?:to|and|or|the|a)\b/i
incoming_array = incoming_array.join(' ').gsub(STOPWORDS, '').split
# => ["14000", "Things", "Be", "Happy", "About"]
You could try to use Array's set operations, but you'll run afoul of the case sensitivity of the words, forcing you to iterate over the stopwords and the arrays which will run a LOT slower.
Take a look at these two answers for some added tips on how you can build very powerful patterns making it easy to match thousands of strings:
- "How do I ignore file types in a web crawler?"
- "Is there an efficient way to perform hundreds of text substitutions in Ruby?"
How do I write a regular expression that will match characters in any order?
Here is your solution
^(?:([act])(?!.*\1)){3}$
See it here on Regexr
^ # matches the start of the string
(?: # open a non capturing group
([act]) # The characters that are allowed and a capturing group
(?!.*\1) # That character is matched only if it does not occur once more, Lookahead assertion
){3} # Defines the amount of characters
$
The only special think is the lookahead assertion, to ensure the character is not repeated.
^
and $
are anchors to match the start and the end of the string.
Related Topics
Why Is "Slurping" a File Not a Good Practice
How to Avoid Nomethoderror For Missing Elements in Nested Hashes, Without Repeated Nil Checks
Where and How Is the _ (Underscore) Variable Specified
Ruby: Inherit Code That Works With Class Variables
Nokogiri/Xpath Namespace Query
Rescue_From Actioncontroller::Routingerror in Rails 4
Custom Authentication Strategy For Devise
Regular Expressions With Validations in Ror 4
How to Work With Two Different Databases in Rails With Active Records
What Does the Unary Question Mark () Operator Do
How to Find Where a Method Is Defined At Runtime
How to Track System-Specific Config Files in a Repo/Project
Require': Cannot Load Such File - 'Nokogiri\Nokogiri' (Loaderror) When Running 'Rails Server'
Difference Between Various Variables Scopes in Ruby
Naked Asterisk as Parameter in Method Definition: Def F(*)