Ruby String Split into Words Ignoring All Special Characters: Simpler Query

Remove words from string which are present in some set

You can build a union of several patterns with Regexp::union:

words = ["Hotel" , "Resorts"]
re = Regexp.union(words)
#=> /Hotel|Resorts/

"Hotel Silver Stone Resorts".gsub(re, "")
#=> " Silver Stone "

Note that you might have to escape your words.

Deleting all special characters from a string - ruby

You can do this

a.gsub!(/[^0-9A-Za-z]/, '')

How to check whether a string contains a substring in Ruby

You can use the include? method:

my_string = "abcdefg"
if my_string.include? "cde"
puts "String includes 'cde'"
end

Symbols in query-string for elasticsearch

You can sanitize your query string. Here is a sanitizer that works for everything that I've tried throwing at it:

def sanitize_string_for_elasticsearch_string_query(str)
# Escape special characters
# http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#Escaping Special Characters
escaped_characters = Regexp.escape('\\/+-&|!(){}[]^~*?:')
str = str.gsub(/([#{escaped_characters}])/, '\\\\\1')

# AND, OR and NOT are used by lucene as logical operators. We need
# to escape them
['AND', 'OR', 'NOT'].each do |word|
escaped_word = word.split('').map {|char| "\\#{char}" }.join('')
str = str.gsub(/\s*\b(#{word.upcase})\b\s*/, " #{escaped_word} ")
end

# Escape odd quotes
quote_count = str.count '"'
str = str.gsub(/(.*)"(.*)/, '\1\"\3') if quote_count % 2 == 1

str
end

params[:query] = sanitize_string_for_elasticsearch_string_query(params[:query])

Split a string by commas but ignore commas within double-quotes using Javascript

Here's what I would do.

var str = 'a, b, c, "d, e, f", g, h';
var arr = str.match(/(".*?"|[^",\s]+)(?=\s*,|\s*$)/g);

Sample Image
/* will match:

    (
".*?" double quotes + anything but double quotes + double quotes
| OR
[^",\s]+ 1 or more characters excl. double quotes, comma or spaces of any kind
)
(?= FOLLOWED BY
\s*, 0 or more empty spaces and a comma
| OR
\s*$ 0 or more empty spaces and nothing else (end of string)
)

*/
arr = arr || [];
// this will prevent JS from throwing an error in
// the below loop when there are no matches
for (var i = 0; i < arr.length; i++) console.log('arr['+i+'] =',arr[i]);

How do I replace accented Latin characters in Ruby?

Rails has already a builtin for normalizing, you just have to use this to normalize your string to form KD and then remove the other chars (i.e. accent marks) like this:

>> "àáâãäå".mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'').downcase.to_s
=> "aaaaaa"


Related Topics



Leave a reply



Submit