Detect Similar Sounding Words in Ruby

Detect similar sounding words in Ruby

I think you're describing levenshtein distance. And yes, there are gems for that. If you're into pure Ruby go for the text gem.

$ gem install text

The docs have more details, but here's the crux of it:

Text::Levenshtein.distance('test', 'test')    # => 0
Text::Levenshtein.distance('test', 'tent') # => 1

If you're ok with native extensions...

$ gem install levenshtein

It's usage is similar. It's performance is very good. (It handles ~1000 spelling corrections per minute on my systems.)

If you need to know how similar two words are, use distance over word length.

If you want a simple similarity test, consider something like this:

Untested, but straight forward:

String.module_eval do
def similar?(other, threshold=2)
distance = Text::Levenshtein.distance(self, other)
distance <= threshold
end
end

Can Ruby recognise different spellings of the same word?

This can be achieved using a regular expression. Though, you probably wouldn't want to match any character -- it might work in this particular instance, but could completely change the meaning of other words.

For example, /gr(a|e)y/ would match both "gray" and "grey".

If you did want to match any single character, you could use a range, like /gr[a-zA-Z]y/.

Here's a working example on Rubular.

There's also probably a gem which wraps up all of these common spellings. I'd suggest searching on rubygems.org and ruby-toolbox.com.

Detect similar sounding words in java

There are several algorithms developed to compare words by how they sound. The most basic one is soundex, and there is an Apache implementation of it here:

http://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/language/Soundex.html

There are also other algorithms like Metaphone, Hamming distance, Levenshtein distance etc.

Ruby compare two strings similarity percentage

I think your question could do with some clarifications, but here's something quick and dirty (calculating as percentage of the longer string as per your clarification above):

def string_difference_percent(a, b)
longer = [a.size, b.size].max
same = a.each_char.zip(b.each_char).count { |a,b| a == b }
(longer - same) / a.size.to_f
end

I'm still not sure how much sense this percent difference you are looking for makes, but this should get you started at least.

It's a bit like Levensthein distance, in that it compares the strings character by character. So if two names differ only by the middle name, they'll actually be very different.

Ruby: Scanning strings for matching adjacent vowel groups

def match(first, second)
end_of_first = first[/[aeiou]+$|[^aeiou]+$/]
start_of_second = second[/^[aeiou]+|^[^aeiou]+/]
end_of_first == start_of_second
end

match("utua", "uailo")
# => false
match("inia", "iatu")
# => true

EDIT: I apparently can't read, I thought you just want to match the group (whether vowel or consonant). If you restrict to vowel groups, it's simpler:

  end_of_first = first[/[aeiou]+$/]
start_of_second = second[/^[aeiou]+/]

How to check whether a string contains a substring in Ruby

You can use the include? method:

my_string = "abcdefg"
if my_string.include? "cde"
puts "String includes 'cde'"
end

Ruby Anagram Comparison Module

I think the problem here is you're displaying a message but not returning a true or false value which is what is expected.

After each puts, include the appropriate answer. That way your method will return something useful. Right now I'm presuming it's nil for all cases, since that's what puts returns.



Related Topics



Leave a reply



Submit