Fastest algorithm to find a string in an array of strings?
You could convert the whole array of strings to a finite state machine, where the transitions are the characters of the strings and put the smallest index of the strings that produced a state into the state. This takes a lot of time, and may be considered indexing.
Fastest way to find string in the array of strings
You can use the indexOf() function for that.
var arr = ['qwe', 'rty', 'uio p', 'a', 's df'];
var str= 'rty';
var isPresent = (arr.indexOf(str) > -1);
To explain:
indexOf() return the index of the string found in the array. If the string is not found, it returns -1.
So...indexOf('qwe') returns 0, indexOf('rty') returns 1, etc. But indexOf('foo') returns -1.
Fast way to find if a string is in an array
First of all, there is some fundamental confusion here about what data structures are available in JavaScript.
TL;DR
If you want fastest key lookup for objects with short prototype inheritance chain use
in
.If you want the same, but for the objects with extensive inheritance chain, use
Object.prototype.hasOwnProperty
If you want the fastest value lookup, use
Array.prototype.indexOf
forArray
.There isn't a built in function for value lookup in hash-tables. You can, of course, roll your own, but there are many libraries that provide one already. For example, Underscore provides one (it calls it
indexOf
).
JavaScript has no arrays
Fundamentally, JavaScript has only hash-tables. The standard Array
function constructs hash-tables (I will call these integer hash-tables, or int-hash-tables) where the keys are integers in addition to string keys. These perform similarly to arrays, but they differ in certain ways. There are cons and pros. For example, deleting an element from int-hash-table is an O(1) operation, while deleting an element from array is an O(n) operation (because you need to copy the rest of the elements into the new array). This is why Array.prototype.splice
function in JavaScript is very fast. The downside is the complexity of implementation.
So, when you say Array
in JavaScript context it is understood as int-hash-table, and all the asymptotic complexity associated with it. This means that if you wanted to find a string value inside an int-hash-table, then it would be an O(n) operation. There is a standard function for doing it: Array.prototype.indexOf
. However, if you wanted to look for the key, then there are two functions: in
and Object.prototype.hasOwnProperty
.
Somewhat counterintuitively:
[1, 2, 3].hasOwnProperty(0); // true
0 in [1, 2, 3]; // true
The difference between the two needs further explaining. It is related to the fact that all things in JavaScript are objects, and thus they have some object-y features. One such features the prototype
, the link between the object and it's prototype. It is a hierarchical structure of hash-tables, each containing properties of objects.
in
looks up the immediate hash-table of the object and then recursively searches the hash-tables of the prototypes of this objects.Whereas
Object.prototype.hasOwnProperty
only looks into the immediate hash-table. You might think it should be faster, but wait jumping to conclusions.
Due to the dynamic nature of JavaScript all function calls are dynamic and the environment must take a lot of care to ensure fail-safe code execution. This means that in JavaScript function calls are very expensive. So, going through Object.prototype.hasOwnProperty
may be a lot more expensive then going through in
, even though theoretically it should be the opposite. However, given tall-enough inheritance tree and enough of inherited properties, eventually, Object.prototype.hasOwnProperty
will take over.
Some examples to get a better intuition:
>>> var array = [1, 2, 3];
undefined
>>> 3 in array;
false
>>> array.hasOwnProperty(3);
false
>>> 3 in array;
false
>>> array.__proto__ = [1, 2, 3, 4];
[1, 2, 3, 4]
>>> 3 in array;
true
>>> array.hasOwnProperty(3);
false
Fastest way to convert string to string array in Java
If you don't want to use regex, you can use substring
, which should be much quicker than string concatenation:
public static String[] mySplit(String input){
int len = input.length(), index = 0;
// Arrays are faster than lists
String[] array = new String[len/2+len%2];
for (int i = 0; i < len-1; i+=2, index++) {
array[index]=input.substring(i,i+2);
}
// To handle strings with an odd number of characters
if(input.length()%2==1) {
array[index]=input.substring(input.length()-1);
}
return array;
}
Demo
fastest way to compare a string with a array of strings
Based on this sentence, from the question:
What is [a] way to check if my String has any of the words in
myArray
?
(Emphasis mine.)
I'd suggest the following, which will test if "some" of the words in the supplied string are present in the supplied array. This – theoretically – stops comparing once there is a match of any of the words from the string present in the array:
var myArray = ["ibira", "garmin", "hide", "park", "parque", "corrida", "trote", "personal", "sports", "esportes", "health", "saúde", "academia"],
myString = "I went to the park with my garmin watch";
function anyInArray(needles, haystack) {
// we split the supplied string ("needles") into words by splitting
// the string at the occurrence of a word-boundary ('\b') followed
// one or more ('+') occurrences of white-space ('\s') followed by
// another word-boundary:
return needles.split(/\b\s+\b/)
// we then use Array.prototype.some() to work on the array of
// words, to assess whether any/some of the words ('needle')
// - using an Arrow function - are present in the supplied
// array ('haystack'), in which case Array.prototype.indexOf()
// would return the index of the found-word, or -1 if that word
// is not found:
.some(needle => haystack.indexOf(needle) > -1);
// at which point we return the Boolean, true if some of the
// words were found, false if none of the words were found.
}
console.log(anyInArray(myString, myArray));
var myArray = ["ibira", "garmin", "hide", "park", "parque", "corrida", "trote", "personal", "sports", "esportes", "health", "saúde", "academia"], myString = "I went to the park with my garmin watch";
function anyInArray(needles, haystack) { return needles.split(/\b\s+\b/).some(needle => haystack.indexOf(needle) > -1);}
console.log(anyInArray(myString, myArray));
Fastest way to find a String into an array of string
You can use Set. It is implemented on top of Hash and will be faster for big datasets - O(1).
require 'set'
s = Set.new ['1.1.1.1', '1.2.3.4']
# => #<Set: {"1.1.1.1", "1.2.3.4"}>
s.include? '1.1.1.1'
# => true
What is the fastest way to check if a string is present in a string array?
Presuming you use STL classes, there's a few mechanisms you can use, depending on the domain of your problem.
For example, if the array is unsorted, then it doesn't really matter: there are StdLib algorithms which will better convey intent and shrink the code, but they'll be performance-wise equivalent to a simple for-loop. This code is identical, performance-wise, to a simple for-loop.
std::vector<std::string> strings = /*...*/;
//This will find the first string that matches the provided value and return its iterator
auto found_string_iterator = std::find(strings.begin(), strings.end(), "Desired String");
if(found_string_iterator != strings.end()) //found it
std::cout << *found_string_iterator << std::endl;
else //Did not find it
std::cout << "No such string found." << std::endl;
If the collection is sorted, you can use a Binary Search, which dramatically improves performance:
std::vector<std::string> sorted_strings = /*...*/;
//In a sorted collection, this returns iterators to all strings matching the provided value
auto string_range_iterators = std::equal_range(strings.begin(), strings.end(), "Desired String");
if(string_range_iterators.first != strings.end()) {
for ( auto i = string_range_iterators.first; i != string_range_iterators.second; ++i )
std::cout << *i << std::endl;
} else {
std::cout << "No Strings found." << std::endl;
If you don't need duplicate strings in your collection, you can use a set
or unordered_set
to collect the strings, which will guarantee at least the performance of a binary-search, and if you use unordered_set
instead, could be faster.
std::set<std::string> collected_strings = /*...*/;
auto found_string_iterator = collected_strings.find("Desired String");
if(found_string_iterator != strings.end()) //found it
std::cout << *found_string_iterator << std::endl;
else //Did not find it
std::cout << "No such string found." << std::endl;
The most efficient way to search for an array of strings in another string
I think you're looking for an algorithm like Rabin-Karp or Aho–Corasick which are designed to search in parallel for a large number of sub-strings in a text.
Related Topics
How to Iterate Over an Array of Arrays
Stub Method Only on The First Call with Rspec
How to Upload a Local File to a Carrierwave Model
How to Send Mail with Ruby Over Smtp with Ssl (Not with Rails, No Tls for Gmail)
Rails 3:Do I Need to Give Return True in a Before_Save Callback for an Object.Save to Work
Rails 5, "Nil Is Not a Valid Asset Source"
Twitter-Bootstrap-Rails Gem Workflow
Instance Variables Inheritance
Fresh Install of Rvm in Ubuntu Isn't Letting Me Install Gems (Zlib Error)
Rails - Link_To Helper with Data-* Attribute
How to Parse a Number from a String That May Have a Leading Zero
Validate That a Value Is in a Certain Range, E.G. 1 <= Val <=2
Best Way to Use HTML5 Data Attributes with Rails Content_Tag Helper