Fastest Way to Find a String into an Array of String

Fastest algorithm to find a string in an array of strings?

You could convert the whole array of strings to a finite state machine, where the transitions are the characters of the strings and put the smallest index of the strings that produced a state into the state. This takes a lot of time, and may be considered indexing.

Fastest way to find string in the array of strings

You can use the indexOf() function for that.

var arr = ['qwe', 'rty', 'uio p', 'a', 's df'];
var str= 'rty';

var isPresent = (arr.indexOf(str) > -1);

To explain:
indexOf() return the index of the string found in the array. If the string is not found, it returns -1.
So...indexOf('qwe') returns 0, indexOf('rty') returns 1, etc. But indexOf('foo') returns -1.

Fast way to find if a string is in an array

First of all, there is some fundamental confusion here about what data structures are available in JavaScript.

TL;DR

  • If you want fastest key lookup for objects with short prototype inheritance chain use in.

  • If you want the same, but for the objects with extensive inheritance chain, use Object.prototype.hasOwnProperty

  • If you want the fastest value lookup, use Array.prototype.indexOf for Array.

  • There isn't a built in function for value lookup in hash-tables. You can, of course, roll your own, but there are many libraries that provide one already. For example, Underscore provides one (it calls it indexOf).

JavaScript has no arrays

Fundamentally, JavaScript has only hash-tables. The standard Array function constructs hash-tables (I will call these integer hash-tables, or int-hash-tables) where the keys are integers in addition to string keys. These perform similarly to arrays, but they differ in certain ways. There are cons and pros. For example, deleting an element from int-hash-table is an O(1) operation, while deleting an element from array is an O(n) operation (because you need to copy the rest of the elements into the new array). This is why Array.prototype.splice function in JavaScript is very fast. The downside is the complexity of implementation.

So, when you say Array in JavaScript context it is understood as int-hash-table, and all the asymptotic complexity associated with it. This means that if you wanted to find a string value inside an int-hash-table, then it would be an O(n) operation. There is a standard function for doing it: Array.prototype.indexOf. However, if you wanted to look for the key, then there are two functions: in and Object.prototype.hasOwnProperty.

Somewhat counterintuitively:

[1, 2, 3].hasOwnProperty(0); // true
0 in [1, 2, 3]; // true

The difference between the two needs further explaining. It is related to the fact that all things in JavaScript are objects, and thus they have some object-y features. One such features the prototype, the link between the object and it's prototype. It is a hierarchical structure of hash-tables, each containing properties of objects.

  • in looks up the immediate hash-table of the object and then recursively searches the hash-tables of the prototypes of this objects.

  • Whereas Object.prototype.hasOwnProperty only looks into the immediate hash-table. You might think it should be faster, but wait jumping to conclusions.

Due to the dynamic nature of JavaScript all function calls are dynamic and the environment must take a lot of care to ensure fail-safe code execution. This means that in JavaScript function calls are very expensive. So, going through Object.prototype.hasOwnProperty may be a lot more expensive then going through in, even though theoretically it should be the opposite. However, given tall-enough inheritance tree and enough of inherited properties, eventually, Object.prototype.hasOwnProperty will take over.

Some examples to get a better intuition:

>>> var array = [1, 2, 3];
undefined
>>> 3 in array;
false
>>> array.hasOwnProperty(3);
false
>>> 3 in array;
false
>>> array.__proto__ = [1, 2, 3, 4];
[1, 2, 3, 4]
>>> 3 in array;
true
>>> array.hasOwnProperty(3);
false

Fastest way to convert string to string array in Java

If you don't want to use regex, you can use substring, which should be much quicker than string concatenation:

public static String[] mySplit(String input){
int len = input.length(), index = 0;

// Arrays are faster than lists
String[] array = new String[len/2+len%2];
for (int i = 0; i < len-1; i+=2, index++) {
array[index]=input.substring(i,i+2);
}
// To handle strings with an odd number of characters
if(input.length()%2==1) {
array[index]=input.substring(input.length()-1);
}
return array;
}

Demo

fastest way to compare a string with a array of strings

Based on this sentence, from the question:

What is [a] way to check if my String has any of the words in myArray?

(Emphasis mine.)

I'd suggest the following, which will test if "some" of the words in the supplied string are present in the supplied array. This – theoretically – stops comparing once there is a match of any of the words from the string present in the array:

var myArray = ["ibira", "garmin", "hide", "park", "parque", "corrida", "trote", "personal", "sports", "esportes", "health", "saúde", "academia"],
myString = "I went to the park with my garmin watch";

function anyInArray(needles, haystack) {

// we split the supplied string ("needles") into words by splitting
// the string at the occurrence of a word-boundary ('\b') followed
// one or more ('+') occurrences of white-space ('\s') followed by
// another word-boundary:
return needles.split(/\b\s+\b/)
// we then use Array.prototype.some() to work on the array of
// words, to assess whether any/some of the words ('needle')
// - using an Arrow function - are present in the supplied
// array ('haystack'), in which case Array.prototype.indexOf()
// would return the index of the found-word, or -1 if that word
// is not found:
.some(needle => haystack.indexOf(needle) > -1);
// at which point we return the Boolean, true if some of the
// words were found, false if none of the words were found.
}

console.log(anyInArray(myString, myArray));

var myArray = ["ibira", "garmin", "hide", "park", "parque", "corrida", "trote", "personal", "sports", "esportes", "health", "saúde", "academia"],  myString = "I went to the park with my garmin watch";
function anyInArray(needles, haystack) { return needles.split(/\b\s+\b/).some(needle => haystack.indexOf(needle) > -1);}
console.log(anyInArray(myString, myArray));

Fastest way to find a String into an array of string

You can use Set. It is implemented on top of Hash and will be faster for big datasets - O(1).

require 'set'
s = Set.new ['1.1.1.1', '1.2.3.4']
# => #<Set: {"1.1.1.1", "1.2.3.4"}>
s.include? '1.1.1.1'
# => true

What is the fastest way to check if a string is present in a string array?

Presuming you use STL classes, there's a few mechanisms you can use, depending on the domain of your problem.

For example, if the array is unsorted, then it doesn't really matter: there are StdLib algorithms which will better convey intent and shrink the code, but they'll be performance-wise equivalent to a simple for-loop. This code is identical, performance-wise, to a simple for-loop.

std::vector<std::string> strings = /*...*/;
//This will find the first string that matches the provided value and return its iterator
auto found_string_iterator = std::find(strings.begin(), strings.end(), "Desired String");
if(found_string_iterator != strings.end()) //found it
std::cout << *found_string_iterator << std::endl;
else //Did not find it
std::cout << "No such string found." << std::endl;

If the collection is sorted, you can use a Binary Search, which dramatically improves performance:

std::vector<std::string> sorted_strings = /*...*/;
//In a sorted collection, this returns iterators to all strings matching the provided value
auto string_range_iterators = std::equal_range(strings.begin(), strings.end(), "Desired String");
if(string_range_iterators.first != strings.end()) {
for ( auto i = string_range_iterators.first; i != string_range_iterators.second; ++i )
std::cout << *i << std::endl;
} else {
std::cout << "No Strings found." << std::endl;

If you don't need duplicate strings in your collection, you can use a set or unordered_set to collect the strings, which will guarantee at least the performance of a binary-search, and if you use unordered_set instead, could be faster.

std::set<std::string> collected_strings = /*...*/;
auto found_string_iterator = collected_strings.find("Desired String");
if(found_string_iterator != strings.end()) //found it
std::cout << *found_string_iterator << std::endl;
else //Did not find it
std::cout << "No such string found." << std::endl;

The most efficient way to search for an array of strings in another string

I think you're looking for an algorithm like Rabin-Karp or Aho–Corasick which are designed to search in parallel for a large number of sub-strings in a text.



Related Topics



Leave a reply



Submit