Determine prefix from a set of (similar) strings
Never rewrite what is provided to you: os.path.commonprefix
does exactly this:
For comparison to the other answers, here's the code:Return the longest path prefix (taken
character-by-character) that is a prefix of all paths in list. If list
is empty, return the empty string (''
). Note that this may return
invalid paths because it works a character at a time.
# Return the longest prefix of all list elements.
def commonprefix(m):
"Given a list of pathnames, returns the longest common leading component"
if not m: return ''
s1 = min(m)
s2 = max(m)
for i, c in enumerate(s1):
if c != s2[i]:
return s1[:i]
return s1
How to get common prefix of strings in a list
Use os.path.commonprefix it will do exactly what you want.
In [1]: list = ['nomad', 'normal', 'nonstop', 'noob']
In [2]: import os.path as p
In [3]: p.commonprefix(list)
Out[3]: 'no'
As an aside, naming a list "list" will make it impossible to access the list
class, so I would recommend using a different variable name. Determine if one string is a prefix of another
Looks like you've got two different problems.
One is to determine if a string is contained as a prefix in another string. For this I would suggest using a function already implemented in the language's string library. In JavaScript you could do this
if (str2.indexOf(str1) === 0) {
// string str1 is a prefix of str2
}
See documentation for String.indexOf here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/indexOfFor the other problem, in a bunch of strings, find out which ones have a given string as a prefix, building a data structure like a Trie or the one you mention seems like the way to go, if you want fast look-ups.
String Match with prefixes
You can use a trie.
Here is an implementation:
class Trie(dict):
def add(self, s):
node = self
for ch in s:
if ch not in node:
node[ch] = Trie()
node = node[ch]
node["end"] = True
def findprefix(self, s):
node = self
len = 0
for i, ch in enumerate(s):
if "end" in node:
len = i
if ch not in node:
break
node = node[ch]
return s[:len]
trie = Trie()
for s in ["good", "goo", "go", "goodbyeparty"]:
trie.add(s)
print(trie.findprefix("goodbye")) # "good"```
Improve string - prefix matching performance
If the prefixes are known in advance and can be preprocessed, you might try a trie. Especially if they are going to be as short as 10 characters. That would mean each check is on the order of 10 comparisons. Not sure how much better one could do.
function buildTrie(trie, words){
for (let word of words){
let _trie = trie;
for (let i=0; i<word.length; i++){
const letter = word[i];
_trie[letter] = _trie[letter] || {};
if (i == word.length - 1)
_trie[letter]['leaf'] = true;
_trie = _trie[letter];
}
}
return trie;
}
function find(trie, str, i=0){
const letter = str[i];
if (!trie[letter])
return false;
if (trie[letter]['leaf'])
return true;
return find(trie[letter], str, i + 1);
}
const prefixes = [ "Hey", "Heya", "Hi", "Hola"];
const trie = buildTrie({}, prefixes)
console.log(trie)
console.log(find(trie, "Hey, I'm Michael"));
console.log(find(trie, "Heuy, I'm Michael"));
Longest Common Prefix from list elements in Python
No matter what, you need to look at each character from each string in turn (until you find a set of corresponding characters that doesn't match), so there's no benefit to splitting the list up. Just iterate through and break when the common prefix stops being common:
def common_prefix(strs) -> str:
prefix = ""
for chars in zip(*strs):
if len(set(chars)) > 1:
break
prefix += chars[0]
return prefix
print(common_prefix(["flowers", "flow", "flight"])) # fl
Check if one string is a prefix of another
Use std::mismatch
. Pass in the shorter string as the first iterator range and the longer as the second iterator range. The return is a pair of iterators, the first is the iterator in the first range and the second, in the second rage. If the first is end of the first range, then you know the the short string is the prefix of the longer string e.g.
std::string foo("foo");
std::string foobar("foobar");
auto res = std::mismatch(foo.begin(), foo.end(), foobar.begin());
if (res.first == foo.end())
{
// foo is a prefix of foobar.
}
Get the longest common string prefix of all words in a list
>>> import os
>>> os.path.commonprefix(["flexible","flexile","flexion","flexor"])
'flex'
Related Topics
Return List of Items in List Greater Than Some Value
Why Is the Apt-Get Function Not Working in the Terminal on MAC Os X V10.9 (Mavericks)
Disable Console Messages in Flask Server
Row-Wise Average for a Subset of Columns with Missing Values
How to Import a Module in Python with Importlib.Import_Module
Errors While Building/Installing C Module for Python 2.7
Typeerror: Expected String or Buffer
Start a Flask Application in Separate Thread
Python, Default Keyword Arguments After Variable Length Positional Arguments
How to Check Task Status in Celery
Getting Data from Ctypes Array into Numpy
String with 'F' Prefix in Python-3.6