Finding Common Prefix of Array of Strings

Longest Common Prefix in Javascript

As the longest common prefix must occur in every string of the array you can jus iterate over the length and check if all words have the same char at that index until you find a difference

function prefix(words){
// check border cases size 1 array and empty first word)
if (!words[0] || words.length == 1) return words[0] || "";
let i = 0;
// while all words have the same character at position i, increment i
while(words[0][i] && words.every(w => w[i] === words[0][i]))
i++;

// prefix is the substring from the beginning to the last successfully checked i
return words[0].substr(0, i);
}

console.log(1, prefix([]));
console.log(2, prefix([""]));
console.log(3, prefix(["abc"]));
console.log(4, prefix(["abcdefgh", "abcde", "abe"]));
console.log(5, prefix(["abc", "abc", "abc"]));
console.log(6, prefix(["abc", "abcde", "xyz"]));

Find the longest common prefix from an array of strings

Here's a possible solution, needs to cover cases with an empty array but you get an idea.

While all letters at index c are the same I keep looping, otherwise I exit.
At the end I take c characters from the first word as it's the length of the common prefix.

function longestCommonPrefix(strs) {
var splits = strs.map((item) => item.split(''));
var c = 0;
var matches = true;
while (matches) {
var letter = splits[0][c];
if (letter != null) {
matches = splits.every(s => s[c] == letter);
if (matches)
c++;
} else matches = false;
}

var prefix = strs[0].slice(0, c);
console.log("The prefix is: " + prefix);
};

longestCommonPrefix(["flower", "flow", "flight"])

finding common prefix of array of strings

I implemented @diogoriba algorithm into code, with this result:

  • Finding the common prefix of the first two strings, and then comparing this with all following strings starting from the 3rd, and trim the common string if nothing common is found, wins in situations where there is more in common in the prefixes than different.
  • But bumperbox's original algorithm (except the bugfixes) wins where the strings have less in common in their prefix than different. Details in the code comments!

Another idea I implemented:

First check for the shortest string in the array, and use this for comparison rather than simply the first string. In the code, this is implemented with the custom written function arrayStrLenMin().

  • Can bring down iterations dramatically, but the function arrayStrLenMin() may itself cause ( more or less) iterations.
  • Simply starting with the length of first string in array seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.

Get the maximum common prefix of strings in an array with as little iterations as possible (PHP)

Code + Extensive Testing + Remarks:

function arrayStrLenMin ($arr, $strictMode = false, $forLoop = false) {
$errArrZeroLength = -1; // Return value for error: Array is empty
$errOtherType = -2; // Return value for error: Found other type (than string in array)
$errStrNone = -3; // Return value for error: No strings found (in array)

$arrLength = count($arr);
if ($arrLength <= 0 ) { return $errArrZeroLength; }
$cur = 0;

foreach ($arr as $key => $val) {
if (is_string($val)) {
$min = strlen($val);
$strFirstFound = $key;
// echo("Key\tLength / Notification / Error\n");
// echo("$key\tFound first string member at key with length: $min!\n");
break;
}
else if ($strictMode) { return $errOtherType; } // At least 1 type other than string was found.
}
if (! isset($min)) { return $errStrNone; } // No string was found in array.

// SpeedRatio of foreach/for is approximately 2/1 as dicussed at:
// http://juliusbeckmann.de/blog/php-foreach-vs-while-vs-for-the-loop-battle.html

// If $strFirstFound is found within the first 1/SpeedRatio (=0.5) of the array, "foreach" is faster!

if (! $forLoop) {
foreach ($arr as $key => $val) {
if (is_string($val)) {
$cur = strlen($val);
// echo("$key\t$cur\n");
if ($cur == 0) { return $cur; } // 0 is the shortest possible string, so we can abort here.
if ($cur < $min) { $min = $cur; }
}
// else { echo("$key\tNo string!\n"); }
}
}

// If $strFirstFound is found after the first 1/SpeedRatio (=0.5) of the array, "for" is faster!

else {
for ($i = $strFirstFound + 1; $i < $arrLength; $i++) {
if (is_string($arr[$i])) {
$cur = strlen($arr[$i]);
// echo("$i\t$cur\n");
if ($cur == 0) { return $cur; } // 0 is the shortest possible string, so we can abort here.
if ($cur < $min) { $min = $cur; }
}
// else { echo("$i\tNo string!\n"); }
}
}

return $min;
}

function strCommonPrefixByStr($arr, $strFindShortestFirst = false) {
$arrLength = count($arr);
if ($arrLength < 2) { return false; }

// Determine loop length
/// Find shortest string in array: Can bring down iterations dramatically, but the function arrayStrLenMin() itself can cause ( more or less) iterations.
if ($strFindShortestFirst) { $end = arrayStrLenMin($arr, true); }
/// Simply start with length of first string in array: Seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.
else { $end = strlen($arr[0]); }

for ($i = 1; $i <= $end + 1; $i++) {
// Grab the part from 0 up to $i
$commonStrMax = substr($arr[0], 0, $i);
echo("Match: $i\t$commonStrMax\n");
// Loop through all the values in array, and compare if they match
foreach ($arr as $key => $str) {
echo(" Str: $key\t$str\n");
// Didn't match, return the part that did match
if ($commonStrMax != substr($str, 0, $i)) {
return substr($commonStrMax, 0, $i-1);
}
}
}
// Special case: No mismatch (hence no return) happened until loop end!
return $commonStrMax; // Thus entire first common string is the common prefix!
}

function strCommonPrefixByChar($arr, $strFindShortestFirst = false) {
$arrLength = count($arr);
if ($arrLength < 2) { return false; }

// Determine loop length
/// Find shortest string in array: Can bring down iterations dramatically, but the function arrayStrLenMin() itself can cause ( more or less) iterations.
if ($strFindShortestFirst) { $end = arrayStrLenMin($arr, true); }
/// Simply start with length of first string in array: Seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.
else { $end = strlen($arr[0]); }

for ($i = 0 ; $i <= $end + 1; $i++) {
// Grab char $i
$char = substr($arr[0], $i, 1);
echo("Match: $i\t"); echo(str_pad($char, $i+1, " ", STR_PAD_LEFT)); echo("\n");
// Loop through all the values in array, and compare if they match
foreach ($arr as $key => $str) {
echo(" Str: $key\t$str\n");
// Didn't match, return the part that did match
if ($char != $str[$i]) { // Same functionality as ($char != substr($str, $i, 1)). Same efficiency?
return substr($arr[0], 0, $i);
}
}
}
// Special case: No mismatch (hence no return) happened until loop end!
return substr($arr[0], 0, $end); // Thus entire first common string is the common prefix!
}

function strCommonPrefixByNeighbour($arr) {
$arrLength = count($arr);
if ($arrLength < 2) { return false; }

/// Get the common string prefix of the first 2 strings
$strCommonMax = strCommonPrefixByChar(array($arr[0], $arr[1]));
if ($strCommonMax === false) { return false; }
if ($strCommonMax == "") { return ""; }
$strCommonMaxLength = strlen($strCommonMax);

/// Now start looping from the 3rd string
echo("-----\n");
for ($i = 2; ($i < $arrLength) && ($strCommonMaxLength >= 1); $i++ ) {
echo(" STR: $i\t{$arr[$i]}\n");

/// Compare the maximum common string with the next neighbour

/*
//// Compare by char: Method unsuitable!

// Iterate from string end to string beginning
for ($ii = $strCommonMaxLength - 1; $ii >= 0; $ii--) {
echo("Match: $ii\t"); echo(str_pad($arr[$i][$ii], $ii+1, " ", STR_PAD_LEFT)); echo("\n");
// If you find the first mismatch from the end, break.
if ($arr[$i][$ii] != $strCommonMax[$ii]) {
$strCommonMaxLength = $ii - 1; break;
// BUT!!! We may falsely assume that the string from the first mismatch until the begining match! This new string neighbour string is completely "unexplored land", there might be differing chars closer to the beginning. This method is not suitable. Better use string comparison than char comparison.
}
}
*/

//// Compare by string

for ($ii = $strCommonMaxLength; $ii > 0; $ii--) {
echo("MATCH: $ii\t$strCommonMax\n");
if (substr($arr[$i],0,$ii) == $strCommonMax) {
break;
}
else {
$strCommonMax = substr($strCommonMax,0,$ii - 1);
$strCommonMaxLength--;
}
}
}
return substr($arr[0], 0, $strCommonMaxLength);
}

// Tests for finding the common prefix

/// Scenarios

$filesLeastInCommon = array (
"/Vol/1/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/a/1",
"/Vol/2/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/a/2",
"/Vol/1/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/b/1",
"/Vol/1/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/b/2",
"/Vol/2/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/b/c/1",
"/Vol/2/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/a/1",
);

$filesLessInCommon = array (
"/Vol/1/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/a/1",
"/Vol/1/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/a/2",
"/Vol/1/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/b/1",
"/Vol/1/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/b/2",
"/Vol/2/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/b/c/1",
"/Vol/2/Finding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of StringsFinding Common Prefix of Array of Stringsa/a/1",
);

$filesMoreInCommon = array (
"/Voluuuuuuuuuuuuuumes/1/a/a/1",
"/Voluuuuuuuuuuuuuumes/1/a/a/2",
"/Voluuuuuuuuuuuuuumes/1/a/b/1",
"/Voluuuuuuuuuuuuuumes/1/a/b/2",
"/Voluuuuuuuuuuuuuumes/2/a/b/c/1",
"/Voluuuuuuuuuuuuuumes/2/a/a/1",
);

$sameDir = array (
"/Volumes/1/a/a/",
"/Volumes/1/a/a/aaaaa/2",
);

$sameFile = array (
"/Volumes/1/a/a/1",
"/Volumes/1/a/a/1",
);

$noCommonPrefix = array (
"/Volumes/1/a/a/",
"/Volumes/1/a/a/aaaaa/2",
"Net/1/a/a/aaaaa/2",
);

$longestLast = array (
"/Volumes/1/a/a/1",
"/Volumes/1/a/a/aaaaa/2",
);

$longestFirst = array (
"/Volumes/1/a/a/aaaaa/1",
"/Volumes/1/a/a/2",
);

$one = array ("/Volumes/1/a/a/aaaaa/1");

$empty = array ( );

// Test Results for finding the common prefix

/*

I tested my functions in many possible scenarios.
The results, the common prefixes, were always correct in all scenarios!
Just try a function call with your individual array!

Considering iteration efficiency, I also performed tests:

I put echo functions into the functions where iterations occur, and measured the number of CLI line output via:
php <script with strCommonPrefixByStr or strCommonPrefixByChar> | egrep "^ Str:" | wc -l GIVES TOTAL ITERATION SUM.
php <Script with strCommonPrefixByNeighbour> | egrep "^ Str:" | wc -l PLUS | egrep "^MATCH:" | wc -l GIVES TOTAL ITERATION SUM.

My hypothesis was proven:
strCommonPrefixByChar wins in situations where the strings have less in common in their beginning (=prefix).
strCommonPrefixByNeighbour wins where there is more in common in the prefixes.

*/

// Test Results Table
// Used Functions | Iteration amount | Remarks

// $result = (strCommonPrefixByStr($filesLessInCommon)); // 35
// $result = (strCommonPrefixByChar($filesLessInCommon)); // 35 // Same amount of iterations, but much fewer characters compared because ByChar instead of ByString!
// $result = (strCommonPrefixByNeighbour($filesLessInCommon)); // 88 + 42 = 130 // Loses in this category!

// $result = (strCommonPrefixByStr($filesMoreInCommon)); // 137
// $result = (strCommonPrefixByChar($filesMoreInCommon)); // 137 // Same amount of iterations, but much fewer characters compared because ByChar instead of ByString!
// $result = (strCommonPrefixByNeighbour($filesLeastInCommon)); // 12 + 4 = 16 // Far the winner in this category!

echo("Common prefix of all members:\n");
var_dump($result);

// Tests for finding the shortest string in array

/// Arrays

// $empty = array ();
// $noStrings = array (0,1,2,3.0001,4,false,true,77);
// $stringsOnly = array ("one","two","three","four");
// $mixed = array (0,1,2,3.0001,"four",false,true,"seven", 8888);

/// Scenarios

// I list them from fewest to most iterations, which is not necessarily equivalent to slowest to fastest!
// For speed consider the remarks in the code considering the Speed ratio of foreach/for!

//// Fewest iterations (immediate abort on "Found other type", use "for" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, true, true) . "\n\n");
// }

/* Results:

NEW ANALYSIS:
Result: Array is empty!

NEW ANALYSIS:
Result: Found other type!

NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
1 3
2 5
3 4
Result: 3

NEW ANALYSIS:
Result: Found other type!

*/

//// Fewer iterations (immediate abort on "Found other type", use "foreach" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, true, false) . "\n\n");
// }

/* Results:

NEW ANALYSIS:
Result: Array is empty!

NEW ANALYSIS:
Result: Found other type!

NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
0 3
1 3
2 5
3 4
Result: 3

NEW ANALYSIS:
Result: Found other type!

*/

//// More iterations (No immediate abort on "Found other type", use "for" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, false, true) . "\n\n");
// }

/* Results:

NEW ANALYSIS:
Result: Array is empty!

NEW ANALYSIS:
Result: No strings found!

NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
1 3
2 5
3 4
Result: 3

NEW ANALYSIS:
Key Length / Notification / Error
4 Found first string member at key with length: 4!
5 No string!
6 No string!
7 5
8 No string!
Result: 4

*/

//// Most iterations (No immediate abort on "Found other type", use "foreach" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, false, false) . "\n\n");
// }

/* Results:

NEW ANALYSIS:
Result: Array is empty!

NEW ANALYSIS:
Result: No strings found!

NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
0 3
1 3
2 5
3 4
Result: 3

NEW ANALYSIS:
Key Length / Notification / Error
4 Found first string member at key with length: 4!
0 No string!
1 No string!
2 No string!
3 No string!
4 4
5 No string!
6 No string!
7 5
8 No string!
Result: 4

*/

Longest Common Prefix in String Array in Java

The error in your code is in the part you used without checking that the variable (K) might be larger than the string length (j) of the array.
To solve this problem, it is enough to add a conditional statement before using the variable (K).
Good luck

How to determine longest common prefix and suffix for array of strings?

extension Collection where Element: StringProtocol {

func longestCommonPrefix() -> String {
guard var prefix = first.map({ String($0) }) else { return "" }
for string in dropFirst() {
while !string.hasPrefix(prefix) {
prefix.removeLast()
}
}
return prefix
}

func longestCommonSuffix() -> String {
guard var suffix = first.map({ String($0) }) else { return "" }
for string in dropFirst() {
while !string.hasSuffix(suffix) {
suffix.removeFirst()
}
}
return suffix
}

}

print(["A12[1]", "A13[1]", "A14[1]"].longestCommonPrefix()) // "A1"
print(["A12[1]", "A13[1]", "A14[1]"].longestCommonSuffix()) // "[1]"
print(["9-b", "10-b", "11-b"].longestCommonPrefix()) // ""
print(["9-b", "10-b", "11-b"].longestCommonSuffix()) // "-b"
print(["A12", "A14", "A6"].longestCommonPrefix()) // "A"
print(["A12", "A14", "A6"].longestCommonSuffix()) // ""

If you're importing Foundation, you can use its String.commonPrefix(with:) extension method to write an even shorter version:

import Foundation

extension Collection where Element: StringProtocol {
func longestCommonPrefix() -> String {
guard let first = self.first.map({ String($0) }) else { return "" }
return dropFirst().reduce(first, { $0.commonPrefix(with: $1) })
}

func longestCommonSuffix() -> String {
return String(self.lazy.map({ String($0.reversed()) }).longestCommonPrefix().reversed())
}
}

I learned about commonPrefix(with:) from Martin R's answer.

C#:How to get the longest common prefix from a list of strings LINQ expression

It was an interesting challenge and this is my solution:

var array = new []{"a","abC","abcD"};

var longestCommonPrefix = Enumerable.Range(1, array.Max(_ => _)!.Length)
.Select(i =>
{
var grouped = array.Where(x => x.Length >= i)
.GroupBy(x => x[..i])
.Where(x => x.Count() > 1)
.OrderByDescending(x => x.Count())
.Select(x => new { LongestCommonPrefix = x.Key })
.FirstOrDefault();

return grouped?.LongestCommonPrefix ?? string.Empty;
}).Max();

C: print the longest common prefix

This declaration

char found[10] = { '\0' }; 

is redundant and does not make a sense.

Also the function findprefix should return the length of the common prefix.

The function should be declared and defined the following way

size_t findprefix( const char *str1, const char *str2 )
{
size_t n = 0;

for ( ; *str1 && *str1 == *str2; ++str1, ++str2 )
{
++n;
}

return n;
}

And in main you can write

size_t n = findprefix( str1, str2 );

if ( n != 0 ) printf( "%.*s\n", ( int )n, str1 );

Here is a demonstration progarn.

#include <stdio.h>

size_t findprefix( const char *str1, const char *str2 )
{
size_t n = 0;

for ( ; *str1 && *str1 == *str2; ++str1, ++str2 )
{
++n;
}

return n;
}

int main( void )
{
const char *str1 = "Hello Word!";
const char *str2 = "Hello Kallum Smith";

size_t n = findprefix( str1, str2 );

if ( n != 0 ) printf( "\"%.*s\"\n", ( int )n, str1 );

return 0;
}

The program output is

"Hello "

Using the return value of the function you also can dynamically allocate an array or declare a variable length array where you can copy the prefix if it is required.



Related Topics



Leave a reply



Submit