Find first character that is different between two strings
You can use a nice property of bitwise XOR (^
) to achieve this: Basically, when you xor two strings together, the characters that are the same will become null bytes ("\0"
). So if we xor the two strings, we just need to find the position of the first non-null byte using strspn
:
$position = strspn($string1 ^ $string2, "\0");
That's all there is to it. So let's look at an example:
$string1 = 'foobarbaz';
$string2 = 'foobarbiz';
$pos = strspn($string1 ^ $string2, "\0");
printf(
'First difference at position %d: "%s" vs "%s"',
$pos, $string1[$pos], $string2[$pos]
);
That will output:
First difference at position 7: "a" vs "i"
So that should do it. It's very efficient since it's only using C functions, and requires only a single copy of memory of the string.
Edit: A MultiByte Solution Along The Same Lines:
function getCharacterOffsetOfDifference($str1, $str2, $encoding = 'UTF-8') {
return mb_strlen(
mb_strcut(
$str1,
0, strspn($str1 ^ $str2, "\0"),
$encoding
),
$encoding
);
}
First the difference at the byte level is found using the above method and then the offset is mapped to the character level. This is done using the mb_strcut
function, which is basically substr
but honoring multibyte character boundaries.
var_dump(getCharacterOffsetOfDifference('foo', 'foa')); // 2
var_dump(getCharacterOffsetOfDifference('©oo', 'foa')); // 0
var_dump(getCharacterOffsetOfDifference('f©o', 'fªa')); // 1
It's not as elegant as the first solution, but it's still a one-liner (and if you use the default encoding a little bit simpler):
return mb_strlen(mb_strcut($str1, 0, strspn($str1 ^ $str2, "\0")));
Finding first different character in two strings
You may get the index with:
index = next((i for i in range(min(len(str1), len(str2))) if str1[i]!=str2[i]), None)
It will return None if they are the same.
if index is not None:
print(str1[index:], str2[index:])
Detect position of first difference in 2 strings
You can simply iterate through your strings and check it character-by-character.
document.body.innerHTML += findFirstDiffPos("in he", "in the") + "<br/>";document.body.innerHTML += findFirstDiffPos("abcd", "abcde") + "<br/>";document.body.innerHTML += findFirstDiffPos("zxc", "zxc");
function findFirstDiffPos(a, b){ var shorterLength = Math.min(a.length, b.length);
for (var i = 0; i < shorterLength; i++) { if (a[i] !== b[i]) return i; }
if (a.length !== b.length) return shorterLength;
return -1;}
Find the first differing character between two Strings in Ruby
Something like this ought to work:
str_a.each_char.with_index
.find_index {|char, idx| char != str_b[idx] } || str_a.size
Edit: It works: http://ideone.com/Ttwu1x
Edit 2: My original code returned nil
if str_a
was shorter than str_b
. I've updated it to work correctly (it will return str_a.size
, so if e.g. the last index in str_a
is 3, it will return 4).
Here's another method that may strike some as slightly simpler:
(0...str_a.size).find {|i| str_a[i] != str_b[i] } || str_a.size
http://ideone.com/275cEU
Find difference between two strings in JavaScript
Another option, for more sophisticated difference checking, is to make use of the PatienceDiff algorithm. I ported this algorithm to Javascript at...
https://github.com/jonTrent/PatienceDiff
...which although the algorithm is typically used for line-by-line comparison of text (such as computer programs), it can still be used for comparison character-by-character. Eg, to compare two strings, you can do the following...
let a = "thelebronnjamist";
let b = "the lebron james";
let difference = patienceDiff( a.split(""), b.split("") );
...with difference.lines
being set to an array with the results of the comparison...
difference.lines: Array(19)
0: {line: "t", aIndex: 0, bIndex: 0}
1: {line: "h", aIndex: 1, bIndex: 1}
2: {line: "e", aIndex: 2, bIndex: 2}
3: {line: " ", aIndex: -1, bIndex: 3}
4: {line: "l", aIndex: 3, bIndex: 4}
5: {line: "e", aIndex: 4, bIndex: 5}
6: {line: "b", aIndex: 5, bIndex: 6}
7: {line: "r", aIndex: 6, bIndex: 7}
8: {line: "o", aIndex: 7, bIndex: 8}
9: {line: "n", aIndex: 8, bIndex: 9}
10: {line: "n", aIndex: 9, bIndex: -1}
11: {line: " ", aIndex: -1, bIndex: 10}
12: {line: "j", aIndex: 10, bIndex: 11}
13: {line: "a", aIndex: 11, bIndex: 12}
14: {line: "m", aIndex: 12, bIndex: 13}
15: {line: "i", aIndex: 13, bIndex: -1}
16: {line: "e", aIndex: -1, bIndex: 14}
17: {line: "s", aIndex: 14, bIndex: 15}
18: {line: "t", aIndex: 15, bIndex: -1}
Wherever aIndex === -1
or bIndex === -1
is an indication of a difference between the two strings. Specifically...
- Element 3 indicates that character " " was found in
b
in position 3. - Element 10 indicates that character "n" was found in
a
in position 9. - Element 11 indicates that character " " was found in
b
in position 10. - Element 15 indicates that character "i" was found in
a
in position 13. - Element 16 indicates that character "e" was found in
b
in position 14. - Element 18 indicates that character "t" was found in
a
in position 15.
Note that the PatienceDiff algorithm is useful for comparing two similar blocks of text or strings. It will not tell you if basic edits have occurred. Eg, the following...
let a = "james lebron";
let b = "lebron james";
let difference = patienceDiff( a.split(""), b.split("") );
...returns difference.lines
containing...
difference.lines: Array(18)
0: {line: "j", aIndex: 0, bIndex: -1}
1: {line: "a", aIndex: 1, bIndex: -1}
2: {line: "m", aIndex: 2, bIndex: -1}
3: {line: "e", aIndex: 3, bIndex: -1}
4: {line: "s", aIndex: 4, bIndex: -1}
5: {line: " ", aIndex: 5, bIndex: -1}
6: {line: "l", aIndex: 6, bIndex: 0}
7: {line: "e", aIndex: 7, bIndex: 1}
8: {line: "b", aIndex: 8, bIndex: 2}
9: {line: "r", aIndex: 9, bIndex: 3}
10: {line: "o", aIndex: 10, bIndex: 4}
11: {line: "n", aIndex: 11, bIndex: 5}
12: {line: " ", aIndex: -1, bIndex: 6}
13: {line: "j", aIndex: -1, bIndex: 7}
14: {line: "a", aIndex: -1, bIndex: 8}
15: {line: "m", aIndex: -1, bIndex: 9}
16: {line: "e", aIndex: -1, bIndex: 10}
17: {line: "s", aIndex: -1, bIndex: 11}
Notice that the PatienceDiff does not report the swap of the first and last name, but rather, provides a result showing what characters were removed from a
and what characters were added to b
to end up with the result of b
.
EDIT: Added new algorithm dubbed patienceDiffPlus.
After mulling over the last example provided above that showed a limitation of the PatienceDiff in identifying lines that likely moved, it dawned on me that there was an elegant way of using the PatienceDiff algorithm to determine if any lines had indeed likely moved rather than just showing deletions and additions.
In short, I added the patienceDiffPlus
algorithm (to the GitHub repo identified above) to the bottom of the PatienceDiff.js file. The patienceDiffPlus
algorithm takes the deleted aLines[] and added bLines[] from the initial patienceDiff
algorithm, and runs them through the patienceDiff
algorithm again. Ie, patienceDiffPlus
is seeking the Longest Common Subsequence of lines that likely moved, whereupon it records this in the original patienceDiff
results. The patienceDiffPlus
algorithm continues this until no more moved lines are found.
Now, using patienceDiffPlus
, the following comparison...
let a = "james lebron";
let b = "lebron james";
let difference = patienceDiffPlus( a.split(""), b.split("") );
...returns difference.lines
containing...
difference.lines: Array(18)
0: {line: "j", aIndex: 0, bIndex: -1, moved: true}
1: {line: "a", aIndex: 1, bIndex: -1, moved: true}
2: {line: "m", aIndex: 2, bIndex: -1, moved: true}
3: {line: "e", aIndex: 3, bIndex: -1, moved: true}
4: {line: "s", aIndex: 4, bIndex: -1, moved: true}
5: {line: " ", aIndex: 5, bIndex: -1, moved: true}
6: {line: "l", aIndex: 6, bIndex: 0}
7: {line: "e", aIndex: 7, bIndex: 1}
8: {line: "b", aIndex: 8, bIndex: 2}
9: {line: "r", aIndex: 9, bIndex: 3}
10: {line: "o", aIndex: 10, bIndex: 4}
11: {line: "n", aIndex: 11, bIndex: 5}
12: {line: " ", aIndex: 5, bIndex: 6, moved: true}
13: {line: "j", aIndex: 0, bIndex: 7, moved: true}
14: {line: "a", aIndex: 1, bIndex: 8, moved: true}
15: {line: "m", aIndex: 2, bIndex: 9, moved: true}
16: {line: "e", aIndex: 3, bIndex: 10, moved: true}
17: {line: "s", aIndex: 4, bIndex: 11, moved: true}
Notice the addition of the moved
attribute, which identifies whether a line (or character in this case) was likely moved. Again, patienceDiffPlus
simply matches the deleted aLines[] and added bLines[], so there is no guarantee that the lines were actually moved, but there is a strong likelihood that they were indeed moved.
Using R to find the start difference of two strings
We can split the string and compare character by character and get the first mismatch using which.min
which.min(strsplit(string1, "")[[1]] == strsplit(string2, "")[[1]])
#[1] 18
The above method returns a warning message when nchar(string1)
is not equal to nchar(string2)
Warning message:
In strsplit(string1, "")[[1]] == strsplit(string2, "")[[1]] :
longer object length is not a multiple of shorter object length
Most of the cases it would be fine to ignore this message, it would still give you correct answer.
However, to make it complete and reliable we can write a function
location <- function(string1, string2) {
n = pmin(nchar(string1), nchar(string2))
i = 1
while (i <= n) {
if (substr(string1, i, i) != substr(string2, i, i))
return(i)
i = i + 1
}
cat("There is no difference between two strings")
}
location(string1, string2)
#[1] 18
location("Ronak", "Shah")
#[1] 1
location("Ronak", "Ronak")
#There is no difference between two strings
Java: Search the first common character between two strings
Make the following two changes :
In your cercaCarattere()
once you find the first occurence you can return early. Also the method can be simplified to :
public static char cercaCarattere(String str1, String str2) {
char letter = '*';
for (int i = 0; i < str1.length() && i < str2.length(); i++) {
if (str1.charAt(i) == str2.charAt(i)) {
return str1.charAt(i);
}
}
return letter;
}
And, use the value returned by the method to print it out
System.out.println(cercaCarattere(str1, str2));
Related Topics
PHP Variable Inside Echo 'HTML Code'
What Is an Abstract Class in PHP
How to Get the Subversion Revision Number in PHP
How to Use Class Methods as Callbacks
Pdo Fetchall Group Key-Value Pairs into Assoc Array
How to Force PHP to Use Strings for Array Keys
Laravel 5.4 Field Doesn't Have a Default Value
Cannot Use String Offset as an Array in PHP
PHP Filesize Reporting Old Size
PHP What Is the Best Approach to Using Xml? Need to Create and Parse Xml Responses
Could Not Open Input File: Artisan
PHP - Get Numeric Index of Associative Array
PHP Function Use Variable from Outside
Turn Off Display Errors Using File "Php.Ini"
Difference Between & and && in PHP
PHP Debug_Backtrace in Production Code to Get Information About Calling Method
How to Keep JSON_Encode() from Dropping Strings with Invalid Characters