How to Map the Differences Between Two Strings

How to map the differences between two strings?

One solution is to replace $(name) with (?P<name>.*) and use that as a regex:

def make_regex(text):
replaced = re.sub(r'\$\((\w+)\)', r'(?P<\1>.*)', text)
return re.compile(replaced)

def find_mappings(mapper, text):
return make_regex(mapper).match(text).groupdict()

Sample usage:

>>> string1 = "I have $(food1), $(food2), $(food3) for lunch"
>>> string2 = "I have rice, soup, vegetable for lunch"
>>> string3 = "I have rice rice rice, soup, vegetable for lunch"
>>> make_regex(string1).pattern
'I have (?P<food1>.*), (?P<food2>.*), (?P<food3>.*) for lunch'
>>> find_mappings(string1, string2)
{'food1': 'rice', 'food3': 'vegetable', 'food2': 'soup'}
>>> find_mappings(string1, string3)
{'food1': 'rice rice rice', 'food3': 'vegetable', 'food2': 'soup'}

Note that this can handle non alpha numeric tokens (see food1 and rice rice rice). Obviously this will probably do an awful lot of backtracking and might be slow. You can tweak the .* regex to try and make it faster depending on your expectations on "tokens".


For production ready code you'd want to re.escape the parts outside the (?P<name>.*) groups. A bit of pain in the ass to do because you have to "split" that string and call re.escape on each piece, put them together and call re.compile.


Since my answer got accepted I wanted to include a more robust version of the regex:

def make_regex(text):
regex = ''.join(map(extract_and_escape, re.split(r'\$\(', text)))
return re.compile(regex)

def extract_and_escape(partial_text):
m = re.match(r'(\w+)\)', partial_text)
if m:
group_name = m.group(1)
return ('(?P<%s>.*)' % group_name) + re.escape(partial_text[len(group_name)+1:])
return re.escape(partial_text)

This avoids issues when the text contains special regex characters (e.g. I have $(food1) and it costs $$$. The first solution would end up considering $$$ as three times the $ anchor (which would fail), this robust solution escapes them.

Returning difference between two strings (irrespective of their type)

You can just use loops, check if a character is not present in the string and you can save the difference in a variable.

Here's a way to do it in python:

x = 'abcd'
y = 'cdefg'

s = ''
t = ''

for i in x: # checking x with y
if i not in y:
s += i

for i in y: # checking y with x
if i not in x:
t += i

print(s) # ab
print(t) # efg

Edit:

I guess you are working in pandas column, so here's the code that would help you:

# importing pandas as pd
import pandas as pd

# Creating the DataFrame
df = pd.DataFrame({'PN':[555, 444, 333, 222, 111],
'whatever':['555A', 444, '333B', 222, '111C'],})

A=list(df['PN']) # Coverting Column to a list
B=list(df['whatever']) # Coverting Column to a list

def convert_str(a): # Function to convert element of list to string
return str(a)

C=[convert_str(i) for i in A] # Converting Element in List A to string
D=[convert_str(i) for i in B] # Converting Element in List B to string
E="".join(C) # Joinning the list C
F="".join(D) # Joinning the list D

diffrence=[i for i in F if i not in E] # Differences of F-E
print(diffrence)

# Output ['A', 'B', 'C']

Calculating the difference between two strings

What about doing it recursively? If two elements are the same, the first element of the resulting tuple is incremented; otherwise, the second element of the resulting tuple is appended by the mismatched element:

calcP :: [String] -> [String] -> (Int,[String])
calcP (x:xs) (y:ys)
| x == y = increment (calcP xs ys)
| otherwise = append y (calcP xs ys)
where
increment (count, results) = (count + 1, results)
append y (count, results) = (count, y:results)

calcP [] x = (0, x)
calcP x [] = (0, [])

a = ["A1","A2","B3","C3"]
b = ["A1","B2","B3","D5"]

main = print $ calcP a b

The printed result is (2,["B2","D5"])

Note, that

calcP [] x = (0, x)
calcP x [] = (0, [])

are needed to provide exhaustiveness for the pattern matching. In other words, you need to provide the case when one of the passed elements is an empty list. This also provides the following logic:

If the first list is greater than the second one on n elements, these n last elements are ignored.

If the second list is greater than the first one on n elements, these n last elements are appended to the second element of the resulting tuple.

How to map two String[] to each other

Based on how you're retrieving your value, you want to do something more like this when populating your hashmap

Map<String, String> myMap = new HashMap<String, String>();
for (int i = 0; i < Yellow_Li.length(); i++) {
myMap.put(Yellow_Li[i], Yellow_ID[i]);
}
String value = myMap.get(selectedValue);

I assume the String arrays are the same size so if they are not you should add that logic in.

Also, if the arrays are the same size you could so something like this so you don't have to build a hashmap:

int index = -1;
for (int i = 0; i < Yellow_Li.length(); i++) {
if (Yellow_Li[i].equals(selectedValue)) {
index = i;
break;
}
}

String value = Yellow_ID[index]; //should do a check for -1 before you try to assign "value"

How to find words that are different between two Strings? [closed]

if it's only words you can split the strings to string[] that each cell will contain 1 word only and then compare those words. Goes as follows:

String one = "this is first text example";
String two = "this is next text example";
String[] oneVals = one.split("\\ ");
String[] twoVals = two.split("\\ ");
int i = oneVals.length;
if(oneVals.length != twoVals.length)
{
// determine what to do
}
String wordsNotMatching = "";
for(int j=0; j<i; j++)
{
if((!oneVals[j].equals(twoVals[j])))
wordsNotMatching += oneVals[j] + " ";
}
// wordNotMatching will contain all different words.

What is the best effective way to get symmetric difference between two strings(in python)?

You can convert each string to a set then use symmetric_difference, then finally str.join back into a single string

>>> ''.join(set(a).symmetric_difference(b))
'daeb'


Related Topics



Leave a reply



Submit