Intersecting Two Dictionaries

Intersecting two dictionaries

In general, to construct the intersection of dictionaries in Python, you can first use the & operator to calculate the intersection of sets of the dictionary keys (dictionary keys are set-like objects in Python 3):

dict_a = {"a": 1, "b": 2}
dict_b = {"a": 2, "c": 3}

intersection = dict_a.keys() & dict_b.keys() # {'a'}

On Python 2 you have to convert the dictionary keys to sets yourself:

keys_a = set(dict_a.keys())
keys_b = set(dict_b.keys())
intersection = keys_a & keys_b

Then given the intersection of the keys, you can then build the intersection of your values however is desired. You have to make a choice here, since the concept of set intersection doesn't tell you what to do if the associated values differ. (This is presumably why the & intersection operator is not defined directly for dictionaries in Python).

In this case it sounds like your values for the same key would be equal, so you can just choose the value from one of the dictionaries:

dict_of_dicts_a = {"a": {"x":1}, "b": {"y":3}}
dict_of_dicts_b = {"a": {"x":1}, "c": {"z":4}}

shared_keys = dict_of_dicts_a.keys() & dict_of_dicts_b.keys()

# values equal so choose values from a:
dict_intersection = {k: dict_of_dicts_a[k] for k in shared_keys } # {"a":{"x":1}}

Other reasonable methods of combining values would depend on the types of the values in your dictionaries, and what they represent. For example you might also want the union of values for shared keys of dictionaries of dictionaries. Since the union of dictionaries doesn't depend on the values, it is well defined, and in python you can get it using the | operator:

# union of values for each key in the intersection:
dict_intersection_2 = { k: dict_of_dicts_a[k] | dict_of_dicts_b[k] for k in shared_keys }

Which in this case, with identical dictionary values for key "a" in both, would be the same result.

Finding the intersection of two dictionaries

Here you are:

var dic1 = new Dictionary<int, string> { { 12, "hi" }, { 14, "bye" } };
var dic2 = new Dictionary<int, string> { { 12, "hello" }, { 18, "bye" } };
HashSet<int> commonKeys = new HashSet<int>(dic1.Keys);
commonKeys.IntersectWith(dic2.Keys);
var result =
dic1
.Where(x => commonKeys.Contains(x.Key))
.Concat(dic2.Where(x => commonKeys.Contains(x.Key)))
// .Select(x => x.Value) // With this additional select you'll get only the values.
.ToList();

The result list contains { 12, "hi" } and { 12, "hello" }

The HashSet is very usefull for intersections.


Just out of curiostiy I compared all six solutions (hopefully didn't missed any) and the times are as following:

@EZI        Intersect2   GroupBy         ~149ms
@Selman22 Intersect3 Keys.Intersect ~41ms
@dbc Intersect4 Where1 ~22ms
@dbc Intersect5 Where2 ~18ms
@dbc Intersect5 Classic ~11ms
@t3chb0t Intersect1 HashSet ~66ms

class Program
{
static void Main(string[] args)
{
var dic1 = new Dictionary<int, string>();
var dic2 = new Dictionary<int, string>();

Random rnd = new Random(DateTime.Now.Millisecond);
for (int i = 0; i < 100000; i++)
{
int id = 0;

do { id = rnd.Next(0, 1000000); } while (dic1.ContainsKey(id));
dic1.Add(id, "hi");

do { id = rnd.Next(0, 1000000); } while (dic2.ContainsKey(id));
dic2.Add(id, "hello");
}

List<List<string>> results = new List<List<string>>();

using (new AutoStopwatch(true)) { results.Add(Intersect1(dic1, dic2)); }
Console.WriteLine("Intersect1 elapsed in {0}ms (HashSet)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);

using (new AutoStopwatch(true)) { results.Add(Intersect2(dic1, dic2)); }
Console.WriteLine("Intersect2 elapsed in {0}ms (GroupBy)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);

using (new AutoStopwatch(true)) { results.Add(Intersect3(dic1, dic2)); }
Console.WriteLine("Intersect3 elapsed in {0}ms (Keys.Intersect)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);

using (new AutoStopwatch(true)) { results.Add(Intersect4(dic1, dic2)); }
Console.WriteLine("Intersect4 elapsed in {0}ms (Where1)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);

using (new AutoStopwatch(true)) { results.Add(Intersect5(dic1, dic2)); }
Console.WriteLine("Intersect5 elapsed in {0}ms (Where2)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);

using (new AutoStopwatch(true)) { results.Add(Intersect7(dic1, dic2)); }
Console.WriteLine("Intersect7 elapsed in {0}ms (Old style :-)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);

Console.ReadKey();
}

static List<string> Intersect1(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
HashSet<int> commonKeys = new HashSet<int>(dic1.Keys);
commonKeys.IntersectWith(dic2.Keys);
var result =
dic1
.Where(x => commonKeys.Contains(x.Key))
.Concat(dic2.Where(x => commonKeys.Contains(x.Key)))
.Select(x => x.Value)
.ToList();
return result;
}

static List<string> Intersect2(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result = dic1.Concat(dic2)
.GroupBy(x => x.Key)
.Where(g => g.Count() > 1)
.SelectMany(g => g.Select(x => x.Value))
.ToList();
return result;
}

static List<string> Intersect3(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result =
dic1
.Keys
.Intersect(dic2.Keys)
.SelectMany(key => new[] { dic1[key], dic2[key] })
.ToList();
return result;
}

static List<string> Intersect4(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result =
dic1.
Where(pair => dic2.ContainsKey(pair.Key))
.SelectMany(pair => new[] { dic2[pair.Key], pair.Value }).ToList();
return result;
}

static List<string> Intersect5(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result =
dic1
.Keys
.Where(dic2.ContainsKey).SelectMany(k => new[] { dic1[k], dic2[k] })
.ToList();
return result;
}

static List<string> Intersect7(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var list = new List<string>();
foreach (var key in dic1.Keys)
{
if (dic2.ContainsKey(key))
{
list.Add(dic1[key]);
list.Add(dic2[key]);
}
}
return list;
}
}

class AutoStopwatch : IDisposable
{
public static readonly Stopwatch Stopwatch = new Stopwatch();

public AutoStopwatch(bool start)
{
Stopwatch.Reset();
if (start) Stopwatch.Start();
}
public void Dispose()
{
Stopwatch.Stop();
}
}

Intersection and Difference of two dictionaries

Here's one way of doing it, though there may be a more efficient method.

d1 = {1:30, 2:20, 3:30, 5:80}
d2 = {1:40, 2:50, 3:60, 4:70, 6:90}

d_intersect = {} # Keys that appear in both dictionaries.
d_difference = {} # Unique keys that appear in only one dictionary.

# Get all keys from both dictionaries.
# Convert it into a set so that we don't loop through duplicate keys.
all_keys = set(d1.keys() + d2.keys()) # Python2.7
#all_keys = set(list(d1.keys()) + list(d2.keys())) # Python3.3

for key in all_keys:
if key in d1 and key in d2:
# If the key appears in both dictionaries, add both values
# together and place it in intersect.
d_intersect[key] = d1[key] + d2[key]
else:
# Otherwise find out the dictionary it comes from and place
# it in difference.
if key in d1:
d_difference[key] = d1[key]
else:
d_difference[key] = d2[key]

Output:

{1: 70, 2: 70, 3: 90}

{4: 70, 5: 80, 6: 90}

Intersecting two Dictionaries and getting average scores

You can create normalized dicts where the keys used for matching are extracted from the original keys. Since names both inside and outside parentheses in the keys of the input dicts can be used for matching, create redundant keys for both names in the normalized dict:

import re

n1, n2 = (
{t.lower(): v for k, v in d.items() for t in re.findall('[^()]+', k)}
for d in (d1, d2)
)
print({k: (n1[k] + n2[k]) / 2 for k in n1.keys() & n2.keys()})

This outputs:

{'gurgaon': 17.5, 'jaipur': 40.0}

Intersection of two list of dictionaries based on a key

You can do this with a list comprehension. First, build a set of all counts from list2, and then filter out dictionaries based on constant time set membership check.

counts = {d2['count'] for d2 in list2}
list3 = [d for d in list1 if d['count'] in counts]

print(list3)
# [{'count': 351, 'att_value': 'one', 'person_id': 12},
# {'count': 359, 'att_value': 'nine', 'person_id': 4}]

(Re:Edit) To handle other keys (besides just "att_value") appropriately, giving a default value of '-' in the same way, you can use:

keys = list1[0].keys() - {'count'}
idx = {d['count'] : d for d in list1}
list3 = []
for d in list2:
d2 = idx.get(d['count'], dict.fromkeys(keys, '-'))
d2.update(d)
list3.append(d2)

print(list3)
# [{'count': 359, 'att_value': 'nine', 'person_id': 4},
# {'count': 351, 'att_value': 'one', 'person_id': 12},
# {'person_id': 8, 'att_value': '-', 'count': 381}]

Python intersection of 2 lists of dictionaries

Use list comprehension:

[x for x in list1 if x in list2]

This returns me this list for your data:

[{'count': 351, 'evt_datetime': datetime.datetime(2015, 10, 23, 8, 45), 'att_value': 'red'}, {'count': 359, 'evt_datetime': datetime.datetime(2015, 10, 23, 8, 45), 'att_value': 'red'}]

c# dictionaries intersect

You could do in this way:

resultDict =  primaryDict.Keys.Intersect(secondaryDict.Keys)
.ToDictionary(t => t, t => primaryDict[t]);

or, alternatively:

resultDict =  primaryDict.Where(x => secondaryDict.ContainsKey(x.Key))
.ToDictionary(x => x.Key, x => x.Value);

the latter maybe is slightly more efficient because avoids the creation of a throw-away collection (the one generated by the Intersect method) and does not require a second access-by-key to primaryDict.

EDIT (as per comment) :

resultDict =  
primaryDict.Where(x => secondaryDict.ContainsKey(x.Key))
.ToDictionary(x => x.Key, x => x.Value + secondaryDict[x.Key]);


Related Topics



Leave a reply



Submit