Finding Unique Combinations Irrespective of Position

Finding unique combinations irrespective of position

Maybe something like that

indx <- !duplicated(t(apply(df, 1, sort))) # finds non - duplicates in sorted rows
df[indx, ] # selects only the non - duplicates according to that index
# a b c
# 1 1 2 3
# 3 3 1 4

Get Unique List of Combinations of Strings, regardless of Order

df$grp <- interaction(do.call(pmin, df[1:2]), do.call(pmax, df[1:2]))

df
# col1 col2 grp
# 1 a b a.b
# 2 c d c.d
# 3 g h g.h
# 4 d c c.d
# 5 e f e.f
# 6 b a a.b
# 7 f e e.f
# 8 h g g.h

If you want numbers, you can then do

df$grp <- as.integer(df$grp)

df
# col1 col2 grp
# 1 a b 1
# 2 c d 6
# 3 g h 16
# 4 d c 6
# 5 e f 11
# 6 b a 1
# 7 f e 11
# 8 h g 16

Count unique combinations regardless of column order

Another solution, using .groupby:

x = (
df1.groupby(df1.apply(lambda x: tuple(sorted(x)), axis=1))
.agg(A=("A", "first"), B=("B", "first"), count=("B", "size"))
.reset_index(drop=True)
)
print(x)

Prints:

       A      B  count
0 cat bunny 1
1 bunny mouse 2
2 dog cat 3
3 mouse dog 1

Counting unique combinations of values across multiple columns regardless of order?

Assuming the character / doesn't show up in any of the offer names, you can do:

select count(distinct offer_combo) as distinct_offers
from (
select listagg(offer, '/') within group (order by offer) as offer_combo
from (
select customer_id, offer_1 as offer from t
union all select customer_id, offer_2 from t
union all select customer_id, offer_3 from t
) x
group by customer_id
) y

Result:

DISTINCT_OFFERS
---------------
2

See running example at db<>fiddle.

Get unique combinations of elements from a python list

You need itertools.combinations:

>>> from itertools import combinations
>>> L = [1, 2, 3, 4]
>>> [",".join(map(str, comb)) for comb in combinations(L, 3)]
['1,2,3', '1,2,4', '1,3,4', '2,3,4']

How to get every unique digit combination regardless of digit placement

Here's a code I wrote once:

function kPn(k, values, repetition) {
var retVal=[];
var n=(Array.isArray(values))?n=values.length:values;
var list=[];
for(var i=0;i<n;i++) {
list.push(i);
retVal.push([i]);
}
for(var i=2;i<=k;i++) {
var tempRetVal=[];
for(var rv=0;rv<retVal.length;rv++)
for(var l=0;l<list.length;l++) {
if(repetition||!(retVal[rv].includes(list[l]))) {
var retValItem=retVal[rv].slice();
retValItem.push(list[l]);
tempRetVal.push(retValItem);
}
}
retVal=tempRetVal;
}
if(!Array.isArray(values)) values=list;
var permutations=retVal;
var retVal=[];
for(var i=0;i<permutations.length;i++) {
tempSet=[];
for(var j=0;j<permutations[i].length;j++)
tempSet.push(values[permutations[i][j]]);
retVal.push(tempSet);
}
return retVal;
}

k: how many values you want,

values: array of values, and

repetition: true|flase.

example:

kPn(3, ["a","b","c"], false);

returns:

(6) [Array(3), Array(3), Array(3), Array(3), Array(3), Array(3)]
0: (3) ["a", "b", "c"]
1: (3) ["a", "c", "b"]
2: (3) ["b", "a", "c"]
3: (3) ["b", "c", "a"]
4: (3) ["c", "a", "b"]
5: (3) ["c", "b", "a"]
length: 6
__proto__: Array(0)

Creating a df of unique combinations of columns in R where order doesn't matter

A base R method is to create all the combination of political_spectrum_values taking 3 at a time using expand.grid, sort them by row and select unique rows.

df <- expand.grid(first_person = political_spectrum_values, 
second_person = political_spectrum_values,
third_person = political_spectrum_values)

df[] <- t(apply(df, 1, sort))
unique(df)

If needed as a single string

unique(apply(df, 1, function(x) paste0(sort(x), collapse = "_")))

Create unique combinations regardless of subset size

You could use a recursive function to "brute force" the packing combinations and get the best fit out of those:

def pack(sizes,bound,subset=[]):
if not sizes: # all sizes used
yield [subset] # return current subset
return
if sizes and not subset: # start new subset
i,m = max(enumerate(sizes),key=lambda s:s[1])
subset = [m] # using largest size
sizes = sizes[:i]+sizes[i+1:] # (to avoid repeats)
used = sum(subset)
for i,size in enumerate(sizes): # add to current subset
if subset and size>subset[-1]: # non-increasing order
continue # (to avoid repeats)
if used + size <= bound:
yield from pack(sizes[:i]+sizes[i+1:],bound,subset+[size])
if sizes:
for p in pack(sizes,bound): # add more subsets
yield [subset,*p]

def bestFit(sizes,bound):
packs = pack(sizes,bound)
return min(packs,key = lambda p : bound*len(p)-sum(sizes))

output:

for p in pack([1,2,3,4,5],8):
print(p,8*len(p)-sum(map(sum,p)))

[[5, 1], [4], [3, 2]] 9
[[5, 2, 1], [4, 3]] 1
[[5, 2], [4, 3, 1]] 1
[[5, 2], [4], [3, 1]] 9
[[5, 3], [4, 2, 1]] 1
[[5, 3], [4], [2, 1]] 9
[[5], [4, 1], [3, 2]] 9
[[5], [4, 2], [3, 1]] 9
[[5], [4, 3], [2, 1]] 9
[[5], [4], [3, 2, 1]] 9
[[5], [4], [3], [2, 1]] 17

print(*bestFit([1,2,3,4,5],8))
# [5, 2, 1] [4, 3]

print(*bestFit([1,2,3,4,5,6,7,8,9],18))
# [9, 1] [8, 4, 3, 2] [7, 6, 5]

This will take exponentially longer as your list of sizes gets larger but it may be enough if you only have very small inputs



Related Topics



Leave a reply



Submit