Sum for unique combinations of variables in a data table
Use pmin
and pmax
..
require(data.table) # v1.9.6
dt = fread("Country1 Country2 Value Category
A A 4 1
A B 2 1
A C 9 1
B A 3 2
B D 4 1
C A 2 2
D C 7 2")
dt[, .(total = sum(Value)),
by=.(Country1 = pmin(Country1, Country2),
Country2 = pmax(Country1, Country2))]
# Country1 Country2 total
# 1: A A 4
# 2: A B 5
# 3: A C 11
# 4: B D 4
# 5: C D 7
If you want this within Category
, just add it as well to by
.
Generate All ID Pairs, by group with data.table in R
You need to convert the result from t(combn())
which is a matrix to a data.table
or data.frame
, so this should work:
library(data.table)
set.seed(10)
dat <- data.table(ids=1:20, groups=sample(x=c("A","B","C"), 20, replace=TRUE))
dt <- dat[, as.data.table(t(combn(ids, 2))), .(groups)]
head(dt)
groups V1 V2
1: C 1 3
2: C 1 5
3: C 1 7
4: C 1 10
5: C 1 13
6: C 1 14
Sql Query : Sum , all the possible combination of rows in a table
This is a "all possible permutations" problem can be solved with a self join and a recursive CTE.
With selfrec as (
Select t.Value, t.ID, 0 as Level From TempTable t
UNION ALL
Select t2.Value + t1.Value as Value, t1.ID, Level + 1 From TempTable t1
Inner Join selfrec t2 on t1.ID < t2.ID
Where Level < 4 -- limit the number of recursions
)
Select * From selfrec
Order By Level
This criteria eliminates items matching themselves, as well as having duplicate reversals of permutation combos:t1.ID < t2.ID
Limits the depth of recursion since we have no other breaking criteria and will surpass the number of allowed recursions:Where Level < 4
Same j as by in datatable leads to strange outcome
You shouldn't sum
and group
with the same column b
, otherwise you will always get a new column V1
, whose value is same with the group column b
, because each value in group column b
is unique, which means you will get it self if you sum
by an unique value in b
(V1 = b
) . For your purpose,
you can group with a new variable rleid(b)
, which marks different groups.
dt = data.table(a=c(1,2,2,1,3,3,3,4,4,4),b=c(1:9,9)))
dt[a == 4, .(a = unique(a), b = unique(b), sumb = sum(b)), by = rleid(b)]
# rleid a b sumb
# <int> <num> <num> <num>
#1: 1 4 8 8
#2: 2 4 9 18
P.S. You can also have a better understanding if you create a copy of column b
with dt[,c:=b]
and then try dt[a == 4,sum(b),c]
dt[,c := b][a == 4,sum(b),c]
# c V1
# <num> <num>
#1: 8 8
#2: 9 18
Oracle PL/sql : Calculate sum of all possible number combinations
There might be easier ways, but I would use a recursive CTE. Here's a plain SQL example, with extra sample rows:
create table boxes (id varchar2(4), qty number);
insert into boxes values ('B101',5);
insert into boxes values ('B102',5);
insert into boxes values ('B103',4);
insert into boxes values ('B104',9);
insert into boxes values ('B105',11);
insert into boxes values ('B106',2);
insert into boxes values ('B107',1);
with c (r, id, qty, lvl) as (
-- anchor query
select id as r, id, qty, 1 as lvl
from boxes
where qty + 2 < 15
union all
-- recursive query
select c.r || ',' || b.id, b.id, b.qty+c.qty, c.lvl+1
from boxes b
join c on c.id < b.id
where b.qty + c.qty + 2 < 15
)
select r, lvl, qty
from c
order by qty desc, lvl asc
;
This will show all the combinations, with the best fit at the top. I added a secondary sort on the level, assuming that in the case of a tie, you'd prefer the fewest number of boxes per container. But you might prefer maximum boxes per container instead.
I also used join on c.id < b.id
instead of a cross join because I think you don't really want ALL combinations, you want all UNIQUE combinations, so it's more of a tree search.
And an example as a PL/SQL function:
create or replace function fit_boxes(existing_qty in number, capacity in number)
return varchar2
is
box_list varchar2(4000);
begin
with c (r, id, qty, lvl) as (
select id as r, id, qty, 1 as lvl
from boxes
where qty + existing_qty < capacity
union all
select c.r || ',' || b.id, b.id, b.qty+c.qty, c.lvl+1
from boxes b
join c on c.id < b.id
where b.qty + c.qty + existing_qty < capacity
)
select r into box_list
from c
order by qty desc, lvl asc
fetch first 1 row only
;
return box_list;
exception when NO_DATA_FOUND then
return 'No boxes fit'
end;
/
select fit_boxes(2,15) from dual;
Finally, I would have guessed that you wanted sum <= capacity, but your question definitely says "sum < capacity", so that's how I wrote it. Just test it out with your data and make sure it works as expected.
Edit: Sure, to explain the query logic - for a recursive CTE, you start with an anchor query and union it to a recursive query (which keeps iteratively selecting from the CTE itself).
For the anchor, we start by selecting all the single boxes which can fit in the container (where qty + 2 < 15). In our example, they all fit, so we have 7 rows with a lvl of 1.
For the recursive part, we already have 1 or more boxes in the container c
, and we want to see which remaining boxes from b
will fit. So we join them, using c.id < b.id
to make sure that we're only looking at the boxes in b
which aren't already in c
. And once we've looked at all the boxes, the join will return 0 rows and the recursion will stop.
For the 4 columns in the CTE - r
displays all the boxes that we've added to the container so far, id
shows the id of the most recently added box (important so we can keep track of which boxes we've considered), qty
sums up the size of all the boxes currently in the container, and lvl
shows how many boxes are in the container.
Related Topics
Error with H2O in R - Can't Connect to Local Host
Ggplot2: Geom_Smooth Confidence Band Does Not Extend to Edge of Graph, Even with Fullrange=True
How to Format the X-Axis of the Hard Coded Plotting Function of Spei Package in R
How to Store Filter Expressions as Strings
Rvest Not Recognizing CSS Selector
Parallel R on a Windows Cluster
Error: Could Not Find Build Tools Necessary to Build Dplyr
Lm and Predict - Agreement of Data.Frame Names
Filter Group of Rows Based on Sum of Values from Different Column
Conditionally Remove Leading or Trailing '.' Character in R
Why Does Withcallinghandlers Still Stops Execution
Sum Columns Row-Wise with Similar Names
Map Array of Strings to an Array of Integers
Changing Line Color in Ggplot Based on Slope
Why Isn't the R Function Sink() Writing a Summary Output to My Results File
Choose Specific Number with Probability
How to Automate Nested Sections in Rmds Which Include Text, Maps and Tables