How to Assign a Unique Id Number to Each Group of Identical Values in a Column

How to assign a unique ID number to each group of identical values in a column

How about

df2 <- transform(df,id=as.numeric(factor(sample)))

I think this (cribbed from Add ID column by group) should be slightly more efficient, although perhaps a little harder to remember:

df3 <- transform(df, id=match(sample, unique(sample)))
all.equal(df2,df3)  ## TRUE

If you want to do this in tidyverse:

library(dplyr)
df %>% group_by(sample) %>% mutate(id=cur_group_id())

assign a unique ID number for every repeated value in a column R

It can be done using rowid

library(data.table)
library(dplyr)
weighted_df %>% 
   mutate(ID = rowid(Name))

-output

#     Name        Room1       Room2        Room3 ID
#1    H001  0.579649851  0.84602529  0.620850211  1
#2    H001  0.579649851  0.84602529  0.620850211  2
#3    H001  0.579649851  0.84602529  0.620850211  3
#4    H001  0.579649851  0.84602529  0.620850211  4
#5    H001  0.579649851  0.84602529  0.620850211  5
#6    H001  0.579649851  0.84602529  0.620850211  6
#7    H001  0.579649851  0.84602529  0.620850211  7
#8    H001  0.579649851  0.84602529  0.620850211  8
#9    H001  0.579649851  0.84602529  0.620850211  9
#10   H001  0.579649851  0.84602529  0.620850211 10
#11   H001  0.579649851  0.84602529  0.620850211 11
#12   H001  0.579649851  0.84602529  0.620850211 12
#13   H001  0.579649851  0.84602529  0.620850211 13
#14   H001  0.579649851  0.84602529  0.620850211 14
#15   H001  0.579649851  0.84602529  0.620850211 15
#16   H001  0.579649851  0.84602529  0.620850211 16
#17   H001  0.579649851  0.84602529  0.620850211 17
#18   H002  1.457267473 -1.18612874  0.553957293  1
#19   H002  1.457267473 -1.18612874  0.553957293  2
# ...

Create a new column with unique identifier for each group

Try with groupby ngroup + 1, use sort=False to ensure groups are enumerated in the order they appear in the DataFrame:

df['idx'] = df.groupby(['ID', 'phase'], sort=False).ngroup() + 1

df:

   ID phase side  values  idx
0  r1   ph1    l      12    1
1  r1   ph1    r      34    1
2  r1   ph2    l      93    2
3  s4   ph3    l      21    3
4  s3   ph2    l      88    4
5  s3   ph2    r      54    4

How to assign a unique ID to each corresponding unique column value?

Use window functions.

scala> df.show(false)
+-------+--------+---------+---------+--------+-----------------+
|address|lastname|firstname|patientid|policyno|visitid          |
+-------+--------+---------+---------+--------+-----------------+
|addr1  |Pits    |Rodney   |patid1   |policy1 |visituid_rodney_1|
|addr1  |Pits    |Rodney   |patid1   |policy1 |visituid_rodney_2|
|addr1  |Pits    |Rodney   |patid1   |policy1 |visituid_rodney_3|
|addr2  |Dobs    |Aliya    |patid2   |policy2 |visituid_aliya_1 |
|addr2  |Dobs    |Aliya    |patid2   |policy2 |visituid_aliya_2 |
|addr2  |Dobs    |OP       |patid3   |policy3 |visituid_OP_1    |
|addr4  |AN      |OTHER    |patid4   |policy4 |visituid_OP_1    |
|addr2  |ANO     |THER     |patid5   |policy5 |visituid_OP_1    |
+-------+--------+---------+---------+--------+-----------------+

val expr = Seq("lastname","firstname")
            .map(c => hash(col(c)).as(c))
            .reduce(concat(_,_).asc)

val winSpec = dense_rank()
                .over(Window.orderBy(expr))

Output

scala> df.withColumn("id",winSpec).show(false)

+-------+--------+---------+---------+--------+-----------------+---+
|address|lastname|firstname|patientid|policyno|visitid          |id |
+-------+--------+---------+---------+--------+-----------------+---+
|addr1  |Pits    |Rodney   |patid1   |policy1 |visituid_rodney_1|1  |
|addr1  |Pits    |Rodney   |patid1   |policy1 |visituid_rodney_2|1  |
|addr1  |Pits    |Rodney   |patid1   |policy1 |visituid_rodney_3|1  |
|addr4  |AN      |OTHER    |patid4   |policy4 |visituid_OP_1    |2  |
|addr2  |Dobs    |OP       |patid3   |policy3 |visituid_OP_1    |3  |
|addr2  |Dobs    |Aliya    |patid2   |policy2 |visituid_aliya_1 |4  |
|addr2  |Dobs    |Aliya    |patid2   |policy2 |visituid_aliya_2 |4  |
|addr2  |ANO     |THER     |patid5   |policy5 |visituid_OP_1    |5  |
+-------+--------+---------+---------+--------+-----------------+---+

PostgreSQL -Assign Unique ID to each groups of records, based on a condition

We can use LAG here along with SUM as an analytic function:

WITH cte AS (
    SELECT *, CASE WHEN TELEGRAMMZAEHLER =
                        LAG(TELEGRAMMZAEHLER) OVER (ORDER BY TMSTP DESC) - 1
                   THEN 0 ELSE 1 END AS idx
    FROM yourTable
)

SELECT ID, TMSTP, CAT_TYPE, TELEGRAMMZAEHLER,
       SUM(idx) OVER (ORDER BY TMSTP DESC) AS GRUPPE
FROM cte
ORDER BY TMSTP DESC;

Demo

Assign unique id each time two fields/columns are an exact match

Using DENSE_RANK() should do the trick:

SELECT *, 
   DENSE_RANK() OVER(ORDER by person_id, datetime) as unique_combination_id
FROM tbl;

DEMO

How to generate a unique ID based on duplicate values in either of two columns?

Further to my comment above, to reproduce the same ID order you need to ensure the proper factor level ordering

library(dplyr)
df %>% mutate(ID = as.integer(factor(Name, levels = unique(Name))))
#        E.mail Phone        Name ID
#1      mortena  3076      morten  1
#2      kaspera  2688      kasper  2
#3 christoffera  1212 christoffer  3
#4      mortenb  3076      morten  1
#5      mortena  3075      morten  1
#6      kasperb  2688      kasper  2
#7 christoffera  1213 christoffer  3

or with group_indices

df %>% mutate(ID = group_indices(., factor(Name, levels = unique(Name))))

If the order of IDs doesn't matter you can do

df %>% mutate(ID = as.integer(as.factor(Name)))

df %>% mutate(ID = group_indices(., Name))

In these cases, the order of ID is determined by the alphabetical order of unique Names (rather than by their occurrence).

Sample data

df <- read.table(text = "E-mail         Phone      Name
mortena        3076       morten
kaspera        2688       kasper
christoffera   1212       christoffer
mortenb        3076       morten
mortena        3075       morten
kasperb        2688       kasper
christoffera   1213       christoffer", header = T)

Assign Unique Group Id To Sets of Rows with Same Column Value Separated by Other value

Try this:

with t as
(
    select
        uid,
        radius,
        lag(radius,1) over (order by uid) as prev_rad
    from
        radtable
)

select
    uid,
    radius,
    sum
    (
        case when radius = coalesce(prev_rad,radius) then 0 else 1 end
    )  
    over 
    (
        order by uid
    ) + 1 as GroupID
from 
    t

How to Assign a Unique Id Number to Each Group of Identical Values in a Column