Getting Unique Rows of a Table and Their Numbers

How to select unique records by SQL

With the distinct keyword with single and multiple column names, you get distinct records:

SELECT DISTINCT column 1, column 2, ...
FROM table_name;

How to get a list of all unique elements by row in a table?

We can use apply to loop over the data and get the unique

apply(df1[-1], 1, unique)

If there are leading, lagging spaces, use trimws to remove those

apply(df1[-1], 1, function(x) unique(trimws(x)))
#[[1]]
#[1] "F"  "M2" "E"  "H"  "M"  "21" "L" 

#[[2]]
#[1] "A3" "V"  "R"  "W"  "12" "N"  "21" ""  

#[[3]]
# [1] "I"  "M"  "A1" "H"  "D2" "L"  "M2" "G"  "R"  "K"  "E"  ""  

#[[4]]
# [1] "H"  "A1" "M"  "W"  "N"  "21" "Q"  "L"  "F"  "D"  ""

If it needs the frequency, use table

apply(df1[-1], 1, table)

data

df1 <- read.csv("Data_exmpl.txt", fill = TRUE, header = FALSE)

How to select distinct rows in a datatable and store into an array

DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "Column1", "Column2" ...);

Get unique row with distinct multiple column in oracle table

Thanks to @marmite-bomber following query worked for me,

SELECT comp_key, loc_id, org_id, max_id, 1, sysdate, sysdate
FROM (
  SELECT comp_key, max(id) max_id
  FROM (
    WITH t1 AS (
      SELECT t.id, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION 
      SELECT t.id, t.loc_id, t.org_id, nvl(comp2,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
      SELECT t.id, t.loc_id, t.org_id, nvl(comp3,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
      SELECT t.id, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL
    )
    SELECT t1.id, t1.loc_id loc_id, t1.org_id org_id, listagg(comp,',') within group (ORDER BY comp) AS comp_key
    FROM t1
    GROUP BY t1.id, t1.loc_id, t1.org_id
  )
  GROUP BY comp_key, loc_id, org_id
) t2, tags tg
WHERE tg.id = t2.max_id;

Extracting unique rows from a data table in R

Before data.table v1.9.8, the default behavior of unique.data.table method was to use the keys in order to determine the columns by which the unique combinations should be returned. If the key was NULL (the default), one would get the original data set back (as in OPs situation).

As of data.table 1.9.8+, unique.data.table method uses all columns by default which is consistent with the unique.data.frame in base R. To have it use the key columns, explicitly pass by = key(DT) into unique (replacing DT in the call to key with the name of the data.table).

Hence, old behavior would be something like

library(data.table) v1.9.7-
set.seed(123)
a <- as.data.frame(matrix(sample(2, 120, replace = TRUE), ncol = 3))
b <- data.table(a, key = names(a))
## key(b)
## [1] "V1" "V2" "V3"
dim(unique(b)) 
## [1] 8 3

While for data.table v1.9.8+, just

b <- data.table(a) 
dim(unique(b)) 
## [1] 8 3
## or dim(unique(b, by = key(b)) # in case you have keys you want to use them

Or without a copy

setDT(a)
dim(unique(a))
## [1] 8 3

R data.table get unique rows dropping some columns as well

How about this:

R> unique(tbl, by=c("reader_id", "book_id"))[,-4]
#    reader_id book_id date
# 1:        10       1   d1
# 2:        20       2   d2
# 3:        30       4   d4
# 4:        50       5   d5

Or if you prefer to drop by name,

unique(tbl,by=c("reader_id", "book_id"))[,!"inf"]

KDB: how to get distinct values of rows in a table?

You could try this:

q)tbl:([] columnA:("AZ;B;C";"AT;B;C";"A;B;D";"E;F";"C;D";"A;D";enlist"A"))
q)tbl
columnA
--------
"AZ;B;C"
"AT;B;C"
"A;B;D"
"E;F"
"C;D"
"A;D"
,"A"
q)";"sv asc distinct exec";"vs";"sv columnA from tbl
"A;AT;AZ;B;C;D;E;F"

If the last row of your table is an atom, then you could try this:

q)tbl:([] columnA:("AZ;B;C";"AT;B;C";"A;B;D";"E;F";"C;D";"A;D";"A"))
q)tbl
columnA
--------
"AZ;B;C"
"AT;B;C"
"A;B;D"
"E;F"
"C;D"
"A;D"
"A"
q)exec ";"sv asc distinct ";"vs -1_raze{x,";"}each columnA from tbl
"A;AT;AZ;B;C;D;E;F"

select unique rows based on single distinct column

Quick one in TSQL

SELECT a.*
FROM emails a
INNER JOIN 
  (SELECT email,
    MIN(id) as id
  FROM emails 
  GROUP BY email 
) AS b
  ON a.email = b.email 
  AND a.id = b.id;