How to select unique records by SQL
With the distinct
keyword with single and multiple column names, you get distinct records:
SELECT DISTINCT column 1, column 2, ...
FROM table_name;
How to get a list of all unique elements by row in a table?
We can use apply
to loop over the data and get the unique
apply(df1[-1], 1, unique)
If there are leading, lagging spaces, use trimws
to remove those
apply(df1[-1], 1, function(x) unique(trimws(x)))
#[[1]]
#[1] "F" "M2" "E" "H" "M" "21" "L"
#[[2]]
#[1] "A3" "V" "R" "W" "12" "N" "21" ""
#[[3]]
# [1] "I" "M" "A1" "H" "D2" "L" "M2" "G" "R" "K" "E" ""
#[[4]]
# [1] "H" "A1" "M" "W" "N" "21" "Q" "L" "F" "D" ""
If it needs the frequency, use table
apply(df1[-1], 1, table)
data
df1 <- read.csv("Data_exmpl.txt", fill = TRUE, header = FALSE)
How to select distinct rows in a datatable and store into an array
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "Column1", "Column2" ...);
Get unique row with distinct multiple column in oracle table
Thanks to @marmite-bomber following query worked for me,
SELECT comp_key, loc_id, org_id, max_id, 1, sysdate, sysdate
FROM (
SELECT comp_key, max(id) max_id
FROM (
WITH t1 AS (
SELECT t.id, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT t.id, t.loc_id, t.org_id, nvl(comp2,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT t.id, t.loc_id, t.org_id, nvl(comp3,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT t.id, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL
)
SELECT t1.id, t1.loc_id loc_id, t1.org_id org_id, listagg(comp,',') within group (ORDER BY comp) AS comp_key
FROM t1
GROUP BY t1.id, t1.loc_id, t1.org_id
)
GROUP BY comp_key, loc_id, org_id
) t2, tags tg
WHERE tg.id = t2.max_id;
Extracting unique rows from a data table in R
Before data.table v1.9.8, the default behavior of unique.data.table
method was to use the keys in order to determine the columns by which the unique combinations should be returned. If the key
was NULL
(the default), one would get the original data set back (as in OPs situation).
As of data.table 1.9.8+, unique.data.table
method uses all columns by default which is consistent with the unique.data.frame
in base R. To have it use the key columns, explicitly pass by = key(DT)
into unique
(replacing DT
in the call to key with the name of the data.table).
Hence, old behavior would be something like
library(data.table) v1.9.7-
set.seed(123)
a <- as.data.frame(matrix(sample(2, 120, replace = TRUE), ncol = 3))
b <- data.table(a, key = names(a))
## key(b)
## [1] "V1" "V2" "V3"
dim(unique(b))
## [1] 8 3
While for data.table v1.9.8+, just
b <- data.table(a)
dim(unique(b))
## [1] 8 3
## or dim(unique(b, by = key(b)) # in case you have keys you want to use them
Or without a copy
setDT(a)
dim(unique(a))
## [1] 8 3
R data.table get unique rows dropping some columns as well
How about this:
R> unique(tbl, by=c("reader_id", "book_id"))[,-4]
# reader_id book_id date
# 1: 10 1 d1
# 2: 20 2 d2
# 3: 30 4 d4
# 4: 50 5 d5
Or if you prefer to drop by name,
unique(tbl,by=c("reader_id", "book_id"))[,!"inf"]
KDB: how to get distinct values of rows in a table?
You could try this:
q)tbl:([] columnA:("AZ;B;C";"AT;B;C";"A;B;D";"E;F";"C;D";"A;D";enlist"A"))
q)tbl
columnA
--------
"AZ;B;C"
"AT;B;C"
"A;B;D"
"E;F"
"C;D"
"A;D"
,"A"
q)";"sv asc distinct exec";"vs";"sv columnA from tbl
"A;AT;AZ;B;C;D;E;F"
If the last row of your table is an atom, then you could try this:
q)tbl:([] columnA:("AZ;B;C";"AT;B;C";"A;B;D";"E;F";"C;D";"A;D";"A"))
q)tbl
columnA
--------
"AZ;B;C"
"AT;B;C"
"A;B;D"
"E;F"
"C;D"
"A;D"
"A"
q)exec ";"sv asc distinct ";"vs -1_raze{x,";"}each columnA from tbl
"A;AT;AZ;B;C;D;E;F"
select unique rows based on single distinct column
Quick one in TSQL
SELECT a.*
FROM emails a
INNER JOIN
(SELECT email,
MIN(id) as id
FROM emails
GROUP BY email
) AS b
ON a.email = b.email
AND a.id = b.id;
Related Topics
Convert Data with One Column and Multiple Rows into Multi Column Multi Row Data
How to Force the X-Axis Tick Marks to Appear at the End of Bar in Heatmap Graph
Changes in Plotting an Xts Object
How to Convert Class of Several Variables at Once
Stargazer Output Appears Below Text - Rmarkdown to PDF
How to Render Custom Map Tiles Created with Gdal2Tiles in Leaflet for R
"Dims [Product Xx] Do Not Match the Length of Object [Xx]" Error in Using R Function 'Outer'
Tls V1.1/Tls V1.2 Support in Rcurl
Out of Order Text Labels on Stack Bar Plot (Ggplot)
Looping Over Combinations of Regression Model Terms
Why Does Nls Function Not Work in Ggplot2
Removing Row with Duplicated Values in All Columns of a Data Frame (R)
Split a Column to Multiple Columns
Web Scraping Data Table with R Rvest
Object 'C_Stri_Join' Not Found - Using Knitr in Rstudio
Total Mean & Mean by Groups in R with Dplyr