Delete Top-N' Rows from a Table with Some Sorting(Order by 'Column')

Delete Top-N' Rows from a Table with some sorting(order by 'Column')

You can use a CTE to do a faster ordered delete without the need for a separate sub query to retrieve the top 3 ids.

WITH T
AS (SELECT TOP 3 *
FROM Table1
ORDER BY id DESC)
DELETE FROM T

How do I delete a fixed number of rows with sorting in PostgreSQL?

You could try using the ctid:

DELETE FROM logtable
WHERE ctid IN (
SELECT ctid
FROM logtable
ORDER BY timestamp
LIMIT 10
)

The ctid is:

The physical location of the row version within its table. Note that although the ctid can be used to locate the row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore ctid is useless as a long-term row identifier.

There's also oid but that only exists if you specifically ask for it when you create the table.

Order by date column ASC and delete top row for particular name

Quotes around _row_id_ makes it a string. In other words the query would try to delete records where string _rowid_ equals to string 1 which is never true and nothing is deleted as a result.

Omit the quotes to let it be recognized as a column name:

myid = 1
cursor.execute("DELETE FROM "+group+" WHERE _rowid_ = ?;", (myid, ))

Also, generally speaking, you should not be using string interpolation or string formatting to make SQL queries. But in this case, you are parameterizing a table name which cannot be inserted into the query through query arguments by a database driver.

R dataframe - Top n values in row with column names

You could pivot to long, group by the corresponding original row, use slice_max to get the top values, then pivot back to wide and bind that output to the original table.

library(dplyr, warn.conflicts = FALSE)
library(tidyr)

iris %>%
group_by(rn = row_number()) %>%
pivot_longer(-c(Species, rn), 'col', values_to = 'high') %>%
slice_max(col, n = 2) %>%
mutate(nm = row_number()) %>%
pivot_wider(values_from = c(high, col),
names_from = nm) %>%
ungroup() %>%
select(-c(Species, rn)) %>%
bind_cols(iris)
#> # A tibble: 150 × 9
#> high_1 high_2 col_1 col_2 Sepal.Length Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 5.1 3.5 Sepal.… Sepa… 5.1 3.5 1.4 0.2
#> 2 4.9 3 Sepal.… Sepa… 4.9 3 1.4 0.2
#> 3 4.7 3.2 Sepal.… Sepa… 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 Sepal.… Sepa… 4.6 3.1 1.5 0.2
#> 5 5 3.6 Sepal.… Sepa… 5 3.6 1.4 0.2
#> 6 5.4 3.9 Sepal.… Sepa… 5.4 3.9 1.7 0.4
#> 7 4.6 3.4 Sepal.… Sepa… 4.6 3.4 1.4 0.3
#> 8 5 3.4 Sepal.… Sepa… 5 3.4 1.5 0.2
#> 9 4.4 2.9 Sepal.… Sepa… 4.4 2.9 1.4 0.2
#> 10 4.9 3.1 Sepal.… Sepa… 4.9 3.1 1.5 0.1
#> # … with 140 more rows, and 1 more variable: Species <fct>

Created on 2022-02-16 by the reprex package (v2.0.1)

Edited to remove the unnecessary rename and mutate, thanks to tip from @Onyambu!

How to delete by row number in SQL

This is not really an answer. There were a few issues with the data that made the answers above (while excellent) unrelated. I simply deleted the table and then re-imported it from fixed width. This time, I was more careful and did not have the duplication.

Is there a way to SELECT the TOP N rows from a table and delete them afterwards?

Top N option is not supported in subquery

It's just not allowed in a subquery, but you can wrap it in a Derived Table:

DELETE FROM table 
WHERE idx IN
( SELECT *
FROM
( SELECT TOP 100 idx FROM table ORDER BY idx
) AS dt
)

A subquery might be Correlated, but not a Derived Table :-)

But, why do you actually need this?
Hopefully not in a loop to get smaller transactions.

delete the last row in a table using sql query?

If id is auto-increment then you can use the following

delete from marks
order by id desc limit 1

How to delete the top 1000 rows from a table using Sql Server 2008?

The code you tried is in fact two statements. A DELETE followed by a SELECT.

You don't define TOP as ordered by what.

For a specific ordering criteria deleting from a CTE or similar table expression is the most efficient way.

;WITH CTE AS
(
SELECT TOP 1000 *
FROM [mytab]
ORDER BY a1
)
DELETE FROM CTE

Sorting columns and selecting top n rows in each group pandas dataframe

There are 2 solutions:

1.sort_values and aggregate head:

df1 = df.sort_values('score',ascending = False).groupby('pidx').head(2)
print (df1)

mainid pidx pidy score
8 2 x w 12
4 1 a e 8
2 1 c a 7
10 2 y x 6
1 1 a c 5
7 2 z y 5
6 2 y z 3
3 1 c b 2
5 2 x y 1

2.set_index and aggregate nlargest:

df = df.set_index(['mainid','pidy']).groupby('pidx')['score'].nlargest(2).reset_index() 
print (df)
pidx mainid pidy score
0 a 1 e 8
1 a 1 c 5
2 c 1 a 7
3 c 1 b 2
4 x 2 w 12
5 x 2 y 1
6 y 2 x 6
7 y 2 z 3
8 z 2 y 5

Timings:

np.random.seed(123)
N = 1000000

L1 = list('abcdefghijklmnopqrstu')
L2 = list('efghijklmnopqrstuvwxyz')
df = pd.DataFrame({'mainid':np.random.randint(1000, size=N),
'pidx': np.random.randint(10000, size=N),
'pidy': np.random.choice(L2, N),
'score':np.random.randint(1000, size=N)})
#print (df)

def epat(df):
grouped = df.groupby('pidx')
new_df = pd.DataFrame([], columns = df.columns)
for key, values in grouped:
new_df = pd.concat([new_df, grouped.get_group(key).sort_values('score', ascending=True)[:2]], 0)
return (new_df)

print (epat(df))

In [133]: %timeit (df.sort_values('score',ascending = False).groupby('pidx').head(2))
1 loop, best of 3: 309 ms per loop

In [134]: %timeit (df.set_index(['mainid','pidy']).groupby('pidx')['score'].nlargest(2).reset_index())
1 loop, best of 3: 7.11 s per loop

In [147]: %timeit (epat(df))
1 loop, best of 3: 22 s per loop

Html to JS Datatable : Still show a removed row after sorting or page size change?

As @NTR and @J E Carter II stated, re-reading the data solves the issue perfectly. However my main purpose here is not to re-read the data from the DB, I was looking for a solution providing that and found something. All I did is to change the row btn.parentNode.parentNode.parentNode.removeChild(row); into $('#myTable').DataTable().rows(row).remove().draw(false); and it works perfectly, after the table is refreshed the removed data is not seen on the screen. Here is the complete ajax function:

var row = btn.parentNode.parentNode;
$.ajax({
type: 'POST',
url: 'DataManagementPage.aspx/DeleteRowFromMyDatabase',
data: JSON.stringify({ id: id, grId:grID, city: city }),
contentType: 'application/json; charset=utf-8',
dataType: 'json',
success: function (msg)
{
$('#myTable').DataTable().rows(row).remove().draw(false);
}
});

The topic @NTR provided also offers a solution in server side: Deleting specific rows from DataTable

EDIT: Please note that DeleteRowFromMyDatabase returns true after the removal in DB.



Related Topics



Leave a reply



Submit