Select first and last row for each group and take the column value difference in MySQL?
Using a MySQL-8.0/ MariaDB-10.2+ window function:
SELECT symbol,
LAST - FIRST AS price_change
FROM
(SELECT DISTINCT symbol,
first_value(price) OVER w AS FIRST,
last_value(price) OVER w AS LAST
FROM ticks WINDOW w AS (PARTITION BY symbol
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
) AS p
ref: fiddle
How to get the first and the last record per group in SQL Server 2008?
How about using ROW_NUMBER
:
SQL Fiddle
WITH Cte AS(
SELECT *,
RnAsc = ROW_NUMBER() OVER(PARTITION BY [group] ORDER BY val),
RnDesc = ROW_NUMBER() OVER(PARTITION BY [group] ORDER BY val DESC)
FROM tbl
)
SELECT
id, [group], val, start, [end]
FROM Cte
WHERE
RnAsc = 1 OR RnDesc = 1
ORDER BY [group], val
Select first and last row from grouped data
There is probably a faster way:
df %>%
group_by(id) %>%
arrange(stopSequence) %>%
filter(row_number()==1 | row_number()==n())
Window function get first and last row from each group
You need a subquery:
SELECT Name, Price, Percent, Volume, time, date,
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Percent) AS seqnum_asc,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Percent DESC) AS seqnum_desc
FROM TABLE_NAME t
) t
WHERE 1 IN (seqnum_asc, seqnum_desc)
ORDER BY Name asc;
If you want time
as a separate query, just adjust the ORDER BY
in the windowing clause. If you want all in one query, then add two new "seqnum"s based on time
.
Get only the first and last rows of each group with pandas
Use groupby
, find the head
and tail
for each group, and concat
the two.
g = df.groupby('ID')
(pd.concat([g.head(1), g.tail(1)])
.drop_duplicates()
.sort_values('ID')
.reset_index(drop=True))
Time ID X Y
0 8:00 A 23 100
1 20:00 A 35 220
2 9:00 B 24 110
3 23:00 B 38 250
4 11:00 C 26 130
5 22:00 C 37 240
6 15:00 D 30 170
If you can guarantee each ID group has at least two rows, the drop_duplicates
call is not needed.
Details
g.head(1)
Time ID X Y
0 8:00 A 23 100
1 9:00 B 24 110
3 11:00 C 26 130
7 15:00 D 30 170
g.tail(1)
Time ID X Y
7 15:00 D 30 170
12 20:00 A 35 220
14 22:00 C 37 240
15 23:00 B 38 250
pd.concat([g.head(1), g.tail(1)])
Time ID X Y
0 8:00 A 23 100
1 9:00 B 24 110
3 11:00 C 26 130
7 15:00 D 30 170
7 15:00 D 30 170
12 20:00 A 35 220
14 22:00 C 37 240
15 23:00 B 38 250
Get values from first and last row per group
This is a bit of a pain, because Postgres has the nice window functions first_value()
and last_value()
, but these are not aggregation functions. So, here is one way:
select t.name, min(t.week) as minWeek, max(firstvalue) as firstvalue,
max(t.week) as maxWeek, max(lastvalue) as lastValue
from (select t.*, first_value(value) over (partition by name order by week) as firstvalue,
last_value(value) over (partition by name order by week) as lastvalue
from table t
) t
group by t.name;
drop first and last row from within each group
I'd apply a similar technique to what I did for the other question:
def first_last(df):
return df.ix[1:-1]
df.groupby(level=0, group_keys=False).apply(first_last)
Get last row of each group in R
You might try:
a %>%
group_by(ID) %>%
arrange(NUM) %>%
slice(n())
Related Topics
SQL Filter Criteria in Join Criteria or Where Clause Which Is More Efficient
SQL Error: Misuse of Aggregate
Like '%' Does Not Accept Null Value
In VS. Join with Large Rowsets
Generate Random Int Value from 3 to 6
Count Cumulative Total in Postgresql
Procedurally Transform Subquery into Join
How to Aggregate Over Rolling Time Window with Groups in Spark
MySQL Convert Latin1 Data to Utf8
Hierarchical/Tree Database for Directories Path in Filesystem
Difference Between Filtering Queries in Join and Where
Copy Data from One Column to Other Column (Which Is in a Different Table)
When Using Getdate() in Many Places, Is It Better to Use a Variable
Can a Commit Statement (In SQL) Ever Fail? How
SQL Query Joins Multiple Tables - Too Slow (8 Tables)
How to Return a Incremental Group Number Per Group in SQL
Postgresql Join with Array Type with Array Elements Order, How to Implement