Count max. number of concurrent user sessions per day
I would serialize logins and logouts with UNION ALL
, "in" counts as 1, "out" counts as -1. Then compute a running count with a simple window function and get the max per day.
Since it has not been specified, assuming that:
- "Concurrent" means at the same point in time (not just on the same day).
- Sessions can span any range of time (i.e. multiple days, too).
- Each user can only be online once at one point in time. So no grouping per user is needed in my solution.
- Logout trumps login. If both occur at the same time, logout is counted first (leading to a lower concurrent number in corner cases).
WITH range AS (SELECT '2014-03-01'::date AS start_date -- time range
, '2014-03-31'::date AS end_date) -- inclusive bounds
, cte AS (
SELECT *
FROM tbl, range r
WHERE login_date <= r.end_date
AND logout_date >= r.start_date
)
, ct AS (
SELECT log_date, sum(ct) OVER (ORDER BY log_date, log_time, ct) AS session_ct
FROM (
SELECT logout_date AS log_date, logout_time AS log_time, -1 AS ct FROM cte
UNION ALL
SELECT login_date, login_time, 1 FROM cte
) sub
)
SELECT log_date, max(session_ct) AS max_sessions
FROM ct, range r
WHERE log_date BETWEEN r.start_date AND r.end_date -- crop actual time range
GROUP BY 1
ORDER BY 1;
You might use the OVERLAPS
operator in cte
:
AND (login_date, logout_date) OVERLAPS (r.start_date, r.end_date)
Details:
- Find overlapping date ranges in PostgreSQL
But that might not be a good idea because (per documentation):
Each time period is considered to represent the half-open
intervalstart
<=
time
<
end
, unlessstart
andend
are equal in which
case it represents that single time instant. This means for instance
that two time periods with only an endpoint in common do not overlap.
Bold emphasis mine. The upper bound of your range would have to be the day after your desired time frame.
Explain
CTE are available since Postgres 8.4.
The 1st CTE
range
is just for convenience of providing the time range once.The 2nd CTE
cte
selects only relevant rows: those that ...- start before or in the range
- and end in or after the range
The 3rd CTE
ct
serializes "in" and "out" points with values of +/-1 and computes a running count with the aggregate functionsum()
used as window function. Those are available since Postgres 8.4.In the final
SELECT
trim leading and trailing days and aggregate the maximum per day. Voilá.
SQL Fiddle for Postgres 9.6.
Postgres 8.4 is too old and not available any more, but should work the same. I added a rows to the test case - one spanning multiple days. Should make it more useful.
Notes
I would generally use timestamp
instead of date
and time
. Same size, easier to handle. Or timestamptz
if multiple time zones can be involved.
An index on (login_date, logout_date DESC)
is instrumental for performance as a bare minimum.
PostgreSQL count max number of concurrent user sessions per hour
I would decompose this into two problems:
- Find the number of overlaps and when they begin and end.
- Find the hours.
Note two things:
- I am assuming that
'2014-04-03 17:59:00'
is a typo. - The following goes by the beginning of the hour and puts the date/hour in a single column.
First, calculate the overlaps. For this, unpivot the logins and logout. Put in a counter of +1
for logins and -1
for logouts and do a cumulative sum. This looks like:
with overlap as (
select v.ts, sum(v.inc) as inc,
sum(sum(v.inc)) over (order by v.ts) as num_overlaps,
lead(v.ts) over (order by v.ts) as next_ts
from sessions s cross join lateral
(values (login_ts, 1), (logout_ts, -1)) v(ts, inc)
group by v.ts
)
select *
from overlap
order by ts;
For the next step, use generate_series()
to generate timestamps one hour apart. Look for the maximum value during that period using left join
and group by
:
with overlap as (
select v.ts, sum(v.inc) as inc,
sum(sum(v.inc)) over (order by v.ts) as num_overlaps,
lead(v.ts) over (order by v.ts) as next_ts
from sessions s cross join lateral
(values (login_ts, 1), (logout_ts, -1)) v(ts, inc)
group by v.ts
)
select gs.hh, coalesce(max(o.num_overlaps), 0) as num_overlaps
from generate_series('2021-02-03'::date, '2021-02-05'::date, interval '1 hour') gs(hh) left join
overlap o
on o.ts < gs.hh + interval '1 hour' and
o.next_ts > gs.hh
group by gs.hh
order by gs.hh;
Here is a db<>fiddle using your data fixed with the a reasonable logout time for the last record.
Calculate maximum number of concurrent user sessions
Deduct 30 minutes from the end (or start) of each time range. Then basically proceed as outlined in my referenced "simple" answer (adjusting for the 30 min in the right direction everywhere). Ranges shorter than 30 minutes are eliminated a priori - which makes sense as those can never be part of a 30 minutes period of continuous overlap. Also makes the query faster.
Calculating for all days in Oct 2019 (example range):
WITH range AS (SELECT timestamp '2019-10-01' AS start_ts -- incl. lower bound
, timestamp '2019-11-01' AS end_ts) -- excl. upper bound
, cte AS (
SELECT userid, starttime
-- default to current timestamp if NULL
, COALESCE(endtime, localtimestamp) - interval '30 min' AS endtime
FROM usersessions, range r
WHERE starttime < r.end_ts -- count overlaps *starting* in outer time range
AND (endtime >= r.start_ts + interval '30 min' OR endtime IS NULL)
)
, ct AS (
SELECT ts, sum(ct) OVER (ORDER BY ts, ct) AS session_ct
FROM (
SELECT endtime AS ts, -1 AS ct FROM cte
UNION ALL
SELECT starttime , +1 FROM cte
) sub
)
SELECT ts::date, max(session_ct) AS max_concurrent_sessions
FROM ct, range r
WHERE ts >= r.start_ts
AND ts < r.end_ts -- crop outer time range
GROUP BY ts::date
ORDER BY 1;
db<>fiddle here
Be aware that LOCALTIMESTAMP
depends on the time zone of the current session. Consider using timestamptz in your table and CURRENT_TIMESTAMP
instead. See:
- Ignoring time zones altogether in Rails and PostgreSQL
How to use ms sql query to determine max number of concurrent sessions over a timespan
You need a tally table for that. I create it here on the fly but it preferably may be instantiated as a regular table as well.
declare @t table(
id int,
dtmStarted datetime,
dtmEnded datetime
);
insert @t values
(1, '2020-01-01 10:00', '2020-01-01 10:05'),
(2, '2020-01-01 10:00', '2020-01-01 10:05'),
(3, '2020-01-01 10:00', '2020-01-01 10:05'),
(4, '2020-01-01 10:06', '2020-01-01 10:09'),
(5, '2020-01-01 10:07', '2020-01-01 10:08'),
(6, '2020-01-01 10:10', '2020-01-01 10:11');
declare @s datetime = '2020-01-01 10:00';
declare @e datetime = '2020-01-01 11:00';
-- table of 1000 numbers starting 0
with t0(n) as (
select n
from (
values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)
) t(n)
),nmbs as(
select row_number() over(order by t1.n) - 1 n
from t0 t1, t0 t2, t0 t3
)
select dateadd(minute, n, @s) start, count(id) cnt
from nmbs
left join @t on dtmStarted <= dateadd(minute, n+1, @s) and dateadd(minute, n, @s)<= dtmEnded
where dateadd(minute, n+1, @s) <= @e
group by dateadd(minute, n, @s)
order by dateadd(minute, n, @s);
Returns
2020-01-01 10:01:00.000 3
2020-01-01 10:02:00.000 3
2020-01-01 10:03:00.000 3
2020-01-01 10:04:00.000 3
2020-01-01 10:05:00.000 4
2020-01-01 10:06:00.000 2
2020-01-01 10:07:00.000 2
2020-01-01 10:08:00.000 2
2020-01-01 10:09:00.000 2
2020-01-01 10:10:00.000 1
2020-01-01 10:11:00.000 1
2020-01-01 10:12:00.000 0
2020-01-01 10:13:00.000 0
2020-01-01 10:14:00.000 0
2020-01-01 10:15:00.000 0
...
2020-01-01 10:59:00.000 0
Probably you'll need to change both or one of predicates dtmStarted <= dateadd(minute, n+1, @s) and dateadd(minute, n, @s) <= dtmEnded
from <=
to strict <
to get the expected result.
What is the fastest way to count user logins in a sessions table?
Create a materialized view for your query that contains the already aggregated data. That is fast to query, and if you refresh the materialized view often enough, the data will be close enough to reality.
count
itself is never fast.
Related Topics
Retrieve The Most Recent Record for Each Customer
How to Count Most Consecutive Occurrences of a Value in a Column in SQL Server
Why Google's Bigtable Referred as a Nosql Database
Query to Check Overlapping Ranges in SQL Server
How to Use Oracle Outer Join with a Filter Where Clause
What Is The Query to Get "Related Tags" Like in
Wordpress: Automatically Delete Posts That Are X Days Old
Sql 2005 How to Use Keyword Like in a Case Statement
Sql Server Login Disable Windows Authentication
How to Concat Multiple Rows into One Column in SQL Server
Passing C# Datatable as a Parameter to Stored Procedure in Ms SQL Server 2008
Postgresql Multiple Nullable Columns in Unique Constraint
Date Split-Up Based on Fiscal Year
Grouping by Date, Return Row Even If No Records Found
Delete ... from ... Where ... In
Difference Between Numeric and Float in Bigquery
Display Multiple Rows and Column Values into a Single Row, Multiple Column Values