Count Max. Number of Concurrent User Sessions Per Day

Count max. number of concurrent user sessions per day

I would serialize logins and logouts with UNION ALL, "in" counts as 1, "out" counts as -1. Then compute a running count with a simple window function and get the max per day.

Since it has not been specified, assuming that:

  • "Concurrent" means at the same point in time (not just on the same day).
  • Sessions can span any range of time (i.e. multiple days, too).
  • Each user can only be online once at one point in time. So no grouping per user is needed in my solution.
  • Logout trumps login. If both occur at the same time, logout is counted first (leading to a lower concurrent number in corner cases).
WITH range AS (SELECT '2014-03-01'::date AS start_date  -- time range
, '2014-03-31'::date AS end_date) -- inclusive bounds
, cte AS (
SELECT *
FROM tbl, range r
WHERE login_date <= r.end_date
AND logout_date >= r.start_date
)
, ct AS (
SELECT log_date, sum(ct) OVER (ORDER BY log_date, log_time, ct) AS session_ct
FROM (
SELECT logout_date AS log_date, logout_time AS log_time, -1 AS ct FROM cte
UNION ALL
SELECT login_date, login_time, 1 FROM cte
) sub
)
SELECT log_date, max(session_ct) AS max_sessions
FROM ct, range r
WHERE log_date BETWEEN r.start_date AND r.end_date -- crop actual time range
GROUP BY 1
ORDER BY 1;

You might use the OVERLAPS operator in cte:

AND   (login_date, logout_date) OVERLAPS (r.start_date, r.end_date)

Details:

  • Find overlapping date ranges in PostgreSQL

But that might not be a good idea because (per documentation):

Each time period is considered to represent the half-open
interval start <= time < end, unless start and end are equal in which
case it represents that single time instant. This means for instance
that two time periods with only an endpoint in common do not overlap.

Bold emphasis mine. The upper bound of your range would have to be the day after your desired time frame.

Explain

  • CTE are available since Postgres 8.4.

  • The 1st CTE range is just for convenience of providing the time range once.

  • The 2nd CTE cte selects only relevant rows: those that ...

    • start before or in the range
    • and end in or after the range
  • The 3rd CTE ct serializes "in" and "out" points with values of +/-1 and computes a running count with the aggregate function sum() used as window function. Those are available since Postgres 8.4.

  • In the final SELECT trim leading and trailing days and aggregate the maximum per day. Voilá.

SQL Fiddle for Postgres 9.6.

Postgres 8.4 is too old and not available any more, but should work the same. I added a rows to the test case - one spanning multiple days. Should make it more useful.

Notes

I would generally use timestamp instead of date and time. Same size, easier to handle. Or timestamptz if multiple time zones can be involved.

An index on (login_date, logout_date DESC) is instrumental for performance as a bare minimum.

PostgreSQL count max number of concurrent user sessions per hour

I would decompose this into two problems:

  1. Find the number of overlaps and when they begin and end.
  2. Find the hours.

Note two things:

  • I am assuming that '2014-04-03 17:59:00' is a typo.
  • The following goes by the beginning of the hour and puts the date/hour in a single column.

First, calculate the overlaps. For this, unpivot the logins and logout. Put in a counter of +1 for logins and -1 for logouts and do a cumulative sum. This looks like:

with overlap as (
select v.ts, sum(v.inc) as inc,
sum(sum(v.inc)) over (order by v.ts) as num_overlaps,
lead(v.ts) over (order by v.ts) as next_ts
from sessions s cross join lateral
(values (login_ts, 1), (logout_ts, -1)) v(ts, inc)
group by v.ts
)
select *
from overlap
order by ts;

For the next step, use generate_series() to generate timestamps one hour apart. Look for the maximum value during that period using left join and group by:

with overlap as (
select v.ts, sum(v.inc) as inc,
sum(sum(v.inc)) over (order by v.ts) as num_overlaps,
lead(v.ts) over (order by v.ts) as next_ts
from sessions s cross join lateral
(values (login_ts, 1), (logout_ts, -1)) v(ts, inc)
group by v.ts
)
select gs.hh, coalesce(max(o.num_overlaps), 0) as num_overlaps
from generate_series('2021-02-03'::date, '2021-02-05'::date, interval '1 hour') gs(hh) left join
overlap o
on o.ts < gs.hh + interval '1 hour' and
o.next_ts > gs.hh
group by gs.hh
order by gs.hh;

Here is a db<>fiddle using your data fixed with the a reasonable logout time for the last record.

Calculate maximum number of concurrent user sessions

Deduct 30 minutes from the end (or start) of each time range. Then basically proceed as outlined in my referenced "simple" answer (adjusting for the 30 min in the right direction everywhere). Ranges shorter than 30 minutes are eliminated a priori - which makes sense as those can never be part of a 30 minutes period of continuous overlap. Also makes the query faster.

Calculating for all days in Oct 2019 (example range):

WITH range AS (SELECT timestamp '2019-10-01' AS start_ts  -- incl. lower bound
, timestamp '2019-11-01' AS end_ts) -- excl. upper bound
, cte AS (
SELECT userid, starttime
-- default to current timestamp if NULL
, COALESCE(endtime, localtimestamp) - interval '30 min' AS endtime
FROM usersessions, range r
WHERE starttime < r.end_ts -- count overlaps *starting* in outer time range
AND (endtime >= r.start_ts + interval '30 min' OR endtime IS NULL)

)
, ct AS (
SELECT ts, sum(ct) OVER (ORDER BY ts, ct) AS session_ct
FROM (
SELECT endtime AS ts, -1 AS ct FROM cte
UNION ALL
SELECT starttime , +1 FROM cte
) sub
)
SELECT ts::date, max(session_ct) AS max_concurrent_sessions
FROM ct, range r
WHERE ts >= r.start_ts
AND ts < r.end_ts -- crop outer time range
GROUP BY ts::date
ORDER BY 1;

db<>fiddle here

Be aware that LOCALTIMESTAMP depends on the time zone of the current session. Consider using timestamptz in your table and CURRENT_TIMESTAMP instead. See:

  • Ignoring time zones altogether in Rails and PostgreSQL

How to use ms sql query to determine max number of concurrent sessions over a timespan

You need a tally table for that. I create it here on the fly but it preferably may be instantiated as a regular table as well.

declare @t table(
id int,
dtmStarted datetime,
dtmEnded datetime
);
insert @t values
(1, '2020-01-01 10:00', '2020-01-01 10:05'),
(2, '2020-01-01 10:00', '2020-01-01 10:05'),
(3, '2020-01-01 10:00', '2020-01-01 10:05'),
(4, '2020-01-01 10:06', '2020-01-01 10:09'),
(5, '2020-01-01 10:07', '2020-01-01 10:08'),
(6, '2020-01-01 10:10', '2020-01-01 10:11');

declare @s datetime = '2020-01-01 10:00';
declare @e datetime = '2020-01-01 11:00';

-- table of 1000 numbers starting 0
with t0(n) as (
select n
from (
values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)
) t(n)
),nmbs as(
select row_number() over(order by t1.n) - 1 n
from t0 t1, t0 t2, t0 t3
)
select dateadd(minute, n, @s) start, count(id) cnt
from nmbs
left join @t on dtmStarted <= dateadd(minute, n+1, @s) and dateadd(minute, n, @s)<= dtmEnded
where dateadd(minute, n+1, @s) <= @e
group by dateadd(minute, n, @s)
order by dateadd(minute, n, @s);

Returns

2020-01-01 10:01:00.000 3
2020-01-01 10:02:00.000 3
2020-01-01 10:03:00.000 3
2020-01-01 10:04:00.000 3
2020-01-01 10:05:00.000 4
2020-01-01 10:06:00.000 2
2020-01-01 10:07:00.000 2
2020-01-01 10:08:00.000 2
2020-01-01 10:09:00.000 2
2020-01-01 10:10:00.000 1
2020-01-01 10:11:00.000 1
2020-01-01 10:12:00.000 0
2020-01-01 10:13:00.000 0
2020-01-01 10:14:00.000 0
2020-01-01 10:15:00.000 0
...
2020-01-01 10:59:00.000 0

Probably you'll need to change both or one of predicates dtmStarted <= dateadd(minute, n+1, @s) and dateadd(minute, n, @s) <= dtmEnded from <= to strict < to get the expected result.

What is the fastest way to count user logins in a sessions table?

Create a materialized view for your query that contains the already aggregated data. That is fast to query, and if you refresh the materialized view often enough, the data will be close enough to reality.

count itself is never fast.



Related Topics



Leave a reply



Submit