Finding The Decade with Largest Records, SQL Server

Finding the decade with largest records, SQL Server

You can use the LEFT function in SQL Server to get the decade from the year. The decade is the first 3 digits of the year. You can group by the decade and then count the number of movies. If you sort, or order, the results by the number of movies - the decade with the largest number of movies will be at the top. For example:

select
count(id) as number_of_movies,
left(cast([year] as varchar(4)), 3) + '0s' as decade
from movies
group by left(cast([year] as varchar(4)), 3)
order by number_of_movies desc

Query for find the decade with the largest number of records

I would do this by generating the years, joining in the movies, and then aggregating:

select y.year as decade_start, y.year + 9 as decade_end,
       count(*) as num_movies
from (select distinct year from movies) y join
     movies m
     on m.year >= y.year and m.year < y.year + 10
group by y.year
order by count(*) desc
limit 1;

SQL to Count Records that Existed by Decade

You can also take advantage of the connect by clause to do that.
I assume you don't have any duplicate rows (these columns FILE_NUM, START_DATE, END_DATE should be unique) in your real data.

SELECT dec_start, dec_end, COUNT(*) nb
FROM (
    SELECT t.*, level
      , 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 10 dec_start
      , 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 1 dec_end
    FROM YourTable T
    CONNECT BY 
        10 * TRUNC( EXTRACT ( YEAR FROM (START_DATE) ) / 10 ) + 10 * LEVEL - 10 
            <   10 * CEIL( EXTRACT ( YEAR FROM (END_DATE) ) / 10 )
    AND PRIOR FILE_NUM = FILE_NUM
    AND PRIOR START_DATE = START_DATE
    AND PRIOR END_DATE = END_DATE
    AND PRIOR SYS_GUID() IS NOT NULL
)
group by dec_start, dec_end 
order by dec_start, dec_end
;

demo on db<>fiddle

Top-N By Decade for successive decades (in SQL Server)

If I followed you correctly, you want to top 5 per decade. If so:

you would need to group by decade rather than by calendar year to get the proper counts; it is easier to compute the decade in a subquery so you don't have to repeat the case expression
the rank should be computed over decade partitions rather than per year
you can then use that column to filter in an outer query

Consider:

select *
from (
    select
        dtd.documenttitle as title,
        rank() over (partition by dd.decade order by count(*) desc) as rnk,
        count(*) as number_of_occurrences,
        dd.decade
    from tbldocumentTitleDimension dtd
    inner join tblDocumentFact df on dtd.documenttitleid   = df.documenttitleid
    inner join  (
        select 
            dateid,
            case
                when calendarYear between 2010 and 2019 then '2010 - 2019'
                when calendarYear between 2000 and 2009 then '2000 - 2009'
                when calendarYear between 1990 and 1999 then '1990 - 1999'
                when calendarYear between 1980 and 1989 then '1980 - 1989'
                when calendarYear between 1970 and 1979 then '1970 - 1979'
                when calendarYear between 1960 and 1969 then '1960 - 1969'
                else 'all others'
            end AS decade
            from tblDateDimension
    ) dd on df.publicationdateid  = dd.dateid
    group by dtd.documenttitle, dd.decade
) t
where rnk <= 5
order by decade, number_of_occurrences desc

Side notes:

don't use single quotes for identifiers (although SQL Server allows that, single quotes should be reserved for litteral stings, as defined in the SQL standard) - better yet, you can use identifiers that do not require quoting
in a multi-table query, always qualify all column names with the table they belong to; I made a few assumptions here
unless you have null values in column documentTitle that you don't want to count in, you can use count(*) instead of count(documentTitle) - this is straight-forward, and more efficient

SQL Query to return maximums over decades

SELECT
  Lookup.DecadeID,
  Data.*
FROM
(
  SELECT
    truncate(yearid/10,0) as decadeID,
    MAX(HR) as Homers
  FROM
    masterplusbatting
  GROUP BY
    truncate(yearid/10,0)
)
  AS lookup
INNER JOIN
  masterplusbatting AS data
    ON  data.yearid >= lookup.decadeID * 10
    AND data.yearid <  lookup.decadeID * 10 + 10
    AND data.HR     =  lookup.homers

Editted for MySQL

SQL query to find dates where more records were active

Test table and data:

create table startend ( prod, startdate, enddate )
as
select 'a', date'1789-04-01', date'1799-12-14' from dual union all
select 'b', date'1797-03-04', date'1826-07-04' from dual union all
select 'c', date'1801-03-04', date'1826-07-04' from dual union all
select 'd', date'1809-03-04', date'1836-06-28' from dual union all
select 'e', date'1817-03-04', date'1831-07-04' from dual ; 

SQL> select * from startend;
PROD  STARTDATE  ENDDATE    
a     01-APR-89  14-DEC-99  
b     04-MAR-97  04-JUL-26  
c     04-MAR-01  04-JUL-26  
d     04-MAR-09  28-JUN-36  
e     04-MAR-17  04-JUL-31

Let's assume that we need to find/examine every possible combination of STARTDATE and ENDDATE. We could use a JOIN like the one in the inline view below. In this query, the rownum values have been renamed to: ERA (and will be used for GROUP BY at a later stage).

  select 
    to_char( startdate, 'YYYY-MM-DD') start_
  , to_char( enddate, 'YYYY-MM-DD')   end_
  , enddate - startdate as duration
  , rownum as era
  from ( 
    select distinct
      S1.startdate
    , S2.enddate
    from startend S1 
      join startend S2 on S1.startdate < S2.enddate
  ) 
;

-- result
START_     END_         DURATION        ERA
---------- ---------- ---------- ----------
1789-04-01 1836-06-28      17254          1
1789-04-01 1826-07-04      13607          2
1801-03-04 1831-07-04      11079          3
1809-03-04 1836-06-28       9978          4
1817-03-04 1836-06-28       7056          5
1817-03-04 1831-07-04       5235          6
1801-03-04 1826-07-04       9253          7
1809-03-04 1826-07-04       6331          8
1789-04-01 1831-07-04      15433          9
1797-03-04 1799-12-14       1015         10
1797-03-04 1826-07-04      10713         11
1797-03-04 1831-07-04      12539         12
1817-03-04 1826-07-04       3409         13
1789-04-01 1799-12-14       3909         14
1797-03-04 1836-06-28      14360         15
1801-03-04 1836-06-28      12900         16
1809-03-04 1831-07-04       8157         17

17 rows selected.

The conditions you need seem to be as follows (see the WHERE clause):

-- test dates: from your question
select prod
from startend
where startdate <= date'1817-03-04' and startdate < date'1826-07-04'
  and enddate   > date'1817-03-04' and enddate   >= date'1826-07-04'
;

-- result
b
c
d
e

Final step: combine the ideas behind the first 2 queries, something like (Oracle 11g):

select count(*)                        as "prod_count"
, to_char( E.startdate, 'YYYY-MM-DD' ) as "StartDate"
, to_char( E.enddate, 'YYYY-MM-DD' )   as "EndDate"
from 
(
    select startdate, enddate, rownum as era
    from 
    (
      select distinct
        S1.startdate
      , S2.enddate
      from startend S1 join startend S2 on S1.startdate < S2.enddate
    )
) E 
join 
(
    select distinct prod, startdate, enddate from startend
) P  
on    
      ( P.startdate <= E.startdate and P.startdate < E.enddate )
  and ( P.enddate   >  E.startdate and P.enddate   >= E.enddate )
--
group by era, E.startdate, E.enddate
order by 2, 3
;

Result

prod_count StartDate  EndDate   
---------- ---------- ----------
         1 1789-04-01 1799-12-14
         2 1797-03-04 1799-12-14
         1 1797-03-04 1826-07-04
         2 1801-03-04 1826-07-04
         3 1809-03-04 1826-07-04
         1 1809-03-04 1831-07-04
         1 1809-03-04 1836-06-28
         4 1817-03-04 1826-07-04
         2 1817-03-04 1831-07-04
         1 1817-03-04 1836-06-28

10 rows selected.

See also: dbfiddle here. When working with Oracle 12c (or 18c), you could use CROSS APPLY (instead of JOIN ... ON ...)

query to select count of records for each year

A simple method to get all years in the data -- even when they don't meet the conditions of the where clause -- is to use conditional aggregation:

select year(fact_date) as yyyy,
       sum(case when stat = 1 and id = 16 then 1 else 0 end) as cnt_16
from tbl_fact
group by year(fact_date)
order by yyyy;

Finding The Decade with Largest Records, SQL Server