Sql Set-Based Range

SQL set-based range

I think the very short answer to your question is to use WITH clauses to generate your own.

Unfortunately, the big names in databases don't have built-in queryable number-range pseudo-tables. Or, more generally, easy pure-SQL data generation features. Personally, I think this is a huge failing, because if they did it would be possible to move a lot of code that is currently locked up in procedural scripts (T-SQL, PL/SQL, etc.) into pure-SQL, which has a number of benefits to performance and code complexity.

So anyway, it sounds like what you need in a general sense is the ability to generate data on the fly.

Oracle and T-SQL both support a WITH clause that can be used to do this. They work a little differently in the different DBMS's, and MS calls them "common table expressions", but they are very similar in form. Using these with recursion, you can generate a sequence of numbers or text values fairly easily. Here is what it might look like...

In Oracle SQL:

WITH
digits AS -- Limit recursion by just using it for digits.
(SELECT
LEVEL - 1 AS num
FROM
DUAL
WHERE
LEVEL < 10
CONNECT BY
num = (PRIOR num) + 1),
numrange AS
(SELECT
ones.num
+ (tens.num * 10)
+ (hundreds.num * 100)
AS num
FROM
digits ones
CROSS JOIN
digits tens
CROSS JOIN
digits hundreds
WHERE
hundreds.num in (1, 2)) -- Use the WHERE clause to restrict each digit as needed.
SELECT
-- Some columns and operations
FROM
numrange
-- Join to other data if needed

This is admittedly quite verbose. Oracle's recursion functionality is limited. The syntax is clunky, it's not performant, and it is limited to 500 (I think) nested levels. This is why I chose to use recursion only for the first 10 digits, and then cross (cartesian) joins to combine them into actual numbers.

I haven't used SQL Server's Common Table Expressions myself, but since they allow self-reference, recursion is MUCH simpler than it is in Oracle. Whether performance is comparable, and what the nesting limits are, I don't know.

At any rate, recursion and the WITH clause are very useful tools in creating queries that require on-the-fly generated data sets. Then by querying this data set, doing operations on the values, you can get all sorts of different types of generated data. Aggregations, duplications, combinations, permutations, and so on. You can even use such generated data to aid in rolling up or drilling down into other data.

UPDATE: I just want to add that, once you start working with data in this way, it opens your mind to new ways of thinking about SQL. It's not just a scripting language. It's a fairly robust data-driven declarative language. Sometimes it's a pain to use because for years it has suffered a dearth of enhancements to aid in reducing the redundancy needed for complex operations. But nonetheless it is very powerful, and a fairly intuitive way to work with data sets as both the target and the driver of your algorithms.

How to query number based SQL Sets with Ranges in SQL

One fairly easy way to do this would be to load a temp table with your values/ranges:

CREATE TABLE #Ranges (ValA int, ValB int)

INSERT INTO #Ranges
VALUES
(1, 10)
,(13, NULL)
,(24, NULL)
,(51,60)

SELECT *
FROM Table t
JOIN #Ranges R
ON (t.Field = R.ValA AND R.ValB IS NULL)
OR (t.Field BETWEEN R.ValA and R.ValB AND R.ValB IS NOT NULL)

The BETWEEN won't scale that well, though, so you may want to consider expanding this to include all values and eliminating ranges.

How to update rows in a table based on a range from two rows in another table

Using simple MAX:

SqlFiddleDemo

/* Preparing data */
CREATE TABLE Review(id INT IDENTITY(1,1), val INT, discount INT NULL);

INSERT INTO Review(val)
VALUES (12), (400), (600), (1100), (1550);

CREATE TABLE Discount(Id INT IDENTITY(1,1), Quantity INT, DiscountAmount INT);

INSERT INTO Discount(Quantity, DiscountAmount)
VALUES (500, 6), (1000, 8), (1500, 10);

/* Main */
UPDATE rev
SET Discount = (SELECT ISNULL(MAX(d.DiscountAmount), 0)
FROM Discount d
WHERE rev.val >= d.Quantity)
FROM Review rev;

How to set in SQL a Flag based on days range

First, here's a query that I think gives you what you're looking for (let me know):

select RecId
, createdDate
, UserId
, row_number() over (partition by UserId order by createdDate desc) as ROWNUMBER
, case
when datediff(day,lag(createdDate) over (partition by UserId order by createdDate),createdDate) <= 50 then 'false'
else 'true'
end as toCount
from customer
order by RecId;

A couple of observations:

row_number() over (partition by UserID order by UserID) as "ROWNUMBER" UserID is not distinct and doesn't make a good candidate for the order by in this row_number function. It's good for partition, not for order by.

lag(createdDate,50,createdDate) That 50 in there is an offset, so you're asking to skip fifty rows, not 50 days.

In SQL, how can you group by in ranges?

Neither of the highest voted answers are correct on SQL Server 2000. Perhaps they were using a different version.

Here are the correct versions of both of them on SQL Server 2000.

select t.range as [score range], count(*) as [number of occurences]
from (
select case
when score between 0 and 9 then ' 0- 9'
when score between 10 and 19 then '10-19'
else '20-99' end as range
from scores) t
group by t.range

or

select t.range as [score range], count(*) as [number of occurrences]
from (
select user_id,
case when score >= 0 and score< 10 then '0-9'
when score >= 10 and score< 20 then '10-19'
else '20-99' end as range
from scores) t
group by t.range

Date Range for set of same data

Non-relational Solution

I don't think any of other answers are correct.

  • GROUP BY won't work

  • Using ROW_NUMBER() forces the data into a Record Filing System structure, which is physical, and then processes it as physical records. At a massive performance cost. Of course, in order to write such code, it forces you to think in terms of RFS instead of thinking in Relational terms.

  • Using CTEs is the same. Iterating through the data, especially data that does not change. At a slightly different massive cost.

  • Cursors are definitely the wrong thing for a different set of reasons. (a) Cursors require code, and you have requested a View (b) Cursors abandon the set-processing engine, and revert to row-by-row processing. Again, not required. If a developer on any of my teams uses cursors or temp tables on a Relational Database (ie. not a Record Filing System), I shoot them.

Relational Solution

  1. Your data is Relational, logical, the two given data columns are all that is necessary.

  2. Sure, we have to form a View (derived Relation), to obtain the desired report, but that consists of pure SELECTs, which is quite different to processing (converting it to a file, which is physical, and then processing the file; or temp tables; or worktables; or CTEs; or ROW_Number(); etc).

  3. Contrary to the lamentations of "theoreticians", who have an agenda, SQL handles Relational data perfectly well. And you data is Relational.

Therefore, maintain a Relational mindset, a Relational view of the data, and a set-processing mentality. Every report requirement over a Relational Database can be fulfilled using a single SELECT. There is no need to regress to pre-1970 ISAM File handling methods.

I will assume the Primary Key (the set of columns that give a Relational row uniqueness) is Date, and based on the example data given, the Datatype is DATE.

Try this:

    CREATE VIEW MyTable_Base_V          -- Foundation View
AS
SELECT Date,
Date_Next,
Price
FROM (
-- Derived Table: project rows with what we need
SELECT Date,
[Date_Next] = DATEADD( DD, 1, O.Date ),
Price,
[Price_Next] = (

SELECT Price -- NULL if not exists
FROM MyTable
WHERE Date = DATEADD( DD, 1, O.Date )
)

FROM MyTable MT

) AS X
WHERE Price != Price_Next -- exclude unchanging rows
GO

CREATE VIEW MyTable_V -- Requested View
AS
SELECT [Date_From] = (
-- Date of the previous row
SELECT MAX( Date_Next ) -- previous row
FROM MyTable_V
WHERE Date_Next < MT.Date
),

[Date_To] = Date, -- this row
Price
FROM MyTable_Base_V MT
GO

SELECT *
FROM MyTable_V
GO

Method, Generic

Of course this is a method, therefore it is generic, it can be used to determine the From_ and To_ of any data range (here, a Date range), based on any data change (here, a change in Price).

Here, your Dates are consecutive, so the determination of Date_Next is simple: increment the Date by 1 day. If the PK is increasing but not consecutive (eg. DateTime or TimeStamp or some other Key), change the Derived Table X to:

    -- Derived Table: project rows with what we need
SELECT DateTime,
[DateTime_Next] = (
-- first row > this row
SELECT TOP 1
DateTime -- NULL if not exists
FROM MyTable
WHERE DateTime > MT.DateTime
),

Price,
[Price_Next] = (
-- first row > this row
SELECT TOP 1
Price -- NULL if not exists
FROM MyTable
WHERE DateTime > MT.DateTime
)

FROM MyTable MT

Enjoy.

Please feel free to comment, ask questions, etc.

Repeat the rows based on the Range of two columns

In simple terms, a Tally looks like this:

WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP (200)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3)
SELECT *
FROM Tally;

Then I suspect what you want to do, to replace the rCTe (and some of the Cross applies), would be something like this:

WITH N AS
(SELECT N
FROM (VALUES (NULL),
(NULL),
(NULL),
(NULL),
(NULL),
(NULL),
(NULL),
(NULL),
(NULL),
(NULL)) N (N) ),
Tally AS
(SELECT TOP (200)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1,
N N2,
N N3),
cte AS
(SELECT D.Materialno_start,
D.Materialno_end,
D.Name,
D.MType,
D.Noofstock,
CONCAT(LEFT(D.Materialno_start,CHARINDEX('-',D.Materialno_start)),V.NoStart + ISNULL(T.I,0)) AS NewID
FROM dbo.[data] D
CROSS APPLY (VALUES(TRY_CONVERT(int,STUFF(D.Materialno_start,1,CHARINDEX('-',D.Materialno_start),'')),TRY_CONVERT(int,STUFF(D.Materialno_end,1,CHARINDEX('-',D.Materialno_end),'')))) V(NoStart,NoEnd)
LEFT JOIN Tally T ON T.I <= V.NoEnd - V.NoStart)
SELECT Materialno_start,
Materialno_end,
Materialno_start AS MaterialNo,
Name,
mtype,
noofstock,
NewID
FROM cte
ORDER BY cte.Materialno_start;

Create Set of IDs based on Varying ID Ranges

I use a UDF to generate ranges, but a numbers table or tally table would do the trick as well

Declare @Table table (StartRange int,EndRange int,Date Date)
Insert into @Table values
(184,186,'1979-01-09'),
(204,207,'1979-01-09')

Select B.ID
,A.Date
From @Table A
Join (Select ID=cast(RetVal as int) from [dbo].[udf-Create-Range-Number](1,9999,1)) B
on B.ID between A.StartRange and A.EndRange
Order by B.ID,Date

Returns

ID  Date
184 1979-01-09
185 1979-01-09
186 1979-01-09
204 1979-01-09
205 1979-01-09
206 1979-01-09
207 1979-01-09

The UDF

CREATE FUNCTION [dbo].[udf-Create-Range-Number] (@R1 money,@R2 money,@Incr money)

-- Syntax Select * from [dbo].[udf-Create-Range-Number](0,100,2)

Returns
@ReturnVal Table (RetVal money)

As
Begin
With NumbTable as (
Select NumbFrom = @R1
union all
Select nf.NumbFrom + @Incr
From NumbTable nf
Where nf.NumbFrom < @R2
)
Insert into @ReturnVal(RetVal)

Select NumbFrom from NumbTable Option (maxrecursion 32767)

Return
End

How do I define date range sets based on the status of a user

I've not tested this for the full dataset, as I wasn't formatting all those dates, however, this should work:

CREATE TABLE #User_Activity (
UserID INT
,[Start] DATE
,[End] DATE
,[Status] VARCHAR(10) NULL)

INSERT INTO #User_Activity (UserID, [Start], [End], [Status])
SELECT 1, '20050101', '20060905', 'Active' UNION ALL
SELECT 1, '20060906', '20070402', 'Active' UNION ALL
SELECT 1, '20070403', '20071231', 'Inactive';
GO

WITH Grp AS (
SELECT UserID,
[Start],
ISNULL([End], CONVERT(date, GETDATE())) AS [End],
[Status],
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY [Start]) -
ROW_NUMBER() OVER (PARTITION BY UserID, [Status] ORDER BY [Start]) AS Grp
FROM #User_Activity UA)
SELECT UserID,
MIN([Start]) AS MinStart,
MAX([End]) AS MaxEnd,
[Status]
FROM Grp
GROUP BY UserID, [Status], Grp;

GO
DROP TABLE #User_Activity;


Related Topics



Leave a reply



Submit