Time Based Priority in Active Record Query

Time based priority in Active Record Query

Unlike some other databases (like Oracle) PostgreSQL has a fully functional boolean type. You can use it directly in an ORDER BY clause without applying a CASE statement - those are great for more complex situations.

Sort order for boolean values is:

FALSE -> TRUE -> NULL

If you ORDER BY bool_expressionDESC, you invert the order to:

NULL -> TRUE -> FALSE

If you want TRUE first and NULL last, use the NULLS LAST clause of ORDER BY:

ORDER BY (featured AND created_at > now() - interval '11 days') DESC NULLS LAST  
, created_at DESC

Of course, NULLS LAST is only relevant if featured or created_at can be NULL. If the columns are defined NOT NULL, then don't bother.

Also, FALSE would be sorted before NULL. If you don't want to distinguish between these two, you are either back to a CASE statement, or you can throw in NULLIF() or COALESCE().

ORDER BY NULLIF(featured AND created_at > now() - interval '11 days'), FALSE)
DESC NULLS LAST
, created_at DESC

Performance

Note, how I used:

created_at > now() - interval '11 days'

and not:

now() - created_at < interval '11 days'

In the first example, the expression to the right is a constant that is calculated once. Then an index can be utilized to look up matching rows. Very efficient.

The latter cannot usually be used with an index. A value has to be computed for every single row, before it can be checked against the constant expression to the right. Don't do this if you can avoid it. Ever!

How to sort activerecord query by specific prority

Order can be any SQL code. You can use a CASE statement to map your values to values that naturally sort in the correct order.

Assignment.order("
CASE
WHEN priority = 'best' THEN '1'
WHEN priority = 'good' THEN '2'
WHEN priority = 'bad' THEN '3'
END")

Even better, you could move this logic to the model so that it's easier to call from controllers:

class Assignment < ActiveRecord::Base
...
def self.priority_order
order("
CASE
WHEN priority = 'best' THEN '1'
WHEN priority = 'good' THEN '2'
WHEN priority = 'bad' THEN '3'
END")
end
end

Then you can just call Assignment.priority_order to get your sorted records.

If this column is sortable in the view, add a parameter to the method for direction:

def self.priority_order(direction = "ASC")
# Prevent injection by making sure the direction is either ASC or DESC
direction = "ASC" unless direction.upcase.match(/\ADESC\Z/)
order("
CASE
WHEN priority = 'best' THEN '1'
WHEN priority = 'good' THEN '2'
WHEN priority = 'bad' THEN '3'
END #{direction}")
end

Then, you would call Assignment.priority_order(params[:direction]) to pass in the sorting from the controller.

Select records based on column priority

Your query is nearly correct. Just use PRODUCTID and not ID.

SELECT * 
FROM xx
WHERE f_COLOUR = "GREEN"
UNION
SELECT *
FROM xx
WHERE PRODUCTID not in
(SELECT PRODUCTID
FROM xx
WHERE f_COLOUR = "GREEN");

SQLFiddle Demo

sql server order by date and priority

You need to cast your order by LastModifiedDate as a date to remove the time element from your ordering.

Updated to handle the case where you have multiple records with the same priority on the same day, and want those ordered by the time.

SELECT 
[Id]
,[Priority]
,[LastModifiedDate]

FROM [News] order by CAST(LastModifiedDate AS DATE) desc, Priority desc, LastModifiedDate

Rails - order on column's values (priority column)

A simple CASE WHEN can do the trick (postgreSQL syntax used here):

scope :urgency_ordered {
order(<<-SQL)
CASE tasks.urgency
WHEN 'HIGH' THEN 'a'
WHEN 'MEDIUM' THEN 'b'
WHEN 'LOW' THEN 'c'
ELSE 'z'
END ASC,
id ASC
SQL
}

Call it this way:

Task.urgency_ordered

query linear timeline result based on itemevents date ranges and highest priority, help improve existing query

It's a bit late answer I came up on my own some while ago, better later than never ;)
After upgrading PostgreSQL to 9.3 version, I had an opportunity and need to
try and test range types. Came up with some ideas and revisited old problem with (later improved but still) slow query.

I wrote a function where you pass two arrays as an input value with equal element count. First one is bigint[] with event_id , second one is daterange[] array which contains each tasks active period accordingly. Events priority is implemented based on aggregated data order. So the first task_id in array "takes" - reserves his active daterange from the timeline , second event_id takes and reserves a range or two ranges which where not occupied and taken by the first, third takes whats left available from previous two and so on...
Function returns bigint event_id and array of ranges each task could acquire...
In the end, using this approach query performed more than 11 times faster with large datasets reducing execution time from couple minutes to several seconds.

Example Function:

CREATE OR REPLACE FUNCTION a2_daterange_timeline(IN i_id bigint[], IN i_ranges daterange[])
RETURNS TABLE(id bigint, ranges daterange[]) AS
$BODY$
declare

r record;
base_range daterange;

unocupied_ranges daterange[];
unocupied_ranges_temp daterange[];

target_range daterange;

overlap boolean;
to_right boolean;
to_left boolean;

uno daterange[];
ocu daterange[];

iii integer;

begin
overlap := false;
to_right := false;
to_left := false;

base_range := '[''2000-01-01'', ''3000-01-01'']' ::daterange;

unocupied_ranges := array[ base_range ];

FOR r IN SELECT unnest(i_id) id, unnest(i_ranges) zz
LOOP
unocupied_ranges_temp := array[] ::daterange[] ;

FOR iii IN 1..array_upper( unocupied_ranges,1) LOOP
target_range := r.zz ::daterange;
overlap := target_range && unocupied_ranges[iii];
to_right := not target_range &< unocupied_ranges[iii];
to_left := not target_range &> unocupied_ranges[iii];

uno :=case
when not overlap then array[unocupied_ranges[iii]]
when overlap and not to_right and not to_left then array[ daterange (lower(unocupied_ranges[iii]),lower(target_range),'[)') , daterange (upper(target_range),upper(unocupied_ranges[iii]),'[)') ]
when overlap and to_right and not to_left then array[ daterange (lower(unocupied_ranges[iii]),lower(target_range),'[)') ]
when overlap and not to_right and to_left then array[ daterange (upper(target_range),upper(unocupied_ranges[iii]),'[)') ]
when overlap and to_right and to_left then array[ ]::daterange[] end ;

unocupied_ranges_temp:= array_cat( unocupied_ranges_temp ,uno);

ocu :=case
when not overlap then array[ ] ::daterange[]
when overlap and not to_right and not to_left then array[ target_range ]
when overlap and to_right and not to_left then array[ daterange (lower(target_range),upper(unocupied_ranges[iii]),'[)') ]
when overlap and not to_right and to_left then array[ daterange (lower(unocupied_ranges[iii]),upper(target_range),'[)') ]
when overlap and to_right and to_left then array[ unocupied_ranges[iii] ] end ;

ranges := ocu;

if not ranges = array[]::daterange[] then
id := r.id;
return next;
end if;

END LOOP;
unocupied_ranges :=unocupied_ranges_temp;
END LOOP;
RETURN;

end;
$BODY$
LANGUAGE plpgsql IMMUTABLE
COST 100
ROWS 20;

Resulting query:

drop table if exists   temp_box;
create temp table temp_box(person_id integer ,event_id integer,event_description text, priority integer , date_from date , date_to date);
insert into temp_box values(333,1, 'white shirt', 10, '2015-01-01' , '3000-01-01');
insert into temp_box values(333,22, 'green shirt', 8, '2015-01-05' , '2015-01-20');
insert into temp_box values(333,13, 'red shirt', 7, '2015-02-03' , '2015-05-10');
insert into temp_box values(333,2, 'grey shirt', 6, '2015-02-11' , '2015-04-01');
insert into temp_box values(333,104, 'blue blouse', 4, '2015-03-01' , '2015-03-11');
insert into temp_box values(333,6, 'nothing', 2, '2015-04-10' , '2015-04-12');

with
a as (select * from temp_box order by person_id,priority)
-- ordering by person and event priority
, b as (select person_id, array_agg(event_id) a, array_agg(daterange(coalesce(date_from,'2000-01-01') , coalesce(date_to,'3000-01-01'),'[]')) b from a temp_box group by person_id )
--aggregate events into arrays to pass into function and calculate available dateranges for each event
, c as (select (a2_daterange_timeline(a,b)).* from b )
--calculating data
, d as (select id as r_id, unnest (ranges) as ranges from c)
--unnesting function results into individual daterange slices
, e as (select *,row_number() over (partition by temp_box.person_id order by ranges) zz from temp_box left join d on temp_box.event_id = d.r_id where upper(ranges)-1 >= '2015-01-01' and lower(ranges) < '2015-01-01'::date +365 order by ranges)
-- joining calculated data to an initial dataset and filtering desired period
select
person_id,
event_id,
(((( event_description || ' (from '::text) || to_char( lower(ranges) , 'yyyy.MM.dd'::text)) || ' to '::text) || to_char(upper(ranges)-1 , 'yyyy.MM.dd'::text)) || ')'::text AS info
from e

This approach with some modifications might be used with other range types. Hope this answer will be helpful to someone else as well.

How to give priority to certain queries?

Once a query has begun execution it cannot be paused/interrupted. The only exception to this is at the DB administration level where you could essentially force the query to stop (think of it as killing a running process in windows if you will). However you don't want to do that, so forget it.

Your best option would be to use a LOW PRIORITY chunked operation. Basically what that means is if the query on the LOW PRIORITY is taking too long to execute, think about ways in which you could split it up to be quicker without creating orphaned data or illegal data in the database.

A very basic use case would be imagine an insert that inserts 10,000 new rows. By "chunking" the insert so that it runs the insert multiple times with smaller data sets (i.e. 500 at a time), each one will complete more quickly, and therefore allow any non-LOW PRIORITY operations to be executed in a more timely manner.

How To

Setting something as low priority is as simple as adding in the LOW_PRIORITY flag.

INSERT LOW_PRIORITY INTO xxx(a,b,c,) VALUES()

UPDATE LOW_PRIORITY xxx SET a=b

DELETE LOW_PRIORITY FROM xxx WHERE a="value"

Only update one row per group of peers

Is there a way to re-evaluate a sub-query once per row.

Yes, you can do that with a subquery expression in the assignment - a "correlated subquery". Like:

UPDATE t SET col = (SELECT col FROM t2);

But that's not going to solve your problem at all, as every re-evaluation still sees the same snapshot of the underlying tables from the start of the UPDATE command. You would have to update one row at a time, which is comparatively expensive.

This does something like you ask for:

UPDATE newtable t
SET name = u.new_name
FROM (
SELECT DISTINCT ON (new_name) * -- one per group, unchanged first
FROM (
SELECT id, name
, translate(regexp_replace(trim(name), '\s\s+', ' ', 'g')
, 'ąćęłńóśźżĄĆĘŁŃÓŚŹŻ'
, 'acelnoszzACELNOSZZ') AS new_name
FROM newtable
) sub
ORDER BY new_name, name <> new_name, id
) u
WHERE u.id = t.id
AND u.name <> u.new_name; -- only if actual change

db<>fiddle here

The subquery u picks one row from every set that would end up with the same name, using DISTINCT ON. See:

  • Select first row in each GROUP BY group?

Suspecting a UNIQUE constraint on name, I prioritize rows that don't change at all. That's done by name <> newname, where false sorts before true. See:

  • Time based priority in Active Record Query

Only the one row per group is updated - and only if it actually changes. This way, name stays UNIQUE - as long as there are no concurrent writes that might interfere.

It's unclear if that's what you actually need, I filled in gaps in the task description with educated guesses.

While being at it, I made the string expression a lot cheaper.



Related Topics



Leave a reply



Submit