Clean Way to Use Postgresql Window Functions in Django Orm

Clean way to use postgresql window functions in django ORM?

Since Django 2.0 it is built-in into the ORM. See window-functions

# models.py
class GameScore(models.Model):
user_id = models.IntegerField()
score = models.IntegerField()

# window function usage
from django.db.models.expressions import Window
from django.db.models.functions import Rank

GameScore.objects.annotate(rank=Window(
expression=Rank(),
order_by=F('score').desc(),
partition_by=[F('user_id')]))

# generated sql
SELECT "myapp_gamescore"."id",
"myapp_gamescore"."user_id",
"myapp_gamescore"."score",
RANK() OVER (
PARTITION BY "myapp_gamescore"."user_id"
ORDER BY "myapp_gamescore"."score" DESC
) AS "rank"
FROM "myapp_gamescore"

Ranking players using Django ORM (Postgres - RANK)

Django <=1.10 doesn't allow to express window functions in pure Python.

However, it is possible to combine Python code and raw SQL to achieve needed result.

Using window functions for your use case is already explained on Stack Overflow here, please take a look: Clean way to use postgresql window functions in django ORM?

Using window functions in an update statement

The error is from postgres not django. You can rewrite this as:

WITH v_table_name AS
(
SELECT row_number() over (partition by col2 order by col3) AS rn, primary_key
FROM table_name
)
UPDATE table_name set table_name.col1 = v_table_name.rn
FROM v_table_name
WHERE table_name.primary_key = v_table_name.primary_key;

Or alternatively:

UPDATE table_name set table_name.col1 = v_table_name.rn
FROM
(
SELECT row_number() over (partition by col2 order by col3) AS rn, primary_key
FROM table_name
) AS v_table_name
WHERE table_name.primary_key = v_table_name.primary_key;

This works. Just tested it on postgres-9.6. Here is the syntax for UPDATE (see the optional fromlist).

Hope this helps.

Fastest way to run many startswith queries in django

A more efficient way would be to include all of the filters in a single database query.

For keeping your code portable and easy to understand, I'd recommend avoiding raw SQL if possible.

In the Django ORM, you can combine your filters by using a series of Q objects.

Refactoring your loop to use Q objects "OR"ed together would look like this:

from django.db.models import Q

query_filters = Q()
for item_number_prefix in item_number_prefixes:
query_filters |= Q(item_number__startswith=item_number_prefix)

queryset = Item.objects.filter(query_filters).values('item_number', 'manufacturer')

Annotate QuerySet with first value of ordered related model

Perhaps using .raw isn't such a bad idea. Checking the code for Window class we can see that essentially composes an SQL query to achieve the "Windowing".

An easy way out may be the usage of the architect module which can add partition functionality for PostgreSQL according to the documentation.

Another module that claims to inject Window functionality to Django < 2.0 is the django-query-builder which adds a partition_by() queryset method and can be used with order_by:

query = Query().from_table(
Order,
['*', RowNumberField(
'revenue',
over=QueryWindow().order_by('margin')
.partition_by('account_id')
)
]
)
query.get_sql()
# SELECT tests_order.*, ROW_NUMBER() OVER (PARTITION BY account_id ORDER BY margin ASC) AS revenue_row_number
# FROM tests_order

Finally, you can always copy the Window class source code in your project or use this alternate Window class code.

Efficient way to update multiple fields of Django model object

You can update a row in the database without fetching and deserializing it; update() can do it. E.g.:

User.objects.filter(id=data['id']).update(email=data['email'], phone=data['phone'])

This will issue one SQL update statement, and is much faster than the code in your post. It will never fetch the data or waste time creating a User object.

You cannot, though, send a whole bunch of update data to the SQL database and ask it to map it to different rows in one go. If you need a massive update like that done very quickly, your best bet is probably inserting the data into a separate table and then update it form a select on that table. Django ORM does not support this, as far as I can tell.

Dependant subqueries in Django

If the dataset is small, you could make multiple passes, but that is not a very good idea.

A couple of options, depending on the database you are using.

1) You can change the sql to use rank or dense_rank function to make the query much simpler.

select decision.id, decision.content_type_id, decision.object_id,
first_value (last_decision.date_taken)
over (partition by ecision.id, decision.content_type_id, decision.object_id
order by last_decision.date_taken desc
)
from canvasblocks_decision AS decision
...

2) You could put the same logic in an annotation, to get the rank. That way you have everything your django object gives and you get this extra column.

from django.db.models.expressions import RawSQL
Decision.objects.filter().annotate(rank=RawSQL("RANK() OVER (partition by id, content_type, object_id
(ORDER BY date_taken DESC)", [])
)

..

This might help : https://stackoverflow.com/a/35948419/237939



Related Topics



Leave a reply



Submit