Optimizing Database Queries in Django Rest Framework

Optimizing database queries in Django REST framework

Django REST Framework cannot automatically optimize queries for you, in the same way that Django itself won't. There are places you can look at for tips, including the Django documentation. It has been mentioned that Django REST Framework should automatically, though there are some challenges associated with that.

This question is very specific to your case, where you are using a custom SerializerMethodField that makes a request for each object that is returned. Because you are making a new request (using the Friends.objects manager), it is very difficult to optimize the query.

You can make the problem better though, by not creating a new queryset and instead getting the friend count from other places. This will require a backwards relation to be created on the Friendship model, most likely through the related_name parameter on the field, so you can prefetch all of the Friendship objects. But this is only useful if you need the full objects, and not just a count of the objects.

This would result in a view and serializer similar to the following:

class Friendship(models.Model):
    from_friend = models.ForeignKey(User, related_name="friends")
    to_friend = models.ForeignKey(User)

class GetAllUsers(generics.ListAPIView):
    ...

    def get_queryset(self):
        return User.objects.all().prefetch_related("friends")

class GetAllUsersSerializer(serializers.ModelSerializer):
    ...

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        friends = set(friend.from_friend_id for friend in obj.friends)

        if request.user != obj and request.user.id in friends:
            return True
        else:
            return False

If you just need a count of the objects (similar to using queryset.count() or queryset.exists()), you can include annotate the rows in the queryset with the counts of reverse relationships. This would be done in your get_queryset method, by adding .annotate(friends_count=Count("friends")) to the end (if the related_name was friends), which will set the friends_count attribute on each object to the number of friends.

This would result in a view and serializer similar to the following:

class Friendship(models.Model):
    from_friend = models.ForeignKey(User, related_name="friends")
    to_friend = models.ForeignKey(User)

class GetAllUsers(generics.ListAPIView):
    ...

    def get_queryset(self):
        from django.db.models import Count

        return User.objects.all().annotate(friends_count=Count("friends"))

class GetAllUsersSerializer(serializers.ModelSerializer):
    ...

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        if request.user != obj and obj.friends_count > 0:
            return True
        else:
            return False

Both of these solutions will avoid N+1 queries, but the one you pick depends on what you are trying to achieve.

Optimize Django Rest ORM queries

There's no silver bullet without looking at the code (and the profiling results) in detail.

The only thing that is a no-brainer is enforcing relationships in the models and in the database. This prevents a whole host of bugs, encourages the use of standardized, performant access (rather than concocting SQL on the spot which more often than not is likely to be buggy and slow) and makes your code both shorter and a lot more readable.

Other than that, 50-60 queries can be a lot (if you could do the same job with one or two) or it can be just right - it depends on what you achieve with them.

The use of prefetch_related and select_related is important, yes – but only if used correctly; otherwise it can slow you down instead of speeding you up.

Nested serializers are the correct approach if you need the data – but you need to set up your querysets properly in your viewset if you want them to be fast.

Time the main parts of slow views, inspect the SQL queries sent and check if you really need all data that is returned.

Then you can look at the sore spots and gain time where it matters. Asking specific questions on SO with complete code examples can also get you far fast.

If you have just one top-level object, you can refine the approach offered by @jensmtg, doing all the prefetches that you need at that level and then for the lower levels just using ModelSerializers (not SerializerMethodFields) that access the prefetched objects. Look into the Prefetch object that allows nested prefetching.

But be aware that prefetch_related is not for free, it involves quite some processing in Python; you may be better off using flat (db-view-like) joined queries with values() and values_list.

Optimizing Database Queries in Django Rest Framework

Optimizing database queries in Django REST framework

Optimize Django Rest ORM queries

Related Topics

Leave a reply