How to Avoid the Warning "Firstresult/Maxresults Specified with Collection Fetch; Applying in Memory!" When Using Hibernate

How can I avoid the Warning firstResult/maxResults specified with collection fetch; applying in memory! when using Hibernate?

Reason for this warning is that when fetch join is used, order in result sets is defined only by ID of selected entity (and not by join fetched).

If this sorting in memory is causing problems, do not use firsResult/maxResults with JOIN FETCH.

org.hibernate.hql.ast.QueryTranslatorImpl list ATTENTION: firstResult/maxResults specified with collection fetch; applying in memory

TL;DR Hibernate doesn't know how many rows of the flattened, joined query it needs to get the specified number of the Order objects, so it has to load the whole query in memory. See below for an explanation.

To understand why Hibernate does this, you need to understand how Hibernate does the ORM (Object-Relational Mapping) involved for JPA Entities.

Consider a simplified set of Entities for your order. Class Order contains 2 fields: number and customerId and a list of order lines. Class OrderLine contains productCode and quantity fields, as well as a uid key and a reference to the parent Order.

These classes may be defined thus:

@Entity
@Table(name = "ORDER")
public class Order {
@ID
@Column(name = "NUMBER")
private Integer number;
@Column(name = "CUSTOMER_ID")
private Integer customerId;
@OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
@OrderBy
private List<OrderLine> orderLineList;

.... // Rest of the class
}

@Entity
@Table(name = "ORDER_LINE")
public class OrderLine
{
@ID
@Column(name = "UID")
private Integer uid;
@Column(name = "PRODUCT_CODE")
private Integer productCode;
@Column(name = "QUANTITY")
private Integer quantity;
@Column(name = "ORDER_NUMBER")
private Integer orderNumber;
@ManyToOne(fetch = FetchType.LAZY, optional = false)
@JoinColumn(name = "ORDER_NUMBER", referencedColumnName = "NUMBER", insertable = false, updatable = false)
private Order order;

.... // Rest of the class
}

Now, if you performed the following JPQL query on these Entities:

SELECT o FROM Order o LEFT JOIN FETCH o.orderLineList

then Hibernate performs this query as a 'flattened' SQL query similar to the following:

SELECT o.number, o.customer_id, ol.uid, ol.product_code, ol.quantity, ol.order_number
FROM order o LEFT JOIN order_line ol ON order_line.order_number = order.number

which would give a result like this:

| o.number | o.customer_id | ol.uid | ol.product_code | ol.quantity |
|==========|===============|========|=================|=============|
| 1 | 123 | 1 | 1111 | 5 |
| 1 | 123 | 2 | 1112 | 6 |
| 1 | 123 | 3 | 1113 | 1 |
| 2 | 123 | 4 | 1111 | 2 |
| 2 | 123 | 5 | 1112 | 7 |
| 3 | 123 | 6 | 1111 | 6 |
| 3 | 123 | 7 | 1112 | 5 |
| 3 | 123 | 8 | 1113 | 3 |
| 3 | 123 | 9 | 1114 | 2 |
| 3 | 123 | 10 | 1115 | 9 |
...etc

which Hibernate would use to 'reconstruct' Order objects with attached lists of OrderLine sub-objects.

However, since the number of order lines per order is random, there is no way for Hibernate to know how many rows of this query to take to get the specified maximum number of Order objects required. So it has to take the whole query and build up the objects in memory until it has the right amount, before discarding the rest of the result set. The log warning it produces alludes to this:

ATTENTION: firstResult/maxResults specified with collection fetch; applying in memory!

We're only just discovering now that these queries can have a significant impact on server memory use, and we've had issues with our server falling over with out of memory errors when these queries are attempted.

By the way, I will say now that this is mostly just theory on my part and I have no idea how the actual Hibernate code works. Most of this you can glean from the logs when you have Hibernate logging the SQL statements it generates.


UPDATE:
Recently I have discovered a little 'gotcha' with the above.

Consider a third Entity called Shipment which is for one or more Lines of an Order.

The Shipment entity would have a @ManyToOne association to the Order entity.

Let's say you have 2 Shipments for the same Order which has 4 Lines.

If you perform a JPQL query for the following:

SELECT s FROM Shipment s LEFT JOIN s.order o LEFT JOIN FETCH o.orderLineList

You would expect (or at least I did) to get 2 shipment objects back, each with a reference to the same Order object, which itself would contain the 4 Lines.

Nope, wrong again! In fact, you get 2 Shipment objects, each referring to the same Order object, which contains 8 Lines! Yes, the Lines get duplicated in the Order! And yes, that is even if you specify the DISTINCT clause.

If you research this issue here on SO or elsewhere (most notably the Hibernate forums), you'll find that this is actually a feature not a bug, according to the Hibernate powers that be. Some people actually want this behaviour!

Go figure.

Why does Hibernate ignore setMaxResults?

You have to understand that the first/max results apply on entity level if you select an entity. If you fetch join collection attributes or use an entity graph for collection attributes you change the cardinality of the rows returned by JDBC for each entity i.e. every row for the main entity row is duplicated for every collection element. The effect of that is, that Hibernate can't do pagination with ROWNUM anymore which is why you are not seeing it in the query. If you remove the fetch join you will see the use of ROWNUM.

Having said that, this is a perfect use case for Blaze-Persistence.

Blaze-Persistence is a query builder on top of JPA which supports many of the advanced DBMS features on top of the JPA model. The pagination support it comes with handles all of the issues you might encounter.

It also has a Spring Data integration, so you can use the same code like you do now, you only have to add the dependency and do the setup: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-setup

Blaze-Persistence has many different strategies for pagination which you can configure. The default strategy is to inline the query for ids into the main query. Something like this:

select u 
from User u
left join fetch u.notes
where u.id IN (
select u2.id
from User u2
order by ...
limit ...
)
order by ...


Related Topics



Leave a reply



Submit