How to Avoid Unnecessary Firestore Reads with Cache

How to avoid unnecessary Firestore reads with Cache

Firestore get() always tries to get the data first from SERVER, even if I add CACHE to my query, how can I sync only the updated document?

According to the official documentation regarding enable offline data:

For Android and iOS, offline persistence is enabled by default.

So to use offline persistence, you don't need to make ("add") any changes to your code in order to be able to use and access Cloud Firestore data.

Also according to the official documentation regarding Query's get() method:

By default, get() attempts to provide up-to-date data when possible by waiting for data from the server, but it may return cached data or fail if you are offline and the server cannot be reached. This behavior can be altered via the Source parameter.

So this is normal behavior. Starting with 2018-06-13 (16.0.0 SDK) is now possible to specify the source from which we want to get data. We can achieve this with the help of the DocumentReference.get(Source source) and Query.get(Source source) methods.

As you can see, we can now pass as an argument to the DocumentReference or to the Query the source so we can force the retrieval of data from the server only, cache only or attempt server and fall back to the cache.

So something like this is now possible:

yourDocRef.get(Source.SERVER).addOnSuccessListener(new OnSuccessListener<DocumentSnapshot>() {
    @Override
    public void onSuccess(DocumentSnapshot documentSnapshot) {
        //Get data from the documentSnapshot object
    }
});

In this case, we force the data to be retrieved from the server only. If you want to force the data to be retrieved from the cache only, you should pass as an argument to the get() method, Source.CACHE. More informations here. I also wrote an article that can help understand more clear the concept:

How to drastically reduce the number of reads when no documents are changed in Firestore?

If you want to get only updated documents, you can view changes between snapshots. An example from the official documentation would be:

db.collection("cities")
    .whereEqualTo("state", "CA")
    .addSnapshotListener(new EventListener<QuerySnapshot>() {
        @Override
        public void onEvent(@Nullable QuerySnapshot snapshots,
                @Nullable FirebaseFirestoreException e) {
        if (e != null) {
            Log.w(TAG, "listen:error", e);
            return;
        }

        for (DocumentChange dc : snapshots.getDocumentChanges()) {
            switch (dc.getType()) {
            case ADDED:
                Log.d(TAG, "New city: " + dc.getDocument().getData());
                break;
            case MODIFIED:
                Log.d(TAG, "Modified city: " + dc.getDocument().getData());
                break;
            case REMOVED:
                Log.d(TAG, "Removed city: " + dc.getDocument().getData());
                break;
            }
        }

        }
    });

See the switch statement for every particular case? The second case will help you get only the updated data.

In my Recyler View I want the data to be order by a field date not by lut(last-modified-time)

In this case you should create a query that allow you order the results by a specific date property. Assuming you have a database structure that looks like this:

Firestore-root
   |
   --- items (collection)
        |
        --- itemId (document)
             |
             --- date: Oct 08, 2018 at 6:16:58 PM UTC+3
             |
             --- //other properties

The query like will help you achieve this:

FirebaseFirestore rootRef = FirebaseFirestore.getInstance();
CollectionRefference itemsRef = rootRef.collection("items");
Query query = itemsRef.orderBy("date", Query.Direction.ASCENDING);

This is also a recommended way on which you can add a timestamp to a Cloud Firestore database.

With FireStore Snapshot Listener i can't query such a large record set.

Oh, yes you can. According this answer, Firebase can effectively loop through billions items. This answer is for Firebase realtime database but same principles apply to Cloud Firestore.

How to prevent unnecessary Document read in Firestore?

So finally after looking high and low, apparently there is no solution at the moment other than saving data persistently in local DB.
Because:

All document that has been read will be re-read after 30 minutes of inactivity.
Simulating offline and online is very troublesome.

So finally my solution is:

Use Sembast as the local DB (so far so good) to persist data.
Create one Document in Firestore to store latest update information and only listen to that 1 document. If there is a change, then update the local DB with the new data starting from the latest one in local DB

If anyone has a better answer please still leave comment here. Thank you community!

***** UPDATE:
There is a method to get the data only from cache:

FirebaseFirestore.instance
    .collection(kCollection).get(GetOptions(source: Source.cache));

So just add the GetOptions(source: Source.cache) you can always fetch it from cache. Let me know if this could help.

== == == == == == ==

How can I reduce my reads on Firestore with Flutter?

There are 2 approaches for you to achieve this :

One with source options. With the source options, get() calls on
DocumentReference and Query. By providing the source
value, these methods can be configured to fetch results only from
the server, only from the local cache, or attempt to fetch results
from the server and fall back to the cache (which is the default).

This is a code example to show how you would use the source options to know if the data was fetched from cache, server or default.
You can also write code to query the cache first, then fall back to the server if not present like below :

let snapshot = await documentRef.get({ source: 'cache' }
if (!snapshot.exists) {
   snapshot = await documentRef.get({ source: 'server' })
}

This could be a good way of saving on the cost of read operations, but if source : cache is true everytime, the app does not read the document from the server, but only from cache, it might never see future changes to the document’s data. For that reason, the second option - SnapshotMetadata is more suitable.

Second, use the fromCache property in the SnapshotMetadata in your
snapshot event. If fromCache is true, the data came from the cache
and might be devoid of changes/updates.If fromCache is false, the
data is complete and current with the latest updates on the server.

This is a code example to show how you would use the SnapshotMetadata changes to know if the data was fetched from cache or server.

How can I avoid reads for documents that I'm not interested in Firestore?

It sounds like what you're trying to do isn't possible with your data. You can't have a query that's ordered by a different field than your range query. So, if you have a range query on lastModified, then you can't also order by name. And you if you order by name, you can't use a range filter by lastModified.

How to disable firestore cache for specific document?

So I want to disable offline cache for the specific documents

That's currently not possible. You cannot disable the offline persistence only for some documents. It's the entire database (up to the configured cache size) or nothing.

How to Avoid Unnecessary Firestore Reads with Cache