How to Delete Firebase Data After "N" Days

How to delete firebase data after n days

Say that you have a data structure with nodes line this:

-KItqNxLqzQoLnUCb9sJaddclose
time: "Thu Apr 28 17:12:05 PDT 2016"
timestamp: 1461888725444

Each such node has a timestamp property that indicates when it was created. Preferably you'd set this property using Server Timestamp.

With this data structure, you can easily build a query that returns only the items older than 30 days and removes them:

long cutoff = new Date().getTime() - TimeUnit.MILLISECONDS.convert(30, TimeUnit.DAYS);
Query oldItems = ttlRef.orderByChild("timestamp").endAt(cutoff);
oldItems.addListenerForSingleValueEvent(new ValueEventListener() {
@Override
public void onDataChange(DataSnapshot snapshot) {
for (DataSnapshot itemSnapshot: snapshot.getChildren()) {
itemSnapshot.getRef().removeValue();
}
}

@Override
public void onCancelled(DatabaseError databaseError) {
throw databaseError.toException();
}
});

Deleting Firebase data after a certain time

The orderByChild() sorting method is very forgiving. The children being sorted are not required to have a member with the specified field name. The documentation explains that those children are assigned a null value and appear first in the sort. Thus, if the reference used to create a query is incorrectly located, the query doesn't fail and instead will typically return all the children of that location.

You created your oldBug query using mDatabase where:

DatabaseReference mDatabase = FirebaseDatabase.getInstance().getReference();

This is one level too high. It should be:

 Query oldBug = mDatabase.child("Users").orderByChild("timeStamp").endAt(cutoff);

How to delete many of data in Realtime Database

There is support for deleting large nodes built into the Firebase CLI these days as explained in this blog How to Perform Large Deletes in the Realtime Database:

If you want to delete a large node, the new recommended approach is to use the Firebase CLI (> v6.4.0). The CLI automatically detects a large node and performs a chunked delete efficiently.

$ firebase database:remove /path/to/delete


My initial write-up is below. I'm pretty sure the CLI mentioned above implements precisely this approach, so that's likely a faster way to accomplish this, but I'm still leaving this explanation as it may be useful as background.

Deleting data is a write operation, so it's by definition going to put load on the database. Deleting a lot of data causes a lot of load, either as a spike in a short period or (if you spread it out) as a lifted load for a longer period. Spreading the load out is the best way to minimize impact for your regular users.

The best way to delete a long, flat list of keys (as you seem to have) is to:

  1. Get a list of that keys, either from a backup of your database (which happens out of band), or by using the shallow parameter on the REST API.
  2. Delete the data in reasonable batches, where reasonable depends on the amount of data you store per key. If each key is just a few properties, you could start deleting 100 keys per batch, and check how that impacts the load to determine if you can ramp up to more keys per batch.

How to create an auto delete mechanism for firestore? (deleting data after time period)

There is no built-in time-to-live mechanism in Cloud Firestore. The common approach is to run a piece of code at an interval, e.g. a Cloud Function triggered by something like cron-job.org.

Have a look at these questions for samples:

  • Delete firebase data older than 2 hours
  • How to delete firebase data after "n" days
  • Impelementing aging in a Firebase real time database
  • How to schedule a Cloud Functions to run in the future in order to build a Firestore document TTL

While these are for the Firebase Realtime Database, the same approach applies to Cloud Firestore.

How to automatically remove data from firestore after a specific time?

Before displaying an item you can check if it is posted in the last 24 hours and if not then don't display it.

To delete the items from Firestore you can set a scheduled job in the backend.
You can use cloud functions pubsub. Write a function like "every one hour find & delete the items that are created before the last 24 hours". Take a look here Schedule functions.



Related Topics



Leave a reply



Submit