Set a Maximum Number of Children in Firebase

Maximum recommended children in a Firebase Database node

As a rough estimate, I'm expecting a rough maximum of 30,000 children all-in.

That's not really a very large number of child nodes.

as far as I know, Firebase will retrieve the entire node, before filtering out any results

If you query the database using a field with an index, the nodes will be filtered on the server. You can create an index to avoid performance problems for larger numbers of child nodes.

limit number of children with storage rules

The short answer is no, there isn't a way to achieve this using storage security according to the documentation for storage security rules.

The related question Frank linked in the comments has a good answer. Another way you could make this happen is a Realtime Database trigger with Cloud Functions for Firebase. If you already write the download URLs to the database, use Cloud Functions for Firebase to count the number of children and limit them, as shown in this example.

How can I enforce a maximum number of children in a node?

It might seem like you could use a numeric id for the ordered drinks and then try something like this; it will fail since the ID is a string:

"$customer_id": {
    "drinks_ordered": {
       "$drink_id": {
          ".validate": "$drink_id > 0 && $drink_id < 16" // error
       }
    }
}

Instead, you can use a counter and validate the counter to be 1-15, and then validate the drink ID matches the counter.

"$customer_id": {
    "counter": {
       // counter can only be incremented by 1 each time, must be a number
       // and must be <= 15
       ".validate": "newData.isNumber() && newData.val() > 0 && newData.val() <= 15 && ((!data.exists() && newData.val() === 1) || (newData.val() === data.val()+1))"
    },
    "drinks_ordered": {
       // new record's ID must match the incremented counter
       "$drink_id": {
          // use .val()+'' because $drink_id is a string and Firebase always uses ===!
          ".validate": "root.child('bar/customers/'+$customer_id+'/counter').val()+'' == $drink_id"
       }
    }
}

Naturally, your drinks will look something like this:

 /bar/customers/george_thorogood/counter/3
 /bar/customers/george_thorogood/drinks_ordered/1/burbon
 /bar/customers/george_thorogood/drinks_ordered/2/scotch
 /bar/customers/george_thorogood/drinks_ordered/3/beer

Now before a client could add another drink, they would have to set the counter to 4 (which is the only thing it can be set to) and then add the drink with that same ID.

A little roundabout, but it does do the job : )

Firebase Performance: How many children per node?

If a node has that many children, accessing the node in any way is a recipe for problems. Accessing an individual child is never a problem.

Querying the node for a subset of its children still requires that the database consider each of those children. If you request the last 10 out of 100 million items, you're asking the database to consider 999,999,990 items that you're apparently not interested in.

It is impossible to say what the maximum is without a way more concrete description of the data size, ordering criteria, etc. But to be honest, even then the best you're likely to get is a value with a huge variance that is likely to change over time.

You best approach in Firebase (and most NoSQL solutions) is to model the data in a way that fits with how your app uses that data. So for example: if you need to show the latest 10 items to your users, store the (keys of) those latest 10 items in a separate list.

items
    -K........0
        title: "Firebase Performance: How many children per node?"
        body: "If a node has 100 million children, will there be a performance impact if I:..."
    -K........1
        title: "Firebase 3x method won't working in real device but worked in simulator swift 3.0"
        body: "Hi we are working with google firebase 3x version and we faced..."
    .
    .
    .
    -K999999998
    -K999999999
recent
    -K999999990: true
    -K999999991: true
    -K999999992: true
    -K999999993: true
    -K999999994: true
    -K999999995: true
    -K999999996: true
    -K999999997: true
    -K999999998: true
    -K999999999: true

I'm not sure if I got the right number of nines in there, but I hope you get the idea.

Q: Firebase Extension Limit Child Nodes

All Firebase Extensions are open-source, so if you want to learn how the Limit Child Nodes extension works, you can check the source code link from its installation page.

The code from there:

export const rtdblimit = functions.handler.database.ref.onCreate(
  async (snapshot): Promise<void> => {
    logs.start();

    try {
      const parentRef = snapshot.ref.parent;
      const parentSnapshot = await parentRef.once("value");

      logs.childCount(parentRef.path, parentSnapshot.numChildren());

      if (parentSnapshot.numChildren() > config.maxCount) {
        let childCount = 0;
        const updates = {};
        parentSnapshot.forEach((child) => {
          if (++childCount <= parentSnapshot.numChildren() - config.maxCount) {
            updates[child.key] = null;
          }
        });

        logs.pathTruncating(parentRef.path, config.maxCount);
        await parentRef.update(updates);
        logs.pathTruncated(parentRef.path, config.maxCount);
      } else {
        logs.pathSkipped(parentRef.path);
      }

      logs.complete();
    } catch (err) {
      logs.error(err);
    }
  }
);

From that it looks like this extension:

gets triggered when a new child node is created in the list.
then reads the entire parent node to determine how many children there are.
then deletes any number of children that are over its maximum with a single write operation.

This matches with my recollection of the code, since I wrote the first version of it. :)

To your questions:

Q1: Would every trigger of this extension count as a "write" to my database?

The triggering of the Cloud Functions happens when you write a child node to the database. It's the write that counts as a write, the trigger itself happens out-of-band. If you read the rest of the code I linked, you'll see that it deletes all child nodes over the maximum as a single write.

Q2: In the youtube information video it says it "works best" with auto-generated IDs, but is it needed?

The nodes are read without an ordering condition (const parentSnapshot = await parentRef.once("value")) and then it deletes nodes from the start of that list. So it assumes the nodes to delete are at the start of the list, when this is ordered by their keys. As long as that applies to your data too, it doesn't matter where the keys come from.