Does Sqlite Support Replication

Does SQLite support replication?

Brute force approach: Send it the ".dump" command to create a text representation of the data. Read that data in into the second database. Not sure you can use that.

If you need a fine grained update (sending a copy of each upto the other copy), have a look at sqlite3_update_hook

But how do you plan to handle errors? For example, what happens when the copy of the DB in app2 can't make an update for some reason?

To solve this, move the database to a server process and have the two apps talk to it.

Method to replicate sqlite database across multiple servers

I used the Raft consensus protocol to replicate my SQLite database. You can find the system here:

https://github.com/rqlite/rqlite

Method to replicate sqlite database across multiple servers

I used the Raft consensus protocol to replicate my SQLite database. You can find the system here:

https://github.com/rqlite/rqlite

Is SQLite appropriate for off-line storage before replication to a server?

You can use SQLIte for your scenario. But, while implementing, you can follow any one of this approach.

Approach #1: Use an Abstract Factory to Instantiate the SQLiteOpenHelper.

Approach #2: Wrap the SQLiteDatabase in a ContentProvider

Refer to this link for how to implement these 2 approaches. http://www.androiddesignpatterns.com/2012/05/correctly-managing-your-sqlite-database.html

Key points to be noted while using SQLite

  • Sqlite takes care of the file level locking.
  • Many threads can read,one can write. The locks prevent more than one
    writing.
  • Android implements some java locking in SQLiteDatabase to help keep
    things straight.
  • If we handle the database incorrectly from many threads and mess up the code, your
    database will not be corrupted. Only few updates will be lost.

How "Multiple Threads - DB access" can be used for your scenario

The SqliteOpenHelper object holds on to one database connection.

If you try to write to the database from actual distinct connections (multiple threads) at the same time, one will fail. It will not wait till the first is done and then write. It will simply not write your change. Worse, if you don’t call the right version of insert/update on the SQLiteDatabase, you won’t get an exception. You’ll just get a message in your LogCat, and that will be it.

So recommended to write using single thread and read from multiple threads if necessary for faster access.

Stateful service fabric service - replication of files changed in the disk

Azure will try to copy the whole file when it detects changes, or disk sectors only?

No, stateful services only replicate data placed on reliable collections, you should not expect changed files to be replicated across nodes, whenever a new instance of your service is created, the original file is copied to the new node and you won't have access to the modified files. You shouldn't persist file in the node disk, for this you should use a persisted storage like Azure Blob Storage, or attach a File Share to the nodes.

How the replication works in the background?

I think you misunderstood the concept of reliable collections and stateful services, I recommend you take a look on this docs to have a clear view on how it works:
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-services-reliable-collections

Suggestion:

If you want to rely on SF data management and replication features, I would recommend you store your data in the reliable collections instead of using the SqlLite. If fits your requirements.

Update:

At Build 2018, the SF team announced the plans to support two versions of 'reliable' volumes, one will be based o Azure files, same as described here and the other will be based on reliable collections, that will replicate changed files, the later, is not released yet. If you can wait, I think these feature will be suitable for your needs.

What are the performance characteristics of sqlite with very large database files?

So I did some tests with sqlite for very large files, and came to some conclusions (at least for my specific application).

The tests involve a single sqlite file with either a single table, or multiple tables. Each table had about 8 columns, almost all integers, and 4 indices.

The idea was to insert enough data until sqlite files were about 50GB.

Single Table

I tried to insert multiple rows into a sqlite file with just one table. When the file was about 7GB (sorry I can't be specific about row counts) insertions were taking far too long. I had estimated that my test to insert all my data would take 24 hours or so, but it did not complete even after 48 hours.

This leads me to conclude that a single, very large sqlite table will have issues with insertions, and probably other operations as well.

I guess this is no surprise, as the table gets larger, inserting and updating all the indices take longer.

Multiple Tables

I then tried splitting the data by time over several tables, one table per day. The data for the original 1 table was split to ~700 tables.

This setup had no problems with the insertion, it did not take longer as time progressed, since a new table was created for every day.

Vacuum Issues

As pointed out by i_like_caffeine, the VACUUM command is a problem the larger the sqlite file is. As more inserts/deletes are done, the fragmentation of the file on disk will get worse, so the goal is to periodically VACUUM to optimize the file and recover file space.

However, as pointed out by documentation, a full copy of the database is made to do a vacuum, taking a very long time to complete. So, the smaller the database, the faster this operation will finish.

Conclusions

For my specific application, I'll probably be splitting out data over several db files, one per day, to get the best of both vacuum performance and insertion/delete speed.

This complicates queries, but for me, it's a worthwhile tradeoff to be able to index this much data. An additional advantage is that I can just delete a whole db file to drop a day's worth of data (a common operation for my application).

I'd probably have to monitor table size per file as well to see when the speed will become a problem.

It's too bad that there doesn't seem to be an incremental vacuum method other than auto vacuum. I can't use it because my goal for vacuum is to defragment the file (file space isn't a big deal), which auto vacuum does not do. In fact, documentation states it may make fragmentation worse, so I have to resort to periodically doing a full vacuum on the file.

An In-memory database solution with quickest real time replication

Check out Redis. Here's the Replication Howto.

Also, if you decide that the DB doesn't absolutely need to be in-memory, it just needs to be fast, you might want to consider CouchDB. It can do continuous replication, which is essentially instant, and all nodes are masters. It has a well-thought-out conflict detection and resolution mechanism. This blog post is a great introduction to the latest and greatest CouchDB replication capabilities.



Related Topics



Leave a reply



Submit