What is the best way to sync 2 sqlite tables over http and json?

问题

I have a fairly simple sync problem. I have a table with about 10 columns that I want to keep in sync between a sqlite file on 3 different clients: an Iphone client, a browser client, and a Ruby on Rails client. So I need a simple sycing solution that will work for all 3, i.e. I can easily implement it in Javascript, Objective C, and Ruby and it works with JSON over HTTP. I have looked at various components of other syncing solutions like the one in git, some of the tutorials that have come out of the Google gears community, and a rails plugin called acts_as_replica. My naive approach would be to simply create a last synced timestamp in the database and then create a changelog of all deletes as they are made. (I don't allow updates to entries in the table). I can then retrieve all the new entries since the last timestamp, combine then with the deletes, and send a changelog as json over http between the 3 solutions.

Should I consider the use of SHA1 hash or a UUID of each entry or is a last synced timestamp sufficient? How do I make sure there are no duplicate entries? Is there a simpler algorithm I could follow?

回答1:

I am assuming changes are likely to be at the end. I don't know the character of insert and updates but here is my idea;

I would SHA1 (or MD5, it doesn't matter in this case) days of the current month and months before. Comparing against these fingerprints is a fast way to see were the differences are. (I am leaving today unhashed)
If previous months have differences;
- If volume for a month is too big we can then split the month and simply generate daily fingerprint on the fly instead of comparing the whole month.
- Otherwise we can treat a monthly change the same way we treat a daily change.
After finding out where the changes occur, master copy would send a list of all unique id's for that period. (Always sending today's info)
The slave then deletes what has to be deleted and compiles a list of id's to be inserted.
The master sends only those records (in full).

The time categories (day, month) can be adjusted according to the data volume.

Of course this is a naive and simple algorithm. If I was processing sensitive/critical data I would look for a transactional algorithm.

来源：https://stackoverflow.com/questions/327111/what-is-the-best-way-to-sync-2-sqlite-tables-over-http-and-json

标签

database

json

sqlite

synchronization