Comments (3)
Original but defunct Lsyncd author here, I skimmed over the code and it seemed to me as well there is a race condition in here. You need to set last_rsynced time before you run rsync, because for changes during the rsync command you just don't know if rsync got them over the line or not. You need to call it another time for everything that happened in the meantime. Unless I'm mistaken this should be easily fixable by simply exchanging line 45 and 46 main.py -- but not tested.
I still suggest making a heavy duty unit test that randomly creates/changes/deletes and moves a ton of files and directory in rapid succession and then runs a diff to check if source and target are the same. Like churn-rsync.lua but it's Lua and I guess you want to redo it Python. Took me a while to remove all race conditions (AFAIK). Funny enough when I first published it, multiple people told me that you just can't reliable create an asynchronous live program, but it is possible, but there are pitfalls.
Further short notes, IMO you should consider network error codes of rsync to retry, otherwise you run out of sync on a network hick up. And what I learned from a lot of feedback, don't do --delete by default config, I know Lsyncd has that, but that was less ideal in hindsight and would do different.
Two minor difference things, moves via ssh feature isn't there, but that was not so important optimization requested and Lsyncd always calls rsync with a custom filter setup so rsync only has to inspect those files that actually changes, that can be a big difference on large file trees. On the flipside using fswatch as established peer program I guess you got fanotify functionality out of the box, something Lsyncd never got. Back the day the kernel using fanotify didn't report moves thus was not usable and had to rely on inotify which uses kernel memory for every directory watched. It has been fixed since then.
Honestly if I would redo everything from scratch, I'd use a database like leveldb to store the timestamps of the source tree, thus the need to always run a full initial rsync on startup should fall flat.
Wish you much success on your project! Back the day when I needed a live asynchronization functionality I was surprised there was no opensource solution available, we certainly can use more diversity here. A few projects flickered up but unfortunally most seem to died of again. There was also one in Python -- unfortunally I forget the name -- it used multithreading with Pythons built in semaphore mechanics the code even seemed not very complicated.
from workonsh.
Ah no, one thing I forget which indeed makes the loop more complicated (as MarSoft suggested) you should first fully exhaust the message queue from your notification mechanism, then set the timestamp (for everything covered) and then call rsync. Dunno how your mechanism works here, put you'd need some peek functionality to see if another event is incoming before you try to receive it (and thus block if it isn't)
from workonsh.
Thank @MarSoft @axkibe for your great suggestions. This package was built within a night since I needed a quick solution to synchronize the source code in my local machine to a remote server where I could compile and run my code. Many things have not been considered yet.
Unfortunately, I am not using the output of fswatch for rsync. fswatch is just a signal to activate rsync (rsync still has to run on the entire source tree). Besides, it seems that setting a time interval for two consecutive rsync
is not the ideal, efficient solution also.
My actual experience when using this package is that it's too slow. After changing the source code, I often go directly to the server's console and run the code, however, sometimes, the code wasn't there, making the debugging experience awful. Probably, this package or even lsyncd
would be more suitable for backing up purposes.
I think this package needs a deeper thought about the use cases and especially, its efficiency. Probably, some things like mirror
would be more suitable here.
from workonsh.
Related Issues (1)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from workonsh.