Git Product home page Git Product logo

Comments (3)

axkibe avatar axkibe commented on July 21, 2024

Original but defunct Lsyncd author here, I skimmed over the code and it seemed to me as well there is a race condition in here. You need to set last_rsynced time before you run rsync, because for changes during the rsync command you just don't know if rsync got them over the line or not. You need to call it another time for everything that happened in the meantime. Unless I'm mistaken this should be easily fixable by simply exchanging line 45 and 46 main.py -- but not tested.

I still suggest making a heavy duty unit test that randomly creates/changes/deletes and moves a ton of files and directory in rapid succession and then runs a diff to check if source and target are the same. Like churn-rsync.lua but it's Lua and I guess you want to redo it Python. Took me a while to remove all race conditions (AFAIK). Funny enough when I first published it, multiple people told me that you just can't reliable create an asynchronous live program, but it is possible, but there are pitfalls.

Further short notes, IMO you should consider network error codes of rsync to retry, otherwise you run out of sync on a network hick up. And what I learned from a lot of feedback, don't do --delete by default config, I know Lsyncd has that, but that was less ideal in hindsight and would do different.

Two minor difference things, moves via ssh feature isn't there, but that was not so important optimization requested and Lsyncd always calls rsync with a custom filter setup so rsync only has to inspect those files that actually changes, that can be a big difference on large file trees. On the flipside using fswatch as established peer program I guess you got fanotify functionality out of the box, something Lsyncd never got. Back the day the kernel using fanotify didn't report moves thus was not usable and had to rely on inotify which uses kernel memory for every directory watched. It has been fixed since then.

Honestly if I would redo everything from scratch, I'd use a database like leveldb to store the timestamps of the source tree, thus the need to always run a full initial rsync on startup should fall flat.

Wish you much success on your project! Back the day when I needed a live asynchronization functionality I was surprised there was no opensource solution available, we certainly can use more diversity here. A few projects flickered up but unfortunally most seem to died of again. There was also one in Python -- unfortunally I forget the name -- it used multithreading with Pythons built in semaphore mechanics the code even seemed not very complicated.

from workonsh.

axkibe avatar axkibe commented on July 21, 2024

Ah no, one thing I forget which indeed makes the loop more complicated (as MarSoft suggested) you should first fully exhaust the message queue from your notification mechanism, then set the timestamp (for everything covered) and then call rsync. Dunno how your mechanism works here, put you'd need some peek functionality to see if another event is incoming before you try to receive it (and thus block if it isn't)

from workonsh.

ngocbh avatar ngocbh commented on July 21, 2024

Thank @MarSoft @axkibe for your great suggestions. This package was built within a night since I needed a quick solution to synchronize the source code in my local machine to a remote server where I could compile and run my code. Many things have not been considered yet.

Unfortunately, I am not using the output of fswatch for rsync. fswatch is just a signal to activate rsync (rsync still has to run on the entire source tree). Besides, it seems that setting a time interval for two consecutive rsync is not the ideal, efficient solution also.

My actual experience when using this package is that it's too slow. After changing the source code, I often go directly to the server's console and run the code, however, sometimes, the code wasn't there, making the debugging experience awful. Probably, this package or even lsyncd would be more suitable for backing up purposes.

I think this package needs a deeper thought about the use cases and especially, its efficiency. Probably, some things like mirror would be more suitable here.

from workonsh.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.