Comments (3)
One proposal
What if we referred to every type of object the same way we reference files, using the URL. The general idea is that the types are referred to hierarchically. I think this maps fairly well to the concepts we have. For example to access the filesystem at a specific commit we use a file subroute of commit:
# access <file> as it was in <commit>
curl pfs/commit/<commit>/file/<file>
# the default value for <commit> is still `"master"`:
curl pfs/file/<file> # reads from master
# so that use case doesn't get longer
Here's what it would look like:
/file
# Access to master doesn't change
curl -XPUT pfs/file/<file> -d @<local_file>
curl -XGET pfs/file/<file>
# Accessing outdated data does
curl -XGET pfs/commit/<commit>/file/<file>
# And writing to other branches does as well
curl -XGET pfs/branch/<branch>/file/<file>
/commit
# committing a branch is now
curl -XPOST pfs/branch/foo/commit
# committing master remains unchanged
curl -XPOST pfs/commit
# Listing commits remains unchanged
curl -XGET pfs/commit
# Getting commits from <branch>
curl -XGET pfs/branch/<branch>/commit
/branch
# creating <branch> from <commit>
curl -XPOST pfs/commit/<commit>/branch
# writing <file> to <branch>
curl -XPUT pfs/branch/<branch>/file/<file> -d @<local_file>
from pachyderm.
This has a very natural extension to the next primitive we're going to add which is jobs. Here's how the /job
route will look in this scheme I think:
# scheduling <job> on <master>
curl -XPUT pfs/job/<job> -d @<local_job>
# scheduling <job> on <branch>
curl -XPUT pfs/branch/<branch>/job/<job> -d @<local_job>
# accessing <file> from the output of <job>
curl -XGET pfs/job/<job>/file/<file>
# this filesystem is read-only
curl -XPUT pfs/job/<job>/file/<file> -d @<local_file>
Method not allowed.
from pachyderm.
Looks interesting. It would be great if compatibility is kept between git's http access syntax and the proposed file access syntax. For example, git's way of accessing the ReadMe.md at master
and some old version
is
https://github.com/pachyderm/pfs/blob/master/README.md
https://github.com/pachyderm/pfs/blob/90fcb0f0d8ab233daf2449c1125c5e6f7bee0d2d/README.md
Essentially it is following the <root>/<branch>/<filename>
pattern.
Since the syntax we select for the access API is going to effect the performance and other factors it may be better to evaluate all the options w.r.to some of the below factors.
Few important factors to consider for evaluation are:
subscribers
orlisteners
for any modifications at any level of the file system tree.- security / access control
- Throttling / Rate limiting
- Caching / invalidating
Please add more if I missed something.
On the other hand, Most of these above are common for all file-systems since old ages - what is the need of hour and expected desperately from new file systems is:
- ability to join and remove additional file systems on the fly (e.g. files mapped on smart phones' SD card being able to join and remove as user walks across wifi zones). The key is: being able to serve files from the near-by smart phone's SD card on wifi, rather than going through central server from remote location (similar to bit-torrent), and fall back to central server when the phone is not available.
It would be great if new systems such as pfs
can at least take one step towards that direction, if not full solve them.
from pachyderm.
Related Issues (20)
- `pachctl logs` help text is wrong HOT 1
- There is no `pachctl create project` support in pachctl HOT 2
- Spout pipeline can't be restarted HOT 1
- Service pipeline stops serving static files after new data committed HOT 9
- Directory path collision error - pipeline that fails HOT 2
- pachctl get file returns 'branch "master" not found in repo'
- Can't run pachctl on WSL2 HOT 6
- Integrate pull request preview environments HOT 2
- pachtl put_file pfs folder specification HOT 2
- Console styling problems in airgapped (offline) environment HOT 2
- Proxy configuration does not honor no_proxy variable with hostname HOT 3
- Pachd says running but is Never Ready HOT 1
- Vulnerability of dependency "github.com/containerd/containerd"
- Examine Golang Arenas for GC Performance
- wrong proxy port in local deployment tutorial HOT 3
- 429 error when doing a put file using a url in the Pachyderm tutorial HOT 1
- Offer Database Hosting Locally instead of AWS s3
- pgbouncer cannot connect to server
- Unable to connect to PachD HOT 2
- Main Readme 404's
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pachyderm.