Comments (12)
if you have a build rule like:
par_binary(
name = alpha,
data = ["data.txt"]
...
)
# Then the following commands are available:
bazel run //path/to/alpha # This is same as py_binary.
bazel run //path/to/alpha.par # This is subpar in action
However, It turns out only one of them can work:
- bazel is recommending us to use their own runfiles library to access data files.
- subpar is recommending the use of pkg_util!
Can we have an ultimate example of a python project which uses a simple data file - and it works for both par as well as py_binary?
from subpar.
Hi,
I'm having a similar issue. I have this project tree:
project/BUILD
project/experiments/scripts/BUILD
In project/BUILD
, I have a cc_binary called agent
. In project/experiments/scripts/BUILD
, I have a py_library with a data dependency (data = ["//:agent"]
) and a par_binary that depends on that py_library.
I've been trying for two hours and I can't figure out how to access the agent
binary from my Python code. Does anyone know what I should put in place of the question mark below? Also, is there a constraint that a par_binary can only have a data dependency on a target that is in its own directory? I was having issues creating a par_binary from Python files in a subdirectory, so maybe that is part of the issue here.
pkgutil.get_data("?", "agent")
Attempts:
pkgutil.get_data("", "agent") # Returns None
pkgutil.get_data("experiments", "agent") # Returns None
pkgutil.get_data("experiments.scripts", "agent") # Returns None
from subpar.
Can you use pkg_resources
or pkgutil
to access the file?
Here's an example where we use pkgutil
to access a file within the PAR and extract it onto the filesystem so that things work as the bundled library expects.
from subpar.
Actually maybe it's pkg_resources
that doesn't work well with PAR, so try pkgutil
first :)
from subpar.
pkgutil
works, thanks.
Is this in the documentation somewhere?
from subpar.
@duggelz is the authority on this repo. If not, I think we should track adding it with this issue.
from subpar.
.par files are not intended (by me, at least), to extract all of their files to disk, that kind of defeats the point.
However, the waters are seriously muddied by the Bazel .zip for Windows which does always extract by default, and the various ways to create .par files inside Google that use magic command lines or environment variables to autoextract.
So, point 1:
- Document how to access data files
Yes, I should do this. For reference it's like:
import pkgutil
dat = pkgutil.get_data('my.package.name', 'filename.ext')
This provides a file-like object, which is often good enough. When you really need an actual file, you need an API to materialize that file to disk. The internal Google API is terrible (I can say that because I wrote it) so we don't plan to open-source it. The pkg_resources
module should be the preferred API, at least it's better, but there's some issues with proper metadata handling at present, and also there are some logistical issues with pkg_resources
being part of setuptools rather than part of the Python standard library, the way pkgutil
is.
- "Feature Request: .par files should autoextract when you run them".
This is a valid feature request, but I'm biased by the fact that we're actively trying to move away from this inside Google, because the performance and disk usage implications have become quite severe. It's a balance between programmer ease of use, and performance/resource usage, and Google's position on that line is probably quite different than almost everyone else.
from subpar.
Slight correction: pkgutil.get_data
returns a string (on Python 3 I believe it's actually bytes
), not a file-like object.
I personally have less interest in 2, but see how others might.
from subpar.
A resource API is finally coming to the standard library in Python 3.7, and will be backported to 2.7 and 3.4-3.6. Hallelujah!
https://gitlab.com/python-devs/importlib_resources
Also, I'm leaning toward a "just extract everything to disk all the time" strategy for this tool, instead of the much more complicated heuristics used inside Google for their performance benefits. At the same time, we're investigating open-sourcing the real PAR file implementation used inside Google.
from subpar.
@duggelz If it's coming in Python 3.7, that means we'll only have to wait 3-4 years before it makes it into the distroless base images which rules_docker
uses. :)
from subpar.
Could someone suggest how to deal with data deps provided by WORKSPACE
? Basically, I'd like to embed a deep learning model and then read it with TFLite from inside. TFLite needs either a file, or byte representation of the model. The model does get embedded into PAR, but it's at the root level (if I unzip it), and therefore, it seems, pkgutil can't get to it.
The layout of the unpacked par is as follows:
tflite_models __main__.py subpar <namespace name>
The models are inside tflite_models
.
from subpar.
Answering my own question after digging through the code some more:
pkgutil.get_data("__main__", "tflite_models/detect_float.tflite")
gets the data
from subpar.
Related Issues (20)
- Fix tests to use proper select() for python version
- Tests fail for centos7 configuration
- Tests may fail when run via `bazel test` HOT 1
- Incompatible search for main files
- Development Best Practices HOT 1
- bazel 0.27.1 deprecated API usages
- Move subpar to the bazelbuild org HOT 1
- Conflicting package names HOT 1
- CODEOWNERS HOT 1
- cython modules import fails when running par file HOT 1
- Why is the generated dpkg_parser having a #!/usr/bin/python3 shebang? HOT 1
- Flag --incompatible_no_implicit_file_export will break Subpar in Bazel 1.2.1
- Flag --incompatible_no_implicit_file_export will break Subpar in a future Bazel release
- [Documentation] Should README setup example use http_archive rather than git_repository? HOT 1
- Running `bazel-bin/*.par ` has ModuleNotFoundError: No module named issue
- Support hermetically built python interperters
- Python cannot run Subpars larger than 2GiB because they are Zip64 formatted.
- Subpar is failing on Bazel CI since the upgrade to Ubuntu 20.04 HOT 1
- Migration guide for existing subpar users now that it is "unmaintained and considered deprecated"?
- Build fails with bazel 0.25 --incompatible_use_python_toolchains HOT 19
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from subpar.