Git Product home page Git Product logo

Comments (7)

jedbrown avatar jedbrown commented on June 1, 2024 1

Thanks for starting this discussion. One thing that comes to mind is that if Cargo.lock is held constant, we should be able to reproduce builds from months or years prior. This is somewhat in conflict with having a common install of spack that may be upgraded (and spack external find, but that is a loophole with any external dependencies and people who are pedantic about this are likely building from carefully managed containers).

One thought on distribution model: an alternative to having each libxyz-sys crate maintain the logic of when and how to use spack, I wonder if we could automate (e.g., with spack CI) publishing libxyz-src crates any time the corresponding spack package is updated (or as part of spack's release process, though that's relatively slow). The entirely automated deployment could expose features that spack knows about. Then the libxyz-sys crate would use a feature to enable depending on libxyz-src, but would not need to know about spack directly (and nothing spack would be pulled when that feature is not enabled).

from spack.

cosmicexplorer avatar cosmicexplorer commented on June 1, 2024

I wasn't sure whether to introduce this yet, but I was speaking with @jedbrown over mastodon and he mentioned he was looking to produce cargo packages for libCEED, PETsc, and MPI, for example. While my initial proposal above was just about how spack-rs is useful even to pure Rust developers who don't otherwise use spack, I believe I have done enough work on this for it to be useful for the separate-but-related goal of exposing spack packages to cargo (which I understand is obviously much more relevant to the spack project). He advised me to ping him on github and we can hash out remaining requirements. From our discussion, I recall:

  • absolutely need support for spack external find (need to make an API for toggling this)
  • remove the m4 bootstrap build that's just used to ensure clingo-bootstrap exists
    • this build also fails on @jedbrown's linux as well due to an issue with gcc-runtime for some reason -- need to repro
  • describe the caching mechanisms (spack is downloaded not as a git repo, but as a release tarball, into ~/.spack/summonings, so every distinct spack package build variant should be cached after the first build)

@jedbrown feel free to add a list of any additional requirements you can think of for the PETsc/libCEED/MPI crates. I'll be using this issue to track progress but we can also split out PETsc/MPI into their own issue(s) as needed.

from spack.

cosmicexplorer avatar cosmicexplorer commented on June 1, 2024

Hm, one more thing. Currently, there is only a single canonical spack installation, which is downloaded from a release tarball and not updated (that spack workspace is reused for all cargo invocations using the same version of spack-rs). This was done to avoid breakages that might occur if a package definition is updated in a breaking way without coordinating with the cargo package. However, it also means that packages like PETsc would need to ask for a spack-rs version bump every time they wanted to update their package definition, which isn't great.

The way I initially approached this was to allow cargo packages to define their own spack packages in a local spack repo defined in the crate: see https://github.com/cosmicexplorer/spack-rs/blob/main/vectorscan/sys/local-repo/packages/vectorscan/package.py. If e.g. PETsc was to use this approach, they could copy their spack package definition into the cargo crate and thereby avoid needing to update spack-rs to update their package definition. However, they would need to bump their spack-rs dep whenever they wanted to make use of new spack features, and they would need to mirror every change they make to their upstream spack package definition.

I think this seems like a reasonable-enough tradeoff, but I'm not incredibly familiar with how spack's git repo is currently deployed, and if there are processes that assume the existence of the git repo, then I would probably want to look into changing this process. For now, it seems like a static spack release tarball that only gets updated when the spack-rs build dependency is explicitly bumped is the most reliable way to avoid confusing stateful bugs when making spack (stateful CLI) interact with cargo (stateless CLI).

from spack.

cosmicexplorer avatar cosmicexplorer commented on June 1, 2024

@jedbrown:

if Cargo.lock is held constant, we should be able to reproduce builds from months or years prior

Ok, this sounds like a strong argument in support of tying spack-rs package versions to specific spack release versions (which is the current approach), and it also sounds like you're not currently relying on spack being deployed specifically as a mutable git repo (which is what I expected, but wanted to confirm). This is very convenient.

an alternative to having each libxyz-sys crate maintain the logic of when and how to use spack, I wonder if we could automate (e.g., with spack CI) publishing libxyz-src crates any time the corresponding spack package is updated

This is extremely interesting and I absolutely hadn't considered this! In fact, I was initially looking to provide an "automagical" interface which e.g. executed bindgen without having to reimplement that logic in each -sys crate build script, but ended up reworking that approach because I couldn't figure out how to make it general enough. If we invert the control here and instead generate cargo package definitions from within spack as you propose, I think we would have a lot more freedom to autogenerate cargo features and build script logic. I can also easily see this being applied to other package registries like pip in the future.

I think that my previous approach is still good and useful especially when paired with the local repo approach to use a custom package recipe but I can easily see the benefit of your approach here for spack packages like petsc and I would prefer to focus on that now as it seems well-scoped and highly feasible. We will also be able to address things like spack external find in doing so which will also be usable for the local repo use case.

We already have a spack containerize command; perhaps we can consider spack publish --cargo package@version? I would need to play around with this but I absolutely think we should be able to literally generate an entire cargo project from a template that maps e.g. variants to features (we'll need to get creative with this since spack is more expressive than cargo), executes bindgen, and links any static or dynamic libraries. I assume we can introduce additional spack directives or package methods as needed to define the cargo translation layer.

from spack.

cosmicexplorer avatar cosmicexplorer commented on June 1, 2024

Hm, the spack publish --cargo approach could probably also still rely on the spack-rs crate to make it reproducible, but we could also simply vendor the relevant parts of the spack checkout used to invoke spack publish --cargo. However, that approach then loses the caching benefits of being able to share the built artifacts across a shared spack version. One solution then might be:

  1. generate cargo crates that depend on a particular version of spack-rs
  2. vendor just the current package recipe into the generated crate definition as a local spack repo (using the existing local repo functionality of the spack-rs crate)

This actually solves the conundrum of updating the package recipe in two places at once: just vendor the package recipe into the generated crate upon each publish. And this also maximizes our ability to cache spack builds across crates while still ensuring separate crate versions always produce distinct cache results (because they will build distinct versions of the source package).

from spack.

cosmicexplorer avatar cosmicexplorer commented on June 1, 2024

would not need to know about spack directly (and nothing spack would be pulled when that feature is not enabled).

Read your comment again and wanted to clarify that I believe that a literal interpretation of "would not need to know about spack directly" is probably not possible with the current general architecture of spack. We need at the very least to be able to execute python code contained in spack package recipes every time a spack package is built, which would need to be interpreted and executed by the build.rs script for any wrapper cargo package. This would require a significant portion of the spack source code to be available in order to provide e.g. the package and spec API at "runtime" (which is build time for the spack/cargo package).

When I say "the general architecture of spack", I am referring to package recipes making use of a mutable python API. This is not in general a "problem", but I have previously worked on the pants build tool which completely rewrote its build task API to avoid stateful python logic in order to enable parallelism and especially interop with rust code to avoid the GIL; see e.g. https://www.pantsbuild.org/2.18/docs/writing-plugins/the-rules-api/concepts. I'm not advocating for that in this issue, but I will note that pants also configures the version of itself like a pinned dependency (see install guide), which allows it to precompile its Rust code and apply several significant optimizations to the python code in the released binary using pyoxidizer.

However, I totally agree that we can and should remove all spack-specific logic from e.g. the Cargo.toml file if we take your approach (and I believe we should) of generating cargo wrapper crates from spack instead. So if that's what you meant, I absolutely agree with you, and we should definitely be able to cover spack external find invocation behind a cargo feature flag as well instead of needing to bubble that up to cargo.

The complex Cargo.toml interface was largely motivated by the need to ensure spack builds from independent cargo packages in the same dependency graph (and would therefore be linked into the same binary) would always be performed within the same spack environment (ephemeral spack environments with checksummed names were used to accomplish this). I believe this is still necessary to some extent, but I believe the required metadata can also simply be autogenerated by spack when it generates the cargo wrapper package, so spack's presence can still be made effectively invisible.

@jedbrown do the above two paragraphs address the goal of making spack into more of an implementation detail from the rust/cargo perspective despite failing to avoid depending on spack python code?

from spack.

cosmicexplorer avatar cosmicexplorer commented on June 1, 2024

What I'm proposing is:

  • A spack publish --cargo (or whatever) command to generate a cargo package libxyz-src which maps spack variants to cargo feature flags
    • This should probably involve an additional set of methods for package.py files to explicitly "export" certain variants as features (analogously to calling self.define_from_variant() in the cmake_args() method).
    • The use of spack external find to locate certain dependencies can also similarly be configured by an overridable package method.

The generated cargo package would:

  • Depend on the spack-rs crate (as a build dependency), which also fixes the version of spack used to a particular release tarball.
    • Note that the spack-rs crate requires this in order to ensure the same spack version is used across cargo crates in the same dependency graph, which also lets us share the local spack build cache across all cargo builds on the same machine.
    • The spack-rs crate essentially allows us to lean on cargo's dependency resolution logic to resolve the appropriate version of spack's implementation code (the stuff in lib/).
  • Vendor the target package's package.py file as well as any of its transitive dependencies into the generated crate in the format of a spack repo (which looks like this).
    • This enables generated cargo packages to correspond to the package recipes at the time of executing spack publish --cargo, even though the actual spack source checkout is downloaded and versioned separately.
    • In the case where a package needs a new spack feature (e.g. new code in lib/), they can update their package.py to constrain the generated spack-rs dependency in the generate rust crate to require a spack release with the appropriate version.
  • Generate some form of Cargo.toml metadata like the format in OP, but automatically.
    • Some additional cargo metadata (using the officially supported interface to add arbitrary structured data to crate definitions) is necessary to propagate spack-specific info across the cargo dependency graph (this is necessary for build correctness as well as caching), but this can and should be entirely an implementation detail. Furthermore, this metadata format can be considered explicitly unstable and only for consumption by the spack-rs crate.

from spack.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.