julialang / juleps Goto Github PK

Julia Enhancement Proposals

License: Other

juleps's Introduction

Juleps: Julia Enhancement Proposals

This repository contains proposals to enhance the Julia language and ecosystem. It contains the following "Juleps" (Julia Enhancement Proposals):

Pkg3 – the next generation of Julia package management
RTLIB – a runtime-library for Julia.
Find - Reorganize search and find API
Logging – A general logging interface

juleps's People

Contributors

Stargazers

Watchers

juleps's Issues

Find Julep: issue with sentinel values

Originally posted as comment on commit, suggested to open issue instead.
cc @nalimilan

Find Julep

Extract from section Issues Beyond the Scope of This Julep

Sentinel values in a world where array indices do not necessarily start with 1:
- findfirst(x, v) returns 0 if no value matching v is found; however, if x allows 0 as an index, the meaning of 0 is ambiguous. One could return typemin(Int) or minimum(linearindices(x))-1, but what if x starts indexing at typemin(Int)?
- No matter [what] sentinel value gets returned, the deprecation strategy here is delicate. There may be a lot of code that checks the return value and compares it to 0.

It would be nice if supporting non-1-based indices was in the scope of this Julep, and it might be better to address non-1-based arrays before their use becomes more widespread.

Note that using first(linearindices(A))-1 as a sentinel value would be non-breaking for standard 1-based arrays.
If A starts indexing at typemin(Int), then returned sentinel value would wrap-around to typemax(Int) (i.e. no error).

The calling code could then test for the sentinel value in the same way:

i = findfirst(!iszero, A)
if i != first(linearindices(A))-1
  ...
end

To simplify, could introduce new function sentinelindex, and use this consistently instead:

sentinelindex(A) = first(linearindices(A))-1

i = findfirst(!iszero, A)
if i != sentinelindex(A)
  ...
end

Pkg3: defaults of open vs closed

Specifically with respect to

When started in interactive mode, Julia defaults to open; when started non-interactively, it defaults to closed.

So for code that depends on modules available via "open" mechanisms like LOAD_PATH, starting julia interactively and running include("script.jl") will work but julia script.jl will not?

This seems like it would be a major gotcha, I don't think interactivity is a good proxy for whether people want to be permissive about code loading or have a locked-down reproducible environment. (It is a good criteria for whether or not to enable user-prompting behavior though.)

Package options

It would be nice to have uniform way to specify package configuration options. Currently, I'm using environment variables for this (e.g. PYTHON in PyCall), but that is somewhat non-Julian. More importantly, it requires some manual effort for me to save the option in a file so that it is "remembered" when you do Pkg.update etcetera even if the environment variable is no longer set. Many packages will get this wrong, and even if they get it right it is a lot of duplication of effort.

My suggestion would be that the package TOML file should have something like:

[option.python]
value = "python3" # default

The user interface would be something like:

Pkg.option("PyCall", :python) — return the current choice (saved in a file somewhere)
Pkg.option("PyCall", python="python2") — change the option, maybe rebuilding automatically unless this is called by the package during the build
Pkg.add and Pkg.build will accept python="python2" keyword arguments to set options when installing/building.

The value of all options would be saved in an options.toml file somewhere.

A related mechanism would be used for package alternatives, see #37.

Pkg3: Runtime Configuration - Specification of Installed Packages?

First of all, really nice work!

As noted in the DifferentialEquations plan for modularization (SciML/DifferentialEquations.jl#59), the only form of "conditional dependencies" that I really need is the ability to add/remove functionality at compile-time by knowing which packages are installed on the user's system. It seems like this can be done by the user specifying a flag in the Runtime Configuration file, but is there any way that this can be computed? As in, can this flag instead be determined a code which sees if packages X, Y, Z are installed, if so, allow me to use add functionality?

An example is that, at the top of the module, I'd want to:

if SUNDIALS_INSTALLED
  using Sundials
end

and then inside some higher level API functions have a branch which is allowed if SUNDIALS_INSTALLED (and error if not). If this is a possibility, then nothing would actually need to be changing at runtime, and thus the current conditional module problems that I have (which are not too difficult! I know there are more difficult conditional module problems which don't apply here) would be solved seamlessly to the user.

Pkg3: Minor-Upgrade

Right now,
as I understand it there is 2 levels of package updating:

Update: get the latest patch release (but keeping same minor version) i.e. X.Y.a -> X.Y.b
- Nonbreaking
Upgrade: get the lastest release.______________________________________i.e. A.α.a -> B.β.b
- Breaking

I suggest perhaps it is worth going all out, and having one update/upgrade command for each level in the Semantic Versioning.
Perhaps:

Patch: get any patch releases, on same major and minot version number, i.e. X.Y.a -> X.Y.b
- Bug-fix; Nonbreaking
Update: get any minor, or patch releases, on same major version number, i.e. X.α.a-> X.β.b
- Feature Addition; Nonbreaking
Upgrade: get any releases, including major, minor, or patch_______________i.e. A.α.a -> B.β.b
- Breaking

So this is the addition of the middle stage.
Where you upgrade minor, without upgrading major.

I think if SemVer is being followed correctly then this is useful.
Since a Minor change, is supposed to be nonbreaking.

People don't really follow SemVer that closely, but perhaps the should.
In part the lack of following SemVer closely is because of a fear of declaring a package v1.0.0.
and packages in the 0.x.y stage do not have to obey most of SemVer, and can have breaking changes in their minor (or even patch) versions.

This itself suggests perhaps each there should be different behaviors for the update and upgrade commands depending on the package's version. Eg that any change on a 0.x.y package be considered a potentially breaking change, and thus require upgrade

More sanely, perhaps, it suggests that no package that is not tagged 1.0.0 or higher should be allowed in the official repo.
Since at the point one is asking to be added to the official repo, one really is almost by definition at least at 1.0.0. And so one should now be following SemVer properly.

Pkg3: package namespaces

The current proposal doesn't mention namespaces anywhere. This issue is to discuss how namespaces could fit into the proposal and Julia's own namespacing in general.

Pkg3: running a package

Over in discourse it was discussed to distinguish between "runnable packages" (called projects in that thread) and "library packages" (called packages). The suggestion which gathered the most likes was not to distinguish between projects and packages, but instead to "standardize where to put runnable scripts into packages as we know them now. Say a folder run/ or scripts/ and the main program would be run/main.jl. Pure "Projects" would have an empty src/ folder and full run/ folder and vice versa (most would have a bit of both). Similar to Pkg.test("SomePkg") we could have a Pkg.run("SomePkg") to run run/main.jl." Also a command-line option could be good, say julia --run SomePkg.

(I haven't followed this Julep too closely, please close this issue if this is in it already. Or let me know if this should be posted over in Julia itself.)

Pkg3: Unregistered Packages?

I'm unclear as to how Pkg3 will handle unregistered packages. Is the expectation that all packages must be registered with at least a personal/private registry? Currently, it's very useful to run Pkg.clone to try out someone's changes. Similarly, declaring dependencies with just a git url in a DECLARE file provides a convenient way of developing a collection of inter-dependent packages prior to registering them.

Pkg3: "Binary" or "non-source" packages

Taken from JuliaLang/julia#16330:

It would be nice if it would be possible to have packages where the functionality is only available in precompiled form.

Pkg3: Separated Test Environments

Will one be able to set a separate environment for testing? I don't think that the current REQUIRE setup is sufficient for the current organization-by-organizations setup. Sometimes there are large changes that are happening in tandem, and it would be helpful to have a designated way to say "this package needs to be on master for tests".

One case where this shows up is in organizations where there's a change to a main package which cases changes in all dependents. Another case where this shows up is where you create features which are dependent on a large change in some other package, like getting your package ready for DataFrames' nullable-array interface and wanting the proper CI. Currently this is done with little hacks around it (calling an extra script to checkout packages only when the test is run on CI), but I think making it standard and explicit would be beneficial for both developers and those who are reviewing packages (@tkelman).

Branching process Julep

We need to figure out how we're going to handle branching going forward. For the immediate future we plan to keep master as 1.1, but we will need to figure out how we want to handle breaking changes in the long term.

The key questions

How to structure our branches?
What tooling can we use to minimise the need for manual intervention with handling backports?
What should get backported to minor and patch releases?

Pkg3: default pkg/depot precedence clarification

Hi folks,

The Pkg3 julep looks really good so far! I was a little confused about the language describing pkg precedence across the union of depots.

Some environment and/or Julia variable – DEPOT_PATH maybe? – will control the set of depots visible to a Julia process. The registries, libraries, packages, and environments visible to Julia are the union across all depots in the depot path
...
The set of installed library versions is the union across depots. If the same library version occurs multiple times in the depot path, the first occurance is used – different instances of the same library version may be different depending on how they are configured and installed. The set of installed package versions is the union across depots. If the same package version occurs multiple times in the depot path, the first occurance is used

Does this mean that the ordering of the depot paths in the DEPOT_PATH environment variable determines pkg precedence? For example, if we have something like JULIA_DEPOT_PATH=/home/Julia/.julia:/usr/local/share/julia/system:/usr/local/share/julia/standard then user packages would take precedence over system packages and so on?

Release process Julep

How often should we make future releases?
What should be the timelines and requirements for alpha, beta, RC, etc.?
When should we upgrade LLVM?

We should also fully document all the steps in making a release (e.g. https://github.com/JuliaLang/julia/blob/883c8a38920985e0c02df169ff8c379731d88fc6/Makefile#L112-L129)

Pkg3: telemetry

I'd like to see Pkg3 support some kind of anonymized opt-out telemetry.

For example, it could keep track of statistics like like how many times Pkg.add has been called for a given package with a heuristic to not count CI testing. This would give us a more reliable guide of package popularity than github stars.

Pkg3: naming of project filenames

Hi all

I saw the talk on Pkg3 and was a bit confused with the naming.
My personal expectation was something like this:

Manifest.toml for the actual configuration
Manifest.lock for the latest working set
Manifest.journal for the log. This one I would expect to be a like firebird or another single file db with a lock.

Of course Config.* would be fine too. The mix of Config.toml and Manifest.toml was confusing me since many develop environments use them also.
Like android with Manifest.xml or npm with packages.json and packages-lock.json or cargo with Cargo.toml and Cargo.lock.

Regards

Pkg3: comparison with Guix

Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, [...] and takes a purely functional approach to package management

https://www.gnu.org/software/guix/
https://www.gnu.org/software/guix/manual/html_node/Package-Management.html

Guix is implemented in Scheme. Julia uses Scheme as its parser. All the ideas of Guix mentioned in the above quote appear in line with those of Julia.

I don't know if Pkg3 should be "like Guix", but I happen to be using, and interested in the development of, both Julia and Guix. I wanted to be sure that all Pkg3 designers are well aware of Guix/Nix and possibly can use ideas/features (and maybe code) from the Guix project, or even collaborate.

Pkg3: system depot can't live under /usr/julia

The /usr directory cannot contain application-specific directories. Also, things under /usr should generally be managed exclusively by distro package managers. So a better place (according to the FHS) would be /usr/local/share/julia for architecture-independent files (like sources), and /usr/local/lib/julia for binaries.

Pkg3: compatibility constraint granularity

Continuing the other half of the discussion on #3.

Pkg3: developing packages

Currently I'm organizing the packages which I develop by copying them to a folder with sym-links to .julia, see https://discourse.julialang.org/t/recommended-setup-for-developing-packages/5725 for a discussion. I then started wondering how this would work with Pkg3 but did not find anything in the document, but got a bit confused how package-immutability and development would work together.

Maybe a section "Package development" should be added to:
https://github.com/JuliaLang/Juleps/blob/master/Pkg3.md#operations

Python to Julia transpiler

It would be nice to have Transpiler from subset of Python (and special ~pythonjulia module named like "Juthon") to complete Julia lang.

And a dream looks like it should both be:

A standalone transpiler so that the user can write "Juthon" package and contribute it to the Julia ecosystem by transpiling.
And runtime decorator style transpiler (with interface like Numba has) that calls Julia (or compiled python library in the future) from Python. So that a small piece of code can be added right in the Python app written in Python.

A good example of a transipiler from Python is Transcrypt that is a valid Python prior transpiling. But to achieve 1. this should be more like Julia in Python syntax (via IDE friendly mapping and juthon module with well-documented stubs) than attempt to implement Python's standard library.

The idea came from this discussion: Why Julia? Will Python/Numba and Python/Cython lose to Julia?

Pkg3: TOML Compatibility

Hi,

I saw that you are using TOML.

So Are you using TOML.jl package ? It seems that package is not maintained and doesn't support TOML V>0.2.

thanks

Deprecation Julep

We need an explicit policy for how deprecations will be handled post-1.0. Some options are:

Allow them in minor releases, removed next minor release
Allow them in minor releases, removed next major release
Allowed only in major release, removed next minor release
- Could be added in minor releases, but hidden have --depwarn=no by default.

We can use this issue for preliminary discussion until the appropriate Julep is written.

Pkg3: allow upgrading and removing packages from REPL

When the user wants to remove package or update packages from an environment, they will instead invoke an external package management mode (julia --pkg?), which makes it clear that changes will not affect any currently running Julia sessions. The impact on usability is a strict improvement:

Adding packages and loading them is easier since one simply does using XYZ and answers interactive prompts.

Removing and upgrading packages is no less difficult since it previously required restarting the current Julia process anyway, and is less confusing since the requirement to restart is explict since running a separate process clearly doesn't affect the current one.

Using the command line to manage packages sounds like a usability loss to me. This will particularly be the case for Windows users.

I think we should keep the current Pkg.* functions. They should simply print a warning or ask to restart when modified packages are currently loaded.

Pkg3: Quality Tags (unstable/testing/candidate/stable)

It would be useful to have a facility for applying mutable quality tags to a release.
e.g.

Make a quick-fix for a user-reported bug, push a release to the registry, tagged "testing".
Ask the user to try out the fix.
Wait for the CI results for the release and all the dependant packages.
If all is well, a few days later tag the release as "stable".
or:
Do some kind of major refactoring and push out a release tagged "unstable".
After a period of beta testing, push out a new release tagged "testing" (or "candidate")
etc...

Maybe users could push releases into the registry without waiting for human review, but only official reviewers would be allowed to apply the "released" tag?

End users could configure their production environment to only run "stable" packages and their playing-with-new-stuff environment to run anything.

REQUIRES files could include quality tag qualifiers.

Perhaps in the future there could be a "certified" tag that indicates a quality review has been done by an approved body.

github private with git credentials

I have a private package kept with github, using their fee-enabled private setting. To use it locally, first I clone it and thereafter I use checkout. Checkout prompts me for the username and password. My username and password are stored using git's credentials manager.

Pkg3's clone and checkout should find them.

Progress display?

This is a really cool project, addressing one of the big issues I have with using Julia. I'm curious about when it'll be ready for testing. Would you consider opening a meta-issue with a few checkboxes that describe the current state of things, i.e. which pieces are missing and which are already working?

Apologies if it's too early for that.

package alternatives in Pkg3?

I wonder whether we want Pkg3 to contain something like Debian alternatives: a package A can depend on package B or package C or package D.

For example, the Plots.jl package might want to depend on having at least one plotting package installed (cc @tbreloff).

Or, now that FFTW.jl is being split off into its own package, and that package in turn will probably depend on an AbstractFFTs package (JuliaMath/FFTW.jl#2), other packages may want to depend on having some package installed that implements FFTs following the AbstractFFT interface (e.g. FFTW, or a native Julia FFT, or FFTPACK, or MKL, etcetera).

Pkg.add(pkg) when pkg is installed

currently,

Pkg.rm("MyPackage"); Pkg.rm("MyPackage");
Pkg.add("MyPackage")
# takes some time
Pkg.add("MyPackage")
# takes some time -- which is silly

Pkg.add(pkg) should return immediately with eg "$(pkg) already added" when pkg was add()ed already.

and

at least one of Pkg.add(xs) or Pkg.add([xs...]) should work (preferably both).
the same for some other Pkg.cmds where it makes sense

Pkg3: enforcement of immutability?

Questions to answer in further revisions:

Will installed packages be read-only (at least until deletion is called for), or strictly checksummed, or immutable by convention only? What consequences, if any, would modifications have? Using code that doesn't match what the manifest specifies was installed isn't good for reproducibility, but what's the intended level of tracking and granularity of this? If packages aren't always git repositories any more, how are generated files and downloaded resources, which would be ignored from a version control perspective, dealt with?

This ties into the bigger separate question of where and how development happens if everything Pkg3 touches is immutable (by convention only, or strictly enforced). If you want to make a local change, does that require a separate installation mechanism and modifiable copy that lives outside of Pkg3 somewhere? Or if you make it locally do you then have untracked local modifications that never get recorded anywhere? (People will forget they've made this kind of change if packages aren't git repos.) Most other package systems work this way, but most other package systems have an unfriendly distinction between the way users work with packages and the way developers/contributors do. The low barrier to entry of contributing to Julia packages is a huge benefit to our ability to get users to become developers.

Pkg3: conservative compatibility will make it harder to access new features

[sorry for wall of text - leaving town this weekend, want to write initial reactions down while fresh]

A package should not declare compatibility with a minor version series unless some version in that series has actually been published – this guarantees that compatibility can (and should) be tested. If a new compatible major or minor version of a package is released, this should be reflected by publishing a new patch that expands the compatibility claims.

So this is requiring upper bounds, and in a strict way where the bound must already exist (though unlike current upper bounds, these would be inclusive). While I like that idea in theory - only allowing package dependency versions to be installed that are known to work - it trades the "the new release of package A broke package B" problem for a smaller feasible set of allowed versions. Users of widely depended-on packages will be held back to old versions if they also want to use any dependent packages that update slowly. If both package B and package C depend on package A, package B hasn't tagged since A released version 1.4, and package C depends on a new feature in package A that was first released in version 1.8, you won't be able to use both package B and package C in the same environment until package B has tested and tagged a 1.8-compatible version.

Automatic testing of reverse dependencies and making an automatic set of new downstream tags with wider bounds if tests pass could help here, but we don't have that infrastructure yet (and requiring such infrastructure in order for a set of packages to progress cohesively may be a burden to place on organizations that want to run their own registry). For packages that can't be tested on CI, or start failing in the automatic test results, then you start needing to involve the authors of even sporadically-developed packages any time their dependencies put out new feature releases that people want to use.

We'd also need much better error messages and suggested fixes when dependency resolution fails to find a feasible set of versions. Resolution failure is luckily pretty rare right now, but can be very confusing when it happens. Downgrading or being held back to old versions does happen now with upper bounds, and being stricter about them would make that more common. If a set of versions that are known to work can be installed we should do that, but I fear the choice will often be between allowing untested versions to be installed or erroring when the user tries to do so. When those are the only choices, I think the former has a higher chance of allowing the user to get things done (or figure out how to fix the problems).

https://www.well-typed.com/blog/2014/09/how-we-might-abolish-cabal-hell-part-1/ is worth at least skimming, as Haskell's ecosystem has gone through many similar issues. By going from the current scheme of only applying upper bounds when problems are already known or the package author is choosing to be conservative, to a scheme where strict tested bounds are the only thing that's allowed to be released, we're moving from the Julia equivalent of "Type errors" or "Compile errors" to the equivalent of "solver failure."

Lastly, this makes package authors' adherence to semver (or lack thereof) much more consequential. We do want to encourage these processes and get people thinking more about them, but I think social expectations and more thorough documentation are safer ways to get there right now than baking them into the behavior of the package manager.

Pkg3: Environment manipulation in the language instead of with command line arguments

If environment manipulation gets moved to a command-line invocation, people are probably going to want to run it most often from shell mode or via run(), which isn't any better than what we have now. Otherwise many Windows users aren't going to find it at all because the usual workflow there is often not via the command line. (IDE features can help, but shouldn't be required for this to be usable.)

As a thought, what if we make this tie into JuliaLang/julia#17997 instead? Instead of loading packages into Main, load them into an environment-specific namespace. Enter and leave environments via Pkg3.enterenv("newenv"), Pkg3.leaveenv() etc. Make Pkg3.add, Pkg3.rm, and Pkg3.update take an environment argument. If the user attempts to update or remove a loaded package from an environment, that environment needs to be reset via something similar to workspace().

Pkg3: relative paths for adding local packages

So you can do,

Pkg.add("UTF64", "~/code/UTF64.jl")

edit:

i know you can just keep monkeying around with ~/.julia/v0.5/UTF64, it just feels a little dirty

Don't use semantic versioning for Pkg3

I know this sounds like heresy but please hear out my argument.

Semantic versioning basically just tells me whether my code will explode or not. For example a library I'm depending on changes from 1.0.3 to 1.0.4 I don't care my code will work. If it changes to 1.1.4 I don't care because my code will work. If it changes to 2.0.0 I care because my code will explode. This is literally all the information I get out of semantic versioning.

Now if pkg3 were to use a system like the following: <number>:<timestamp> for package versioning it would be much easier to implement and reason about than semantic versioning and it would give me, the end user, more information than semantic versioning.

Here's how this versioning system would work:

I create a package called foo with a version 1:1494423107554 where the number 14944... is just a timestamp. Now everytime I want to release a non-breaking change of my package I just update the timestamp, so for instance I change some internal crap and then I publish 1:1494423232649. Any consumer of my library now knows that they can use my library no problem, also its easy to always stay updated with the latest non-breaking change - you just pull the latest timestamp.

Finally, anytime I want to release a breaking change I just increment the initial number. For example, now I want to change some public api function so I do it and release a new version of my lib as 2:1494423377157.

This seems a lot simpler to me (especially from an implementation and usage perspective) and it also increases the amount of information conveyed to the end user (now every user knows when a package was published which is something I do not know when looking at a version like 1.0.3).

Lastly, in case anyone thinks it would be annoying to update the timestamp we all know that could be easily automated by a simple scripting command.

Thoughts?

Pkg3: immutability of compatibility

Continuing half of the discussion on #3.

julialang / juleps Goto Github PK

juleps's Introduction

Juleps: Julia Enhancement Proposals

juleps's People

Contributors

Stargazers

Watchers

Forkers

juleps's Issues

Find Julep

Recommend Projects

Recommend Topics

Recommend Org