Git Product home page Git Product logo

Comments (6)

jakelishman avatar jakelishman commented on June 2, 2024 1

Are you having problems with reproducibility? It should already possible for scripts to be completely reproducible down to the generated circuits - we do it in our test suites (and we have several assertions to ensure that the seeding functions correctly).

This type of design opens the door wide to race conditions in accessing the global random state, which Qiskit would never be able to control - there's always going to have to be co-operation from user scripts to ensure safe random-state access, at which point it's (imo) rather easier for everybody if that's just done by explicit seeding, because if users get in the habit of thinking things will be magically reproducible for them, it's all the more surprising and difficult to debug when they aren't. Global state is typically very at risk of being thread-unsafe, and heavily accessed mutable global state like a pRNG multiply so.

This is among the same concerns that led Numpy to change up its random-number generation in 1.17+, to move people to using explicit np.random.Generator instances, and to push people away from using the magic shared global state (which is still currently maintained as the legacy np.random.rand* functions, but discouraged).

from qiskit.

jakelishman avatar jakelishman commented on June 2, 2024 1

Any library package using randomness from another library should also be re-exposing those as arguments, or they themselves are not reproducible - that would be a problem in their own design, rather than Qiskit's. That means that a user should only ever need to interact with the seeding components of the libraries that they're using, which would be the case even with a global state - even if there were global state, you'd still need to have an individual one per library (so a user would still need to be aware of seeding for each component they are using). As soon as randomness gets involved anywhere, reproducibility can't really happen without some amount of user intervention, unless we're prepared to have the default state being that the random routines always start from exactly the same seed (which almost invariably defeats the purpose of them being random).

Using global state to skip a few levels into the inner dependency of a package is problematic itself - how do we choose who gets to do that? If more than one part of the code tries to use those global-state seeders to fix seeds for its own internal components, that can break the inner reproducibility expectations on the inner routines, because the seeding may change at unexpected times. This is a core reason that even if Qiskit exposed global state, downstream libraries couldn't use it; they'd still have to use the exposed seeding tooling, and re-expose it to users in their own interfaces.

If you wanted to read a little more about this, Numpy wrote a little bit about it when switching over their random-number routines (https://numpy.org/neps/nep-0019-rng-policy.html#numpy-random), though it's not super technically detailed on this component.

from qiskit.

jakelishman avatar jakelishman commented on June 2, 2024

Sorry, forgot to include examples: if you're not already aware, everything that uses randomness should have some sort of seed argument already. For example, transpile has seed_transpiler (as do the preset-passmanager generators) that takes an integer, and the quantum_info random functions and circuit.library random generators should have a seed argument that can take either an integer or a numpy.random.Generator instance iirc.

from qiskit.

jakelishman avatar jakelishman commented on June 2, 2024

Depending on your application, you may also need to a take a bit of care when interacting with Python's hash-based containers (especially set, though you can occasionally see tricks with dict too), because Python hashing is salted based on the content of the PYTHONHASHSEED environment variable at the instantiation time of the Python interpreter (it's randomised if that variable isn't set). One notably impact is that iteration order through a given set is non-deterministic between different Python processes in general, which can quite easily cause non-reproducibility in test examples if it's not explicitly watched for.

from qiskit.

RoyElkabetz avatar RoyElkabetz commented on June 2, 2024

Thank you for the quick respond. You are raising some very good points regarding race conditions and user education. Also, I very much agree that accessing the seeding interface of each one of qiskit's components (such as transpiler, quantum_info, etc') is fairly simple, so reproducibility at the qiskit level (without getting into set iterators, which is a very good point btw) is not a problem.

Alternatively, imagine the case of trying to compose a reproducible experiment using qiskit-experiments and qiskit-aer (or qiskit-dynamics) where one would have to be familiar with the seeding hierarchy of the specific components and how they relate to one another in order to set up a deterministically reproducible experiment. It is surely possible, but sounds (imo) like something that could be manageable in the qiskit's ecosystem level with a unified interface to the user.

That being said, I must say I haven't thought about the consciousness of managing a global state.

from qiskit.

jakelishman avatar jakelishman commented on June 2, 2024

I'll close this issue as "complete" / "won't fix" for now, but please feel free to re-open if there's more to discuss.

from qiskit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.