Git Product home page Git Product logo

Comments (10)

yannickperrenet avatar yannickperrenet commented on May 20, 2024 1

Thanks a bunch for your input! I think I get it now.

The recommended way is to create a package or git repository which you can install with git or pip using environment builds. When it comes to local development this might not be feasible since you lose the hot reloading.

The current approach would be to use the proposed:

spec = util.spec_from_file_location("mymodule", location="/data/module.py")

Another possibility indeed would be to make the /data/shared-code/ directory "special" by mounting it directly into your project-dir. This way you can import it as if it a local package (without the need to install it through pip or make it a git repo).

from orchest.

howie6879 avatar howie6879 commented on May 20, 2024 1

That’s great. I think we’re on the same page.

from orchest.

howie6879 avatar howie6879 commented on May 20, 2024

@yannickperrenet Hi:

Great! thanks for your solution, I agree with this implementation plan.

I don’t see a problem at the moment, but it’s probably more a matter of details, such as interaction and how to make it easier for users to use.

As a user, I would probably prefer that the interaction of shared modules be orchest-based. I have a question. How Do I share without Git or Pip?

from orchest.

yannickperrenet avatar yannickperrenet commented on May 20, 2024

it’s probably more a matter of details, such as interaction and how to make it easier for users to use.

Agreed!

As a user, I would probably prefer that the interaction of shared modules be orchest-based.

I agree here as well, especially when talking about developing those shared modules and wanting some kind of hot reloading.

How Do I share without Git or Pip?

Currently, this is possible by making use of the /data directory.

Create a module.py in userdir/data (either through your filesystem or by adding it through the JupyterLab terminal at /data), and give it the following content:

def foo():
    print("Hello from /data")

Next you can add the following code (which is a code snippet from stackoverflow that I changed to fit our example) in one of your pipeline steps:

from importlib import util

spec = util.spec_from_file_location("mymodule", location="/data/module.py")

mymodule = util.module_from_spec(spec)
spec.loader.exec_module(mymodule)

# This will print: "Hello from /data"
mymodule.foo()

Keep in mind, that this is considered more a hack than a recommended solution.

from orchest.

howie6879 avatar howie6879 commented on May 20, 2024
spec = util.spec_from_file_location("mymodule", location="/data/module.py")

This approach can initially solve the problem, but there are details to think about. Such as:

  • For other users, may not know that there is such a shared module named module.py under the data directory, or what this module is for, and how can other users see this information
  • A user imports a module shared by another user, he may not only import and use but also need to import and modify the script

Can we consider installing the shared modules in /data/shared-code/, regardless of whether the modules in this folder are git, pip or created by themselves, they all need to follow the orchest module standard and provide basic input and output related function descriptions. And this shared module folder can be displayed somewhere.

And then any project can select the modules they want to import, when the user selects the common modules needed for the current project, they will see the generated folder /project/shared-code/target_module/ under the current project.

from orchest.

yannickperrenet avatar yannickperrenet commented on May 20, 2024

Those are valid points you are making, however, I think it is good to keep in mind that Orchest does not have native multi-user support and so solving those points are out of scope (for now).

Can we consider installing the shared modules in /data/shared-code/

Please correct me if I am wrong, but are you proposing we create a special directory inside /data specifically for sharing code between projects? I would like to better understand what the use case is that sharing code through existing solutions (such as git and/or pip) is not feasible.

You can of course have the /data/shared-code/ directory as a convention for your team internally, without it being directly supported by Orchest.

from orchest.

howie6879 avatar howie6879 commented on May 20, 2024

Please correct me if I am wrong, but are you proposing we create a special directory inside /data specifically for sharing code between projects?

Yep, Suppose I have three projects and I want to use quickstart as a shared script, do I need to do this.

  • In project A: git clone quickstart
  • In project B: git clone quickstart
  • In project C: git clone quickstart

Do we have to do three import operations?

Please correct me if I am wrong.

If we have a shared directory, we can import quickstart to /data/shared-code/, and any project can use quickstart so that any shared script only needs to be imported once.

I would like to better understand what the use case is that sharing code through existing solutions (such as git and/or pip) is not feasible.

I strongly agree with the previously mentioned implementations, and I also think pip and git are perfectly feasible.

What I mean is that we can put shared modules in the /data/shared-code/ directory in any way we want, whether it's git or pip, or even manually moving

from orchest.

yannickperrenet avatar yannickperrenet commented on May 20, 2024

Another possibility indeed would be to make the /data/shared-code/ directory "special" by mounting it directly into your project-dir.

@ricklamers This would actually not work, would it? If I remember correctly you tested whether Docker supports nested path mounting, which it did not.

from orchest.

yannickperrenet avatar yannickperrenet commented on May 20, 2024

@cricksmaidiene Mentioning you here regarding the "template" steps ;)

from orchest.

ricklamers avatar ricklamers commented on May 20, 2024

Currently the approaches to sharing code between projects is through a centralized/private pip package that can be installed through Orchest Environments or by putting files/modules in /data. E.g. /data/shared-scripts/mymodule.py and pointing a step to it.

from orchest.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.