Git Product home page Git Product logo

prism's People

Contributors

mtrivedi50 avatar prism-admin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

prism's Issues

Have `tasks.ref(...)` automatically open a target

Is your feature request related to a problem? Please describe.

  • For tasks that output a target, a downstream tasks.ref(...) call outputs the path to the target, not the content / object associated with it.
    • Suppose Task1 outputted a PandasCsv to /some_path.csv and Task2 had a tasks.ref(Task1) call in its execution. This tasks.ref call would return /some_path.csv rather than the DataFrame itself.
  • This requires the user to write custom logic to open their target every time they use a target.
  • In addition, if a target changes in the previous task (e.g., if the user decides to no longer use a target), then all tasks.ref(...) calls downstream will have to change too.

Describe the solution you'd like

  • Require targets to have an open method. Have tasks.ref(...) call this open method.
  • The method could look something like:
class PandasCsv(PrismTarget):
    
    def open(self, **kwargs):
        return pd.read_csv(self.loc, **kwargs)
  • If possible, it would be nice to be able to control how the open function runs via open_kwargs in the tasks.ref(...) function::
# some_task.py

@task()
def example(tasks, hooks):
     df = tasks.ref("other_task", local=True, open_kwargs={"index": False, ...})

Describe alternatives you've considered
N/A

Additional context
N/A

MySQL adapter

Is your feature request related to a problem? Please describe.
I can't connect my Prism project to a MySQL database using the PrismHooks API.

Describe the solution you'd like

  • Add an option to create and configure a MySQL adapter in profile.yml.
  • I'd also like the prism connect command to support --type mysql.

Describe alternatives you've considered

  • Instantiating a MySQL connection in the prism_project.py file. This works fine, it leads to repeated code (e.g., the same connection boilerplate across different projects).

Additional context
N/A

`done` method in PrismTask class

Is your feature request related to a problem? Please describe.

When running a Prism project, I want to be able to skip tasks that are already "Done". I can do this while running individual tasks (i.e., if I run a specific Prism task, then the task will skip upstream tasks that have a target). However, I want to be able to do this when running the project as a whole.

Describe the solution you'd like

  • Add a done method to the PrismTask class. This function must return either True of `False.
class PrismTask:
    ...
    def done(self, hooks):
        # do something here
        return True
  • Users can overwrite this this function in their own task definitions.
  • When running each task in a project, call done() before calling exec()

Describe alternatives you've considered
N/A

Additional context
N/A

Can't connect to EC2 agents on an IPv6 address

Describe the bug

  • Prism cannot connect to EC2 instances using an IPv6 address. When running prism agent [apply | build | run], Prism gets stuck on the following step:
...
local-ec2-agent2[build] | ssh: connect to host ec2-52-90-51-71.compute-1.amazonaws.com port 22: Operation timed out
local-ec2-agent2[build] | SSH connection failed. Retrying in 5 seconds...
local-ec2-agent2[build] | ssh: connect to host ec2-52-90-51-71.compute-1.amazonaws.com port 22: Operation timed out
local-ec2-agent2[build] | SSH connection failed. Retrying in 5 seconds...
...
  • Here is the agent configuration
# local-ec2-agent.yml

agent:
  type: ec2
  instance_type: t2.micro
  requirements: requirements.txt
  env:
    SNOWFLAKE_ACCOUNT: '{{ env("SNOWFLAKE_ACCOUNT") }}'
    SNOWFLAKE_DATABASE: '{{ env("SNOWFLAKE_DATABASE") }}'
    SNOWFLAKE_PASSWORD: '{{ env("SNOWFLAKE_PASSWORD") }}'
    SNOWFLAKE_ROLE: '{{ env("SNOWFLAKE_ROLE") }}'
    SNOWFLAKE_SCHEMA: '{{ env("SNOWFLAKE_SCHEMA") }}'
    SNOWFLAKE_USER: '{{ env("SNOWFLAKE_USER") }}'
    SNOWFLAKE_WAREHOUSE: '{{ env("SNOWFLAKE_WAREHOUSE") }}'
    GOOGLE_APPLICATION_CREDENTIALS: '{{ env("GOOGLE_APPLICATION_CREDENTIALS") }}'

CLI Arguments
prism agent apply -f local_ec2_agent.yml

Traceback
See the logs pasted above.

Expected behavior
A clear and concise description of what you expected to happen.

Desktop (please complete the following information):

  • Mac OSx 12.5
  • Python version: 3.10.12

Additional context

  • Note that Prism creates all the necessary resources โ€” the PEM key pair, the security group, and the instance itself. However, Prism cannot SSH into the instance when the host machine is using a IPv6 address.
  • I confirmed that the security group has the appropriate inbound SSH permissions.

Compatibility with dbt 1.6.0

Describe the bug
Prism is only compatible with dbt versions up to 1.5.6. Update the Dbt adapter to be compatible with dbt-core==1.6.0.

Initialize hooks in Jupyter notebooks

Is your feature request related to a problem? Please describe.
When developing task code, users may want to experiment and test the code in a Jupyter notebook prior to running it in the project. This enables a faster development feedback loop (i.e., users can rapidly test iterations of code in a notebook rather than modifying and re-running tasks after every change).

Describe the solution you'd like
I'd like to add a function initialize_hooks() that works as follows:

  • When inside a Prism project directory, the initialize_hooks function parses the prism_project.py file for a profile. If a profile doesn't exist, then it throws an error. If a profile does exist, then it parses the profile
  • After parsing the profile, the function instantiates all the adapters contained therein.
  • The hooks function should return an instance of the prism.infra.hooks.PrismHooks class. Users can use the methods of this class in their Jupyter notebook.
  • Here's what the function definition should look like
def initialize_hooks(project_dir: Optional[Path] = None):
    """
    Initialize hooks within a Prism project

    args:
        project_dir: Prism project directory. If `None`, then assume the current working
            directory is inside a Prism project. Default is `None`
    returns:
        hooks: instance of prism.infra.hooks.PrismHooks class
    """
  • Here's what the output should look like in a notebook:
image

Describe alternatives you've considered
N/A - open to hearing suggestions!

Additional context
This shouldn't require a ton of net-new code. The assignee should be able to accomplish this with a lot of the existing classes and structures.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.