runprism / prism Goto Github PK
View Code? Open in Web Editor NEWPrism is the easiest way to develop, orchestrate, and execute data pipelines in Python.
Home Page: https://runprism.com
License: Apache License 2.0
Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python.
Home Page: https://runprism.com
License: Apache License 2.0
Is your feature request related to a problem? Please describe.
tasks.ref(...)
call outputs the path to the target, not the content / object associated with it.
Task1
outputted a PandasCsv
to /some_path.csv
and Task2
had a tasks.ref(Task1)
call in its execution. This tasks.ref
call would return /some_path.csv
rather than the DataFrame itself.tasks.ref(...)
calls downstream will have to change too.Describe the solution you'd like
open
method. Have tasks.ref(...)
call this open
method.class PandasCsv(PrismTarget):
def open(self, **kwargs):
return pd.read_csv(self.loc, **kwargs)
open
function runs via open_kwargs
in the tasks.ref(...)
function::# some_task.py
@task()
def example(tasks, hooks):
df = tasks.ref("other_task", local=True, open_kwargs={"index": False, ...})
Describe alternatives you've considered
N/A
Additional context
N/A
Is your feature request related to a problem? Please describe.
I can't connect my Prism project to a MySQL database using the PrismHooks API.
Describe the solution you'd like
profile.yml
.prism connect
command to support --type mysql
.Describe alternatives you've considered
prism_project.py
file. This works fine, it leads to repeated code (e.g., the same connection boilerplate across different projects).Additional context
N/A
Is your feature request related to a problem? Please describe.
When running a Prism project, I want to be able to skip tasks that are already "Done". I can do this while running individual tasks (i.e., if I run a specific Prism task, then the task will skip upstream tasks that have a target). However, I want to be able to do this when running the project as a whole.
Describe the solution you'd like
done
method to the PrismTask
class. This function must return either True
of `False.class PrismTask:
...
def done(self, hooks):
# do something here
return True
done()
before calling exec()
Describe alternatives you've considered
N/A
Additional context
N/A
Describe the bug
prism agent [apply | build | run]
, Prism gets stuck on the following step:...
local-ec2-agent2[build] | ssh: connect to host ec2-52-90-51-71.compute-1.amazonaws.com port 22: Operation timed out
local-ec2-agent2[build] | SSH connection failed. Retrying in 5 seconds...
local-ec2-agent2[build] | ssh: connect to host ec2-52-90-51-71.compute-1.amazonaws.com port 22: Operation timed out
local-ec2-agent2[build] | SSH connection failed. Retrying in 5 seconds...
...
# local-ec2-agent.yml
agent:
type: ec2
instance_type: t2.micro
requirements: requirements.txt
env:
SNOWFLAKE_ACCOUNT: '{{ env("SNOWFLAKE_ACCOUNT") }}'
SNOWFLAKE_DATABASE: '{{ env("SNOWFLAKE_DATABASE") }}'
SNOWFLAKE_PASSWORD: '{{ env("SNOWFLAKE_PASSWORD") }}'
SNOWFLAKE_ROLE: '{{ env("SNOWFLAKE_ROLE") }}'
SNOWFLAKE_SCHEMA: '{{ env("SNOWFLAKE_SCHEMA") }}'
SNOWFLAKE_USER: '{{ env("SNOWFLAKE_USER") }}'
SNOWFLAKE_WAREHOUSE: '{{ env("SNOWFLAKE_WAREHOUSE") }}'
GOOGLE_APPLICATION_CREDENTIALS: '{{ env("GOOGLE_APPLICATION_CREDENTIALS") }}'
CLI Arguments
prism agent apply -f local_ec2_agent.yml
Traceback
See the logs pasted above.
Expected behavior
A clear and concise description of what you expected to happen.
Desktop (please complete the following information):
Additional context
Describe the bug
Prism is only compatible with dbt
versions up to 1.5.6. Update the Dbt
adapter to be compatible with dbt-core==1.6.0
.
Is your feature request related to a problem? Please describe.
When developing task code, users may want to experiment and test the code in a Jupyter notebook prior to running it in the project. This enables a faster development feedback loop (i.e., users can rapidly test iterations of code in a notebook rather than modifying and re-running tasks after every change).
Describe the solution you'd like
I'd like to add a function initialize_hooks()
that works as follows:
initialize_hooks
function parses the prism_project.py
file for a profile. If a profile doesn't exist, then it throws an error. If a profile does exist, then it parses the profileprism.infra.hooks.PrismHooks
class. Users can use the methods of this class in their Jupyter notebook.def initialize_hooks(project_dir: Optional[Path] = None):
"""
Initialize hooks within a Prism project
args:
project_dir: Prism project directory. If `None`, then assume the current working
directory is inside a Prism project. Default is `None`
returns:
hooks: instance of prism.infra.hooks.PrismHooks class
"""
Describe alternatives you've considered
N/A - open to hearing suggestions!
Additional context
This shouldn't require a ton of net-new code. The assignee should be able to accomplish this with a lot of the existing classes and structures.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.