Git Product home page Git Product logo

meltano / meltano Goto Github PK

View Code? Open in Web Editor NEW
1.6K 13.0 142.0 138.53 MB

Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

Home Page: https://meltano.com/

License: MIT License

Shell 0.94% Dockerfile 0.05% Python 98.99% Mako 0.03%
dataops dataops-platform elt open-source opensource data pipelines extract-data meltano meltano-sdk

meltano's Introduction

Meltano Logo

The declarative code-first data integration engine

Say goodbye to writing, maintaining, and scaling your own API integrations.
Unlock 600+ APIs and DBs and realize your wildest data and ML-powered product ideas.


Integrations

Meltano Hub is the single source of truth to find any Meltano plugins as well as Singer taps and targets. Users are also able to add more plugins to the Hub and have them immediately discoverable and usable within Meltano. The Hub is lovingly curated by Meltano and the wider Meltano community.

Installation

If you're ready to build your ideal data platform and start running data workflows across multiple tools, start by following the Installation guide to have Meltano up and running in your device.

Documentation

Check out the "Getting Started" guide or find the full documentation at https://docs.meltano.com.

Contributing

Meltano is a truly open-source project, built for and by its community. We happily welcome and encourage your contributions. Start by browsing through our issue tracker to add your ideas to the roadmap. If you're still unsure on what to contribute at the moment, you can always check out the list of open issues labeled as "Accepting Merge Requests".

For more information on how to contribute to Meltano, refer to our contribution guidelines.

Community

We host weekly online events where you can engage with us directly. Check out more information in our Community page.

If you have any questions, want sneak peeks of features or would just like to say hello and network, join our community of over +2,500 data professionals!

๐Ÿ‘‹ Join us on Slack!

Responsible Disclosure Policy

Please refer to the responsible disclosure policy on our website.

License

This code is distributed under the MIT license, see the LICENSE file.

meltano's People

Contributors

aaronsteers avatar afolson avatar alexmarple avatar austinpray avatar bencodezen avatar braedonleonard avatar buzzcutnorman avatar cjohnhanson avatar dependabot[bot] avatar dosire avatar douwem avatar edgarrmondragon avatar emilieschario avatar github-actions[bot] avatar gtsiolis avatar magreenbaum avatar meltybot avatar mjsqu avatar niallrees avatar nkclemson avatar pnadolny13 avatar pre-commit-ci[bot] avatar ra-one avatar rabidaudio avatar reubenfrankel avatar sbalnojan avatar tayloramurphy avatar visch avatar willdasilva avatar zamai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meltano's Issues

Product Vision for Data Engineering/Analytics/Science (DataOps)

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/54

Originally created by @tayloramurphy on 2018-06-20 21:00:46


I was rewatching the 2018 Product Vision video https://www.youtube.com/watch?v=RmSTLGnEmpQ and some thoughts came to mind around Meltano.

  • What if there was a separate tab within GitLab for Data Operations? Basically, a souped-up, better version of the Airflow UI for managing batch (maybe streaming) jobs, viewing logs, errors, etc.

    • It'd enable you to surface alerts like "hey, there's a new field on this Salesforce object. We've automatically mapped it to xyz, but you can override here"

    • You could have aggregate stats on specific jobs and highlight areas for improvement (Job Y has 4.3% failure rate - click to see logs)

    • Secret var management could be integrated and tied to specific jobs.

    • Schema manifests could be read and interacted with via the UI.

    • API limits could be declared and managed in the UI and the DataOps tab would keep track of calls. You could even declare the API test harness for each source.

  • CI isn't the right place for moving data

    • CI/CD is about testing, not actually moving data around. We could have default, recommended tests for each pipeline that's integrated into GitLab. The tests could be minimal around data integrity (like what we're doing with dbt), or it could be large-scale where ~10% of every table is used as the basis for a new data warehouse and the pipelines are run on that.

    • The DataOps tab then becomes the management center of actually moving the data around. Pipelines are continuously (every ~10 min.) kicked off and once tests pass on a new version of the pipeline, the next pipeline run picks up the new version

    • By keeping the focus of CI on actually testing everything about pipelines and data movement, it relieves the pressure of them having to keep running everything all the time.

    • We could have a tight integration with dbt and show the transformation DAG that's generated

  • This could then translate into versioning ML models and monitoring their performance in production. So similar to how we can have "gitlab bot" auto deploy and auto revert, we could do the same thing with new versions of ML models if they pass or fail certain thresholds.

    • Then you can integrate things like lore so that in addition to a git clone of the project, you can meltano clone and get the harness required to do Machine Learning and to update any pipelines.

I'm a little all over the place with this, but that video got my brain juices flowing. The code of a project declares what the application should be doing, CI does the testing so that changes don't break the application, CD deploys new changes to the application. In this case, the application is moving data around constantly but we could make smart abstractions for that app to make it easier and integrated!

cc @joshlambert @jschatz1 @tlapiana @emilielimaburke @iroussos @mbergeron @zamai

Onboarding

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/27

Originally created by @mbergeron on 2018-04-03 14:18:24


We need an onboarding process for new developers on the BizOps project, on top of my head here are tasks that are needed.

  • Add to the BizOps group
  • Add the the BizOps 1Password Vault
  • Give access to the gitlab-analysis GCP project

Improved secret management

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/21

Originally created by @joshlambert on 2018-06-05 22:01:41


We have locked down access to the protected secrets, which prevents users from having direct access to them. However we still make them available for review apps, which means any developer can alter the .gitlab-ci.yml, print the secrets, and then view the build log to retrieve them. This means that any user with developer rights has access to all of the secrets for all of the data sources, which is a concern especially as we move into more sensitive data sources.

Some possible solutions:

  1. Test harness (https://gitlab.com/meltano/meltano/issues/86): Utilize something like vcr to provide an automated API mock for review branches. This way the real secrets could only be available on protected branches, and we'd also not consume API quotas on review branches.
  2. Some type of forward proxy, which held the secrets and performs the authentication. This seems unrealistic, I'm not sure if something like this even exists.

Something like a KMS won't really help address these isues, because of the review app problem noted above, but could help to further secure the secrets themselves.

Mono Repo of Meltano Toolset

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/13

Originally created by @jschatz1 on 2018-07-23 17:33:40


During the 'Reference to deleted milestone 0.5.0' milestone, we saw a need to split the current meltano extractors and loaders into their own package for discoverability purpose. We also saw that as a separation of concern. To this extent the following projects were created:

  • meltano-common (shared module)
  • meltano-cli (cli interface)
  • meltano-load-postgresql
  • meltano-extract-fastly

As per @sytses, this move was detrimental to the contribution value of the project (which the team also agreed on). So we are moving things back to a monorepo.

This discussion spurred another separation of concern: where should the data & analytics data sit in this structure? We decided that splitting the analytics project:

  • dbt transforms
  • python transforms
  • looker models
  • meltano manifests
  • ELT pipeline definition (gitlab-ci.yml)

From the meltano project:

  • CLI tool
  • Extractors
  • Loaders
  • CI/CD pipeline definition

After another round of feedback, the approach was reverted back to a single repository that would host all of these in the same run. This issue will track this merge.

cc @tayloramurphy @mbergeron

Add ELT job for packagecloud download stats

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/46

Originally created by @joshlambert on 2018-06-05 00:12:36


We use packagecloud to serve downloads of our packages. An important measure of installs/upgrades is to track downloads. Presently we have a manual process to get this information, but we should start collecting it automatically and incorporating it into the data warehouse.

See https://gitlab.com/gitlab-cookbooks/gitlab-packagecloud for more information on the current manual process.

Automatically adjust to SFDC schema changes

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/36

Originally created by @joshlambert on 2018-02-19 19:16:13


Fields in SFDC can frequently change. Right now we do not attempt to handle these automatically, which means manual effort in the event anything changes that we currently use.

We should explore, in increasing complexity, the ability to handle these changes automatically.

For example, the easiest would be detecting that an object was simply renamed and adjusting accordingly.

In the event an object we were utilizing was deleted, we could potentially explore more descriptive alerts or errors to help flag the issue.

[meta] Allow customer fields to be mapped to BizOps common data model fields

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/39

Originally created by @joshlambert on 2018-04-25 21:30:55


We are working to establish a common data model (https://gitlab.com/bizops/bizops/issues/9), to address some of the challenges in the sales and marketing analytics space. Namely, that much of the fields in SFDC, Marketo, Zuora, are custom and therefore different between customers. This makes any sort of common tool or pipeline difficult to create, as everyones fields are different.

While the common data model is great, it will take time and effort for it to be embraced. In the interim, we need a practical solution to leverage as much of BizOps as possible, when your fields don't match the common model.

A solution to this, is to build a mapping stage into our data pipeline.

  1. Extract: Extract and Load data as-is from source into staging table
  2. Mapping: Analyze staging table schema and map fields to common data model
  3. Transform: Using the map as input, transform from staging table to production table

We can work to improve the mapping process over time:

  1. As an MVC, it is a flat file which simply provides a 1:1 mapping from customer field to data model field
  2. As a next step, we can build a tool which takes this file as input, and outputs any missing data model fields
  3. From here, we can start to build intelligence where we attempt to auto-detect the fields and map them without user interaction. This will likely take several iterations.

Access to full GitLab.com database for analytics

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/55

Originally created by @joshlambert on 2018-06-06 01:33:57


The product team (@JobV) has stated it is a critical short term need have access to the full GitLab.com database to run analytics.

We are working on an ELT to solve this, but proper pseudonymization is hard and we'd like to iterate on low-risk groups prior to pulling larger data sets, due to the high sensitivity of the content. The work and general plan is outlined here: https://gitlab.com/meltano/meltano/issues/80

This issue is to explore alternative methods to solve the immediate need, while we gain confidence and maturity with the pseudonymization process.

Proposal

  1. Set up a new GCP project, GCS bucket, bastion host and Cloud SQL instance.
  2. Enable full statement logging on Cloud SQL, and audit logs on the bastion host as well, to stackdriver
  3. Enable SSH access to the bastion host, for specific whitelisted users (no admin access)
  4. Set up a nightly full GitLab.com ELT dump, written into the GCS bucket
  5. Create a cron job on the bastion host to import from the GCS bucket to the Cloud SQL instance
  6. Access to Cloud SQL could be through a simple console SQL client
  7. If approved by security and as a further iteration, we could enable SSH port forwarding and run a VNC server, to allow a graphical SQL client to improve ease of use.

Test harness for ELT sources

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/25

Originally created by @joshlambert on 2018-04-02 18:14:44


We need a way to test changes to ELT sources, without hitting the real endpoints every time we run the test suite.
Many sources throttle the number of API requests or data you can pull, which can break not just the test CI jobs but also the production ELT pipeline.

We need a way to be able to run these without consuming significant amounts of real requests. A staging/sandbox account is not applicable, as not all sources allow these nor do we want to require one to get going with BizOps.

One option is to utilize an API play/record like service such as:

The benefit is that the effort to create a mocked API endpoint is significantly reduced, which is critical because:

  • A customer's data source schema may change, and frequently
  • New API's may be implemented, which would require new API's to be mocked

Because of these reasons, it would be nice for this to be relatively adaptive especially for a customer situation.

Manage grants for automatically created tables

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/19

Originally created by @mbergeron on 2018-05-09 16:44:52


As we can now create tables automatically using the schema_apply action in our custom extraction jobs, how should we ensure the correct grants are also applied?

Right now the mkto.* and zendesk.* schemas both have tables being populated but only the gitlab user can read the data.

From analytics#43 I see that the analytics role should have SELECT on these tables, but right now we don't do it automatically.

I think that should be specified in the catalog that we will create.

/cc @tayloramurphy @iroussos

Extractor for Digital Ocean Billing Info

Make GitLab ELT Incremental

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/5

Originally created by @jschatz1 on 2018-07-25 21:59:41


The current implementation of the Gitlab ELT is implemented as:

  • Downloads CSV files from a GCS bucket
  • Decompress the files
  • Integrate the CSV using the corresponding strategy (upsert or overwrite)

The bulk of this work is in the Pseudonymizer component of GitLab: we need to make it export only new data, from the last export. One way to do it would be to persist some kind of cursor (MAX(id) is a natural one for numeric id, MAX(created_date) can also work) and instead of walking through all the data set, start the extraction from this cursor.

We already output metadata files in the pseudonymizer run, we could either add this to the metadata, or create a cursor.yml that tracks this.

The pseudonymizer would then:

  • Read the provided cursor file (either provided at invocation or fetched from the latest run or default cursors)
  • Extract starting at the cursor
  • Export the updated cursor
  • Upload the cursors along the data

There should be a way to invalidate this cursor, for any of these cases (this might be a follow-up MR, you can manually delete the cursor file):

  • An entity has changed:
    • New entity
    • New attribute
    • Changed transformation

cc @tayloramurphy @mbergeron

Extract SFDC data that's only accessible as a child query

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/57

Originally created by @tayloramurphy on 2018-07-13 21:30:49


The ActivityHistory Object isn't queryable directly. You have to access it with a specific opportunity in mind. So we'd have to iterate through every opportunity and query the table for each one.

Is there a way to do this with meltano components?

This is relatively low-priority now, but something to think about.

See:
https://stackoverflow.com/questions/35122751/querying-salesforce-activity-history-using-power-query-raises-datasource-error
https://salesforce.stackexchange.com/a/50149

cc: @mbergeron @jschatz1 @tlapiana @zamai @iroussos

Data dictionary and Entity Relationship Diagram (ERD) generator

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/48

Originally created by @tayloramurphy on 2018-06-07 19:57:06


This is related to https://gitlab.com/meltano/meltano/issues/144

Most of the tools for managing data dictionaries and entity-relationship diagrams are suboptimal and not usually version-controlled. Long-term having some visual tooling around this would be cool, but having data models in version control along with human-readable descriptions of what they represent would be a huge win.

I see this as an extension of analytics#144 b/c it would allow us to define in human readable terms what the field represents and not just have the field name.

Make pgbedrock work with CloudSQL

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/29

Originally created by @tayloramurphy on 2018-06-07 19:43:45


I have an issue open on the project (Squarespace/pgbedrock#12) about getting this working. Doesn't seem like a terribly heavy lift.

The project is Apache 2, so we should be good to go there.

Features I'd like to see:

  • Runs on every pipeline to validate permissions
  • Execute changes if there are discrepancies
    • Nice to have would be to log the change as well

Reverse data modeling step to create branch datasets

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/44

Originally created by @mbergeron on 2018-05-15 12:39:04


This is a brain dump from thoughts I had this weekend.

One of the goal of Meltano is to bring the software workflow to data science. To me this means being able to tinker around freely and as friction-free as possible.

The first pain I can identify is the need for coherent data sets. Our current solution is to clone the production database on each branch, so it is available and coherent. The branch's code can run, model, analyse (in fact do whatever with it). This seems ideal, but in fact has some caveats:

  • cloning the instance takes time, scaling with database size
  • cloning the instance can become costly for large teams (lots of Cloud SQL instance running)

I think the main perk of using the production data is having a coherent dataset, so you can test your models/analysis on and expect results.

Reverse the stats

Alright that might be a long shot, but bear with me.

We already have models around the production database, yielding some statistical metrics and other modeling layers (facts, measures, etc...) I'm not familiar with the data science lingo but let's call this the analysis output

Can we think of a way to build a dataset, deterministically, that would comply with the analysis output (within an error margin) but with a very small sample size? I understand that it is impossible to have all the analysis output right for this dataset, but we could maybe mock some of the stats if need be.

Think of it this way, your test could define what analysis output it needs to run, then the dataset would be created to comply with the current production's analysis output, but with a dataset of a smaller magnitude (in fact, the smallest possible).

Example

You have a model on a source of N=10e6 that aggregates the price -> average_price, max_price, q1_price, q2_price, q3_price, q4_price

Instance N average_price ...
Production 10e6 100.25 ...
Branch 100 100.25 ...

/cc @tayloramurphy @joshlambert @iroussos

Website Content

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/11

Originally created by @jschatz1 on 2018-07-17 16:19:55


Just putting this out there. I am not saying this is the content, but just putting something out there.

Content

Header

Meltano
Tool for data scientists

The Tools

Extractors

Extract data from it's source

  • Lever
  • Netsuite
  • Salesforce
  • BambooHR
  • Many more (link)

Loaders

Load that data into your data warehouse
Multiple dialects including:

  • Postgresql
  • BigQuery (coming soon)
  • MySQL (coming soon)

Transformers

Transform that data to get the answers you need using DBT.

Visualize

Using the lookml file format, describe your visualizations, and view them in Melt, our complete visualization tool.

Find out more on our README (link)


This is just a quick write up. Purposefully inaccurate information, to fill the void. Help me by responding with comments of what the right information is and I will update this description.

cc @mbergeron @iroussos @zamai @emilielimaburke @tlapiana

Add optional support for Object Storage to GitLab.com ELT

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/40

Originally created by @joshlambert on 2018-05-02 21:46:55


Currently our GitLab.com ELT writes the data into CSV files locally on disk, and then we have a CI job which picks them up and writes them into the data warehouse.

This works fine if the BizOps CI jobs and rake task are running in the same segment, but if you have these separated for security reasons you will need to figure out moving the files yourself.

It would be nice to add direct support to move these up to and down from object storage, to reduce the burden on end users.

Monorepo - [merged]

Merges monorepo -> master

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/merge_requests/1


Integrate the meltano-cli, meltano-common, meltano-load-postgresql, meltano-extract-fastly, melt projects into a single repo.

This is the first part of the integration, we have yet to come to a consensus about the analytics project and how it should be handled.

Ensure access and audit logs are generated for the data warehouse

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/28

Originally created by @joshlambert on 2018-06-07 19:20:36


We need to ensure we can audit the access and query logs for the data warehouse. This is important so that we can have visibility into the actions of a user, in the event we need to.

Right now in looking at stackdriver for cloudsql, I can see connections and queries, but I don't see a way to attribute a specific query to a specific user.

dbt schema generation

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/45

Originally created by @joshlambert on 2018-05-09 20:08:50


As part of the larger goal to have a schema library, dbt should also output both what it expects as input as well as the eventual output.

This will help us achieve to goals:

  1. A catalog of the schema for each ELT job, and as well what is expected and output by dbt: https://gitlab.com/bizops/looker/issues/46
  2. A common data model and mapping tool, to map a users custom fields to what is ultimately expected by dbt: https://gitlab.com/bizops/bizops/issues/9

[meta] Establish common data model for analytics

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/34

Originally created by @joshlambert on 2017-11-09 23:52:16


Today the vast majority of the fields in SFDC, Zuora, Marketo, another SaaS services are custom. While there are some default fields, these are the exception rather than the rule.

This means that every company has a different data schema, but is calculating largely similar types of metrics. (For example many SaaS companies utilize common metrics for establishing business performance, etc.)

This presents both a problem and an opportunity:

  • Setting up the integration between these services, and then the analytics to make use of the data is time consuming and expensive. It often involves consultants or dedicated employees. This is expensive and time consuming.
  • We have an opportunity to try to establish a common "best practices" data model, where more of these types of analytics could "just work" if you followed the conventions. This would dramatically ease downstream analytics and more tools/config/samples could be shared and applied.

To that end, we should do a few things:

  1. Iterate ourselves towards the common "best practice" data model and schema.
  2. Implement a "mapping stage", to map a customers custom fields to the fields in the common data model. This could be manual at first, and more automated/intelligent later.
  3. Evangelize the common data model, it's benefits, and the interim bridge step of the mapping service.

Automatic ELT schema generation

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/18

Originally created by @joshlambert on 2018-05-09 20:02:31


We should have a schema file that is output by each of the ELT jobs, so it is easy to understand what data is being extracted, where it is coming from, and the general structure. This will be much easier to consume than trying to look at the code, or running the job and looking at the database.

This will also help us drive towards two other goals:

  1. A catalog of the schema for each ELT job, and as well what is expected and output by dbt: https://gitlab.com/meltano/looker/issues/46
  2. A common data model and mapping tool, to map a users custom fields to what is ultimately expected by dbt: https://gitlab.com/meltano/meltano/issues/9

Make extractors usable on their own

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/20

Originally created by @joshlambert on 2018-06-25 16:54:40


We discussed this in the past, but opted to not do this due to the increased complexity of adding capabilities for which we do not immediately need: databases other than Postgres, splitting the extractors into separate projects, etc.

Now that we have more engineering resources on board, and have addressed the near term nears of our internal data team, I think we should revisit this topic for a few reasons:

  1. Each of these extractors has their own value, and could generate interest on their own. For example, a good SFDC, Zuora, or Netsuite extractor would be useful for the broader community.
  2. It will take some time for us to really make the full meltano experience great, end to end.
  3. While that is work is being done, we could start generating interest and critical mass with just the extractors themselves.
  4. Right now however, there are a few major hurdles in driving usage of these extractors:
  • Our extractors only output to Postgres. There is no support for exporting to a file, or any other database type. If your EDW runs on bigquery, we can't help you.
  • There is no SEO for the individual extractors. If you google for "sfdc extract", you aren't going to get a good hit based on the full Meltano readme.
  • Further, the extractors aren't easily usable on their own. It's expected that they are used in the context of the full project. For example, there is no canned image, instead they are pulled down with a git checkout.
  • We currently operate as a monorepo, and it is not user friendly to work on these in isolation. Our issues, MR's, README's, etc. all cover a broader scope than the simple sharp tool of extracting from a source.

There are some downsides:

  1. There will be some work to really "productize" these individually, if we are going with our own system.
  • We should accelerate the output to an intermediate format for the extractors, so we can support multiple storage engines. (PG, MySQL, Bigquery, Redshift, Snowflake, etc.) We can then build individual loaders for these.
  • We will need to rework the pipelines, to build a final image for each extractor. Then update the main CI pipeline.
  1. This work may delay down the effort to productize the full meltano project, for example building the data mapping feature.

Rename Meltano Extract components

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/31

Originally created by @mbergeron on 2018-06-12 14:23:21


We should change the packages name so we can start publishing them.

I suggest:

  • meltano-extract-common for the shared modules
  • meltano-extract-<source> for a specific data source

We shall start versioning at 0.1.0-dev0 for all components, or we could map our milestone in there (0.4.0-dev0) for the current version.

/cc @jschatz1 @iroussos @zamai

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.