Git Product home page Git Product logo

data-factory-validate-action's Introduction

Azure Data Factory Validate Action

GitHub Action that validates all of the Azure Data Factory resources in your Git repository using the Azure Data Factory utilities package.

When to use

The action is particularly useful on Continuous Integration (CI) workflows, where a step can be added to check if all Data Factory resources (e.g. pipelines, activities, linked services, datasets, etc) in the target Git branch are valid before applying the changes during the Continuous Deployment (CD) phase.

Getting Started

Prerequisites

Example Usage

steps:
  - name: Validate Data Factory resources
    uses: Azure/[email protected]
    # with:
    #   path: ./mydir [optional]
    #   id: <data factory resource ID> [optional]

Inputs

  • path (optional): Directory that contains all Data Factory resources. Defaults to ./ directory.

  • id (optional): Data Factory resource ID. Defaults to /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/resourceGroup/providers/Microsoft.DataFactory/factories/dataFactory.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

data-factory-validate-action's People

Contributors

fedeoliv avatar microsoftopensource avatar raflyalk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

data-factory-validate-action's Issues

Getting timeout when using Azure/data-factory-validate-action in my Action with self-hosted runner behind a proxy

Below is my workflow:

runs-on: [ self-hosted, kubernetes, on-prem ]
- name: Checkout branch
  uses: actions/checkout@v3
  • name: Validate ARM template
    uses: Azure/[email protected]
    with:
    id: /subscriptions/xxxxxxxxxx/resourceGroups/rgname/providers/Microsoft.DataFactory/factories/myadf

It's using self-hosted runners that is behind the company proxy. I'm wondering if I need to pass somehow our proxy URI and how to do that if that would solve the issue. The ADF resource in Azure is set to Networking public and the proxy allows that traffic from the self-hosted runner to get out public to the ADF.

The workflow takes ~15 minutes to complete and yields the error (read ETIMEDOUT) below:
Run Azure/[email protected]
with:
id: /subscriptions//resourceGroups/rg-sample-dbx-dev/providers/Microsoft.DataFactory/factories/adf-daas-dev
path: ./
env:
ARM_SUBSCRIPTION_ID: ***
ADF_DEV_RG: ***
ADF_DEV_RESX_NAME: ***
ADF_TARGET_RG: ***
ADF_TARGET_RESX_NAME: ***
/usr/local/bin/docker run --name e2269cec0d69a52540c5bbc393ffff3f49bb_ae6300 --label 60e226 --workdir /github/workspace --rm -e "HTTP_PROXY" -e "http_proxy" -e "HTTPS_PROXY" -e "https_proxy" -e "NO_PROXY" -e "no_proxy" -e "ARM_SUBSCRIPTION_ID" -e "ADF_DEV_RG" -e "ADF_DEV_RESX_NAME" -e "ADF_TARGET_RG" -e "ADF_TARGET_RESX_NAME" -e "INPUT_ID" -e "INPUT_PATH" -e "HOME" -e "GITHUB_JOB" -e "GITHUB_REF" -e "GITHUB_SHA" -e "GITHUB_REPOSITORY" -e "GITHUB_REPOSITORY_OWNER" -e "GITHUB_REPOSITORY_OWNER_ID" -e "GITHUB_RUN_ID" -e "GITHUB_RUN_NUMBER" -e "GITHUB_RETENTION_DAYS" -e "GITHUB_RUN_ATTEMPT" -e "GITHUB_REPOSITORY_ID" -e "GITHUB_ACTOR_ID" -e "GITHUB_ACTOR" -e "GITHUB_TRIGGERING_ACTOR" -e "GITHUB_WORKFLOW" -e "GITHUB_HEAD_REF" -e "GITHUB_BASE_REF" -e "GITHUB_EVENT_NAME" -e "GITHUB_SERVER_URL" -e "GITHUB_API_URL" -e "GITHUB_GRAPHQL_URL" -e "GITHUB_REF_NAME" -e "GITHUB_REF_PROTECTED" -e "GITHUB_REF_TYPE" -e "GITHUB_WORKFLOW_REF" -e "GITHUB_WORKFLOW_SHA" -e "GITHUB_WORKSPACE" -e "GITHUB_EVENT_PATH" -e "GITHUB_PATH" -e "GITHUB_ENV" -e "GITHUB_STEP_SUMMARY" -e "GITHUB_STATE" -e "GITHUB_OUTPUT" -e "GITHUB_ACTION" -e "GITHUB_ACTION_REPOSITORY" -e "GITHUB_ACTION_REF" -e "RUNNER_OS" -e "RUNNER_ARCH" -e "RUNNER_NAME" -e "RUNNER_TOOL_CACHE" -e "RUNNER_TEMP" -e "RUNNER_WORKSPACE" -e "ACTIONS_RUNTIME_URL" -e "ACTIONS_RUNTIME_TOKEN" -e "ACTIONS_CACHE_URL" -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/runner/_work/_temp/_github_home":"/github/home" -v "/runner/_work/_temp/_github_workflow":"/github/workflow" -v "/runner/_work/_temp/_runner_file_commands":"/github/file_commands" -v "/runner/_work/daas-terraform-azurerm-azure-data-factory/daas-terraform-azurerm-azure-data-factory":"/github/workspace" 60e226:9cec0d69a52540c5bbc393ffff3f49bb "./" "/subscriptions/
/resourceGroups/rg-sample-dbx-dev/providers/Microsoft.DataFactory/factories/adf-daas-dev"
total 68
-rw-r--r-- 1 node node 1116 Apr 13 15:19 CHANGELOG.md
-rw-r--r-- 1 node node 4089 Apr 13 15:19 README.md
drwxr-xr-x 3 node node 4096 Apr 13 15:19 Terraform
drwxr-xr-x 3 node node 4096 Apr 13 15:19 adf_artifacts
drwxr-xr-x 2 node node 4096 Apr 13 15:19 build
drwxr-xr-x 2 node node 4096 Apr 13 15:19 dataflow
drwxr-xr-x 2 node node 4096 Apr 13 15:19 dataset
drwxr-xr-x 2 node node 4096 Apr 13 15:19 deploy
drwxr-xr-x 2 node node 4096 Apr 13 15:19 docs
drwxr-xr-x 2 node node 4096 Apr 13 15:19 example
drwxr-xr-x 2 node node 4096 Apr 13 15:19 factory
drwxr-xr-x 2 node node 4096 Apr 13 15:19 integrationRuntime
drwxr-xr-x 2 node node 4096 Apr 13 15:19 linkedService
drwxr-xr-x 2 node node 4096 Apr 13 15:19 pipeline
-rw-r--r-- 1 node node 66 Apr 13 15:19 publish_config.json
drwxr-xr-x 3 node node 4096 Apr 13 15:19 sample
drwxr-xr-x 2 node node 4096 Apr 13 15:19 scripts
Installing Azure Data Factory Utilities package...

added 2 packages, and audited 3 packages in 16s

found 0 vulnerabilities
Installation completed.
Validating /subscriptions//resourceGroups//providers/Microsoft.DataFactory/factories/* at /github/workspace...
Downloading bundle from: https://adf.azure.com/assets/cmd-api/main.js
Process cwd: /github/workspace
node:events:371
throw er; // Unhandled 'error' event
^

Error: read ETIMEDOUT
at TLSWrap.onStreamRead (node:internal/stream_base_commons:211:20)
Emitted 'error' event on ClientRequest instance at:
at TLSSocket.socketErrorListener (node:_http_client:447:9)
at TLSSocket.emit (node:events:394:28)
at emitErrorNT (node:internal/streams/destroy:157:8)
at emitErrorCloseNT (node:internal/streams/destroy:122:3)
at processTicksAndRejections (node:internal/process/task_queues:83:21) {
errno: -110,
code: 'ETIMEDOUT',
syscall: 'read'**

How to fail the step whenever there is a validation error

Hi there,
I'm making a CI workflow where there is an ADF validation involved. So, suppose that there is a validation error like so:

Execution failed with exit code: 255
(Use `node --trace-uncaught ...` to show where the exception was thrown)
 ERROR === CmdApiApp: Resource 'A' has the following validation error: Execute Pipeline activity 'X' Pipeline is required.
 ERROR === CmdApiApp: Resource 'B' has the following validation error: Execute Pipeline activity 'Y' Pipeline is required.

Go to:
https://adf.azure.com/authoring?factory=/subscriptions/XXX/resourcegroups/YYY/providers/Microsoft.DataFactory/factories/ZZZ
to fix the validation errors.

=====ERROR=====
Error: Command failed: node  /github/workspace/adf/downloads/main.js validate /github/workspace/adf /subscriptions/XXX/resourceGroups/YYY/Microsoft.DataFactory/factories/ZZZ

Execution finished....
Validation completed.

How can we make that the Step fails whenever there is a validation error? Because, so far when I got the error message like above, the step remains successful
image
When I print out the outcome of the step with echo ${{ steps.validate_adf.outcome }}, this prints out 'success'
image

I'd like to know whether this is an expected behavior and how can I achieve such objective with this Action

Thank you!

cc @fedeoliv @microsoftopensource

The validation succeeds but with DynamicConnectorService Error

After running the validation step, it completes successfully, but gives this error:
Resource: /subscriptions/****************/resourceGroups/**********/providers/Microsoft.DataFactory/factories/***** RootFolder: /github/workspace/***/*** ModelService: synchronize - start ModelService: Dynamic connector - Start registering dynamic connectors ERROR === DynamicConnectorService: Dynamic connector - Error happens when loading dynamic connectors from adf resource: Cannot find module './550.js'
image

Can you please let me know the reason and suggest corrections?

DataFlow Validation not working with Data Explorer

Hello everyone,

Sorry if this is the wrong place, I couldn't find anywhere else where to open an issue/bug regarding the package @microsoft/azure-data-factory-utilities. I noticed it is used here too, so I figured this might be the best place to post.

In my company we've been using the azure-data-factory-utilities package to validate our datafactory on Azure before building it and then automatically deploying the arm templates, more specifically we use run build validate <datafactory> to validate our datafactory.
We recently run into an issue with this package. We tried to deploy a new pipeline which has a DataFlow activity that uses a few Data Explorer linked services (both as source and sink). While on ADF everything runs and validates fine, the packages doesn't correctly validate the new pipeline and returns the following error

 CmdApiApp: Starting to validate all resources
 ModelService: _backgroundFetch - finished, duration: 2225.322690999994, total nameds: 236
 ERROR === CmdApiApp: 
Data Factory validation failed. Found 2 validation errors.

 ERROR === CmdApiApp: Resource 'ReconciliationDataIngestion_DataFlow' has the following validation error: Dataset is using 'AzureDataExplorer' linked service type, which is not supported in data flow.
 ERROR === CmdApiApp: Resource 'ReconciliationDataIngestion_DataFlow' has the following validation error: Dataset is using 'AzureDataExplorer' linked service type, which is not supported in data flow.
 ERROR === CmdApiApp: 

Since the package hasn't been updated in a few months, we suspect this is due to a misalignment between the package and the new features introduced in ADF in October 2021.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.