Comments (4)
Hi, @EdoQuasso - The guid you pass in should simply be a negative number. Azure Purview will automatically create a new guid if you're creating a brand new entity (it knows if its brand new based on Qualified Name) or it will automatically find the existing entity and update it (based on qualified name).
So your AtlasProcess will look like...
process = AtlasProcess(
name="sample process",
typeName="Process",
qualified_name="pyapacheatlas://democustomprocess",
inputs=[input01],
outputs=[output01],
guid=-102
)
See the create entity and lineage sample for more examples.
from pyapacheatlas.
Hi @wjohnson, thank you very much for the quick reply.
May I ask you another question? Is it possible to attach an Atlas process to an Azure Data Factory Copy Activity?
Because when I try, it raises this error:
AtlasException: {"requestId":"96b9b506-7499-4455-859d-64b133c62aac","errorCode":"ATLAS-400-00-036","errorMessage":"invalid relationshipDef: dataset_process_inputs: end type 1: DataSet, end type 2: Process"}
The code is the following:
output01 = AtlasEntity(
name="output01",
typeName="azure_datalake_gen2_path",
qualified_name="apacheatlas://fake_output",
guid=-110
)
newLineage = AtlasProcess(
name="Databricks Processing",
typeName = "custom_databricks_notebook_process",
qualified_name = process_qn,
inputs = [entity1["entities"][0]],
outputs = [output01],
guid = gt.get_guid()
)
results = client.upload_entities(
batch = [output01, newLineage]
)
The entity1 is an Azure Data Factory Copy Activity as you can see:
{'referredEntities': {'a56acc20-481e-43c7-b62f-192a1daffe8b': {'typeName': 'adf_copy_operation', 'attributes': {'owner': None, 'outputs': [{'guid': 'c44e46d4-eef5-4139-8eb8-1deb6cfec0a7', 'typeName': 'azure_datalake_gen2_path', 'uniqueAttributes': {'qualifiedName': 'https://pvdemo.dfs.core.windows.net/demo/raw/Customer.csv'}}], 'replicatedTo': None, 'replicatedFrom': None, 'qualifiedName': '/subscriptions/b8e41e61-5fb3-4590-aefb-714df2bdb6ec/resourceGroups/Purview_demo/providers/Microsoft.DataFactory/factories/pv-demo/pipelines/P_Customer/activities/Source to Raw#https://pvdemo.dfs.core.windows.net/demo/raw/Customer.csv#azure_datalake_gen2_path', 'inputs': [{'guid': '23459b6a-8d72-4b6e-9a31-3ff6f6f60000', 'typeName': 'azure_sql_table', 'uniqueAttributes': {'qualifiedName': 'mssql://pv-demo.database.windows.net/BI/src/Customer'}}], 'name': 'Source to Raw', 'errorMessage': None, 'description': None, 'status': None, 'columnMapping': '[{"DatasetMapping":{"Source":"*","Sink":"https://pvdemo.dfs.core.windows.net/demo/raw/Customer.csv"},"ColumnMapping":[{"Source":"CustomerID","Sink":"Prop_0"},{"Source":"NameStyle","Sink":"Prop_1"},{"Source":"Title","Sink":"Prop_2"},{"Source":"FirstName","Sink":"Prop_3"},{"Source":"MiddleName","Sink":"Prop_4"},{"Source":"LastName","Sink":"Prop_5"},{"Source":"Suffix","Sink":"Prop_6"},{"Source":"CompanyName","Sink":"Prop_7"},{"Source":"SalesPerson","Sink":"Prop_8"},{"Source":"EmailAddress","Sink":"Prop_9"},{"Source":"Phone","Sink":"Prop_10"},{"Source":"PasswordHash","Sink":"Prop_11"},{"Source":"PasswordSalt","Sink":"Prop_12"},{"Source":"rowguid","Sink":"Prop_13"},{"Source":"ModifiedDate","Sink":"Prop_14"}]}]'}, 'lastModifiedTS': '1', 'guid': 'a56acc20-481e-43c7-b62f-192a1daffe8b', 'status': 'ACTIVE', 'createdBy': 'ServiceAdmin', 'updatedBy': 'ServiceAdmin', 'source': 'DataFactory', 'createTime': 1614614786093, 'updateTime': 1614614786093, 'version': 0, 'relationshipAttributes': {'outputs': [{'guid': 'c44e46d4-eef5-4139-8eb8-1deb6cfec0a7', 'typeName': 'azure_datalake_gen2_path', 'entityStatus': 'ACTIVE', 'displayText': 'Customer.csv', 'relationshipType': 'process_dataset_outputs', 'relationshipGuid': 'b752a816-58a0-4a5b-9cae-c1fb6f812398', 'relationshipStatus': 'ACTIVE', 'relationshipAttributes': {'typeName': 'process_dataset_outputs'}}], 'parent': {'guid': 'c890633e-fd69-42e2-b5c5-64753d2f976a', 'typeName': 'adf_copy_activity', 'entityStatus': 'ACTIVE', 'displayText': 'Source to Raw', 'relationshipType': 'process_parent', 'relationshipGuid': '8d79ac6a-642c-4758-94ea-a15919457b52', 'relationshipStatus': 'ACTIVE', 'relationshipAttributes': {'typeName': 'process_parent'}}, 'inputs': [{'guid': '23459b6a-8d72-4b6e-9a31-3ff6f6f60000', 'typeName': 'azure_sql_table', 'entityStatus': 'ACTIVE', 'displayText': 'Customer', 'relationshipType': 'dataset_process_inputs', 'relationshipGuid': '36442682-9fc7-4b07-b8f0-72e08bd40a68', 'relationshipStatus': 'ACTIVE', 'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}], 'subProcesses': [], 'meanings': []}}}, 'entities': [{'typeName': 'adf_copy_activity', 'attributes': {'owner': None, 'outputs': [], 'replicatedTo': None, 'replicatedFrom': None, 'qualifiedName': '/subscriptions/b8e41e61-5fb3-4590-aefb-714df2bdb6ec/resourceGroups/Purview_demo/providers/Microsoft.DataFactory/factories/pv-demo/pipelines/P_Customer/activities/Source to Raw', 'inputs': [], 'errorMessage': '', 'description': None, 'dataSize': 217829, 'lastRunTime': 1614614763078, 'name': 'Source to Raw', 'rowCount': 847, 'status': 'Completed', 'columnMapping': None}, 'lastModifiedTS': '2', 'guid': 'c890633e-fd69-42e2-b5c5-64753d2f976a', 'status': 'ACTIVE', 'createdBy': 'ServiceAdmin', 'updatedBy': 'ServiceAdmin', 'source': 'DataFactory', 'createTime': 1614614784352, 'updateTime': 1614614787018, 'version': 0, 'relationshipAttributes': {'outputs': [], 'parent': {'guid': '1d606f7b-ed4d-4387-8d34-b4fc506eded5', 'typeName': 'adf_pipeline', 'entityStatus': 'ACTIVE', 'displayText': 'P_Customer', 'relationshipType': 'process_parent', 'relationshipGuid': 'a3bc8e61-099a-45e2-9c76-a6a31aba9b3a', 'relationshipStatus': 'ACTIVE', 'relationshipAttributes': {'typeName': 'process_parent'}}, 'inputs': [], 'subProcesses': [{'guid': 'a56acc20-481e-43c7-b62f-192a1daffe8b', 'typeName': 'adf_copy_operation', 'entityStatus': 'ACTIVE', 'displayText': 'Source to Raw', 'relationshipType': 'process_parent', 'relationshipGuid': '8d79ac6a-642c-4758-94ea-a15919457b52', 'relationshipStatus': 'ACTIVE', 'relationshipAttributes': {'typeName': 'process_parent'}}], 'runInstances': [], 'meanings': []}}]}
Thank you very much in advance.
Edoardo
from pyapacheatlas.
@EdoQuasso - The adf_copy_operation
type is a Process type. A Process entity cannot be an input or output to another Process entity. You have to have an intermediate dataset. Presumably, the copy activity creates a dataset, right? I'd take the output of the copy activity and then THAT entity's guid would be the input to your new lineage process with Databricks.
from pyapacheatlas.
Thank you very much again @wjohnson for the response. Yes, that's what I thought and how I managed the lineage, I asked you because I hoped there was another way.
Thank you very much for your time.
Edoardo
from pyapacheatlas.
Related Issues (20)
- Process info HOT 1
- Enhance bulk template to allow upload terms not just in default Glossary HOT 1
- Add Support for Purview GraphQL API Queries (Currently in Purview Preview Status) HOT 2
- Trying to update purview contacts (expert and owner) but not able to do it HOT 4
- Show API definitions as assets by importing OpenAPI or Swagger files HOT 1
- Prepare for Microsoft Purview Tenant Level Changes
- Can't upload child terms HOT 4
- creating a new term with incorrect hierarchy data is allowed and cause issue between Azure Purview GUI and data.
- Adding Assset type to Lineage HOT 1
- AtlasEntity::addRelationship fails silently due to a wrong check
- Query the Atlas Entity Audit API HOT 1
- How to update Expert information using pyapacheatlas.core.client.PurviewClient.partial_update_entity() for an already existing entity HOT 1
- Create Custom Lineage between Azure Synapse SQL Table and Power BI Dataset facing issues HOT 1
- Search for Snowflake table column entities returning no results
- How do I return the entity through PurviewClient.get_entity if the qualifiedName value has spaces in it? HOT 1
- Column Mapping is not created HOT 1
- Impossible to create entities using any method of PurviewCollectionsClient
- Unable to authenticate Atlas server using kerberos authentication HOT 2
- Issue Retrieving Lineage with Azure Purview API
- **Subject: Issue Retrieving Lineage with Azure Purview API**
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyapacheatlas.