Git Product home page Git Product logo

Comments (6)

zsponaugle avatar zsponaugle commented on July 17, 2024 1

Dang. I should also note that I made two minor code additions although I don't see how they would have any affect on this issue. Please see code comments below.

` atlas_type_defs = client.get_all_typedefs()

excel_results = excel_reader.parse_lineages(
    file_path,
    atlas_type_defs,
    use_column_mapping=True
)

for row in excel_results:
    
    # Set the name attribute. The default code overwrites the name with the qualified name.
    if 'table' in row['typeName']:
        name = row['attributes']['name']
        name = name[name.rindex('/')+1:]
        row['attributes']['name'] = name
    
    # Lookup data type for the column and add it to the payload. Otherwise you will get an error saying you need to include the data type when uploading a column.
    elif row['typeName'].endswith('_column'):
        results = client.get_entity(typeName=row['typeName'], qualifiedName=row['attributes']['qualifiedName'])
        for entity in results['entities']:
            row['attributes']['data_type'] = entity['attributes']['data_type']

# Upload excel file's content to Atlas and view the guid assignments to confirm successful upload
uploaded_entities = client.upload_entities(excel_results)`

Anyways, I've deleted all the process related entities several times and re-tried but no luck. Thanks for looking into things on your end. I will let you know if I come across a fix.

from pyapacheatlas.

wjohnson avatar wjohnson commented on July 17, 2024

@zsponaugle Thank you again for continuing to use the package!

Would you be able to share the excel file you used for the upload AND can you run the get_entity_lineage command on the Purview Client for your SQL table?

Please be sure to sanitize the data like you did for the screenshot :)

from pyapacheatlas.

zsponaugle avatar zsponaugle commented on July 17, 2024

Hi Will, yes here you go.

censored_atlas_upload_table_column_lineage.xlsx

Note there are some unrelated classification and glossary terms belonging to this table.

get_entity_lineage results for the target table (my_target_table):

{'baseEntityGuid': 'db71092e-6e6e-42c4-8322-b9f6f6f60000', 'lineageDirection': 'BOTH', 'lineageDepth': 3, 'lineageWidth': 10, 'childrenCount': -1, 'guidEntityMap': {'db71092e-6e6e-42c4-8322-b9f6f6f60015': {'typeName': 'mssql_column', 'attributes': {'userTypeId': 175, 'qualifiedName': 'mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_target_table#my_target_column_3', 'precision': 0, 'length': 5, 'encryptionType': 0, 'columnEncryptionKeyId': 0, 'description': 'my_description', 'scale': 0, 'isXmlDocument': 'false', 'isMasked': 'false', 'xmlCollectionId': 0, 'isHidden': 'false', 'name': 'my_target_column_3', 'data_type': 'char', 'systemTypeId': 175}, 'guid': 'db71092e-6e6e-42c4-8322-b9f6f6f60015', 'status': 'ACTIVE', 'displayText': 'my_target_column_3', 'classificationNames': ['my_classification_1'], 'meaningNames': ['my_glossary_term'], 'meanings': [{'termGuid': 'c0ab4819-409c-4095-8ed2-b320268a1c98', 'relationGuid': 'b99e3343-d4f2-4e80-98d8-84e53f333134', 'displayText': 'my_glossary_term', 'confidence': 0}]}, 'db71092e-6e6e-42c4-8322-b9f6f6f6000a': {'typeName': 'mssql_column', 'attributes': {'userTypeId': 59, 'qualifiedName': 'mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_target_table#my_target_column_2', 'precision': 24, 'length': 4, 'encryptionType': 0, 'columnEncryptionKeyId': 0, 'description': 'my_description', 'scale': 0, 'isXmlDocument': 'false', 'isMasked': 'false', 'xmlCollectionId': 0, 'isHidden': 'false', 'name': 'my_target_column_2', 'data_type': 'real', 'systemTypeId': 59}, 'guid': 'db71092e-6e6e-42c4-8322-b9f6f6f6000a', 'status': 'ACTIVE', 'displayText': 'my_target_column_2', 'classificationNames': ['my_classification_1'], 'meaningNames': ['my_glossary_term'], 'meanings': [{'termGuid': 'def1eedc-98a0-4044-9f50-07796ac06a15', 'relationGuid': 'fb013e80-e4a2-410b-9906-37a6cf5b95c1', 'displayText': 'my_glossary_term', 'confidence': 0}]}, 'db71092e-6e6e-42c4-8322-b9f6f6f60000': {'typeName': 'mssql_table', 'attributes': {'modifiedTime': 1585446528000, 'createTime': 1522933267000, 'qualifiedName': 'mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_target_table', 'name': 'my_target_table', 'principalId': 0, 'objectType': 'U '}, 'lastModifiedTS': '192', 'guid': 'db71092e-6e6e-42c4-8322-b9f6f6f60000', 'status': 'ACTIVE', 'displayText': 'my_target_table', 'classificationNames': ['my_classification_1'], 'meaningNames': ['my_glossary_term'], 'meanings': [{'termGuid': 'a0dfd070-df24-4fe6-ab87-487ebd266ba3', 'relationGuid': '05d2132c-0534-46b0-b9eb-09bca07cdc12', 'displayText': 'my_glossary_term', 'confidence': 0}]}, 'e5ba7f84-33fc-441d-b6b3-0b44c11f3e6d': {'typeName': 'mssql_column_lineage', 'attributes': {'dependencyType': 'SIMPLE', 'qualifiedName': 'some_adf_job@derived_column:my_target_column_2', 'name': 'some_adf_job'}, 'lastModifiedTS': '1', 'guid': 'e5ba7f84-33fc-441d-b6b3-0b44c11f3e6d', 'status': 'ACTIVE', 'displayText': 'some_adf_job', 'classificationNames': [], 'meaningNames': [], 'meanings': []}, '187c7015-8695-4bb3-87f6-63f6f6f60000': {'typeName': 'mssql_table', 'attributes': {'modifiedTime': 1608426020000, 'createTime': 1522314481000, 'qualifiedName': 'mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_source_table_1', 'name': 'my_source_table_1', 'principalId': 0, 'objectType': 'U '}, 'lastModifiedTS': '4', 'guid': '187c7015-8695-4bb3-87f6-63f6f6f60000', 'status': 'ACTIVE', 'displayText': 'my_source_table_1', 'classificationNames': [], 'meaningNames': [], 'meanings': []}, 'f803577a-2205-43b8-a2c9-70f6f6f60000': {'typeName': 'mssql_table', 'attributes': {'modifiedTime': 1612054805000, 'createTime': 1522314480000, 'qualifiedName': 'mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_source_table_2', 'name': 'my_source_table_2', 'principalId': 0, 'objectType': 'U '}, 'lastModifiedTS': '1', 'guid': 'f803577a-2205-43b8-a2c9-70f6f6f60000', 'status': 'ACTIVE', 'displayText': 'my_source_table_2', 'classificationNames': [], 'meaningNames': [], 'meanings': []}, 'db71092e-6e6e-42c4-8322-b9f6f6f60030': {'typeName': 'mssql_column', 'attributes': {'userTypeId': 108, 'qualifiedName': 'mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_target_table#my_target_column_1', 'precision': 12, 'length': 9, 'encryptionType': 0, 'columnEncryptionKeyId': 0, 'description': 'my_description', 'scale': 2, 'isXmlDocument': 'false', 'isMasked': 'false', 'xmlCollectionId': 0, 'isHidden': 'false', 'name': 'my_target_column_1', 'data_type': 'decimal', 'systemTypeId': 108}, 'guid': 'db71092e-6e6e-42c4-8322-b9f6f6f60030', 'status': 'ACTIVE', 'displayText': 'my_target_column_1', 'classificationNames': ['my_classification_1'], 'meaningNames': ['my_target_column_1 '], 'meanings': [{'termGuid': 'c4a143e8-ae7a-4f8b-8412-63e4331b1aff', 'relationGuid': 'e1683e5f-ff2e-4132-8411-44db753f0531', 'displayText': 'my_target_column_1 ', 'confidence': 0}]}, '6aa293e6-6b6b-4ddf-a6a9-f3c306071a15': {'typeName': 'mssql_column_lineage', 'attributes': {'expression': 'my_source_column_3 + my_source_column_4', 'dependencyType': 'EXPRESSION', 'qualifiedName': 'some_adf_job@derived_column:my_target_column_3', 'name': 'some_adf_job'}, 'lastModifiedTS': '1', 'guid': '6aa293e6-6b6b-4ddf-a6a9-f3c306071a15', 'status': 'ACTIVE', 'displayText': 'some_adf_job', 'classificationNames': [], 'meaningNames': [], 'meanings': []}, 'c55e9c2f-6c2b-48a8-9fea-eae7e904f876': {'typeName': 'mssql_column_lineage', 'attributes': {'dependencyType': 'SIMPLE', 'qualifiedName': 'some_adf_job@derived_column:my_target_column_1', 'name': 'some_adf_job'}, 'lastModifiedTS': '1', 'guid': 'c55e9c2f-6c2b-48a8-9fea-eae7e904f876', 'status': 'ACTIVE', 'displayText': 'some_adf_job', 'classificationNames': [], 'meaningNames': [], 'meanings': []}, '3541e013-c56b-4e05-b16f-e4b751efc034': {'typeName': 'mssql_process', 'attributes': {'qualifiedName': 'some_adf_job', 'name': 'some_adf_job', 'columnMapping': '[{"ColumnMapping": [{"Source": "my_source_column_1", "Sink": "my_target_column_1"}, {"Source": "my_source_column_2", "Sink": "my_target_column_2"}, {"Source": "my_source_column_3", "Sink": "my_target_column_3"}], "DatasetMapping": {"Source": "mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_source_table_1", "Sink": "mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_target_table"}}, {"ColumnMapping": [{"Source": "my_source_column_4", "Sink": "my_target_column_3"}], "DatasetMapping": {"Source": "mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_source_table_2", "Sink": "mssql://my_db_server.com:my_port/MSSQLSERVER/my_database/dbo/my_target_table"}}]'}, 'lastModifiedTS': '1', 'guid': '3541e013-c56b-4e05-b16f-e4b751efc034', 'status': 'ACTIVE', 'displayText': 'some_adf_job', 'classificationNames': [], 'meaningNames': [], 'meanings': []}}, 'includeParent': False, 'relations': [{'fromEntityId': '187c7015-8695-4bb3-87f6-63f6f6f60000', 'toEntityId': '3541e013-c56b-4e05-b16f-e4b751efc034', 'relationshipId': '807d81cc-ce6a-4eae-9087-558144716d5a'}, {'fromEntityId': 'f803577a-2205-43b8-a2c9-70f6f6f60000', 'toEntityId': '3541e013-c56b-4e05-b16f-e4b751efc034', 'relationshipId': 'd7c0e295-b569-4b6f-a879-9ddc5fcf33e6'}, {'fromEntityId': 'c55e9c2f-6c2b-48a8-9fea-eae7e904f876', 'toEntityId': 'db71092e-6e6e-42c4-8322-b9f6f6f60030', 'relationshipId': '3a714c76-d38e-4847-8adc-623bc8412bed'}, {'fromEntityId': '187c7015-8695-4bb3-87f6-63f6f6f60000', 'toEntityId': 'c55e9c2f-6c2b-48a8-9fea-eae7e904f876', 'relationshipId': '965c64f6-818c-4a7e-b890-633be87322c7'}, {'fromEntityId': '3541e013-c56b-4e05-b16f-e4b751efc034', 'toEntityId': 'db71092e-6e6e-42c4-8322-b9f6f6f60000', 'relationshipId': '74257f36-9369-401a-b648-11fe6552e37f'}, {'fromEntityId': 'e5ba7f84-33fc-441d-b6b3-0b44c11f3e6d', 'toEntityId': 'db71092e-6e6e-42c4-8322-b9f6f6f6000a', 'relationshipId': '6fe73c27-f6fd-438b-9a34-cf9d7d60840c'}, {'fromEntityId': '187c7015-8695-4bb3-87f6-63f6f6f60000', 'toEntityId': '6aa293e6-6b6b-4ddf-a6a9-f3c306071a15', 'relationshipId': 'ac47544d-8190-4b09-a3d1-3525273aca0d'}, {'fromEntityId': '187c7015-8695-4bb3-87f6-63f6f6f60000', 'toEntityId': 'e5ba7f84-33fc-441d-b6b3-0b44c11f3e6d', 'relationshipId': '9242c437-6987-4b12-8bca-0850da9f54fd'}, {'fromEntityId': '6aa293e6-6b6b-4ddf-a6a9-f3c306071a15', 'toEntityId': 'db71092e-6e6e-42c4-8322-b9f6f6f60015', 'relationshipId': '6040df6c-0770-45a6-817e-71a0aed16a78'}, {'fromEntityId': 'f803577a-2205-43b8-a2c9-70f6f6f60000', 'toEntityId': '6aa293e6-6b6b-4ddf-a6a9-f3c306071a15', 'relationshipId': 'e6df5a95-b020-47bf-90d1-2456a4bf205b'}], 'parentRelations': [], 'widthCounts': {'INPUT': {}, 'OUTPUT': {}}}

from pyapacheatlas.

zsponaugle avatar zsponaugle commented on July 17, 2024

Also the column search side bar on the left shows when I am viewing my_target_table and my_source_table_1 but not my_source_table_2. This is regarding the MS SQL examples. It works fine in the demo sample.

from pyapacheatlas.

wjohnson avatar wjohnson commented on July 17, 2024

Thanks for this @zsponaugle! I'm having troubles replicating this :( Using your sample excel file, I'm seeing tables with table level lineage and columns with column level lineage.

Would you try deleting the process entities you see in the lineage graph and trying again from your original sheet? I'm wondering if you re-ran your spreadsheet and did the client.get_all_typedefs() instead of the scaffolding (since you already defined the custom type defs), if your assets would line back up correctly.

from pyapacheatlas.

wjohnson avatar wjohnson commented on July 17, 2024

Closing for now but please feel free to re-open if you want a second pair of eyes on a reproducible snippet of code.

from pyapacheatlas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.