wjohnson / pyapacheatlas Goto Github PK

View Code? Open in Web Editor NEW

167.0 167.0 95.0 1.22 MB

A python package to help work with the apache atlas REST APIs

Home Page: https://wjohnson.github.io/pyapacheatlas-docs/latest/

License: MIT License

Python 100.00%

pyapacheatlas's People

Contributors

Stargazers

Watchers

Forkers

mdrakiburrahman balakreshnan amjadmkhan masuryan fbedecarrats tonio-lora praveenksingh7 slyons abdale kawofong sri-azure-git bsherwin reiselgp kfengmsft iagofranco cristiangomez811 chinmoysarangi luismleite florentbedecarratsnm svchandramohan hannumuurinendigia sistemasbravo hophanms amiket23 fredgis kumarmisra tvboy jomit ksista1848 arvindindeed athenads zeinab-mk isantillan1 cloudbreadpapa fpvmorais jvanbuel rcabr mccheng98 syaheedz xiaoyongzhumsft lazowmich bibuwei costadelsolbjj microcassidy hmoazam edobrynin-dodo schalkje chetnachaudhari briwalkr shreyaj626 snathani-ib nastevtose pedro-luis-ey olddusty ludwinic1 geoffreychet joaosalvadomicrosoft bosunnj marcoeziogustavobartolini jordirua sonnyhcl shivatomar2183 joe-tj amberz minettes ronaldyu ideasoft-tech bramhaaelem ievsantillan plaksnor freeone15 rkapp22 leonianne1 harveyhubj mprs-labs deathtobanana w0lveri9 mchebihi reddy-sruthi ramesh-venkatachalam sgidwani61 mohantyrr2003 leizhang258 manthena2020 jarteagaf zabardast1999 rokarolla ptinsl gse89 lponnam75 laknath123 joelvaneenwyk flyingbearhk yifan-zhou922 maartenevenepoel

pyapacheatlas's Issues

Support searching across catalog for glossary terms in asset name

Enabling a rough "discovery" of possible candidates for assets that need to be tagged with a glossary term.

Should be possible through the advanced search and parsing through the glossary terms.

Get all glossary terms
Possibly "massage" those terms into different shapes (abbreviations, substr)
Wildcard query against all assets
List out any possible matches for people to review
Phase two would be to assign the AtlasGlossaryTerm relationship to the asset from some intermediate file that was written out to beforehand.

Enable classification definition from Excel

pyapachealtas enables to upload entities specifying classifications. It would be useful to be able to define new classifications directly from the Excel Template.

UpdateLineage should support multiple inputs and outputs

After #47 it should include the ability to handle multiple inputs or outputs in the spreadsheet.

If there's an N/A in one row and it's being defined in another row, assume the N/A and WARN on the output.

Add support for multiple inputs in excel process

Search raises StopIteration at end of paging

PEP 479 indicates that StopIteration inside a Generator is not good behavior and replaces it with a RuntimeError instead in Python 3.7+

Need to replace this StopIteration and simply return when the inner function completes on AtlasClient.search_entities

We need support for Owner and Experts

Hi, I got a customer that wants to also be able to manage the owner and expert with the API and also to assign during the creation of the custom ones.

Support Data Catalog Glossary Term Template Upload

The Azure Data Catalog provides a CSV glossary term upload with the following fields. The goal of this issue would be to develop a similar offering via the excel template and replicate the features.

Columns of CSV / Excel File:

Name
Status: ENUM (Approved, Draft)
Definition: String
Acronym: String
Resources: DisplayName:URL
Related Terms: Needs to look up existing terms and create or associate.
Synonyms: Needs to look up existing terms and create or associate.
Stewards: Needs Graph API support
Experts: Needs Graph API support
Dynamic attribute of pattern: [Attribute][termTemplateName]extraAttributeName

The dynamic attribute should be attached to an attributes property

{
"attributes:{
  "termTemplateName": {
      "extraAttributeName": ""
}
}

Enable download of all entities for backup / restore

This is accomplished through the search API and requires paging through the results.

The goal would be to extract every entity and enable users to essentially "back up" their data catalog but also potentially re-locating their data catalog by uploading the results of this extraction.

Search REST API: http://atlas.apache.org/api/v2/resource_DiscoveryREST.html

Need to consider the upload process as well. Assuming you have to replace the guids when pushing to the new catalog since entity upload requires a negative number as guid.

Bulk column UPDATE

Hello,

Would your samples/excel/excel_bulk_entities_upload.py work for updating existing columns? I am trying to find a way to bulk update columns that have already been scanned in via the GUI. We want to add additional information to the columns, mainly descriptions and glossary links.

I am trying to test it by updating a single column (adding a description to it). Below is what I have in the spreadsheet.

typeName	name	qualifiedName	classifications	[Relationship] table	type	description
mssql_column	my_column	mssql://XXXXXXXXXXX:XXXXXXX/MSSQLSERVER/XXXXX/XXXXX/my_table#my_column		pyapacheatlas://my_table	smallint	testing

Running this gives me the following error --

KeyError: 'The entity pyapacheatlas://my_table should be listed before mssql://XXXXXXXXXXX:XXXXXXX/MSSQLSERVER/XXXXX/XXXXX/my_table#my_column.'

I am not sure how to interpret this. Any help is greatly appreciated. Thank you.

OSError: [Errno 22] Invalid argument

HI,
I am trying to run the code and create a sample entity but i am getting the following error. I have checked the credentials and everything seems fine.

Traceback (most recent call last):
File "c:\Users\shkh\Purview.py", line 105, in
batch=[output01, input01, process]
File "C:\Users\shkh\AppData\Roaming\Python\Python36\site-packages\pyapacheatlas\core\client.py", line 927, in upload_entities
headers=self.authentication.get_authentication_headers()
File "C:\Users\shkh\AppData\Roaming\Python\Python36\site-packages\pyapacheatlas\auth\serviceprincipal.py", line 58, in get_authentication_headers
self._set_access_token()
File "C:\Users\shkh\AppData\Roaming\Python\Python36\site-packages\pyapacheatlas\auth\serviceprincipal.py", line 48, in _set_access_token
self.expiration = datetime.fromtimestamp(int(authJson["expires_in"]))
OSError: [Errno 22] Invalid argument

Create Entities without lineage bulk upload

Create an excel reader function that supports upload of entities without needing column or table level lineage.

Currently, the Columns and Tables tabs expect you to be creating source and targets.

A new tab should be added to the template to support BulkEntities and the Columns and Tables tabs should be renamed to ColumnsLineage and TablesLineage as defaults.

BulkEntities should be able to automatically take column headers as the attributes. If a cell is empty, it will not add that attribute to the entity.

Disable classification propagation when uploading entities

Currently, pyapacheatlas uploads entity classifications with the propagation attribute activated.
This is not convenient for all use cases. For instance, one would like to add a classification such as "manual_import" to differentiate, when browsing the catalog, the entities imported with pyapacheatlas from those populated automatically. Currently, when uploading related entities with this classification, one ends up with a series of "propagated classifications" stating "manual_import manual_import manual_import manual_import..." as many times as there are relationthips (which can be >10 in my case).

Provide Sample of Importing ADC Gen 1 Terms to Purview

ADC Gen 1 glossary terms should be importable into Purview!

Add a migration sample for ADC Gen1 that:

Exports ADC Gen1 terms
Converts it to the import format
Executes the bulk import endpoint for Purview

The `is_purview` attribute is not set correctly.

When the client is created with PurviewClient(), the is_purview client attribute is incorrectly set to False.
This causes search_entities() to throw RuntimeWarning: You're using a Purview only feature on a non-purview endpoint:

from pyapacheatlas.auth import ServicePrincipalAuthentication
from pyapacheatlas.core import PurviewClient

auth = ServicePrincipalAuthentication(
    tenant_id = "...", 
    client_id = "...", 
    client_secret = "..."
)

client = PurviewClient(
    account_name = "my-purview-account-name",
    authentication = auth
)

print('client.is_purview:', client.is_purview)
# >> False

for i in client.search_entities('totemove'):
    print(i)
# >> ...python3.9/site-packages/pyapacheatlas/core/util.py:18: 
# >> RuntimeWarning: You're using a Purview only feature on a non-purview endpoint.
# >> warnings.warn(

The output is ok, despite this warning.

Workaround: Set client.is_purview = True after client creation.

Excel should use columnMappings

Test against existing sql types

Any of the additional "required" attributes should be okay if they're not part of the type since they are ignored.

dependencyType defaults to simple but expression should default to null

When you create a column lineage entity, you have a dependencyType attribute that is either SIMPLE or EXPRESSION. If you have an EXPRESSION value then you would also see an expression attribute. That expression attribute would contain the code used to create that field.

If you go to re-run the parse_lineages method with existing entities (based on type and qualified name) and remove the transformation value for a given column lineage, you end up with a SIMPLE dependencyType but still have a value in the expression attribute.

Instead, the default for expression should be set to null. However, this may break other scenarios where we want to omit null values. There may have to be a compromise of an empty string value instead or an NA value?

Sent relationships are not visible with excel_bulk_entities_upload.py

Congratulations and many thanks for this great tool!
The sample provided are very useful but I cannot figure out how the custom attributes and relationships are passed to the Atlas API.
For instance, the script samples/excel/excel_bulk_entities_upload.py produces an excel BulkEntities sheet with two additional columns: "[Relationship] table", and "type".
The corresponding information is visible in the dict outputed by excel_reader.parse_bulk_entities(), but I cannot find them in the result of client.upload_entities() that also get printed on the console (see below). How are the attributes "[Relationship] table", and "type" passed to Apache Atlas in this case?
I would really need to understand that to grasp exactly what kind of related objects I can pass to the catalog API with pyapacheatlas.

runfile('C:/Users/FBEDECARRA/Desktop/Tests Apache Atlas/sample_bulk_upload.py', wdir='C:/Users/FBEDECARRA/Desktop/Tests Apache Atlas')
{
  "mutatedEntities": {
    "CREATE": [
      {
        "typeName": "DataSet",
        "attributes": {
          "qualifiedName": "pyapacheatlas://dataset",
          "name": "exampledataset"
        },
        "guid": "f24c4f22-c5e3-4776-a630-41e533b47099",
        "status": "ACTIVE",
        "displayText": "exampledataset",
        "classificationNames": [],
        "classifications": [],
        "meaningNames": [],
        "meanings": [],
        "isIncomplete": false,
        "labels": []
      },
      {
        "typeName": "hive_table",
        "attributes": {
          "createTime": 0,
          "qualifiedName": "pyapacheatlas://hivetable01",
          "name": "hivetable01"
        },
        "guid": "46efb945-281d-497b-8334-92c668fb8d5b",
        "status": "ACTIVE",
        "displayText": "hivetable01",
        "classificationNames": [],
        "classifications": [],
        "meaningNames": [],
        "meanings": [],
        "isIncomplete": false,
        "labels": []
      },
      {
        "typeName": "hive_column",
        "attributes": {
          "qualifiedName": "pyapacheatlas://hivetable01#colA",
          "name": "columnA"
        },
        "guid": "195d4775-69f0-48fe-b63c-88c0e30066fa",
        "status": "ACTIVE",
        "displayText": "columnA",
        "classificationNames": [],
        "classifications": [],
        "meaningNames": [],
        "meanings": [],
        "isIncomplete": false,
        "labels": []
      },
      {
        "typeName": "hive_column",
        "attributes": {
          "qualifiedName": "pyapacheatlas://hivetable01#colB",
          "name": "columnB"
        },
        "guid": "f43b8f63-63da-4c82-b5f5-2b09c0418e67",
        "status": "ACTIVE",
        "displayText": "columnB",
        "classificationNames": [],
        "classifications": [],
        "meaningNames": [],
        "meanings": [],
        "isIncomplete": false,
        "labels": []
      },
      {
        "typeName": "hive_column",
        "attributes": {
          "qualifiedName": "pyapacheatlas://hivetable01#colC",
          "name": "columnC"
        },
        "guid": "f1650ead-6b7e-4dce-aa2b-03ddb18ebca3",
        "status": "ACTIVE",
        "displayText": "columnC",
        "classificationNames": [],
        "classifications": [],
        "meaningNames": [],
        "meanings": [],
        "isIncomplete": false,
        "labels": []
      }
    ]
  },
  "guidAssignments": {
    "-1005": "f1650ead-6b7e-4dce-aa2b-03ddb18ebca3",
    "-1004": "f43b8f63-63da-4c82-b5f5-2b09c0418e67",
    "-1001": "f24c4f22-c5e3-4776-a630-41e533b47099",
    "-1003": "195d4775-69f0-48fe-b63c-88c0e30066fa",
    "-1002": "46efb945-281d-497b-8334-92c668fb8d5b"
  }
}
Completed bulk upload successfully!
Search for hivetable01 to see your results.

BulkEntities should be able to map a column entity to a table entity.

Related to #31 in the sense that a table has a columns relationship attribute.

ColumnMappings Takes in Qualified Name and not just Name for dataset mapping

This is the appropriate syntax and needs to be fixed in the column lineage reader.

column_mapping = [
        {"ColumnMapping": [
            {"Source": "AddressType", "Sink": "address"},
            {"Source": "CustomerId", "Sink": "cust_id"}],
            "DatasetMapping": {
            "Source": custAddr.qualifiedName, "Sink": customer.qualifiedName}
         },
        {"ColumnMapping": [
            {"Source": " total_emp", "Sink": "cust_id"},
            {"Source": " description", "Sink": "username"}],
            "DatasetMapping": {"Source": sample.qualifiedName, "Sink": customer.qualifiedName}
         }
    ]

Look toward RelationshipDefs rather than EntityDefs for source and target types

Distinguish between Relationship Attributes and Entity Attributes

After completing #29 and merging #32 , there is a potential need to connect relationship attributes to an uploaded entity. For example, you might upload several tables and columns. However, those columns would be unattached entities and have no relationships.

There needs to be something like (Relationship) attributeX in the BulkEntities tab or Target (Relationship) attributeY in the Lineages tabs.

Create a Reader Abstract Class to standardize multiple readers

Others may want to implement readers for different formats.

For example, you may want to create a JSON reader or a DelimitedFile reader that implements the same standard methods to parse the results.

This will result in merging:

ExcelConfiguration
readers.excel functions
scaffolding.templates.excel
scaffolding.core.*
scaffolding.util maybe?

This would be a breaking change for the samples.

Make it easier to get started with Purview

There should be a Purview Client that accepts a account_name attribute and fills in the endpoint_url for you.

The PurviewClient should be used as a test to warn when...

Using this package's search feature (only implements the Purview)
Using classifications and the propagation feature (not supported in Purview)

Create a generic Entity Type Def based on an excel sheet template from user

Given an excel spreadsheet with column headers, generate an entity based on the column headers as attributes.

The goal would be to quickly generate the type and have it be hand edited to modify the results.

Use the Excel Configuration to specify the sheet?
All fields are optional
All fields are strings
Support for multiple tabs to be different entity types?

Stretch goal should be to allow for entities (the rows of the spreadsheet) to be created for that entity type.

Excel Worksheet Should Support Glossary Term Uploads

Right now, the Excel files are only smart enough to include classifications (which might need to be made into an optional field).

By including glossary terms, this would support bulk updates to entities that can't be currently done in Purview.

Implementation should look at adding a meanings special header that supports multiple semi-colon delimited terms that get mapped as relationship attributes.

Lineage UI Question

Hi,

I've ran the excel_custom_table_column_lineage.py sample and it works fine. Below is a picture of the lineage tab from the perspective of DestTable01.

However when I try the same exact type of lineage setup using some of my MSSQL tables, I get the following --

Three of the four some_adf_job entities are of type MS SQL Column Lineage. I don't want those showing. I only want the process entity showing like how it is in the demo. Also in the demo you can search for the columns on the left, but I can't do that here.

Any idea on what I could be doing wrong? I uploaded the missing MSSQL typedefs using the column_lineage_scaffold template beforehand.

Support Classification REST Endpoints

Support the following REST Endpoints with AtlasClient methods to round out the supported features

/v2/entity/bulk/classification (POST)
/v2/entity/guid/{guid}/classifications (GET | POST | PUT)
/v2/entity/guid/{guid}/classification/{classificationName} (DELETE | GET)
/v2/types/classificationdef/guid/{guid} (GET) (Already supported)
/v2/types/classificationdef/name/{name} (GET) (Already supported)

Contacts and Owner: Phase3 Support Service Principal Graph Lookup

Add a switch to the import process (or the PurviewClient's authentication?) so that the user can signal "My SP has admin-granted permissions to call the Graph". In that case, the package will know it doesn't have to ask for interactive login. It can use the SP to call the Graph straight away. This would enable a scenario where the package can be used in a fully automated environment.

Add CLI Support

A CLI would help with using PyApacheAtlas as part of a tool chain and handle simple, reoccurring tasks such as:

Upload (type def | entity | relationship | term) json to your data catalog.
Validate your upload prior to submission with the What If / Validator
Create scaffolding json
Create template file for excel

TypeError in databricks_catalog_dataframe.py

Hello,

I'm seeing the following error when running databricks_catalog_dataframe.py in Databricks:

TypeError: 'EntityTypeDef' object is not subscriptable
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<command-1278818675318490> in <module>
     78    "relationshipDefs":[spark_column_to_df_relationship]
     79   }, 
---> 80   force_update=True)
     81 print(typedef_results)
     82 

/databricks/python/lib/python3.7/site-packages/pyapacheatlas/core/client.py in upload_typedefs(self, typedefs, force_update, **kwargs)
    840                 new_types[cat] = []
    841                 for t in typelist:
--> 842                     if t["name"] in types_from_client[cat]:
    843                         existing_types[cat].append(t)
    844                     else:

TypeError: 'EntityTypeDef' object is not subscriptable

Create What If Analysis of output

Count of changes, adds, invalid
Displaying diff in changes
What is invalid in the entity

get_entity error when trying to retrieve database object

Hello,

I am trying to retrieve data about the synapse (azure_sql_dw) instance I have setup in our Purview catalog. However when I try to get that information via get_entity, it returns an error.

Here is the code --

azure_sql_dw = client.get_entity(guid='c39966cb-fbe1-4394-9b44-1d3bbafeb38e')

Here is the error --

HTTPError: 500 Server Error: Internal Server Error for url: https://XXXXXXXXXXXX.catalog.purview.azure.com/api/atlas/v2/entity/bulk?guid=c39966cb-fbe1-4394-9b44-1d3bbafeb38e

Any idea on what could be causing this? Calls I make to table or column objects work fine.

Thanks,
Zack

Scaffolding should be able to take an input data source and output data source

Contacts and Owner: Phase2 Support Interactive Auth for Graph Lookups

Change the package so that it looks at the experts and owners input. If the values look like guids, then proceed as before. If they look like email addresses, force the user to login interactively and then the package will use the Graph API to translate the email addresses to guids on the user's behalf.

This should only occur in the PurviewClient and only applies to Entities upload and Glossary Term uploads. This is already handled in the terms/import csv route developed in #77 .

AtlasClient.upload_entities should handle AtlasEntity

Currently upload_entities only supports a dictionary or list of dictionaries. It should handle a single AtlasEntity or a list of AtlasEntities. If the batch is a dictionary of "entities: [] then assume they are passing in a list of dicts already since they know the format.

Support LineageREST for Purview Features

Knock out the LineageREST section!

Purview ONLY Support
GET /atlas/v2/lineage/{guid}/next/
GET /atlas/v2/lineage/{guid}

I will not support the Atlas way of calling this API at this time.

Purview Limitation
GET /v2/lineage/uniqueAttribute/type/{typeName}

I will not support this endpoint as it is not present in Purview currently.

Guid generation question

Hi,

I'm trying to build a function that takes two entities and creates a process between those two entities using the AtlasProcess function. My problem is that I need to create a new guid and be sure that the guid is not already assigned to one of my asset in Purview. Is there a function that creates a new guid, knowing the already existing ones on Purview?

Thank you,

Edoardo

Support classifications

AtlasProcess should accept AtlasEntity as Inputs and Outputs

As major methods like AtlasClient.upload_entities take on the role of converting objects into json, so should the AtlasProcess.

Three areas require changes:

__init__ should handle the inputs and outputs attributes.
set_outputs ...
set_inputs ...

In each case, it should allow an AtlasEntity and execute the to_json(minimum=True) method for you.

AtlasClient.upload_typedefs should accept wider variety of def parameters

upload_typedefs currently accepts a typedef parameter that can take in different values.

I think it would be better if it had arguments for the required keys: "classificationDefs", "entityDefs","enumDefs", "relationshipDefs", "structDefs". That way you don't have to construct the dict yourself.

The arguments should accept a list of either AtlasTypeDefs (and converts them into dicts) or dicts.

Add documentation

Release 0.3.0 Features

New Classes

Create AtlasRelationshipEndDef from http://atlas.apache.org/api/v2/json_AtlasRelationshipEndDef.html
Subclass AtlasRelationshipEndDef for ParentEndDef, pass in name
Subclass AtlasRelationshipEndDef for ChildEndDef, pass in name

EntityTypeDef and ClassificationTypeDef: AttributeDefs

Support AttributeDefs objects alone or in a list, Dicts alone or in a list
Use @properties to get and set attribute defs in whole
Use addAttributeDef(*args) to append attribute defs (NOT Fluent Interface style)
Settable (in whole via .attributeDefs = [] or in init with list

EntityTypeDef: RelationshipAttributeDef

Support RelationshipAttributeDefs objects alone or in a list, Dicts alone or in a list
Use @properties to get and set relationship attribute defs in whole
Use addAttributeDef(*args) to append attribute defs (NOT Fluent Interface style)
Settable (in whole via .relationshipAttributeDefs = [] or in init with list

AtlasProcess: Inputs and Outputs

Support AtlasEntity objects alone or in a list, dicts alone or in a list
Use @properties to get and set input/output in whole
Use .addInput(*args) to append inputs (NOT Fluent Interface style).
Use .addOutput(*args) to append outputs (NOT Fluent Interface style).
Drop get_ / set_ methods.
CONSIDERING: Intelligently determining if it's using a valid guid vs plans to be referenced by qual name, type, negative guid.

AtlasEntity: name and qualified_name

Move to using @properties
Drop get_ / set_ methods.

AtlasEntity: RelationshipAttributes

Use @properties to get and set relationship attributes in whole
Use addRelationshipAttribute(**kwargs) to add / update attributes
When accepting another AtlasEntity or dict, convert to json with minimum=True
Not creating a separate class for AtlasRelationshipAttribute since it's just a dict.
CONSIDERING: Intelligently determining if it's using a valid guid vs plans to be referenced by qual name, type, negative guid.

Cleanups

Defined TypeDefs have corrected ways of getting TypeCategory.

to_json should be smarter when guid is not provided

There are two sort of headers that work!

The currently supported version looks like this:

{
  "guid":-1,
  "typeName": "",
  "qualifiedName": ""
}

However, if you don't provide a guid, the to_json(minimum=True) should specify:

{
    "typeName": "type",
    "uniqueAttributes": {
        "qualifiedName": "qualified name"
    }
}

This could help avoid having to upload the entity as part of the batch.

Documentation seems incorrect about "Search (only for Azure Purview advanced search)"

As far as I understand PurView offers a very limited API for searches when compared to the original Apache Atlas. One example: there is no v2/search/basic in PurView, but there is in Atlas .

In the light of this information, did you mean this instead in the README.md?

Search (the only search available for Azure Purview advanced search)

And as a side question, do you know if the original Atlas API is still accessible somehow?

Contacts and Owner: Phase1 Support Object ID in Excel Sheet

Allow experts and owners to be imported by putting the object ID's into the Excel sheet. This is enough to get the ball rolling. It's the easiest solution and it gives a path for users who are desperate for a solution. It also separates the basic API and import parsing problem from the more complicated "Graph authentication" problem.

Make force_update smarter

In order to make working with uploads easier, the client.upload_typedefs force_update parameter should be smarter.

Currently, it simply does a POST request if FALSE and a PUT request if TRUE. However, doing a PUT request on a type def that does not exist will break the entire upload and doing a POST request for a type def that exists breaks the upload as well.

A better solution is to look up each type def by name and category (entity, relationship, classification) and see if it exists. If it exists, then use the PUT request.

However, there may be dependencies between types and there may be issues in updating a type that will conflict against the existing entities.

Need to test:

What's the impact of PUT with new type defs
What the impact of POST with existing type defs
What's the impact of a PUT with breaking changes to existing type defs
Can we determine the dependency between type defs?

Support a bulk delete operation on entities

Applying classifications + glossary terms to columns via excel

Hi again. Is it possible to add a classification and/or glossary term to columns using the excel_bulk_entities_upload method? I see the sample has a classifications column. I have tried populating this with an existing classification and it runs without error but nothing shows up in the interface for the column. Other fields like description/data_type update fine. Thanks.

excel_custom_table_column_lineage question

Hi,

I am trying to figure out how to add column specific lineage. I have ran the excel_custom_table_column_lineage sample but am not seeing any lineage in the interface. The demo tables and columns are uploaded but I do not see a lineage tab. Are there any changes I need to make to the sample code beside entering the authentication information?

The excel_update_lineage_upload sample works fine for me but this only shows table lineage.

Thank you,
Zack

wjohnson / pyapacheatlas Goto Github PK

pyapacheatlas's People

Contributors

Stargazers

Watchers

Forkers

pyapacheatlas's Issues

New Classes

EntityTypeDef and ClassificationTypeDef: AttributeDefs

EntityTypeDef: RelationshipAttributeDef

AtlasProcess: Inputs and Outputs

AtlasEntity: name and qualified_name

AtlasEntity: RelationshipAttributes

Cleanups

Recommend Projects

Recommend Topics

Recommend Org