Git Product home page Git Product logo

azure-digital-twin-cdf-sync's Introduction

Cognite Data Fusion and Azure Digital Twin Plug-in

The purpose of the present plug-in is to synchronize the industrial knowledge graph between the Cognite Data Fusion (CDF) and Azure Digital Twin (ADT) platforms, using Azure Functions written in Python.

For development Python 3.9.7 was used, as this was the latest version supported by Azure functions. For various ways of deployment, check this link.


Ground Rules

The user must respect a few ground rules (guidelines) when making changes to the asset hierarchy, because of issues and limitations that cannot be handled unambiguously by the current solution.

  1. Do not change the external ID of resources in CDF. Instead, delete the resource first and create it again with the new external ID.

  2. Do not use the same external ID for different type of resources in CDF.

  3. Do not edit the “externalId” and “id” properties of resources (assets and timeseries for now) in ADT. Even if blank (i.e., not set) leave them as is, the CDF→ADT sync will take care of it.

  4. In ADT do not create different relationships with the same ID, because in CDF they are unique.

  5. For timeseries in ADT, update both the "latestValue" and the "timestamp" properties at the same time. Otherwise the new datapoint will not be inserted in CDF. Also, do not insert string values into numeric timeseries, and vice-versa.


Description

The solution translates a CDF asset hierarchy together with contextualized operational and engineering data into Digital Twin Definition Language (DTDL) ontologies, pushes the results to Azure, and synchronizes changes in the graph in both directions.

Currently, the following CDF resource types are mapped:

  • Assets
  • Asset-to-asset relationships
  • Timeseries with the value of the latest datapoint

The project contains two main features:

  1. a timer-triggered Azure function to create/update the knowledge graph in the CDF→ADT direction,
  2. an event-triggered Azure function to update changes in the ADT→CDF direction.

The DTDL models used to represent resources in ADT are stored in the Models folder in this repository.

For more details about the Azure functions check the CDF→ADT Readme and ADT→CDF Readme files.

Assets

CDF Assets are translated into the Asset DTDL model together with all properties:

  • CDF external ID and internal ID (remember not to edit these)
  • name (which is mandatory in CDF)
  • description
  • metadata - represented by the tags/values map property in ADT

Relationships

In the current solution only asset-to-asset CDF relationships are modeled, but at the same time two types of ADT relationships should be differentiated:

  1. Explicit relationships: in CDF they are the actual Relationship resources and are represented by the relatesTo ADT relationship. IMPORTANT NOTE: these relationships can have multiple labels in CDF – check the limitations on how this is handled.

  2. Implicit relationships: in CDF they are not separate resources but are stored as properties. In ADT they must still be represented as real relationships. These are the following (2 for now):
    1. Parent-child relationship: stored in the “parent_external_id” field of a CDF asset, and represented in ADT by the parent relationship between Asset twins.
    2. Timeseries – belongs to – Asset relationship: stored in the “asset_id” field of a CDF Timeseries, and represented in ADT by the contains relationship between Asset and Timeseries twins.

To summarize, currently there are 3 types of ADT relationships (relatesTo, parent, contains), all defined in the Asset model.

Timeseries

CDF Timeseries are translated into the Timeseries DTDL model and are similar to assets with the addition of 2 new properties holding the value and the timestamp, respectively, of the latest datapoint.


Dependencies

In order to deploy and run the plugin, the following resources are required:

  • CDF tenant, which contains the initial industrial knowledge graph(s) to be mapped

  • Microsoft Azure tenant, where the Azure functions will be deployed to replicate and synchronize the graph(s). The Azure resources below need to be created beforehand:

    • 2 function apps (one timer-triggered and one event-triggered),
    • 2 blob storage accounts (one for each function),
    • Key vault,
    • Azure Digital Twins,
    • Event Hub.

Library Versions

The Python libraries used during the development of the two Azure functions are listed in the table below (last update on May 20, 2022).

Python Library Version
CDF→ADT ADT→CDF
azure-core 1.24.0
azure-digitaltwins-core 1.1.0
azure-eventhub - 5.9.0
azure-functions 1.11.2
azure-identity 1.10.0
azure-storage-blob 12.12.0 -
cognite-sdk 2.49.1

Environment Variables

Besides the knowledge graph itself, all the inputs for the functions must be defined as environment variables in the Azure function configuration settings. The table below summarizes the list of keys and the requirement for each function.

Variable Key Description CDF→ADT ADT→CDF
ADT_URL URL of the ADT resource (with "https://") YES YES
adtevents_RootManageSharedAccessKey_EVENTHUB
endpoint of the Event Hub NO YES
AzureWebJobsStorage connection string to the blob storage linked to this Azure function YES YES
CDF_CLIENT_SECRET client secret of the Cognite tenant YES YES
CDF_CLIENTID the client ID of the Cognite tenant YES YES
CDF_CLUSTER cluster of the Cognite tenant YES YES
CDF_TENANTID ID of the Cognite tenant YES YES
CDF_PROJECT Cognite project inside the Cognite tenant YES YES
FUNCTIONS_EXTENSION_VERSION "~4" "~4"
FUNCTIONS_WORKER_RUNTIME defaults to "python" in both cases "python" "python"
ROOT_ASSET_EXTERNAL_ID the external ID of the root asset node of the knowledge graph to be instantiated and synchronized YES YES

To run the Azure functions on your local computer, you may need to add additional environment variables in your local.settings.json file. Check this documentation for more information.


Authors

Contributors names and contact info:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.