Git Product home page Git Product logo

mstrerd's Introduction

Pretty schema diagrams for MicroStrategy

This notebook demonstrates how to convert the generally unhelpful MicroStrategy schema into human-readable JSON. We can then use a diagramming library such as JointJS or GoJS to draw aesthetically pleasing entity relationship diagrams.

Export MicroStrategy project schema

We will use the schema from the MicroStrategy Tutorial. The schema can be obtained in Developer from 'Schema' > 'Export Project Schema'. Export the Table Catalog - Logical View which contains a list of all warehouse tables and their attributes and facts.

from pyexcel_xls import get_data
import json

data = get_data("schema.xls")
logical_view = data["Logical View"]
temp = [[item[0], item[1], item[2]] for item in logical_view]
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero
import pandas as pd
df = pd.DataFrame(temp[1:], columns = logical_view[0])
df.head(10)
Table Name Object Type
0 CITY_CTR_SLS Call Center Attribute (K)
1 CITY_CTR_SLS Customer City Attribute (K)
2 CITY_CTR_SLS Cost Fact
3 CITY_CTR_SLS Profit Fact
4 CITY_CTR_SLS Revenue Fact
5 CITY_CTR_SLS Units Sold Fact
6 CITY_CTR_SLS Gross Revenue Fact
7 CITY_MNTH_SLS Month Attribute (K)
8 CITY_MNTH_SLS Customer City Attribute (K)
9 CITY_MNTH_SLS Cost Fact

Node list

# form node list

graph_spec = {"nodes": [], "links": []}

df2 = df["Table Name"].drop_duplicates()

for table_id in df2:
    props_list = []
    temp = pd.DataFrame()
    temp = df[df["Table Name"] == table_id]
    for index, row in temp.iterrows():
        props_list.append(row["Object"])
    graph_spec["nodes"].append({"id": table_id, "props": props_list})

Edge List

# cartesian product to find possible edges
df3 = pd.merge(df, df, how='left', on='Object')

# drop rows if Type eq Fact because can't join on Facts
df3 = df3[df3.Type_x != "Fact"]

# drop rows if same Table Name because it is a duplicate record
df3 = df3[df3["Table Name_x"] != df3["Table Name_y"]]

df3.head(10)
Table Name_x Object Type_x Table Name_y Type_y
1 CITY_CTR_SLS Call Center Attribute (K) DAY_CTR_SLS Attribute (K)
2 CITY_CTR_SLS Call Center Attribute (K) F_TUTORIAL_TARGETS Attribute (K)
3 CITY_CTR_SLS Call Center Attribute (K) ITEM_CCTR_MNTH_SLS Attribute (K)
4 CITY_CTR_SLS Call Center Attribute (K) LU_CALL_CTR Attribute (K)
5 CITY_CTR_SLS Call Center Attribute (K) LU_EMPLOYEE Attribute
6 CITY_CTR_SLS Call Center Attribute (K) SUBCATEG_MNTH_CTR_SLS Attribute (K)
8 CITY_CTR_SLS Customer City Attribute (K) CITY_MNTH_SLS Attribute (K)
9 CITY_CTR_SLS Customer City Attribute (K) CITY_SUBCATEG_SLS Attribute (K)
10 CITY_CTR_SLS Customer City Attribute (K) LU_CUST_CITY Attribute (K)
11 CITY_CTR_SLS Customer City Attribute (K) LU_CUSTOMER Attribute
# only keep unique pairs of nodes

unique_pairs = set()
temp = []
for index, row in df3.iterrows():
    if ((row[0], row[3]) in unique_pairs):
        pass
    else:
        unique_pairs.add((row[0], row[3]))
        temp.append(row)
        
df4 = pd.DataFrame(temp)

for index, row in df4.iterrows():
    graph_spec["links"].append({"source": row[0], "target": row[3]})

Export JSON

import json

with open('data.json', 'w') as f:
    json.dump(graph_spec, f)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.