Git Product home page Git Product logo

thicket's Introduction

thicket Thicket

Build Status Read the Docs Code Style: Black

Thicket

A Python-based toolkit for analyzing ensemble performance data. You can find detailed documentation, along with tutorials of Thicket in the ReadtheDocs.

Installation

To use thicket, install it with pip:

$ pip install llnl-thicket

Or, if you want to develop with this repo directly, run the install script from the root directory, which will build the package and add the cloned directory to your PYTHONPATH:

$ source install.sh

Contact Us

You can direct any feature requests or questions to the Lawrence Livermore National Lab's Thicket development team by emailing either Stephanie Brink ([email protected]) or Olga Pearce ([email protected]).

Contributing

Thicket is an open-source project. We welcome contributions via pull requests, and questions, feature requests, or bug reports via issues.

License

Thicket is distributed under the terms of the MIT license.

All contributions must be made under the MIT license. Copyrights in the Thicket project are retained by contributors. No copyright assignment is required to contribute to Thicket.

See LICENSE and NOTICE for details.

SPDX-License-Identifier: MIT

LLNL-CODE-834749

thicket's People

Contributors

andym1098 avatar cscully-allison avatar ilumsden avatar julius-plehn avatar michaelmckinsey1 avatar slabasan avatar treece-burgess avatar vanessalama09 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

thicket's Issues

Editable installs break due to setuptools

As reported here, there is a bug in newer versions of setuptools that can caused editable installs to break due to changes that setuptools made to support PEP 600.

There are two ways to address this. In the short term, users can add --config-settings editable_mode=strict to their pip install -e command. This forces setuptools to use the older style of editable installs. Longer term, the best way to fix this for Hatchet and Thicket is to move away from setuptools.

Bug: typeerror in verify_thicket_structures

Typeerror occurs in verify_thicket_structures when a column is specified that does not exist.

File /usr/gapps/spot/dev/thicket-playground-dev/thicket/utils.py:58, in verify_thicket_structures(thicket_component, columns, index)
     49     raise RuntimeError(
     50         "\n Missing column(s): "
     51         + missing_columns
   (...)
     54         + " required for the function"
     55     )
     56 elif not column_result and index_result:
     57     raise RuntimeError(
---> 58         "\n Missing column(s): " + missing_columns + " required for the function"
     59     )
     60 elif column_result and not index_result:
     61     raise RuntimeError(
     62         "\n Missing index level(s): " + missing_index + " required for the function"
     63     )

TypeError: can only concatenate str (not "list") to str

RADIUSS Tutorial Reflection

  • Update rendered tutorial notebooks in thicket
  • Update tutorials page with materials
  • Rendered basic tutorial is using new RAJAPerf data, this shouldn't be the case yet because we want to update all
  • Use display instead of print for dataframes, will all columns of dataframe render?

Columnar Join and Sync Nodes Consequence

The columnar join function requires Thicket._sync_nodes_frame in order to match up the nodes in the dataframe after union operations have been performed in the graph. The function makes an assumption if a df node's frame matches a graph node's frame to set the df node to be the graph node, making their nid's equal. This results in some nodes being mapped to the same nid, if it's frame has already existed in the df.

image

I ran into this issue when traversing a graphs children and then trying to match those nodes in the dataframe. The problem appears for nodes like the one above because the same node frame appears for every kernel node that exists (obviously the kernel nodes have different frames, "apps_vol" vs "apps_hydro", etc).

Bug: loading PCP code looks for NPM

%load_ext thicket.vis.visualizations

     41 if not os.path.isfile(pkg_lock_file):
     42     if not check_npm():
---> 43         raise FileNotFoundError(_filenotfound_errmsg)
     44     else:
     45         npm_build(curr_dir)

FileNotFoundError: 
Cannot find NPM!
This is required to use thicket.vis!
Please install NPM and try again to import thicket.vis!

Use metadata_column_to_perfdata in ExtraP

Currently Extra-P uses a manual lookup in the metadataframe to get the "p" (x-values) for its modeling. We could optimize this by calling the metadata_column_to_perfdata function before to have all of the data we need in the ensembleframe.

Pandas 1.3.5 Shallow Copy Discrepancy

Problem

When running the unit test test_copy:test_copy() on GitHub for python3.7 and pandas==1.3.5 the following error will occur

        # Shallow copy of data
        node = other.dataframe.index.get_level_values("node")[0]
        profile = other.dataframe.index.get_level_values("profile")[0]
        other.dataframe.loc[(node, profile), "nid"] = -1
>       assert (
            other.dataframe.loc[(node, profile), "nid"]
            == self.dataframe.loc[(node, profile), "nid"]
        )
E       assert -1.0 == 26.0

This error does not happen when running locally, tested with python3.7.2 and pandas==1.3.5.

Current Fix

Our current fix was to change the pandas version on GitHub to pandas==1.2.5 and the error does not occur. For reference the error does not occur for pandas==1.4 and pandas==2.0, so our suspicion is the error only occurs for pandas==1.3.5.

This is somewhat known by pandas and they fixed the inconsistency in pandas==1.5. See:

Docs: update images

  • fix fig 1 in user guide
  • user guide call tree and performance data to match

CI fails because it can't fetch NodeJS

Our CI is currently failing because the setup-node action fails to fetch NodeJS. This is due to a recently discovered bug in the setup-node action: actions/setup-node#873

There is currently no known fix for this bug, but I'll update this issue when one is found.

Use MultiIndex for ExtraP Functions

The componentize_statsframe and _add_extrap_statistics extrap model functions both add a lot of columns to the statsframe.

Statsframe

avg#inclusive#sum#time.duration_extrap-model avg#inclusive#sum#time.duration_RSS_extrap-model avg#inclusive#sum#time.duration_rRSS_extrap-model ...
cell cell cell cell
cell cell cell cell

Currently, we are appending strings to the column name to distinguish them between each other. This becomes very messy fast and is not user friendly. I propose pulling out the common strings between the columns, like avg#inclusive#sum#time.duration, into a higher-level column index, and let the different strings distinguish each column, like model, RSS, rRSS.

And what if our statsframe already had other statistics and/or models in it? Our problem would only become worse. So I believe we can leverage multi-indexing to help us organize the statsframe.

Add a Pipfile to enable pipenv-based environments

To help users install Thicket and to standardize development environments, we can add a Pipfile to GitHub with both development and non-development dependencies. With this, we could use pipenv to have a standardized way to create environments for both end users and developers.

Tree Bug

#76 introduced a bug into tree where it cannot find the columns in the dataframe, even if they exist. This error occurs for any column, and I have not found a configuration that works. The error looks like the following:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[/usr/workspace/ritter5/thicket/graphs.ipynb](https://vscode-remote+ssh-002dremote-002boslic-002ellnl-002egov.vscode-resource.vscode-cdn.net/usr/workspace/ritter5/thicket/graphs.ipynb) Cell 10 in
----> [1](vscode-notebook-cell://ssh-remote%2Boslic.llnl.gov/usr/workspace/ritter5/thicket/graphs.ipynb#X12sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) print(tk_full.tree(annotation_column="Avg time/rank"))
File [/usr/WS2/ritter5/thicket/thicket/thicket.py:735](https://vscode-remote+ssh-002dremote-002boslic-002ellnl-002egov.vscode-resource.vscode-cdn.net/usr/WS2/ritter5/thicket/thicket/thicket.py:735), in Thicket.tree(self, metric_column, annotation_column, precision, name_column, expand_name, context_column, rank, thread, depth, highlight_name, colormap, invert_colormap, colormap_annotations, render_header, min_value, max_value)
    732 elif sys.version_info.major == 3:
    733     unicode = True
--> 735 return ThicketRenderer(unicode=unicode, color=color).render(
    736     self.graph.roots,
    737     self.statsframe.dataframe,
    738     metric_column=metric_column,
    739     annotation_column=annotation_column,
    740     precision=precision,
    741     name_column=name_column,
    742     expand_name=expand_name,
    743     context_column=context_column,
    744     rank=rank,
    745     thread=thread,
    746     depth=depth,
    747     highlight_name=highlight_name,
    748     colormap=colormap,
    749     invert_colormap=invert_colormap,
    750     colormap_annotations=colormap_annotations,
    751     render_header=render_header,
...
    108             self.second_metric
    109         )
    110     )
KeyError: 'metric_column=time does not exist in the dataframe, please select a valid column. See a list of the available metrics with GraphFrame.show_metric_columns().'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.