Git Product home page Git Product logo

morpheus's Introduction

NVIDIA Morpheus

NVIDIA Morpheus

NVIDIA Morpheus is an open AI application framework that provides cybersecurity developers with a highly optimized AI framework and pre-trained AI capabilities that allow them to instantaneously inspect all IP traffic across their data center fabric. The Morpheus developer framework allows teams to build their own optimized pipelines that address cybersecurity and information security use cases. Bringing a new level of security to data centers, Morpheus provides development capabilities around dynamic protection, real-time telemetry, adaptive policies, and cyber defenses for detecting and remediating cybersecurity threats.

Documentation

Using Morpheus

Modifying Morpheus

Deploying Morpheus

Full documentation for the latest official release is available at https://docs.nvidia.com/morpheus/.

morpheus's People

Contributors

ajschmidt8 avatar anuradhakaruppiah avatar asergarcia avatar ayodeawe avatar bartleyr avatar bsuryadevara avatar cwharris avatar dagardner-nv avatar dependabot[bot] avatar drobison00 avatar e-ago avatar edknv avatar efajardo-nv avatar exactlyallan avatar gbatmaz avatar gputester avatar hesaneasycoder avatar hsin-c avatar ifengw-nv avatar jadu-nv avatar jameslamb avatar jarmak-nv avatar jjacobelli avatar lobotmcj avatar mdemoret-nv avatar pdmack avatar raykallen avatar shawn-davis avatar tzemicheal avatar yczhang-nv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

morpheus's Issues

Docker release image build failure

docker buildx build -t nvcr.io/nvidia/morpheus/morpheus:latest --target runtime --build-arg FROM_IMAGE=gpuci/miniforge-cuda --build-arg CUDA_VER=11.4 --build-arg LINUX_DISTRO=ubuntu --build-arg LINUX_VER=20.04 --build-arg RAPIDS_VER=21.10 --build-arg PYTHON_VER=3.8 --build-arg TENSORRT_VERSION=8.2.1.3 --build-arg NEO_GIT_URL=REDACTED --network=host --ssh default --load  -f docker/Dockerfile .
[+] Building 4.5s (23/31)
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => => transferring dockerfile: 32B                                                                                                                                     0.0s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => => transferring context: 35B                                                                                                                                        0.0s
 => resolve image config for docker.io/docker/dockerfile:1.3                                                                                                            0.7s
 => CACHED docker-image://docker.io/docker/dockerfile:1.3@sha256:42399d4635eddd7a9b8a24be879d2f9a930d0ed040a61324cfdf59ef1357b3b2                                       0.0s
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => [internal] load metadata for docker.io/gpuci/miniforge-cuda:11.4-devel-ubuntu20.04                                                                                  0.0s
 => [base 1/4] FROM docker.io/gpuci/miniforge-cuda:11.4-devel-ubuntu20.04                                                                                               0.0s
 => [internal] load build context                                                                                                                                       0.1s
 => => transferring context: 73.92kB                                                                                                                                    0.1s
 => CACHED [base 2/4] RUN apt-get update &&    apt-get upgrade -y &&    curl -sL https://deb.nodesource.com/setup_12.x | bash - &&    apt-get install --no-install-rec  0.0s
 => CACHED [base 3/4] WORKDIR /workspace                                                                                                                                0.0s
 => CACHED [base 4/4] RUN conda config --set ssl_verify false &&    conda config --add pkgs_dirs /opt/conda/pkgs &&    conda config --env --add channels conda-forge &  0.0s
 => CACHED [conda_bld_deps 1/4] COPY ci/conda/recipes/cudf/ ./ci/conda/recipes/cudf/                                                                                    0.0s
 => CACHED [conda_bld_deps 2/4] COPY ci/conda/recipes/libcudf/ ./ci/conda/recipes/libcudf/                                                                              0.0s
 => CACHED [conda_bld_deps 3/4] COPY ci/conda/recipes/run_conda_build.sh ./ci/conda/recipes/run_conda_build.sh                                                          0.0s
 => CACHED [conda_bld_deps 4/4] RUN --mount=type=ssh     --mount=type=cache,id=workspace_cache,target=/workspace/.cache,sharing=locked     --mount=type=cache,id=conda  0.0s
 => CACHED [conda_env 1/3] RUN --mount=type=bind,from=conda_bld_deps,source=/opt/conda/conda-bld,target=/opt/conda/conda-bld     --mount=type=cache,id=conda_pkgs,targ  0.0s
 => CACHED [conda_env 2/3] RUN source activate morpheus &&    conda config --env --add channels conda-forge &&    conda config --env --add channels nvidia &&    conda  0.0s
 => CACHED [conda_env 3/3] COPY docker/entrypoint.sh ./docker/                                                                                                          0.0s
 => CACHED [runtime  1/10] COPY docker/conda/environments/requirements.txt ./docker/conda/environments/                                                                 0.0s
 => CACHED [runtime  2/10] COPY docker/conda/environments/cuda11.4_runtime.yml ./docker/conda/environments/                                                             0.0s
 => CACHED [conda_bld_morpheus 1/2] COPY . ./                                                                                                                           0.0s
 => ERROR [conda_bld_morpheus 2/2] RUN --mount=type=ssh     --mount=type=cache,id=workspace_cache,target=/workspace/.cache,sharing=locked     --mount=type=cache,id=co  3.3s
------
 > [conda_bld_morpheus 2/2] RUN --mount=type=ssh     --mount=type=cache,id=workspace_cache,target=/workspace/.cache,sharing=locked     --mount=type=cache,id=conda_pkgs,target=/opt/conda/pkgs,sharing=locked     source activate base &&    MORPHEUS_ROOT=/workspace CONDA_BLD_DIR=/opt/conda/conda-bld CONDA_ARGS="--no-test" ./ci/conda/recipes/run_conda_build.sh morpheus:
#23 0.969 CUDA        : 11.4.1
#23 0.969 PYTHON_VER  : 3.8
#23 0.969 NEO_GIT_TAG : 5b55e37c6320c1a5747311a1e29e7ebb049d12bc
#23 0.969
#23 0.999 fatal: No names found, cannot describe anything.
#23 1.001 Running conda-build for morpheus...
#23 1.001 ++ conda mambabuild --use-local --build-id-pat '{n}-{v}' --variants '{python: 3.8}' -c rapidsai -c nvidia -c nvidia/label/dev -c conda-forge --no-test ci/conda/recipes/morpheus
#23 1.717 No numpy version specified in conda_build_config.yaml.  Falling back to default numpy value of 1.16
#23 1.717 WARNING:conda_build.metadata:No numpy version specified in conda_build_config.yaml.  Falling back to default numpy value of 1.16
#23 1.764 Cloning into bare repository '/opt/conda/conda-bld/git_cache/workspace'...
#23 1.778 done.
#23 1.785 Cloning into '/opt/conda/conda-bld/morpheus-split-None/work'...
#23 1.792 done.
#23 1.890 Your branch is up to date with 'origin/branch-22.04'.
#23 1.990 fatal: No names found, cannot describe anything.
#23 2.043 commit 09eb17f105dd0d719da26e8817892d89c0f33334
#23 2.043 Author: Michael Demoret <[email protected]>
#23 2.043 Date:   Thu Apr 21 17:20:55 2022 -0700
#23 2.043
#23 2.043     Initial Commit
#23 2.043
#23 2.043 commit 09eb17f105dd0d719da26e8817892d89c0f33334
#23 2.043 Author: Michael Demoret <[email protected]>
#23 2.043 Date:   Thu Apr 21 17:20:55 2022 -0700
#23 2.043
#23 2.043     Initial Commit
#23 2.043
#23 2.043 On branch branch-22.04
#23 2.043 Your branch is up to date with 'origin/branch-22.04'.
#23 2.043
#23 2.043 nothing to commit, working tree clean
#23 2.043
#23 2.043 Updating build index: /opt/conda/conda-bld
#23 2.043
#23 2.043 checkout: 'HEAD'
#23 2.043 ==> /usr/bin/git log -n1 <==
#23 2.043
#23 2.043 ==> /usr/bin/git describe --tags --dirty <==
#23 2.043
#23 2.043 ==> /usr/bin/git status <==
#23 2.043
#23 2.043 Adding in variants from internal_defaults
#23 2.043 INFO:conda_build.variants:Adding in variants from internal_defaults
#23 2.043 Adding in variants from /workspace/ci/conda/recipes/morpheus/conda_build_config.yaml
#23 2.043 INFO:conda_build.variants:Adding in variants from /workspace/ci/conda/recipes/morpheus/conda_build_config.yaml
#23 2.043 Adding in variants from argument_variants
#23 2.043 INFO:conda_build.variants:Adding in variants from argument_variants
#23 3.031 Attempting to finalize metadata for morpheus
#23 3.031 /opt/conda/lib/python3.8/site-packages/conda_build/environ.py:444: UserWarning: The environment variable 'CMAKE_CUDA_ARCHITECTURES' is being passed through with value 'ALL'.  If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
#23 3.031   warnings.warn(
#23 3.031 /opt/conda/lib/python3.8/site-packages/conda_build/environ.py:444: UserWarning: The environment variable 'MORPHEUS_CACHE_DIR' is being passed through with value '/workspace/.cache'.  If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
#23 3.031   warnings.warn(
#23 3.031 /opt/conda/lib/python3.8/site-packages/conda_build/environ.py:444: UserWarning: The environment variable 'PARALLEL_LEVEL' is being passed through with value '80'.  If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
#23 3.031   warnings.warn(
#23 3.031 INFO:conda_build.metadata:Attempting to finalize metadata for morpheus
#23 3.108 Error: Failed to render jinja template in /workspace/ci/conda/recipes/morpheus/meta.yaml:
#23 3.108 list object has no element 1
------
error: failed to solve: executor failed running [/bin/bash -c source activate base &&    MORPHEUS_ROOT=/workspace CONDA_BLD_DIR=/opt/conda/conda-bld CONDA_ARGS="--no-test" ./ci/conda/recipes/run_conda_build.sh morpheus]: exit code: 1

[FEA] Setup CI to use new Jenkins Build

Is your feature request related to a problem? Please describe.
In transitioning to Github, much of our CI system was lost. The style checks, build and tests need to be re-enabled using the new Jenkins build from RAPIDS.

Describe the solution you'd like
Convert the previous CI system to use the new Jenkins build from RAPIDS.

[FEA] Separate Pipeline Type Inference from Node Creation

Is your feature request related to a problem? Please describe.
Currently, morpheus determines the input/output types of each stage at the same time the nodes are created. This is a relic of the streamz days and requires the _build() methods to accept and return tuples of StreamPair. This is an ugly implementation delays type checking until the pipeline is actually being created.

Describe the solution you'd like
Before the pipeline object is created, Morpheus should run type inference on all stages to determine their input/output types. This will separate the two pieces and allow for invalid pipelines to be checked before any nodes are created.

[FEA] Make the Morpheus CLI Extendable with the Ability to Add New Commands

Is your feature request related to a problem? Please describe.
All pipelines that define at least 1 custom Stage must use the Python interface and cannot use the CLI. This is because there is no way to register new stages with the CLI unless they are built as first-party stages. This forces all of our built-in examples to use the python interface, duplicating a lot of click commands.

Describe the solution you'd like
Custom stages could be registered as commands with the CLI using various methods (such as a decorator). Once registered, CLI commands simply forward all options as kwargs to the constructor of each stage.

Additional context
Custom stages could be automatically discovered using the entry_point functionality in python. This is used by tools such as pytest to build extensions into the pytest CLI.

[BUG] Intermittent ValueError: I/O operation on closed file

Describe the bug
I'm occasionally seeing this at the end of running pytest. I'm not sure if this is reproducible outside of the unittests or if the unittests are simply doing something weird.

Traceback (most recent call last):
  File "/home/dagardner/work/morpheus/morpheus/utils/logging.py", line 47, in emit
    click.echo(click.style(msg, **color_kwargs), file=file, err=is_error)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/site-packages/click/utils.py", line 299, in echo
    file.write(out)  # type: ignore
Traceback (most recent call last):
ValueError: I/O operation on closed file.
  File "/home/dagardner/work/morpheus/morpheus/utils/logging.py", line 47, in emit
    click.echo(click.style(msg, **color_kwargs), file=file, err=is_error)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/site-packages/click/utils.py", line 299, in echo
    file.write(out)  # type: ignore
Call stack:
ValueError: I/O operation on closed file.
Call stack:
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/logging/handlers.py", line 1487, in _monitor
    self.handle(record)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/logging/handlers.py", line 1468, in handle
    handler.handle(record)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/logging/__init__.py", line 954, in handle
    self.emit(record)
  File "/home/dagardner/work/morpheus/morpheus/utils/logging.py", line 53, in emit
    self.handleError(record)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/logging/handlers.py", line 1487, in _monitor
    self.handle(record)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/logging/handlers.py", line 1468, in handle
    handler.handle(record)
  File "/home/dagardner/work/conda/envs/morpheus/lib/python3.8/logging/__init__.py", line 954, in handle
    self.emit(record)
  File "/home/dagardner/work/morpheus/morpheus/utils/logging.py", line 53, in emit
    self.handleError(record)
Message: '====Pipeline Complete===='
Arguments: None
Message: '====Pipeline Complete===='
Arguments: None

Steps/Code to reproduce bug
Run pytest about 10 times.

[BUG] Broken cudf builds

Describe the bug
Performing a clean build results in a broken build of cudf

ImportError: cannot import name 'builder' from 'google.protobuf.internal' 

Steps/Code to reproduce bug
Reproducable in both Conda & Docker builds

Additional context
Issue appears to be caused by a new version of protobuf being published on conda-forge three days ago. Morpheus uses 3.19, however the cudf builds don't specify a version.

[DOC] Typos in README.md; one more GitLab-style term in CONTRIBUTING.md

Report incorrect documentation

Location of incorrect documentation
See Pull Request #74 .

Describe the problems or issues found in the documentation
Some typos in README.md and one lingering gitlab-style reference in CONTRIBUTING.md.

Steps taken to verify documentation is incorrect
N/A

Suggested fix for documentation
See Pull Request #74 .

[BUG] cli default args for num_threads should be based on process thread affinity.

We are using os.cpu_count() (or psutil.cpu_count()) as the default argument for num_threads for both CLI and example code, which is based on the physical number of CPUs on the device and is potentially non-determinable.

Instead we should probably use len(os.sched_getaffinity(0)), which looks to be the appropriate default as it accounts for how many threads the current process has access to.

https://github.com/NVIDIA/Morpheus/blob/02bfbfbeb6e9e62d5fd8fe47a2bf2a92a85d1adc/examples/gnn_fraud_detection_pipeline/run.py#L36-L41

[FEA] Add ability to build containers that default to a user other than root.

Is your feature request related to a problem? Please describe.
Yes. The primary issue, is that by default containers use the 'root' user. In addition to the baseline security concerns this can present, it also induces some inconvenient behavior and side effects. Specifically, if I run a container as myuser with a user id of 1234 and mount my working directory into the container, then all files created by the container in the mounted directory will have root ownership. This can cause a variety of subtle problems and degraded user experience.

Describe the solution you'd like
When running a Morpheus container an end user should be able to specify the UID they want the tasks in the container to run under.

Describe alternatives you've considered
No other alternatives considered

Additional context

There are some complicating issues with user permissions on base conda installation items, packages, etc.. which make this more complicated than I'd originally hoped and why its on the back log. I don't think there are any insurmountable problems, but it might require some trail and error to work out all the issues.

[BUG] Remove `no_args_is_help` from CLI

Describe the bug
If you use a morpheus CLI command without any arguments or options, it just prints the help instead of adding that stage

Steps/Code to reproduce bug
For example, this launch command will just print the help:

(morpheus)$ morpheus run pipeline-nlp \
                from-file --filename=data/pcap_dump.jsonlines \
                deserialize
Configuring Pipeline via CLI
Usage: morpheus run pipeline-nlp deserialize [OPTIONS]

Options:
  --help  Show this message and exit.

Expected behavior
The pipeline should run with the specified commands

[FEA] AppShieldSource Stage

Messages from Appshield plugins should be read into the morpheus pipeline as a single data frame per source for further processing, such as feature creation by the Appshield source.

[FEA] Improve LFS with Scripts and Optional Download

Is your feature request related to a problem? Please describe.
Currently, Git LFS is used to store large and binary files in the repo. However, some users have not had success with Git LFS. By default, Git LFS will download all files for a particular commit, even if the user is unlikely to need every file. This adds about 2 GB to the repo checkout.

Describe the solution you'd like
Ideally, the LFS configuration would be improved to minimize the download size for the average user. Utility scripts could be added to download the necessary files for a particular workload/example/model. This would improve the user experience by minimizing the repo checkout size and would not require anyone to learn the necessary Git LFS commands to download ignored or skipped files.

Describe alternatives you've considered
Other options include storing large files external to Git (i.e. S3) and building a set of scripts or tools to download these files as necessary. This option (while preferred by some) would just trade one problem for another. We would need to develop our own versioning system for storing these large files, create the tools to download them, and maintain a system which functions very similarly to Git LFS but is slightly different. While Git LFS is far from perfect, we can mitigate some of the issues by correctly configuring our repo and providing scripts to shield users from needing to deal with Git LFS.

Additional context
The recommended solution from Rapids Ops is outlined here: https://docs.rapids.ai/resources/git/

[DOC] Suggested updates to developer guide for clarity

In reading through the developer guide I had some changes to suggest.
Some suggestions are typos or copy/paste errors; others are an attempt to increase clarity.
Please edit/disregard these suggestions at will.

Files with suggested updates:

  • docs/source/developer_guide/architecture.md
  • docs/source/developer_guide/guides/1_simple_python_stage.md
  • docs/source/developer_guide/guides/2_real_world_phishing.md
  • docs/source/developer_guide/guides/3_simple_cpp_stage.md
  • docs/source/developer_guide/guides/4_source_cpp_stage.md

Suggested fix for documentation
See Pull Request #96 .

to-file stage failure

Traceback (most recent call last):
  File "/opt/conda/envs/morpheus/bin/morpheus", line 11, in <module>
    sys.exit(run_cli())
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/cli.py", line 1381, in run_cli
    cli(obj={}, auto_envvar_prefix='CLX', show_default=True, prog_name="morpheus")
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1689, in invoke
    return _process_result(rv)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1626, in _process_result
    value = ctx.invoke(self._result_callback, value, **ctx.params)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/cli.py", line 543, in post_pipeline
    pipeline.run()
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 1239, in run
    asyncio.run(self._do_run())
  File "/opt/conda/envs/morpheus/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/conda/envs/morpheus/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 1214, in _do_run
    self.build_and_start()
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 1038, in build_and_start
    self.start()
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 998, in start
    self._neo_executor.start()
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 960, in inner_build
    s.build(seg)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 526, in build
    dep.build(seg, do_propagate=do_propagate)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 526, in build
    dep.build(seg, do_propagate=do_propagate)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 526, in build
    dep.build(seg, do_propagate=do_propagate)
  [Previous line repeated 10 more times]
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 504, in build
    out_ports_pair = self._build(seg, in_ports_pairs)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/pipeline.py", line 783, in _build
    return [self._build_single(seg, in_ports_streams[0])]
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/pipeline/output/to_file.py", line 113, in _build_single
    to_file = neos.WriteToFileStage(seg,
ValueError: Read/Write mode ('+') is not supported by WriteToFileStage. Mode: w+

[DOC] ssh-agent must be set up to build as instructed in CONTRIBUTING.md

Report incorrect documentation

Location of incorrect documentation
CONTRIBUTING.MD in the "Build in Docker Container" section.

Describe the problems or issues found in the documentation
If ssh-agent is not set up as described in step 2, then step 1 may fail with the following error:
error: invalid empty ssh agent socket: make sure SSH_AUTH_SOCK is set.
If step 2 is executed first, then step 1 works without error.

Steps taken to verify documentation is incorrect
I encountered this error on:

  • Operating system: Ubuntu 18.04
  • Docker runtime version: 20.10.15

I suspect the offending lines are 44-47 of docker/build_container.sh here.

Suggested fix for documentation
Either switch steps 2 and 1 or (preferably) modify lines 44-47 of docker/build_container.sh to work even if ssh-agent has not been configured. Perhaps use a new, alternate environment variable here such as DOCKER_PRIVATE_REPOS instead of DOCKER_BUILDKIT?

[BUG] bert vocabulary file has incorrectly encoded special characters.

This file has incorrect encoded special characters.

https://github.com/NVIDIA/Morpheus/blob/branch-22.04/models/training-tuning-scripts/log-parsing-models/resources/bert-base-cased-vocab.txt

A copy of the file which is correctly encoded is located here:

https://github.com/NVIDIA/Morpheus/blob/branch-22.04/models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt

We should remove the incorrectly encoded file, as well as the associated hash file, and update reference to use the correctly encoded file and hash file.

cc @raykallen @efajardo-nv

[FEA] Create Utility Repo to Store Common Functionality

Is your feature request related to a problem? Please describe.
We currently have several repos that all require similar functionality related to CMake, CI, Conda, Docker, and testing. This functionality is duplicated making it hard to maintain.

Describe the solution you'd like
Create a new repo containing the above common functionality that can be added as a submodule to all repos that need it. This will allow for versioning the common tools and keeping everything in sync. A partial list of items that could be included is:

#87 (comment)

#87 (comment)

#87 (comment)

Allow Building with `sccache` instead of `ccache`

In the PR to add a CI system, it was discovered that ccache doesnt work well between build stages. Instead, Morpheus should have the option to build using sccache.

Original comment from the PR review is below:

I think there might be a better way to configure using sccache. The script cmake/setup_cache.cmake does some templating to ensure ccache can figure out the compiler type during a conda-build run (look at the comments in that file for more info).

We could also just set the variable CCACHE_PROGRAM_PATH to sccache if they are drop in replacements for eachother.

Either way, we should make a new issue to explore this.

Originally posted by @mdemoret-nv in #80 (comment)

[BUG] CI scripts broken

Describe the bug
Both ci/scripts/python_checks.sh and ci/scripts/fix_all.sh are currently broken.
I believe python_checks might be working when invoked as a git hook and I have a pull-req open which fixes it in CI, but these should work when invoker directly.

Steps/Code to reproduce bug
ci/scripts/python_checks.sh

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of Morpheus install: from source

[FEA] benchmark the example workflows

Let's benchmark the example workflows for end-to-end timing with fixed message counts and the ability to compare the python and c++ stage implementations.

Goals:

  • run all example workflows as benchmarks
  • ability to run pure python implementations or include c++ accelerated implementations
  • time benchmarks, not message count per second benchmarks.
  • answer the question "on average, how fast are the example workflows in both pure-python and c++ accelerated"

Non-Goals

  • per-stage metrics

Convert Gitlab Config to Github

With the transition to Github, we need to convert the GitLab specific config files into Github. Here is the task list:

  • Add the ops-bot config file into the repo to enable the ops-bot (Links: Ops-bot, Config file, and Example)
  • Convert the issue templates
  • Make a new CODEOWNERS file

For now, lets keep the ./.gitlab file around until after release.

[DOC] Add NGC references back to README

Report needed documentation
We removed the NGC references from the getting started section temporarily as to avoid confusion on the multiple paths to get started. The removed link directed users here: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/collections/morpheus_. This is where a user would go if they want to use Helm charts to run Morpheus in a K8s environment.

Describe the documentation you'd like
This link and wording around it should be added back into the README.md file once we can decide on the wording to avoid potential confusion with users mixing and matching paths to get started.

[FEA] Documentation build stage shouldn't require a gpu

Is your feature request related to a problem? Please describe.
If we try to build without it we get this error:

F20220518 00:39:57.558239   679 device_info.cpp:39] Check failed: nvmlInit_v2() == NVML_SUCCESS (9 vs. 0) Failed to initialize NVML

Describe the solution you'd like
Sphinx provides the ability to mock modules during the build phase, likely we just need to identify these and add them to our config.

[BUG] WriteToFileStage fails when given a filename without a path

Describe the bug
Initializing WriteToFileStage with:
WriteToFileStage(config, filename='test.json')

Will cause a failure in _build_single:
os.makedirs(os.path.dirname(self._output_file), exist_ok=True)
The call to os.path.dirname will return '' causing os.mkdirs to raise an exception

[DOC] Morpheus source build instructions in CONTRIBUTING.md do not work as written.

Report incorrect documentation

Location of incorrect documentation
CONTRIBUTING.md

export CUDAToolkit_ROOT=/usr/local/cuda-{CUDA_VER}
mamba env create -n morpheus -f ./docker/conda/environments/cuda${CUDA_VER}_dev.yml
conda activate morpheus

Describe the problems or issues found in the documentation

  • This does not work, because the cuda_xxx_dev.yml specifies cudf 21.10.* *morpheus as a dependency.
  • The specified build path for cuDF also will not work, because the build.sh approach does not add the correct morpheus tag to the installed package.

Suggested fix for documentation
Steps 1 and 2 need to be swapped, and cudf build description needs to be updated to reflect the new conda packaging process.

[FEA] Improve What Is Stored In CI Artifacts to Reduce Size and Allow for In-Place Builds

Is your feature request related to a problem? Please describe.
The original implementation of the CI system in PR #80, saves the entire conda environment and workspace as artifacts to allow for the built morpheus packages to be used in docs and testing. It seems that this is necessary because the cudf packages are installed into the conda environment and the morpheus packages are installed as in-place developer builds. This requires both the workspace and conda environment to be zipped up and moved to the next build stage.

Describe the solution you'd like
Taring up the entire conda environment is a bit unnecessary and will require 4+ GB of storage per build when the majority of the files are the same. Instead, we only need a few packages to be transferred between the build and test/docs phase:

  1. The built cudf conda package
  2. The built morpheus conda package
  3. The built morpheus wheel package

[FEA] Improve Monitor Stage

Is your feature request related to a problem? Please describe.
The MonitorStage currently has several flaws that reduce its usefulness:

  1. It has its own buffered queue meaning messages are not tracked as they are processed, but when back pressure forces them to be processed. This delays and alters the measured results
  2. It requires a separate python thread to periodically update the screen with the latest results
  3. The throughput numbers cannot be queried or displayed anywhere besides the console
  4. It does not scale well. If multiple monitors are added, they all redraw and refresh at separate times.

Describe the solution you'd like
Ideally, the MonitorStage would:

  1. Immediately track the throughput of the previous stage without any buffering. This would require changing the type of node that is used to a node that does not have it's own progress engine.
  2. Be implemented in C++ to avoid grabbing the GIL when monitoring throughput
  3. Utilize a common library such as Prometheus to allow extracting the measured values or displaying outside of the console
  4. Synchronize updates between multiple instances to avoid scaling issues.

[FEA] Ensure all message classes have a C++ impl

Is your feature request related to a problem? Please describe.
This would allow pipelines like Hammah, where some stages have C++ impls, to run in a hybrid mode.

Describe the solution you'd like
There are a few message classes which don't have C++ impls but inherit from a message class that has a C++ impl. Typically these only add a few attributes.

Describe alternatives you've considered
Options would be:

  1. Just add the new C++ message classes (I have some of them in an old branch).
  2. Investigate pybind11 trampoline classes

[FEA] Update Morpheus to Use Latest cuDF

Is your feature request related to a problem? Please describe.
As of the GA release, Morpheus is stuck on cuDF version 21.10 because several modification to the cuDF source code were necessary and these modifications cannot be applied to later releases. The modified files are contained in a patch file located here and can be summarized by the following:

  1. Addition of ${CUDF_SOURCE_DIR}/include/nvtext to the installed headers to allow Morpheus to use the C++ SubwordTokenizer
  2. Public export of the cython Column class (this could possibly be removed)
  3. Public export of the cython Table class

All of these modifications revolve around exposing public aspects of the cuDF library to external projects and were originally intended to be upstreamed after the 0.2 EA release. However, in the 21.12 release of cuDF, the Table class (item #3 above) was removed completely, preventing applying the patch to any release after 21.10.

With the removal of the Table cython class, Morpheus will need to refactor the morpheus.TableInfo class and the cudf_helpers.pyx code.

How the Table class is used in Morpheus
Morpheus stores a message's DataFrame object in C++ using a pybind11::object wrapped in the class IDataTable which is then referenced by a TableInfo struct. To operate on the data in a DataFrame using C++, we need to convert the python object referenced in a TableInfo into a cudf:table_view. The Table cython class facilitates this conversion in C++ even though it is not used directly (we cast a pybind11 object into cython via (PyTable*)data_frame.ptr() and is the main reason that the Table class needed to be exported in patch item #3 above).

  • To convert from a pybind11::object -> cudf:table_view, we use the cudf cython method cudf._lib.table.table_view_from_table()
  • To convert from a cudf::table_view -> pybind11::object, we use the cudf cython method cudf._lib.utils.data_from_table_view()

Describe the solution you'd like
Ultimately, the solution to this problem should allow Morpheus to utilize the most recent version of cuDF without needing a patch file or custom cuDF build while maintaining the C++ performance currently available in GA. The best solution will likely require working with the cuDF team to find alternatives and upstream any changes they see fit. Below is a rough outline of some of the expected changes:

  1. Removal of a cuDF patch file and custom cuDF conda build from the Docker components
  2. Update to the IDataTable class to store C++ objects instead of the python DataFrame
  3. Update to the TableInfo class to handle slicing into the IDataTable using C++ functions instead of python
  4. Update cudf_helpers to work with the latest cuDF classes and utility functions

Describe alternatives you've considered
While updating the IDataTable and TableInfo classes to store data in C++ instead of python would be preferred, it may not be necessary. Storing the data in C++ could significantly improve performance but may complicate the implementation. If this proves to be too difficult, alternative options that find workarounds to the removed classes while preserving the functionality would also satisfy the main goal of using the latest cuDF without modifications.

[DOC] abp_nvsmi_detection example is out of date

Report incorrect documentation

Location of incorrect documentation
examples/abp_nvsmi_detection/README.md

Describe the problems or issues found in the documentation

  • Uses the deprecated buffer stage
  • Uses the vizualize flag which appears to be broken
  • Uses a triton fork which appears to be no longer relevant and doesn't build properly

Steps taken to verify documentation is incorrect
I tried to follow the steps

[BUG] Stacktrace when interrupting (SIGINT) a from-kafka pipeline

Describe the bug
Ctrl-C of a from-kafka pipeline yields an abort and stacktrace.

====Pipeline Started====
Stopping pipeline. Please wait... Press Ctrl+C again to kill.C
From Kafka rate: 93085messages [03:45, 413.64messages/s]
Deserialization rate: 93085messages [03:45, 413.64messages/s]
Preprocessing rate: 93085messages [03:45, 413.65messages/s]
Inference rate: 93085inf [03:45, 413.65inf/s]
Serialization rate: 93085messages [03:45, 413.65messages/s]
To Kafka rate: 93085messages [03:45, 413.65messages/s]
====Stopping Pipeline====
%3|1651867570.038|ERROR|rdkafka#consumer-1| [thrd:GroupCoordinator]: 1/1 brokers are down
F20220506 20:06:10.052778  4511 executor_base.cpp:395] Check failed: search != m_segments.end() [segment id: 57864; rank: 1]: not found
*** Check failure stack trace: ***
    @     0x7fb4ed097c0d  google::LogMessage::Fail()
    @     0x7fb4ed09a7a6  google::LogMessage::SendToLog()
    @     0x7fb4ed097705  google::LogMessage::Flush()
    @     0x7fb4ed09ad6a  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fb4e4d834bd  neo::ExecutorBase::segment()
    @     0x7fb4e4d84602  neo::ExecutorBase::remove_segment()
    @     0x7fb4e4d87fae  neo::ExecutorBase::do_update()
    @     0x7fb4e4d86f01  neo::ExecutorBase::update()
    @     0x7fb4e4d8f93d  neo::StandaloneExecutor::stop()
    @     0x7fb4caf2ea6c  neo::pyneo::Executor::stop()
    @     0x7fb4caea3daf  (unknown)
    @     0x7fb4caea88aa  (unknown)
    @     0x561d62c03e7e  PyCFunction_Call
    @     0x561d62bec631  _PyObject_MakeTpCall
    @     0x561d62c03bfd  method_vectorcall
    @     0x561d62be7ffc  _PyEval_EvalFrameDefault
    @     0x561d62bf42a6  _PyFunction_Vectorcall
    @     0x561d62be3d9d  _PyEval_EvalFrameDefault
    @     0x561d62be2b76  _PyEval_EvalCodeWithName
    @     0x561d62bf433c  _PyFunction_Vectorcall
    @     0x561d62d0cc6a  _PyObject_Vectorcall.lto_priv.10
    @     0x561d62ba2441  context_run
    @     0x561d62beb0c6  cfunction_vectorcall_FASTCALL_KEYWORDS
    @     0x561d62c040bf  PyVectorcall_Call
    @     0x561d62be8d2e  _PyEval_EvalFrameDefault
    @     0x561d62bf42a6  _PyFunction_Vectorcall
    @     0x561d62be3d9d  _PyEval_EvalFrameDefault
    @     0x561d62bf42a6  _PyFunction_Vectorcall
    @     0x561d62be3d9d  _PyEval_EvalFrameDefault
    @     0x561d62bf42a6  _PyFunction_Vectorcall
    @     0x561d62be3d9d  _PyEval_EvalFrameDefault
    @     0x561d62bf42a6  _PyFunction_Vectorcall
Aborted (core dumped)

Steps/Code to reproduce bug
Ctrl-C in the shell of a from-kafka pipeline

Expected behavior
No stacktrace. Graceful exit from pipeline.

Environment overview (please complete the following information)

  • Environment location: [Bare-metal]
  • Method of Morpheus install: [Docker/k8s]

Environment details
https://gist.github.com/pdmack/2f1924031e20995e99985cfd735fbbc3

Additional context
Add any other context about the problem here.

[FEA] Python packages should be re-organized

Is your feature request related to a problem? Please describe.
Currently Morpheus stages are not easily discoverable as they are intermixed with pipeline, utility and message modules.

This would allow users to quickly get a list of available stages & messages.

Describe the solution you'd like
Ideally the Python layout modules should be match the directory structure of our C++ headers:

_lib/include/morpheus
    io/
    messages/
    objects/
    stages/
    utilities/

Such that:

from morpheus.pipeline import LinearPipeline
from morpheus.pipeline.input.from_cloudtrail import CloudTrailSourceStage
from morpheus.pipeline.general_stages import FilterDetectionsStage
from morpheus.pipeline.messages import MessageMeta

Would become:

from morpheus.pipeline import LinearPipeline
from morpheus.stages.cloudtrail_source import CloudTrailSourceStage
from morpheus.stages.filter_detection import FilterDetectionsStage
from morpheus.messages.meta import MessageMeta

This would make it easier for any users who are working with both the C++ & Python APIs

[DOC] nlp_si_detection example needs to be updated

Report incorrect documentation

  • Old triton
  • Reference to git sub module
  • Remove vizualization flag
  • In cpp mode it fails an assert early Check failed: this->mess_count == this->count At this time, mess_count and count must be the same for slicing
  • in python mode it fails later with ValueError: Cannot align indices with non-unique values

Location of incorrect documentation
examples/nlp_si_detection

[FEA] Add Python Stubs for Cython/PyBind11 Modules

Is your feature request related to a problem? Please describe.
Python modules created with cython/pybind11 do not show up in IDEs and fail all linting which is a poor developer experience.

Describe the solution you'd like
Utilize mypy or pybind11-stubgen to create pyi stubs for all C++ generated code. This allows linters and IDEs to pickup and check python code using these modules

[BUG] FilterDetectionsStage can spam the output edge with thousands of small sliced messages

Describe the bug
Degraded pipeline performance when using filter stage with any threshold.
In this case the pipeline_batch_size=8192 so messages entering the FilterDetectionsStage will generally contain 8192 rows, and FilterDetectionsStage will emit roughly ~1,800 slices of that incoming message which quickly causes the output edge to block on writes.

Steps/Code to reproduce bug

morpheus --log_level=DEBUG run --use_cpp=True --num_threads=8 --pipeline_batch_size=8192 --model_max_batch_size=32 --edge_buffer_size=32 \
  pipeline-nlp --model_seq_length=256 \
  from-file --filename=./examples/data/pcap_dump.jsonlines dropna \
  monitor --description='Drop Null Attributes rate' \
  deserialize \
  monitor --description='Deserialization rate' \
  preprocess --vocab_hash_file=./morpheus/data/bert-base-uncased-hash.txt --truncation=True --do_lower_case=True --add_special_tokens=False \
  monitor --description='Preprocessing rate' \
  inf-triton --force_convert_inputs=True --model_name=sid-minibert-onnx --server_url=localhost:8001 --use_shared_memory=True \
  monitor --description='Inference rate' --smoothing=0.001 --unit inf \
  add-class \
  monitor --description='Classification rate' \
  filter \
  monitor --description='Filter rate' \
  serialize --exclude '^ts_' \
  monitor --description='Serialization rate' \
  to-file --filename=/tmp/sid-minibert-onnx-output.jsonlines --overwrite

Expected behavior
Performance in line with other stages.

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of Morpheus install: Docker
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
https://gist.github.com/pdmack/2f1924031e20995e99985cfd735fbbc3

Additional context
Add any other context about the problem here.

[DOC] Add Quickstart Guide

To get started with Morpheus workflows, add a Morpheus Quickstart guide doc to run precompiled workflows. The guide must describe how to install Morpheus SDK and its components, as well as how to run the workflow from end to end.

[BUG] tensorrt package missing leading to onnx-to-trt failure

# morpheus tools onnx-to-trt \
>             --input_model /common/models/sid-minibert-onnx/1/model.onnx \
>             --output_model /common/models/sid-minibert-trt_b1-8_b1-16_b1-32.engine \
>             --batches 1 8 \
>             --batches 1 16 \
>             --batches 1 32 \
>             --seq_length 256 \
>             --max_workspace_size 4000
Traceback (most recent call last):
  File "/opt/conda/envs/morpheus/bin/morpheus", line 11, in <module>
    sys.exit(run_cli())
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/cli.py", line 1381, in run_cli
    cli(obj={}, auto_envvar_prefix='CLX', show_default=True, prog_name="morpheus")
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/cli.py", line 136, in new_func
    return f(ctx, *args, **kwargs)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/cli.py", line 233, in onnx_to_trt
    gen_engine(c)
  File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/morpheus/utils/onnx_to_trt.py", line 30, in gen_engine
    import tensorrt as trt
ModuleNotFoundError: No module named 'tensorrt'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.