Git Product home page Git Product logo

cyclonedx-editor-validator's People

Contributors

andife avatar cbeck-96 avatar cedricwritescode avatar dependabot[bot] avatar italvi avatar mmarseu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cyclonedx-editor-validator's Issues

Generated filename is invalid if component has hashes

Some of the commands generate a new filename for the SBOM when used with the --output option pointing to a folder.

This filename currently fails to validate if metadata.component.hashes exists and is not empty.

That's because the filename generation algorithm in output.py doesn't take hashes into consideration at all.

On the other hand, the validator requires a hash to be present if the metadata component has at least one hash.

I'd be willing to fix this if you let me know which of the two sides of the problem you'd like to change: Either the hashes are added to the filename generator, or the validator accepts filenames without hashes even if the metadata component has one.

`build-public` messes up compositions

The build-public command does something surprising to the .compositions array. I haven't checked in the code what exactly it does but it definitely deletes some entries that shouldn't be deleted.

For example, use build-public with a dummy schema that doesn't match anything (see #153) on

"compositions": [
{
"aggregate": "complete",
"assemblies": [
"some-vendor/[email protected]",
"some-vendor/[email protected]:physics/[email protected]",
"some-vendor/[email protected]:physics/[email protected]",
"some-vendor/[email protected]:physics/[email protected]:[email protected]"
]
},
{
"aggregate": "incomplete",
"assemblies": ["com.company.unit/[email protected]"]
}
]

The command shouldn't delete anything and therefore the SBOM should simply remain untouched. However, the result is

"compositions": [
        {
            "aggregate": "incomplete",
            "assemblies": [
                "com.company.unit/[email protected]",
                "some-vendor/[email protected]",
                "[email protected]"
            ]
        }
],

Add `amend` operator to delete ambigious licenses

Add optional amend operation to remove ambiguous licenses from components.

Some SBOM generators (syft in particular) generate license entries with only a name property but no further context such as url or text. This serves very little purpose and we prefer deleting these than keeping them in the SBOM.

Therefore, a new operator for amend should be implemented, which deletes license entries such as:

license: {
  "name": "something"
}

This operator must be disabled by default and only run when specifically enabled through a command-line switch.

Merge cannot read files with uppercase characters in filename when using --from-folder option on Linux

I've encountered an error when trying to merge multiple SBOM files using the --from-file option:
ERROR: Failed to load input file (at line 64) - File not found., referencing the read_sbom method.
This happened in a Docker Image. It seems to happen only on Unix systems.

Reproduce

In order to reproduce this error, use these files, with the given command:
cdx-ev merge --from-folder "components/" --output "out.cdx.json" "main.cdx.json"
error_reproduction.zip

Whereas on Windows this works flawlessly because of the case-insensitivity towards filenames, the case-sensitive filesystem on Linux doesn't find the requested file.

Cause of error

The error is caused in the invoke_merge() function in main.py
, in which the files from the specified directory are being globbed. The python file cdxev-error.py inside error_reproduction.zip shows that lowercase versions are being passed to read_sbom.

Python 3.11.2
cdx-ev 0.8.1

Ad further options for the modification of a sbom

Allow to use set or a similar command with subschemata and or regex additional to name+version+group.
So all components that fullfill the provided schema get modified.
This operation would be more complex, but since it is planned as addition to the regular set schema, it would not be to the detriment of the already implemented function but a extension to be more versatile.
With this, more complex operations would be possible, that can be necessary, depending on the output of the creation tool.

Missing attribution for license data

Ah, irony is beautiful ๐Ÿ˜†

There we went and copied raw data about licenses from other open-source projects and neglected to make sure we follow their license conditions.

cdxev/amend/license_name_spdx_id_map.json is largely based off of https://github.com/CycloneDX/cyclonedx-core-java/blob/master/src/main/resources/license-mapping.json which is licensed under Apache-2.0.

We also incorporated additional license names and ids from SPDX. Since the raw data that the site is based on - equally ironically - doesn't specify a license, I'd say we refer to the license of the website which is copyrighted by The Linux Foundation and under CC-BY-3.0.

Amend throws error on Windows using licenses from Conan project

I've noticed a bug while trying to amend an SBOM generated with the Conan cyclonedx tool with license data from the Conan _collected_license_files (using conan install --deployer=licenses) on Windows. On Linux it seems to run without any problems (tested on Ubuntu). I've attached a reduced file set that will reproduce this error.

Error on Windows:

WARNING: License text not found - No text for the license (Unrar), in component (PURL[pkg:conan/[email protected]]), was found. An empty string was added as text.

INFO: License text added - The text of the license (CppTest_EULA), in component (PURL[pkg:conan/[email protected]]), was added.
INFO: License text added - The text of the license (License_Jlink), in component (PURL[pkg:conan/[email protected]]), was added.
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Appl\Python\Lib\site-packages\cdxev_main
.py", line 631, in
sys.exit(main())
^^^^^^
File "C:\Appl\Python\Lib\site-packages\cdxev_main
.py", line 40, in main
return args.cmd_handler(args)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Appl\Python\Lib\site-packages\cdxev_main_.py", line 462, in invoke_amend
amend(sbom, args.license_path)
File "C:\Appl\Python\Lib\site-packages\cdxev\amend\command.py", line 32, in run
walk_components(sbom, _do_amend, skip_meta=True)
File "C:\Appl\Python\Lib\site-packages\cdxev\auxiliary\sbomFunctions.py", line 307, in walk_components
_recurse(sbom["components"], func, *args, **kwargs)
File "C:\Appl\Python\Lib\site-packages\cdxev\auxiliary\sbomFunctions.py", line 296, in _recurse
func(component, *args, **kwargs)
File "C:\Appl\Python\Lib\site-packages\cdxev\amend\command.py", line 55, in _do_amend
operation.handle_component(component)
File "C:\Appl\Python\Lib\site-packages\cdxev\amend\operations.py", line 230, in handle_component
process_license(
File "C:\Appl\Python\Lib\site-packages\cdxev\amend\process_license.py", line 90, in process_license
add_text_from_folder_to_license_with_name(
File "C:\Appl\Python\Lib\site-packages\cdxev\amend\process_license.py", line 152, in add_text_from_folder_to_license_with_name
license_text = get_license_text_from_folder(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Appl\Python\Lib\site-packages\cdxev\amend\process_license.py", line 211, in get_license_text_from_folder
license_text = f.read()
^^^^^^^^
File "C:\Appl\Python\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 219: character maps to

Output on Linux:

WARNING: License text not found - No text for the license (Unrar), in component (PURL[pkg:conan/[email protected]]), was found. An empty string was added as text.
INFO: License text added - The text of the license (CppTest_EULA), in component (PURL[pkg:conan/[email protected]]), was added.
INFO: License text added - The text of the license (License_Jlink), in component (PURL[pkg:conan/[email protected]]), was added.
INFO: License text added - The text of the license (SLA0048), in component (PURL[pkg:conan/stm32cubeprog]), was added.
Writing output to: My application_1.0.0_20240423T061253.cdx.json

error_reproduction.zip

Add support for new tools format in CycloneDX 1.5

CycloneDX 1.5 has deprecated the .metadata.tools array in favor of an object. See here: https://cyclonedx.org/docs/1.5/json/#tab-pane_metadata_tools_oneOf_i0

Example:

{
    "metadata": {
        "tools": {
            "components": [
                {
                    "type": "application",
                    "author": "anchore",
                    "name": "syft",
                    "version": "1.0.1"
                }
            ]
        }
    }
}

Currently, this tool errors out when faced with an SBOM with such an object:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Appl\Python\Scripts\cdx-ev.exe\__main__.py", line 7, in <module>
  File "C:\Appl\Python\Lib\site-packages\cdxev\__main__.py", line 40, in main
    return args.cmd_handler(args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Appl\Python\Lib\site-packages\cdxev\__main__.py", line 463, in invoke_amend
    write_sbom(sbom, args.output)
  File "C:\Appl\Python\Lib\site-packages\cdxev\auxiliary\output.py", line 36, in write_sbom
    update_tools(sbom)
  File "C:\Appl\Python\Lib\site-packages\cdxev\auxiliary\output.py", line 113, in update_tools
    tools.append(this_tool)
    ^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'append'

`build-public` incorrectly deletes nested components

The build-public command is meant to delete components marked as internal. It isn't documented what it does with non-internal components nested inside those internal components.

The tool should probably either:

  1. Delete nested components and remove them from the dependency tree the same way as internal components.
    This is likely the more logical choice, as users might expect components bundled inside internal components to also disappear from the SBOM.
  2. Leave nested components in the SBOM and move them up to the parent scope.

Instead, here is what actually happens:

  • Delete any component marked as internal, including nested components.
  • Remove dependencies on the internal component.
  • Do not remove dependencies on the nested components, leaving dependencies to components behind, which aren't part of the SBOM anymore.

We should choose one of the options above, implement it and make it explicit in the documentation.

Enhance "amend" by adding license texts

Feature request:
When a license is set by "name" (not "Id"), the field "text" is required. If the "text" is missing, the amend feature shall search for a file with the name .txt in the folder "licenses" and add the content of this file to the "text" field.

Minimum length for license texts

The "content" of license texts should have a minimum length of 1 in the custom schema.
"content": {
"type": "string",
"title": "Attachment Text",
"minLength": 1,
"description": "The attachment data"
}
},

feat: add copyright from supplier trough amend-command

If a component does not have any licenses or copyright the amend-command should add a copyright, using the information provided by publisher, author or supplier.name with the text Copyright {current year} {name of supplier/publisher/author}

Add plausibility check for vex

A check for the plausibility of the vex-file. e.g. that in "analysis" the chosen "state" and "justification" do not contradict each other.

Add flag ignore missing for set-function

It should be possible to add a flag, e.g. "--ignore-missing" for the "set"-command, s.t. if a file is provided no error message will be shown when a component in the file is not in the SBOM. This is useful for the use-case when a central database file is used for the command.

Currently an error is shown:
ERROR: Set not performed (at line 230) - The component "COORDINATES[[email protected]]" was not found and could not be updated.

Desired behavior:
When provided the flag "--ignore-missing" the error should not happen.

Filename validation happens with default schema

Is it intentional behavior that the validate command validates the filename even when using the default schema? It seems counterintuitive, since this is a very special requirement that we probably shouldn't impose on every user of the tool.

I know filename validation can be "disabled" manually in a sense by providing a catch-all regex (i.e., .*) to the --filename-pattern option but IMO the default behavior should be either:

  • no validation at all, since the CycloneDX standard doesn't require any particular filename
  • or a warning (no error) if the filename doesn't match what CycloneDX recommends (i.e., bom.json or *.cdx.json), with an option to disable (e.g., --no-filename)

Merge is not hierarchical

When writing the integration tests, I noticed a surprising behavior of the merge command. I'm not sure whether that's a bug or by design.

When merging two SBOMs, where the first SBOM contains the meta-component of the second SBOM as one of its sub-components, the components of the second SBOM are added to the first at the top level. They are not grouped under the sub-component.

For reference, the official CycloneDX CLI tool has the same behavior but it provides a command-line switch named --hierarchical to change that.

SBOM 1

flowchart 
  main_component --> sub_component_1
  main_component --> sub_component_2

SBOM 2

flowchart
  sub_component_1 --> dependency_1
  sub_component_1 --> dependency_2

Actual result

flowchart
  main_component --> sub_component_1
  main_component --> sub_component_2
  main_component --> dependency_1
  main_component --> dependency_2

Expected result

flowchart
  main_component --> sub_component_1
  main_component --> sub_component_2
  sub_component_1--> dependency_1
  sub_component_1--> dependency_2

validate-feature: rework dependencies as required

In some cases, e.g. completely self-written products, there are no third-party components. However, we still require the "dependencies"-field in general. Therefore we should rework the schema to only require the field, if the "components"-field is provided.

FileNotFoundError: cdxev/auxiliary/schema/spdx.schema.json

I did pip install cyclonedx-editor-validator then cdx-ev validate /tmp/sbom.json on Ubuntu 23.10 and get

Traceback (most recent call last):
  File "/home/davidl/.local/bin/cdx-ev", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/davidl/.local/lib/python3.11/site-packages/cdxev/__main__.py", line 40, in main
    return args.cmd_handler(args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidl/.local/lib/python3.11/site-packages/cdxev/__main__.py", line 604, in invoke_validate
    if validate_sbom(
       ^^^^^^^^^^^^^^
  File "/home/davidl/.local/lib/python3.11/site-packages/cdxev/validator/validate.py", line 41, in validate_sbom
    load_spdx_schema(), specification=DRAFT202012
    ^^^^^^^^^^^^^^^^^^
  File "/home/davidl/.local/lib/python3.11/site-packages/cdxev/validator/helper.py", line 49, in load_spdx_schema
    with open("cdxev/auxiliary/schema/spdx.schema.json", "r") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'cdxev/auxiliary/schema/spdx.schema.json'

The spdx.schema.json file is there in the site-packages directory though:

$ python --version
Python 3.11.6
$ pip --version
pip 23.0.1 from /home/davidl/.local/lib/python3.11/site-packages/pip (python 3.11)
$ python --version
Python 3.11.6
$ cdx-ev --version
0.9.1
$ find ~/.local/lib/ | grep spdx.schema
/home/davidl/.local/lib/python3.11/site-packages/cdxev/auxiliary/schema/spdx.schema.json

Workaround is to cd /home/davidl/.local/lib/python3.11/site-packages/ first :)

Using cyclonedx-editor-validator in python script

I want to use cdx-ev from a python script.
Besides running CLI commands using subprocess.run, is it intended to import this package (i.e. from cyclonedx-editor-validator import *) and calling methods directly? If so, which methods would correspond to the CLI commands?

Add 'add_to_dependencies' option for merge command

When merging multiple SBOMs with a main SBOM, there should be an option that automatically adds the bom-refs of each metadata component from non-main SBOM files.

This is helpful when the other SBOMs are actual dependencies of the main SBOM. Without this option I don't really understand the use case of the merge command.

feature_request.zip

Add integration tests

We're already doing unit testing and there are some higher-level tests which test entire command modules (such as CommandIntegrationTestCase in

class CommandIntegrationTestCase(AmendTestCase):
def test_compositions(self) -> None:
run_amend(self.sbom_fixture)

I'd like to add some proper integration tests - or call them acceptance tests if you prefer - which test even higher-level functionality. They would invoke the main() function itself and verify that the program interprets inputs correctly and produces the correct output.

The need for this came to me when I started thinking about #7. The redesign of the tool is so extensive that very little of our code and existing tests will remain untouched. But if we must change the tests, we can't be sure there are no regressions in the tests themselves.
So we need tests that don't change every time the tools internals get rewritten.

I'm thinking, we should have a set of test SBOMs (we can likely reuse much of what we already have for lower-level testing) and run it through a known good version of the tool to produce the expected output, which we also save to the repo.
Then, our test functions are comparatively simple: Set command-line arguments, run main method, compare actual output against expected output. In practice, it's a little more involved but that's the essence.

It is important that these tests don't need to cover every single command-line option. We don't want to reproduce the entire test coverage that we already have in unit tests on another level of abstraction.
For example, for amend I believe a single test with an SBOM that triggers all operations will do just fine.

What is the point of `merge-vex`?

The documentation isn't clear on this, so I'd like to ask what the merge-vex command is for.

The documentation simply states:

This command requires two input files, a SBOM and a VEX file that shell be merged. The VEX file needs to be compatible with the SBOM.

But what is a "VEX file"? That isn't an established term. Even if you google it, you'll find a VEX document only described in an abstract manner, as a set of requirements but not as a complete data format. At the very least, there are currently two implementations which fulfil these requirements and therefore could constitute a VEX file: CycloneDX and CSAF.

Now, if we wanted to merge a CSAF VEX document into the SBOM that would probably justify its own command. At the moment, the command only works for CycloneDX, though, so why do we need a separate command if we already have merge for merging two CycloneDX files?

The only thing the current implementation apparently does differently from the regular merge is check whether all vulnerabilities reference components contained in the SBOM (see also #155). If they don't the command fails silently and simply outputs the original SBOM.

Could you explain the need for a separate command for this in the documentation?

`build-public` shouldn't require a schema

The build-public command requires a schema-path option to be provided. The schema serves to identify components that should be deleted.

This should be made optional. There is the realistic use-case that someone would only want to delete properties in the internal namespace without deleting any components.

Should `amend` set `compositions` to `unknown` rather than `incomplete`?

As of now, the amend command creates a compositions entry with .aggregate == "incomplete".

The stated goal of this is to explicitly disclaim any completeness of the provided information in the interest of revealing known unknowns. Shouldn't the value then not rather be unknown, which expresses exactly that? "Incomplete" means: this SBOM is known to be incomplete, which it might not actually be. "Unknown" only says: we don't guarantee completeness, which seems to be exactly our intent.

`merge-vex` reference check doesn't take nested components into account

The following piece of code doesn't take nested components into account:

def get_refs_from_sbom(sbom: dict) -> list:
"""
Collects the refs of a sbom file into a list
Parameters
----------
sbom: dict
A sbom dictionary
Returns
-------
list:
List with the bom-refs used in the sbom
"""
references = [sbom.get("metadata", {}).get("component", {}).get("bom-ref", "")]
for components in sbom.get("components", []):
references.append(components.get("bom-ref", ""))
return references

That means, if your vulnerabilities reference a nested component, the merge will fail.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.