in-toto / specification Goto Github PK

View Code? Open in Web Editor NEW

35.0 11.0 25.0 1.87 MB

Specification and other related documents.

Home Page: https://in-toto.io

License: MIT License

Python 58.74% Makefile 3.34% TeX 37.92%

specification's Introduction

Docs

Specification and other related documents.

You can read the current version of the specification here.
The latest stable version (1.0) is here.

Generating PDF

Requirements

The documentation can be generated into a printable PDF by compiling the markdown file.

make pdf

Examples and Demos

There are a couple of repositories within this organization that you can use to play around and better understand in-toto. Here's a list of them along with a brief description.

demo: This is a very basic dummy supply chain example to help you understand the in-toto python toolchain. We recommend getting started here.
kubectl-in-toto: Inside of this repository, you will find a demo to test a kubectl in-toto plugin to scan containers in your kubernetes deployment against in-toto metadata.
demo OpenSUSE: This repository uses the OpenSUSE build toolchain to exemplify how in-toto could be integrated inside of OpenSUSE-based distros.
totoify-grafeas: This repository provides an interface that converts standard in-toto links into Grafeas occurrences, and back for use in an in-toto verification workflow.
layout-web-tool: The layout-web-tool is a simple Flask-based web app that walks users through creating an in-toto layout.

Other informative repositories

Along with this Docs repository, the in-toto enhancements (ITE) repository contains information about features, recommendations and other extensions that are not part of the core specification

specification's People

Contributors

Stargazers

Watchers

specification's Issues

documentation examples

The following is slightly wrong:

https://github.com/in-toto/specification/blob/master/in-toto-spec.md?plain=1#L1459

Currently says "As a result of this, Alice's layout would have two steps and one inspection." but the example has 3 steps so should be "As a result of this, Alice's layout would have three steps and one inspection."

Discuss the overlap and comparison with IETF's RATS framework

Hi folks, I work on confidential VMs and am trying to get an internally built binary that we run in production to have a fully transparent verifiable remote attestation that matches binary digest all the way back to a public source repository at some commit hash. That's what I really like about in-toto and SLSA.

What I'm trying to understand is how to integrate the software attestation of in-toto with a whole target environment the way that the Internet Engineering Task Force (IETF) Remote ATtestation procedureS (RATS) framework is seeking to do. Where RATS refers to integrity manifests with Reference Values and Endorsements, I see a lot of connections to in-toto and opportunities for in-toto's richness to improve the attestation framework as a bit of combination of all that. Furthermore claims in an entity attestation token (EAT) appear to be synonymous with in-toto's notion of a predicate. The difference is really the IANA number assignment to claims, whereas in-toto has claims defined by URI to the schema.

RATS synthesizes multiple software integrity manifests and hardware integrity manifests with a trusted policy evaluator that issues attestation tokens (EATs). working group for "reference integrity manifests" called CoRIM https://datatracker.ietf.org/doc/html/draft-ietf-rats-corim-03 is seeking to do.

It seems that where CoRIM has tags for software identity that go so far as file digests and version number claims, in-toto's SLSA predicate has more fine-grained evidence that links Reference Values to software identities where software identity isn't a name and version but the culmination of all of the code, how the code is produced and securely stored and built.

Do you see this standard merging with RATS's favored formats of EAT (JWT or CWT) and CoRIM, or existing separately as an alternative carrier format within the RATS framework?

pattern matching syntax for artifact rules is undefined

[from the X41 specification and source code audit]

Section 4.3.3 of the in-toto specification specifies a "pattern" for the artifact rules, but only describes them as "bash-style wildcards" and does not further define the pattern matching syntax.

The Python implementation of in-toto uses the fnmatch module for pattern matching, while the Go implementation uses a customized version of the filepath.Match function.

The Python and Go functions differ in the way patterns are applied, for example regarding escaping and negated sequence matching.

Solution Advice
X41 recommends to describe the pattern syntax in the specification, or to refer to a specific version of a third-party pattern syntax definition, such as IEEE Std 1003.1-2017, 2.13.1.

The Python and Go implementations should implement the same pattern matching syntax.

Document "artifact rule" rationale

The recent discussion about whether and why we decided to replace the implicit DISALLOW * with an implicit ALLOW * when verifying expected materials and products makes it clear that we need to document these decisions more thoroughly.

in-toto/in-toto#43 shows how the current design of MATCH rules (only) evolved.

@SantiagoTorres suggests to create a Wiki that summarizes such on- and offline discussions and provides additional information about the rationale behind certain design decision that would go beyond the scope of the specification.

"other_headers" field in "signatures" field in reference implementation is not part of the spec

The spec indicates that signed files should have the format:

{
  "signed" : "<ROLE>",
  "signatures" : [
      { "keyid" : "<KEYID>",
        "sig" : "<SIGNATURE>" }, 
  "..."
  ]
}

However, when I use the reference implementation to sign a link file using a GPG key, the signature object also contains an "other_headers" field:

{
 "signatures": [
  {
   "keyid": "aad6ec15d80aca160e1a0e7041fc235573127eb3",
   "other_headers": "04000108001d162104aad6ec15d80aca160e1a0e7041fc235573127eb30502615bdc17",
   "signature": "5be45bc0d63301d487d3eac7605d0440aba0017fcaef987066d1451ed02e696b53ffa6404bb97698becc69ec4cbe2efc58e4148151aa48c2d0d6b3f5544672cf4d596c4ac0bdff6350e4a8c4a034fae3286e123e3e934c2c14bace75126d8b9e2fea08055a21674875dac6284e1c8eba142cab8f0b14b2547f50135745dc5b051b2b7a59c1e5578b68917ca2d3c061235c0a3c93c649df4140e0acda9b779b2fefd0e767e376afb67c7bde86fd904e75d99efef664ed1561471645b3bb642adf1cf2707819e9246cb71410445e00cc130d237e8904710266411d3bc166ffb3c90407f54d52cc3f5e514bfc8823053a681136a230983c3c2bf38371b456cb1d1eae322119e3a697e0cfed27504eb55bcf8bae8cb33c88ef155477aa97b523834a8efe1efc0fb12de6a32b84c4eb4230682c5554d34e6fd54de6928c675a98df633a1ed17abb77958e2da0b5230c57f7e4c817b42aa86d26a6583fdbb2c96e0d6c46e7bafe02b1d600a146476b7434bd511741414e40309d6d99715f75b1644bb04470311520b76114a4cf75abceaf805f3b2d8ce2b0cca56dbba231116b83b0aabf920b3602e06c334de13ceccb8b5ef9c0f3fd1b85a2c07126d513d9c45ccc54d5036d4a31a4f4299f8081811e83fe6d50aeb85c09959e43f46ce8d9053da9dba330f57faacbadeaa5bc1533fc03bcd8d3f1b24f66ad403691d6118a43df7a74"
  }

Verification fails if this field is removed, so it is clearly important/required in some scenarios, but the spec does not provide any details on this. It also doesn't indicate that a signature object can have other arbitrary/opaque fields.

It would be helpful if the spec could be updated/fixed to properly explain this.

PDF Generation Instructions

The documentation PDF should be generatable with directions or some sort of PDF creation library (Ex. LaTEX, markdown-to-PDF, etc) And PDF could be stored as a Github artifact rather than in the repository itself. (GIT isn't optimized for binaries)

Ambiguous rule set: `ALLOW foo` while `DISALLOW *` is also a rule

The following example is used in the layout creation document (see in-toto/in-toto#182):

step_update.add_product_rule_from_string("ALLOW demo-project/foo.py")
step_update.add_product_rule_from_string("DISALLOW *")

These two rules together seem to contradict each other. I personally find it easier to grok DISALLOW * EXCEPT "demo-project/foo.py" or ALLOW * EXCEPT "foo". However, I'm not familiar with in-toto's rules and why things are designed this way.

missing key management

[based on the X41 specification and source code audit]

in-toto does not deal with key management and relies on additional channels to distribute and verify the generated keys in a secure manner. Therefore, key expiration and revocation is outside of the project’s responsibility. However, the keys in the specification do not describe fields that can support this. This could be improved upon.

Solution Advice

X41 recommends to use keys and certificates that have key management infrastructure in place instead of generating a new key format. These might contain PGP keys and S/MIME certificates.

in-toto-spec.md: missing table of contents

The in-toto specifciation has grown quite a lot, so a table of contents would be useful to help people navigate through it:

would be nice to point to live website docs from README

the website has some nice intro material -- would be nice to link from this repo, since I looked here first before discovering the website

expected python in-toto to consume 'REQUIRE' layout item.

https://github.com/in-toto/docs/blob/master/in-toto-spec.md#4331-rule-processing

Artifact rules reside in the "expected_products" and "expected_materials" fields of a step and are applied sequentially on a queue of "materials" or "products" from the step's corresponding link metadata. 
They operate in a similar fashion as firewall rules do. This means if an artifact is successfully consumed by a rule, it is removed from the queue and cannot be consumed by subsequent rules.

is inconsistent with the behavior of the reference implementation:
https://github.com/in-toto/in-toto/blob/develop/in_toto/verifylib.py#L1053

# Initialize empty consumed set as fallback for rules that do not consume
# artifacts. All rules except "disallow" and "require" consume artifacts.

In an 'expected_materials', I had a REQUIRE followed by a DISALLOW * and was confused that the disallow kept triggering. We eventually decided to implement:
["REQUIRE", "product/index.html"], ["ALLOW", "product/index.html"],
because ALLOW consumes the item. The above code seems awkward
I think REQUIRE should consume the item. Let me know if I should open an issue with python in-toto ... or if this documentation is incorrect.

Thanks

Differences between the in-toto implementations and the specification

While investigating in-toto/in-toto-golang#124 I think I have found a bug in our specification.

The question is: What happens when the "command" section in the link file is empty? (Vice versa with "expected_command" for the layout file).

In our Golang implementation such case looks like this right now:

{
  "signed": {
    "_type": "link",
    "name": "write-code",
    "materials": {},
    "products": {
      "foo.py": {
        "sha256": "74dc3727c6e89308b39e4dfedf787e37841198b1fa165a27c013544a60502549"
      }
    },
    "byproducts": {},
    "command": [],
    "environment": {}
  },
  "signatures": [ ]
}

Our python implementation has the following link:

 
{
 "signatures": [],
 "signed": {
  "_type": "link",
  "byproducts": {},
  "command": [],
  "environment": {},
  "materials": {},
  "name": "write-code",
  "products": {
   "foo.py": {
    "sha256": "74dc3727c6e89308b39e4dfedf787e37841198b1fa165a27c013544a60502549"
   }
  }
 }
}

These two are pretty much the same (thank god, that's good!).

Out in-toto specification is different to our implementations, though. In our specification we define the field
"command" as string not as an array of strings. See: https://github.com/in-toto/docs/blob/master/in-toto-spec.md#44-file-formats-namekeyid-prefixlink

Is it safe to fix this in the specification or do we need to change our implementations? We did not reach in-toto 1.0 yet..

What's an SCC?

The spec refers to an SCC. Is that intended to be SSC? That doesn't quite fit the sentence though, it seems like it's a tool or environment, rather than the SSC which is defined as a series of steps.

Same thing showed up in 0.9 too.

Tag v1.0 of the spec

Given that ITE-5 and ITE-6 intend to significantly change the spec, it is useful to be able to link to the current revision. The v0.9 tag is too old - it does not yet include "environment".

Could you bump up the revision of docs/in-toto-spec.md and create a git tag? I'm guessing this would be called v1.0?

in-toto-spec.md: Missing product in 5.3.2 example

In section 5.3.2 it states that Alice writes test.py however in the root layout it is missing the specification that Alice creates test.py.

The step:

       {"_name": "write-code",
        "threshold": 1,
        "expected_materials": [ ],
        "expected_products": [
          ["CREATE", "foo.py"]
        ],
        "pubkeys": [
          "<ALICES_KEYID>"
        ],
        "expected_command": "vi"
       },

should be

       {"_name": "write-code",
        "threshold": 1,
        "expected_materials": [ ],
        "expected_products": [
          ["CREATE", "foo.py"],
          ["CREATE", "test.py"]
        ],
        "pubkeys": [
          "<ALICES_KEYID>"
        ],
        "expected_command": "vi"
       },

Unless I have a misunderstanding, this is my first time reading through the document.

Diagrams that show up in various places are not viewable in darkmode.

If you use GitHub in darkmode, some of the diagrams can't be seen as they rely on white background. i.e https://github.com/in-toto/docs/blob/master/in-toto-spec.md has a diagram but you can't see the arrows.

Pseudocode for MATCH rule indicates that source and destination patterns can be distinct

The pseudocode currently reads:

source_artifacts_filtered = filter(rule.source_prefix + rule.source_pattern,
                                   source_materials_or_products_set)

destination_artifacts_filtered = \
    filter(rule.destination_prefix + rule.destination_pattern,
             destination_materials_or_products_set)

This should become:

source_artifacts_filtered = filter(rule.source_prefix + rule.pattern,
                                   source_materials_or_products_set)

destination_artifacts_filtered = \
    filter(rule.destination_prefix + rule.pattern,
             destination_materials_or_products_set)

The in-toto specification currently allows for IN clauses to allow for relocation, but not renaming of artifacts as shown by the statement:

The "IN <prefix>" clauses are optional, and they are used to match products and materials whose path differs from the one presented in the destination step. This is the case for steps that relocate files as part of their tasks. For example "MATCH foo IN lib WITH PRODUCT IN build/lib FROM compilation" will ensure that the file "lib/foo" matches "build/lib/foo" from the compilation step.

"keys" JSON format described in spec does not match reference implementation output

Overview

Section 4.2 of the spec states that:

The KEYID of a key is the hexadecimal encoding of the SHA-256 hash of the canonical JSON form of the key, where the "private" object key is excluded.

The spec also describes that the currently supported key types are "rsa", "ed25519", and "ecdsa", and describes their formats -- for example, for RSA, it states "PUBLIC and PRIVATE are in PEM format and are strings. All RSA keys must be at least 2048 bits" and for ESA it states "PUBLIC and PRIVATE are both 32-byte (256-bit) strings."

In order for different implementations of the spec to produce the same output, it's necessary for the JSON form of the keys to match. However, the reference implementation key objects contain fields not specified in the spec and also support key formats not described in the spec (for example, GPG keys).

It would help to clarify which scenario best describes the desired state:

The spec is general, and different implementations aren't necessarily expected to interoperate with each other -- the spec describes the concepts of things like "KEYID" and the "KEY" structure, but different implementations may derive/represent them in their own way within their own ecosystem
The spec should prescribe a particular set of supported keys and formats such that all in-toto implementations should generate the same output (in this case, spec and reference implementation should be in sync for GPG keys)
Some other state?

I believe this is also relevant to the DSSE work, as the "key ID" concept exists there as well and the manner in which that MAY, SHOULD, or MUST line up with the notion of key IDs in the spec is related to this.

Details

In section 4.2 of the spec, it states that all keys have the format:

 { "keytype" : "<KEYTYPE>",
    "scheme" : "<SCHEME>",
    "keyval" : "<KEYVAL>" }

However, keys produced by the reference implementation also include values such as "keyid" and "keyid_hash_algorithms":

  "keys": {
   "2f89b9272acfc8f4a0a0f094d789fdb0ba798b0fe41f2f5f417c12f0085ff498": {
    "keyid": "2f89b9272acfc8f4a0a0f094d789fdb0ba798b0fe41f2f5f417c12f0085ff498",
    "keyid_hash_algorithms": [
     "sha256",
     "sha512"
    ],
    "keytype": "rsa",
    "keyval": {
     "private": "",
     "public": "-----BEGIN PUBLIC KEY-----\nMIIBojANBgkqhkiG9w0BAQEFAAOCAY8AMIIBigKCAYEAzgLBsMFSgwBiWTBmVsyW\n5KbJwLFSodAzdUhU2Bq6SdRz/W6UOBGdojZXibxupjRtAaEQW/eXDe+1CbKg6ENZ\nGt2D9HGFCQZgQS8ONgNDQGiNxgApMA0T21AaUhru0vEofzdN1DfEF4CAGv5AkcgK\nsalhTyONervFIjFEdXGelFZ7dVMV3Pp5WkZPG0jFQWjnmDZhUrtSxEtqbVghc3kK\nAUj9Ll/3jyi2wS92Z1j5ueN8X62hWX2xBqQ6nViOMzdujkoiYCRSwuMLRqzW2CbT\nL8hF1+S5KWKFzxl5sCVfpPe7V5HkgEHjwCILXTbCn2fCMKlaSbJ/MG2lW7qSY2Ro\nwVXWkp1wDrsJ6Ii9f2dErv9vJeOVZeO9DsooQ5EuzLCfQLEU5mn7ul7bU7rFsb8J\nxYOeudkNBatnNCgVMAkmDPiNA7E33bmL5ARRwU0iZicsqLQR32pmwdap8PjofxqQ\nk7Gtvz/iYzaLrZv33cFWWTsEOqK1gKqigSqgW9T26wO9AgMBAAE=\n-----END PUBLIC KEY-----"
    },
    "scheme": "rsassa-pss-sha256"
   },
...

The spec also notes:

We define three key types at present: "rsa", "ed25519", and "ecdsa".

However, the reference implementation has support for adding GPG keys to the layout, and when doing so the key type is noted as "rsa" but contains many more fields than documented in the spec:

   "aad6ec15d80aca160e1a0e7041fc235573127eb3": {
    "creation_time": 1633371753,
    "hashes": [
     "pgp+SHA2"
    ],
    "keyid": "aad6ec15d80aca160e1a0e7041fc235573127eb3",
    "keyval": {
     "private": "",
     "public": {
      "e": "010001",
      "n": "dc4bc6109e950fe1e68f69421bacf4786d91703be656a5cbc0d0abfe45763a2a86109dda3aed11da0b3d14d4d01304618d8b11c77ae22f1c52f18f4a637c8564211041338089c249b22b3f20c45ed9f6e0ed780acc2b6adb39f283a18e28bc7cc28a11fcee90a6aad765d6c7bca0d219f51b16fcfa65b922dd6cfdd78f7feff5366d07573a5ad8b6c314bbe9936586d3ee9edf49ce33a71ff26a62bc8d48484c6bdfe803017d9dc73ac4fd3bfce3ec4be6ab49781623e9e158e16a2b3d13701590e8dbb901fa4497233c858c1503c06fd963276f952e6f452536be37f99f3e13910f75fbbbae2a01307512961f76e0786c8877c047d7393db97d8a9175e2b3de90cd368e7907f9201064da1abd95fe766c61c972833f344a0e1d40af5ef14550d6185042d375e1e37ea25e0a042d3fa9f94ad631d4387b0927d7f66a10e464a5742695437e4c57186786559c78f790e0599a7f7bb5ae9466904f56459e0a91b1a66b9deb1c1d4a551c931734be22e850f6e3b628959c588c432dd504865503bdb286b368c9e6c4dfa46ead3edb5dac0675e736806efe9305974c69377ebf7b731763d01ab5d1738964ad3c6a388dd394cd80b318f09fefb3d44f10c06e2823221eee7938c63e247111c533d7a274d282845b9a69e062b6c6dedb87dcdaa10d7e8b8fe0766b3b323bf81497b5563051cf4e338a9d9b923b1cb3fd7d7408b89917"
     }
    },
    "method": "pgp+rsa-pkcsv1.5",
    "subkeys": {
     "236add4ccccd28f775ba491487649e16938531ef": {
      "creation_time": 1633371753,
      "hashes": [
       "pgp+SHA2"
      ],
      "keyid": "236add4ccccd28f775ba491487649e16938531ef",
      "keyval": {
       "private": "",
       "public": {
        "e": "010001",
        "n": "e0f7c91be8b8c0fcfabf322e347219b9866fd49f359f47f58c99b83dc0cd9a77a7c6780326b0dd2be404c2061e6f6e61169b8095bf5c7027209c35ec004e69b97269fdc123720be23772dc830121bbfcab3bc82cbe31bd59c163e1c74ff1f6f8a31b7f137253cb9bd27c3a31f167d26708e1775d641a4989272fcc74e7cc1d7a4d581cd2ae8837a819685026a40e635b0487bc74dfade968f66768b0d87c1677bb737d3642e5908ac643fb61b1687ba7c4ab117e00057ae84b2abca0351cbb55ff25752edb7aa5e955236afbb9350cfc894efccd471450214cefbd4a1a04af06c114f77bd17b4c95b49723aa0dde0bcfa2f9579498f538f7794f6a9a51e5c7640f15e687c16bedcc70767491b10765cfc7eae8b212e010119cef2b0ffee9d1eddd5e8d39d46fbad04941db4f5ae4b1a1a4a14a9e34108cde01ef0502870113aef9c69c7fa47c3080ecb79491466640d8702e3a979f344d55df8ee0613f23db4f51f17ae50f6488fa461125641778f46e273f5e667826043f3636b2c4d4f892a480229098f9df47d500f987f0976baf902234002bb22c6743be335a32076e8aa9b425e9a8fe2ddcc40784ce06baa2cc8374600441d1340675ceda1df03f6837afc0bcb70facf2329ef278a1b7684904411c2140414ae151c8c712df03bdd23862ed5f96507f3dcb9f0cacb8955a520acc703248aa40cdc8825cabc48ebd900ce3"
       }
      },
      "method": "pgp+rsa-pkcsv1.5",
      "type": "rsa",
      "validity_period": 94608000
     },
     "74201f27437bc868067ed70a2443912b98592890": {
      "creation_time": 1633371753,
      "hashes": [
       "pgp+SHA2"
      ],
      "keyid": "74201f27437bc868067ed70a2443912b98592890",
      "keyval": {
       "private": "",
       "public": {
        "e": "010001",
        "n": "ce3feeda9a87e1abe4bdb822e79b4cd5de931f3add989243e711ff807908bc8d7a5653746f38e81747507ea778ac46fb503dc42ad1b2f47744a53ab577df2e6da1c0ff9476c4c28bcf13997aa46e8bf989147fa53b85f8b71dfb24476ecb9b2ae043619831692249cd97704e7a2a50338f422ee7dd7cc9489ebc52c2a3cb014e7d2bb6dc00b028f931799cd6ad6380951b1d6090af5439e43f1bfc8f318f8ba1302437ee22d8696e877332b427d68fa443e50cdc87348217a568c2753ce5c777460ab351bbda21569f47603c273678f2ac59ebffa32b0dee1a21509e0d6983bec0b87c9db93f4031b9f733ec53a44bc8436e8e98ae25cb88149c6e9645496a7df9c606642a4958861dd2a71fe04ff9d7296ada176d08b38d63a601d098d68eed44dad6a736b1839e487e716fb5c1d8f93faafe13002f7128bd003e3dbbf03b69949485b63627f9aa4fd4f8316f77a53af6bf83c52ea1c7f96577c547b3b9de53634187f136f8fca9b8aef31338a0d6ec3417734f80205ed08f5027c47b29c91556be1a0ae2d660b6532adde67034e97eac3f0b00dbe0f86ccef30ba460b4a1c31fc7dccabe8a193c6d55995e5e07562b809992b74f2657cd940d86eceb78c2daf3a7fad3d80b070488c8955380853556ecb16d9aa586198d7bfd0e4090d0c999f01e8eb19ef1731791f97f875316394883ce2ef843d0d54632bfd873b09f4979"
       }
      },
      "method": "pgp+rsa-pkcsv1.5",
      "type": "rsa",
      "validity_period": 94608000
     }
    },
    "type": "rsa",
    "validity_period": 94608000
   }

(The above comes from a portion of the layout file produced by adding the line layout.add_functionary_key_from_gpg_keyid(gpg_keyid="AAD6EC15D80ACA160E1A0E7041FC235573127EB3") to https://github.com/in-toto/demo/blob/master/owner_alice/create_layout.py#L87 and running the layout creation logic)

This is relevant because the spec notes that:

The KEYID of a key is the hexadecimal encoding of the SHA-256 hash of the canonical JSON form of the key, where the "private" object key is excluded.

Thus, in order for "KEYID" to be determined in a canonical manner, the JSON form of the key must also be consistent/canonical.

Verify target files of final product without inspection

Description of issue or feature request:
Target files of the final product, i.e. the files whose supply chain is being verified, require a final link in order to associate the actual files on the client with the artifacts from earlier steps in the supply chain.
Currently, it needs an inspection in order to generate that link. However, inspections seem unsuited for such a crucial task, a.o. for the following reasons:

originally they were conceived as optional plugins for additional supply chain verification
they require the client to execute arbitrary commands at installation time, which makes the trust in the project owner more important
the commands must be present on the client system (see in-toto/in-toto#109)

Hence, I propose generating that final link without the use of inspections.

Current behavior:

Example (unpack inspection)
The following inspection has been used widely (see examples in specs) to link the target files on the client to the last step of the supply chain, where the last step created a package foo.tar.gz from a file foo.

run: tar xzf foo.tar.gz
expected_materials: MATCH foo.tar.gz WITH PRODUCTS FROM last-step
expected_products: MATCH foo WITH MATERIALS FROM last-step

This inspection is meaningful as it additionally verifies that the last step was performed correctly, i.e. the foo that went into foo.tar.gz is the same that came out of it on the client side. However, those are two different verifications, and if the second is not required, the inspection seems less sensical (see next example).

Example (dummy inspection)
In some cases unpacking the target files of the final product might not be possible or desired. In order to still generate the required link, we have to work around by adding an inspection with a command that "does nothing".

run: /usr/bin/true
expected_materials: MATCH bar WITH PRODUCTS FROM last-step

Expected behavior:
Instead we could provide an expected_final_product field that lists artifact rules for the target files on the client system, akin to expected_materials and expected_products on steps and inspections. These rules should be verified after verifying step artifact rules and before running inspection commands.
In order to get the hashes of the target files on the client system, the in_toto_verify command should receive a list of target_files, for which it will record hashes, akin to the materials and products arguments, passed to the in_toto_run command.

Update provided PDF

The PDF is way older than the Markdown spec file, please keep them in sync.

Inconsitencies between the in-toto specification and reference implementations: to `_name` or to `name`, that is the question

The in-toto specification defines _name for step names in both layouts and links. However, our implementations use just name, and this is what we see in all existing metadata. Incidentally, some of the examples in the specification use name too. As I understand, the underscore was to ensure the type and name show up first, but there has also been other discussion about the right ordering of fields in in-toto metadata.
- _name in steps: https://github.com/in-toto/docs/blob/master/in-toto-spec.md#431-steps
- _name in links: https://github.com/in-toto/docs/blob/master/in-toto-spec.md#44-file-formats-namekeyid-prefixlink
- name in example in spec: https://github.com/in-toto/docs/blob/master/in-toto-spec.md#write-codealice-keyid-prefixlink
The specification has the inspections field with the key inspections but our implementations and existing metadata use inspect.

key format specification is inconsistent

Hi there,
Right now our specification specifies the following key format:

  { "keytype" : "<KEYTYPE>",
    "scheme" : "<SCHEME>",
    "keyval" : "<KEYVAL>" }

Source: https://github.com/in-toto/docs/blob/master/in-toto-spec.md#42-file-formats-general-principles

However, if you generate a new ed25519 key with in-toto-keygen.py you will end up with the following private and public key.
private:

{
  "keytype": "ed25519",
  "scheme": "ed25519",
  "keyid": "d7c0baabc90b7bf218aa67461ec0c3c7f13a8a5d8552859c8fafe41588be01cf",
  "keyid_hash_algorithms": ["sha256", "sha512"],
  "keyval": {
    "public": "8c93f633f2378cc64dd7cbb0ed35eac59e1f28065f90cbbddb59878436fec037",
    "private": "4cedf4d3369f8c83af472d0d329aedaa86265b74efb74b708f6a1ed23f290162"}
}

public:

{"keytype": "ed25519",
"scheme": "ed25519",
"keyid_hash_algorithms": ["sha256", "sha512"],
"keyval": {"public": "8c93f633f2378cc64dd7cbb0ed35eac59e1f28065f90cbbddb59878436fec037"}}

In the public key the keyID is missing. Is this on purpose?

JSON parsing differences can make the same signature verify for different payloads

[from X41 specification and source code audit]

in-toto uses JSON for many of the security-relevant parts, which poses a risk due to the relatively lax definition of JSON and the factual differences of JSON parsers and serializers.

The Go implementation ignores the case of key names when parsing, but does not accept extraneous keys whereas the Python implementation is case sensitive, but ignores extraneous keys.

In either case, the JSON document can be tampered with while the cryptographic signature is still considered valid.

When the contents are the same, for instance if the uppercased "Run" key also contains a bare whoami command, then the document is currently accepted as valid in both languages. This means that the following JSON documents are all considered valid without re-signing in both languages:

$ cat variant1.json
{
  "run": ["whoami"]
}

$ cat variant2.json
{
  "run": ["whoami"],
  "Run": ["whoami"]
}

$ cat variant3.json
{
  "Run": ["whoami"],
  "ru\u006E": ["whoami"]
}

Each of the three variants verifies successfully using an identical signature in both the Python and Go implementations of in-toto. The Go implementation produces errors when there are unexpected (extraneous) keys, but since it considers differently-cased variants to be identical, the alterations are accepted. The Python implementation does not error out when an extraneous key is added, such as a differently-cased one.

One might imagine a scenario where in-toto verifies the signature of a file and then passes that file to a different program, which then parses different data (e.g., a command to execute) than the one verified by in-toto before.

"signed": {
  "_type": "layout",
  "expires": "2023-03-15T09:18:03Z",
  "inspect":
  [
    {
      "run": [
        "whoami"
      ],
      "_type": "inspection",
      "expected_materials": [],
      "expected_products": [],
      "name": "runcmd",
      "Run": [
        "rm",
        "/"
      ]
    }
  ],
  "keys": {},
  "readme": "",
  "steps": []
}

The payload shown in the snippet above would execute whoami in Python but rm / in Go because the case insensitive handling of keys in Go results in it overwriting the formerly benign command. Both languages have a policy of considering the last key with the same name as the valid one, but Go ignores case differences. In another attack scenario, the signatory might be tricked into accepting the layout file contents because they look good, while in reality a secondary key located further down in the document specifies what is truly being executed, as also demonstrated in the snippet above. The overwriting can be made less obvious by adding more information in between, obfuscating the "Run" key such as by using hex encoding, and obfuscating the malicious command's contents.

ITE-5 proposes the use the Dead Simple Signing Envelope (DSSE), where data is serialized before it is signed. This allows to verify the signature before deserialization and prevents modifications to the serialized data after signing. However, DSSE uses JSON as a container format and is thus vulnerable to some extent when parsing the container – A DSSE file might contain two pairs of payloads and signatures that are read inconsistently depending on the implementation. If the DSSE payload also consists of serialized JSON (as currently proposed), DSSE would not prevent the second attack scenario described where an attacker tricks the signatory.

Solution Advice

To minimize the attack vector, X41 recommends to use a serialization format that is more strictly defined, has less implementation differences and relies on the data structure to be defined on the serializing and deserializing ends rather than within the serialization. Alternatively, X41 recommends to implement stricter JSON document verification in both languages in addition to DSSE, such as rejecting documents with duplicate keys, ill-cased keys, and extraneous keys. However, it is important to point out that such safety measures are outside the usual handling of JSON documents and may not be followed by (or even be reasonably available to) all implementers of the in-toto specification.

Document media/MIME type of in-toto payloads

Currently, cosign is using this:

https://github.com/sigstore/cosign/blame/b1e7ca2813ca42a38961f7fac51f130d2d3ec82c/pkg/types/payload.go#L19

I suggest we document these types somewhere so as to allow other applications to use the same.

Support ecdsa key-type for commands

Currently, commands like in-toto-sign, and in-toto-run support a command line option named --key-type:

 -t {rsa,ed25519}, --key-type {rsa,ed25519}

Would it be possible to add support for ECDSA key types?

The reason for asking this that it would be nice to be able to use keys generated by cosign but currently it looks like cosign only generates ECDSA-P256 keys and uses SHA256 hashes.

I'm not sure if this helps or not, but it looks like securesystemslib has support for KEY_TYPE_ECDSA.

Broken Hyperlinks in in-toto-spec.pdf

The hyperlinks in the table of contents of the in-toto-spec.pdf seem to be broken.

Address remaining review comments from @trishankkarthik

@SantiagoTorres partially addressed an out-of-band specification review from @trishankkarthik in #6. As per the PR description (Oct 2017) the following items remained:

There are too many terms upfront
Sections 2-3 are particularly redundant
More pictures and diagrams are necessary (ASK)
POSIX timestamps
Point 8 of minor readability issues (ask trishank)
Why is there mention of private keys in the metadata?
Commands sound dangerous, are clients expected to run arbitrary commands? (ASK/Debate/FAQ)
What about rollback attacks? (Elaborate/question)

Change expiration check to match TUF

There was a discussion in the issue tracker of TUF reference implementation (theupdateframework/python-tuf#1231) about how expiration datetime's should be compared.

Having done some research into how other security providers are comparing expiration equivalents (i.e. OpenSSL x509 certificate checking code, and GnuPG expiration checks), and how other TUF implementations are performing the same check (rust-tuf, go-tuf), we came to a consensus that the correct way to implement expiration comparisons is:

expiration <= now

I'm filing this issue because the discussion in the thread suggested it is preferable to have in-toto and TUF specifications and reference implementations behave in the same way on this.

Resolve documentation inconsistency in number of signing methods supported.

Section 4.2 There's an inconsistency in terms of the number of signing methods supported.

Spotted during a read "We define two key types at present: 'rsa' and 'ed25519'."

Yet earlier in the section it states:
The current reference implementation of in-toto defines three signing methods, although in-toto is not restricted to any particular key signing method, key type, or cryptographic library:

"RSASSA-PSS" : RSA Probabilistic signature scheme with appendix. The underlying hash function is SHA256.
"ed25519" : Elliptic curve digital signature algorithm based on Twisted Edwards curves.
"ecdsa" : Elliptic curve digital signature algorithm

Needs to be reconciled.

Add Vale Linter for Documentation Quality Checks

Description:

To ensure the quality and consistency of our documentation, we propose adding the Vale linter to the in-toto documentation repository. Vale is a highly customizable, syntax-aware linter that helps enforce writing style guides and catch common documentation issues.

Benefits:

Improved documentation quality and consistency.
Automated checks for common documentation issues.
Clear guidelines for contributors, helping maintain high standards.

expiration date might prevent installation

[based on the X41 specification and source code audit]

The expiration date in the layout files might prevent users from properly verifying and installing a product after that date. This might force the users of in-toto to create additional releases that offer no functional changes or use overly long expiration dates.

Solution Advice
X41 recommends to add an option to enforce a certain version counter value.

Problem with conflicting rules

This is a problem that @lukpueh points out in the in-toto/layout-web-tool#49 (comment):

Here is an example of a material rule:

[
  ['MATCH', 'foo.txt', 'WITH', 'PRODUCTS', 'FROM', 'previous_step'],
  ['DELETE', 'foo.txt'],
  ['DISALLOW', '*']
]

Because we apply both MATCH rule and DELETE rule on the artifact, foo.txt, the first MATCH rule makes the subsequent DELETE rule moot. However, both MATCH rule and DELETE rule here are meaningful. We need MATCH rule to guarantee the integrity of artifacts between steps. We also need DELETE rule to guarantee that deleted artifacts don't appear as products of this step.

Revisit artifact rule path patterns

Description of issue::
Rethink the behavior of artifact rule path pattern filtering, especially if no artifacts are filtered by the pattern.

Current behavior:
All artifact rules take a pattern argument that is used to filter artifacts reported by a link.
If the pattern does not filter any artifacts the rules as practically not applied.

This issue was already pointed out in the discussion in in-toto/in-toto#43 and is also described in the docstrings of the artifact rules verification functions, e.g. for the MATCH rule:
https://github.com/in-toto/in-toto/blob/0beaf5b131b5860e8bf0bb059c9f97a0736851b2/in_toto/verifylib.py#L451-L461

Expected behavior:
Expected behavior is open for discussion. My suggestions:

each pattern must match at least one artifact
extend the rule syntax to indicate if the pattern can/must match ?, +, * artifact(s) (c.f. glob characters)

Potential grammatical error

Hi there 👋

I am just reading through the spec and I noticed a small sentence that might be a grammatical error? I just thought it would be worth raising in case it needed a cleanup. The line in question is L431-L432:

A functionary can allow a third-party define a step or series of steps of the supply chain a sublayout

I was thinking maybe this is meant to read as something like:

A functionary can allow a third-party to define a step or series of steps of the supply chain known as a sublayout

Maybe grammatically the current sentence is technically correct, it just didn't quite read correct to me.

Thanks 😄