What is the strategy for maintaining the SignToolData.json (<a href="https://github.co

[Signing] Maintaining SignToolData file,about dotnet/arcade

Comments (28)

natemcmaster commented on May 26, 2024

One thing we tried to do in KoreBuild was generate the sign data automatically from the .csproj. https://github.com/aspnet/BuildTools/blob/dev/docs/Signing.md#config-via-csproj For the most part, aspnet packages follow standard conventions, and we can auto-generate all the metadata we need for signing. I'm not sure if this would be flexible enough for other teams.

from arcade.

jaredpar commented on May 26, 2024

Why do you think dropping a file next to binary that it should be signed is more reliable than updating a file? In both cases it requires developer effort to achieve.

In general we haven't really had the problem you specified with SignTool. While the individual binaries within a repository do get added / deleted with some regularity the consumable outputs of the repository change much less often. For example the set of VSIX / NuGet packages Roslyn produces changes a couple of times a year while the set of projects changes with some regularity.

SignTool pretty aggressively digs through these VSIX / NuGet packages and requires that every item contained in them be accounted for. Either by explicitly labeling it as not needing to be signed or by verifying it's in the list of binaries to be signed. This makes it quite hard to sneak in a signing error. The moment a new DLL is embedded in any of our VSIX / NuGet it gets caught in Jenkins PR that it's not properly accounted for.

Note: the necessity of listing binaries like System.* as not needing to be signed is tedious. We have a work item to remove it now. Probably one of the first changes I'll do after I move it over to arcade.

from arcade.

weshaggard commented on May 26, 2024

Why do you think dropping a file next to binary that it should be signed is more reliable than updating a file? In both cases it requires developer effort to achieve.

The file is generated as part of the build based on a set of projects that set a signing property that we set in our common dir.props.

My main goal is that we have some strategy that makee maintaining this simple/automatic, and ideally a side-effect of the build as opposed to being committed and manually authored.

from arcade.

jaredpar commented on May 26, 2024

The file is generated as part of the build based on a set of projects that set a signing property that we set in our common dir.props.

Why do you think developers are more likely to set this property than update the json file?

My viewpoint on this: any manual action by developers is equally likely to be forgotten. Don't care what it is. Even if you make it central and on by default then developers will forget to turn it off and you'll end up signing and publishing code you never should have.

The reason we went with this approach is the following:

Understanding the set of binaries signed by a repository and the set of certificates / strong names used to sign them requires no evaluation. This means all manner of tooling can be applied to this list to evaluate the correctness of the repository.
It requires a manual update but so does every other signing system we looked at.
It has built-in redundancy. As stated before big items (VSIX, meta-packages, etc ...) don't change often and they tend to include in the vast majority of signing assets a repository builds. These are easy to dig through, look at the contents and validate everything is accounted for.
It's easy to validate signing success or failure at PR time. Knowing the totality of the assets to be signed allows us to replicate most of the signing failures we've seen in practice and catch them at PR time.

I agree that there is a whole though. If you add a new project and it's not included in one of your existing assets then it can slip through the cracks. In practice though that hasn't happened since we moved to this system.

from arcade.

weshaggard commented on May 26, 2024

Why do you think developers are more likely to set this property than update the json file?

Because the property is automatically set in the common props file that is included in all the shipping library projects. So as long as folks just copy/paste a project file and follow the same convention it will just happen for them.

I don't know how many times I've had to fix official build breaks because binaries that should be signed weren't correctly added to the signing list so as part of our move to github/open source I made sure we would minimize those type of issues by not requiring developers to do anything special to get signed.

from arcade.

chcosta commented on May 26, 2024

I tend to prefer towards the side where signing is just a by-product of the build and the right things happen versus having to maintain a config file. It's also nice to have the code itself handle the unique cases rather than a separate config. Either way, there are certainly ways that things can fall through the cracks or be signed incorrectly or whatever. The main reason I can get behind the "SignTool" concept is for potential prodcon scenarios where signing is a separate process from the repo build itself and occurs over a composed set of repos. I know there used to be discussion about this, I'm not certain if that is still a goal or non-goal for prod-con.

The reason we went with this approach is the following:
Understanding the set of binaries signed by a repository and the set of certificates / strong names used to sign them requires no evaluation. This means all manner of tooling can be applied to this list to evaluate the correctness of the repository.
It requires a manual update but so does every other signing system we looked at.
It has built-in redundancy. As stated before big items (VSIX, meta-packages, etc ...) don't change often and they tend to include in the vast majority of signing assets a repository builds. These are easy to dig through, look at the contents and validate everything is accounted for.
It's easy to validate signing success or failure at PR time. Knowing the totality of the assets to be signed allows us to replicate most of the signing failures we've seen in practice and catch them at PR time.

If the build itself generated the config file (and you maybe commit it to the repo), that would provide many of the same benefits.

from arcade.

jaredpar commented on May 26, 2024

I tend to prefer towards the side where signing is just a by-product of the build and the right things happen versus having to maintain a config file

How do you propose the build "does the right things"? It requires some sort of manual intervention. Better to have it in a declarative form that can be tooled.

Bears repeating I guess but we had all of the hypothetical problems you all listed when we used your recommended approach (derived from build). Since switching we've had none of them.

If the build itself generated the config file (and you maybe commit it to the repo), that would provide many of the same benefits.

Disagree. One of the other benefits of the config file is it's declarative in history. When files are added, or removed, from signing it's visible in the PR. Changes to build which accidentally remove signing (through subtle build interactions) are not visible.

from arcade.

chcosta commented on May 26, 2024

One of the other benefits of the config file is it's declarative in history. When files are added, or removed, from signing it's visible in the PR.

I agree with your principal, and I suppose we're talking about an implementation detail, but I'd expect a build generated file that's intended for commit to still have to go through pr. Ie. It'd be weird if an official build generated a file different than what was in the repo.

How do you propose the build "does the right things"

I imagine you'd still need something in the build (msbuild properties or Metadata). Something more akin to aspnets implementation which @natemcmaster referenced https://github.com/aspnet/BuildTools/blob/dev/docs/Signing.md#config-via-csproj

from arcade.

weshaggard commented on May 26, 2024

aspnet's solution is very similar to the core repo's solution as well. See https://github.com/dotnet/buildtools/blob/master/src/Microsoft.DotNet.Build.Tasks/PackageFiles/sign.targets. A library project can opt-in to signing (which is the default for src projects) and can set the strong-name key and authenticode signatures (also defaulted).

from arcade.

jaredpar commented on May 26, 2024

Also FWIW: a large number of repos use SignToolData.json. There have been zero complaints so far about the need to maintain the file (other than the exclusion lists). It's also lead to a big jump in reliability, errors caught at PR time and build throughput.

Definitely we're going to be doing more initial maintenance in this repo at the start. But this isn't a normal developer cycle. We're in the process of merging a lot of existing and well formed tools into a central repository. Edits to SignToolData will certainly be higher here at first but this is also an abnormal level of churn.

from arcade.

weshaggard commented on May 26, 2024

I'm fine if folks would like to manually maintain the file but I don't think it should be a requirement and I'm ok with the format I just want to enable an option to generate as part of the build instead of manually maintaining it.

from arcade.

jaredpar commented on May 26, 2024

I have another suggestion: please try this before you reject it. Literally the only people who want this option have not tried it. Why create deviations on the repositories based on a fear of a solution you haven't tried?

from arcade.

weshaggard commented on May 26, 2024

We have tried it for many years in the TFS world and I worked extremely hard to eliminate maintaining this file list as part of our move to the open. Sorry but this is something I feel strongly about not having.

from arcade.

jaredpar commented on May 26, 2024

Then the implementation is up to you. This is a case where CoreFx is going to deviate from the rest of the organization, without every trying the solution at all. Work and maintenance is on you.

Got to say though this is very disappointing to see. The point of arcade is to unify our solution, not deviate. Kicking off the effort by deviating before trying is not a good start.

from arcade.

weshaggard commented on May 26, 2024

Got to say though this is very disappointing to see. The point of arcade is to unify our solution, not deviate. Kicking off the effort by deviating before trying is not a good start.

Why is it that you guys don't want to try the approach that both core and aspnet are already using?

from arcade.

jaredpar commented on May 26, 2024

Why is it that you guys don't want to try the approach that both core and aspnet are already using?

It doesn't meet the needs of our repositories:

Declarative: need to understand the full set of artifacts that we sign. Having it be implicit has lead to numerous build authoring errors over the years. Making it declarative has eliminated those entirely.
Local verification: developers need to be able to verify locally that signing is consistent.
Batch processing: a must for us to meet our throughput and reliability goals.

from arcade.

markwilkie commented on May 26, 2024

To update this thread, @jaredpar, @weshaggard, and I have agreed on the following requirements. @maririos - please be sure and add these to the overall signtool requirement doc which is being created.

Signtool will (batch) sign from a manifest (list of files)
This manifest can be checked in (explicit)
This manifest can be generated during the build (implicit)

from arcade.

tmat commented on May 26, 2024

Jumping a little bit late to the discussion but hopefully not too late :).

@weshaggard If I understand your concern correctly it's the build breaks that you had to fix that bother you. Is that the only issue you have with the existing implementation? If so I'd like to emphasize Jared's point on local verification and verification in Jenkins. The CI (PR and official) builds can't get broken because Jenkins will validate that all assets are signed that need signing. The output of a repo is a set of packages. That's what's being published. The sign tool validates that all binaries in these packages are listed in the .json file and thus will be signed in official build.

That said, I wonder if we could do the following:
Packages (NuGet and VSIX) are categorized as shipping (intended for publishing on NuGet) and non-shipping. The signing tool could assume that all binaries in all shipping packages must be signed. Binaries in non-shipping packages do not need to be signed. We can also assume that most binaries will be signed by a default certificate. So, how about we only specify binaries in the .json that need to be signed by non-default certificate? In other words the .json file would only list assets whose signing properties differ from the defaults. The default is that a binary in a shipping package is signed by a default certificate.

Would that work for everyone?

from arcade.

weshaggard commented on May 26, 2024

I like the idea of signing validation during both CI (currently don't do in corefx) and Official Builds (we do this) but I don't believe that requires a checked in file list it just requires the list of what we consider shipping bits, as you put it. That list is easily gathered from our builds today based on the project metadata. From there we can verify that all the files in our containers (msi's, vsix, nupkgs, etc) are in the list of shipping bits to be signed.

We are planning to convert (or at least create a new) version of the signtool that is an msbuild task and ideally everyone would start using that version of the tool and the only pivot is how repo's decide they wish to pass in the list of files to it. That list can be a static json file as it is today (or perhaps a static file list in an msbuild props file), or it can generated as part of the build and passed to the tool.

For the core repo's the plan will be for any shipping library we will produce a ".requires_signing" file next to the binary in the output which will contain it's strong name and cert ids. Then when we hit the signing phase we will gather all those require_signing files and produce the list of things we need to pass to the signing service. For validation we will pass that list as well as a list of all our containers (mostly nupkgs) to the signtool task so it can validate that all assets we are going to ship are planned to be signed for CI, or are actually signed in the case of official builds.

from arcade.

tmat commented on May 26, 2024

Why are .requires_signing files needed? As I pointed out the set of files that need to be signed is given by the content of shipping packages (NuGet, VSIX, etc.) that the repo produces.

If each project that produces a shipping package outputs that package to an artifacts directory dedicated for shipping packages, while non-shipping packages are placed to a different directory, then the requirement is that all artifacts (the packages as well as the binaries contained in those packages) in the shipping directory must be signed.

The same logic can be applied to symbol publishing. All symbols contained in packages in the shipping directory must be published.

In addition, we would use the set of shipping packages found in the shipping directory for nuget repacking. NuGet repacking changes the per-build pre-release version of a NuGet package to a release version, which is readily available for publishing if we decide we need to publish the build. Roslyn (and other repos) use repacking since we don't know upfront which build is gonna be the final build. So we produce final packages for every build with extra information that tells us whether these packages can be published (i.e. whether they do not have pre-release dependencies).

from arcade.

markwilkie commented on May 26, 2024

@tmat - I like the fact that you had a good suggestion as to how further unification might be accomplished. However, at the end of the day, I think this is an area where we can leave the decision up to the repo owner regarding if it's explicit or implicit. The point of convergence is that the regardless the path - there's still a "manifest" of what should have got signed.

Is there something specific that you think blocks moving forward with the following?

Signtool will (batch) sign from a manifest (list of files)
This manifest can be checked in (explicit)
This manifest can be generated during the build (implicit)

from arcade.

tmat commented on May 26, 2024

I don't think we disagree on the above bullet points. I'm just saying that if we structure the package output directory based on whether package is shipping or not we won't need the explicit manifest, nor we would need extra ".requires_signing" files. At that point I don't see why we wouldn't use implicit one in all repos.

I'm currently working on incorporating the package shipping/non-shipping classification into RepoToolset, since it's needed for NuGet and Symbol publishing. The publishing logic needs to know which packages are shipping and which are not. The easiest way to do that, imo, is to separate them into directories during the build and let each project that produces a package decide to which directory to put the package (based on <IsShipping> project property).

from arcade.

markwilkie commented on May 26, 2024

I see @tmat. So you're saying that as part of the publishing work you're doing in repotoolset could potentially be used to determine what needs to be signed?

from arcade.

tmat commented on May 26, 2024

Exactly.

from arcade.

weshaggard commented on May 26, 2024

I'm just saying that if we structure the package output directory based on whether package is shipping or not we won't need the explicit manifest, nor we would need extra ".requires_signing" files. At that point I don't see why we wouldn't use implicit one in all repos.

That is exactly what do today for our shipping packages and how we gather them to publish and validate. However one key missing piece of data is figuring out what strong name and certificate to use for a given binary. The certificate is generally the same but the strong name varies. If we can figure out a way to get all the necessary information (perhaps we could crack open the files and read the public key token to figure out the strong name) then I'm all for eliminating the requires_signing marker files as well.

I think we still have to work out some of the details about unpacking/signing/repacking so I'm not sure if that approach will fully work but it is worth exploring. One thing we need to be careful with is that some binaries ship in multiple containers but today are only signed once if we move to a model of unpack/sign/repack then we may end up signing the same binary multiple times. That may not be that common and isn't the end of the world but something to keep in mind.

from arcade.

tmat commented on May 26, 2024

perhaps we could crack open the files and read the public key token to figure out the strong name

I think that's a good idea. Let's try to read the strong name from the metadata.

One thing we need to be careful with is that some binaries ship in multiple containers but today are only signed once if we move to a model of unpack/sign/repack then we may end up signing the same binary multiple times.

Absolutely. I believe the signtool has this built-in already. It calculates checksum of each artifacts to see if it matches the file that was listed in the .json.

from arcade.

jaredpar commented on May 26, 2024

@tmat

Absolutely. I believe the signtool has this built-in already. It calculates checksum of each artifacts to see if it matches the file that was listed in the .json.

Indeed this is the case. The tool itself primarily operates on checksums of artifacts for identity. The only time names are used is when printing out error messages to developers.

from arcade.

JohnTortugo commented on May 26, 2024

Closing this issue since the changes are now being tracked on this issue #396 and this PR #410 .

from arcade.

[Signing] Maintaining SignToolData file about arcade HOT 28 CLOSED

Comments (28)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent