Git Product home page Git Product logo

licensecheck's Introduction

Licensecheck

The licensecheck package scans source texts for known licenses. The design aims never to give a false positive. It also reports matches of known license URLs.

See the package documentation for API details.

The license scanner recognizes nearly all the licenses gathered by the SPDX project, along with a few others.

See licenses/README.md for license details.

licensecheck's People

Contributors

dsymonds avatar jba avatar kortschak avatar robpike avatar rsc avatar spakin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

licensecheck's Issues

License not recognized in pkg.go.dev

Hi,

I'm opening this issue has the license I use for my repository is the MIT License type which I datafilled with Github (I believe). Github recognize the license and doesn't have any trouble, but the license isn't recognized in the package in pkg.go.dev.

The github repository ; https://github.com/elafontaine/smpp/
The file ; https://github.com/elafontaine/smpp/blob/master/LICENSE
The package link on pkg.go.dev ; https://pkg.go.dev/github.com/elafontaine/smpp

I believe the bug comes from the fact that the license start with : "MIT License"

I compared it with this repository ; https://github.com/fiorix/go-smpp
The license used in that repository ; https://github.com/fiorix/go-smpp/blob/master/LICENSE

Dual license problems

We have a repository that is dual-licensed. I have followed the guidelines for file naming (LICENSE-APACHE, LICENSE-BSD3) but the package seems to be bothered by the standard LICENSE file which specifies that the package is dual-licensed. What is the suggested way to fix this? There are two options:

  • Modify this package to support the SPDX header that we have in our LICENSE file.
  • We remove the LICENSE file, leaving only the LICENSE-APACHE and LICENSE-BSD3 files.

Please advise. It's quite unfortunate that users can't see documentation.

allow dynamic additions to the corpus

A licensecheck user may want to recognize licenses that are not part of the built-in corpus. It would be nice if they could avail themselves of the text-matching machinery of licensecheck without being limited by the implicit legal judgments it makes in virtue of having a fixed corpus of license texts.

Jurisdiction-ported Creative Commons license URLs are not recognised

The CC BY 3.0 license (and possibly others?) have country coded ports indicated by a two-letter country code in the URL path. These are missed by license check. An example of this is the (old) license for ajstarks/svgo:

The contents of this repository are Licensed under 
the Creative Commons Attribution 3.0 license as described in
http://creativecommons.org/licenses/by/3.0/us/

Variable sections are not considered in license descriptions

The SPDX license descriptions (available here) include coding for sections of license that are variable (as described here, with additional notes in the Matching Guidelines paragraph here).

The absence of annotation of variable as opposed to required matching imposes the requirement for heuristic detection based on proportion of matching text, leading to both false positives with clearly incorrect classification of malicious license as OSS licenses and false negatives where perfect SPDX-match licenses are not correctly detected.

These issues have been raised at golang-dev and golang-nuts.

Cover() returns {NaN, []} on text that only contains newlines.

Hi.
Thanks for this great library. I'm making a tool with it and happened to find this.

How to reproduce:

  1. create an empty file or a file with only a newline in;
  2. input the content of the file to Cover();
  3. check the output.

Not sure if this is intended behavior. Seems like a bug.

Add Plan 9 licenses

https://9fans.github.io/usr/local/plan9/src/cmd/9pfuse/COPYRIGHT is currently a no match (note: the HTTP server seems to force a download). It seems to be a variant of the MIT license, but it's a variation I can't find elsewhere (e.g. https://fedoraproject.org/wiki/Licensing:MIT https://en.wikipedia.org/wiki/MIT_License https://opensource.org/licenses/alphabetical )

There's a secondary question of whether that "without fee" is a problematic clause or not for FOSS (later variants add commas or say "with or without", to clarify intent). Not sure if that's in scope of this project at all.

Nudge @rsc

The files in this directory are subject to the following license.

The author of this software is Russ Cox.

        Copyright (c) 2006 Russ Cox

Permission to use, copy, modify, and distribute this software for any
purpose without fee is hereby granted, provided that this entire notice
is included in all copies of any software which is or includes a copy
or modification of this software and in all copies of the supporting
documentation for such software.

THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED
WARRANTY.  IN PARTICULAR, THE AUTHOR MAKES NO REPRESENTATION OR WARRANTY
OF ANY KIND CONCERNING THE MERCHANTABILITY OF THIS SOFTWARE OR ITS
FITNESS FOR ANY PARTICULAR PURPOSE.

Proprietary software licenses

Hello,

I would like to know what's the current status regarding proprietary licenses. Are you considering to detect them (random example: by detecting some specific headers in the license file)?

I am trying to understand how can I get my proprietary library to be properly handled by pkg.dev.go. The fallback to godoc.org seems no longer possible.

Add BUSL 1.1 license

This BUSL 1.1 is an SPDX license and I think originally came from https://mariadb.com/bsl11/.

I was going to make a PR to add an ler file for this but I’m sure what should count as a valid BUSL-1.1 license.

For example, I was looking into github.com/hashicorp/terraform's license and it seems to have the bulk of the BUSL-1.1 license but not the “Standard License Header” and “Covenants of Licensor”. Both sections appear on the SPDX license text page.

Does licensecheck have a process to define what counts as a valid license?

EUPL type doesn't match

Despite the presence of the license file in

, the license is not matched by licensecheck.

package main

import (
	"fmt"
	"os"

	"github.com/google/licensecheck"
)

func main() {
	data, _ := os.ReadFile("EUPL-1.2%20EN.txt")
	cov := licensecheck.Scan(data)
	fmt.Printf("%.1f%% of text covered by licenses:\n", cov.Percent)
	for _, m := range cov.Match {
		fmt.Printf("%s at [%d:%d] IsURL=%v\n", m.Type, m.Start, m.End, m.IsURL)
	}
}
$ curl -LO https://joinup.ec.europa.eu/sites/default/files/custom-page/attachment/2020-03/EUPL-1.2%20EN.txt

$ go run main.go
100.0% of text covered by licenses:
Unknown at [0:13827] IsURL=false

Cover no longer thread-safe

Previously, Cover could be invoked concurrently. Now (a18cfb0), thanks to the shared state in Checker, it can't.

The particular error I saw was "concurrent map read and map write", with one offender at normalize.go:57.

This isn't that important since checking is now a lot faster, but it still takes about 30s to run on about 29K licenses.

Match.Start is incorrect

The value given for Match.Start is sometimes wrong. I suspect that the copyright logic is to blame.

Example:

ISC License

Copyright (c) 2013-2017 The btcsuite developers
Copyright (c) 2015-2018 The Decred developers
Copyright (c) 2017 The Lightning Network Developers

Permission to use, copy, modify, and distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Calling licensecheck.Cover with this and the default options returns the Match

{Name:ISC Type:Other Percent:93.75 Start:197 End:852 IsURL:false}

Byte offset 197 is the beginning of the word "distribute". I would have expected the start to be at "Permission", or maybe one of the Copyright lines.

Python 2.0 incorrectly recognized

The license file at https://github.com/nodeca/argparse/blob/a645a9a9d3d0a347f383d0b795859e67dfae6ad8/LICENSE isn't recognized as a Python 2.0 license.

When I run

package main

import (
	"fmt"
	"os"

	"github.com/google/licensecheck"
)

func main() {
	content, _ := os.ReadFile("LICENSE")

	cov := licensecheck.Scan(content)
	fmt.Printf("%.1f%% of text covered by licenses:\n", cov.Percent)
	for _, m := range cov.Match {
		fmt.Printf("%s at [%d:%d] IsURL=%v\n", m.ID, m.Start, m.End, m.IsURL)
	}
}

on the license file, the output is

7.9% of text covered by licenses:
HPND at [11702:12775] IsURL=false

It looks like the license file referenced is a longer version of the Python 2.0 license, which is throwing the checks off.

Reuse Compatability

I changed a go project I have over to use spdx & the FSF's reuse tooling to attempt to make things more clear & simpler to manage. It looks like this license checker doesn't handle looking in a LICENSES directory for the SPDX licenses. The purpose for this issue is to see if that is something the maintainers here would be willing to accept a change for.

For reference, this is what reuse shows for it's structuring:
https://reuse.software/tutorial/#result

Here is my project (that uses the reuse cli tool):
https://github.com/schmidtw/goschtalt/tree/main/LICENSES

Of note is that the code isn't dual licensed (meaning it's available under both) but parts are covered by one license and other are covered by a different license.

Add license MIT No Attribution (SPDX MIT-0)

Please add MIT No Attribution (MIT-0).

I don't know how prevalent it is in the Go ecosystem, but as a license it is used by e.g. Amazon

Apologies for not making this a pull request but signing a CA for this was too high friction.

SPDX-License-Identifier: MIT-0

Text:

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Add BlueOak Model License

Please add Blue Oak Model License 1.0.0. It looks like you're not limiting license detections to OSI-approvals, as BlueOak-1.0 is not approved by OSI. However, it's a nice alternative to the MIT license, with a better (read: understandable) language, and specifics on patents.

Apache license recognized as ECL-2

The text below, from https://github.com/hyperledger/burrow/blob/v0.21.0/LICENSE.md, is identical to the Apache 2.0 license without appendix.

At the previous head (e43db20), Cover reported a single Match, to Apache-2.0:

{Name:Apache-2.0 Type:Apache Percent:89.40355329949239 Start:33 End:10172 IsURL:false}

At the current head (9240d01), it reports a single match to ECL-2.0:

{Name:ECL-2.0 Type:Other Percent:81.6793893129771 Start:161 End:10172 IsURL:false} 

The text lacks ECL-2.0's preamble and its additional sentence in section 3.

                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

Dual-licensed BSD 3-clause and GPL-2.0 not recognized as such

This license file that specifies both BSD 3-clause and GPL-2.0 is only recognized as GPL-2.0. It's a bit mysterious to me why it's not recognizing the BSD part of the file, as it seems to contain a lot of text that fits the matcher.

I wrote a sample program that runs licensecheck against the file, and it gives this output:

$ go run main.go
90.0% of text covered by licenses:
GPL-2.0 at [2276:20460] IsURL=false

Apache-2.0 URL alone is enough to match the license; is this intentional?

The Apache-2.0 URL (http://www.apache.org/licenses/LICENSE-2.0) alone counts as a match to the license. Text containing parts of other licenses usually does not produce a match, so I wanted to ask if this was intentional.

Here's the example using my Julia wrapper LicenseCheck.jl:

julia> using LicenseCheck

julia> licensecheck("http://www.apache.org/licenses/LICENSE-2.0")
(licenses_found = ["Apache-2.0"], license_file_percent_covered = 100.0)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.