Git Product home page Git Product logo

nuget-package-scanner's Introduction

nuget-package-scanner

PyPI PyPI - License Python package codecov

nuget-package-scanner is a Python module that will query your Github organization for Nuget dependencies in your .Net projects and produce a report on how up-to-date they are. This can be useful for identifying which projects' dependencies are out of date (and how badly). It can also be useful for identifying how many disperate versions of the same common package is in use across your codebase (cough... Newtonsoft.Json...cough).

Currently, results are saved to a csv file that can be imported into a spreadsheet or another db that can be used for displaying, sorting, and further analysis.

Installation

pip install nuget-package-scanner

Usage (as a script)

  1. Ensure that you have a Github personal token
  2. (Optionally) Set a GITHUB_TOKEN envorinment variable with the value. If you don't set this variable, you'll have to provide it at the prompt at runtime.
  3. cd to your local clone of this repo
  4. python -m nuget_package_scanner
  5. Follow the prompt(s)
  6. Import the exported .csv into google sheets (or another spreadsheet app)

Report Data

  • From Github
    • Repo Name - The name of the github repository that the package configuration was discovered
    • Container Path - The file path to the package configuration within the repository
    • Name - The name of the package config container that the package was listed in. This will be either a .Net Framework packages.config or a .Net core .csproj file.
    • Referenced Version - The version referenced in the package container (if there was a version specified, some core MSFT libraries don't specify)
  • From Nuget Server
    • Date - The date the Referenced Version was published to the Nuget Server
    • Latest Release - The latest full-release version of the package that has been published to the Nuget Server
    • Latest Release Date - The date the Latest Release was published to the Nuget Server
    • Latest Package - The latest published version of the package (inclusive of pre-release) that has been published to the Nuget Server
    • Latest Package Date - The date the Latest Package was published to the Nuget Server
    • Link - If it is a public package on the Nuget Server (e.g nuget.org), this will be a url to the detail page for the package. This link is not likely to be provided by a private package repo
    • Source - Url of the registration index for the package that was used to GET details
  • Calculated (included for convenience)
    • Major Release Behind - The number of major releases behind the referenced package is from the Latest Release
    • Minor Release Behind - The number of minor releases behind the referenced package is from the Latest Release. This will only be calculated if the packages have the same major version
    • Patch Release Behind - The number of patch releases behind the referenced package is from the Latest Release. This will only be calculated if the packages have the same major version and minor version
    • Available Version Count - The total number of versions of the package that are published to the Nuget Server.

Basic Application Flow

  1. Search the specified Github org for all Nuget server configurations (nuget.config) to detect and additional Nuget Server sources.(Nuget.org will be included by default)
  2. Search the specified Github org for all .Net Core and .Net Framework project configurations that contain refrences to Nuget package dependencies.
  3. Cycle through each Nuget Package discovered
    1. Cycle through each Nuget Server (preferring nuget.org) to find where the package lives
    2. Use the appropriate Nuget Server to fetch registration information for the package
  4. Generate and save CSV

Runtime Note: My org (168 repositories w/ 100+ Nuget-referencing projects and ~2k individual package references) can take around 2 minutes to fully process.

TODOs

  • Shared session(s) in web requests to support connection pooling and boost performance
  • More resilliancy in web call timeout errors. Currently, any timeout crashes things.
  • Implement async web requests in nuget module. This would speed this up a good bit. Most of the time is currently spent waiting on web requests to complete and there is little reason for that to happen serially.
  • Build a visual front end consumer
  • Rate limiting checks on calls to the github api. When searching within a very large github org, there is the possiblity that the search api rate limit budget could be exhausted (currently 30 calls/minute if authenticated)
  • Possibly break out the nuget module into a stand-alone Python package. I'm not sure if there's any use beyond basic GET functionality.
  • Optimizing json object scanning algorhithms. It's currently a very simple brute force approach. This may be a lot of work for little gain.

You may be wondering...

Why did you write your own github client?

I originally tried to make use of PyGithub. I couldn't get it working correctly with my personal acccess token, so I wrote a simple client of my own. This also gave me a chance to familiarize a bit more with the Github API. I wanted to use the Github GraphQL API for this, but it doesn't support code search as of yet. If I need to support any more-complicated use cases, I will look at switching back to PyGithub.

Why did you write this in Python? Nuget only supports .Net.

I wanted to learn something new and Python is new to me. This project seemed like a good use case for the high-level scripting support available in Python. I could have written this in C#, but I wouldn't have learned as much in the process.

nuget-package-scanner's People

Contributors

doneholmes avatar

Stargazers

 avatar  avatar

Watchers

 avatar

nuget-package-scanner's Issues

Issue with reading the response on specific file from Github

I occasionally get errors in the following places when hitting the Github API..

except aiohttp.ClientPayloadError:

except aiohttp.ClientPayloadError:

Found an issue that seems related (curiously also w/ calling the Github API)

Since these are intermittent errors where it's often the same problem files (but not always), My assumption is that i'm tripping on Possibility 2 listed in the aiohttp docs.

Handle aiohttp.client_exceptions.ClientPayloadError Here

This call occasionally fails with

Traceback (most recent call last): File "C:\Python38\lib\runpy.py", line 193, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python38\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "E:\repositories\nuget-package-scanner\nuget_package_scanner\__main__.py", line 15, in <module> loop.run_until_complete(app.run(org, token, output)) File "C:\Python38\lib\asyncio\base_events.py", line 616, in run_until_complete return future.result() File "E:\repositories\nuget-package-scanner\nuget_package_scanner\app.py", line 151, in run package_containers: List[PackageContainer] = await build_org_report(org, token) File "E:\repositories\nuget-package-scanner\nuget_package_scanner\app.py", line 95, in build_org_report await wait_or_raise([net_core_task, net_framework_task]) File "E:\repositories\nuget-package-scanner\nuget_package_scanner\async_utils.py", line 14, in wait_or_raise raise d.exception() File "E:\repositories\nuget-package-scanner\nuget_package_scanner\github_search.py", line 115, in search_package_configs return await self.search_github_code(f'package+org:{org}+filename:packages.config', limit) File "E:\repositories\nuget-package-scanner\nuget_package_scanner\github_search.py", line 103, in search_github_code await wait_or_raise(tasks) File "E:\repositories\nuget-package-scanner\nuget_package_scanner\async_utils.py", line 14, in wait_or_raise raise d.exception() File "E:\repositories\nuget-package-scanner\nuget_package_scanner\github_search.py", line 74, in __process_search_page details = await self.get_request_as_json(details_url) File "E:\repositories\nuget-package-scanner\nuget_package_scanner\github_search.py", line 48, in get_request_as_json return await response.json() File "E:\repositories\nuget-package-scanner\env\lib\site-packages\aiohttp\client_reqrep.py", line 1021, in json await self.read() File "E:\repositories\nuget-package-scanner\env\lib\site-packages\aiohttp\client_reqrep.py", line 973, in read self._body = await self.content.read() File "E:\repositories\nuget-package-scanner\env\lib\site-packages\aiohttp\streams.py", line 358, in read block = await self.readany() File "E:\repositories\nuget-package-scanner\env\lib\site-packages\aiohttp\streams.py", line 380, in readany await self._wait('readany') File "E:\repositories\nuget-package-scanner\env\lib\site-packages\aiohttp\streams.py", line 296, in _wait await waiter aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed

There's likely something missing or off in how i'm handling errors/timeouts since updating to async calls.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.