Git Product home page Git Product logo

patent-analysis's Introduction

Patent Analysis

This is a tool to download US Patent information, particularly, citation network from patentsview.org and store into MongoDB to analyze. Currently, this dataset contains 6.215.171 patents (verticies) and 86.184.397 citations (edges).

For example, we can draw in-degree distribution of the citation network.

PageRank of PageRank Patent

or average citations (out-degree) added per year:

PageRank of PageRank Patent

or edge/node (loglog) count over time:

PageRank of PageRank Patent

See the notebook for other demos.

Getting Started

1) Python dependencies

First, you need to have the required packages under requirements.txt. You can install using virtualenv or to user.

2) Database

Second, you need a MongoDB running. If you have docker you can use following line to run an instance:

$ docker run --name patent-mongo -v `pwd`/db:/data/db -p 27017:27017 -d mongo:3.4.6

(If your db host is not localhost, set it with environment vars like $ export MONGO_URL="mongodb://host:27017" before running the app.)

3) Downloading and storing the dataset:

This will download the dataset from patentsview.org/download/:

$ ./download_data.sh

Now inserting it to the db:

$ python -m patent_analysis.insert_db

(It takes ~4h with SSD Disk!)

Now we are ready to work on the patent data.

Roadmap

  • Queries for WIPO categories
  • Centrality measure analysis such as PageRank PageRank of PageRank Patent
  • Community analysis
  • Embeddings
  • Assignee network analysis
  • ...

License

Released under the MIT license.

patent-analysis's People

Contributors

aksakalli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

jatin7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.