Git Product home page Git Product logo

openinsiderdata's Introduction

OpenInsider Scraper

If you've always wanted to dive into the riveting world of insider trading data, but were too lazy to manually sift through thousands of pages, then buckle up because you're in for a treat. This is your magic carpet to travel through time, from 2013 to the present, and gather juicy tidbits of insider trading data from the future, well not really, only up to the current month (if it could gather data from the future, I'd probably be on my yacht in the Caribbean by now).

What does this badboy do?

The code in this repository makes use of the requests and BeautifulSoup libraries in Python to scrape data from openinsider. The results are neatly tucked away in a CSV file. I also added threading so it's as fast as a leopard.

The script also comes with a built-in logger that logs events into a file because why not.

How to Run

Docker

Simply build the image and run:

docker buildx build -t openinsider ./
mkdir date
docker run \
-v "${PWD}/data":data \
-e OUTPUT_DIR="data" \
-it openinsider

You can also build the daily image and tell it when to start scraping:

docker buildx build -t openinsider-daily -f Dockerfile.daily
mkdir data
docker run \
-v "${PWD}/data":data \
-e OUTPUT_DIR="data" \
-e START_DATE="2024-03-01" \
-it openinsider-daily

Bare python

Running the script is as easy as a walk in the park... on a sunny day... with your favorite ice cream in your hand. Clone the repository, make sure you have the required libraries installed:

pip install --upgrade pip
pip install requests BeautifulSoup4 logging datetime

and then just run the python script. Grab a cup of coffee, and watch it do the work.

Disclaimer

While this tool is quite powerful, it comes with no guarantee of making you rich. It will just make you data rich, which isn't necessarily the same thing. Also, this tool does not promote insider trading. It's called insider trading data scraper, not insider trading-data scraper.

Last words

Enjoy the script and may the odds be ever in your favor!

openinsiderdata's People

Contributors

santiago-mooser avatar sd3v avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

openinsiderdata's Issues

If date has no data, scripts exits with error

See:

Traceback (most recent call last):
  File "/daily.py", line 76, in <module>
    get_openinsider_data()
  File "/daily.py", line 59, in get_openinsider_data
    all_data.extend(future.result())
                    ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/daily.py", line 35, in get_data_for_date
    rows = soup.find('table', {'class': 'tinytable'}).find('tbody').findAll('tr', recursive=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'find'

This can be fixed if this line is split into two and a None check is made:

    # Find all the rows in the table on the website
    rows = soup.find('table', {'class': 'tinytable'}).find('tbody').findAll('tr', recursive=False)

Like this:

    tinytable = soup.find('table', {'class': 'tinytable'})

    if tinytable is None
        print(f"No data for {date_string}")
    return data

    # Find all the rows in the table on the website
    rows = tinytable.find('tbody').findAll('tr', recursive=False)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.