Git Product home page Git Product logo

requests-ratelimiter's Introduction

Requests-Ratelimiter

Build status Codecov Documentation Status PyPI Conda PyPI - Python Versions PyPI - Format

This package is a simple wrapper around pyrate-limiter v2 that adds convenient integration with the requests library.

Full project documentation can be found at requests-ratelimiter.readthedocs.io.

Features

  • pyrate-limiter is a general-purpose rate-limiting library that implements the leaky bucket algorithm, supports multiple rate limits, and has optional persistence with SQLite and Redis backends
  • requests-ratelimiter adds some conveniences for sending rate-limited HTTP requests with the requests library
  • It can be used as either a session or a transport adapter
  • It can also be used as a mixin, for compatibility with other requests-based libraries
  • Rate limits are tracked separately per host
  • Different rate limits can optionally be applied to different hosts

Installation

pip install requests-ratelimiter

Usage

Usage Options

There are three ways to use requests-ratelimiter:

Session

The simplest option is LimiterSession, which can be used as a drop-in replacement for requests.Session.

Note: By default, each session will perform rate limiting independently. If you are using a multi-threaded environment or multiple processes, you should use a persistent backend like SQLite or Redis which can persist the rate limit across threads, processes, and/or application restarts. When using requests-ratelimiter as part of a web application, it is recommended to use a persistent backend to ensure that the rate limit is shared across all requests.

Example:

from requests_ratelimiter import LimiterSession
from time import time

# Apply a rate limit of 5 requests per second to all requests
session = LimiterSession(per_second=5)
start = time()

# Send requests that stay within the defined rate limit
for i in range(20):
    response = session.get('https://httpbin.org/get')
    print(f'[t+{time()-start:.2f}] Sent request {i+1}')

Example output:

[t+0.22] Sent request 1
[t+0.26] Sent request 2
[t+0.30] Sent request 3
[t+0.34] Sent request 4
[t+0.39] Sent request 5
[t+1.24] Sent request 6
[t+1.28] Sent request 7
[t+1.32] Sent request 8
[t+1.37] Sent request 9
[t+1.41] Sent request 10
[t+2.04] Sent request 11
...

Adapter

For more advanced usage, LimiterAdapter is available to be used as a transport adapter.

Example:

from requests import Session
from requests_ratelimiter import LimiterAdapter

session = Session()

# Apply a rate-limit (5 requests per second) to all requests
adapter = LimiterAdapter(per_second=5)
session.mount('http://', adapter)
session.mount('https://', adapter)

# Send rate-limited requests
for user_id in range(100):
    response = session.get(f'https://api.some_site.com/v1/users/{user_id}')
    print(response.json())

Mixin

Finally, LimiterMixin is available for advanced use cases in which you want add rate-limiting features to a custom session or adapter class. See Custom Session Example below for an example.

Rate Limit Settings

Basic Settings

The following parameters are available for the most common rate limit intervals:

  • per_second: Max requests per second
  • per_minute: Max requests per minute
  • per_hour: Max requests per hour
  • per_day: Max requests per day
  • per_month: Max requests per month
  • burst: Max number of consecutive requests allowed before applying per-second rate-limiting

Advanced Settings

If you need to define more complex rate limits, you can create a Limiter object instead:

from pyrate_limiter import Duration, RequestRate, Limiter
from requests_ratelimiter import LimiterSession

nanocentury_rate = RequestRate(10, Duration.SECOND * 3.156)
fortnight_rate = RequestRate(1000, Duration.DAY * 14)
trimonthly_rate = RequestRate(10000, Duration.MONTH * 3)
limiter = Limiter(nanocentury_rate, fortnight_rate, trimonthly_rate)

session = LimiterSession(limiter=limiter)

See pyrate-limiter docs for more Limiter usage details.

Backends

By default, rate limits are tracked in memory and are not persistent. You can optionally use either SQLite or Redis to persist rate limits across threads, processes, and/or application restarts. You can specify which backend to use with the bucket_class argument. For example, to use SQLite:

from pyrate_limiter import SQLiteBucket
from requests_ratelimiter import LimiterSession

session = LimiterSession(per_second=5, bucket_class=SQLiteBucket)

See pyrate-limiter docs for more details.

Other Features

Per-Host Rate Limit Tracking

With either LimiterSession or LimiterAdapter, rate limits are tracked separately for each host. In other words, requests sent to one host will not count against the rate limit for any other hosts:

session = LimiterSession(per_second=5)

# Make requests for two different hosts
for _ in range(10):
    response = session.get(f'https://httpbin.org/get')
    print(response.json())
    session.get(f'https://httpbingo.org/get')
    print(response.json())

If you have a case where multiple hosts share the same rate limit, you can disable this behavior with the per_host option:

session = LimiterSession(per_second=5, per_host=False)

Per-Host Rate Limit Definitions

With LimiterAdapter, you can apply different rate limits to different hosts or URLs:

# Apply a different set of rate limits (2/second and 100/minute) to a specific host
adapter_2 = LimiterAdapter(per_second=2, per_minute=100)
session.mount('https://api.some_site.com', adapter_2)

Behavior for matching requests is the same as other transport adapters: requests will use the adapter with the most specific (i.e., longest) URL prefix that matches a given request. For example:

session.mount('https://api.some_site.com/v1', adapter_3)
session.mount('https://api.some_site.com/v1/users', adapter_4)

# This request will use adapter_3
session.get('https://api.some_site.com/v1/')

# This request will use adapter_4
session.get('https://api.some_site.com/v1/users/1234')

Custom Tracking

For advanced use cases, you can define your own custom tracking behavior with the bucket option. For example, an API that enforces rate limits based on a tenant ID, this feature can be used to track rate limits per tenant. If bucket is specified, host tracking is disabled.

Note: It is advisable to use SQLite or Redis backends when using custom tracking because using the default backend each session will track rate limits independently, even if both sessions call the same URL.

sessionA = LimiterSession(per_second=5, bucket='tenant1')
sessionB = LimiterSession(per_second=5, bucket='tenant2')

Rate Limit Error Handling

Sometimes, server-side rate limiting may not behave exactly as documented (or may not be documented at all). Or you might encounter other scenarios where your client-side limit gets out of sync with the server-side limit. Typically, a server will send a 429: Too Many Requests response for an exceeded rate limit.

When this happens, requests-ratelimiter will adjust its request log in an attempt to catch up to the server-side limit. If a server sends a different status code other than 429 to indicate an exceeded limit, you can set this with limit_statuses:

session = LimiterSession(per_second=5, limit_statuses=[429, 500])

Or if you would prefer to disable this behavior and handle it yourself:

session = LimiterSession(per_second=5, limit_statuses=[])

Compatibility

There are many other useful libraries out there that add features to requests, most commonly by extending or modifying requests.Session or requests.HTTPAdapter.

To use requests-ratelimiter with one of these libraries, you have a few different options:

  1. If the library provides a custom Session class, mount a LimiterAdapter on it
  2. Or use LimiterMixin to create a custom Session class with features from both libraries
  3. If the library provides a custom Adapter class, use LimiterMixin to create a custom Adapter class with features from both libraries

Custom Session Example: Requests-Cache

For example, to combine with requests-cache, which also includes a separate mixin class:

from requests import Session
from requests_cache import CacheMixin
from requests_ratelimiter import LimiterMixin, SQLiteBucket


class CachedLimiterSession(CacheMixin, LimiterMixin, Session):
    """
    Session class with caching and rate-limiting behavior. Accepts arguments for both
    LimiterSession and CachedSession.
    """


# Optionally use SQLite as both the bucket backend and the cache backend
session = CachedLimiterSession(
    per_second=5,
    cache_name='cache.db',
    bucket_class=SQLiteBucket,
    bucket_kwargs={
        "path": "cache.db",
        'isolation_level': "EXCLUSIVE",
        'check_same_thread': False,
    },
)

This example has an extra benefit: cache hits won't count against your rate limit!

requests-ratelimiter's People

Contributors

dependabot[bot] avatar jwcook avatar paddyroddy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

requests-ratelimiter's Issues

Only limit if requested is not cached

I am trying to implement a rate limiting with additional cache. This works as far as it goes, but I want to limit the rate only if the entry cannot be loaded from the cache. What is the best way to implement this?

self.session = CachedLimiterSession(
            limiter=Limiter(RequestRate(2, Duration.SECOND * 5)),  # max 2 requests per 5 seconds
            bucket_class=SQLiteBucket,
            backend=SQLiteCache(ROOT_DIR + "/cache/cache.sqlite"),
        )

Handle 304 responses

Some APIs that support conditional requests, like the GitHub API, do not count it against your rate limit if you make a conditional request and get a 304 Not Modified response. It might be convenient to add an option to not apply client-side rate-limiting for 304s.

How to set random or changing delays?

I'd like to spread my requests out a bit so that it's not just once every x seconds. However, I don't see an example of how to stagger the delays. Ideally, I'd like to set two limits: a random limit (one request every .5 to 5 seconds) and then also an overall limit (30 per minute). Is this possible with requests-ratelimiter?

Add feature for custom buckets

I need finer-grained control over the rate-limiting as I am using an API that is rate limited on a per-tenant basis. I see that the underlying pyrate library supports this and this is used for per-host tracking. I would like to add support for custom caching buckets as an optional replacement for host-based tracking.

`limiter` attribute lost in pickling

I'm experimenting with passing a rate-limiter object to processes, and encountered a pickling bug:

from requests_ratelimiter import LimiterSession
from pyrate_limiter import Duration, RequestRate, Limiter
history_rate = RequestRate(1, Duration.SECOND)
limiter = Limiter(history_rate)
session = LimiterSession(limiter=limiter)

import pickle
pickled_session = pickle.dumps(session)
unpickled_session = pickle.loads(pickled_session)
unpickled_session.limiter

RedisBucket no rate limit

import http
import json
import re
import time
import urllib3
from urllib.parse import urlencode, quote

import redis
import requests
from bs4 import BeautifulSoup
from loguru import logger
from pyrate_limiter import RedisBucket, SQLiteBucket
from requests_ratelimiter import LimiterSession

from config import settings


class EtherscanAPI(object):
    """_summary_
    https://docs.etherscan.io/

    My API Plan: FREE API PLAN
    API calls per second: 5 calls

    Args:
        object (_type_): _description_

    Returns:
        _type_: _description_
    """

    proto = 'https://'
    domain = 'api.etherscan.io'
    redis_pool = redis.ConnectionPool(
        host=settings.REDIS_HOST,
        port=settings.REDIS_PORT,
        password=settings.REDIS_PASSWORD,
        db=settings.REDIS_DB,
    )
    session = LimiterSession(
        per_second=5,
        bucket_class=RedisBucket,
        bucket_kwargs={'redis_pool': redis_pool, 'bucket_name': 'etherscan_api'},
    )

    @classmethod
    def get(cls, path='/api', **params):
        """

        :param path: 
        :param params:
        :return:
        """
        url = cls.proto + cls.domain + path

        cls.session.headers.update(
            {
                "User-Agent": settings.UA,
                "Connection": "keep-alive",
            }
        )
        r = cls.session.get(url, params=params, timeout=30)
        if r.status_code == http.HTTPStatus.OK:
            try:
                rsp = r.json()
            except json.decoder.JSONDecodeError as jde:
                return r
            if rsp["status"] == '1' and rsp["message"] == "OK":
                return rsp
            else:
                logger.error(
                    "request etherscan api error, response data's status code is error: %s, %s, %s, %s"
                    % (path, rsp["result"], rsp["status"], rsp["message"])
                )
                return False
        else:
            logger.error(
                "request etherscan api error, request url: %s, param: %s, response status_code: %s"
                % (url, params, r.status_code)
            )
            return False

    @classmethod
    def get_gas_price(cls):
        payload = {"module": "gastracker", "action": "gasoracle", "apikey": settings.ETHERSCAN_API_KEY}
        return cls.get('/api', **payload)


if __name__ == '__main__':
    start = time.time()

    # Send requests that stay within the defined rate limit
    # for i in range(20):
    #     EtherscanAPI.get_gas_price()
    #     print(f'[t+{time.time()-start:.2f}] Sent request {i+1}')

    from concurrent.futures import ThreadPoolExecutor

    with ThreadPoolExecutor(max_workers=20) as executor:
        futures = [executor.submit(EtherscanAPI.get_gas_price) for _ in range(20)]
    print([future.result() for future in futures])

Output exist some error exceed the rate limit

2024-02-04 17:53:25.102 | ERROR    | __main__:get:76 - request etherscan api error, response data's status code is error: /api, Max rate limit reached, 0, NOTOK
2024-02-04 17:53:25.117 | ERROR    | __main__:get:76 - request etherscan api error, response data's status code is error: /api, Max rate limit reached, 0, NOTOK
[{'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, False, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, False, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}, {'status': '1', 'message': 'OK', 'result': {'LastBlock': '19154180', 'SafeGasPrice': '15', 'ProposeGasPrice': '15', 'FastGasPrice': '15', 'suggestBaseFee': '14.182493938', 'gasUsedRatio': '0.387329566666667,0.496199666666667,0.2551076,0.296484866666667,0.630901033333333'}}]

Handle 429 responses

If client-side rate-limiting doesn't match the server-side rate-limiting (or the server doesn't behave as documented/expected), it may still be possible to get a 429 Too Many Requests response.

This could potentially be handled by this library. Or would it be better just to let the user handle this with standard retry behavior, e.g. with urllib3.util.Retry?

Handle rate-limiting headers?

Some APIs, like the GitHub API send rate-limiting response headers:

  • X-RateLimit-Limit
  • X-RateLimit-Remaining
  • X-RateLimit-Reset
  • Retry-After

When these are provided, they could potentially be used instead of the leaky bucket to apply client-side rate-limiting. Or that might be out of scope for this library. Just an idea.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.