Git Product home page Git Product logo

gspread_asyncio's Introduction

gspread_asyncio

An asyncio wrapper for burnash's excellent Google Spreadsheet API library. gspread_asyncio isn't just a plain asyncio wrapper around the gspread API, it implements several useful and helpful features on top of those APIs. It's useful for long-running processes and one-off scripts.

Requires Python >= 3.8.

Documentation Status CI status

Features

  • Complete async wrapping of the gspread API. All gspread API calls are run off the main thread in a threadpool executor.
  • Internal caching and reuse of gspread Client/Spreadsheet/Worksheet objects.
  • Automatic renewal of expired credentials.
  • Automatic retries of spurious failures from Google's servers (HTTP 5xx).
  • Automatic rate limiting with defaults set to Google's default API limits.
  • Many methods that don't need to return a value can optionally return an already-scheduled Future (the nowait kwarg). You can ignore that future, allowing forward progress on your calling coroutine while the asyncio event loop schedules and runs the Google Spreadsheet API call at a later time for you.

Example usage

import asyncio

import gspread_asyncio

# from google-auth package
from google.oauth2.service_account import Credentials 

# First, set up a callback function that fetches our credentials off the disk.
# gspread_asyncio needs this to re-authenticate when credentials expire.

def get_creds():
    # To obtain a service account JSON file, follow these steps:
    # https://gspread.readthedocs.io/en/latest/oauth2.html#for-bots-using-service-account
    creds = Credentials.from_service_account_file("serviceacct_spreadsheet.json")
    scoped = creds.with_scopes([
        "https://spreadsheets.google.com/feeds",
        "https://www.googleapis.com/auth/spreadsheets",
        "https://www.googleapis.com/auth/drive",
    ])
    return scoped

# Create an AsyncioGspreadClientManager object which
# will give us access to the Spreadsheet API.

agcm = gspread_asyncio.AsyncioGspreadClientManager(get_creds)

# Here's an example of how you use the API:

async def example(agcm):
    # Always authorize first.
    # If you have a long-running program call authorize() repeatedly.
    agc = await agcm.authorize()

    ss = await agc.create("Test Spreadsheet")
    print("Spreadsheet URL: https://docs.google.com/spreadsheets/d/{0}".format(ss.id))
    print("Open the URL in your browser to see gspread_asyncio in action!")

    # Allow anyone with the URL to write to this spreadsheet.
    await agc.insert_permission(ss.id, None, perm_type="anyone", role="writer")

    # Create a new spreadsheet but also grab a reference to the default one.
    ws = await ss.add_worksheet("My Test Worksheet", 10, 5)
    zero_ws = await ss.get_worksheet(0)

    # Write some stuff to both spreadsheets.
    for row in range(1, 11):
        for col in range(1, 6):
            val = "{0}/{1}".format(row, col)
            await ws.update_cell(row, col, val + " ws")
            await zero_ws.update_cell(row, col, val + " zero ws")
    print("All done!")

# Turn on debugging if you're new to asyncio!
asyncio.run(example(agcm), debug=True)

Observational notes and gotchas

  • This module does not define its own exceptions, it propagates instances of gspread.exceptions.GSpreadException.
  • Always call AsyncioGspreadClientManager.authorize(), AsyncioGspreadClient.open_*() and AsyncioGspreadSpreadsheet.get_worksheet() before doing any work on a spreadsheet. These methods keep an internal cache so it is painless to call them many times, even inside of a loop. This makes sure you always have a valid set of authentication credentials from Google.
  • The only object you should store in your application is the AsyncioGspreadClientManager (agcm).
  • Right now the gspread library does not support bulk appends of rows or bulk changes of cells. When this is done gspread_asyncio will support batching of these Google API calls without any changes to the Python gspread_asyncio API.
  • I came up with the default 1.1 second delay between API calls (the gspread_delay kwarg) after extensive experimentation. The official API rate limit is one call every second but however Google measures these things introduces a tiny bit of jitter that will get you rate blocked if you ride that limit exactly.
  • Google's service reliability on these endpoints is surprisingly bad. There are frequent HTTP 500s and the retry logic will save your butt in long-running scripts or short, one-shot, one-off ones.
  • Experimentation also found that Google's credentials expire after an hour and the default reauth_interval of 45 minutes takes care of that just fine.

License

MIT

Sponsorship

Development of gspread_asyncio is sponsored by Pro Football History.com, your source for NFL coaching biographies.

gspread_asyncio's People

Contributors

2qar avatar azjps avatar dgilman avatar erakli avatar imasters11 avatar klauria-factorial avatar kraptor avatar mat0mba24 avatar sethis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

gspread_asyncio's Issues

AsyncioGspreadWorksheet.range() stopped accepting numeric boundaries

Hi,

With this commit: 75bdaee the definition of function range() in AsyncioGspreadWorksheet class changed from:

async def range(self, *args, **kwargs)

to:

async def range(self, name: str)

But wrapped library seems to still support different notations for this function: https://docs.gspread.org/en/latest/api/models/worksheet.html#gspread.worksheet.Worksheet.range. So code that used numeric boundaries now gets an error after updating to a newer version of gsread_asyncio:

AsyncioGspreadWorksheet.range() takes 2 positional arguments but 5 were given

[Bug] AsyncioGspreadWorksheet.get doesn't work

Problem

AsyncioGspreadWorksheet.get throws the TypeError exception.

values = await sh.get()
../python3.10/site-packages/gspread/utils.py:666: TypeError
TypeError: get got unexpected keyword arguments: ['range_name']

I think it happens because gspread decorates Worksheet.get function with:

@accepted_kwargs(
        major_dimension=None,
        value_render_option=None,
        date_time_render_option=None,
    )

so range_name parameter for some reason can no longer be in kwargs

Traceback

    @wraps(f)
    def wrapper(*args, **kwargs):
        unexpected_kwargs = set(kwargs) - set(default_kwargs)
        if unexpected_kwargs:
            err = "%s got unexpected keyword arguments: %s"
>           raise TypeError(err % (f.__name__, list(unexpected_kwargs)))
E           TypeError: get got unexpected keyword arguments: ['range_name']

Possible Solution

Move range_name from kwargs to args in _call function

Before

return await self.agcm._call(
            self.ws.get,
            range_name=range_name,
            major_dimension=major_dimension,
            value_render_option=value_render_option,
            date_time_render_option=date_time_render_option,
        )

After

return await self.agcm._call(
            self.ws.get,
            range_name,
            major_dimension=major_dimension,
            value_render_option=value_render_option,
            date_time_render_option=date_time_render_option,
        )

[Bug] Worksheet.get_all_records() fails when numericise_ignore is omitted

Problem

When you call Worksheet.get_all_records() without the numericise_ignore parameter, it fails with:

TypeError: argument of type 'NoneType' is not iterable

This is due to 75bdaee in gspread, which boldly changed the default from None to [] (it's still None in gspread_asyncio). As a workaround, you can pass numericise_ignore=[], so it's probably just a matter of making that the default.

Steps to reproduce

  1. Open a worksheet with values past the head row, e.g., a sheet with just a 1 in A2.
  2. Call ws.get_all_records() with no parameters.
  3. The call will raise a TypeError.

Code example

async def main():
  agc = await agcm.authorize()
  ss = await agc.open_by_url(SHEETS_URL)
  ws = await ss.worksheet(WORKSHEET)
  await ws.get_all_records()

Stack trace

Traceback (most recent call last):
  File "/home/yob/programming/discord/callbot/sheets.py", line 31, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/home/yob/programming/discord/callbot/sheets.py", line 28, in main
    print(await ws.get_all_records())
  File "/home/yob/programming/discord/callbot/venv/lib/python3.10/site-packages/gspread_asyncio/__init__.py", line 1926, in get_all_records
    return await self.agcm._call(
  File "/home/yob/programming/discord/callbot/venv/lib/python3.10/site-packages/gspread_asyncio/__init__.py", line 122, in _call
    rval = await self._loop.run_in_executor(None, fn)
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/yob/programming/discord/callbot/venv/lib/python3.10/site-packages/gspread/worksheet.py", line 507, in get_all_records
    values = [
  File "/home/yob/programming/discord/callbot/venv/lib/python3.10/site-packages/gspread/worksheet.py", line 508, in <listcomp>
    numericise_all(
  File "/home/yob/programming/discord/callbot/venv/lib/python3.10/site-packages/gspread/utils.py", line 204, in numericise_all
    numericised_list = [
  File "/home/yob/programming/discord/callbot/venv/lib/python3.10/site-packages/gspread/utils.py", line 206, in <listcomp>
    if index + 1 in ignore
TypeError: argument of type 'NoneType' is not iterable

Await find does not have case insensitive option

I was wondering when or if AsyncioGspreadWorksheet.find() will have the case_insensitive parameter like Gspread 5.3.0+ has.

This would be very helpful for the application I'm making. Right now I have to make an extra API call and get the values of a column to make it case insensitive.

Threads piling up when on a loop. Some say to trim these down after my code has executed a single loop?

Hi there,

I basically am reading a random row of data every 5 seconds, but the program (expectedly) is making a new thread every loop. Is there any adverse effects to leaving these threads open for long periods of time (somewhere in the range of weeks at a time)? Can I close the thread at the end of the code loop in someway so as to save memory (if it affects memory)?

Code:
`async def run(agcm):

while True:

    gclient = await agcm.authorize()

    sheet = await gclient.open_by_key('1Hkwo9gSpk3NjgPLPkG8kh0zBNw2nxsYWRw0cVdn0JA0')

    ws = await sheet.get_worksheet(0)

    rcount = ws.row_count

    msg = await ws.cell(random.randint(1,rcount),1)

    msg = msg.value

    print(msg)

    await asyncio.sleep(5)`

spreadsheets.get

In the Google Sheets for Developers documentation, there is a "spreadsheets.get", which has the "includeGridData" parameter, I need this method to get the cell color, is there such a thing in gspread_asyncio or is it planned to add such a method?

Can you release a 0.1.0?

Would you be interested in a pull-request to help you release this as a lib? I would be interested in using the Asyncio wrapper.

Worksheet update method raise warning

The following warning is raised:
UserWarning: [Deprecated][in version 6.0.0]: method signature will change to: 'Worksheet.update(value = [[]], range_name=)' arguments 'range_name' and 'values' will swap, values will be mandatory of type: 'list(list(...))
Could be possible updating code calling underlying lib.

Add __iter__ to spreadsheet

This is for compatibility with gspread, however it has async implications that I'm not totally sure of yet.

TypeError: get_values got unexpected keyword arguments: ['range_name']

burnash's gspread somewhy restricts the kwargs that can be used for worksheet.get_values: https://github.com/burnash/gspread/blob/master/gspread/worksheet.py#L248-L253

Your library makes a call providing range_name as a kwarg: https://github.com/dgilman/gspread_asyncio/blob/master/gspread_asyncio/__init__.py#L1208

This leads to the following error:

  File "…/site-packages/gspread/utils.py", line 588, in wrapper
    raise TypeError(err % (f.__name__, list(unexpected_kwargs)))
TypeError: get_values got unexpected keyword arguments: ['range_name']

Versions: gspread-asyncio==1.5.0 gspread=4.0.1

updateDimensionProperties error: string indices must be integers

Hello.
Trying to set width of columns:
body = { "requests": [ { "updateDimensionProperties": { "dimensions": { "sheetId": sheet_id, "dimension": "COLUMNS", "startIndex": 0, "endIndex": 1 }, "properties": { "pixelSize": 200 }, } } ] } await sheet.batch_update(body)

and get TypeError: "string indices must be integers"

p.s. same with
body = { "requests": [ { "updateDimensionProperties": { "range": { "sheetId": sheet_id, "dimension": "COLUMNS", "startIndex": 0, "endIndex": 1 }, "properties": { "pixelSize": 100 }, "fields": "pixelSize" } } ] }

Remove internal logger

Remove the PFH internal logger and just use the python logger directly in a normal fashion. Move them out of callbacks whenever they were in callbacks.

Set up travis

  • Lint, pep8
  • run tests, possibly
  • have it publish to pypi

What is the BEST way to authorize credentials? In a loop that runs every so often, or before every function that uses gspread_asyncio?

Right now, I have an authorization loop that runs every 60 seconds that automatically re-authorizes the credentials for the sheet I am using.

async def refresh_google_creds(agcm):
    await bot.wait_until_ready() #waits until discord.client has initialized and is ready
    while not bot.is_closed:  #returns true if the discord.client is running
        if bot.creds.access_token_expired:
            print('Refreshed google creds at ' + time.ctime())
        bot.gclient = await agcm.authorize()
        bot.sheet = await bot.gclient.open_by_key('sheetkey')
        bot.database_sheet = await bot.gclient.open_by_key('sheet key')
        await asyncio.sleep(60) # task runs every 60 seconds

All my other code is calling on the regular functions and methods to read/write to the spreadsheet that is being authorized in in the refresh_google_creds function.

@bot.command(pass_context=True)
async def function(ctx):
    ws = await bot.sheet.get_worksheet(0)
    rcount = ws.row_count
    msg = await ws.cell(random.randint(1,rcount),1)
    await bot.send_message(ctx.message.channel, msg.value)

The reason I ask is because I still am getting some blocking code for some reason. Last night, my bot crashed because of the error Task was destroyed but it is pending! citing the loop refresh_google_creds as the offending line. Any ideas?

Extra logging handler

I'm using gspread_asyncio 1.9.0 (with gspread 5.10.0) and I've been noticing a weird logging behaviour.

Whenever AsyncioGspreadClient.open() is called, an extra handler is added to the one I'm configure in my code, dupplicating any log output

Small example that works as expected with vanilla gspread 5.10.0

import logging

import gspread
from google.oauth2.service_account import Credentials

def main():
    
    steam_handler = logging.StreamHandler()
    logger = logging.getLogger("test")
    logger.setLevel(logging.INFO)
    logger.addHandler(steam_handler)
    
    creds = Credentials.from_service_account_file(
        "service_account.json",
        scopes = [
            "https://spreadsheets.google.com/feeds",
            "https://www.googleapis.com/auth/spreadsheets",
            "https://www.googleapis.com/auth/drive",
        ]
    )
    client = gspread.authorize(creds)
    logger.info("Before client.open")
    spreadsheet = client.open(title="My Spreadsheet")
    logger.info('After client.open')

if __name__ == "__main__":
    main()

Output

Before client.open
After client.open

The same example with gspread_asyncio 1.9.0 (relying on gspread 5.10.0)

import asyncio

import logging

import gspread_asyncio
from google.oauth2.service_account import Credentials

async def main():
    
    steam_handler = logging.StreamHandler()
    logger = logging.getLogger("test")
    logger.setLevel(logging.INFO)
    logger.addHandler(steam_handler)

    def get_creds():
        creds = Credentials.from_service_account_file("service_account.json")
        scoped = creds.with_scopes([
            "https://spreadsheets.google.com/feeds",
            "https://www.googleapis.com/auth/spreadsheets",
            "https://www.googleapis.com/auth/drive",
        ])
        return scoped

    agcm = gspread_asyncio.AsyncioGspreadClientManager(get_creds)
    
    agc = await agcm.authorize()
    
    logger.info("Before agc.open")
    spreadsheet = await agc.open(title="My Spreadsheet")
    logger.info('After agc.open')

if __name__ == "__main__":
    asyncio.run(main())

Output

Before agc.open
After agc.open
INFO:test:After agc.open

How to migrate from gspread to gspread_asyncio?

Hi, I still can not migrade my code from gspread to gspread_asyncio?

my current code:

async def test(self, ctx):

        scope = ["https://spreadsheets.google.com/feeds", 'https://www.googleapis.com/auth/spreadsheets', "https://www.googleapis.com/auth/drive.file", "https://www.googleapis.com/auth/drive"]

        creds = ServiceAccountCredentials.from_json_keyfile_name("Creds.json", scope)

        client = gspread.authorize(creds)

       sheet1 = client.open('sheet_name')

       sheet = sheet1.worksheet('Summary')

        name = sheet.acell('Z5').value

        print(name)

I did try this one but I got error Command raised an exception: AttributeError: 'ServiceAccountCredentials' object has no attribute 'auth_lock'

        scope = ["https://spreadsheets.google.com/feeds", 'https://www.googleapis.com/auth/spreadsheets', "https://www.googleapis.com/auth/drive.file", "https://www.googleapis.com/auth/drive"]

        creds = ServiceAccountCredentials.from_json_keyfile_name("Creds.json",scope)

        client = await gspread_asyncio.AsyncioGspreadClientManager.authorize(creds)

        sheet1 = await client.open('sheet_name')

        sheet = await sheet1.worksheet('Summary')

        name = sheet.acell('Z5').value

        print(name)

But when I try the code from your example document it works fine.

AGC.del_spreadsheet only updates 1 of 2 caches

AsyncioGspreadClient.del_spreadsheet doesn't remove the sheet from the _ss_cache_title cache. So if you delete and recreate a sheet with the same name, then the AGC.open() function grabs the cached, deleted sheet and things go bad.

worksheet.freeze()

Hello! Keep on getting this error for some reason:
asyncio.run(test()) File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/Users/amirsakov/Downloads/Telegram Bots/brmanager_reloaded/sheet.py", line 20, in test await worksheet.freeze() File "/Users/amirsakov/Downloads/Telegram Bots/brmanager_reloaded/.venv/lib/python3.11/site-packages/gspread_asyncio/__init__.py", line 1852, in freeze return await self.agcm._call(self.ws.format, rows=rows, cols=cols) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/amirsakov/Downloads/Telegram Bots/brmanager_reloaded/.venv/lib/python3.11/site-packages/gspread_asyncio/__init__.py", line 122, in _call rval = await self._loop.run_in_executor(None, fn) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: Worksheet.format() got an unexpected keyword argument 'rows'

This is what I use in my code agcm = gspread_asyncio.AsyncioGspreadClientManager(get_creds) agc = await agcm.authorize() ss = await agc.open_by_url('https://docs.google.com/spreadsheets/d/14FzPFJ5Y7Sh9KkmALKA6-jj7TgYA9oqWLoWbTvXmOHk/edit#gid=543611067') worksheet = await ss.worksheet("Учётность состава") await worksheet.freeze(rows=1,cols=1)

Kindly advice on solving the issue

Question about the library

is this library a regular wrapper in the form of asyncio? Ie all functions are the same as in synchronous?

Error with gspread/models.py

Recently I have some problem with my project.

File "/usr/local/lib/python3.9/site-packages/gspread/models.py", line 93, in title
return self._properties['title']
KeyError: 'title'

How could I fix it?
I use next requirements:
gspread==4.*
gspread-asyncio

Bulk cell and bulk row methods

These need to be implemented upstream. Then, the gspread_asyncio code can have optional batching modes for row appends and cell updates that use these methods.

Worksheet's `update()` coroutine method's `raw` parameter is ignored

Using gspread_asyncio.AsyncGspreadWorksheet's update() method to add, say, formulas to a cell, isn't possible. Formulas will be sent as raw content, even if you pass the raw=False parameter:

import gspread_asyncio

FORMULA: str = '=SUM(A1:C1)'
LOCATION: str = 'A2'

sheet: gspread_asyncio.AsyncGspreadWorksheet = ...
await sheet.update(LOCATION, FORMULA, raw=False)

Cell A2 will be updated with the text =SUM(A1:C1), and not the formula it should be parsed as.

This works, however:

import asyncio

await asyncio.to_thread(
  sheet.ws.update,
  LOCATION,
  FORMULA,
  raw=False
)

Cell A2 will be updated correctly with the formula =SUM(A1:C1), and not just raw text.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.