mosquito / aiofile Goto Github PK
View Code? Open in Web Editor NEWReal asynchronous file operations with asyncio support.
License: Other
Real asynchronous file operations with asyncio support.
License: Other
Since most user perfer pip to install package, pls update the version used in pip.
Like the following problem already solved but remained in pip version:
My task is to create a/b/c/d/e.txt, where b c d dir doesn't exist.
How to create a new dir using aiofile?
Can't use in Docker image (FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim)
async with async_open(f"{folder}/{filename}", "w+") as afp:
/usr/local/lib/python3.8/site-packages/aiofile/utils.py:327: in async_open
afp = AIOFile(str(file_specifier), mode, *args, **kwargs)
/usr/local/lib/python3.8/site-packages/aiofile/aio.py:127: in __init__
self.__context = context or get_default_context()
/usr/local/lib/python3.8/site-packages/aiofile/aio.py:316: in get_default_context
return create_context()
/usr/local/lib/python3.8/site-packages/aiofile/aio.py:298: in create_context
context = caio.AsyncioContext(max_requests, loop=loop)
/usr/local/lib/python3.8/site-packages/caio/asyncio_base.py:22: in __init__
self.context = self._create_context(max_requests, **kwargs)
/usr/local/lib/python3.8/site-packages/caio/linux_aio_asyncio.py:10: in _create_context
context = super()._create_context(max_requests)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <caio.linux_aio_asyncio.AsyncioContext object at 0x4035c58a30>
max_requests = 512, kwargs = {}
def _create_context(self, max_requests, **kwargs):
> return self.CONTEXT_CLASS(max_requests=max_requests, **kwargs)
E SystemError: (38, 'Function not implemented')
/usr/local/lib/python3.8/site-packages/caio/asyncio_base.py:25: SystemError
python==3.8
aiofile==3.7.2
caio==0.9.3
https://github.com/mosquito/aiofile/blob/master/README.rst#async_open-helper misses to do seek before readLine. Have added "afp.seek(0)" in below code snippet.
{code}
await afp.write("Hello from\nasync world")
>> afp.seek(0) <<
print(await afp.readline())
print(await afp.readline())
{code}
Hi your example at https://github.com/mosquito/aiofile#async-csv-dict-reader
has a bug in it aiter shouldn't be a coroutine it will never be awaited since it isn't suppose to be async
You will get be the following
RuntimeWarning: coroutine 'AsyncDictReader.aiter' was never awaited
TypeError: 'async for' received an object from aiter that does not implement anext: coroutine
Installing via the git repository fails due to the above missing file. AFAICT it's not in the repository.
I just realized 0.5.2 is on pypi but it would be nice to have this as a backup.
Hello,
Thanks for the great package!
I want to make a feature request for adding support for aiofile's equivalent of StringIO and BytesIO from the standard python lib. These will be particularly useful when writing test, as test doubles. Currently, the only way to test functions that use aiofile is to create temporary files and wrap them in AIOFile.
Hello, I am using it to writing an wav file from iternet async with aiohttp. It works fine on linux, but downloaded audio sounds weird on windows.
async def download_audio(self, task_line: TaskLine):
url = f'{self.base_url}{task_line.raw_data_file_path}'
logger.debug(url)
async with aiohttp.ClientSession(cookie_jar=self._cookie_jar) as session:
async with session.get(url=url) as resp:
assert resp.status == 200, f'request error status code: {resp.status}'
path = task_line.audio_file_path()
dir, _ = os.path.split(path)
if not os.path.exists(dir):
os.mkdir(dir)
async with AIOFile(path, 'wb') as f:
await f.write(await resp.read())
When I change write to file code with internal open method, the audio file download turn out be fine.
async def download_audio(self, task_line: TaskLine):
url = f'{self.base_url}{task_line.raw_data_file_path}'
logger.debug(url)
async with aiohttp.ClientSession(cookie_jar=self._cookie_jar) as session:
async with session.get(url=url) as resp:
assert resp.status == 200, f'request error status code: {resp.status}'
path = task_line.audio_file_path()
dir, _ = os.path.split(path)
if not os.path.exists(dir):
os.mkdir(dir)
# async with AIOFile(path, 'wb') as f:
# await f.write(await resp.read())
with open(path, 'wb') as f:
f.write(await resp.read())
Downloaded audio file info:
AIOFile:
open:
By the way, I am using manjaro18 linux system and windows10 ltsb2016 and both python3 latest version 3.7.3
aiofile 1.5.2
aiohttp 3.5.4
We can create a https://docs.python.org/3/library/os.html#os.mkfifo and then open it asynchronously both in reading and writing using aiofile.
It should be :
if m == 'b':
binary = True
if 'nt' == os.name:
flags |= os.O_BINARY
Not sure if its an AWS Environment problem or aiofile problem. I'm running python 3.8, aiofile 3.3.3, asyncio + uvloop 0.14.0 cron job that runs every minute and opens a yaml file in the same package.
Sometimes I get this traceback:
File "/var/task/util/track_errors.py", line 76, in wrapper
return func(*args, **kwargs)
File "/var/task/cron_jobs/process_notification_webhook.py", line 64, in handler
return asyncio.run(main(event))
File "/var/lang/lib/python3.8/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
return future.result()
File "/var/task/cron_jobs/process_notification_webhook.py", line 12, in main
async with AsyncDynamodbTable(
File "/var/task/fluxo_aws/async_dynamodb_table.py", line 54, in __aenter__
async with async_open(self.schema_path, "r") as opened_file:
File "/var/task/aiofile/utils.py", line 315, in async_open
afp = AIOFile(str(file_name), mode, *args, **kwargs)
File "/var/task/aiofile/aio.py", line 116, in __init__
self.__context = context or get_default_context()
File "/var/task/aiofile/aio.py", line 289, in get_default_context
return create_context()
File "/var/task/aiofile/aio.py", line 276, in create_context
context = caio.AsyncioContext(max_requests, loop=loop)
File "/var/task/caio/asyncio_base.py", line 23, in __init__
self.context = self._create_context(
File "/var/task/caio/linux_aio_asyncio.py", line 10, in _create_context
context = super()._create_context(max_requests)
File "/var/task/caio/asyncio_base.py", line 62, in _create_context
return self.CONTEXT_CLASS(max_requests=max_requests, **kwargs)
SystemError: (11, 'Resource temporarily unavailable')
The method that opens the file into /var/task/fluxo_aws/async_dynamodb_table.py
is:
async def __aenter__(self):
self.client = await aioboto3.client("dynamodb").__aenter__()
self.resource = await aioboto3.resource("dynamodb").__aenter__()
self.table = await self.resource.Table(self.table_name)
if self.schema_path:
async with async_open(self.schema_path, "r") as opened_file:
file_content = await opened_file.read(length=-1)
self.schema = yaml.safe_load(file_content)
self._build_validator()
return self
This is an open source helper for AWS that I made for my projects, that you can find here: https://github.com/Fluxo-Resultados/fluxo-aws/blob/master/fluxo_aws/async_dynamodb_table.py
I suspect thats an AWS environment problem, as I get this error 2 to 5 times in a row and after that it runs fine for some minutes (or even hours) before getting the error again.
I don't know how to proceed to help debugging this and help to fix the issue. Thanks!
I'm considering using aiofile for the FileHandler implementation of aiologger. Could you help me with the pros and cons comparing aiofile with aiofiles threadpool implementation ?
LineReader is very slow
LineReader is as fast as normal line reading.
Read Chunk size actually prevents extra reads.
LineReader is slow, takes many ms per line.
LineReader causes a read for each line.
import asyncio
import functools
import time
import aiofile
def print_on_call_decorator(func):
@functools.wraps(func)
def wrapper_decorator(*args, **kwargs):
print("real read called")
value = func(*args, **kwargs)
return value
return wrapper_decorator
aiofile.AIOFile.read_bytes = print_on_call_decorator(aiofile.AIOFile.read_bytes)
async def main():
async with aiofile.AIOFile("test_line_iter_file", "r") as f:
last_line_time = time.perf_counter()
async for line in aiofile.LineReader(f, chunk_size=aiofile.LineReader.CHUNK_SIZE * 16*16):
# print("line_time", time.perf_counter() - last_line_time)
last_line_time = time.perf_counter()
# print(line, end="")
if __name__ == "__main__":
open("test_line_iter_file", "w").write("\n".join(str(i) for i in range(1000000)))
asyncio.run(main())
Sync version to compare:
import time
open("test_line_iter_file", "w").write("\n".join(str(i) for i in range(100000)))
start = time.perf_counter()
with open("test_line_iter_file", "r", buffering=4192 * 16) as f:
last_line_time = time.perf_counter()
for line in f:
# print("line_time", time.perf_counter() - last_line_time)
last_line_time = time.perf_counter()
# print(line, end="")
print("end_time", time.perf_counter() - start)
My temporary solution that only works for python approved new lines from the file __iter__
, its only twice as slow as sync version:
import asyncio
import asyncio
import collections.abc
import functools
import io
import itertools
import time
from typing import Union, Self
import aiofile
class CustomLineReader(collections.abc.AsyncIterable):
CHUNK_SIZE = 4192
def __init__(
self,
aio_file: aiofile.AIOFile,
offset: int = 0,
chunk_size: int = CHUNK_SIZE,
line_sep: str = "\n",
):
self.__reader = aiofile.Reader(aio_file, chunk_size=chunk_size, offset=offset)
self._buffer = None
self.linesep = aio_file.encode_bytes(line_sep) if aio_file.mode.binary else line_sep
self.chunk_iterator = None
self.last_read = None
async def setup_buffer(self, buffer_initialization=None):
chunk = await self.__reader.read_chunk()
if not chunk:
raise StopAsyncIteration(chunk)
if self._buffer:
self._buffer.close()
del self._buffer
self._buffer = io.BytesIO() if self.__reader.file.mode.binary else io.StringIO()
if buffer_initialization:
self._buffer.write(buffer_initialization)
self._buffer.write(chunk)
self._buffer.seek(0)
self.chunk_iterator = self._buffer.__iter__()
async def __anext__(self) -> Union[bytes, str]:
if not self._buffer:
await self.setup_buffer()
try:
self.last_read = next(self.chunk_iterator)
if self.last_read[-1] != "\n":
await self.setup_buffer(self.last_read)
self.last_read = next(self.chunk_iterator)
except StopIteration:
await self.setup_buffer(self.last_read)
self.last_read = next(self.chunk_iterator)
return self.last_read
def __aiter__(self) -> Self:
return self
I've prepared the following code to represent the exception:
import asyncio
import os
import sys
import aiofile
fifo = '/tmp/test.fifo'
if not os.path.exists(fifo):
os.mkfifo(fifo)
async def read():
print('Reading')
async with aiofile.AIOFile(fifo, 'r') as afp:
while True:
data = await afp.read(4096)
if not data:
break
print('Reading done')
async def write():
print('Writing')
async with aiofile.AIOFile(fifo, 'w') as afp:
await afp.write('test')
await afp.fsync()
print('Writing done')
async def main():
arg = sys.argv[1]
if arg == 'reader':
await read()
elif arg == 'writer':
await write()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Execution:
$ python test.py reader &
[1] 30822
$ Reading
$ python test.py writer
Writing
Segmentation fault
$ Reading done
Is it a bug or I'm doing something wrong?
I see the NFS label, but there are no examples. Is there NFS support?
AIOFile.open()
does not prevent concurrent open()
calls from each opening the file. As a result, the following scenario can occur:
AIOFile
object is opened by task A, resulting in FD (file descriptor) Aopen()
AIOFile
object using FD A.open()
call completes, overwriting the AIOFile._file_obj
created by task A's open()
call, causing it to be GC'd and FD A to be closedThis program reproduces this behaviour:
import asyncio
import os
import tempfile
from aiofile import AIOFile
async def use_file(label: str, af: AIOFile) -> None:
fd = await af.open()
if fd is None:
print(f"{label}: file already open")
else:
print(f"{label}: opened file: {fd=}")
try:
await af.fsync() # could be anything
print(f"{label}: fsync() succeeded")
except Exception as error:
print(
f"{label}: fsync() failed: opened fd={fd}, current AIOFile fd={af.fileno()}, {error=}"
)
async def main():
fd, name = tempfile.mkstemp()
os.close(fd)
af = AIOFile(name)
await asyncio.gather(*(use_file(f"task {n}", af) for n in range(10)))
if __name__ == "__main__":
asyncio.run(main())
$ python aiofile_open_bug.py
task 0: opened file: fd=8
task 1: opened file: fd=9
task 2: opened file: fd=10
task 3: opened file: fd=11
task 5: opened file: fd=13
task 7: opened file: fd=15
task 4: opened file: fd=12
task 0: fsync() failed: opened fd=8, current AIOFile fd=12, error=OSError(9, 'Bad file descriptor')
task 6: opened file: fd=14
task 8: opened file: fd=8
task 9: opened file: fd=16
task 1: fsync() failed: opened fd=9, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 2: fsync() failed: opened fd=10, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 3: fsync() failed: opened fd=11, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 5: fsync() failed: opened fd=13, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 7: fsync() failed: opened fd=15, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 4: fsync() failed: opened fd=12, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 6: fsync() failed: opened fd=14, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 8: fsync() failed: opened fd=8, current AIOFile fd=16, error=OSError(9, 'Bad file descriptor')
task 9: fsync() succeeded
Python's open may be used directly, or as a context manager in a with block.
I love contextmanagers, but they're not always the right tool because cleanup on the stack is not always the right time to perform cleanup.
Can I use aiofile.async_open without a with statement?
import asyncio
import aiofile
async def test():
f = aiofile.async_open('filename.txt', mode='w')
try:
await f.write('123') # asyncio.base_futures.InvalidStateError: AIOFile closed
finally:
pass
# await f.close()
asyncio.run(test())
Can some one explain?
I try write to file, but i get this traceback, and
the main thread freezes and nothing happens.
I use:
Traceback (most recent call last): File "aiofile/posix_aio.pyx", line 179, in aiofile.posix_aio.AIOOperation._set_result File "aiofile/posix_aio.pyx", line 180, in aiofile.posix_aio.AIOOperation._set_result File "uvloop/loop.pyx", line 1265, in uvloop.loop.Loop.call_soon_threadsafe File "uvloop/loop.pyx", line 634, in uvloop.loop.Loop._call_soon File "uvloop/cbhandles.pyx", line 341, in uvloop.loop.new_Handle File "uvloop/cbhandles.pyx", line 15, in uvloop.loop.Handle._set_loop File "uvloop/cbhandles.pyx", line 428, in uvloop.loop.extract_stack ValueError: call stack is not deep enough Exception ignored in: 'aiofile.posix_aio.on_event' Traceback (most recent call last): File "aiofile/posix_aio.pyx", line 179, in aiofile.posix_aio.AIOOperation._set_result File "aiofile/posix_aio.pyx", line 180, in aiofile.posix_aio.AIOOperation._set_result File "uvloop/loop.pyx", line 1265, in uvloop.loop.Loop.call_soon_threadsafe File "uvloop/loop.pyx", line 634, in uvloop.loop.Loop._call_soon File "uvloop/cbhandles.pyx", line 341, in uvloop.loop.new_Handle File "uvloop/cbhandles.pyx", line 15, in uvloop.loop.Handle._set_loop File "uvloop/cbhandles.pyx", line 428, in uvloop.loop.extract_stack ValueError: call stack is not deep enough
I'm running the aiofile 3.1.1 pipe example under Linux (Debian stable, python 3.7). It usually stops with no other output than this:
201021 14:17 ~ srn@lije{1}% python3 /tmp/z.py
Start reader
Start reader
Start writer
C-c C-cExited
Traceback (most recent call last):
File "/tmp/z.py", line 46, in
loop.run_until_complete(main())
File "/usr/lib/python3.7/asyncio/base_events.py", line 571, in run_until_complete
self.run_forever()
File "/usr/lib/python3.7/asyncio/base_events.py", line 539, in run_forever
self._run_once()
File "/usr/lib/python3.7/asyncio/base_events.py", line 1775, in _run_once
handle._run()
File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.7/dist-packages/caio/asyncio_base.py", line 55, in _run
step(operation, future)
File "/usr/local/lib/python3.7/dist-packages/caio/asyncio_base.py", line 43, in step
self.context.submit(*operations.keys())
KeyboardInterrupt
201021 14:18 ~ srn@lije{1}%
(Since nothing is output, I give up and press control-C.)
I have made this test several times, trying to use pdb and with other minor changes, and one of the tests yielded a few loop.time() outputs, like this:
Start reader
Start reader
Start writer
1131834.8124811131834.8126551131834.812847
1131834.813134
C-c C-c C-c C-cTraceback (most recent call last):
File "/usr/local/ch-tools3/z.py", line 64, in
loop.run_until_complete(main())
File "/usr/lib/python3.7/asyncio/base_events.py", line 571, in run_until_complete
self.run_forever()
File "/usr/lib/python3.7/asyncio/base_events.py", line 539, in run_forever
self._run_once()
File "/usr/lib/python3.7/asyncio/base_events.py", line 1775, in _run_once
handle._run()
File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.7/dist-packages/caio-0.6.3-py3.7-linux-x86_64.egg/caio/asyncio_base.py", line 55, in _run
step(operation, future)
File "/usr/local/lib/python3.7/dist-packages/caio-0.6.3-py3.7-linux-x86_64.egg/caio/asyncio_base.py", line 43, in step
self.context.submit(*operations.keys())
Here's /tmp/z.py, which I believe is a verbatim copy of the example from the README:
import os
import asyncio
from aiofile import AIOFile, Reader, Writer
async def reader(fname):
print('Start reader')
async with AIOFile(fname, 'r') as afp:
while True:
# Maximum expected chunk size, must be passed.
# Otherwise will be read zero bytes
# (because unix pipe has zero size)
data = await afp.read(4096)
print(data)
async def writer(fname):
print('Start writer')
async with AIOFile(fname, 'w') as afp:
while True:
await asyncio.sleep(1)
await afp.write('%06f' % loop.time())
async def main():
fifo_name = "/tmp/test.fifo"
if os.path.exists(fifo_name):
os.remove(fifo_name)
os.mkfifo(fifo_name)
# Starting two readers and one writer, but only one reader
# will be reading at the same time.
await asyncio.gather(
reader(fifo_name),
reader(fifo_name),
writer(fifo_name),
)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
# Shutting down and closing file descriptors after interrupt
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()
print('Exited')
201021 14:18 ~ srn@lije{1}% uname -a
Linux lije 4.19.0-11-amd64 #1 SMP Debian 4.19.146-1 (2020-09-17) x86_64 GNU/Linux
I get the following error.
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.7m -c caio/thread_aio.c -o build/temp.linux-x86_64-3.7/caio/thread_aio.o -g caio/thread_aio.c:6:10: fatal error: Python.h: No such file or directory #include <Python.h> ^~~~~~~~~~ compilation terminated. error: command 'gcc' failed with exit status 1
DictReader can take any iterator that defines __next__
but no async generator.
it would be great to create an aiofile.DictReader
for that.
Here is the workaround:
import csv
import io
raw_bytes = await stream.read(10000000)
wrapped_bytes = io.BytesIO(raw_bytes)
text = io.TextIOWrapper(wrapped_bytes, encoding=encoding, errors='replace')
for i, row in enumerate(csv.DictReader(text)):
print(row)
The following code as listed on the readme page to manage context manually, has a bug.
import asyncio
from aiofile import async_open
from caio import linux_aio, thread_aio
async def main():
linux_ctx = linux_aio.Context()
threads_ctx = thread_aio.Context()
async with async_open("/tmp/test.txt", "a", context=linux_ctx) as afp:
await afp.write("Hello")
async with async_open("/tmp/test.txt", "a", context=threads_ctx) as afp:
print(await afp.read())
await main()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [94], in <cell line: 18>()
14 async with async_open("/tmp/test.txt", "a", context=threads_ctx) as afp:
15 print(await afp.read())
---> 18 await main()
Input In [94], in main()
8 linux_ctx = linux_aio.Context()
9 threads_ctx = thread_aio.Context()
---> 11 async with async_open("/tmp/test.txt", "a", context=linux_ctx) as afp:
12 await afp.write("Hello")
14 async with async_open("/tmp/test.txt", "a", context=threads_ctx) as afp:
File /usr/local/lib/python3.10/site-packages/aiofile/utils.py:200, in FileIOWrapperBase.__aenter__(self)
199 async def __aenter__(self) -> "FileIOWrapperBase":
--> 200 await self.file.open()
201 return self
File /usr/local/lib/python3.10/site-packages/aiofile/aio.py:177, in AIOFile.open(self)
174 return None
176 if self.__open_result is None:
--> 177 self.__open_result = self._run_in_thread(
178 open,
179 self._fname,
180 self._open_mode,
181 )
182 self._file_obj = await self.__open_result
183 self.__open_result = None
File /usr/local/lib/python3.10/site-packages/aiofile/aio.py:154, in AIOFile._run_in_thread(self, func, *args, **kwargs)
151 def _run_in_thread(
152 self, func: "Callable[..., _T]", *args: Any, **kwargs: Any
153 ) -> "asyncio.Future[_T]":
--> 154 return self.__context.loop.run_in_executor(
155 self._executor, partial(func, *args, **kwargs),
156 )
AttributeError: 'Context' object has no attribute 'loop'
It would be very helpful if you could tag releases as well again. This would enable distributions to fetch the package from GitHub instead of PyPI.
Thanks
Hello,
Thank you very much for your hard work on this. I am testing things with aiofile to see if i can write a large file (might go up to 400-500mb) from a continuous data stream in append mode. I can't seem to find an example of writing a large file in append mode using aiofile in pypi or readme. Can you please point me to an example?
When I test my program using aiofiles==0.5.0 (latest available version) with pytest in Python 3.8.0, I get the following warning:
[...]/lib/python3.8/site-packages/aiofiles/os.py:10: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
def run(*args, loop=None, executor=None, **kwargs):
-- Docs: https://docs.pytest.org/en/latest/warnings.html
This is similar to e.g. this issue, where also this information is probably helpful for this repo:
The async def type of coroutine was added in Python 3.5, and is recommended if there is no need to support older Python versions.
from datetime import datetime
from aiofile import AIOFile, Writer
def add_timestamp(log_line):
timestamp = datetime.strftime(datetime.now(), "[%d.%m.%y %H:%M]")
return f"{timestamp} {log_line}"
async def chat_logging():
async with AIOFile("log.txt", 'w') as afp:
writer = Writer(afp)
for _ in range(10):
log_line = "Hello world\n"
# log_line = "Hello май ворлд\n"
print(add_timestamp(log_line))
await writer(add_timestamp(log_line))
async def main():
await asyncio.gather(chat_logging())
asyncio.run(main())
Если в записываемой строке есть русские буквы, то в log.txt сохраняется каша. Если все буквы английские, то строчки сохраняются нормально. Возможно, неправильно высчитывается смещение в файле.
OS Linux Debian 8 / Python 3.7.4 / aiofile==1.5.2
I have tried to check performance of the library and found some confusing results. I am on Ubuntu 16.04 Python 3.7. This is the snippet I use to check performance: http://dpaste.com/3SH4BAZ
I check by toggling comment of lines 31 and 32.
When I run the sequential version the code is always faster (did a huge number of tries). In seconds, it is near this number: 0.0032964419997369987
When I run the async version the code is slower and near a value of 0.04831875699983357
Again, I run the tests a huge amount of times, to safely conclude that the sequential code is about 10 times faster than the asyncio one.
Am I using aiofile wrong? Do I miss something in the asyncio utilization?
Thanks in advance
Hi!
Do you plan on adding seek() and tell() ?
thank you very much
Just an idea/request, please close if you don't care:
As somebody who uses aiofile a lot, it would be helpful to see a changelog or release notes somewhere. Wouldn't even need to be anything detailed, just maybe a mention of methods that have changed or compatibility that has been added/dropped.
Currently, when I'm upgrading I'm scrolling through the commits and hope that I'd catch things that are important to me. Definitely doable, but I'm sure I'm missing stuff this way.
I have a script that is working on my Windows 10 Pro machine, that is not working on my AWS Environment. Am I missing something simple?
MS Windows Version 2004 (OS Build 19041.685)
Python 3.8.0
UNIX Version:
uname -r: 4.14.214-160.339.amzn2.x86_64
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
Python 3.8.0 (Manually compiled, same issue on the stock python 3.7.9 that I could install with yum)
Requirements file:
aiofile==3.3.3
aiohttp==3.7.3
async-timeout==3.0.1
attrs==20.3.0
caio==0.6.3
cffi==1.14.4
chardet==3.0.4
discord.py==1.6.0
idna==3.1
multidict==5.1.0
pycparser==2.20
PyNaCl==1.4.0
python-dotenv==0.15.0
six==1.15.0
typing-extensions==3.7.4.3
yarl==1.6.3
The following code throws the error:
async def save_8ball(guild_id):
async with aiofile.async_open(os.path.join(CONFIG['8Ball']['FileDirectory'], f'{guild_id}.txt'), 'w')\
as create_8ball:
for resp in EIGHT_BALL_RESPONSES[guild_id]:
await create_8ball.write(f'{resp}\r\n')
I am using a relative path for the os.path.join, as the config file is currently pointing to "./8ball"
Error Listing:
Traceback (most recent call last):
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/caio/asyncio_base.py", line 43, in step
self.context.submit(*operations.keys())
SystemError: (9, 'Bad file descriptor')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/discord/ext/commands/core.py", line 85, in wrapped
ret = await coro(*args, **kwargs)
File "./botGames.py", line 96, in prediction
await create_guild_8ball(guild_id=ctx.guild.id)
File "./botGames.py", line 66, in create_guild_8ball
await save_8ball(guild_id=guild_id)
File "./botGames.py", line 34, in save_8ball
await create_8ball.write(f'{resp}\r\n')
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/aiofile/utils.py", line 183, in __aexit__
await self.close()
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/aiofile/utils.py", line 176, in close
await self.file.close()
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/aiofile/aio.py", line 166, in close
await self.fsync()
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/aiofile/aio.py", line 261, in fsync
return await self.__context.fdsync(self.fileno())
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/caio/asyncio_base.py", line 87, in submit
await future
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/caio/asyncio_base.py", line 48, in step
self.context.submit(operation)
SystemError: (9, 'Bad file descriptor')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/discord/ext/commands/bot.py", line 902, in invoke
await ctx.command.invoke(ctx)
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/discord/ext/commands/core.py", line 1345, in invoke
await super().invoke(ctx)
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/discord/ext/commands/core.py", line 864, in invoke
await injected(*ctx.args, **ctx.kwargs)
File "/home/ec2-user/python_discord_bot/lib/python3.8/site-packages/discord/ext/commands/core.py", line 94, in wrapped
raise CommandInvokeError(exc) from exc
discord.ext.commands.errors.CommandInvokeError: Command raised an exception: SystemError: (9, 'Bad file descriptor')
When substituting the aiofile.async_open for a regular open this succeeds without issue on the Linux machine.
Reading online this error usually means that the file was closed and then attempted to be referenced, but I'm writing into the file, and then closing the file.
I've also tried to write into this using AIOFile with a writer instead of async_open, but I'm receiving the same result.
I have this csv file
"year","name","percent","sex"
1880,"John",0.081541,"boy"
1880,"William",0.080511,"boy"
1880,"James",0.050057,"boy"
1880,"Charles",0.045167,"boy"
1880,"George",0.043292,"boy"
and i want to read the second column.
I have this code
import asyncio
from aiofile import AIOFile, LineReader, Writer
async def main():
async with AIOFile("file.csv", 'r') as afp:
async for line in LineReader(afp):
print(line[:10])
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
but this only reads the first 10 characters. How can i read the second column value of the simple csv attached here?.
aio — POSIX asynchronous IO, which is used by this package aiofile — is not real asynchronous file IO. You could find these sentences in manpage (man 7 aio
):
The current Linux POSIX AIO implementation is provided in user space by glibc. This has a number of limitations, most notably that maintaining multiple threads to perform I/O operations is expensive and scales poorly.
It’s just based on user‐space threads, which is not different from simply using ThreadPool to read file in Python.
Hi,
Could you please add support for Python 3.8? Including the wheel?
Thanks in advance,
Samuel
Hi, I use the async_open
to write file but it does not create file if file does not exist in mode 'a+'.
I report this issue becasue the builtin open will create file if file does not exists in mode 'a+'.
The mode 'w+' will create file if file does not exist but it is not append mode.
async with async_open(p, 'w+') as af:
print('222')
await af.write('12345\n')
await af.write('67890\n')
The mode 'a+' is append mode but it need file exists.
async with async_open(p, 'a+') as af:
print('222')
await af.write('12345\n')
await af.write('67890\n')
Traceback (most recent call last):
File "C:\Users\rondou.chen\workspace\system-team-tools\dump_edge_sys_logs\dump_logcat.py", line 66, in <module>
asyncio.run(main())
File "C:\Users\rondou.chen\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Users\rondou.chen\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 641, in run_until_complete
return future.result()
File "C:\Users\rondou.chen\workspace\system-team-tools\dump_edge_sys_logs\dump_logcat.py", line 61, in main
await asyncio.gather(*[task2,], return_exceptions=False)
File "C:\Users\rondou.chen\workspace\system-team-tools\dump_edge_sys_logs\dump_logcat.py", line 37, in write_log
async with async_open(p, 'a+') as af:
File "C:\Users\rondou.chen\workspace\system-team-tools\.venv\lib\site-packages\aiofile\utils.py", line 334, in async_open
return TextFileWrapper(afp)
File "C:\Users\rondou.chen\workspace\system-team-tools\.venv\lib\site-packages\aiofile\utils.py", line 257, in __init__
super().__init__(afp)
File "C:\Users\rondou.chen\workspace\system-team-tools\.venv\lib\site-packages\aiofile\utils.py", line 167, in __init__
self._offset = os.stat(afp.name).st_size
FileNotFoundError: [WinError 2] No such file or directory:。: 'C:\\Users\\rondou.chen\\workspace\\system-team-tools\\logcat_logs.txt'
(.venv) PS C:\Users\rondou.chen\workspace\system-team-tools>
import asyncio
from aiofile import AIOFile
data = b'hi'
async def main():
with open("test.data", "wb") as fp:
fp.write(data)
async with AIOFile("test.data", "rb") as fp:
while True:
buf = await fp.read(2)
if not buf:
break
print(buf)
asyncio.run(main())
I get the following error when running the async_open example.
Traceback (most recent call last):
File "/home/pi/.local/lib/python3.7/site-packages/caio/asyncio_base.py", line 43, in step
self.context.submit(*operations.keys())
SystemError: (22, 'Invalid argument')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "async_open_test.py", line 18, in <module>
loop.run_until_complete(main())
File "/usr/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
return future.result()
File "async_open_test.py", line 7, in main
await afp.write("Hello ")
File "/home/pi/.local/lib/python3.7/site-packages/aiofile/utils.py", line 284, in write
await operation
File "/home/pi/.local/lib/python3.7/site-packages/aiofile/aio.py", line 243, in write_bytes
data[written:], self.fileno(), offset + written,
File "/home/pi/.local/lib/python3.7/site-packages/caio/asyncio_base.py", line 87, in submit
await future
File "/home/pi/.local/lib/python3.7/site-packages/caio/asyncio_base.py", line 48, in step
self.context.submit(operation)
SystemError: (22, 'Invalid argument')
Versions:
Am I missing something?
thanks
Ist it possible to load from /tmp/huge.file
:
10 bytes from offset 100
10 bytes from offset 200
10 bytes from offset 300
in parallel?
Hi, when I run the UNIX pipe example, it does not work as it should. Looking at it in a debugger, It appears that the readers and writers are blocked on the os.open(). My OS is Ubuntu 18.04.
Cheers
We are using Yandex Cloud Speech and are using lpcm codec especially.
https://cloud.yandex.ru/docs/speechkit/tts/request
There is code for the one from the example:
with wave.open(test_filepath, 'wb') as file_obj:
file_obj.setparams((1, 2, 8000, 0, 'NONE', 'NONE'))
file_obj.writeframes(content)
Can you manage to support wave lib?
For instance, like this:
async with AIOWaveFile("/tmp/test.wav", 'wb') as afp:
await afp.setparams((1, 2, 8000, 0, 'NONE', 'NONE'))
await afp.write(content)
await afp.fsync()
Hi, would this be possible? If you don't want to add all missing tags, could you add the latest and tag new versions moving forward?
Thanks!
It's not immediately clear that this project is tested for 3.7
If you could add python 3.8 in the test pipeline that would be great.
When AIOFile
object is used as a context manager then some data is lost. Below is slightly modified example from README.rst
:
import asyncio
from aiofile import AIOFile, LineReader, async_open
async def main():
print("\nAIOFile:")
async with AIOFile("/tmp/hello.txt", 'w') as afp:
for i in range(10):
await afp.write("%d Hello World\n" % i)
await afp.write("Tail-less string\n")
async with AIOFile("/tmp/hello.txt", 'r') as afp:
async for line in LineReader(afp):
print(line[:-1])
print("\nasync_open:")
async with async_open("/tmp/hello.txt", 'w') as afp:
for i in range(10):
await afp.write("%d Hello World\n" % i)
await afp.write("Tail-less string\n")
async with AIOFile("/tmp/hello.txt", 'r') as afp:
async for line in LineReader(afp):
print(line[:-1])
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
The output is:
Python 3.8.16 (default, Dec 10 2022, 12:00:00)
[GCC 7.5.0] on localhost
>>>
AIOFile:
Tail-less string
async_open:
0 Hello World
1 Hello World
2 Hello World
3 Hello World
4 Hello World
5 Hello World
6 Hello World
7 Hello World
8 Hello World
9 Hello World
Tail-less string
>>>
So it seems that async_open
worked fine while AIOFile
context manager lost what was written in the for loop.
Hi
today AIOFile stop working and throw this exception (install from pip on venv)
ERROR 05:07:03 __eml:162 fetch module 'os' has no attribute 'pwrite'
Traceback (most recent call last):
File "c:\xxx\pyimap\v.1.2\export__eml.py", line 150, in fetch
await afp.write(_eml)
File "C:\xxx\pyimap\v.1.2\venv\lib\site-packages\aiofile\aio.py", line 219, in write
return await self.__context.write(bytes_data, self.__fileno, offset)
File "C:\xxx\pyimap\v.1.2\venv\lib\site-packages\caio\asyncio_base.py", line 88, in submit
return op.get_value()
File "C:\xxx\pyimap\v.1.2\venv\lib\site-packages\caio\python_aio.py", line 188, in get_value
raise self.exception
File "C:\xxx\pyimap\v.1.2\venv\lib\site-packages\caio\python_aio.py", line 71, in _in_thread
result = op_mapoperation.opcode
File "C:\xxx\pyimap\v.1.2\venv\lib\site-packages\caio\python_aio.py", line 39, in _handle_write
return os.pwrite(
AttributeError: module 'os' has no attribute 'pwrite'
When reading a file in windows the last line is not read correctly. Here is the example code:
async def read_text():
path = os.getcwd()
myfile = os.path.join(path, 'dummy.txt')
a = 1
async with AIOFile(myfile, 'r') as afp:
async for line in LineReader(afp):
print(a, ': ', line[:-1])
a += 1
Where the dummy.txt
file is this:
This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6
This is line 7
This is line 8
including the extra line at the end. The output is:
1 : This is line 1
2 : This is line 2
3 : This is line 3
4 : This is line 4
5 : This is line 5
6 : This is line 6
7 : This is line 7
8 : This is line 8
9 : line 8
10 :
If the extra line is removed, the output is this:
1 : This is line 1
2 : This is line 2
3 : This is line 3
4 : This is line 4
5 : This is line 5
6 : This is line 6
7 : This is line 7
8 : This is line 8 line
AIOfile 1.5.1 worked fine with Ubuntu18.04, but AIOfile 3.0.0 raises an issue (22, 'Invalid argument') on Ubuntu18.04
AIOfile 3.0.0 works fine on Ubuntu20.04
test script:
root@ubuntu-s-1vcpu-1gb-fra1-01:~# cat aiofile_test.py
import asyncio
from aiofile import AIOFile
async def main():
async with AIOFile("/tmp/hello.txt", 'w+') as afp:
await afp.write("Hello ")
await afp.write("world", offset=7)
await afp.fsync()
print(await afp.read())
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
output:
root@ubuntu-s-1vcpu-1gb-fra1-01:~# pip3 install aiofile==3.0.0
Collecting aiofile==3.0.0
Using cached https://files.pythonhosted.org/packages/f9/e5/a3215b7da4fb90c3abb7850ae1f3fa4c7553ab33381c7f49aaccc8adc531/aiofile-3.0.0-py3-none-any.whl
Requirement already satisfied: caio~=0.5.3 in /usr/local/lib/python3.6/dist-packages (from aiofile==3.0.0)
Installing collected packages: aiofile
Found existing installation: aiofile 1.5.1
Uninstalling aiofile-1.5.1:
Successfully uninstalled aiofile-1.5.1
Successfully installed aiofile-3.0.0
root@ubuntu-s-1vcpu-1gb-fra1-01:~# python3 aiofile_test.py
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/caio/asyncio_base.py", line 43, in step
self.context.submit(*operations.keys())
SystemError: (22, 'Invalid argument')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "aiofile_test.py", line 15, in <module>
loop.run_until_complete(main())
File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "aiofile_test.py", line 11, in main
print(await afp.read())
File "/usr/local/lib/python3.6/dist-packages/aiofile/aio.py", line 163, in close
await self.fsync()
File "/usr/local/lib/python3.6/dist-packages/aiofile/aio.py", line 223, in fsync
return await self.__context.fdsync(self.fileno())
File "/usr/local/lib/python3.6/dist-packages/caio/asyncio_base.py", line 87, in submit
await future
File "/usr/local/lib/python3.6/dist-packages/caio/asyncio_base.py", line 48, in step
self.context.submit(operation)
SystemError: (22, 'Invalid argument')
works fine with ubuntu20.04
regards Stefan
The object that is produced by async_open has a seek() method. However, it only accepts two positional arguments(one being the object itself, and the other being the offset), while Python files accept three positional arguments (including the object itself), the third being the position, from which the offset has to be applied (0 - beginning, 1 - current position, 2 - end of file). This leads to some conflicts between this module and other modules that rely on the default Python file interface. Will this ever be changed?
Source (opening a nonexistent file):
import aiofile
import asyncio
loop = asyncio.get_event_loop()
async def main():
try:
async with aiofile.AIOFile('x.txt', 'r', loop):
pass
except:
pass
loop.run_until_complete(main())
I'd expect this to close normally with no error, but I get this:
Exception ignored in: <bound method AIOFile.__del__ of <AIOFile: 'x.txt'>>
Traceback (most recent call last):
File ".../lib/python3.6/site-packages/aiofile/aio.py", line 74, in __del__
self.close()
File ".../lib/python3.6/site-packages/aiofile/aio.py", line 58, in close
if self.__fileno == -2:
AttributeError: _AIOFile__fileno
how to reproduce:
async with aiofile.AIOFile("10_mb_file", "rb") as f:
async for c in aiofile.Reader(f, chunk_size=1024*1024):
print(len(c))
expected: prints 10x 1048576
actual: prints 1x 587
system: win10, python 3.8.1 x64, aiofile pip 1.5.2
adding flags |= os.O_BINARY
in aio.py after line 96 fixes the issue for me.
Not sure if this is an oversight or if I am misunderstanding something here.
It raise AttributeError: 'AIOFile' object has no attribute 'flush',when I run the code example in win10,python3.6
why?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.