Git Product home page Git Product logo

vt-py's People

Contributors

anatlivt avatar aramirezmartin avatar carlosjcabello avatar cclauss avatar chexca avatar danipv avatar dbrennand avatar dprolongo avatar escipion avatar f1in avatar fcosantos avatar jinfantes avatar joseotoro avatar jramirezvt avatar karlhiramoto avatar kesh-stripe avatar korrosivesec avatar lfortemps avatar mgmacias95 avatar ninoseki avatar ostefano avatar pabloperezj avatar plusvic avatar qux-bbb avatar ro0tk1t avatar shen-qin avatar sim0nx avatar wesinator avatar ytreister avatar za avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vt-py's Issues

JSON parse error during iteration

I can't figure out if this is a problem with VirusTotal API data that is returned, or the vt-py client. I am using vt-py to automate processing of livehunt notifications. It seems to work most of the time, but intermittently, I have been getting the following exception in my logs, and I can't figure out what the problem is:

2021-06-24 05:37:26 job_scheduler.log [ERROR]::job_scheduler->check_vthunt:314 An exception occured when trying to process vthunt
Traceback (most recent call last):
  File "/opt/ems/job_scheduler.py", line 279, in check_vthunt
    for notification in notifications:
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 137, in __iter__
    self._items, self._server_cursor = self._get_batch()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 114, in _get_batch
    self._path, params=self._build_params())
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 403, in get_json
    return _make_sync(self.get_json_async(path, *path_args, params=params))
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 53, in _make_sync
    return event_loop.run_until_complete(future)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/asyncio/base_events.py", line 488, in run_until_complete
    return future.result()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 408, in get_json_async
    return await self._response_to_json(response)
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 236, in _response_to_json
    return await response.json_async()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 99, in json_async
    return await self._aiohttp_resp.json()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/aiohttp/client_reqrep.py", line 1032, in json
    return loads(self._body.decode(encoding))  # type: ignore
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 14378 column 35 (char 792013)

2021-06-24 13:38:18 job_scheduler.log [ERROR]::job_scheduler->check_vthunt:314 An exception occured when trying to process vthunt
Traceback (most recent call last):
  File "/opt/ems/job_scheduler.py", line 279, in check_vthunt
    for notification in notifications:
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 137, in __iter__
    self._items, self._server_cursor = self._get_batch()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 114, in _get_batch
    self._path, params=self._build_params())
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 403, in get_json
    return _make_sync(self.get_json_async(path, *path_args, params=params))
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 53, in _make_sync
    return event_loop.run_until_complete(future)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/asyncio/base_events.py", line 488, in run_until_complete
    return future.result()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 408, in get_json_async
    return await self._response_to_json(response)
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 236, in _response_to_json
    return await response.json_async()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 99, in json_async
    return await self._aiohttp_resp.json()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/aiohttp/client_reqrep.py", line 1032, in json
    return loads(self._body.decode(encoding))  # type: ignore
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 15493 column 33 (char 869532)

2021-06-25 03:00:12 job_scheduler.log [ERROR]::job_scheduler->check_vthunt:314 An exception occured when trying to process vthunt
Traceback (most recent call last):
  File "/opt/ems/job_scheduler.py", line 279, in check_vthunt
    for notification in notifications:
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 137, in __iter__
    self._items, self._server_cursor = self._get_batch()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 114, in _get_batch
    self._path, params=self._build_params())
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 403, in get_json
    return _make_sync(self.get_json_async(path, *path_args, params=params))
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 53, in _make_sync
    return event_loop.run_until_complete(future)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/asyncio/base_events.py", line 488, in run_until_complete
    return future.result()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 408, in get_json_async
    return await self._response_to_json(response)
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 236, in _response_to_json
    return await response.json_async()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 99, in json_async
    return await self._aiohttp_resp.json()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/aiohttp/client_reqrep.py", line 1032, in json
    return loads(self._body.decode(encoding))  # type: ignore
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 17004 column 17 (char 956831)

2021-06-24 03:37:15 job_scheduler.log [ERROR]::job_scheduler->check_vthunt:314 An exception occured when trying to process vthunt
Traceback (most recent call last):
  File "/opt/ems/job_scheduler.py", line 279, in check_vthunt
    for notification in notifications:
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 137, in __iter__
    self._items, self._server_cursor = self._get_batch()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/iterator.py", line 114, in _get_batch
    self._path, params=self._build_params())
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 403, in get_json
    return _make_sync(self.get_json_async(path, *path_args, params=params))
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 53, in _make_sync
    return event_loop.run_until_complete(future)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/asyncio/base_events.py", line 488, in run_until_complete
    return future.result()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 408, in get_json_async
    return await self._response_to_json(response)
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 236, in _response_to_json
    return await response.json_async()
  File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/vt/client.py", line 99, in json_async
    return await self._aiohttp_resp.json()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/aiohttp/client_reqrep.py", line 1032, in json
    return loads(self._body.decode(encoding))  # type: ignore
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 14384 column 40 (char 792288)

A snippet of my code looks like the following:

with Client(apikey=item.get('api_key'), host=self.vt_host) as client:
    notifications = client.iterator('/intelligence/hunting_notification_files', batch_size=10, limit=self.vthunt_limit)
    for notification in notifications:
        # Process notification...

Naming of tags

All tags are named X.X.X (0.1.0, 0.2.0, etc.) but the latest one has a "v" in front (v0.6.3)

Results issue

When I get the results from scanning a file, no matter what file I scan whenever I check the json data it sent back nothing ever detects it as suspicious or malicious even though when I scan the same file on the virustotal website it comes back as malicious. (I'm using the Eicar test file). I attached the file I'm using.
test.txt

Tag the source

Could you please tag the source again? This allows distributions to get the complete source from GitHub.

Thanks

Use next in iterator objects

Python 3.7
vt-py 0.5.4

import vt

client = vt.Client(VT_API_KEY)
query = 'entity:domain p:0 urls_max_detections:10+'
attacked_legit_domains = client.iterator('/intelligence/search', params={'query': query}, limit=1)

legit_domain = next(attacked_legit_domains)

Traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-18-806f21dacbfe> in <module>()
      2 attacked_legit_domains = client.iterator('/intelligence/search', params={'query': query}, limit=1)
      3 
----> 4 legit_domain = next(attacked_legit_domains)

TypeError: 'Iterator' object is not an iterator

Task was destroyed but it is pending!

I'm running "python3 search_and_download_topn_files.py -n 5 'type:"peexe"' --apikey 1234" and the following error is returned.

2022-08-17 10:26:03 INFO Starting VirusTotal Intelligence downloader
2022-08-17 10:26:03 INFO * VirusTotal Intelligence search: type:"peexe"
2022-08-17 10:26:03 INFO * Number of files to download: 5
2022-08-17 10:26:06 ERROR Task was destroyed but it is pending!
task: <Task pending name='Task-3' coro=<DownloadTopNFilesHandler.download_files() running at search_and_download_topn_files.py:55>>
2022-08-17 10:26:06 ERROR Task was destroyed but it is pending!
task: <Task pending name='Task-4' coro=<DownloadTopNFilesHandler.download_files() running at search_and_download_topn_files.py:55>>
2022-08-17 10:26:06 ERROR Task was destroyed but it is pending!
task: <Task pending name='Task-5' coro=<DownloadTopNFilesHandler.download_files() running at search_and_download_topn_files.py:55>>
2022-08-17 10:26:06 ERROR Task was destroyed but it is pending!
task: <Task pending name='Task-6' coro=<DownloadTopNFilesHandler.download_files() running at search_and_download_topn_files.py:55>>

add example for file scan

Most important feature of VT would be to scan a file and get results. We should have example of that even within the readme.

File scan results

how can i get the results of a file scan?
i specifically want the body-sha256 id of the file scan

Add logging

It would be nice if there were some logging.debug() calls throughout the library to assist with troubleshooting exactly what requests are being made with what headers.

module vt has no attribute 'Client'

C:\Python37\python.exe intel_to_net_infra.py --apikey=xxx --query='type:apk'
Traceback (most recent call last):
  File "intel_to_net_infra.py", line 162, in <module>
    loop.run_until_complete(main())
  File "C:\Python37\lib\asyncio\base_events.py", line 579, in run_until_complete
    return future.result()
  File "intel_to_net_infra.py", line 149, in main
    await asyncio.gather(enqueue_files_task)
  File "intel_to_net_infra.py", line 62, in get_matching_files
    async with vt.Client(self.apikey) as client:
AttributeError: module 'vt' has no attribute 'Client'
Task was destroyed but it is pending!
task: <Task pending coro=<VTISearchToNetworkInfrastructureHandler.get_network() running at intel_to_net_infra.py:77> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x000002177533B078>()]>>
Task was destroyed but it is pending!
task: <Task pending coro=<VTISearchToNetworkInfrastructureHandler.build_network() running at intel_to_net_infra.py:106> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x000002177533B8B8>()]>>

I have asyncio and vt-py imported, getting this error when running a test query seen above. seems like it may traceback to an asyncio issue? not sure whats going on here.

Pytest warnings about Python 3.10

DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.

$ python setup.py test

=============================== warnings summary ===============================
tests/test_client.py: 14 warnings
tests/test_iterator.py: 5 warnings
  /opt/hostedtoolcache/Python/3.9.6/x64/lib/python3.9/site-packages/aiohttp-4.0.0a1-py3.9-linux-x86_64.egg/aiohttp/connector.py:944: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
    hosts = await asyncio.shield(self._resolve_host(

tests/test_client.py: 11 warnings
tests/test_iterator.py: 3 warnings
  /opt/hostedtoolcache/Python/3.9.6/x64/lib/python3.9/site-packages/aiohttp-4.0.0a1-py3.9-linux-x86_64.egg/aiohttp/locks.py:21: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
    self._event = asyncio.Event(loop=loop)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
======================= 17 passed, 33 warnings in 0.90s ========================

URL objects not always populated with same keys

Today I was experimenting with vt-py using code a bit like this:

 with vt.Client(apikey=[SOME_KEY]) as cli:
                iterator = cli.iterator(
                    path=f"/files/a0b9ddaa108d8dd6faca8b661fc0890be5f8077a131a5585e386dd25801276b6/embedded_urls",
                    limit=response.slider,
                    batch_size=min(40, response.slider)
                )
                linkLabel = "embeds URL"
                for obj in iterator:
                    if obj.type == "url":
                        url = obj.url

The code above will fail, because sometimes VT doesn't always respond in the same way (i.e. in this case the object it will crash on is):

{"_type": "url", "_id": "125aa0fff3f9792f95947f44c120037c2ea0e10ed039a728cf94719a87af972a", "_modified_attrs": ["_modified_data", "_context_attributes"], "_modified_data": {}, "_context_attributes": {"url": "https://client.api.ufiler.pro/api/v1/integrator/%25s/rev1"}}

I think when VT responds like this, vt-py should populate the obj.url field.

Cheers,
Tom

Hash scan missing and few other endpoints

I was exploring the virustotal api v3. Some of the major features are missing.

  • Hash scan is missing
  • All x/report endpoints are missing. where x is url, domain, ip
  • Also analyze url does not give positive field like it used to give in v2

Are they present already? Is their any alternative? Can we expect those actions in v3 also?

Default limit is 0 in client.iterator

Hello,

When using client.iterator over a collection without specifying a limit parameter, is it set to 0 by default which makes it confusing at first. I think it should be set to some value, such as 10, to prevent situations like this one:

>>> it = client.iterator('/domains/google.com/downloaded_files')
>>> list(it)
[]
>>> it = client.iterator('/domains/google.com/downloaded_files', limit=10)
>>> list(it)
[<vt.object.Object object at 0x10ab00fd0>, <vt.object.Object object at 0x10bc08690>, <vt.object.Object object at 0x10bc0d910>, <vt.object.Object object at 0x10b9ad290>, <vt.object.Object object at 0x10b90b0d0>, <vt.object.Object object at 0x10b90b5d0>, <vt.object.Object object at 0x10b90b610>, <vt.object.Object object at 0x10b90b350>, <vt.object.Object object at 0x10b90b890>, <vt.object.Object object at 0x10b90bd50>]

Regards,
Marta

Unclosed client session after File Upload

I do get the following error after uploading a file to scan.

Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x000001E137254D90>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x000001E1382814C0>, 200746.125)]']
connector: <aiohttp.connector.TCPConnector object at 0x000001E137254880>

The Scan is running flawlessly with the following code.

with open(file, "rb") as f:
        analysis = client.scan_file(f)

while True:
        analysis = client.get_object("/analyses/{}", analysis.id)
        if analysis.status == "completed":
            break
        time.sleep(30)

If there is a way to just surpress the error message I would be fine for now

`examples/file_feed_async.py` seems broken

# pipenv run python examples/file_feed_async.py
examples/file_feed_async.py:52: RuntimeWarning: coroutine 'Feed.__aiter__' was never awaited
  async for file_obj in feed:
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task exception was never retrieved
future: <Task finished coro=<FeedReader._get_from_feed_and_enqueue() done, defined at examples/file_feed_async.py:48> exception=TypeError("'async for' received an object from __aiter__ that does not implement __anext__: coroutine")>
Traceback (most recent call last):
  File "examples/file_feed_async.py", line 52, in _get_from_feed_and_enqueue
    async for file_obj in feed:
TypeError: 'async for' received an object from __aiter__ that does not implement __anext__: coroutine

client.download_file "RuntimeError: Timeout context manager should be used inside a task"

While working with the api client, I am unable to download a file using the download_file method. The following exception occurs:

File "/usr/local/lib/python3.9/site-packages/vt/client.py", line 314, in download_file
return make_sync(self.download_file_async(hash, file))
File "/usr/local/lib/python3.9/site-packages/vt/utils.py", line 26, in make_sync
return event_loop.run_until_complete(future)
File "/usr/local/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/usr/local/lib/python3.9/site-packages/vt/client.py", line 334, in download_file_async
await self.__download_async(f'/files/{hash}/download', file)
File "/usr/local/lib/python3.9/site-packages/vt/client.py", line 322, in __download_async
response = await self.get_async(endpoint)
File "/usr/local/lib/python3.9/site-packages/vt/client.py", line 423, in get_async
await self._get_session().get(
File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 466, in _request
with timer:
File "/usr/local/lib/python3.9/site-packages/aiohttp/helpers.py", line 701, in __enter__
raise RuntimeError(
RuntimeError: Timeout context manager should be used inside a task

TLS verification turned off

You should remove the ssl=False from

connector=aiohttp.TCPConnector(ssl=False),

This means currently anyone with a man in the middle position can easily sniff the API key and other interesting data silently.

Exceptions for Rate Limits

I'm having trouble understanding what exceptions would be thrown if my API key does hit one of the rate limits associated with it. The documentation does mention receiving a response code >400, and I did trace that through some of the vt.Client's methods but it seems to construct an exception from a dict made from the server response but that is as far as I got. Really hate to make an issue that's more a support request but any help would be greatly appreciated.

EDIT: Also querying vt like so

>>> url_id = vt.url_id("http://www.virustotal.com")
>>> url = client.get_object("/urls/{}", url_id)

does not count towards rate limits does it? Or am I understanding incorrectly?

Thanks

Enhancement: Add ZIP Download Function

Having a function to ease download of an encrypted ZIP archive (/zip_file) with a specified password would be useful.

  def download_zip(self, hashes, file, password):
    """Downloads a zip archive of files given their hashes (SHA-256, SHA-1 or MD5).
    The zip archive created on the server will be downloaded to the provided file
    object. The file object must be opened in write binary mode ('wb').
    :param hashes: List of file hashes.
    :param file: A file object where the downloaded archive file will be written to.
    :type hashes: list
    :type file: file-like object
    :type password: str
    """
    return _make_sync(self.download_zip_async(hash_list, file, password))

  async def download_zip_async(self, hashes, file, password):
    """Like :func:`download_zip` but returns a coroutine."""
    if password is None:
      password = 'infected'
    zip_files = await self.get_data_async('/intelligence/zip_files', 
                                    params={'password': password
                                            'hashes': hashes}
                                            )
    
    while zip_files['attributes']['status'] in ("starting", "creating", ):
      await asyncio.sleep(20)
      zip_files = await self.get_data_async('/intelligence/zip_files/{}'.format(zip_files['id'])
    
    if zip_files['attributes']['status'] == 'finished':
        response = await self.get_async('/intelligence/zip_files/{}/download'.format(zip_files['id'])) 
        while True:
          chunk = await response.content.read_async(1024*1024)
          if not chunk:
            break
          file.write(chunk)

Errors Related to Client.__del__

Hey there,
my functions are working fine but when i run my script without my arguments i get errors,
any idea how to fix it?

Exception ignored in: <function Client.__del__ at 0x0000000003DB9160>
Traceback (most recent call last):
  File "C:\Python39\lib\site-packages\vt\client.py", line 263, in __del__
  File "C:\Python39\lib\site-packages\vt\client.py", line 295, in close
  File "C:\Python39\lib\site-packages\vt\utils.py", line 22, in make_sync
  File "C:\Python39\lib\asyncio\events.py", line 725, in get_event_loop_policy
  File "C:\Python39\lib\asyncio\events.py", line 718, in _init_event_loop_policy
ImportError: sys.meta_path is None, Python is likely shutting down
sys:1: RuntimeWarning: coroutine 'Client.close_async' was never awaited

TypeError: Expected a file to be a file object, got <class 'tempfile.SpooledTemporaryFile'> (FastAPI Request Files)

I am trying to use the scan_file_async function with the FastAPI Request Files function:

@app.post('/foo')
async def foo(file: UploadFile = File(...)):
    while chunk := await file.read(1024 * 1024):
        sha1.update(chunk)

    result = None
    
    with suppress(Expression):
        result = await vt_client.get_object_async(f'/files/{sha1.hexdigest()}')
        result = result.last_analysis_stats
        result['name'] = file.filename
        result['hash'] = sha1.hexdigest()

    if not result:
        analysis = await vt_client.scan_file_async(file.file, True)
        result = analysis.stats
        result['name'] = file.filename
        result['hash'] = sha1.hexdigest()

    return result

And I'm not getting the result I want because I get this error:

Traceback (most recent call last):
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\uvicorn\protocols\http\httptools_impl.py", line 404, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\uvicorn\middleware\proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\fastapi\applications.py", line 270, in __call__
    await super().__call__(scope, receive, send)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\middleware\errors.py", line 184, in __call__
    raise exc
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\middleware\errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\middleware\exceptions.py", line 75, in __call__
    raise exc
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\middleware\exceptions.py", line 64, in __call__
    await self.app(scope, receive, sender)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\fastapi\middleware\asyncexitstack.py", line 21, in __call__
    raise e
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\fastapi\middleware\asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\routing.py", line 680, in __call__
    await route.handle(scope, receive, send)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\routing.py", line 275, in handle
    await self.app(scope, receive, send)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\starlette\routing.py", line 65, in app
    response = await func(request)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\fastapi\routing.py", line 231, in app
    raw_response = await run_endpoint_function(
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\fastapi\routing.py", line 160, in run_endpoint_function
    return await dependant.call(**values)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\.\scratch2.py", line 36, in root
    analysis = await vt_client.scan_file_async(file.file, True)
  File "C:\Users\ktako\OneDrive\Documents\Repositories\ebook-collector\__pypackages__\3.10\lib\vt\client.py", line 686, in scan_file_async
    raise TypeError(f'Expected a file to be a file object, got {type(file)}')
TypeError: Expected a file to be a file object, got <class 'tempfile.SpooledTemporaryFile'>

I re-checked the documentation for FastAPI UploadFile and according to this, it should return a file-like object. As stated in your documentation here, it scans for file-like object. Is there a workaround regarding this issue?

Edit: Changed the Python script above

Bad formatting of request

Once uploading content (for instance hunting ruleset), one can notice the following error:
vt.error.APIError: ('BadRequestError', 'Expecting JSON dictionary')

After investigation, it is due to the request using a wrong content-type of 'text/plain' instead of 'application/json'

Get the data as response from URL report

Here is the code

import vt
client = vt.Client(<API KEY>)
analysis = client.scan_url('https://21stcenturywire.com/2021/04/07/texas-governor-signs-order-banning-use-of-vaccine-passports/', wait_for_completion=True)
print(analysis)

Output

analysis u-17cca8a680d8c2b04a044cc689cdb2dadde2b43abcc2edfe290f3cb552d49bbe-1619945963
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x0000017280919610>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x0000017280944220>, 6975.046)]']
connector: <aiohttp.connector.TCPConnector object at 0x0000017280919430>

But I want the result in terms of JSON to be added to my streamlit web app : https://developers.virustotal.com/reference#url-report

{
 'response_code': 1,
 'verbose_msg': 'Scan finished, scan information embedded in this object',
 'scan_id': '1db0ad7dbcec0676710ea0eaacd35d5e471d3e11944d53bcbd31f0cbd11bce31-1390467782',
 'permalink': 'https://www.virustotal.com/url/__urlsha256__/analysis/1390467782/',
 'url': 'http://www.virustotal.com/',
 'scan_date': '2014-01-23 09:03:02',
 'filescan_id': null,
 'positives': 0,
 'total': 51,
 'scans': {
    'CLEAN MX': {
      'detected': false, 
      'result': 'clean site'
    },
    'MalwarePatrol': {
      'detected': false, 
      'result': 'clean site'
    }
  }
}

I do not think that batch_size and limit work properly

I am having trouble figuring out the best way to use the iterator. I am using an iterator to call the /intelligence/hunting_notification_files endpoint.

with Client(apikey) as vt_client:
    cursor = None
    while (True):
        notifications = vt_client.iterator('/intelligence/hunting_notification_files', batch_size=2, limit=5,  cursor=cursor)
        for notification in notifications:
            if not notification:
                cursor = notifications.cursor
                break
            // do something with notification

batch_size seems to work, limit seems to do nothing. What ends up happening is the for loop loops over twice (batch_size=2) and on the third iteration notification is None, so I get the cursor, break out of the for loop and create a new iterator with an updated cursor...

What am I doing wrong? The API endpoint has a maximum limit of 40, so the maximum I can set the batch_size to is 40. How could I write something that would use a maximum batch_size (40) but limit to 100 total iterations?

sha-256

@mgmacias95 you had given me code which gets the id of a url, then it asks virus total for the results. but i want to scan a file and then get the sha-256 id of that file. can you help?

Enhancement: Enable adding Proxy settings to Client initialization

Currently, there is no ability to add proxy settings to the vt.Client where a proxy is required within the usage environment. Proposing adding a proxy argument to vt.Client which would then be added to get_async or post_async, etc, as appropriate, e.g.:

    async def get_async(self, path, *path_args, params=None):
        """Like :func:`get` but returns a coroutine."""
        return vt.ClientResponse(
            await self._get_session().get(
                self._full_url(path, *path_args),
                params=params,
                proxy=self._proxy))

Using the client in a Web Application

Morning,
We have a standard web application (in our case Django, but the example holds for most cases including Flask, CherryPy, etc.,)..

I'd have expected this to work inside a function in my web app

        with vt.Client(api.api_key) as client:
            file_info = client.get_object(
                '/files/44d88612fea8a8f36de82e1278abb02f')
            return file_info.sha256

But it doesn't unfortunately. This is due to the _make_sync call which tries to get the current event loop and run the awaitable to completion. I believe this will cause a fair bit of consternation for most web applications.

Any suggestions on how the library is intended to be used inside web applications? Here is the traceback

Traceback (most recent call last):
  File "/..../actions/vt_get_file.py", line 47, in process
    file_info = client.get_object(
  File "/.../lib/python3.8/site-packages/vt/client.py", line 416, in get_object
    return _make_sync(self.get_object_async(path, *path_args, params=params))
  File "/.../lib/python3.8/site-packages/vt/client.py", line 47, in _make_sync
    return asyncio.get_event_loop().run_until_complete(future)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/events.py", line 639, in get_event_loop
    raise RuntimeError('There is no current event loop in thread %r.'

[help] Pagination for endpoints that are not collections

There are some URL endpoints in VT that are paginated and use a structure like this:

{
 ...
 "links": {
    "self": "somelink",
    "next": "somelink"
 }

It seems that generally the approach in this library for these is to use client.Iterator, but some of these endpoins don't yield a collection, e.g.

/files/sha256goeshere?relationships=embedded_urls

What is the right way of querying this endpoint for more than 20 elements using the library? An example file with >20 embedded URLs is:

a0b9ddaa108d8dd6faca8b661fc0890be5f8077a131a5585e386dd25801276b6

scan_file_async closes file object

I have been meaning to ask this, does the function intend to close the file after the scanning?

async def virus_analysis_hash(md5_hash):
    with suppress(Exception):
        result = await client.get_object_async(f"/files/{md5_hash}")
        return result.last_analysis_stats

    return None


async def virus_analysis_file(file):
    tmp_file = file.file
    result = await client.scan_file_async(tmp_file, True)
    return result.stats


async def virus_analysis(md5_hash, file):
    if not (result := await virus_analysis_hash(md5_hash)):
        result = await virus_analysis_file(file)

    return (
        True if result["malicious"] > 0 or result["suspicious"] > 0 else False
    )

I wanted to upload the file after checking if it isn't suspicious or malicious, is there a workaround or how do I stop the function from closing the file?

How to acess the results of scan

Hello
I am running the following:

with open("download.png", "rb") as f:
    analysis = client.scan_file(f, wait_for_completion=True)

How do I get the results of the scan?

'urls' feed support?

Hi and thanks for your support for open-source libraries!

I was attempting to support the "URL" feed in the premium API, using an apikey which I have confirmed has access to that feed, and confirming the script has access to the apikey value

--
After updating the FeedType Enum, which it seems like would result in the endpoint I want
e.g. https://www.virustotal.com/api/v3/feeds/urls/$time

[feeds.py]

class FeedType(enum.Enum):
  FILES = 'files'
  URLS = 'urls'

The below code seems to spin infinitely without writing anything
(which may be due to me using it wrong, or perhaps the Feed class doesn't yet implement the details of the URL feed)

with vt.Client(args.apikey) as client:
    # Iterate over the file feed, one file at a time. This loop doesn't
    # finish, when the feed is consumed it will keep waiting for more files.

    for url_obj in client.feed(vt.FeedType.URLS, cursor=args.cursor):
      # Write the file's metadata into a JSON-encoded file.

      url_path = os.path.join(args.output, url_obj.id)
      with open(url_path + '.json', mode='w') as f:
        f.write(json.dumps(url_obj.to_dict()))

search_and_download_topn_files.py not working as expected

When I run the "search_and_download_topn_files.py" provided when I attempt
to download 100+ files, it just downloads the same 10 files over and over.
The target directory lists only ten files (sha256 hashes of files that
match the query), but the same files are overwritten repeatedly.

molinajavier@molinajavier:/Downloads$ python3 search_and_download_topn_files.py --apikey cat /home/molinajavier/code/jramirez.key s:400+ p:0 type:pdf
2019-10-24 10:26:48 DEBUG Using selector: EpollSelector
2019-10-24 10:26:48 INFO Starting VirusTotal Intelligence downloader
2019-10-24 10:26:48 INFO * VirusTotal Intelligence search: s:400+ p:0 type:pdf
2019-10-24 10:26:48 INFO * Number of files to download: 100
molinajavier@molinajavier:
/Downloads$ tree intelligencefiles/
intelligencefiles/
└── 20191024T102648
├── 585adc5a586d2ebf64f9dd010b7ba3a1836364c3c2ae86a04f0c2fa17952dc4c
├── 9c6cf8869fb74345cf375385ed04d74f34b3af73de8599775d87d5b62eeb5c81
└── c95e8427278e0b7c90b41f3372299bad029614f924399567378d2f30480ff58a

1 directory, 3 files

RuntimeError: Event loop is closed in multiprocessing enviroments (celery).

Hi,

First of all sorry for not providing PR, lack of spare time.
The problem I faced is not strictly an vt-py issue but i beleave that protection against such a problem can also be added here.
In short words, i ve got software where celery tasks using vt-py are processesd among other tasks, other tasks are using packages containg async code too. While trying to debug error mentioned in title I came up with PoC of problem/possible solution. I believe that code will be best description.

Right now the only more or less clear solution is wrapping async code using threads.

#!/usr/bin/env python3


import os
from multiprocessing import Pool
import asyncio


def other_celery_task(job_id):
    '''
        Here we've got common pattern in other libs, notice 'idiomatic' (or pretend to be) view on event loop as your own resource:
        obtain new event_loop
        try:
            do your job
        finally:
            event_loop.close()
    '''
    cpid = os.getpid()
    event_loop = asyncio.new_event_loop()
    asyncio.set_event_loop(event_loop)

    async def some_aync_op_possibly_throwing_exception():
        print(
            "[*] async def some_aync_op_possibly_throwing_exception() start id", job_id, cpid)
        # await asyncio.sleep()
        print(
            "[*] async def some_aync_op_possibly_throwing_exception() done id", job_id, cpid)
        return 1
    try:
        return event_loop.run_until_complete(some_aync_op_possibly_throwing_exception())
    except Exception:
        print("Oh no, fail, let's ignore it, forget about finnaly below")
    finally:
        event_loop.close()


def vt_client_related_celery_task(job_id):
    '''
        Here we've got reconstructed flow from vt.Client
    '''
    cpid = os.getpid()

    async def vt_client_aync_op():
        print("[*] async def vt_cllient_aync_op() start id", job_id, cpid)
        # await asyncio.sleep()
        print("[*] async def vt_cllient_aync_op() done id", job_id, cpid)
        return 1

    try:
        event_loop = asyncio.get_event_loop()
        print("event loop was in place id",
              job_id, cpid, event_loop.is_closed())

        '''
            try to uncommnet 2 lines below. I assume that closed loop is not NX loop, so Runtime exceptiion will be never thrown.
            When next celery task with vt.Client arrives, we've got RuntimeError: Event loop is closed
        '''
        # if event_loop.is_closed():
        #     raise RuntimeError("other task closed our loop?")
    except RuntimeError:
        # Generate an event loop if there isn't any.
        event_loop = asyncio.new_event_loop()
        asyncio.set_event_loop(event_loop)
        print("event loop regenerated id", job_id, cpid)

    return event_loop.run_until_complete(vt_client_aync_op())


if __name__ == '__main__':
    '''
        Here we've got ovesimplication of default celery worker setup, dealing also with other tasks.
        Pool size 2 is intentionally small, this problem can be non deterministic

    '''

    with Pool(2) as pool:

        vt_round_one = pool.map_async(vt_client_related_celery_task,
                                      [i for i in range(0, 9)])
        other_round = pool.map_async(other_celery_task, [9])
        vt_round_two = pool.map_async(vt_client_related_celery_task,
                                      [i for i in range(10, 20)])
        if 20 == sum(vt_round_one.get() + other_round.get() + vt_round_two.get()):
            print("all tasks executed")
        else:
            print("yay, fail")

BR
Artur Augustyniak

Result get different from API and UI

Hi,

I am trying to get verdict for domain from the library and for many domain I get an NotFoundError error.
However those domains get verdict in the UI. This issue occurred with many domains.

This is an example:

Screen Shot 2022-03-14 at 12 11 17

This is the code that I am using:

import vt
import json
import nest_asyncio
nest_asyncio.apply()


def get_analysis_result_for_url_from_vt(url):
    try:
        client = vt.Client(API_KEY)
        url_id = vt.url_id(url)
        url = client.get_object(f"/urls/{url_id}")
        client.close()
        analysis_result = url.__dict__['last_analysis_stats']
        if analysis_result['malicious'] >= 5:
            return 'VT_malicious'
        elif analysis_result['malicious'] <2:
            return 'VT_benign'
        else:
            return 'VT_undefined'
    except Exception as e:
        client.close()
        return e


print(get_analysis_result_for_url_from_vt('dcanvasreporting.anglia.ac.uk'))
print(get_analysis_result_for_url_from_vt('google.com'))

Won't work inside Jupyter Notebook

When attempting to use the new v3 package within a Jupyter notebook I receive the error "RuntimeError: This event loop is already running." A little searching suggests that since Jupyter is already using an event loop, the VT package can't.

image

Add type hints

Type hinting improves DX if you are using a modern IDE like VS Code.
So how about adding type hints to the library?
I will work on that if it makes sense with you.

Proxy Configuration

Hi there,
The client cannot support configure proxy settings now. It's not friendly for the developers who have to connect proxy to access. For example, calling the api to download samples or pcap packages. Could u add this feature in the near future? Thanks a lot.

Best Regards,
Nigel

Documentation Improvement - iterator vs get_object

Original Submission:

      client.get_object fails when attempting to pull resolutions. 
      
      example:
      ```
      resolutions = client.get_object("/ip_addresses/46.232.251.191/resolutions")
      ```
      
      This works through the API developers console, but not via the vt-py library.
      
      https://developers.virustotal.com/reference/ip-relationships
    
      Is there an improvement that can be made in handling these responses?

While submitting this issue, I realized I should have been using the iterator to iterate through the response instead of the using the get_object method. I read through the examples in the documentation and didn't see any references to when I might need to use the iterator vs when I could use single object calls.

I read through the quick start (https://virustotal.github.io/vt-py/quickstart.html) and the examples, but it wasn't clear that some endpoints don't support single get_object calls and need to be iterated. Instructions on when to use each of the different methods may be helpful, or maybe I missed something documented elsewhere. Thanks!

vt.Client doesn't seem to (successfully) connect

After creating a client session with:

nest_asyncio.apply()
cli = vt.Client(getpass.getpass('Introduce your VirusTotal API key: '))

And using the following code to iterate over results:
url_objects = [
u.to_dict() for u in cli.iterator(....

I get the following error msg:

AttributeError: 'Task' object has no attribute 'to_dict'

All task seem to be pending.

[bug] examples/download_files.py broken for python >= 3.10

This example is broken for Python >= 3.10:

https://github.com/VirusTotal/vt-py/blob/master/examples/download_files.py

Results in:

  File "/mnt/hgfs/Work/misc_ti/virustotal/download_util.py", line 85, in main
    queue = asyncio.Queue(loop=loop)
  File "/usr/lib/python3.10/asyncio/queues.py", line 34, in __init__
    super().__init__(loop=loop)
  File "/usr/lib/python3.10/asyncio/mixins.py", line 17, in __init__
    raise TypeError(
TypeError: As of 3.10, the *loop* parameter was removed from Queue() since it is no longer necessary

Needs something like this for python 3.10 or greater:

loop = asyncio.get_event_loop()
    queue = asyncio.Queue()
    for hash in input_file:
        queue.put_nowait(hash.strip())

    worker_tasks = []
    for _ in range(args.workers):
        worker_tasks.append(
            loop.create_task(download_files(queue, args)))

    # Wait until all worker tasks has completed.
    loop.run_until_complete(asyncio.gather(*worker_tasks))
    loop.close()
    if not isinstance(input_file, list):
        input_file.close()

aiohttp ClientSession upload times-out

In case I have a pretty big file (specificaly 60mb) and a quite slow upload speed, the upload process can take quite some time. The aiohttp ClientSession, as I read, by default times-out at 5 minutes. In my case my upload, when I tested in the Virus Total web site took around 10-15 minutes.
Is there a way to disable the timeout period the aiohttp session has, from what I can access through the vt-py Client.scan_file? I'm not familliar with this library.

url scan issue

why doesnt the url scan return a body sha-256 id?. doesnt it return it or is it just that i cant find it?

Relationships support

hi, thanks for your work.

how can I get Relationships like behaviour reports using vt-py?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.