Git Product home page Git Product logo

advocate's People

Contributors

coldheat avatar jordanmilne avatar sbdchd avatar spladug avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

advocate's Issues

Tests broken with requests 2.9.1

Seems to have something to do with the new usage of @pytest.fixture and our wrapper for requests' tests that auto-passes tests that call Session.mount(). I'm guessing fixture uses introspection to figure out what fixtures to add in based on arg names, but our wrapper uses *args, **kwargs.

Add support for user-defined DNS64 prefixes

A network admin may configure their DNS64 instance to use a prefix other than the well-known prefix, which is in the private IP space. We want to allow people to specify ranges that should be treated the same as the DNS64 WK prefix.

Refactor requests compatibility tests to work with new requests versions

Currently we have a pretty heinous way of checking that our wrappers don't break existing functionality across multiple versions of requests. We pull the source for requests, read in its test_requests.py file and make modifications so that any calls to requests.<foo>() will instead use the advocate wrapper. We also use regexes to replace code in older requests versions' test code that is broken under Python 3.

We then take all that regex-mangled code, exec() it, and shove the resulting test functions' globals into the test_advocate module so they get run with the rest of our tests. Gross.

This sort of worked up until recently, when requests did the sensible thing and migrated from a single test script to a directory of test scripts. Rather than have separate code-paths for searching / replacing source in the different requests versions we should make these tests less brittle.

My current plan is:

  • Throw out the regex-replacement junk.
  • Stop running tests against versions of requests where the upstream test suite fails without modification (such as 2.4 which broke due to httpbin changes.)
  • Stop running Python 3 tests against versions of requests where the upstream test suite failed under Python 3 without modification
  • Instead of replacing requests.<method>() with advocate_wrapper.<method>() in the source, figure out a better way of intercepting uses of the requests module directly by requests' test suite (something like the _WrappedSocket we already use.)

Rip out Python 2.x support

It's been at least a year since anybody's asked me to care about Python 2.x, so it seems reasonable to drop 2.x support for new versions of Advocate. Requests no longer supports Python 2.x on new versions either!

This'd let us drop all of the six stuff along with the otherwise-useless deps we're pulling in to fix HTTPS cert validation under Python 2.7.

got exception

i got exception

TypeError: new() got an unexpected keyword argument 'key_validator'

Strange behavior with RequestsAPIWrapper.head()

Hey Jordan,

I ran into some weird behavior in advocate today. I haven't tried to dig into the cause, but it seems to be mostly just related to the head() function for some reason. We have a RequestsAPIWrapper instance that's defined like this:

advocate = RequestsAPIWrapper(AddrValidator(
    ip_whitelist=g.scraper_ip_whitelist,    
))                                          

Using this for advocate.head(url) behaves strangely when used on our S3 thumbnail buckets, even though it seems to work if I just use advocate.request('HEAD', url) instead. Here's some testing that should show what I mean:

RequestsAPIWrapper version

>>> from r2.lib.media import advocate
>>> advocate.head('https://b.thumbs.redditmedia.com/6hn7cHXUEXSYh8K5ga-LCeHWBpPADv6cS6-34ty2RuE.jpg')
<Response [400]>
>>> advocate.head('https://b.thumbs.redditmedia.com/6hn7cHXUEXSYh8K5ga-LCeHWBpPADv6cS6-34ty2RuE.jpg').content
'<?xml version="1.0" encoding="UTF-8"?>\n<Error><Code>BadRequest</Code><Message>Insufficient information. Origin request header needed.</Message><RequestId>1CB3BBB8D0A65360</RequestId><HostId>eno2W4lV4rPy1ZpQbpybvxtXSFrcTmxcjyrdVSf8/ongUZBGJKnZMRtv5eQhVJDto6LxxOHCSD8=</HostId></Error>'

>>> advocate.get('https://b.thumbs.redditmedia.com/6hn7cHXUEXSYh8K5ga-LCeHWBpPADv6cS6-34ty2RuE.jpg')
<Response [200]>

>>> advocate.request('HEAD', 'https://b.thumbs.redditmedia.com/6hn7cHXUEXSYh8K5ga-LCeHWBpPADv6cS6-34ty2RuE.jpg')
<Response [200]>

Standard advocate, no RequestsAPIWrapper

>>> import advocate
>>> advocate.head('https://b.thumbs.redditmedia.com/6hn7cHXUEXSYh8K5ga-LCeHWBpPADv6cS6-34ty2RuE.jpg')
<Response [200]>

Standard requests library

>>> import requests
>>> requests.head('https://b.thumbs.redditmedia.com/6hn7cHXUEXSYh8K5ga-LCeHWBpPADv6cS6-34ty2RuE.jpg')
<Response [200]>

Let me know if you need any more information or anything.

Make netifaces dependency optional

For platforms where using Python modules implemented in C is particularly difficult, it would be useful to be able to use Advocate without those features included.

In particular, it looks like netifaces exists solely to support detection of local addresses; in the vast majority of cases, I would expect all addresses reported by netifaces to be private addresses, and thus not pass the filter anyway. I'm not sure that it's worth a breaking change, but I'm curious why this functionality is enabled by default, given that (I assume) most hosts don't have direct public IP addresses.

Would you be open to a PR that disables this functionality if netifaces is not available?

hostname tests are skipped due to incorrect type of `canonname`

Hi, when I try to run the pytest, I noticed that hostname-related tests are skipped even when the nameserver supports the AI_CANONNAME flag. This was possibly due to the incorrect type of the expected result when testing for the AI_CANONNAME flag: https://github.com/JordanMilne/Advocate/blob/master/test/test_advocate.py#L55

The cannonname returned by socket.getaddrinfo is a string now. ref: https://docs.python.org/3/library/socket.html#socket.getaddrinfo

When running hostname-related tests, HostnameTests.test_idn failed:

======================================= FAILURES =======================================
________________________________ HostnameTests.test_idn ________________________________
self = <test.test_advocate.HostnameTests testMethod=test_idn>

    def test_idn(self):
        # test some basic globs
>       self.assertFalse(self._is_hostname_allowed(
            u"**.example.org",
            fake_lookup=True,
            hostname_blacklist={"*.org"}
        ))

test/test_advocate.py:359:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test/test_advocate.py:350: in _is_hostname_allowed
    if validator.is_addrinfo_allowed(res):
advocate/addrvalidator.py:58: in wrapper
    return func(self, *args, **kwargs)
advocate/addrvalidator.py:319: in is_addrinfo_allowed
    if not self.is_hostname_allowed(canonname):
advocate/addrvalidator.py:259: in is_hostname_allowed
    if self._hostname_matches_pattern(hostname, pattern):
advocate/addrvalidator.py:221: in _hostname_matches_pattern
    hostname = canonicalize_hostname(hostname)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

hostname = b'xn--fiqs8s.example.org'

    def canonicalize_hostname(hostname):
        """Lowercase and punycodify a hostname"""
        # We do the lowercasing after IDNA encoding because we only want to
        # lowercase the *ASCII* chars.
        # TODO: The differences between IDNA2003 and IDNA2008 might be relevant
        # to us, but both specs are damn confusing.
>       return str(hostname.encode("idna").lower(), 'utf-8')
E       AttributeError: 'bytes' object has no attribute 'encode'

advocate/addrvalidator.py:23: AttributeError
=============================== short test summary info ================================
FAILED test/test_advocate.py::HostnameTests::test_idn - AttributeError: 'bytes' object has no attribute 'encode'

Similar to the cannoname bug, this is also introduced by feeding bytes as the cannoname in the _is_hostname_allowed function: https://github.com/JordanMilne/Advocate/blob/master/test/test_advocate.py#L346

I can make a PR later for a quick fix of the issues mentioned above

Unvendor ipaddress module

I don't know what I was thinking. There's no reason for this to be vendored, and the code to vendor it is gnarly.

What should be done about IPv6?

Should it be on by default? I'm kind of iffy on this, but I don't know enough about IPv6 as-deployed-by most people to say if it's dangerous one way or the other.

I get the feeling people may currently be relying on NAT / firewall rules to prevent external requests from getting in. With IPv4, if you use private ranges like you should, it's super easy to detect and block if an IP refers to something on the LAN (192.x.x.x, 10.x.x.x., etc.) Even if there's a service that's bound to 0.0.0.0.

But what about with IPv6? Say I've got an internal service that binds to :: on server X, and server Y running a site that uses Advocate on the same network. If server Y makes a request to server X's globally-routable IPv6 IP, do we have any guarantees as to how the packet will be routed? Might it be routed directly to server server X without hitting the external firewall?

I'm concerned that if that is the case, users will be vulnerable to SSRF unless the servers are assigned ULAs, or if the organization is large enough to have a globally-routable block assigned to them for internal use, which be added to Advocate's blacklist.

TL;DR: We somewhat abuse the side-effects of IPv4 + NAT so we can tell by an address if the destination is likely on the LAN or not. Non-globally-routable (RFC1918 or otherwise) addresses are blocked, globally routable ones are not.

This doesn't appear to be possible with most IPv6 setups as LAN IPs usually == globally routable WAN IPs, frustrating the ability to prevent SSRF via checks in the client.

Is any of that actually a concern? Am I fundamentally misunderstanding IPv6? Any relevant IPv6 documentation is helpful!

Advocate stops working with requests

Since requests 2.14.0 advocate is broken.
requests updated its internal urllib3 version and this can be the cause of the problem.
I get an error which relates to urllib3:

File "D:\******\******\env36\lib\site-packages\requests\sessions.py", line 531, in get
    return self.request('GET', url, **kwargs)
  File "D:\******\******\env36\lib\site-packages\requests\sessions.py", line 518, in request
    resp = self.send(prep, **send_kwargs)
  File "D:\******\******\env36\lib\site-packages\requests\sessions.py", line 639, in send
    r = adapter.send(request, **kwargs)
  File "D:\******\******\env36\lib\site-packages\requests\adapters.py", line 403, in send
    conn = self.get_connection(request.url, proxies)
  File "D:\******\******\env36\lib\site-packages\requests\adapters.py", line 307, in get_connection
    conn = self.poolmanager.connection_from_url(url)
  File "D:\******\******\env36\lib\site-packages\requests\packages\urllib3\poolmanager.py", line 279, in connection_from_url
    pool_kwargs=pool_kwargs)
  File "D:\******\******\env36\lib\site-packages\requests\packages\urllib3\poolmanager.py", line 227, in connection_from_host
    return self.connection_from_context(request_context)
  File "D:\******\******\env36\lib\site-packages\requests\packages\urllib3\poolmanager.py", line 238, in connection_from_context
    pool_key = pool_key_constructor(request_context)
  File "D:\******\******\env36\lib\site-packages\requests\packages\urllib3\poolmanager.py", line 103, in _default_key_normalizer
    return key_class(**context)
TypeError: __new__() got an unexpected keyword argument 'key_validator'

Tests are broken with `setup.py test`

They work fine with nosetests, need to look into this.

======================================================================
ERROR: test_response_iter_lines_reentrant (test_advocate.RequestsTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/functools.py", line 33, in update_wrapper
    setattr(wrapper, attr, getattr(wrapped, attr))
AttributeError: 'RequestsTestCase' object has no attribute '__name__'

----------------------------------------------------------------------

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.