Deion I believe WrappedRequest</c

Then how about exception handling code such like this? <div class="snippet-clipboa

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Different behavior of `get_header` between urllib.Request and WrappedRequest about scrapy HOT 4 CLOSED

marinelay commented on June 1, 2024

Different behavior of `get_header` between urllib.Request and WrappedRequest

from scrapy.

Comments (4)

wRAR commented on June 1, 2024 1

Removing the to_unicode() call is obviously incorrect.

from scrapy.

marinelay commented on June 1, 2024 1

Then how about exception handling code such like this?

def get_header(self, name, default=None):
    try:
        return to_unicode(self.request.headers.get(name, default), errors="replace")
    except TypeError:
        return default

from scrapy.

siddharth07-ui commented on June 1, 2024

Hi @marinelay and @wRAR, while looking through this good first issue, I found the line in scrapy, which seems to be the root cause of providing this error for Request with WrappedRequest as shown in the line --

=====================================================================
from urllib.request import Request as _Request
from scrapy.http.request import Request
from scrapy.http.cookies import WrappedRequest

a = _Request(url="https://a.example")
print(a.get_header('xxxx'))

b = WrappedRequest(Request(url="https://a.example"))
print("WrappedRequest get-header result:", b.get_header('xxxx')) -- This one

=====================================================================

None
TypeError: to_unicode must receive a bytes or str object, got NoneType -- Result

=====================================================================

Where as per behavior, it should be returning None for both the requests.

This is line 173 in "http/response/cookies.py" --

return to_unicode(self.request.headers.get(name, default), errors="replace")

Because of "to_unicode" being used, which as per function definition says --
"""Return the unicode representation of a bytes object text. If
text is already an unicode object, return it as-is."""

Here is checks output of 'self.request.headers.get(name, default), errors="replace"', which in this case would be "str" to be a valid candidate. If this is not the case, hence the error - "TypeError: to_unicode must receive a bytes or str object, got NoneType".

Hence a viable solution to this can be - "return self.request.headers.get(name, default)", which returns the output as "None".

Do let me know if this is the correct solution to this, or it might interfere with some other functionality for it.

from scrapy.

siddharth07-ui commented on June 1, 2024

Hi @marinelay, I guess that gives the required output. Since it usually throws a 'TypeError', handling this part using an exception block is neat 👍.

from scrapy.

Recommend Projects

Different behavior of `get_header` between urllib.Request and WrappedRequest about scrapy HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent