Git Product home page Git Product logo

Comments (4)

wRAR avatar wRAR commented on June 1, 2024 1

Removing the to_unicode() call is obviously incorrect.

from scrapy.

marinelay avatar marinelay commented on June 1, 2024 1

Then how about exception handling code such like this?

def get_header(self, name, default=None):
    try:
        return to_unicode(self.request.headers.get(name, default), errors="replace")
    except TypeError:
        return default

from scrapy.

siddharth07-ui avatar siddharth07-ui commented on June 1, 2024

Hi @marinelay and @wRAR, while looking through this good first issue, I found the line in scrapy, which seems to be the root cause of providing this error for Request with WrappedRequest as shown in the line --

=====================================================================
from urllib.request import Request as _Request
from scrapy.http.request import Request
from scrapy.http.cookies import WrappedRequest

a = _Request(url="https://a.example")
print(a.get_header('xxxx'))

b = WrappedRequest(Request(url="https://a.example"))
print("WrappedRequest get-header result:", b.get_header('xxxx')) -- This one

=====================================================================

None
TypeError: to_unicode must receive a bytes or str object, got NoneType -- Result

=====================================================================

Where as per behavior, it should be returning None for both the requests.

This is line 173 in "http/response/cookies.py" --

return to_unicode(self.request.headers.get(name, default), errors="replace")

Because of "to_unicode" being used, which as per function definition says --
"""Return the unicode representation of a bytes object text. If
text is already an unicode object, return it as-is."""

Here is checks output of 'self.request.headers.get(name, default), errors="replace"', which in this case would be "str" to be a valid candidate. If this is not the case, hence the error - "TypeError: to_unicode must receive a bytes or str object, got NoneType".

Hence a viable solution to this can be - "return self.request.headers.get(name, default)", which returns the output as "None".

scrapy1

Do let me know if this is the correct solution to this, or it might interfere with some other functionality for it.

from scrapy.

siddharth07-ui avatar siddharth07-ui commented on June 1, 2024

Hi @marinelay, I guess that gives the required output. Since it usually throws a 'TypeError', handling this part using an exception block is neat 👍.

from scrapy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.