Git Product home page Git Product logo

Comments (12)

initstring avatar initstring commented on June 25, 2024 2

I'm glad I could help!
I read a book recently called "How Linux Works" by Brian Ward. If you're interested in learning more about Linux, I highly recommend it. I've been using Linux for a lot of years but still learned a lot from it.
Good luck with your projects!

from lyricpass.

initstring avatar initstring commented on June 25, 2024

Hi @stereonov
Thanks for reporting! Yeah, it was broken. This is the problem with building webscrapers - regexes break whenever the site makes changes.
I think I see what changed. I push a new update to master just now. Can you try again?

from lyricpass.

stereonov avatar stereonov commented on June 25, 2024

I have replace the lyricpass.py with the updated, this time scan almost 100 songs than 642 but still get this:
$ python3 ./lyricpass.py -a Taylor+Swift
`[+] Looking up artist Taylor+Swift
[+] Found 642 songs for artists Taylor+Swift
Traceback (most recent call last):
File "/usr/lib/python3.4/urllib/request.py", line 1182, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/usr/lib/python3.4/http/client.py", line 1125, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.4/http/client.py", line 1163, in _send_request
self.endheaders(body)
File "/usr/lib/python3.4/http/client.py", line 1121, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.4/http/client.py", line 951, in _send_output
self.send(msg)
File "/usr/lib/python3.4/http/client.py", line 886, in send
self.connect()
File "/usr/lib/python3.4/http/client.py", line 1260, in connect
super().connect()
File "/usr/lib/python3.4/http/client.py", line 863, in connect
self.timeout, self.source_address)
File "/usr/lib/python3.4/socket.py", line 494, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/usr/lib/python3.4/socket.py", line 533, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./lyricpass.py", line 241, in
main()
File "./lyricpass.py", line 223, in main
raw_words.update(scrape_lyrics(url_list))
File "./lyricpass.py", line 183, in scrape_lyrics
with urllib.request.urlopen(url) as response:
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 463, in open
response = self._open(req, data)
File "/usr/lib/python3.4/urllib/request.py", line 481, in _open
'_open', req)
File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
result = func(*args)
File "/usr/lib/python3.4/urllib/request.py", line 1225, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib/python3.4/urllib/request.py", line 1184, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>`

from lyricpass.

initstring avatar initstring commented on June 25, 2024

It looks like a DNS lookup failed (socket.gaierror: [Errno -2] Name or service not known).
There isn't currently anything in the tool to try to work around network errors. Probably using the requests library instead of urllib would help.
I pushed a new branch using requests... can you see if it helps?
https://github.com/initstring/lyricpass/tree/requests-lib

from lyricpass.

initstring avatar initstring commented on June 25, 2024

You might need to do pip3 install requests first.

from lyricpass.

stereonov avatar stereonov commented on June 25, 2024

Here is what I did:
sudo apt-get install python3-pip

pip3 install -r requirements.txt
Requirement already satisfied (use --upgrade to upgrade): requests in /usr/lib/python3/dist-packages (from -r requirements.txt (line 1)) Cleaning up...

`python3 ./lyricpass.py -a Taylor+Swift
[+] Looking up artist Taylor+Swift
[+] Found 642 songs for artists Taylor+Swift
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 562, in urlopen
body=body, headers=headers)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.4/http/client.py", line 1125, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.4/http/client.py", line 1163, in _send_request
self.endheaders(body)
File "/usr/lib/python3.4/http/client.py", line 1121, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.4/http/client.py", line 951, in _send_output
self.send(msg)
File "/usr/lib/python3.4/http/client.py", line 886, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 111, in connect
timeout=self.timeout)
File "/usr/lib/python3.4/socket.py", line 494, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/usr/lib/python3.4/socket.py", line 533, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 330, in send
timeout=timeout
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 612, in urlopen
raise MaxRetryError(self, url, e)
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.lyrics.com', port=443): Max retries exceeded with url: /db-print.php?id=28163737 (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./lyricpass.py", line 245, in
main()
File "./lyricpass.py", line 227, in main
raw_words.update(scrape_lyrics(url_list))
File "./lyricpass.py", line 187, in scrape_lyrics
response = requests.get(url)
File "/usr/lib/python3/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 467, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 570, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.lyrics.com', port=443): Max retries exceeded with url: /db-print.php?id=28163737 (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)`

Did I do something wrong?
Thanks again for helping out!

from lyricpass.

stereonov avatar stereonov commented on June 25, 2024

Update: Your last update is worked and successfully generate files raw and wordlist with an artist which has 150 songs. When an artist has more songs like Taylor swift (642) then I get the error which appear at my previous comment.

from lyricpass.

initstring avatar initstring commented on June 25, 2024

Sorry you're still having issues. I can confirm the tool works for artists with more songs. Your error is related to a failed DNS lookup, so is due to something other than the Python script.
You can try a few things:

  • Create a hosts file entry for for www.lyrics.com to skip DNS lookups
  • wrap the "requests.get" lines in try/except, but this may mean you will miss out on lyrics that fail
  • Try a different DNS server
  • Try connecting to a different network

from lyricpass.

stereonov avatar stereonov commented on June 25, 2024

Thank you so much for help.
I really don't know how to do any of your suggestions, except the last one where I use it from home and I have only one network. I'm a beginner on Ubuntu.
About first suggestion, you mean something like this? https://rimuhosting.com/knowledgebase/linux/misc/bypassing-dns-servers-using-etc-hosts

from lyricpass.

initstring avatar initstring commented on June 25, 2024

If you are using Ubuntu, you can try this:

  • Edit the '/etc/hosts' file (you will need to do this as root)
  • Add this line in it anywhere (on its own line):

52.203.75.1 www.lyrics.com

And try again. Don't forget to delete that line out of hosts when you are all done, as the address for the website may change later.

Hope this helps!

from lyricpass.

stereonov avatar stereonov commented on June 25, 2024

Ok I will, thanks in advance!!!

from lyricpass.

stereonov avatar stereonov commented on June 25, 2024

Yeap worked! Thank you so much again, you are amazing!

from lyricpass.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.