Comments (7)
Thanks for your input! I'm really glad this project has been useful!
I wasn't able to duplicate the problem above on my computer.
As you might expect, this is very troubling. About a month ago, there was a pull request targeting a different issue that I also couldn't reproduce. It's like other people are getting different HTML pages from Billboard's servers than me, which shouldn't be happening.
Do me a favor, please—run this script and tell me what you get.
import json, requests
url = 'http://www.billboard.com/charts/hot-100'
headers_current = {'User-Agent': 'billboard.py (https://github.com/guoguo12/billboard-charts)'}
req = requests.get(url, headers=headers_current)
print json.dumps(dict(req.headers), sort_keys=True, indent=4, separators=(',', ': '))
print json.dumps(dict(req.request.headers), sort_keys=True, indent=4, separators=(',', ': '))
This script sends a HTTP GET request to Billboard's servers and prints the return and request headers to stdout in JSON format. What I got was this.
Not sure if this will help, but it's worth a shot. Let me know if you have any other ideas as to why this might be happening.
from billboard-charts.
Here's my output: http://pastebin.com/4Gy49tbi
The only notable differences I see are the server ("server": "ECS (cpm/F9B6)"
) and cache hits ("x-cache-hits": "HIT (5)"
) but other than that, it's relatively the same. I'm not very familiar with how http requests work and all that at the moment, so I'm not too certain as to what might be happening. I tried fooling with user agents on this site but both my browser and the windows FF/Chrome browsers came back with relatively the same info.
Also ran the script again this morning, still getting the all null albums.
from billboard-charts.
Hmm. Well, I'm not sure where to go from here. I'm not familiar with the intricacies of HTTP either, but I'm guessing content might be varied based on the client IP address.
I can think of two possible options. We can put something like this in:
if chartInfoSoup.contents[3].string:
album = chartInfoSoup.contents[3].string.strip()
elif chartInfoSoup.contents[4].string:
album = chartInfoSoup.contents[4].string.strip()
else:
album = None
# This might not work for songs without album names on my end.
Alternatively, we can rewrite the code to ignore the line breaks, maybe using regex. Let me know what you think is best.
from billboard-charts.
I was thinking more towards the first option to keep things simple for now. Also it seems like to a lot of people that parsing with regex screams bloody murder, so maybe we'll hold back on it for now since the Billboard HTML code is pretty big.
It's been about 24 hours or so and I haven't ran into any problems, so I'll pull up a PR. If anything comes up we can reopen this.
from billboard-charts.
Merged. Thank you for your help!
I've given you full access to the repository. If there are any fixes or improvements you want to make in the future, feel free to do so.
from billboard-charts.
Oh wow, I wasn't expecting that, thanks again!
To be honest, I think you've gotten the main stuff nailed down at the moment. The only other feature I was thinking about implementing with the data we can get is determining if the entry rose/fell in the ranks from the previous week or if its a new entry/re-entry, which should be pretty easy to do since we already have the necessary info to determine it.
from billboard-charts.
Actually, that's already sort of included. Each song has attributes lastPos
and peakPos
for last position and peak position on the chart. There's also a weeks
attribute for number of weeks on chart.
from billboard-charts.
Related Issues (20)
- Lyrics in search results? HOT 7
- Specify a year HOT 1
- image url not working? HOT 7
- Billboard 200 not returning any charts prior to 11/18/1989 HOT 3
- _parseNewStylePage fails bc of previousDate HOT 1
- API suddenly not work! HOT 3
- API doesnt work HOT 1
- Api problems HOT 3
- Charts relying on "old style" parsing missing previousDate & nextDate at extremes HOT 1
- Produces a 67 second delay every 24 requests. HOT 2
- Issue with old style? HOT 2
- Chart Data Error HOT 1
- unable to get image of track HOT 3
- Cannot fetch the Hot 100 chart HOT 3
- This API NOT WORK NOW! HOT 5
- List of Charts
- Do not working HOT 1
- Pulling charts by date returns empty!
- Previous date field is not filled HOT 3
- Not able to pull metadata for new TikTok charts HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from billboard-charts.