Git Product home page Git Product logo

Comments (29)

jessb321 avatar jessb321 commented on July 4, 2024

Hmmm. I haven't set any blocking on my end. Could be CloudFlare issue. Will check the settings.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

All I can tell you is that the code that's getting the 403 is a standard python urllib.request.urlopen call:
https://docs.python.org/3/library/urllib.request.html

try:
    response = request.urlopen(url)
except Exception as e:
    tdenv.WARN("Problem with download:\nURL: {}\nError: {}", url, str(e))

from trade-dangerous.

jessb321 avatar jessb321 commented on July 4, 2024

Looks like a user agent issue. The following code works on my end:

from urllib.request import urlopen, Request

url = 'https://beta.coriolis.io/data/index.json'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.3'}
response = urlopen(Request(url=url, headers=headers))

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

So your server is refusing because there's no User-Agent header in the request. Interesting. I've got a fix for it already, but is it possible to turn it off?

from trade-dangerous.

jessb321 avatar jessb321 commented on July 4, 2024

Looks like it is cloudflare blocking the requests, I'll have a poke around to see if I can turn that off but not sure if possible.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

Okay. I'll upload the fix anyway. Who knows, maybe the other sources will one day also require that header.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

Apparently the default User-Agent string is "Python-urllib/x.x" where x.x is the Python version.

Might be CF is specifically blocking Python, since passing "Trade-Dangerous" works fine.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

Not convinced this is (completely) fixed. Now it doesn't mention the download at all and proceeds directly to using the Default Ship Index.

NOTE: Checking for update to 'index.json'.
NOTE: Using Default Ship Index.

This happens if there is no index.json, or an earlier dated file.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

I'm sorry, I can't reproduce this:

Trade-Dangerous\test>trade import -P eddblink -O ship
NOTE: Checking for update to 'index.json'.
NOTE: Downloading file 'index.json'.
NOTE: Requesting https://beta.coriolis.io/data/index.json
NOTE: Downloaded   0.8MB of gziped data   1.8MB/s
NOTE: Processing Ships: Start time = 2019-05-23 03:43:26.978057
NOTE: Finished processing Ships. End time = 2019-05-23 03:43:26.998074
NOTE: G:\Elite Dangerous Programs\Trade-Dangerous\test\data\Ship.csv exported.
NOTE: Import completed.

Double-check your TD is at v10.4.6?

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

I should mention it doesn't actually check for a new index.json.
It checks every other file's modification date, and only downloads if newer, but because I couldn't check the old site's mod time (it didn't have one), I had to build an exception for the checking, and I haven't removed that exception.
So for this file, and only this file, it just checks if the web copy exists, downloads it if there was no error on that check, or uses the template copy if there was.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

Roger on TD version. I'll try running TD directly and see what it does.

Installing collected packages: tradedangerous
  Found existing installation: tradedangerous 10.4.5
    Uninstalling tradedangerous-10.4.5:
      Successfully uninstalled tradedangerous-10.4.5
Successfully installed tradedangerous-10.4.6

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

Can you reproduce when TD is called from listener, manually running TD I get the same result as you.

[elite@quoth tradedangerous]$ trade import -P eddblink -O ship
NOTE: Checking for update to 'index.json'.
NOTE: Downloading file 'index.json'.
NOTE: Requesting https://beta.coriolis.io/data/index.json
NOTE: Downloaded   0.8MB of gziped data   1.9MB/s
NOTE: Processing Ships: Start time = 2019-05-23 10:54:08.234504
NOTE: Finished processing Ships. End time = 2019-05-23 10:54:08.281628
NOTE: /home/elite/tradedangerous/tddata/Ship.csv exported.
NOTE: Import completed.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

Indeed I cannot:

>eddblink_listener.py
Starting listener.
Running EDDBlink to perform any needed infrastructure updates.
NOTE: Import completed.
Press CTRL-C at any time to quit gracefully.
EDDB update available, waiting for busy signal acknowledgement before proceeding.
Message processor acknowledging busy signal.
Busy signal acknowledged, performing EDDB dump update.
NOTE: Checking for update to 'modules.json'.
NOTE: Downloading file 'modules.json'.
NOTE: Requesting http://elite.tromador.com/files/modules.json
NOTE: Downloaded 359.8KB of gziped data   1.8MB/s
NOTE: Processing Upgrades: Start time = 2019-05-23 11:14:57.751580
NOTE: Finished processing Upgrades. End time = 2019-05-23 11:14:57.798629
NOTE: Checking for update to 'index.json'.
NOTE: Downloading file 'index.json'.
NOTE: Requesting https://beta.coriolis.io/data/index.json
NOTE: Downloaded   0.8MB of gziped data   1.9MB/s
NOTE: Processing Ships: Start time = 2019-05-23 11:14:59.451013
NOTE: Finished processing Ships. End time = 2019-05-23 11:14:59.522073
NOTE: Checking for update to 'systems_populated.jsonl'.
NOTE: Downloading file 'systems_populated.jsonl'.
NOTE: Requesting http://elite.tromador.com/files/systems_populated.jsonl

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

Roger on TD version. I'll try running TD directly and see what it does.

Installing collected packages: tradedangerous
  Found existing installation: tradedangerous 10.4.5
    Uninstalling tradedangerous-10.4.5:
      Successfully uninstalled tradedangerous-10.4.5
Successfully installed tradedangerous-10.4.6

You weren't on 10.4.6, but you are now.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

I was on the correct version. I had to scroll up a fair distance to get that quote - though in fairness, I probably should have said as much.

I will leave it for now and test it again later. Maybe it was having an unexplainable moment of idiocy and will act normally next time I look, otherwise I will shove some debug statements into eddblink and see if I can get some more useful info.

So leave it with me for now and I will report back later.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

Sounds like a plan.

On that note, the good news is that for the server, it doesn't matter. The only thing TD clients get from the listener directly is the live listings. They get the ship index.json from beta.coriolis.io/data/index.json as well, and the other things they get from the server are literally the same thing they'd get from EDDB.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

It does matter, though not urgently. If FDev were to add new ships, then TD would barf until we had the default templates fixed. Though as I'm typing and come to think about it, didn't you work around that already? In which case, you're right and it doesn't matter.

In any event, the server is still performing this minor misbehaviour and it's worth getting to the bottom of - if there's something funky in my environment vs yours, it's possible (however unlikely) that it will bite us in something more important later.

Server is running python 3.7.2 - I don't imagine you are wildly different, but doesn't hurt to check.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

This may get rambling, I'm typing as I go :D

Okay I added a simple debug message ( tdenv.NOTE() ) immediately after each if statement referring to ship url. The line numbers referred to relate to the original eddblink, excluding my additions.

What happens, when calling TD directly (correct behaviour).

[elite@quoth plugins]$ trade import -P eddblink -O ship
NOTE: Checking for update to 'index.json'.
NOTE: SHIPS_URL set to 'https://beta.coriolis.io/data/index.json'.
NOTE: urlTail matched @ 163
NOTE: url set to 'https://beta.coriolis.io/data/index.json'.
NOTE: url matched @ 167
NOTE: Downloading file 'index.json'.
NOTE: Requesting https://beta.coriolis.io/data/index.json
NOTE: Downloaded   0.8MB of gziped data   1.6MB/s
NOTE: Processing Ships: Start time = 2019-05-26 08:43:23.001934
NOTE: Finished processing Ships. End time = 2019-05-26 08:43:23.044372
NOTE: /home/elite/tradedangerous/tddata/Ship.csv exported.
NOTE: Import completed.

What happened from listener.

NOTE: Checking for update to 'index.json'.
NOTE: SHIPS_URL set to 'https://beta.coriolis.io/data/index.json'.
NOTE: urlTail matched @ 163
NOTE: url set to 'https://beta.coriolis.io/data/index.json'.
NOTE: url matched @ 167
NOTE: urlTail matched @ 181

So we know all the if/then logic is firing correctly, as the debug tells us so. Thus if I may borrow from Mr Sherlock Holmes, When you have eliminated the impossible, whatever remains, however improbable, must be the truth.

So the response = URLopen(url) @ line 169 must be failing.

So next as I have no clue how to make td dump warnings since it got modularised and called from a script, I'll change the TD.warn @ 173 to a TD.note. (Note to self: Ask eyeonus how to turn on warnings/debug message. tried some stuff with -Wall and got a bunch of extra stuff, but not what I wanted)

Line 173's output never appears (under any circumstances). Which is a shame, as it would be really helpful to see an error.

Trying

        def URLopen(url):
            user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
            return request.urlopen(request.Request(url, headers = {'User-Agent': user_agent}))

No change in behaviour, didn't really expect it. With this construct, running trade directly still downloads index.json as expected. So I revert that change.

I add debug to see if anything different is being passed to URLopen()
NOTE: URLopen got passed 'https://beta.coriolis.io/data/index.json'
So that is correct.

In summary narrowed it down to this:
When called from listener, eddblink fails at line 169 to download from Will's server for unknown reason, in contrast to being called directly from trade which works correctly. I've taken it as far as I can and need help to determine what exactly is happening during that web transaction. Either logs from @willyb321 or better (than I can do) diagnostic code from @eyeonus

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

debug is triggered with -w, there are three debug levels. -www is the maximum level.

AFAIK, warn is an always on thing, or at least a thing that has to be turned off, not on.

I think I have an idea as to what's going on. The server has the 'fallback' option on. Duh.

So it always goes to the template file. Oog. I'm not exactly sure how I'd fix this, but I'll think of something.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

I think I got it, but I'm hitting the sack now, so, I haven't tried it.

If you like, change the code to:

@167
        if url == SHIPS_URL or not self.getOption('fallback'):
            try:
                response = openURL(url)
                urlTail == ""
            except Exception as e:

That should fix it without introducing any other unexpected behaviour.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

First off I made sure my changes were reverted and starting from clean code -
pip install --upgrade --force-reinstall tradedangerous

Then I added the extra line you show above and rerun listener. No change, it still backs off to the default ship index.

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

Oh. I'm an idiot.

Should be urlTail = ""
One =, not two.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

Don't feel bad, I mentally parsed it as setting urlTail to "" and then blindly copied the == anyway.

OK - Firstly, index.json has to exist, otherwise it bitches about the lack of it. Workaround is I 'touch /blah/index.json' which cures that (though ultimately needs a programmatic fix).

Secondly... ermm... do what now?

NOTE: Checking for update to 'index.json'.
WARNING: Problem with download (fallback enabled):
URL: https://eddb.io/archive/v6/
Error: HTTP Error 403: Forbidden

Followed by a succession of JSON errors, clearly caused by the lack of a valid index.json.

That error must be generated by line 190. So, it's still skipped/ignored/screwed up the download at 169, though happily reset urlTail in the inserted next line. Then gone into the code segment starting @176 (server is on fallback), created a new url "https://eddb.io/archive/v6/" (FALLBACK_URL + urlTail) @ 186 and no surprise, couldn't download it @ 188, leading to the error.

Maybe if you add
or urlTail == ""
into line 178, it doesn't fall through that codeblock and barf at 190, but that doesn't fix the problem.

So, I wanted to try and rule Coriolis out in case it was another user-agent thing. I copied index.json to my server.

trade import -P eddblink -O ship
NOTE: Checking for update to 'index.json'.
NOTE: Downloading file 'index.json'.
NOTE: Requesting https://quoth.tromador.com/index.json
NOTE: Downloaded   0.8MB of uncompressed data  49.0MB/s

But with listener

NOTE: Checking for update to 'index.json'.
NOTE: Using Default Ship Index.

At which point I realised that I, Tromador, had the web logs and could tell what's happening @169. Why I didn't do this days ago I don't know, I'm supposed to be the webmaster and for me, this is a bit like putting two = where I only need one :)

Calling trade directly -
10.168.159.17 - - [28/May/2019:10:39:46 +0100] "GET /index.json HTTP/1.1" 200 869197 "-" "Trade-Dangerous"

Calling from listener -
10.168.159.17 - - [28/May/2019:10:39:46 +0100] "GET /index.json HTTP/1.1" 200 869197 "-" "python-requests/2.22.0"
10.168.159.17 - - [28/May/2019:10:57:56 +0100] "GET /index.json HTTP/1.1" 200 869197 "-" "Trade-Dangerous"

"Curiouser and curiouser!" Cried Alice. On the one hand, it's not set the correct user-agent (except on my second test it did, so I'll ignore that for now as my server doesn't actually care). On the other hand it appears to have successfully downloaded the file (hence no exception triggered, I guess), but carried on into fallback territory regardless.

Which, of course it does, because server is set fallback and line 176 doesn't much care about what came before and sends us down that rabbit hole regardless. We then bomb out with "return false" at line 190 and grab the default.

So what we need is to detect the successful download @ 169 and then skip all the fallback.

So I did this to detect and act upon the http response code from the download:

        tdenv.NOTE("Checking for update to '{}'.", path)
        responseCode=0
        if urlTail == SHIPS_URL:
            url = SHIPS_URL
        else:
            url = BASE_URL + urlTail
        if url == SHIPS_URL or not self.getOption('fallback'):
            try:
                response = URLopen(url)
                responseCode=response.getcode()
            except Exception as e:
                # If Tromador's server fails for whatever reason,
                # fallback to download direct from EDDB.io
                tdenv.WARN("Problem with download:\nURL: {}\nError: {}", url, str(e))
                self.options["fallback"] = True

        if self.getOption('fallback') and responseCode !=200:
            # EDDB.io doesn't have live listings or the ship index.
            if urlTail == LIVE_LISTINGS:
                return False

and Listener said

NOTE: Checking for update to 'index.json'.
NOTE: Downloading file 'index.json'.
NOTE: Requesting https://coriolis.io/data/index.json
NOTE: Downloaded   0.8MB of gziped data   1.9MB/s

\o/

I'm sure "that's not how we do that, Trom" ;) So I'll let you push your more elegant version of this (or an equivalent) fix. Honestly, given it must have been going into fallback @ 176 since forever, I struggle to know how it EVER worked right. Anyway - that's it.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

As an aside - note the url "https://coriolis.io/data/index.json" I happened to be using this as a test before I narrowed down the problem and never got around to changing it back, but per EDCD/coriolis-data#75 this is the master branch anyway, so should we be using it as standard?

from trade-dangerous.

eyeonus avatar eyeonus commented on July 4, 2024

So, it's still skipped/ignored/screwed up the download at 169, though happily reset urlTail in the inserted next line.

Well crap. The success of that line change depended on Python going straight into the exception handling if it failed on the openURL. It's supposed to skip that line when the previous one errors.

Poo.

I'm sure "that's not how we do that, Trom" ;)

The only thing I have to say about your fix is that I wouldn't have bothered making a new variable and would have directly compared response.getCode()

It's a good fix.

I struggle to know how it EVER worked right.

Before I added the code inside the fallback check to pull the default template on failure to download, the ships download always eventually made it to line 222 because it never even went into the checking code:

tdenv.DEBUG0("Checking for update to '{}'.", path)
if urlTail == SHIPS_URL:
url = SHIPS_URL
else:
if not self.getOption('fallback'):

this is the master branch anyway, so should we be using it as standard?

Yeah, I was planning on updating that, I've already made the commit, I just haven't pushed because I didn't see the need to update TD when the beta works just as well, and we were in the middle of solving a different problem.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

Well crap. The success of that line change depended on Python going straight into the exception handling if it failed on the openURL. It's supposed to skip that line when the previous one errors.

Yes. I thought about this. The problem is that it doesn't report success until way later in the code and said report was thrown off by the actual bug. Thus getting hung up on the download. It was misdirection.

It's a good fix

Thanks :)
Not sure if we can compare directly with response.getCode(). Unless the response object is initialised elsewhere (I haven't looked), then if we're attempting to download anything except the ships, response doesn't exist and python will barf when we try to use the nonexistent object, for the same reason I have to initialise the variable outside that if block.

from trade-dangerous.

jessb321 avatar jessb321 commented on July 4, 2024

I would recommend using the file from beta if you need 100% up2date info eg around update times, because I usually don't merge to master until the update is released and I can confirm everything.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

I would recommend using the file from beta if you need 100% up2date info eg around update times, because I usually don't merge to master until the update is released and I can confirm everything.

Roger that, we'll stick with beta then I think. Thanks for the clarification.

from trade-dangerous.

Tromador avatar Tromador commented on July 4, 2024

Resolved by ab6e48e

from trade-dangerous.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.