Git Product home page Git Product logo

Comments (21)

bradenbest avatar bradenbest commented on May 24, 2024 5

@ryukinix What if you were to create a dummy HTTP server with python's http.server module and re-implement the API but in a way where it gives boilerplate responses? Then you could run a server locally and not have to rely upon MAL (the site) for testing.

$ ./test-mal-server.py
Running API testing server at 127.0.0.1 on port 8080.
--
$ mal list --server http://127.0.0.1:8080
1: Test
    Watching at 0/0 episodes with score -

The dummy API should implement the API for MAL (the website) so that mal (the program) can connect to it during testing and test how it responds to various specific requests.

Using this, you could add switches to test how mal handles different errors. Such as a 500 internal server error, or a 403 forbidden

$ ./test-mal-server.py --500
Running API testing server at 127.0.0.1 on port 8080.
Will respond with an HTTP 500 to all requests
--
$ mal list  --server http://127.0.0.1:8080
<program gets HTTP 500 response from request to localhost:8080 and crashes>

Then you just add this to the unit tests and give mal a switch (--server) that lets you specify a different base domain, like localhost:8080 instead of myanimelist.net, as illustrated above.

AFAIK, http.server is part of python's standard library, so no new dependencies.

https://docs.python.org/3/library/http.server.html

The basic rundown, if you aren't familiar with HTTP, is that your mal client will send a request that looks like this:

...

Well, after probably two hours trying to use my /etc/hosts (127.0.0.1 myanimelist.net) to redirect mal to use my ncat server so I could read the request headers, I couldn't get past the SSL part. Either your security is really good, or it has to do with librequest and the string "https" appearing in the URL base. But I can't get mal to connect to my server, as it fails each time with...

mal:
SSLError: MaxRetryError
          reason: SSLError ¯\_(ツ)_/¯

server (sudo ncat --listen --ssl localhost 443 -v, among many others including using certificates generated by openssl):
...
Ncat: Listening on 127.0.0.1:443
Ncat: Connection from 127.0.0.1.
Ncat: Connection from 127.0.0.1:53202.
Ncat: Failed SSL connection from 127.0.0.1: error:14094418:SSL routines:ssl3_read_bytes:tlsv1 alert unknown ca

cURL is able to connect without triggering an error on the ncat server, and this is what it says:

$ curl https://localhost
curl: (60) server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

Using cURL's insecure mode, you can see what an HTTP request looks like:

cURL's end:
$ curl https://localhost --insecure

ncat's end:
$ sudo ncat --listen --ssl localhost 443 -v
...
Ncat: Listening on 127.0.0.1:443
Ncat: Connection from 127.0.0.1.
Ncat: Connection from 127.0.0.1:53224.
[vv this is the request that cURL sends vv]
GET / HTTP/1.1
Host: localhost
User-Agent: curl/7.47.0
Accept: */*

But it's a GET request. And I wanted to show you what request your program is sending so you have a better idea of how to handle incoming requests on the dummy server.

So, one suggestion I have, for the dummy server, is to use raw HTTP. Also be sure to use a port above 1023, because ports 0 - 1023 are owned by root. So it's easier to just use 8000 or 8080.

I suggest isolating in base_url the https://myanimelist.net part and calling that domain or server. Then by passing the aforementioned test flag (--server, e.g. mal list --server http://localhost:8080), have the server variable get switched to that, such that base_url becomes http://localhost:8080/api. That way, when the test server gets the request, it'll look something like:

POST /api HTTP/1.1
data...

But you probably want to know what a POST request actually looks like. Well, I did, too. So I wrote a quick HTML form and served it via http.server (python3 -m http.server), then swapped it out with netcat (nc -l 8000) before submitting.

Here's the form:

<form action="/senpai" method="post">
<input name="username">
<input name="password">
<input type="submit">
</form>

And here's the POST request (+data) it generated with the inputs "sen", "pai":

POST /senpai HTTP/1.1
Host: localhost:8000
Connection: keep-alive
Content-Length: 25
Cache-Control: max-age=0
Origin: http://localhost:8000
Upgrade-Insecure-Requests: 1
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/64.0.3282.167 Chrome/64.0.3282.167 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Referer: http://localhost:8000/form.html
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9

username=sen&password=pai

of course, there is no /senpai page on the server, but netcat doesn't care, because it's not a fileserver.

Anyways, I'm sure python abstracts a lot of that stuff out. You can probably just do something like, I dunno, http.request.url to get /senpai, for example. And then taking that, you can start to implement the behavior for the different "pages" of the dummy API, like /api/anime/search.xml?q=..., for example. In which case, you could treat "Cowboy Bebop" as the only anime in existence, and anything else should 404 so you can test how it reacts to 404's.

This has been a long-winded reply. Sorry about that. But I hope you get the gist. The dummy server/API could be a valuable asset in testing, especially for reproducing errors like this one. Since everything would be under your control. It would also ensure that a run-through of travis-ci doesn't fail if, say, MAL just happens to be down at the time.

Also note that a url like /page?a=somevalue&... is a GET request. Which is what it looks like MAL's api takes in. But I've also seen evidence that it takes POST requests (like in the curl ... -d ... examples given in the API's documentation), so I'm not sure how MAL's API is organized. Seems like a bit of a clusterfuck to me.

Just as a quick reminder, I'm not suggesting to re-implement the entirety of MAL. You are only implementing the interface, while the actual technology that the interface accesses is just a bunch of boilerplate that will always give the same hard-coded response depending on the input.

It's like

int getRandomNumber(){
    return 4; // chosen by fair dice roll.
              // guaranteed to be random.
}

versus

int getRandomNumber(){
    int ret;
    int rand_fd = open("/dev/urandom", O_RDONLY);

    read(rand_fd, &ret, 4); // chosen by reading 4 bytes from /dev/urandom
    close(rand_fd);

    return ret;
}

from mal.

lubien avatar lubien commented on May 24, 2024 2

@bradenbest I don't see the point on boilerplating a server just to mock some HTTP responses as even the exception itself is unknown.

Not only feels like overkill but also seems like missing the original point.

MAL API itself is a mystery. Sometimes it gives "Created" messages in the HTML body, sometimes it don't say anything. It's a "deal with it developers" kinda thing. Trying to mock the unexpected feels just wrong.

from mal.

lubien avatar lubien commented on May 24, 2024 2

@bradenbest Unfortunately. It's too unpredictable and incomplete. Not to talk about undocumented.

from mal.

ryukinix avatar ryukinix commented on May 24, 2024 2

MAL API is the worst API that I knew in the my entire life.

from mal.

bradenbest avatar bradenbest commented on May 24, 2024 1

@lubien The point of my reply was to suggest a fix for the non-reproducible roadblock. So if you have something to add, I think we'd all love to hear your input. It'd certainly be more useful to the discussion than a thumbs-down emoji.

By doing that, you've broadcasted for all of us to see that you object to something I said. It's kinda rotten to just keep that to yourself. Get mad! Tell me why I suck.

from mal.

bradenbest avatar bradenbest commented on May 24, 2024 1

@lubien Ah. MAL's API is really that bad? Well, thanks for the input. I don't really have anything to add, so have a thumbs up.

from mal.

ryukinix avatar ryukinix commented on May 24, 2024 1

Due to a recent update of vulnerabilities, MAL API parsing it's broken for some reason.

image

We need debug it.

Sorry @bradenbest for the so late response, I was busy in a recent job after being hired and a lot of infernal stuff at university. Feel free to to fill your ideas where you think it's more appropriate. If you need elaborate your ideas a little longer, I will recommend to you create independent issues instead commenting here.

This thread can freely close just handling better the exceptions.

BTW, would be nice as well fixing this errors.

I'll be still busy in the next days, just reporting these errors here.

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

Ok. We can create a similar way to treat errors like I did for regex/connection exceptions, using just a decorator to be used in some method or function as defined here:

mal/mal/utils.py

Lines 62 to 96 in 99b7289

def checked_regex(func):
"""Wrap the function in a try/except to catch and handle a BadRegexError."""
@wraps(func) # keeps the wrapped function's name and docstring intact
def wrapper(*args, **kwargs):
result = None
try:
result = func(*args, **kwargs)
except BadRegexError:
if AnimatedDecorator.spinner.running:
AnimatedDecorator.stop()
print_error('BadRegexError', 'invalid regex', 'reason: you')
sys.exit(1)
return result
return wrapper
def checked_connection(func):
"""Wrap the function in a try/except to catch and handle a ConnectionError."""
@wraps(func) # keeps the wrapped function's name and docstring intact
def wrapper(*args, **kwargs):
result = None
try:
result = func(*args, **kwargs)
except ConnectionError as e:
if AnimatedDecorator.spinner.running:
AnimatedDecorator.stop()
error_name = e.__class__.__name__
status = e.args[0].__class__.__name__
reason = e.args[0].reason.__class__.__name__
print_error(error_name, status, "reason: {}".format(reason))
sys.exit(1)
return result
return wrapper

And used here: https://github.com/ryukinix/mal/blob/master/mal/api.py

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

@kutsan on that image:

image

I don't see the name of the exception. What is?

from mal.

kutsan avatar kutsan commented on May 24, 2024

@ryukinix Silly me, for some reason I didn't include the underside of the logs. Don't know what was I thinking. I've already closed that terminal session; so can't provide any useful information on that topic. I'm so sorry.

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

@kutsan hmm, I would try implement this, but would be more difficult without the name of the exception on full log stacktrace. Well... I'll try replicate this later. Would really nice having the full log of this exception.

from mal.

kutsan avatar kutsan commented on May 24, 2024

I am really so sorry, I feel dumb now. Won't be happen again. :(

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

I was thinking... if I just try change the API url randomly, I can get the same exception? 🤔 because if the API is not working, probably the result is fetch another page...

from mal.

kutsan avatar kutsan commented on May 24, 2024

I am looking forward to an possible maintenance. :P

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

Thanks! Be watching! xD

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

@kutsan any news on that? No maintenance? I was looking to a way to reproduce this thing... Maybe if I try just change the mal.api.MyAnimeList.base_url I can replicate a similar issue.

from mal.

kutsan avatar kutsan commented on May 24, 2024

@ryukinix Yeah, haven't encountered any maintenance since then.

I can replicate a similar issue.

Fix that problem and let's close this issue; if happens again in future, I'll let you know.

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

ok, thanks

from mal.

bradenbest avatar bradenbest commented on May 24, 2024

So, as an alternative idea, what if you made a dummy function that returns what you expect from MAL (the site) and returns something that causes the program to crash:

def fetch_site(url):
    # do some stuff
    # return response text from MAL

def fetch_site(url): # version used for testing
    # do some stuff depending on what url says
    # return arbitrary boilerplate

At least then you'd cut out the server and wouldn't have to attempt to recreate the API. Just recreate the responses you expect to get from the server, and add in responses that the program isn't supposed to be receiving (plain text, improperly formatted XML, JSON, HTTP error code, HTML page, RSS feed, all of the above with varying combinations of CR/LF/CRLF line terminators), to test fault tolerance. Surely with enough experimentation, you'll manage to reproduce the bug (and it might even come from something completely unexpected like incompatible line terminators), and can fix it and add the API simulation to the unit tests for the future.

Outside of that, the next time MAL goes down, be it for maintenance, downtime, breakage or domain expiry, document into a file the HTTP header and response you get from MAL with curl -I <url> and curl <url> respectively, along with a description of the circumstances (e.g. "This is from when MAL went down for maintenance on "). Then, you can begin to re-create those responses for testing. I think everybody should try to do this. I'll do it. Next time my copy of mal crashes and myanimelist.net is also having problems, I'll try to remember to make a log and submit it to @ryukinix , with this information, more or less:

copy of python backtrace, including the command that invoked it
screenshot of MAL, if necessary
description of site's state, if necessary
The output of the following commands:
    malusr=<your MAL username>
    malpwd=<your MAL password>
    ^ don't include those in the log, obviously
    curl -I "https://myanimelist.net/malappinfo.php?u=$malusr&status=all&type=anime"
    curl "https://myanimelist.net/malappinfo.php?u=$malusr&status=all&type=anime"
    curl -Iu $malusr:$malpwd "https://myanimelist.net/api/anime/search.xml?q=naruto"
    curl -u $malusr:$malpwd "https://myanimelist.net/api/anime/search.xml?q=naruto"
Which will give the HTTP headers and bodies of both major parts of the API, so it's known EXACTLY what the server responded with to cause the error, and what it might cause for information retrieval
Date in ISO 8601 e.g. 2018-04-02

That's probably the last I'll say on this topic. I wouldn't want to waste anyone's time.

@ryukinix , do you want reports like these, and if so, where should we preferentially post them? On this issue's thread? DM?

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

https://myanimelist.net/api

MAL API seems pretty dead for me. Any news about it that I'm missing?

image

ping @lubien

from mal.

ryukinix avatar ryukinix commented on May 24, 2024

More info about the API dead in #100

from mal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.