First of all, thank you for this awesome tool. I've been testing it for a while and it's being doing a great job and very stable. Awesome that it exists as python SDK, can customize what I want and very simple API.
Here's a bug I've encountered when I was batch-scanning some URLs.
2022-07-26 17:57:44,470 [DEBUG] Generated user agent: Mozilla/5.0 (iPad; CPU OS 8_4_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) CriOS/43.0.2357.61 Mobile/12H321 Safari/600.1.4
2022-07-26 17:57:44,482 [DEBUG] Starting new HTTP connection (1): news.co.th:80
2022-07-26 17:57:45,094 [DEBUG] http://news.co.th:80 "GET /news.co.th.zip HTTP/1.1" 200 16314266
2022-07-26 18:01:52,014 [DEBUG] SHIFT_JIS Japanese prober hit error at byte 11
2022-07-26 18:01:52,014 [DEBUG] EUC-JP Japanese prober hit error at byte 10
2022-07-26 18:01:52,014 [DEBUG] GB2312 Chinese prober hit error at byte 11
2022-07-26 18:01:52,014 [DEBUG] EUC-KR Korean prober hit error at byte 10
2022-07-26 18:01:52,014 [DEBUG] CP949 Korean prober hit error at byte 11
2022-07-26 18:01:52,015 [DEBUG] Big5 Chinese prober hit error at byte 10
2022-07-26 18:01:52,015 [DEBUG] EUC-TW Taiwan prober hit error at byte 10
2022-07-26 18:01:57,974 [DEBUG] windows-1251 confidence = 0.04284181968104123, below negative shortcut threshhold 0.05
2022-07-26 18:02:03,923 [DEBUG] KOI8-R confidence = 0.041309005742054414, below negative shortcut threshhold 0.05
2022-07-26 18:02:09,873 [DEBUG] ISO-8859-5 confidence = 0.04158455218939259, below negative shortcut threshhold 0.05
2022-07-26 18:02:15,734 [DEBUG] MacCyrillic confidence = 0.043316827603991484, below negative shortcut threshhold 0.05
2022-07-26 18:02:21,768 [DEBUG] IBM866 confidence = 0.04203255622482825, below negative shortcut threshhold 0.05
2022-07-26 18:02:27,898 [DEBUG] IBM855 confidence = 0.04124558620798153, below negative shortcut threshhold 0.05
2022-07-26 18:02:45,389 [DEBUG] ISO-8859-5 confidence = 0.04065749192599313, below negative shortcut threshhold 0.05
2022-07-26 18:02:51,299 [DEBUG] windows-1251 confidence = 0.041915259363694175, below negative shortcut threshhold 0.05
2022-07-26 18:02:57,212 [DEBUG] TIS-620 confidence = 0.04620826074805007, below negative shortcut threshhold 0.05
2022-07-26 18:03:11,205 [DEBUG] windows-1255 confidence = 0.04420672943568373, below negative shortcut threshhold 0.05
2022-07-26 18:03:17,123 [DEBUG] windows-1255 confidence = 0.04426233160412784, below negative shortcut threshhold 0.05
2022-07-26 18:03:21,357 [DEBUG] windows-1251 not active
2022-07-26 18:03:21,357 [DEBUG] KOI8-R not active
2022-07-26 18:03:21,357 [DEBUG] ISO-8859-5 not active
2022-07-26 18:03:21,357 [DEBUG] MacCyrillic not active
2022-07-26 18:03:21,357 [DEBUG] IBM866 not active
2022-07-26 18:03:21,357 [DEBUG] IBM855 not active
2022-07-26 18:03:21,357 [DEBUG] ISO-8859-7 Greek confidence = 0.07158825613241834
2022-07-26 18:03:21,357 [DEBUG] windows-1253 Greek confidence = 0.0728303862329963
2022-07-26 18:03:21,357 [DEBUG] ISO-8859-5 not active
2022-07-26 18:03:21,357 [DEBUG] windows-1251 not active
2022-07-26 18:03:21,357 [DEBUG] TIS-620 not active
2022-07-26 18:03:21,357 [DEBUG] ISO-8859-9 Turkish confidence = 0.06570938840808227
2022-07-26 18:03:21,357 [DEBUG] windows-1255 Hebrew confidence = 0.0
2022-07-26 18:03:21,357 [DEBUG] windows-1255 not active
2022-07-26 18:03:21,357 [DEBUG] windows-1255 not active
2022-07-26 18:03:21,358 [DEBUG] no probers hit minimum threshold
2022-07-26 18:03:21,358 [DEBUG] utf-8 confidence = 0.010000000000000009
2022-07-26 18:03:21,358 [DEBUG] SHIFT_JIS Japanese confidence = 0.01
2022-07-26 18:03:21,358 [DEBUG] EUC-JP Japanese confidence = 0.01
2022-07-26 18:03:21,358 [DEBUG] GB2312 Chinese confidence = 0.01
2022-07-26 18:03:21,358 [DEBUG] EUC-KR Korean confidence = 0.01
2022-07-26 18:03:21,358 [DEBUG] CP949 Korean confidence = 0.01
2022-07-26 18:03:21,358 [DEBUG] Big5 Chinese confidence = 0.01
2022-07-26 18:03:21,358 [DEBUG] EUC-TW Taiwan confidence = 0.01
2022-07-26 18:03:21,358 [DEBUG] windows-1251 Russian confidence = 0.04284181968104123
2022-07-26 18:03:21,358 [DEBUG] KOI8-R Russian confidence = 0.041309005742054414
2022-07-26 18:03:21,358 [DEBUG] ISO-8859-5 Russian confidence = 0.04158455218939259
2022-07-26 18:03:21,358 [DEBUG] MacCyrillic Russian confidence = 0.043316827603991484
2022-07-26 18:03:21,358 [DEBUG] IBM866 Russian confidence = 0.04203255622482825
2022-07-26 18:03:21,358 [DEBUG] IBM855 Russian confidence = 0.04124558620798153
2022-07-26 18:03:21,358 [DEBUG] ISO-8859-7 Greek confidence = 0.07158825613241834
2022-07-26 18:03:21,358 [DEBUG] windows-1253 Greek confidence = 0.0728303862329963
2022-07-26 18:03:21,358 [DEBUG] ISO-8859-5 Bulgarian confidence = 0.04065749192599313
2022-07-26 18:03:21,358 [DEBUG] windows-1251 Bulgarian confidence = 0.041915259363694175
2022-07-26 18:03:21,358 [DEBUG] TIS-620 Thai confidence = 0.04620826074805007
2022-07-26 18:03:21,358 [DEBUG] ISO-8859-9 Turkish confidence = 0.06570938840808227
2022-07-26 18:03:21,358 [DEBUG] windows-1255 Hebrew confidence = 0.0
2022-07-26 18:03:21,358 [DEBUG] windows-1255 Hebrew confidence = 0.04420672943568373
2022-07-26 18:03:21,358 [DEBUG] windows-1255 Hebrew confidence = 0.04426233160412784
2022-07-26 18:03:21,358 [DEBUG] ISO-8859-1 confidence = 0.01