Git Product home page Git Product logo

mercury-parser-api's People

Contributors

adampash avatar dependabot-preview[bot] avatar dependabot[bot] avatar evgensk avatar fossabot avatar greenkeeper[bot] avatar henryqw avatar nickwynja avatar renovate-bot avatar renovate[bot] avatar snyk-bot avatar trustin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mercury-parser-api's Issues

[Question] Feed processing time

Just for sanity, I just installed this plugin on a long working and stable installation (have been using readability for some years).

As soon as I enabled it, the installations active multi-process daemon seemed to take a break? Is it doing something when I add or enable it table or data wise that could be the reason? Or maybe Tiny is doing something once added? Just trying to understand why things stopped up.

After a while the daemon seemed to kick back in, but it feels like it's processing in 1/3 of the speed did did before (I have 4k feeds). It's simply enabled, no feed uses it nor do any filter trigger it yet. I still run and have readability enabled until I can find out what is going on.

I'm concerned about the massive drop in the speed of feeds being processed before and after adding this plugin... does this make any sense in anyone experience? Can it be related to the plugin or is that unlikely.... and speed wise feeds should process as fast as usual.

Site Parsing Issue Q

Hi there Henry, I am running this with my NAS and a TTRSS build and generally all work well.

However I have just noticed it is having issues with at least one site. Now, I am a bit new to this and was hoping you had a tip or two how it potentially could be fixed as a fellow user. I did not at glance see a way to customize anything on a per feed basis and I believe your build is the latest one there is (it feels like the postlight version has not been maintained for a good while).

This site seems to pull the text, but also the text for multiple articles below it:
https://www.thumbsticks.com/nintendo-switch-releases-september-21-25-2020-09242020/

docker pull fails

  • Platform: Debian 12, Docker v25.0.3

Expected Behavior

The image should be able to be pulled without a problem.

Current Behavior

When running docker pull wangqiru/mercury-parser-api the pull fails with the following error:
failed to register layer: failed to Lchown "/app/node_modules/content-type/HISTORY.md" for UID 1516583083, GID 0 (try increasing the number of subordinate IDs in /etc/subuid and /etc/subgid): lchown /app/node_modules/content-type/HISTORY.md: invalid argument

Handle the full-text content in other language

The result of retrieving non English webpage is not encoded well. It returned the strings of hex digits (e.g. "中新网") instead of encoded text. Is there a way to fix it? I tried the CLI version of Mercury Parser and pass the parameter --format markdown, which resulting in correct text. But I have no idea how to add this kind of parameter in calling the mercury-parser-api. Please try the example URLs below to reproduce the problem:

  1. https://news.sina.com.cn/c/2021-01-23/doc-ikftssan9988691.shtml
  2. http://www.chinanews.com/sh/2021/01-24/9395190.shtml

Mercury API key

where can i get the Mercury API key? am using docker wangqiru/mercury-parser-api

错误,返回信息:getaddrinfo EAI_AGAIN (地址)

  • Platform:Linux iZuf63oqyjp30wx07gi8jvZ 4.18.0-80.11.2.el8_0.x86_64 #1 SMP Tue Sep 24 11:32:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Mercury Parser API Version:latest
  • Node Version:v10.19.0

Expected Behavior

返回结果

Current Behavior

{"error":true,"messages":"getaddrinfo EAI_AGAIN china.caixin.com"}

用别的链接也是这样,只是后面的地址改了,README.md里的例子也是一样

Steps to Reproduce

  1. 运行docker run -p 3000:3000 -d mercury-parser-api
  2. 在3000端口使用

或者直接用我的链接:http://139.196.180.51:3000

Detailed Description

直接到这个链接去

http://139.196.180.51:3000/parser?url=

在后面放文章url

Possible Solution

我上网查了一下

说getaddrinfo的返回值EAI_AGAIN代表DNS(name server)返回临时性错误. 可以稍后重试.

但是试多少遍都没有用

Update to v2.2.1?

A great tool. Thanks a lot for the great work.
Is there any chance that this version will be updated to the current code base?
https://github.com/postlight/mercury-parser/releases/tag/v2.2.1

As far as I can see there, it is the first major update in a long time.
(the Chrome extension, which is probably based on the latest code, works much better for some sites than my self-hosted Docker version, hence my request).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.