Git Product home page Git Product logo

easyliterature's People

Contributors

psycoy avatar simlif avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

easyliterature's Issues

Cannot Fetch from Google Scholar.

作者您好,最近用的时候就算开了梯子,也一致是获取失败。估计是Scholarly那边的问题?看了一下那边的issue也有人提。请问您能正常使用吗,感谢。
报错信息 Exception ConnectError while fetching page: ('[Errno 11001] getaddrinfo failed',)

2024-03-18 21:19:21,365 - scholarly - INFO - Enabling proxies: http=http://127.0.0.1:10809/ https=http://127.0.0.1:10809/
2024-03-18 21:19:22,633 - scholarly - INFO - Proxy works! IP address: 
2024-03-18 21:19:23,071 - scholarly - INFO - Proxy setup successfully
Proxy setup sucess: True.
2024-03-18 21:19:25,159 - scholarly - INFO - Proxy works! IP address: 
2024-03-18 21:19:25,858 - scholarly - INFO - Proxy works! IP address: 
2024-03-18 21:19:28,435 - scholarly - INFO - Getting https://scholar.google.com/scholar?hl=en&q=A%20lightweight%20network%20for%20photovoltaic%20cell%20defect%20detection%20in%20electroluminescence%20images%20based%20on%20neural%20architecture%20search%20and%20knowledge%20distillation&as_vis=0&as_sdt=0,33
2024-03-18 21:19:32,429 - scholarly - INFO - Exception ConnectError while fetching page: ('[Errno 11001] getaddrinfo failed',)
2024-03-18 21:19:32,430 - scholarly - INFO - Retrying with a new session.
2024-03-18 21:19:36,985 - scholarly - INFO - Exception ConnectError while fetching page: ('[Errno 11001] getaddrinfo failed',)
2024-03-18 21:19:36,985 - scholarly - INFO - Retrying with a new session.

就是Cannot Fetch from Google Scholar.

[288](file:///D:/miniconda/envs/research/lib/site-packages/scholarly/_navigator.py:288) def search_publications(self, url: str) -> _SearchScholarIterator:
    [289](file:///D:/miniconda/envs/research/lib/site-packages/scholarly/_navigator.py:289)     """Returns a Publication Generator given a url
    [290](file:///D:/miniconda/envs/research/lib/site-packages/scholarly/_navigator.py:290) 
    [291](file:///D:/miniconda/envs/research/lib/site-packages/scholarly/_navigator.py:291)     :param url: the url where publications can be found.
...
    [188](file:///D:/miniconda/envs/research/lib/site-packages/scholarly/_navigator.py:188)     return self._get_page(pagerequest, True)
    [189](file:///D:/miniconda/envs/research/lib/site-packages/scholarly/_navigator.py:189) else:
--> [190](file:///D:/miniconda/envs/research/lib/site-packages/scholarly/_navigator.py:190)     raise MaxTriesExceededException("Cannot Fetch from Google Scholar.")

Bert文章下载后为文件夹而非pdf

感谢作者的工具,我在使用的时候遇到了如下问题。
本人note.md与pdfs文件夹在同一,目录下,note.md中输入了示例- {{BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.}}
在cmd中运行easyliter -i "./note.md" -o "./pdfs"
cmd中输出如下。

C:\Users\Administrator\Desktop\note>easyliter -i "./note.md" -o "./pdfs"
INFO:easyliter:Updating the file ./note.md
INFO:easyliter:Number of files to download -  1
  0%|                                                                                                | 0/1 [00:00<?, ?it/s]INFO:Downloads:ID type: title.
D:\software\Anaconda3\Lib\site-packages\easy_literature\dblp_source.py:19: GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 19 of the file D:\software\Anaconda3\Lib\site-packages\easy_literature\dblp_source.py. To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.

  return BeautifulSoup(resp.content)
INFO:Downloads:The Google scholar bib: {'title': 'Bert: Pre-training of deep bidirectional transformers for language understanding', 'author': 'J Devlin and MW Chang and K Lee and K Toutanova', 'journal': 'arXiv preprint arXiv …', 'year': '2018', 'url': 'https://arxiv.org/abs/1810.04805', 'pdf_link': 'https://arxiv.org/pdf/1810.04805.pdf&usg=ALkJrhhzxlCL6yTht2BRmH9atgvKFxHsxQ', 'cited_count': 82010}; The DLBP bib: {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.', 'author': 'Jacob Devlin and Ming-Wei Chang and Kenton Lee and Kristina Toutanova', 'journal': 'NAACL-HLT (1)', 'year': '2019', 'url': 'https://doi.org/10.18653/v1/n19-1423', 'pdf_link': None, 'cited_count': None}.
INFO:utils:The paper's arxiv url: https://arxiv.org/abs/1810.04805; The converted arxiv id: 1810.04805; The pdf link: https://arxiv.org/pdf/1810.04805.pdf&usg=ALkJrhhzxlCL6yTht2BRmH9atgvKFxHsxQ.
100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:07<00:00,  7.54s/it]

最后得到的bert文件是一个名为Bert_Pre-training_of_deep_bidirectional_transformers_for_language_understanding.pdf的文件夹,而非pdf文件。
不知是何问题

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.