Git Product home page Git Product logo

nsfc's Introduction

PyPI Downloads GitHub last commit GitHub Repo stars GitHub forks

国家自然科学基金数据查询系统

安装

pip3 install nsfc

数据下载

数据库文件较大,可通过百度网盘进行下载 (下载链接 提取码: 2nw5)

  • 下载所需的数据库文件,如project.A.sqlite3, 或全部数据project.all.sqlite3
  • 保存至nsfc的安装路径下的data目录下, 如:/path/to/site-packages/nsfc/data/project.db
  • 或者保存至HOME路径下的nsfc_data目录下,如~/nsfc_data/project.db
  • 也可以通过-d参数指定要使用的数据库文件

使用示例

本地查询

# 查看帮助
nsfc query

# 列出可用的查询字段
nsfc query -K

# 输出数量
nsfc query -C

# 按批准年份查询
nsfc query -C -s approval_year 2019

# 按批准年份+学科代码(模糊)
nsfc query -C -s approval_year 2019 -s subject_code "%A%"

# 批准年份也可以是一个区间
nsfc query -C -s approval_year 2015-2019 -s subject_code "%C01%"

# 结果输出为.jl文件
nsfc query -s approval_year 2019 -s subject_code "%C0501%" -o C0501.2019.jl

# 结果输出为xlsx文件
nsfc query -s approval_year 2019 -s subject_code "%C0501%" -o C0501.2019.xlsx -F xlsx

# 限制最大输出条数
nsfc query -L 5 -s approval_year 2019                                           

结题报告下载

nsfc report 20671004

nsfc report 20671004 -o out.pdf

其他功能

LetPub数据获取

nsfc crawl

本地数据库构建/更新

nsfc build

其他说明

  • 目前基本上只有2019年之前的数据,2020年的数据很少
  • 后续有数据时会再更新

更新记录

  • [2022-01-14] version 2.0.4
    • update the urls of Official

nsfc's People

Contributors

roronoa-dong avatar suqingdong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nsfc's Issues

关于数据使用问题咨询

非常感谢你贡献的数据和代码,我是南京大学的一名研究生,希望在研究中参考你的数据,关于数据使用权限有一些文献想向你咨询,如果方便请与我联系[email protected]

安装出错

我在win7上用pip install nsfc 总是出现error:command errored out with exist status:1。请问怎么可以解决,非常想使用您这个,但是奈何一直开在这里,此外想问一下macos可以使用吗。打扰了。

爬取出错

请问大家命令行这样出错是什么原因,怎么改呢?

-bash-3.2$ nsfc search -c H -y 2019 -o out [2020-12-19 11:09:59 NSFC search DEBUG MainThread:72] input arguments: {'codes': 'H', 'years': '2019', 'outfile': 'out', 'projects': None, 'outtype': 'xlsx', 'type': 'Z'} [2020-12-19 11:09:59 NSFC search DEBUG MainThread:84] >>> crawling: H0101 - 2019 - 630 right captcha: 655n {'code': 'H0101', 'projectType': '630', 'conclusionYear': '', 'ratifyYear': '2019', 'ratifyNo': '', 'projectName': '', 'personInCharge': '', 'dependUnit': '', 'keywords': '', 'subPType': '', 'psPType': '', 'pageNum': 0, 'pageSize': 10, 'beginYear': '', 'endYear': '', 'adminID': '', 'checkDep': '', 'checkType': '', 'quickQueryInput': '', 'queryType': 'input', 'complete': '', 'tryCode': '655n'} error code: {"code": 500, "data": null, "message": "请正确输入检索条件"}

parse函数中的info返回值不对

nsfc_spider.py 函数中,parse函数中的info变量现在为
“请输入细化的查找条件”这个固定值,导致后面程序无法成功执行,
请看一下是否是因为网站改版等原因导致?

无法本地查询项目

谢谢大侠分享这么好的工具。可能是我不会使用,总是出现错误,无法本地查询。下载结题报告没问题。请教应该如何设置,才可以查询本地数据库呢?万分感激赐教。
按照说明,将从百度盘下载的文件存放到好几个说明中指定的目录,如C:\Python\Python39\Lib\site-packages\nsfc\data\project.db(错误说明中是proejct.db,估计是打错了,我也尝试了这个错误的拼写)、C:\Python\Python39\nsfc\nsfc_data、C:\Python\Python39\nsfc_data、C:\Python\Python39\nsfc\nsfc_data,但是始终提示STATS main ERROR MainThread:88 dbfile not exists! [C:\Python\Python39\Lib\site-packages\nsfc\data\project.db]。然后我用参数-d直接指定目录及数据库文件project.all.sqlite3,这下不再提示找不到文件了,可是查询结果总是显示STATS main ERROR MainThread:122 no result for your input。我的查询命令是:
nsfc query -d C:\Python\Python39\Lib\site-packages\nsfc\data\project.db -s approval_year 2021 -o 2021.xlsx -F xlsx

C:\Python\Python39>nsfc query -d C:\Python\Python39\Lib\site-packages\nsfc\data\project.db\project.all.sqlite3 -s approval_year 2021 -s subject_code "%A%" -o 2021.xlsx -F xlsx

C:\Python\Python39>nsfc query -d C:\Python\Python39\Lib\site-packages\nsfc\data\project.db\project.all.sqlite3 -s approval_year 1990 -o 1990.xlsx -F xlsx

爬取结果问题

爬取的结果中,重复爬取查询到的所有数据中的最后一条数据,查询到的其他结果未爬取存入文件中,导致缺少大量数据

找不到nsfc路径下的data目录

感谢大佬分享,无奈小弟太菜,在按照说明执行“保存至nsfc的安装路径下的data目录”这一步时,在nsfc文件夹中找不到data目录。

2015年以前的无法爬取

nsfc query -s approval_year 2014 -s subject_code "%H08%" -o /cluster/home/zfli/H08.2014.xlsx -F xlsx
[2021-04-14 00:24:58 STATS main INFO MainThread:72] input arguments: {'search': (('approval_year', '2014'), ('subject_code', '%H08%')), 'outfile': '/cluster/home/zfli/H08.2014.xlsx', 'format': 'xlsx', 'dbfile': '/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/nsfc/data/proejct.db', 'keys': False, 'count': False, 'limit': None, 'log_level': 'info'}
Traceback (most recent call last):
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1706, in _execute_context
    cursor, statement, parameters, context
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 692, in do_execute
    cursor.execute(statement, parameters)
sqlite3.DatabaseError: database disk image is malformed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/cluster/home/zfli/anaconda2/envs/NSFC/bin/nsfc", line 8, in <module>
    sys.exit(main())
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/nsfc/bin/main.py", line 27, in main
    cli()
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/nsfc/bin/query.py", line 121, in main
    elif not query.count():
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3064, in count
    return self._from_self(col).enable_eagerloads(False).scalar()
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 2805, in scalar
    ret = self.one()
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 2782, in one
    return self._iter().one()
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 2823, in _iter
    execution_options={"_sa_orm_load_options": self.load_options},
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1670, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1520, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 314, in _execute_on_connection
    self, multiparams, params, execution_options
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1399, in _execute_clauseelement
    cache_hit=cache_hit,
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1749, in _execute_context
    e, statement, parameters, cursor, context
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1930, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1706, in _execute_context
    cursor, statement, parameters, context
  File "/cluster/home/zfli/anaconda2/envs/NSFC/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 692, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DatabaseError: (sqlite3.DatabaseError) database disk image is malformed
[SQL: SELECT count(*) AS count_1 
FROM (SELECT project.project_id AS project_project_id, project.title AS project_title, project.project_type AS project_project_type, project.project_type_code AS project_project_type_code, project.approval_year AS project_approval_year, project.person AS project_person, project.money AS project_money, project.institution AS project_institution, project.start_time AS project_start_time, project.end_time AS project_end_time, project.subject AS project_subject, project.subject_class_list AS project_subject_class_list, project.subject_code_list AS project_subject_code_list, project.subject_code AS project_subject_code, project.finished AS project_finished, project.keyword AS project_keyword, project.keyword_en AS project_keyword_en, project.abstract AS project_abstract, project.abstract_en AS project_abstract_en, project.abstract_conc AS project_abstract_conc, project.result_stat AS project_result_stat 
FROM project 
WHERE project.approval_year = ? AND project.subject_code LIKE ?) AS anon_1]
[parameters: ('2014', '%H08%')]
(Background on this error at: http://sqlalche.me/e/14/4xp6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.