Git Product home page Git Product logo

Comments (14)

junmo2 avatar junmo2 commented on August 15, 2024

目测是因为__init__.py中format_context方法的context字典对象每次使用的同一个,导致数组中每次都保持同一个对象,结果相同,把context赋值放到下面的for循环里保证每次运行都有新的OrderedDict对象即可。

image
最简单的改法可以把

        context = OrderedDict()

        for each in header:
            context[each] = ''    

放到

        for data in query_result:
            for row in data['data']['resultsData']:
                tmp = dict(zip(header, row[1:]))
                context.update(tmp)

下面
结果为

        header = '项目名称 批准号 项目类别 依托单位 项目负责人 资助经费(万元) 批准年度 关键词 是否结题 研究成果(期刊论文;会议论文;著作;奖励;专利) 依托单位ID 项目负责人ID 项目类别代码 申请代码 起止年月'.split()
  

        for data in query_result:
            for row in data['data']['resultsData']:
                tmp = dict(zip(header, row[1:]))
				context = OrderedDict()

                for each in header:
                    context[each] = ''   
                context.update(tmp)

                conclusion_context = {}
                if context['是否结题'] == 'true':
                    conclusion_context = self.conclusion_project(context['批准号'])

from nsfc.

ChrisLiang2020 avatar ChrisLiang2020 commented on August 15, 2024

我试了一下,这样改不能解决问题,问题似乎不在这里

from nsfc.

NothingOffice avatar NothingOffice commented on August 15, 2024

我试了一下,这样改不能解决问题,问题似乎不在这里

我试了一下是可以的,上面context = OrderedDict()这一行的缩进需要调整一下,然后就没有重复值了

from nsfc.

ChrisLiang2020 avatar ChrisLiang2020 commented on August 15, 2024
    for data in query_result:
        for row in data['data']['resultsData']:
            tmp = dict(zip(header, row[1:]))
            context = OrderedDict()

            for each in header:
                context[each] = ''
            context.update(tmp)

            conclusion_context = {}
            if context['是否结题'] == 'true':
                conclusion_context = self.conclusion_project(context['批准号'])

然后在dos命令框中用 pip setup.py install,这样对吗

from nsfc.

NothingOffice avatar NothingOffice commented on August 15, 2024

我是直接打开了__init__.py,把这段代码替换了,并没有重新执行install

from nsfc.

ChrisLiang2020 avatar ChrisLiang2020 commented on August 15, 2024

运行成功了,在dos命令框去切换到nsfc-master,使用pip install .

from nsfc.

NothingOffice avatar NothingOffice commented on August 15, 2024

名称是__init__.py的这个文件

from nsfc.

ChrisLiang2020 avatar ChrisLiang2020 commented on August 15, 2024

这个文件运行的结果不是爬取下来的数据,而且输入的代码分类的子代码分类,那个结果本来就是对的

from nsfc.

NothingOffice avatar NothingOffice commented on August 15, 2024

这个文件运行的结果不是爬取下来的数据,而且输入的代码分类的子代码分类,那个结果本来就是对的

当然不是,这个py文件替换过后,在命令提示符里输入查阅指令才会爬取,作者给出的有示例的

from nsfc.

ChrisLiang2020 avatar ChrisLiang2020 commented on August 15, 2024

那你之前没有输入查阅指令爬取是怎么判断数据重复的问题解决了,运行了里面的__init__.py又无法输入指令爬取数据

from nsfc.

junmo2 avatar junmo2 commented on August 15, 2024

那你之前没有输入查阅指令爬取是怎么判断数据重复的问题解决了,运行了里面的__init__.py又无法输入指令爬取数据

并不是重复爬取数据啊大哥……你可以自己修改代码打印一下看看,结果是对的,只不过重复使用同一个OrderedDict()对象了……

from nsfc.

junmo2 avatar junmo2 commented on August 15, 2024
    for data in query_result:
        for row in data['data']['resultsData']:
            tmp = dict(zip(header, row[1:]))
            context = OrderedDict()

            for each in header:
                context[each] = ''
            context.update(tmp)

            conclusion_context = {}
            if context['是否结题'] == 'true':
                conclusion_context = self.conclusion_project(context['批准号'])

然后在dos命令框中用 pip setup.py install,这样对吗

你修改之后再重新install就覆盖掉修改了啊大哥

from nsfc.

ChrisLiang2020 avatar ChrisLiang2020 commented on August 15, 2024

我知道的,我运行成功了,我的意思是要install才能爬取数据,才能知道问题有没有解决

from nsfc.

suqingdong avatar suqingdong commented on August 15, 2024

@ChrisLiang2020 @junmo2 @NothingOffice

BUG已修复,可以重新安装下:

pip install -U nsfc==1.0.4

from nsfc.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.