I am Blanker.
- 🇺🇸English
- 🇨🇳中文
- TypeScript / JavaScript
- Solidity
- Python 3
2019新型冠状病毒疫情时间序列数据仓库 | COVID-19/2019-nCoV Infection Time Series Data Warehouse
Home Page: https://lab.isaaclin.cn/nCoV/
License: MIT License
这个值是因为不可描述原因留空?
undefined:475
"countryName": "法属波,
^
SyntaxError: Unexpected token
so that it can distinguish the English name of 山西
see http://www.stats.gov.cn/tjsj/ndsj/2019/indexeh.htm
DXYarea的数据反馈
2.14 武汉死亡数据,有一行为1124,影响数据清洗 (我统计数据使用当天最大值,这个很干扰)
湖北省,武汉,51986,0,4131,1426,35991,0,2286,1124,2020-02-14 08:10:27.048
应该是1106
2.2日武汉的治愈数据 有个252,也不对
湖北省,武汉,9074,0,215,294,4109,0,252,224,2020-02-02 18:23:15.451
另外,为什么不把各项新增数据爬出来呢,只有总数,虽然后一天减去前一天可以得出新增数据,但是由于存在核销数据的情况,这样计算有时并不准确,有时候累计数后一天数量甚至比前一天可能少,减出来的新增数据就为负数,影响统计和趋势判断。
为了新增数据我还专门写了脚本处理,如果能够直接抓去出来就好了。
感谢作者的付出。
目前json/csv中只有最新的数据。
请问是否可以将数据按日期存储在json/csv目录下,结合gitcdn,对于缓解服务器压力、节省作者服务器开支以及提升数据使用的便捷性都有益处。
"3795" 应该是多了个 3。
'https://img1.dxycdn.com/2020/0208/356/3395436496692894611-135.png', 'https://img1.dxycdn.com/2020/0208/599/3395474215095538530-135.png', 'https://img1.dxycdn.com/2020/0208/502/3395474230127927756-135.png', 'https://img1.dxycdn.com/2020/0208/704/3395474279520515356-135.png', 'https://img1.dxycdn.com/2020/0208/629/3395474292405418005-135.png']",,,36833,27657,2602,3795,6101.0,,,,,,该字段已替换为说明1,易感人群:人群普遍易感。老年人及有基础疾病者感染后病情较重,儿童及婴幼儿也有发病,潜伏期:一般为 3~7 天,最长不超过 14 天,潜伏期内可能存在传染性,其中无症状病例传染性非常罕见,宿主:野生动物,可能为中华菊头蝠,,,病毒:新型冠状病毒 2019-nCoV,传染源:新冠肺炎的患者。无症状感染者也可能成为传染源。,传播途径:经呼吸道飞沫、气溶胶传播、接触传播是主要的传播途径。消化道等传播途径尚待明确。,疑似病例数来自国家卫健委数据,目前为全国数据,未分省市自治区等,,[],2020-02-09 08:10:06.607
感谢大家对项目的支持。
近期,我在浏览数据库时发现,丁香园的数据更新异常:大量境外数据和少部分大陆地区数据的createTime
和modifyTime
字段即使在疫情数据没有任何变动的情况下也会发生变化,这就导致了外国数据被多次重复收录至数据库中,收录的条目仅是createTime
和modifyTime
字段与其他不一致。
个人推断,丁香园的createTime
和modifyTime
字段在任何一个国家/省份/城市的数据发生变动时都会发生变动,因此导致了这个问题。所以,我在实时爬虫最近的两次更新ced5fda
和540ae98
中,移除了这两个字段,未来不会再发生类似的问题。
与此同时,对于历史数据部分,我删除了重复的数据条目,删除的逻辑为:
operator
不同,则两份数据都予以保留。(可能表述有不准确的地方,可以参考此处。)
共计删除12716条重复数据。
在最新一次数据更新d166029
及之后的数据中,重复条目均不会再得到保留,如果需要回溯重复条目,可以查询c8d6947
及以前的数据。
Thank you for your support.
Recently, I found that the data of Ding Xiang Yuan was abnormally updated: the createTime
and modifyTime
fields of a large amount of overseas data and a small amount of data in the mainland China will change even if there is no change in the numbers. As a result, foreign data were found duplicated several times, and the only differences between those duplications are createTime
and modifyTime
.
I suppose that the createTime
and modifyTime
fields will change when the data of any country/province/city modified by Ding Xiang Yuan, thus causing this problem. Therefore, in my last 2 updates in real-time crawler ced5fda
and 540ae98
, these two fields have been removed, and similar problems will not occur in the future.
At the same time, for the historical data part, I deleted the duplicate data entries, and the deletion methodology was:
operator
who entered the data is different, even if the numbers are the same, both data will be retained.12716 documents were removed in total.
In the latest update d166029
and future updates, duplicate entries will not be retained anymore, if you would like to backtrack duplicated entries, you can check them out in c8d6947
and previous data.
前面有人提过, Issue closed 了,但是问题并没有解决。这是什么原因?
Not consistent with data other files and it's also not correct right now.
"countryEnglishName": "United Kiongdom",
实际应为
United Kingdom
所有记录里几乎都没有"疑似"病例(province_suspectedCount), "疑似"一栏下几乎都是0, 偶尔出现的"疑似"数量几乎就是1或2个, 比新浪等的数据少非常多.
请问是丁香园上面的数据就没有疑似吗?
Can you add a data timeline api to check the tendency of the virus?
The "confirmedCount" value for
https://lab.isaaclin.cn/nCoV/api/area?provinceEng=China AND
https://lab.isaaclin.cn/nCoV/api/overall?latest=0
is same, Please have a look at it !
Hi,
There's a double quote missing in DXYArea.json line 475.
Adrien
谢谢您的项目,有两个问题请教您一下
因为之前是分析从 api 得到的数据,好像外网访问 api 不太稳定,现在想添加一种方法读取这个数据仓库中的 csv 文件。为了跟 api 返回的数据一致,能不能在area文件中
谢谢
**维吾尔自治区,Xinjiang,兵团第八师石河子市,"Shihezi, Xinjiang Production and Construction Corps 8th Division",70,0,10,1,3,0,0,1,2020-02-15 22:08:57.299
Hi,
I created a Python Analysis repo here:
https://github.com/jianxu305/nCov2019_analysis
Could you please add this to reference? Thanks.
您好,开发者,我发现在您的CSV文件夹下的有关各个省份的数据中没有香港,**与澳门的数据,但在API测试中是有的。期待您的回复,谢谢。
丁香园 的治愈人数是 6 ,这里是 4
Hi,
Thank you for sharing this dataset. But it would be great if you can provide the dataset with the names of cities and provinces in English.
Thank you,
Best Regards,
Mahasen
您好!
感谢您对武汉和开源社区的贡献,您的项目已录入 OpenSourceWuhan 武汉开源,如有疑问请open an issue
武汉开源 OpenSourceWuhan
集齐了开源平台上支援武汉的项目,是一个连结各个开源项目的入口。
And some data are mixed and unvalid, like:
Just search 'Bhutan' to take a look, there are 3 results
Fail to connect to your database, an error comes with "getaddrinfo failed", which I think might due to aninproper URI? Your kind help is very much appreciated!
你好,我是第一次使用hub,我该怎么下载数据呢
您好,十分感谢您可以在该平台提供如此丰富的数据内容。
近期,我们准备利用COVID-19数据做相关研究,请问是否可以直接使用您提供的数据?您是否有一些其他要求?烦请告知,谢谢!
I am trying to put in some analysis codes into the project. Can you please review the pull request?
If you feel ok to merge that into master, then I can directly contribute to this project this way. Please let me know if you have any concerns. Thanks.
DTdxy <- read.csv("https://raw.githubusercontent.com/BlankerL/DXY-COVID-19-Data/master/csv/DXYArea.csv", header = TRUE, stringsAsFactors = FALSE)
It raise an error:
Error in file(file, "rt") :
cannot open the connection to 'https://raw.githubusercontent.com/BlankerL/DXY-COVID-19-Data/master/csv/DXYArea.csv'
With a warning message:
Warning message:
In file(file, "rt") :
URL 'https://raw.githubusercontent.com/BlankerL/DXY-COVID-19-Data/master/csv/DXYArea.csv': status was 'Couldn't connect to server'
And when I copy the https://raw.githubusercontent.com/BlankerL/DXY-COVID-19-Data/master/csv/DXYArea.csv to browser, the connection cannot be opened either.
你好,新生成的csv数据格式似乎有错乱,一会updateTime字段在中间,一会格式又会变成在最后,在中间的时候数量又变成浮点型了。是不是开了两个定时任务在处理了。
你好, 发现疑似数据一致都是0, 能处理吗
suspectedCount
你好
我觉得可以像csv数据那样json数据也提供所有数据,或者说可以提供两个版本一个latest=0
,一个是latest=1
,目前json版本只是提供的最新版数据。
因为您提供csv数据字段并不像json中那么全,比如关于国家的字段country*
, 这样如果需要得到所有数据就得用api,会无形中给api访问带来压力。如果json数据也是提供所有的数据,就可以直接根据json自定制自己想要的数据格式,或者是某些字段,而不用访问api,减小api压力。
现在只有澳门的。
Feb 3 CSV data contains logically duplicated entries with cityName "南阳" and "南阳(含邓州)", "商丘" and "商丘(含永城)", etc, The effect is the counts aggregated on province level will be too large on Feb 3.
If this project doesn't clean this data, then probably better to notify users so that they can be aware of this. Thanks.
如题,目前只有实时数据,能否提供每天的省市数据?
冒昧的想问下:
1、截至2020年1月29日(10:00-23:00之间都行),累计确诊人数的数据,在哪里找啊?
2、截至2020年1月30日(10:00-23:00之间都行),累计确诊人数的数据,在哪里找啊?
谢谢啊!
谢谢
请问疫情信息变化的时间序列数据,/nCoV/api/area里的updateTime数据更新时间 的数据类型是什么,如何解析?
您好,我是 Wuhan2020 那边的参与者,最近在用您抓取的数据制图。
请问是否可以也在此仓库中同步 json 格式的文件?
https://lab.isaaclin.cn/nCoV/ 应该是压力比较大,一直不敢太用只是每天更新,但是现在也访问不了了... 在这个仓库有 json 格式文件的话可能可以缓解 api 的访问压力。有
https://lab.isaaclin.cn/nCoV/api/area
https://lab.isaaclin.cn/nCoV/api/area?latest=0
https://lab.isaaclin.cn/nCoV/api/overall?latest=0
这几个我觉得就能覆盖大多数应用了。
感谢!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.