Git Product home page Git Product logo

junsonchen's Projects

2020_university-scholar-discovery-system icon 2020_university-scholar-discovery-system

技术类别:Web,数据挖掘,数据分析 基本功能: 利用网络爬虫技术,从国内外各大高校公开的网站和资源上,搜寻并下载各个高校专家,教授等学者信息,并为这些学者研究方向建立画像系统。用户可以按学校,专业,学科,论文,研究方向等各个维度去查看和对比各位学者的研究领域信息。 基本模块:专家数据爬取模块,论文信息爬取模块,基于专家画像的信息抽取,专家搜索引擎,研究方向提取,专家自我网络中主题圈发现,数据可视化 难易程度:中等 扩展功能: 专家画像系统 难易程度:较难

administrative-divisions-of-china icon administrative-divisions-of-china

中华人民共和国行政区划:省级(省份直辖市自治区)、 地级(城市)、 县级(区县)、 乡级(乡镇街道)、 村级(村委会居委会) ,**省市区镇村二级三级四级五级联动地址数据 Node.js 爬虫。

china_stock_announcement icon china_stock_announcement

该项目通过python脚本从巨潮网络的服务器获取**股市(sz,sh)的公告(上市公司和监管机构),把公告信息防盗数据库,公告文件下载到本地,并支持网页查询和读取。

court_project icon court_project

人民法院公告网爬虫-获取所有数据-除去破产文书

dxy-covid-19-crawler icon dxy-covid-19-crawler

2019新型冠状病毒疫情实时爬虫及API | COVID-19/2019-nCoV Realtime Infection Crawler and API

nyspider icon nyspider

各种爬虫---大众点评,安居客,58,人人贷,拍拍贷, IT桔子,拉勾网,豆瓣,搜房网,ASO100,气象数据,猫眼电影,链家,PM25.in...

parselawdocuments icon parselawdocuments

对收集的法律文档进行一系列分析,包括根据规范自动切分、案件相似度计算、案件聚类、法律条文推荐等(试验目前基于婚姻类案件,可扩展至其它领域)。

pdf-analysis icon pdf-analysis

Parsing the PDFs and extracting data tables and text out of it.

pdfkp icon pdfkp

一个pdf关键字所在文段提取器

pkulaw_spider icon pkulaw_spider

爬取北大法宝网http://www.pkulaw.cn/Case/的法律文书(law documents),按月爬取,总计约2000w篇法律文档,仅供研究使用。

python-for-data-mining icon python-for-data-mining

该资源为作者在CSDN的撰写Python数据挖掘和数据分析文章的支撑,主要是Python实现数据挖掘、机器学习、文本挖掘等算法代码实现,希望该资源对您有所帮助,一起加油。

ricf icon ricf

Research Infrastructure of Chinese Foundations **基金会研究基础数据库

sinafinancespider icon sinafinancespider

爬取新浪财经网http://finance.sina.com.cn/stock/,各股票公司每日公告(爬取股票分析所需语料)

spider-text-message icon spider-text-message

利用python全网抓取文字信息,包括(搜狐、新浪、凤凰、人民、腾讯、网易、新华、央视)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.