Git Product home page Git Product logo

anyfly8's Projects

akka-crawler icon akka-crawler

A simple web crawler based on Scala and the Akka framework

android-auto-repeat-button icon android-auto-repeat-button

Android user-interface controls that repeat their associated click actions so long as the user is touching them.

android-crawler icon android-crawler

基于Jsoup的 Android 网络爬虫,抓取海投网上的高校宣讲会信息。

arachnez icon arachnez

scala爬虫框架,使用了Akka actor。

crypto-js icon crypto-js

Automatically exported from code.google.com/p/crypto-js

distributecrawler icon distributecrawler

基于Map/Reduce爬虫,可抽取各大新闻网站的新闻正文并进行分类和聚类

distributed_spider_pku_java icon distributed_spider_pku_java

1. 主要分为三个模块,一个爬虫抓取模块,一个是数据处理模块,一个是用户模块。 2. 爬虫抓取模块主要是从直播吧、新浪体育、网易体育上爬取有关足球的新闻和用户关于足球的评论,利用集群HADOOP抓取网页,分析得出URL集,提取特征URL 3. 网页linux脚本过滤得到原始网页,然后二次过滤得到文本,并使用分布式储存。 4. 处理模块主要是根据训练集规则一和规则二,得到分词器,然后对文本进行操作,得出训练结果。 5. 通过特征脚本得到训练结果的特征词分类,然后提取出球队模糊集和球星模糊集。 6. 过滤得到球队精确集和球星精确集,并存入MYSQL数据库。 7. 从数据库中提取球星和球队的信息进行图表分析,并动态显示WIKI信息,调入显示模块中和用户进行交换

distributedcrawler icon distributedcrawler

华南理工大学高英实验室进行的分布式爬虫项目,除了实验室内部人员外,不得私自传播.

dzikka icon dzikka

Dzikka - web crawler implemented using Akka actor framework

egg icon egg

一个通用的爬虫

ferrit icon ferrit

Ferrit is a web crawler service written in Scala using Akka, Spray and Cassandra.

guozhongcrawler icon guozhongcrawler

GuozhongCrawler的是一个无须配置、便于二次开发的爬虫开源框架,它提供简单灵活的API,只需少量代码即可实现一个爬虫。其设计灵感来源于多个爬虫国内外爬虫框架的总结。采用完全模块化的设计,功能覆盖整个爬虫的生命周期(链接提取、页面下载、内容抽取、持久化),支持多线程抓取,分布式抓取,并支持自动重试,定制执行js、自定义cookie等功能。在处理网站抓取多次后被封IP的问题上,guozhongCrawler采用动态轮换IP机制有效防止IP被封。另外,源码中的注释及Log输出全部采用通俗易懂的中文。让初学者能有更加深刻的理解

hive icon hive

一个简单的通用爬虫

hosts icon hosts

:statue_of_liberty:最新可用的google hosts文件

kanon icon kanon

可配置、自动调度的爬虫工具

luckymoney icon luckymoney

一个旨在帮助手慢人士抢微信红包的android项目,实现了微信红包自动抢的功能,最低支持版本android4.1.2。实现原理是检测通知栏消息判断是否有红包,自动打开微信聊天列表,借助AccessibilityService去模拟人工点击,实现秒抢红包,手慢的人,你们有福了!!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.