Git Product home page Git Product logo

searchengine's Introduction

SearchEngine

上接爬虫项目FirstSpider:采集伯乐在线、知乎和拉勾网的数据,使用Django实现搜索网站的搭建,利用开源搜索引擎ElasticSearch完成高级搜索任务。    

搜索推荐:基于AJAX Jquery完成前后端交互。

在项目中查询结果分页的逻辑已改用模板语言控制。

运行效果

SearchEngine

开发环境

环境  版本
开发语言 Python 3.5.3
后端框架 Django 1.11
搜索引擎 ElasticSearch 5.1.1
IDE  PyCharm 2017.1 x64

使用方法

1、安装环境依赖包(不同版本操作系统安装方法有差异,见网上具体解决方案):

pip install -r requirements.txt

2、下载并安装ElasticSearch-rtf 5.1.1(方法见最后),进入ES根目录,命令行下启动ES:

./bin/elasticsearch.bat

3、使用FirstSpider的爬虫爬取数据并写入ES,数据管理可以使用Elasticsearch-head(具体使用方法见官方文档):

Alt text Alt text

4、项目目录下运行:

python manage.py runserver

5、浏览器下访问:

127.0.0.1:8000

Alt text

Alt text

Windows环境下使用ElasticSearch

ElasticSearch是一个基于Lucene(较难使用)的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口(http交互)。Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便(其他搜索引擎:solr、sphinx)。

  • 随着ELK日志分析系统而逐渐流行;
  • 使用效果类似NoSQL数据库(但重点在搜索而非数据存储);    

1、由于ElasticSearch是由Java编写的,首先需要安装JDK8,确保版本在1.8以上。

java -version

java-version

2、安装ElasticSearch-RTF(集成中文分词以及各种插件):

git clone https://github.com/EagleChen/docker_elasticsearch_rtf.git

3、修改配置文件config/elasticsearch.yml:

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-methods: OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers: "X-Requested-With, Content-Type, Content-Length, X-User"

5、进入ES根目录,命令行下启动ES:

./bin/elasticsearch.bat

成功启动后在浏览器访问可见:

127.0.0.1:9200

ES-launch

4、为了更方便地使用ES,推荐几个工具(注意版本要与ES一致):        

searchengine's People

Contributors

dependabot[bot] avatar kyle-ip avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.