Git Product home page Git Product logo

Comments (3)

kodxana avatar kodxana commented on August 18, 2024 1

I would say look how Yacy does that also would be nice to specify websites to crawl.

from future.

rtrevinnoc avatar rtrevinnoc commented on August 18, 2024

Hello!

First of all, thank you very much for your interest, I am really excited!
Regarding using docker, of course, I had not thought about it, but I will be adding it in the course of the following days so that you can run your own server in a container and so can anyone else.

I appreciate your suggestion <3

from future.

rtrevinnoc avatar rtrevinnoc commented on August 18, 2024

HI!

I have added a Dockerfile to the repo 7259297, which allows to create an image based on ubuntu to run FUTURE in a container. It copies the files from the downloaded repository onto the container, and sets everything up to start crawling.

As for specifying which sites to crawl, that can be partially accomplised using the SEED_URLS variable in the config.py file, however the crawler will follow links onto other sites, therefore I will be adding a functionality to limit which domains to crawl in the following days.

Thank you very much for your interest <3.

from future.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.