Git Product home page Git Product logo

bigdata_stack's People

Contributors

hungunicorn avatar johannestang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

bigdata_stack's Issues

great work, how to improve?

Hi @johannestang thanks for the quick start repo you have made!

i am also thinking of setting up a big data stack for a docker cluster and possibly doing so using helm charts for k8s.

Before i start testing with your stack i would to ask a couple of questions to understand the direction to take when making possible improvements:

  • first off, what do you think should be improved first in this project? do you have a desired roadmap ahead?
  • secondly, if i understand correctly, the HIVE service is actually only needed as a requirement to use Presto to run SQL right? Is it possible to get rid of Hive if we only want to use Presto / Impala or is it not possible currently?
  • thirdly, in your blog post you state that "There are of course many other interesting big data SQL engines, e.g. Impala, Spark SQL, and Drill. For background on these (and more) have a look at this great post." Does it mean that if one wants to use Spark (and thus Spark SQL) one can use Hive and remove Presto from the stack, or would you still recommend connecting SparkSQL to Presto to run queries?

thanks for the clarification.

P.S. have you published any newer Blog Post since 2019 ? Let me know

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.