Git Product home page Git Product logo

docker-hadoop's Introduction

Run Apache Bigdata projects in Kubernetes

Flokkr is a project to run Apache Bigdata projects.

It provides:

  • Ready to use docker containers to run Hadoop/Ozone/Flink/Spark.
  • Example cluster definitions and tests
  • A powerful, composition-based approach to generate Kubernetes resource files for different use cases.
  • Toolset to run Bigdata project in the containers

Active repositories to check:

For more information, check https://github.flokkr.iot

docker-hadoop's People

Contributors

adoroszlai avatar elek avatar jzhuge avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

docker-hadoop's Issues

Example config doesn't work

Hi!

I'm using the docker-compose.yml from the example.

And namenode is falling. Here's the logs:

Attaching to hdfs_namenode
hdfs_namenode | Called launcher with command parameters: /opt/hadoop/bin/hdfs namenode
hdfs_namenode | Configuration type: simple
hdfs_namenode | hdfs-site.xml
hdfs_namenode | File hdfs-site.xml has been written out successfullly.
hdfs_namenode | log4j.properties
hdfs_namenode | File log4j.properties has been written out successfullly.
hdfs_namenode | Listening for transport dt_socket at address: 5005
hdfs_namenode | log4j:ERROR Could not find value for key log4j.appender.stdout
hdfs_namenode | log4j:ERROR Could not instantiate appender named "stdout".
hdfs_namenode | log4j:WARN No appenders could be found for logger (org.apache.hadoop.hdfs.server.namenode.NameNode).
hdfs_namenode | log4j:WARN Please initialize the log4j system properly.
hdfs_namenode | log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Can you help me make it work, please?

/cc @frol

Upgrade to hadoop 3.3.4

Hadoop 3.3.4 was released on 2022 Aug 8. We have the desire to move to this version of Hadoop to possibly mitigate vulnerabilities found in Hadoop 3.3.1 and the version of the JARs that it uses.

yarn logs error

Thank you very much for providing such good technology.
But when I used it, I found that yarn logs reported an error:
[hadoop@yarn-resourcemanager-1-0 bin]$ yarn logs -applicationId application_1640440677063_0006
2021-12-25 14:21:56 INFO RMProxy:134-Connecting to ResourceManager at yarn-resourcemanager-1-0.yarn-resourcemanager-1.default.svc.cluster.local/10.2.1.1:8032
Can not find any log file matching the pattern: [ALL] for the application: application_1640440677063_0006
Can not find the logs for the application: application_1640440677063_0006 with the appOwner: hadoop

Any good suggestions?

Failed to start the KeyspaceManager

Hi,
I try to startup the hadoop/ozone cluster as explained here:
https://cwiki.apache.org/confluence/display/HADOOP/Getting+Started+with+docker

But I got the following error when trying to start:

......
ksm_1       | STARTUP_MSG:   build = https://github.com/apache/hadoop.git -r 056a9783337f7e384f651cf86b30abf995d1ead8; compiled by 'elek' on 2017-09-27T19:10Z
ksm_1       | STARTUP_MSG:   java = 1.8.0_121
ksm_1       | ************************************************************/
ksm_1       | 2017-09-29 22:03:33 INFO  KeySpaceManager:51 - registered UNIX signal handlers for [TERM, HUP, INT]
ksm_1       | 2017-09-29 22:03:34 INFO  CallQueueManager:84 - Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 20000 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
ksm_1       | 2017-09-29 22:03:34 INFO  Server:1067 - Starting Socket Reader #1 for port 9862
ksm_1       | 2017-09-29 22:03:34 WARN  NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ksm_1       | 2017-09-29 22:03:34 ERROR KeySpaceManager:191 - Failed to start the KeyspaceManager.
ksm_1       | java.lang.NullPointerException
ksm_1       | 	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
ksm_1       | 	at org.apache.hadoop.ozone.web.utils.OzoneUtils.getScmMetadirPath(OzoneUtils.java:256)
ksm_1       | 	at org.apache.hadoop.ozone.ksm.KSMMetadataManagerImpl.<init>(KSMMetadataManagerImpl.java:71)
ksm_1       | 	at org.apache.hadoop.ozone.ksm.KeySpaceManager.<init>(KeySpaceManager.java:104)
ksm_1       | 	at org.apache.hadoop.ozone.ksm.KeySpaceManager.main(KeySpaceManager.java:187)
ksm_1       | 2017-09-29 22:03:34 INFO  ExitUtil:210 - Exiting with status 1: java.lang.NullPointerException
ksm_1       | 2017-09-29 22:03:34 INFO  KeySpaceManager:51 - SHUTDOWN_MSG: 
ksm_1       | /************************************************************
ksm_1       | SHUTDOWN_MSG: Shutting down KeySpaceManager at 25ef3a8c8827/172.18.0.4
ksm_1       | ************************************************************/
ksm_1       | Process exited with exit code 1
ksm_1       | Process has been failed (exit code: 1), restarting after 60 seconds... (4/10)
......

Is there any solution to fix this? It would be great to use docker for testing the hadoop/ozone feature.

Thanks for any help

How to persist data on the datanodes?

It seems that the example config persists only the namenode data to the host volume (at least, I see only the ./namenode/ folder in the mounted volume after I start namenode and datanode). What is the point of /tmp:/data volume mounting for the datanode container in this case? How can I specify the data root folder to be /data/datanode? (Given the logs I see, the default datanode folder is /tmp/hadoop-root/dfs/)

P.S. Thank you for building these nice BigData images!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.