Git Product home page Git Product logo

sparkdemo's Introduction

SparkDemo

概述

本项目主要提供了 Spark 示例,示例涵盖了 python,java,scala,r 语言,且包括 Streaming,Sql,Mllib,GraphX,SparkR 等方面示例。

本项目代码示例参考了:spark 项目中的示例,当然我们也加入和修改了很多内容, 对于代码的使用也做了充分的说明。

我们还提供了一些实际的案例, 这些案例是对真实场景的抽象, 展示了一个实际项目开发需要考虑的问题, 这些案例经过了充分测试, 读者完全可以在实际项目中参考.

版本说明

hadoop: 2.6.0

spark: 1.6.1

集群的搭建方式请大家参考官网的说明.

文档索引

  • 关于 Scala 的文档,参见:Scala 文档 (推荐指数: ★★★★★ ★★★★★ )
  • 关于 Python 的文档,参见:Python 文档 (推荐指数: ★★★★★ ★★★★ )
  • 关于 Java 的文档,参见:Java 文档 (推荐指数: ★★★★★ )
  • 关于 R 的文档,参见:R 文档 (推荐指数: ★★★★ )

sparkdemo's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sparkdemo's Issues

Kafka2Hdfs问题

  rdd.foreachPartition {
    partitionOfRecords =>
      val connection = HdfsConnection.getHdfsConnection(config)
      partitionOfRecords.foreach(
        record => {
          // connection.writeUTF(record)
          connection.write(record.getBytes("UTF-8"))
          connection.writeBytes("\n")
        }
      )
      // 每次完了之后进行 flush
      try {
        connection.hflush()
      } catch {
        case e: Exception => logger.error(s"hflush exception: ${e.getMessage}")
      }
  }

多个partition往一个hdfs路径写数据不会报错吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.