Git Product home page Git Product logo

spark_'s Introduction

Spark_:)

Spark is an excellent language for developing machine learning pipelines, conducting exploratory data analysis at scale, and producing ETLs for data platforms. Spark is essential for anybody working with large amounts of data. massive-scale data-intensive operations may be managed and business insights can be obtained by processing massive volumes of data in a distributed manner using PySpark, a Python API for Spark, without compromising developer efficiency. To put it succinctly, PySpark is a strong and quick framework for doing massively distributed processing over robust data sets.

Stock Price Analysis using Apache Spark

The data for this case study is publicly available on Yahoo Finance, and it includes information about a company’s daily stock values from 2010 through 2020.

  1. Load the data into Apache Spark as a DataFrame
  2. Analyze the data by computing various statistics such as mean, standard deviation, and correlation
  3. Visualize the data by plotting the daily closing prices over the years. Stock Price Analysis

PySpark Dataframe Complete Guide (with COVID-19 India -Dataset)

Spark which is one of the most used tools when it comes to working with Big Data, but whereas Spark used to be heavily reliant on RDD manipulations, Spark has now provided a DataFrame API for us to work with. So in this notebook, We will learn standard Spark functionalities needed to work with DataFrames, and finally some tips to handle the inevitable errors you will face. COVID-19 Dataset Analysis

Binary Tabular Data Classification with PySpark

This notebook covers a classification problem in Machine Learning and go through a comprehensive guide to succesfully develop an End-to-End ML class prediction model using PySpark. Tabular Data Classification

spark_'s People

Contributors

anerisonani09 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.