emrekutlug / getting-started-with-pyspark Goto Github PK
View Code? Open in Web Editor NEWIn this tutorial, I explained SparkContext by using map and filter methods with Lambda functions in Python and created RDD from object and external files, transformations and actions on RDD and pair RDD, PySpark DataFrame from RDD and external files, used sql queries with DataFrames by using Spark SQL, used machine learning with PySpark MLlib.
Home Page: https://developer.ibm.com/tutorials/getting-started-with-pyspark/
License: Apache License 2.0