pyspark dataframe

Pyspark DataFrame Operations - Basics | Pyspark DataFrames

In this post, we will be discussing on how to work with dataframes in pyspark and perform different spark dataframe operations such as a aggregations, ordering, joins and other similar data manipulations on a spark dataframe. Introduction Spark Dataframe API enables the user to perform parallel and distributed structured data processing on the input data. A Spark dataframe is a dataset with a named set of columns.

Continue reading