bigdata

In this post we discuss how to read semi-structured data such as JSON from different data sources and store it as a spark dataframe. The spark dataframe can in turn be used to perform aggregations and all sorts of data manipulations. Introduction Previously we saw how to create and work with spark dataframes. In post we discuss how to read semi-structured data from different data sources and store it as a spark dataframe and how to do further data manipulations.

Semi-Structured Data in Spark (pyspark) - JSON

Search

Categories

Recent posts

Tags