Spark Tutorial : How to create a spark dataframe from JSON | Pyspark 2

Step 1: Create a Spark Session

from pyspark.sql import SparkSession
sparkSession = SparkSession.builder.appName('abc').getOrCreate()

Step 2: Define a JSON

import json
input = {"column1": "value1",
	"column2": "value2"}

Step 3: Create a Spark DataFrame

input = [json.dumps(input)]
jsonrdd = sparkSession.sparkContext.parallelize(input)
df = sparkSession.read.json(jsonrdd)
df.show()

Leave a Reply

Your email address will not be published. Required fields are marked *