If this helped you, please share!

Importing Stringified JSON Objects Into Pandas (Part 1)

Published November 24, 2017 in data , programming - 0 Comments

All python code in this post is Python 3.5+.

I’m continuing to work with the same Kaggle movies dataset as in the SQL import experiment.
This time, I imported the data into Pandas DataFrames.

The trickiest dataset to import was movies_metadata.csv. I first tried to use pandas.read_csv with the default settings.

I was able to read the movies_metadata dataset into Pandas, but the resulting DataFrame did not contain expected data types. For instance, the first column should be a boolean.

There was also the Pandas data type warning to consider. The default pandas.read_csv import behavior did a poor job of inferring column data types.

Next, I tried setting the data type starting just with the first two columns as a test (adult and belongs_to_collection):

That change broke the import:

Clearly I needed to try a different approach.

No comments yet

Leave a Reply: