I often run into cases where a Pandas dataframe contains columns with JSON or dictionary structures.
In most cases, bashing that sort of structure with the following hammer of a snippet works to fully flatten the structure, such that each column’s dictionary keys get horizontally stretched out into new columns.
For instance a column named person
with a row containing a record like {"Name":"Bob", "Age":100}
would become two rows named person.Name
and person.Age
.
import pandas as pd
import json
json_struct = json.loads(df.to_json(orient="records"))
df_flat = pd.json_normalize(json_struct)