DataFrame直接处理JSON格式的数据

import  pandas as pd

data = [
    {
        "id": "001",
        "name": "Lee",
        "class": "0701"
    },
    {
        "id": "002",
        "name": "wang",
        "class": "0801"
    },
    {
        "id": "003",
        "name": "zhang",
        "class": "0901"
    }
]
df = pd.DataFrame(data)
print(df)

把字典转化为DataFrame类型数据

s = {
    "col1": {"row1": 1, "row2": 2, "row3": 3},
    "col2": {"row1": "a", "row2": "b", "row3": "c"}
}
df = pd.DataFrame(s)
print(df)

从URL中读取数据

import pandas as pd
url = 'https://youtome.club/sample.json'
df = pd.read_json(url)
print(df)

内嵌的json读取

import pandas as pd
df = pd.read_json('inner.json')
print(df)

#输出
       school location                                     info
0  ABC school     Asia    {'id': '001', 'name': 'Lee', 'class': '0701'}
1  ABC school     Asia   {'id': '002', 'name': 'wang', 'class': '0801'}
2  ABC school     Asia  {'id': '003', 'name': 'zhang', 'class': '0901'}

inner.json文件

{
    "school": "ABC school",
    "location": "Asia",
    "info": [
    {
         "id": "001",
         "name": "Lee",
         "class": "0701"
     },
     {
         "id": "002",
         "name": "wang",
         "class": "0801"
     },
     {
         "id": "003",
         "name": "zhang",
         "class": "0901"
     }]
}

使用json_normalize()方法将内嵌的数据展平

import pandas as pd
import json
with open('inner.json', 'r') as f:
    data = json.loads(f.read())
df_inner_list = pd.json_normalize(data, record_path=['info'])
print(df_inner_list)

#输出
    id   name class
0  001    Lee  0701
1  002   wang  0801
2  003  zhang  0901

Process finished with exit code 0

展示school、location原数据

import pandas as pd
import json
with open('inner.json', 'r') as f:
    data = json.loads(f.read())
df_inner_list = pd.json_normalize(
    data,
    record_path=['info'],
    meta=['school', 'location']
)
print(df_inner_list)

#输出
    id   name class      school location
0  001    Lee  0701  ABC school     Asia
1  002   wang  0801  ABC school     Asia
2  003  zhang  0901  ABC school     Asia

Categories: python

0 Comments

发表评论

Avatar placeholder

邮箱地址不会被公开。 必填项已用*标注