Pandas 数据结构 – Series , pandas.Series( data, index, dtype, name, copy)
1.简单pandas示例
import pandas as pd
a = [1,2,3]
myvar = pd.Series(a)
print(myvar)
#输出
0 1
1 2
2 3
dtype: int64
2. 我们可以指定索引
import pandas as pd
a = ["Google","Baidu","Huawei"]
data = pd.Series(a,index=["x","y","z"])
print(data)
#print(data['y']) #根据索引值读取对象
#输出
x Google
y Baidu
z Huawei
dtype: object
3. 我们也可以使用 key/value 对象,类似字典来创建 Series
import pandas as pd
sites = {1:"Google",2:"Baidu",3:"Wiki"}
myvar = pd.Series(sites)
print(myvar)
#输出
1 Google
2 Baidu
3 Wiki
dtype: object
4. 可以指定字典中的索引
import pandas as pd
sites = {1:"Google",2:"Wiki"}
myvar = pd.Series(sites,index=[1,2])
print(myvar)
#输出
1 Google
2 Wiki
dtype: object
5.设置Series实例参数
import pandas as pd
sites = {1: "Google",2:"Baidu",3:"Wiki"}
myvar = pd.Series(sites,index=[1,2],name="Hello World")
print(myvar)
#输出
1 Google
2 Baidu
Name: Hello World, dtype: object
Pandas 数据结构 – DataFrame , pandas.DataFrame( data, index, columns, dtype, copy)
1.使用列表创建
import pandas as pd
data = [['Google',10],['Baidu',12],['Wiki',13]]
df = pd.DataFrame(data,columns=['Site','Age'],dtype=float)
print(df)
#输出
Site Age
0 Google 10.0
1 Baidu 12.0
2 Wiki 13.0
以下实例使用 ndarrays 创建
import pandas as pd
data = {'site':['Google','Baidu','Wiki'],'Age':[10,12,13]}
df = pd.DataFrame(data)
print(df)
#输出
site Age
0 Google 10
1 Baidu 12
2 Wiki 13
使用字典创建
import pandas as pd
data = [{'a':1,'b':2},{'a':5,'b':'10','c':'20'}]
df = pd.DataFrame(data)
print(df)]
#输出
a b c
0 1 2 NaN
1 5 10 20
Pandas 可以使用 loc, 属性返回指定行的数据,如果没有设置索引,第一行索引为 0 ,第二行索引为 1
import pandas as pd
data = {
"calories": [420,380, 390],
"duration": [50,40,45]
}
df = pd.DataFrame(data)
print(df.loc[0])
#输出
calories 420
duration 50
Name: 0, dtype: int64
loc返回结果也可以是Series数据
import pandas as pd
data = {
"calories": [420,380, 390],
"duration": [50,40,45]
}
df = pd.DataFrame(data)
print(df.loc[[0,1]])
#输出
calories duration
0 420 50
1 380 40
dataframe指定索引
import pandas as pd
data = {
"calories":[420,300,500],
"duration":[50,30,40]
}
df = pd.DataFrame(data,index=["day1","day2","day3"])
print(df)
#输出
calories duration
day1 420 50
day2 300 30
day3 500 40
loc指定索引
import pandas as pd
data = {
"alpha": [1,2,3],
"Xma":[10,30,40]
}
df = pd.DataFrame(data,index=["day1","day2","day3"])
print(df.loc['day1'])
#输出
alpha 1
Xma 10
Name: day1, dtype: int64
0 Comments