Loading... <div class="tip inlineBlock error"> ipynb结构:[03.pandas数据结构.html](http://type.zimopy.com/usr/uploads/2022/12/2504297832.html) </div> > 下面是markdown结构。看个人需求,后面的格式统一如此 ```python import pandas as pd import numpy as np ``` # 1.Series Series是一种类似于一维数组的对象,它由一组数据〈不同数据类型)以及一组与之相关的数据标签(即索引)组成。 ## 1.1仅有数据列表即可产生最简单的Series ```python s1 = pd.Series([1,"a",5.2,7]) print(s1) ``` 0 1 1 a 2 5.2 3 7 dtype: object ## 获取索引 ```python s1.index ``` RangeIndex(start=0, stop=4, step=1) ## 获取数据 ```python s1.values ``` array([1, 'a', 5.2, 7], dtype=object) ## 1.2 创建一个具有标签索引的Series ```python s2 = pd.Series([1,"a",5.2,7],index = ['d','b','a','c']) ``` ```python s2 ``` d 1 b a a 5.2 c 7 dtype: object ```python s2.index ``` Index(['d', 'b', 'a', 'c'], dtype='object') ## 1.3 使用Python 字典创建Series ```python sdata = {"Ohio":3500,"Texas":7200,"Oregon":1600,"Utah":500} ``` ```python s3 = pd.Series(sdata) ``` ```python s3 ``` Ohio 3500 Texas 7200 Oregon 1600 Utah 500 dtype: int64 ## 1.4 根据字典索引查询数据 类似Python的字典dict ```python s2 ``` d 1 b a a 5.2 c 7 dtype: object ```python s2['a'] ``` 5.2 ```python type(s2['a']) ``` float ```python s2[["b","a"]] ``` b a a 5.2 dtype: object ```python type(s2[["b","a"]]) ``` pandas.core.series.Series # 2.DateFrame DataFrame是一个表格型的数据结构 每列可以是不同的值类型(数值、字符串、布尔值等) 既有行索引index也有列索引columns 可以被看做由Series组成的字典 创建dataframe最常用的方法,见02节读取纯文本文件、excel、mysql数据库 ## 2.1 根据多个字典序列创建dataframe ```python data = { "state":["Ohio","Ohio","Ohio","nevada","nevada"], "year":[200,201,202,203,204], "pop":[1.5,1.7,1.6,1.9,6.8] } df=pd.DataFrame(data) ``` ```python df ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>state</th> <th>year</th> <th>pop</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>Ohio</td> <td>200</td> <td>1.5</td> </tr> <tr> <th>1</th> <td>Ohio</td> <td>201</td> <td>1.7</td> </tr> <tr> <th>2</th> <td>Ohio</td> <td>202</td> <td>1.6</td> </tr> <tr> <th>3</th> <td>nevada</td> <td>203</td> <td>1.9</td> </tr> <tr> <th>4</th> <td>nevada</td> <td>204</td> <td>6.8</td> </tr> </tbody> </table> </div> ```python df.dtypes ``` state object year int64 pop float64 dtype: object ```python df.columns ``` Index(['state', 'year', 'pop'], dtype='object') ```python df.index ``` RangeIndex(start=0, stop=5, step=1) # 3.从DataFrame中查询出Series 如果只查询一行、一列,返回的是pd.Series 如果查询多行、多列,返回的是pd.DataFrame ## 3.1 查询一列,结果是一个pd.Series ```python df["year"] ``` 0 200 1 201 2 202 3 203 4 204 Name: year, dtype: int64 ```python type(df["year"]) ``` pandas.core.series.Series ## 3.2 查询多列,结果是一个pd.DataFrame ```python df[["year","pop"]] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>year</th> <th>pop</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>200</td> <td>1.5</td> </tr> <tr> <th>1</th> <td>201</td> <td>1.7</td> </tr> <tr> <th>2</th> <td>202</td> <td>1.6</td> </tr> <tr> <th>3</th> <td>203</td> <td>1.9</td> </tr> <tr> <th>4</th> <td>204</td> <td>6.8</td> </tr> </tbody> </table> </div> ```python type(df[["year","pop"]]) ``` pandas.core.frame.DataFrame ## 3.3 查询一行,结果是一个pd.Series ```python df.loc[1] ``` state Ohio year 201 pop 1.7 Name: 1, dtype: object ## 3.4查询多行,结果是一个pd.DataFrame ```python df.loc[1:3] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>state</th> <th>year</th> <th>pop</th> </tr> </thead> <tbody> <tr> <th>1</th> <td>Ohio</td> <td>201</td> <td>1.7</td> </tr> <tr> <th>2</th> <td>Ohio</td> <td>202</td> <td>1.6</td> </tr> <tr> <th>3</th> <td>nevada</td> <td>203</td> <td>1.9</td> </tr> </tbody> </table> </div> ```python type(df.loc[1:3]) ``` pandas.core.frame.DataFrame 最后修改:2022 年 12 月 13 日 © 允许规范转载 打赏 赞赏作者 支付宝微信 赞 如果觉得我的文章对你有用,请随意赞赏