Loading... 报警复现,原因,解决方案 <div class="tip inlineBlock error"> ipynb文件格式:[8.pandas的SettingWithCopyWarning.html](http://type.zimopy.com/usr/uploads/2022/12/3591179633.html) </div> # pandas的SettingWithCopyWarning 报警复现,原因,解决方案 ```python import pandas as pd fpath = "../datas/beijing_tianqi/beijing_tianqi_2018.csv" df= pd.read_csv(fpath) df.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>ymd</th> <th>bWendu</th> <th>yWendu</th> <th>tianqi</th> <th>fengxiang</th> <th>fengli</th> <th>aqi</th> <th>aqiInfo</th> <th>aqiLevel</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>2018-01-01</td> <td>3℃</td> <td>-6℃</td> <td>晴~多云</td> <td>东北风</td> <td>1-2级</td> <td>59</td> <td>良</td> <td>2</td> </tr> <tr> <th>1</th> <td>2018-01-02</td> <td>2℃</td> <td>-5℃</td> <td>阴~多云</td> <td>东北风</td> <td>1-2级</td> <td>49</td> <td>优</td> <td>1</td> </tr> <tr> <th>2</th> <td>2018-01-03</td> <td>2℃</td> <td>-5℃</td> <td>多云</td> <td>北风</td> <td>1-2级</td> <td>28</td> <td>优</td> <td>1</td> </tr> <tr> <th>3</th> <td>2018-01-04</td> <td>0℃</td> <td>-8℃</td> <td>阴</td> <td>东北风</td> <td>1-2级</td> <td>28</td> <td>优</td> <td>1</td> </tr> <tr> <th>4</th> <td>2018-01-05</td> <td>3℃</td> <td>-6℃</td> <td>多云~晴</td> <td>西北风</td> <td>1-2级</td> <td>50</td> <td>优</td> <td>1</td> </tr> </tbody> </table> </div> ```python df.loc[:,"bWendu"] = df["bWendu"].str.replace("℃","").astype("int32") df.loc[:,"yWendu"] = df["yWendu"].str.replace("℃","").astype("int32") ``` ```python df.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>ymd</th> <th>bWendu</th> <th>yWendu</th> <th>tianqi</th> <th>fengxiang</th> <th>fengli</th> <th>aqi</th> <th>aqiInfo</th> <th>aqiLevel</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>2018-01-01</td> <td>3</td> <td>-6</td> <td>晴~多云</td> <td>东北风</td> <td>1-2级</td> <td>59</td> <td>良</td> <td>2</td> </tr> <tr> <th>1</th> <td>2018-01-02</td> <td>2</td> <td>-5</td> <td>阴~多云</td> <td>东北风</td> <td>1-2级</td> <td>49</td> <td>优</td> <td>1</td> </tr> <tr> <th>2</th> <td>2018-01-03</td> <td>2</td> <td>-5</td> <td>多云</td> <td>北风</td> <td>1-2级</td> <td>28</td> <td>优</td> <td>1</td> </tr> <tr> <th>3</th> <td>2018-01-04</td> <td>0</td> <td>-8</td> <td>阴</td> <td>东北风</td> <td>1-2级</td> <td>28</td> <td>优</td> <td>1</td> </tr> <tr> <th>4</th> <td>2018-01-05</td> <td>3</td> <td>-6</td> <td>多云~晴</td> <td>西北风</td> <td>1-2级</td> <td>50</td> <td>优</td> <td>1</td> </tr> </tbody> </table> </div> # 1. 复现 ## 只选出3月份的数据进行分析 ```python condition = df["ymd"].str.startswith("2018-03") ``` ## 设置温差 ```python df.loc[condition]["wen_cha"]=df["bWendu"]-df["yWendu"] ``` c:\users\administrator\appdata\local\programs\python\python37\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy """Entry point for launching an IPython kernel. ## 查看是否成功 ```python df[condition].head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>ymd</th> <th>bWendu</th> <th>yWendu</th> <th>tianqi</th> <th>fengxiang</th> <th>fengli</th> <th>aqi</th> <th>aqiInfo</th> <th>aqiLevel</th> </tr> </thead> <tbody> <tr> <th>59</th> <td>2018-03-01</td> <td>8</td> <td>-3</td> <td>多云</td> <td>西南风</td> <td>1-2级</td> <td>46</td> <td>优</td> <td>1</td> </tr> <tr> <th>60</th> <td>2018-03-02</td> <td>9</td> <td>-1</td> <td>晴~多云</td> <td>北风</td> <td>1-2级</td> <td>95</td> <td>良</td> <td>2</td> </tr> <tr> <th>61</th> <td>2018-03-03</td> <td>13</td> <td>3</td> <td>多云~阴</td> <td>北风</td> <td>1-2级</td> <td>214</td> <td>重度污染</td> <td>5</td> </tr> <tr> <th>62</th> <td>2018-03-04</td> <td>7</td> <td>-2</td> <td>阴~多云</td> <td>东南风</td> <td>1-2级</td> <td>144</td> <td>轻度污染</td> <td>3</td> </tr> <tr> <th>63</th> <td>2018-03-05</td> <td>8</td> <td>-3</td> <td>晴</td> <td>南风</td> <td>1-2级</td> <td>94</td> <td>良</td> <td>2</td> </tr> </tbody> </table> </div> # 2.原因 发出警告的代码 df.loc[condition]["wen_cha"]=df["bWendu"]-df["yWendu"] 相当于:df.get(condition).set(wen_cha),第一步骤的get发出了报警 **链式操作其实就是两个步骤,先get后set,gei得到的dataframe可能是view也可能是copy,pandas发出警告** 核心要诀:pandas的dataframe的修改操写操作,只允许在源dataframe上进行 # 3.解决办法1 将get+set的两步操作,改成set的一步操作 ```python df.loc[condition,"wen_cha"] = df["bWendu"] - df["yWendu"] df[condition].head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>ymd</th> <th>bWendu</th> <th>yWendu</th> <th>tianqi</th> <th>fengxiang</th> <th>fengli</th> <th>aqi</th> <th>aqiInfo</th> <th>aqiLevel</th> <th>wen_cha</th> </tr> </thead> <tbody> <tr> <th>59</th> <td>2018-03-01</td> <td>8</td> <td>-3</td> <td>多云</td> <td>西南风</td> <td>1-2级</td> <td>46</td> <td>优</td> <td>1</td> <td>11.0</td> </tr> <tr> <th>60</th> <td>2018-03-02</td> <td>9</td> <td>-1</td> <td>晴~多云</td> <td>北风</td> <td>1-2级</td> <td>95</td> <td>良</td> <td>2</td> <td>10.0</td> </tr> <tr> <th>61</th> <td>2018-03-03</td> <td>13</td> <td>3</td> <td>多云~阴</td> <td>北风</td> <td>1-2级</td> <td>214</td> <td>重度污染</td> <td>5</td> <td>10.0</td> </tr> <tr> <th>62</th> <td>2018-03-04</td> <td>7</td> <td>-2</td> <td>阴~多云</td> <td>东南风</td> <td>1-2级</td> <td>144</td> <td>轻度污染</td> <td>3</td> <td>9.0</td> </tr> <tr> <th>63</th> <td>2018-03-05</td> <td>8</td> <td>-3</td> <td>晴</td> <td>南风</td> <td>1-2级</td> <td>94</td> <td>良</td> <td>2</td> <td>11.0</td> </tr> </tbody> </table> </div> # 4.解决方法2 如果需要预筛选数据做后续的处理分析,使用copy复制dataframe ```python df_month3 = df[condition].copy() ``` ```python df_month3.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>ymd</th> <th>bWendu</th> <th>yWendu</th> <th>tianqi</th> <th>fengxiang</th> <th>fengli</th> <th>aqi</th> <th>aqiInfo</th> <th>aqiLevel</th> <th>wen_cha</th> </tr> </thead> <tbody> <tr> <th>59</th> <td>2018-03-01</td> <td>8</td> <td>-3</td> <td>多云</td> <td>西南风</td> <td>1-2级</td> <td>46</td> <td>优</td> <td>1</td> <td>11.0</td> </tr> <tr> <th>60</th> <td>2018-03-02</td> <td>9</td> <td>-1</td> <td>晴~多云</td> <td>北风</td> <td>1-2级</td> <td>95</td> <td>良</td> <td>2</td> <td>10.0</td> </tr> <tr> <th>61</th> <td>2018-03-03</td> <td>13</td> <td>3</td> <td>多云~阴</td> <td>北风</td> <td>1-2级</td> <td>214</td> <td>重度污染</td> <td>5</td> <td>10.0</td> </tr> <tr> <th>62</th> <td>2018-03-04</td> <td>7</td> <td>-2</td> <td>阴~多云</td> <td>东南风</td> <td>1-2级</td> <td>144</td> <td>轻度污染</td> <td>3</td> <td>9.0</td> </tr> <tr> <th>63</th> <td>2018-03-05</td> <td>8</td> <td>-3</td> <td>晴</td> <td>南风</td> <td>1-2级</td> <td>94</td> <td>良</td> <td>2</td> <td>11.0</td> </tr> </tbody> </table> </div> ```python df_month3["wen_cha"] = df["bWendu"] -df["yWendu"] df_month3.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>ymd</th> <th>bWendu</th> <th>yWendu</th> <th>tianqi</th> <th>fengxiang</th> <th>fengli</th> <th>aqi</th> <th>aqiInfo</th> <th>aqiLevel</th> <th>wen_cha</th> </tr> </thead> <tbody> <tr> <th>59</th> <td>2018-03-01</td> <td>8</td> <td>-3</td> <td>多云</td> <td>西南风</td> <td>1-2级</td> <td>46</td> <td>优</td> <td>1</td> <td>11</td> </tr> <tr> <th>60</th> <td>2018-03-02</td> <td>9</td> <td>-1</td> <td>晴~多云</td> <td>北风</td> <td>1-2级</td> <td>95</td> <td>良</td> <td>2</td> <td>10</td> </tr> <tr> <th>61</th> <td>2018-03-03</td> <td>13</td> <td>3</td> <td>多云~阴</td> <td>北风</td> <td>1-2级</td> <td>214</td> <td>重度污染</td> <td>5</td> <td>10</td> </tr> <tr> <th>62</th> <td>2018-03-04</td> <td>7</td> <td>-2</td> <td>阴~多云</td> <td>东南风</td> <td>1-2级</td> <td>144</td> <td>轻度污染</td> <td>3</td> <td>9</td> </tr> <tr> <th>63</th> <td>2018-03-05</td> <td>8</td> <td>-3</td> <td>晴</td> <td>南风</td> <td>1-2级</td> <td>94</td> <td>良</td> <td>2</td> <td>11</td> </tr> </tbody> </table> </div> # **总之,pandas不允许先筛选子dataframe,再进行修改写入** 要么使用.loc实现一个步骤直接修改源dataframe 要么先复制一个子dataframe再一个步骤执行修改 最后修改:2022 年 12 月 19 日 © 允许规范转载 打赏 赞赏作者 支付宝微信 赞 如果觉得我的文章对你有用,请随意赞赏