您现在的位置是：首页 > Pandas

当前栏目

pandas apply 应用套路详解

pandas,apply,应用,套路,详解

2025-03-18 08:48:48 时间

在 DataFrame 中应用 apply 函数很常见，你使用的多吗？

在应用时，传递给函数的对象是 Series 对象，其索引是 DataFrame 的index (axis=0) 或者 DataFrame 的 columns (axis=1)。

基本语法：

DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)

基本参数

func : function 应用到每行或每列的函数。
axis ：{0 or 'index', 1 or 'columns'}, default 0 函数应用所沿着的轴。
- 0 or index : 在每一列上应用函数。
- 1 or columns : 在每一行上应用函数。
raw : bool, default False 确定行或列以Series还是ndarray对象传递。
- False : 将每一行或每一列作为一个Series传递给函数。
- True : 传递的函数将接收ndarray 对象。如果你只是应用一个 NumPy 还原函数，这将获得更好的性能。
result_type : {'expand', 'reduce', 'broadcast', None}, default None 这些只有在 axis=1（列）时才会发挥作用。
- expand : 列表式的结果将被转化为列。
- reduce : 如果可能的话，返回一个Series，而不是展开类似列表的结果。这与 expand 相反。
- broadcast : 结果将被广播到 DataFrame 的原始形状，原始索引和列将被保留。

默认行为(None)取决于应用函数的返回值：类似列表的结果将作为这些结果的 Series 返回。但是，如果应用函数返回一个 Series ，这些结果将被扩展为列。

args : tuple 除了数组/序列之外，要传递给函数的位置参数。
**kwds 作为关键字参数传递给函数的附加关键字参数。

应用示例

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])
>>> df
   A  B
0  4  9
1  4  9
2  4  9

应用 numpy 的通用函数：

>>> df.apply(np.sqrt)
     A    B
0  2.0  3.0
1  2.0  3.0
2  2.0  3.0

在任一轴上使用还原函数：

>>> df.apply(np.sum, axis=0)
A    12
B    27
dtype: int64

>>> df.apply(np.sum, axis=1)
0    13
1    13
2    13
dtype: int64

返回一个类似列表的结果是一个 Series。

>>> df.apply(lambda x: [1, 2], axis=1)
0    [1, 2]
1    [1, 2]
2    [1, 2]
dtype: object

传递 result_type='expand' 将把类似列表的结果扩展到Dataframe的列中

>>> df.apply(lambda x: [1, 2], axis=1, result_type='expand')
   0  1
0  1  2
1  1  2
2  1  2

在函数中返回一个 Series 类似于传递 result_type='expand' 。结果的列名将是Series的索引。

>>> df.apply(lambda x: pd.Series([1, 2], index=['foo', 'bar']), axis=1)
   foo  bar
0    1    2
1    1    2
2    1    2

传递 result_type='broadcast' 将确保函数返回与原始 DataFrame 有相同的形状结果，无论是列表式还是标量式，并且沿轴的方向广播。结果的列名将是原始的列名。

>>> df.apply(lambda x: [1, 2], axis=1, result_type='broadcast')
   A  B
0  1  2
1  1  2
2  1  2

自定义函数应用框架

基本应用

# 制定自定义函数计算逻辑
>>> def fx(x):
...     return x * 3 + 7
...
# 应用自定义函数
>>> df.apply(fx)
    A   B
0  19  34
1  19  34
2  19  34

某列应用函数

>>> df['B'].apply(fx)
0    34
1    34
2    34
Name: B, dtype: int64

某列应用函数并新增列

>>> df['new'] = df['B'].apply(fx)
>>> df
   A  B   new
0  4  9  34
1  4  9  34
2  4  9  34

使用列表推导式应用自定义函数

>>> df['new2'] = [x * 3 + 7 for x in df['B']]
>>> df
   A  B   new  new2
0  4  9   34    34
1  4  9   34    34
2  4  9   34    34

-- END --

猜你喜欢

基于python的情感分析案例_约翰肯尼格的悲伤词典
Win10配置Airsim环境并设置Python通信
Python 编程 | 连载 19 - Package 和 Module
Python 链接/操作 MongoDB 数据库
为什么python读取不了文件_python系统找不到指定文件怎么办
pycharm 安装包失败_python安装库为什么不成功
Python基础10-函数的递归
快速入门Python机器学习（21）
Python最强地理可视化库Cartopy安装教学
pycharm pro for mac(Python编辑开发安装包)中文激活版下载
python基础教程入门教程_python基础入门教程
Python3对多股票的投资组合进行分析「建议收藏」
Easy Games With Python and Pygame（三）- Pygame Event
python+selenium环境搭建_pycharm配置anaconda环境
python用pyinstaller编译成exe_pycharm编译成exe
遗传算法做多目标优化_python 遗传算法
Python-基础05-字符编码
Pycharm的python interpreter选择「建议收藏」
Python项目部署-使用Nginx部署Django项目
python实现K近邻算法案例

Python程序教程