根据多个数据框列获取频率计数
- 2025-01-20 09:07:00
- admin 原创
- 91
问题描述:
我有以下数据框。
团体 | 尺寸 |
---|---|
短的 | 小的 |
短的 | 小的 |
缓和 | 中等的 |
缓和 | 小的 |
高的 | 大的 |
我想计算同一行在数据框中出现的次数。
Group Size Time
Short Small 2
Moderate Medium 1
Moderate Small 1
Tall Large 1
解决方案 1:
您可以使用 groupbysize
import pandas as pd
# load the sample data
data = {'Group': ['Short', 'Short', 'Moderate', 'Moderate', 'Tall'], 'Size': ['Small', 'Small', 'Medium', 'Small', 'Large']}
df = pd.DataFrame(data)
选项 1:
dfg = df.groupby(by=["Group", "Size"]).size()
# which results in a pandas.core.series.Series
Group Size
Moderate Medium 1
Small 1
Short Small 2
Tall Large 1
dtype: int64
选项 2:
dfg = df.groupby(by=["Group", "Size"]).size().reset_index(name="Time")
# which results in a pandas.core.frame.DataFrame
Group Size Time
0 Moderate Medium 1
1 Moderate Small 1
2 Short Small 2
3 Tall Large 1
选项 3:
dfg = df.groupby(by=["Group", "Size"], as_index=False).size()
# which results in a pandas.core.frame.DataFrame
Group Size Time
0 Moderate Medium 1
1 Moderate Small 1
2 Short Small 2
3 Tall Large 1
解决方案 2:
pandas 1.1 之后的更新value_counts
现在接受多列
df.value_counts(["Group", "Size"])
您也可以尝试pd.crosstab()
Group Size
Short Small
Short Small
Moderate Medium
Moderate Small
Tall Large
pd.crosstab(df.Group,df.Size)
Size Large Medium Small
Group
Moderate 0 1 1
Short 0 0 2
Tall 1 0 0
编辑:为了得到你的输出
pd.crosstab(df.Group,df.Size).replace(0,np.nan).\n stack().reset_index().rename(columns={0:'Time'})
Out[591]:
Group Size Time
0 Moderate Medium 1.0
1 Moderate Small 1.0
2 Short Small 2.0
3 Tall Large 1.0
解决方案 3:
其他可能性是使用.pivot_table()
和aggfunc='size'
df_solution = df.pivot_table(index=['Group','Size'], aggfunc='size')
相关推荐
热门文章
项目管理软件有哪些?
热门标签
云禅道AD