查找列表的平均值-IT科技

摘要：问题描述：如何在 Python 中查找列表的算术平均值？例如：[1, 2, 3, 4] ⟶ 2.5 解决方案 1：对于 Python 3.8+，使用statistics.fmean浮点数实现数值稳定性。（快。）对于 Python 3.4+，使用statistics.mean浮点数实现数值稳定性。（较慢。）...

问题描述：

如何在 Python 中查找列表的算术平均值？例如：

[1, 2, 3, 4]  ⟶  2.5

解决方案 1：

对于 Python 3.8+，使用statistics.fmean浮点数实现数值稳定性。（快。）

对于 Python 3.4+，使用statistics.mean浮点数实现数值稳定性。（较慢。）

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import statistics
statistics.mean(xs)  # = 20.11111111111111

对于 Python 3 的旧版本，请使用

sum(xs) / len(xs)

对于 Python 2，转换len为浮点数以获得浮点除法：

sum(xs) / float(len(xs))

解决方案 2：

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]
sum(xs) / len(xs)

解决方案 3：

使用numpy.mean：

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import numpy as np
print(np.mean(xs))

解决方案 4：

对于Python 3.4+，使用mean()新statistics模块计算平均值：

from statistics import mean
xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]
mean(xs)

解决方案 5：

reduce()当 Python 具有完美的 cromulent函数时，为什么还要使用它呢sum()？

print sum(l) / float(len(l))

（float()在 Python 2 中需要强制 Python 执行浮点除法。）

解决方案 6：

如果你使用的是 python >= 3.4，则有一个统计库

https://docs.python.org/3/library/statistics.html

您可以像这样使用它的均值方法。假设您有一个数字列表，您想找到其均值：-

list = [11, 13, 12, 15, 17]
import statistics as s
s.mean(list)

它还有其他方法，如标准差、方差、众数、调和平均值、中位数等，非常有用。

解决方案 7：

编辑：

我添加了另外两种方法来获取列表的平均值（仅适用于 Python 3.8+）。这是我做的比较：

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd
import math

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()


def mean7():
    return statistics.fmean(l)


def mean8():
    return math.fsum(l) / len(l)


for func in [mean1, mean2, mean3, mean4, mean5, mean6, mean7, mean8 ]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))

以下是我得到的结果：

mean1 took:  0.09751558300000002
mean2 took:  0.005496791999999973
mean3 took:  0.07754683299999998
mean4 took:  0.055743208000000044
mean5 took:  0.018134082999999968
mean6 took:  0.6663848750000001
mean7 took:  0.004305374999999945
mean8 took:  0.003203333000000086

有趣！看起来math.fsum(l) / len(l)是最快的方法，然后statistics.fmean(l)，只有然后sum(l) / len(l)。很好！

感谢@Asclepius 向我展示这两种另外的方法！

旧答案：

就效率和速度而言，这些是我测试其他答案得到的结果：

# test mean caculation

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()



for func in [mean1, mean2, mean3, mean4, mean5, mean6]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))

结果如下：

mean1 took:  0.17030245899968577
mean2 took:  0.002183011999932205
mean3 took:  0.09744236000005913
mean4 took:  0.07070840100004716
mean5 took:  0.022754742999950395
mean6 took:  1.6689282460001778

显然获胜者是：
sum(l) / len(l)

解决方案 8：

您无需强制转换为浮点数，而是可以将 0.0 添加到总和中：

def avg(l):
    return sum(l, 0.0) / len(l)

解决方案 9：

sum(l) / float(len(l))是正确的答案，但为了完整性，您可以使用一次减少来计算平均值：

>>> reduce(lambda x, y: x + y / float(len(l)), l, 0)
20.111111111111114

请注意，这可能会导致轻微的舍入误差：

>>> sum(l) / float(len(l))
20.111111111111111

解决方案 10：

我尝试使用上述选项，但没有效果。试试这个：

from statistics import mean

n = [11, 13, 15, 17, 19]

print(n)
print(mean(n))

使用 Python 3.5

解决方案 11：

或者使用pandas的Series.mean方法：

pd.Series(sequence).mean()

演示：

>>> import pandas as pd
>>> l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
>>> pd.Series(l).mean()
20.11111111111111
>>>

来自文档：

Series.mean(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)¶

以下是相关文档：

https://pandas.pydata.org/pandas-docs/stable/ generated/pandas.Series.mean.html

以及整个文档：

https://pandas.pydata.org/pandas-docs/stable/10min.html

解决方案 12：

我在 Udacity 的问题中遇到了类似的问题需要解决。我没有使用内置函数，而是编写了代码：

def list_mean(n):

    summing = float(sum(n))
    count = float(len(n))
    if n == []:
        return False
    return float(summing/count)

比平常要长得多，但对于初学者来说，这相当具有挑战性。

解决方案 13：

作为初学者，我只是编写了这个代码：

L = [15, 18, 2, 36, 12, 78, 5, 6, 9]

total = 0

def average(numbers):
    total = sum(numbers)
    total = float(total)
    return total / len(numbers)

print average(L)

解决方案 14：

如果您想要获得的不仅仅是平均值（又称平均值），您可以查看 scipy 统计数据：

from scipy import stats
l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
print(stats.describe(l))

# DescribeResult(nobs=9, minmax=(2, 78), mean=20.11111111111111, 
# variance=572.3611111111111, skewness=1.7791785448425341, 
# kurtosis=1.9422716419666397)

解决方案 15：

为了用于reduce取移动平均值，您需要跟踪总数以及迄今为止看到的元素总数。因为这不是列表中的简单元素，所以您还必须传递reduce一个额外的参数来折叠。

>>> l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
>>> running_average = reduce(lambda aggr, elem: (aggr[0] + elem, aggr[1]+1), l, (0.0,0))
>>> running_average[0]
(181.0, 9)
>>> running_average[0]/running_average[1]
20.111111111111111

解决方案 16：

两者都可以给出接近的整数值或至少 10 个小数值。但如果您真的考虑长浮点值，两者可能会有所不同。方法可能因您想要实现的目标而异。

>>> l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
>>> print reduce(lambda x, y: x + y, l) / len(l)
20
>>> sum(l)/len(l)
20

浮动值

>>> print reduce(lambda x, y: x + y, l) / float(len(l))
20.1111111111
>>> print sum(l)/float(len(l))
20.1111111111

@Andrew Clark 的说法是正确的。

解决方案 17：

假设

x = [
    [-5.01,-5.43,1.08,0.86,-2.67,4.94,-2.51,-2.25,5.56,1.03],
    [-8.12,-3.48,-5.52,-3.78,0.63,3.29,2.09,-2.13,2.86,-3.33],
    [-3.68,-3.54,1.66,-4.11,7.39,2.08,-2.59,-6.94,-2.26,4.33]
]

你会注意到，它的x尺寸是 3*10，如果你需要获取mean每一行，你可以输入这个

theMean = np.mean(x1,axis=1)

别忘了import numpy as np

解决方案 18：

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]

l = map(float,l)
print '%.2f' %(sum(l)/len(l))

解决方案 19：

使用以下PYTHON代码查找列表中的平均值：

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
print(sum(l)//len(l))

尝试一下，很简单。

解决方案 20：

print reduce(lambda x, y: x + y, l)/(len(l)*1.0)

或者像之前发布的那样

sum(l)/(len(l)*1.0)

1.0 是为了确保你得到浮点除法

解决方案 21：

结合上述几个答案，我得出了以下与 Reduce 一起使用的方法，并且不假设您L在 Reduce 函数中可以使用该方法：

from operator import truediv

L = [15, 18, 2, 36, 12, 78, 5, 6, 9]

def sum_and_count(x, y):
    try:
        return (x[0] + y, x[1] + 1)
    except TypeError:
        return (x + y, 2)

truediv(*reduce(sum_and_count, L))

# prints 
20.11111111111111

解决方案 22：

我想补充另一种方法

import itertools,operator
list(itertools.accumulate(l,operator.add)).pop(-1) / len(l)

解决方案 23：

简单的解决方案是 avemedi-lib

pip install avemedi_lib

并将其添加到您的脚本中

from avemedi_lib.functions import average, get_median, get_median_custom


test_even_array = [12, 32, 23, 43, 14, 44, 123, 15]
test_odd_array = [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Getting average value of list items
print(average(test_even_array))  # 38.25

# Getting median value for ordered or unordered numbers list
print(get_median(test_even_array))  # 27.5
print(get_median(test_odd_array))  # 27.5

# You can use your own sorted and your count functions
a = sorted(test_even_array)
n = len(a)

print(get_median_custom(a, n))  # 27.5

享受。

解决方案 24：

与不同statistics.mean()，statistics.fmean()适用于具有不同数字类型的对象列表。例如：

from decimal import Decimal
import statistics

data = [1, 4.5, Decimal('3.5')]
statistics.mean(data)     # TypeError
statistics.fmean(data)    # OK

这是因为在底层，mean()使用statistics._sum()which 返回一种数据类型来将平均值转换为（并且 Decimal 不在 Python 的数字层次结构中），而fmean()使用math.fsum()which 只是将数字相加（这也比内置sum()函数快得多）。

这样做的结果是fmean()总是返回浮点数（因为求平均值涉及除法），而则mean()可能根据数据中的数字类型返回不同的类型。以下示例显示mean()可以返回不同类型，而对于相同的列表，对所有列表都fmean()返回浮点数。3.0

statistics.mean([2, Fraction(4,1)])   # Fraction(3, 1) <--- fractions.Fraction
statistics.mean([2, 4.0])             # 3.0            <--- float
statistics.mean([2, 4])               # 3              <--- int

此外，与不同sum(data)/len(data)，fmean()（和mean()）不仅适用于列表，还适用于生成器等一般可迭代对象。如果您的数据量很大，并且/或者您需要在计算平均值之前执行即兴过滤，那么这很有用。

例如，如果列表具有 NaN 值，则平均返回 NaN。如果要在忽略 NaN 值的情况下对列表求平均值，可以过滤掉 NaN 值并将生成器传递给fmean：

data = [1, 2, float('nan')]
statistics.fmean(x for x in data if x==x)     # 1.5

请注意，numpy 有一个函数（numpy.nanmean()）可以执行相同的工作。

import numpy as np
np.nanmean(data)                              # 1.5