Python标准库中的生成器函数大全（非常详细）

码农世界 2024-05-17 后端 79 次浏览 0个评论

用于筛选的生成器函数

从输入的可迭代对象中产生项的子集，而且不修改项本身。多数可筛选生成器接收一个predicate参数。这个参数的值是个布尔函数，接收一个参数，应用到输入中的每一项上，用于判断项是否包含在输出中。

itertools.compress(it,selectors_it)

并行处理两个可迭代对象；如果selectors_it的值为真值，那么产生iter对应的项。

it = "床前明月光疑是地上霜"
selectors_it = [1, 0, True, False, '', 1, 1, 0, 1, None]
res = itertools.compress(it, selectors_it)
print(list(res))  # 输出['床', '明', '疑', '是', '上']

itertools.dropwhile(predicate,it)

处理it，跳过predicate计算结果为真值的项，然后产生剩下的项。

numbers = [3, 5, 7, 10, 12, 13, 14, 15]
# 弃掉所有小于10的元素
droped_numbers = itertools.dropwhile(lambda x: x < 10, numbers)
print(list(droped_numbers))  # [10, 12, 13, 14, 15]

内置函数filter(predicate,it)

把 it 中的各个元素传给 predicate ，如果predicate(item) 返回真值，那么产出对应的元素；如果 predicate 是 None ，那么只产出真值元素。

filter_numbers = filter(lambda x: x > 10, numbers)
print(list(filter_numbers))  # [12, 13, 14, 15]

itertools.filterfalse(predicate,it)

与filter函数的作用类似，不过predicate的逻辑是相反的：predicate返回假值时产出对应的项。

numbers = [3, 5, 7, 10, 12, 13, 14, 15]
# 弃掉所有小于10的元素
filter_numbers = itertools.filterfalse(lambda x: x > 10, numbers)
print(list(filter_numbers))  # [3, 5, 7, 10]

itertools.iSlice(it,stop)或者itertools.iSlice(it,start,stop,step=1)

产出it的切片，作用类似于s[:stop] 或s[start:stop:step] ，不过it可以是任何可迭代对象，而且这个函数惰性执行操作。

it = "床前明月光疑是地上霜"
s1 = itertools.islice(it, 4)
print(list(s1))  # ['床', '前', '明', '月']
s2 = itertools.islice(it, 4, 7)
print(list(s2))  # ['光', '疑', '是']
s3 = itertools.islice(it, 1, 7, 3)
print(list(s3))  # ['前', '光']

itertools.takewhile(predicate,it)

这个函数返回一个迭代器，其中包含it中满足 predicate 函数条件的前若干个元素。一旦 predicate返回False，takewhile就会停止生成元素。

numbers = [3, 5, 7, 10, 12, 13, 14, 15]
take_numbers = itertools.takewhile(lambda x: x < 10, numbers)
print(list(take_numbers))  # [3, 5, 7]

用于映射的生成器函数

用于映射的生成器函数：在输入可迭代对象中的各项上做计算，产出计算结果。生成器针对输入的可迭代对象中的每一项产出一个结果。如果输入来自多个可迭代对象，那么第一个可迭代对象耗尽后就停止输出。

itertools.accumulate(it, [func])

产出累计求和，如果提供了func，那么把前两项传给它，然后把计算结果和下一项传给它，以此类推，产出最后的结果。

sample = [5, 4, 2, 8, 7, 6, 3, 0, 9, 1]
accumulated_numbers = itertools.accumulate(sample)
print(list(accumulated_numbers))  # [5, 9, 11, 19, 26, 32, 35, 35, 44, 45]
accumulated_numbers_mul = itertools.accumulate(sample, lambda x, y: x * y)
print(list(accumulated_numbers_mul))  # [5, 20, 40, 320, 2240, 13440, 40320, 0, 0, 0]
accumulated_numbers_max = itertools.accumulate(sample, max)
print(list(accumulated_numbers_max))  # [5, 5, 5, 8, 8, 8, 8, 8, 9, 9]

内置函数enumerate(iterable, start=0)

产生(index,item)形式的二元组，其中index从start开始计数，item则从iterable中获取。

it = "床前明月光"
es = enumerate(it)
print(list(es))  # [(0, '床'), (1, '前'), (2, '明'), (3, '月'), (4, '光')]
es1 = enumerate(it, start=100)
print(list(es1))  # [(100, '床'), (101, '前'), (102, '明'), (103, '月'), (104, '光')]

内置函数map(func, it1, [it2, ..., itN])

把it中的各项一次传给func，产出结果；如果传入N个可迭代对象，那么func必须接收N个参数，而且并行处理各个可迭代对象。

sample = [1, 2, 3, 4, 5]
m1 = map(lambda x: x * 2, sample)
print(list(m1))  # [2, 4, 6, 8, 10]
m2 = map(str, sample)  # 使用内置函数str
print(list(m2))  # ['1', '2', '3', '4', '5']
numbers = [6, 7, 8, 9, 1]
m3 = map(max, sample, numbers)
print(list(m3))  # [6, 7, 8, 9, 5]

合并多个可迭代对象的生成器函数

itertools.chain(it1,...,itN)

先产出it1中的所有项，然后再产生it2中的所有项，以此类推，无缝衔接。

c1 = itertools.chain([1, 2, 3], [4, 5, 6], [7, 8, 9])
print(list(c1))  # [1, 2, 3, 4, 5, 6, 7, 8, 9]
c2 = itertools.chain([1, 2, 3], 'abc', (4, 5))
print(list(c2))  # [1, 2, 3, 'a', 'b', 'c', 4, 5]
c3 = itertools.chain(range(1, 5), range(4, 6))
print(list(c3))  # [1, 2, 3, 4, 4, 5]

itertools.chain.from_iterable(it)

产出it生成的各个可迭代对象中的项，一个接一个，无缝衔接；it产出的项也应是可迭代对象，例如元祖列表。

list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
c1 = itertools.chain.from_iterable(list_of_lists)
print(list(c1))  # [0, '床', 1, '前', 2, '明', 3, '月', 4, '光']
c2 = itertools.chain.from_iterable(enumerate('床前明月光'))
print(list(c2))  # [0, '床', 1, '前', 2, '明', 3, '月', 4, '光']

内置函数zip(it1, ..., itN, strict=False)

从输入的可迭代对象中并行获取项，产生由此构成的N元祖，只要有一个可迭代对象耗尽就静默停止，除非指定了strict=True。

z1 = zip('ABCD', range(1, 6), ['N', 'O', 'P', 'Q'])
print(list(z1))  # [('A', 1, 'N'), ('B', 2, 'O'), ('C', 3, 'P'), ('D', 4, 'Q')]
z2 = zip('ABCD', range(1, 6), ['N', 'O', 'P', 'Q'], strict=True)
print(list(z2))  # 出错ValueError: zip() argument 2 is longer than argument 1

itertools.zip_longest(it1, ..., itN, fillvalue=None)

从输入的可迭代对象中并行获取项，产生由此构成的N元祖，知道最长的可迭代对象耗尽为止，空缺的值使用fillvalue填充。

z1 = itertools.zip_longest('ABC', range(1, 6), ['N', 'O', 'P'], fillvalue='N/A')
print(list(z1))  # [('A', 1, 'N'), ('B', 2, 'O'), ('C', 3, 'P'), ('N/A', 4, 'N/A'), ('N/A', 5, 'N/A')]

itertools.product(it1, ..., itN, repeat=1)

计算笛卡尔积：从输入的可迭代对象中并行获取项，合并成N元祖，与嵌套的for循环效果一样；repeat指明重复处理多少次输入的可迭代对象。

print(list(itertools.product("ABC")))  # [('A',), ('B',), ('C',)]
p1 = itertools.product('AB', [1, 2, 3])
print(list(p1))  # 输出：[('A', 1), ('A', 2), ('A', 3), ('B', 1), ('B', 2), ('B', 3)]
p2 = itertools.product('AB', repeat=2)
print(list(p2))  # [('A', 'A'), ('A', 'B'), ('B', 'A'), ('B', 'B')]

把输入的各项扩充成多个输出项的生成器

itertools.combinations(it, out_len)

把it产出的out_len个项组合在一起，然后产出。

c1 = itertools.combinations('ABC', 2)
print(list(c1))  # [('A', 'B'), ('A', 'C'), ('B', 'C')]

itertools.combinations_with_replacement(it, out_len)

把it产出的out_len个项组合在一起，然后产出，包含重复项的组合。

c1 = itertools.combinations_with_replacement('ABC', 2)
print(list(c1))  # [('A', 'A'), ('A', 'B'), ('A', 'C'), ('B', 'B'), ('B', 'C'), ('C', 'C')]

itertools.permutations(it, out_len=None)

把out_len个it产出的项排列在一起，然后产出这些排列；out_len默认值等于len(list(it))。

c1 = itertools.permutations('ABC')
print(list(c1))  # [('A', 'B', 'C'), ('A', 'C', 'B'), ('B', 'A', 'C'), ('B', 'C', 'A'), ('C', 'A', 'B'), ('C', 'B', 'A')]
c2 = itertools.permutations('ABC', 2)
print(list(c2)) # [('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')]

itertools.count(start=0, step=1)

从start开始不断产生数值，按照step指定的步幅增加。

ct = itertools.count()
print(next(ct))  # 0
print((next(ct), next(ct), next(ct)))  # (1, 2, 3)
ct_step = itertools.count(start=10, step=5)
print((next(ct_step), next(ct_step), next(ct_step)))  # (10, 15, 20)

itertools.cycle(it)

从it中产出各项，存储各项的副本，然后按照顺序不断产出整个序列。

cy = itertools.cycle('ABC')
print(next(cy), next(cy), next(cy), next(cy), next(cy))  # A B C A B

itertools.pairwise(it)

返回输入的可迭代对象中连续的重叠对。

cp = itertools.pairwise(range(7))
print(list(cp))  # [(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]

itertools.repeat(item, [times])

重复不断的产生指定的项，除非times指定次数

rp = itertools.repeat(7)
print(next(rp), next(rp), next(rp))  # 7 7 7
rpt = itertools.repeat(8, 5)
print(list(rpt))  # [8, 8, 8, 8, 8]

用于排列元素的生成器函数

内置函数reversed(sq)

从后往前，倒序产生seq中的项，seq必须是序列，或者实现了特殊方法__reversed__的对象。

animals = ['duck', 'eagle', 'rat', 'giraffe', 'bear', 'bat', 'dolphin', 'shark', 'lion']
print(list(reversed(animals)))  # ['lion', 'shark', 'dolphin', 'bat', 'bear', 'giraffe', 'rat', 'eagle', 'duck']

itertools.groupby(it, key=None)

产出(key,group)形式的二元组，其中key是分组的标准，group是生成器，用于产出分组内的项。

g = itertools.groupby("LLLLAAGGG")
print(list(g))
# [('L', ), ('A', ), ('G', )]
animals = ['duck', 'eagle', 'rat', 'giraffe', 'bear', 'bat', 'dolphin', 'shark', 'lion']
animals.sort(key=len)
print(animals)  # ['rat', 'bat', 'duck', 'bear', 'lion', 'eagle', 'shark', 'giraffe', 'dolphin']
for length, group in itertools.groupby(animals, len):
    print(length, '-->', group)
# 输出:
# 3 --> 
# 4 --> 
# 5 --> 
# 7 --> 
for length,group in itertools.groupby(reversed(animals),len):
    print(length, '-->', group)
# 7 --> 
# 5 --> 
# 4 --> 
# 3 -->

itertools.tee(it, n=2)

产出一个由n个生成器组成的元组，每个生成器单独产出可输入的可迭代对象中的项。

t = itertools.tee('ABC')
print(list(t))  # [, ]
t1, t2 = t
print(list(t1))  # ['A', 'B', 'C']
print(list(t2))  # ['A', 'B', 'C']
print(list(zip(*itertools.tee('ABC'))))  # [('A', 'A'), ('B', 'B'), ('C', 'C')]

转载请注明来自码农世界，本文标题：《Python标准库中的生成器函数大全（非常详细）》