每第 n 个字符拆分字符串

2024-11-25 08:50:00
如何按第 n 个字符分割一个字符串?

'1234567890'   →   ['12', '34', '56', '78', '90']


解决方案 1:

>>> line = '1234567890'
>>> n = 2
>>> [line[i:i+n] for i in range(0, len(line), n)]
['12', '34', '56', '78', '90']

解决方案 2:


>>> import re
>>> re.findall('..','1234567890')
['12', '34', '56', '78', '90']


>>> import re
>>> re.findall('..?', '123456789')
['12', '34', '56', '78', '9']


>>> import re
>>> re.findall('.{1,2}', '123456789')
['12', '34', '56', '78', '9']


解决方案 3:

Python 中已经有一个内置函数用于实现这一点。

>>> from textwrap import wrap
>>> s = '1234567890'
>>> wrap(s, 2)
['12', '34', '56', '78', '90']


>>> help(wrap)
Help on function wrap in module textwrap:

wrap(text, width=70, **kwargs)
    Wrap a single paragraph of text, returning a list of wrapped lines.

    Reformat the single paragraph in 'text' so it fits in lines of no
    more than 'width' columns, and return a list of wrapped lines.  By
    default, tabs in 'text' are expanded with string.expandtabs(), and
    all other whitespace characters (including newline) are converted to
    space.  See TextWrapper class for available keyword args to customize
    wrapping behaviour.

解决方案 4:

将元素分组为 n 长度组的另一种常见方法:

>>> s = '1234567890'
>>> map(''.join, zip(*[iter(s)]*2))
['12', '34', '56', '78', '90']

此方法直接来自于 的文档zip()

解决方案 5:

我认为这比 itertools 版本更短且更易读:

def split_by_n(seq, n):
    '''A generator to divide a sequence into chunks of n units.'''
    while seq:
        yield seq[:n]
        seq = seq[n:]

print(list(split_by_n('1234567890', 2)))

解决方案 6:

使用PyPI 中的more-itertools:

>>> from more_itertools import sliced
>>> list(sliced('1234567890', 2))
['12', '34', '56', '78', '90']

解决方案 7:


s = '1234567890'
o = []
while s:
    s = s[2:]

解决方案 8:


Python 2.x:

from itertools import izip_longest    

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

Python 3.x:

from itertools import zip_longest

def grouper(iterable, n, *, incomplete='fill', fillvalue=None):
    "Collect data into non-overlapping fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, fillvalue='x') --> ABC DEF Gxx
    # grouper('ABCDEFG', 3, incomplete='strict') --> ABC DEF ValueError
    # grouper('ABCDEFG', 3, incomplete='ignore') --> ABC DEF
    args = [iter(iterable)] * n
    if incomplete == 'fill':
        return zip_longest(*args, fillvalue=fillvalue)
    if incomplete == 'strict':
        return zip(*args, strict=True)
    if incomplete == 'ignore':
        return zip(*args)
        raise ValueError('Expected fill, strict, or ignore')


解决方案 9:

这可以通过一个简单的 for 循环来实现。

a = '1234567890a'
result = []

for i in range(0, len(a), 2):
    result.append(a[i : i + 2])

输出看起来像 ['12', '34', '56', '78', '90', 'a']

解决方案 10:



x = "1234567890"
n = 2
my_list = []
for i in range(0, len(x), n):


['12', '34', '56', '78', '90']

解决方案 11:


s = '1234567890'
print([s[idx:idx+2] for idx in range(len(s)) if idx % 2 == 0])


['12', '34', '56', '78', '90']

解决方案 12:


from itertools import islice

def split_every(n, iterable):
    i = iter(iterable)
    piece = list(islice(i, n))
    while piece:
        yield piece
        piece = list(islice(i, n))

s = '1234567890'
print list(split_every(2, list(s)))

解决方案 13:


n = 2  
line = "this is a line split into n characters"  
line = [line[i * n:i * n+n] for i, blah in enumerate(line[::n])]

解决方案 14:

>>> from functools import reduce
>>> from operator import add
>>> from itertools import izip
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x)]
['12', '34', '56', '78', '90']
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x, x)]
['123', '456', '789']

解决方案 15:


def split(s, n):
    if len(s) < n:
        return []
        return [s[:n]] + split(s[n:], n)

print(split('1234567890', 2))


def split(s, n):
    if len(s) < n:
        return []
    elif len(s) == n:
        return [s]
        return split(s[:n], n) + split(s[n:], n)


解决方案 16:


s = "1234567890"

["".join(c) for c in mit.grouper(2, s)]

["".join(c) for c in mit.chunked(s, 2)]

["".join(c) for c in mit.windowed(s, 2, step=2)]

["".join(c) for c in  mit.split_after(s, lambda x: int(x) % 2 == 0)]


['12', '34', '56', '78', '90']


解决方案 17:


from itertools import groupby, chain, repeat, cycle

text = "wwworldggggreattecchemggpwwwzaz"
n = 3
c = cycle(chain(repeat(0, n), repeat(1, n)))
res = ["".join(g) for _, g in groupby(text, lambda x: next(c))]


['www', 'orl', 'dgg', 'ggr', 'eat', 'tec', 'che', 'mgg', 'pww', 'wza', 'z']

解决方案 18:


from itertools import groupby

text = "abcdefghij"
n = 3

result = []
for idx, chunk in groupby(text, key=lambda x: x.index//n):

# result = ['abc', 'def', 'ghi', 'j']

解决方案 19:

从 Python 3.12 开始,该itertools库现在包含迭代器batched()

>>> from itertools import batched
>>> s = '1234567890'
>>> [''.join(batch) for batch in batched(s, 2)]
['12', '34', '56', '78', '90']

解决方案 20:


def SplitEvery(string, length):
    if len(string) <= length: return [string]        
    sections = len(string) / length
    lines = []
    start = 0;
    for i in range(sections):
        line = string[start:start+length]
        start += length
    return lines


text = '1234567890'
lines = SplitEvery(text, 2)

# output: ['12', '34', '56', '78', '90']

解决方案 21:

您可以在 Github 上找到包含更新解决方案的完整文章。

注意:解决方案是针对 Python3.10+ 编写的

使用列表推导和切片:这是一种简单直接的方法,我们可以使用 Python 的切片功能将字符串拆分为 n 个字符的块。我们可以使用列表推导以步长 n 迭代字符串,并将字符串从当前索引切片到当前索引加 n。

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters.

    This function uses list comprehension and slicing to split the string into groups.
    It includes error handling to check if `n` is a positive integer.

        s (str): The input string to be split.
        n (int): The size of the groups.

        list[str]: A list of strings, where each string is a group of `n` consecutive characters from the input string.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use list comprehension and slicing to split the string into groups of `n` characters.
    return [s[i:i + n] for i in range(0, len(s), n)]

使用 re (regex) 模块:Python 的 re 模块提供了一个名为 findall() 的函数,可用于查找字符串中某个模式的所有出现位置。我们可以将此函数与匹配任意 n 个字符的正则表达式一起使用,以将字符串拆分为 n 个字符的块。

import re

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters.

    This function uses the `re.findall()` function from the `re` (regex) module to solve the problem.
    It includes error handling to check if `n` is a positive integer.

        s (str): The input string to be split.
        n (int): The size of the groups.

        list[str]: A list of strings, where each string is a group of `n` consecutive characters from the input string.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use `re.findall()` to split the string into groups of `n` characters.
    return re.findall(f'.{{1,{n}}}', s)

使用 textwrap 模块:Python 中的 textwrap 模块提供了一个名为 wrap() 的函数,该函数可用于将字符串拆分为指定宽度的输出行列表。我们可以使用此函数将字符串拆分为 n 个字符的块。

import textwrap

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters.

    This function uses the `textwrap.wrap()` function from the `textwrap` module to solve the problem.
    It includes error handling to check if `n` is a positive integer.

        s (str): The input string to be split.
        n (int): The size of the groups.

        List[str]: A list of strings, where each string is a group of `n` consecutive characters from the input string.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use `textwrap.wrap()` to split the string into groups of `n` characters.
    return textwrap.wrap(s, n)

使用循环和字符串连接:我们还可以通过手动循环字符串并一次将 n 个字符连接到新字符串来解决此问题。一旦我们有 n 个字符,我们就可以将新字符串添加到列表中,并将新字符串重置为空字符串。

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters.

    This function uses a loop and string concatenation to solve the problem.
    It includes error handling to check if `n` is a positive integer.

        s (str): The input string to be split.
        n (int): The size of the groups.

        List[str]: A list of strings, where each string is a group of `n` consecutive characters from the input string.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Initialize an empty list to store the groups.
    result = []

    # Initialize an empty string to store the current group.
    group = ''

    # Iterate over each character in the string.
    for c in s:
        group += c  # Add the current character to the current group.

        # If the current group has `n` characters, add it to the result and reset the group.
        if len(group) == n:
            group = ''

    # If there are any remaining characters in the group, add it to the result.
    if group:

    return result

使用生成器函数:我们可以创建一个生成器函数,该函数以字符串和数字 n 作为输入,并从字符串中生成 n 个字符的块。这种方法节省内存,因为它不需要一次将所有块存储在内存中。

from typing import Generator

def split_string_into_groups(string: str, n: int) -> Generator[str, None, None]:
    Generator function to split a string into groups of `n` consecutive characters.

        string (str): The input string to be split.
        n (int): The size of the groups.

        str: The next group of `n` characters.

        ValueError: If `n` is not a positive integer.

        >>> list(split_string_into_groups("HelloWorld", 3))
        ['Hel', 'loW', 'orl', 'd']
        >>> list(split_string_into_groups("Python", 2))
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Iterate over the string with a step size of `n`.
    for i in range(0, len(string), n):
        # Yield the next group of `n` characters.
        yield string[i:i + n]

使用 itertools:Python 中的 itertools 模块提供了一个名为 islice() 的函数,可用于对可迭代对象进行切片。我们可以使用此函数将字符串拆分为 n 个字符的块。

from itertools import islice
from typing import Iterator

def split_string_into_groups(s: str, n: int) -> Iterator[str]:
    Splits a string into groups of `n` consecutive characters using itertools.islice().

        s (str): The input string to be split.
        n (int): The size of the groups.

        Iterator[str]: An iterator that yields each group of `n` consecutive characters from the input string.

        ValueError: If `n` is not a positive integer.

        >>> list(split_string_into_groups("HelloWorld", 3))
        ['Hel', 'loW', 'orl', 'd']
        >>> list(split_string_into_groups("Python", 2))
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Create an iterator from the string.
    it = iter(s)

    # Use itertools.islice() to yield groups of `n` characters from the iterator.
    while True:
        group = ''.join(islice(it, n))

        if not group:

        yield group

使用 numpy:我们也可以使用 numpy 库来解决这个问题。我们可以将字符串转换为 numpy 数组,然后使用 reshape() 函数将数组拆分为 n 个字符的块。

import numpy as np

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters using numpy.reshape().

        s (str): The input string to be split.
        n (int): The size of the groups.

        List[str]: A list of strings where each string is a group of `n` consecutive characters.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Convert the string to a list of characters
    chars = list(s)

    # Add extra empty strings only if the length of `s` is not a multiple of `n`
    if len(s) % n != 0:
        chars += [''] * (n - len(s) % n)

    # Reshape the array into a 2D array with the number of groups as the number of rows and n as the number of columns
    arr = np.array(chars).reshape(-1, n)

    # Convert each row of the 2D array back to a string and add it to the result list
    result = [''.join(row).rstrip() for row in arr]

    return result

使用 pandas:Python 中的 pandas 库提供了一个名为 groupby() 的函数,可用于将数组拆分为多个 bin。我们可以使用此函数将字符串拆分为 n 个字符的块。

import pandas as pd

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a given string into groups of `n` consecutive characters.

    This function uses the pandas library to convert the string into a pandas Series,
    then uses the groupby method to group the characters into groups of `n` characters.
    The groups are then converted back to a list of strings.

        s (str): The input string to be split.
        n (int): The size of the groups.

        list[str]: A list of strings, where each string is a group of `n` consecutive characters from the input string.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Convert the string to a pandas Series
    s = pd.Series(list(s))

    # Use pandas groupby to group the characters
    # The index of each character is divided by `n` using integer division,
    # which groups the characters into groups of `n` characters.
    groups = s.groupby(s.index // n).agg(''.join)

    # Convert the result back to a list and return it
    return groups.tolist()

使用 more_itertools:more_itertools 库提供了一个名为 chunked() 的函数,可用于将可迭代对象拆分为指定大小的块。我们可以使用此函数将字符串拆分为 n 个字符的块。

import more_itertools

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters using more_itertools.chunked().

        s (str): The input string to be split.
        n (int): The size of the groups.

        List[str]: A list of strings where each string is a group of `n` consecutive characters.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use more_itertools.chunked() to split the string into chunks of `n` characters.
    chunks = more_itertools.chunked(s, n)

    # Convert each chunk to a string and add it to the result list.
    result = [''.join(chunk) for chunk in chunks]

    return result

使用 toolz:toolz 库提供了一个名为partition_all()的函数,该函数可用于将可迭代对象拆分为指定大小的块。我们可以使用此函数将字符串拆分为 n 个字符的块。

import toolz

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters using toolz.partition_all().

        s (str): The input string to be split.
        n (int): The size of the groups.

        List[str]: A list of strings where each string is a group of `n` consecutive characters.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use toolz.partition_all() to split the string into chunks of `n` characters.
    chunks = toolz.partition_all(n, s)

    # Convert each chunk to a string and add it to the result list.
    result = [''.join(chunk) for chunk in chunks]

    return result

使用 cytoolz: cytoolz 库提供了一个名为partition_all()的函数,该函数可用于将可迭代对象拆分为指定大小的块。我们可以使用此函数将字符串拆分为n个字符的块。

from cytoolz import partition_all

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters using cytoolz.partition_all().

        s (str): The input string to be split.
        n (int): The size of the groups.

        list[str]: A list of strings where each string is a group of `n` consecutive characters.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use cytoolz.partition_all() to split the string into chunks of `n` characters.
    chunks = partition_all(n, s)

    # Convert each chunk to a string and add it to the result list.
    result = [''.join(chunk) for chunk in chunks]

    return result

使用 itertools:itertools 库提供了一个名为 zip_longest 的函数,可用于将可迭代对象拆分为指定大小的块。我们可以使用此函数将字符串拆分为 n 个字符的块。

from itertools import zip_longest

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters using itertools.zip_longest().

        s (str): The input string to be split.
        n (int): The size of the groups.

        List[str]: A list of strings where each string is a group of `n` consecutive characters.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use itertools.zip_longest() to split the string into chunks of `n` characters.
    args = [iter(s)] * n
    chunks = zip_longest(*args, fillvalue='')

    # Convert each chunk to a string and add it to the result list.
    result = [''.join(chunk) for chunk in chunks]

    return result

使用 list + map + join + zip:我们还可以使用 list 函数、map 函数、join 方法和 zip 函数来解决这个问题。我们可以使用 map 函数以步长 n 迭代字符串,并将字符串从当前索引切片到当前索引加 n。然后,我们可以使用 zip 函数将块组合成元组列表,并使用 join 方法将元组连接成字符串列表。

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters using list, map, join, and zip.

        s (str): The input string to be split.
        n (int): The size of the groups.

        list[str]: A list of strings where each string is a group of `n` consecutive characters.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Use list, map, join, and zip to split the string into chunks of `n` characters.
    result = [''.join(chunk) for chunk in zip(*[iter(s)] * n)]

    # If the string length is not a multiple of `n`, add the remaining characters to the result.
    remainder = len(s) % n
    if remainder != 0:

    return result

使用递归和切片:我们也可以使用递归和切片来解决这个问题。我们可以定义一个递归函数,该函数以字符串和数字 n 作为输入,并返回一个由 n 个字符组成的块列表。该函数可以将字符串切成 n 个字符的块,并使用剩余的字符串递归调用自身,直到字符串为空。

def split_string_into_groups(s: str, n: int) -> list[str]:
    Splits a string into groups of `n` consecutive characters using recursion with slicing.

        s (str): The input string to be split.
        n (int): The size of the groups.

        list[str]: A list of strings where each string is a group of `n` consecutive characters.

        ValueError: If `n` is not a positive integer.

        >>> split_string_into_groups("HelloWorld", 3)
        ['Hel', 'loW', 'orl', 'd']
        >>> split_string_into_groups("Python", 2)
        ['Py', 'th', 'on']
    # Check if `n` is a positive integer.
    if n <= 0:
        raise ValueError("The group size must be a positive integer")

    # Base case: if the length of the string is less than or equal to `n`, return a list containing `s`.
    if len(s) <= n:
        return [s]

    # Recursive case: split the string into two parts and recursively call `split_string_into_groups` on the rest of the string.
    return [s[:n]] + split_string_into_groups(s[n:], n)
