将二维数组切片成更小的二维数组

2024-12-31 08:37:00
admin
原创
94
摘要:问题描述:有没有办法将 numpy 中的二维数组切成更小的二维数组?例子[[1,2,3,4], -> [[1,2] [3,4] [5,6,7,8]] [5,6] [7,8]] 所以我基本上想将 2x4 数组缩减为 2 个 2x2 数组。寻找可用于图像的通用解决方案。解...

问题描述:

有没有办法将 numpy 中的二维数组切成更小的二维数组?

例子

[[1,2,3,4],   ->    [[1,2] [3,4]   
 [5,6,7,8]]          [5,6] [7,8]]

所以我基本上想将 2x4 数组缩减为 2 个 2x2 数组。寻找可用于图像的通用解决方案。


解决方案 1:

几个月前还有另一个问题reshape,它让我想到了使用和的想法swapaxes。这h//nrows很有意义,因为这将第一个块的行保持在一起。你需要nrowsncols成为形状的一部分也是有意义的。-1告诉重塑填充使重塑有效所需的任何数字。有了解决方案的形式,我只是尝试了各种方法,直到找到可行的公式。

您应该能够使用reshape和的某种组合将数组分成“块” swapaxes

def blockshaped(arr, nrows, ncols):
    """
    Return an array of shape (n, nrows, ncols) where
    n * nrows * ncols = arr.size

    If arr is a 2D array, the returned array should look like n subblocks with
    each subblock preserving the "physical" layout of arr.
    """
    h, w = arr.shape
    assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
    assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
    return (arr.reshape(h//nrows, nrows, -1, ncols)
               .swapaxes(1,2)
               .reshape(-1, nrows, ncols))

转弯c

np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)

[out]:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

进入

print(blockshaped(c, 2, 3))

[out]:
[[[ 0  1  2]
  [ 6  7  8]]

 [[ 3  4  5]
  [ 9 10 11]]

 [[12 13 14]
  [18 19 20]]

 [[15 16 17]
  [21 22 23]]]

我在这里发布了一个反函数,unblockshaped在这里发布了一个 N 维泛化。泛化让我们更深入地了解了该算法背后的原因。


请注意,还有superbatfish 的
blockwise_view。它以不同的格式排列块(使用更多轴),但它的优点是 (1) 始终返回视图和 (2) 能够处理任意维度的数组。

解决方案 2:

在我看来,这是一个任务numpy.split或者某种变体。

例如

a = np.arange(30).reshape([5,6])  #a.shape = (5,6)
a1 = np.split(a,3,axis=1) 
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)

如果您有一个 NxN 图像,您可以创建一个包含 2 个 NxN/2 子图像的列表,然后沿另一个轴划分它们。

numpy.hsplit并且numpy.vsplit也可用。

解决方案 3:

还有一些其他答案似乎已经非常适合您的具体情况,但您的问题激起了我对内存高效解决方案的可能性的兴趣,该解决方案可用于numpy支持的最大维度数,最终我花了整个下午的时间想出了可能的方法。 (该方法本身相对简单,只是我还没有使用numpy支持的大多数真正花哨的功能,所以大部分时间都花在研究numpy有什么可用功能以及它可以做多少事情,这样我就不必这样做了。)

def blockgen(array, bpa):
    """Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
    bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray

    # parameter checking
    if array.ndim != bpa.size:         # bpa doesn't match array dimensionality
        raise ValueError("Size of bpa must be equal to the array dimensionality.")
    if (bpa.dtype != np.int            # bpa must be all integers
        or (bpa < 1).any()             # all values in bpa must be >= 1
        or (array.shape % bpa).any()): # % != 0 means not evenly divisible
        raise ValueError("bpa ({0}) must consist of nonzero positive integers "
                         "that evenly divide the corresponding array axis "
                         "size".format(bpa))


    # generate block edge indices
    rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
            for i, blk_n in enumerate(bpa))

    # build slice sequences for each axis (unfortunately broadcasting
    # can't be used to make the items easy to operate over
    c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]

    # Now to get the blocks; this is slightly less efficient than it could be
    # because numpy doesn't like jagged arrays and I didn't feel like writing
    # a ufunc for it.
    for idxs in np.ndindex(*bpa):
        blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))

        yield array[blockbounds]

解决方案 4:

你的问题实际上和这个一样。你可以使用一行代码np.ndindex()reshape()

def cutter(a, r, c):
    lenr = a.shape[0]/r
    lenc = a.shape[1]/c
    np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)

要创建您想要的结果:

a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
#       [5, 6, 7, 8]])

cutter( a, 1, 2 )
#array([[[[1, 2]],
#        [[3, 4]]],
#       [[[5, 6]],
#        [[7, 8]]]])

解决方案 5:

对 TheMeaningfulEngineer 的答案进行了一些小的改进,处理了大二维数组无法完美地切成大小相等的子数组的情况

def blockfy(a, p, q):
    '''
    Divides array a into subarrays of size p-by-q
    p: block row size
    q: block column size
    '''
    m = a.shape[0]  #image row size
    n = a.shape[1]  #image column size

    # pad array with NaNs so it can be divided by p row-wise and by q column-wise
    bpr = ((m-1)//p + 1) #blocks per row
    bpc = ((n-1)//q + 1) #blocks per column
    M = p * bpr
    N = q * bpc

    A = np.nan* np.ones([M,N])
    A[:a.shape[0],:a.shape[1]] = a

    block_list = []
    previous_row = 0
    for row_block in range(bpc):
        previous_row = row_block * p   
        previous_column = 0
        for column_block in range(bpr):
            previous_column = column_block * q
            block = A[previous_row:previous_row+p, previous_column:previous_column+q]

            # remove nan columns and nan rows
            nan_cols = np.all(np.isnan(block), axis=0)
            block = block[:, ~nan_cols]
            nan_rows = np.all(np.isnan(block), axis=1)
            block = block[~nan_rows, :]

            ## append
            if block.size:
                block_list.append(block)

    return block_list

例子:

a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)

a->
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

out[0] ->
array([[0., 1., 2.],
       [5., 6., 7.]])

out[1]->
array([[3., 4.],
       [8., 9.]])

out[-1]->
array([[23., 24.]])

解决方案 6:

目前,只有当大的二维数组可以被完美地切成大小相等的子数组时,它才有效。

下面的代码片段

a ->array([[ 0,  1,  2,  3,  4,  5],
           [ 6,  7,  8,  9, 10, 11],
           [12, 13, 14, 15, 16, 17],
           [18, 19, 20, 21, 22, 23]])

进入这个

block_array->
    array([[[ 0,  1,  2],
            [ 6,  7,  8]],

           [[ 3,  4,  5],
            [ 9, 10, 11]],

           [[12, 13, 14],
            [18, 19, 20]],

           [[15, 16, 17],
            [21, 22, 23]]])

pangq确定块大小

代码

a = arange(24)
a = a.reshape((4,6))
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

p = 2     #block row size
q = 3     #block column size

block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(blocks_per_column):
        previous_column = column_block * q
        block = a[previous_row:previous_row+p,previous_column:previous_column+q]
        block_array.append(block)

block_array = array(block_array)

解决方案 7:

如果您想要一个能够处理矩阵未均匀划分的情况的解决方案,则可以使用以下命令:

from operator import add
half_split = np.array_split(input, 2)

res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)

解决方案 8:

a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)

产量

[[7 6 2 4 4 2 5 2 3]
 [2 3 7 6 8 8 2 6 2]
 [4 1 3 1 3 8 1 3 7]
 [6 1 1 5 7 2 1 5 8]
 [8 8 7 6 6 1 8 8 4]
 [6 1 8 2 1 4 5 1 8]
 [7 3 4 2 5 6 1 2 7]
 [4 6 7 5 8 2 8 2 8]
 [6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
       [2, 3, 7],
       [4, 1, 3]]), array([[4, 4, 2],
       [6, 8, 8],
       [1, 3, 8]]), array([[5, 2, 3],
       [2, 6, 2],
       [1, 3, 7]])], [array([[6, 1, 1],
       [8, 8, 7],
       [6, 1, 8]]), array([[5, 7, 2],
       [6, 6, 1],
       [2, 1, 4]]), array([[1, 5, 8],
       [8, 8, 4],
       [5, 1, 8]])], [array([[7, 3, 4],
       [4, 6, 7],
       [6, 6, 5]]), array([[2, 5, 6],
       [5, 8, 2],
       [5, 6, 1]]), array([[1, 2, 7],
       [8, 2, 8],
       [2, 6, 4]])]]

解决方案 9:

这是基于 unutbu 的回答的解决方案,用于处理矩阵无法均等划分的情况。在这种情况下,它会在使用某些插值之前调整矩阵的大小。您需要 OpenCV 来实现这一点。请注意,我必须交换ncolsnrows使其工作,但不知道为什么。

import numpy as np
import cv2
import math 

def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
    """
    arr      a 2D array, typically an image
    r_nbrs   numbers of rows
    r_cols   numbers of cols
    """

    arr_h, arr_w = arr.shape

    size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
    size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )

    if size_w != arr_w or size_h != arr_h:
        arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)

    nrows = int(size_w // r_nbrs)
    ncols = int(size_h // c_nbrs)

    return (arr.reshape(r_nbrs, ncols, -1, nrows) 
               .swapaxes(1,2)
               .reshape(-1, ncols, nrows))

解决方案 10:

我发布了我的解决方案。请注意,此代码实际上不会创建原始数组的副本,因此它非常适合处理大数据。此外,如果数组不能均匀分布,它也不会崩溃(但您可以通过删除ceil并检查v_slicesh_slices是否无余地分布来轻松添加条件)。

import numpy as np
from math import ceil

a = np.arange(9).reshape(3, 3)

p, q = 2, 2
width, height = a.shape

v_slices = ceil(width / p)
h_slices = ceil(height / q)

for h in range(h_slices):
    for v in range(v_slices):
        block = a[h * p : h * p + p, v * q : v * q + q]
        # do something with a block

此代码更改(或更准确地说,让您直接访问数组的一部分):

[[0 1 2]
 [3 4 5]
 [6 7 8]]

变成这样:

[[0 1]
 [3 4]]
[[2]
 [5]]
[[6 7]]
[[8]]

如果您需要实际副本,Aenaon 代码就是您要找的。

如果您确定大数组可以被均匀划分,那么可以使用numpy 划分工具。

解决方案 11:

添加到@Aenaon 答案和他的 blockfy 函数,如果你正在使用彩色图像/3D 阵列,这里是我的管道,用于为 3 通道输入创建 224 x 224 的裁剪

def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc

A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a

block_list = []
previous_row = 0
for row_block in range(bpc):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(bpr):
        previous_column = column_block * q
        block = A[previous_row:previous_row+p, previous_column:previous_column+q]

        # remove nan columns and nan rows
        nan_cols = np.all(np.isnan(block), axis=0)
        block = block[:, ~nan_cols]
        nan_rows = np.all(np.isnan(block), axis=1)
        block = block[~nan_rows, :]

        ## append
        if block.size:
            block_list.append(block)

return block_list

然后扩展到

for file in os.listdir(path_to_crop):   ### list files in your folder
   img = io.imread(path_to_crop + file, as_gray=False) ### open image 

   r = blockfy(img[:,:,0],224,224)  ### crop blocks of 224 x 224 for red channel
   g = blockfy(img[:,:,1],224,224)  ### crop blocks of 224 x 224 for green channel
   b = blockfy(img[:,:,2],224,224)  ### crop blocks of 224 x 224 for blue channel

   for x in range(0,len(r)):
       img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch

       img = img.astype(np.uint8) ### cast back to proper integers

       img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
       
       img_swap_2 = img_swap.swapaxes(0, 1) ### do it again

       Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
                                        format = 'jpeg',
                                        subsampling=0,
                                        quality=100) ### save patch with new name etc 

解决方案 12:

方法 1:

import numpy as np
from skimage.util import view_as_blocks

arr = np.array([[1, 2, 3, 4],
                [5, 6, 7, 8]])

# Define block shape
block_shape = (2, 2)

# Slice the array into blocks
blocks = view_as_blocks(arr, block_shape)

print(blocks)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

方法2(简洁):

import numpy as np
from numpy.lib.stride_tricks import as_strided

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Block size
block_shape = (2, 2)

a_shape = np.array(a.shape)
print(a_shape)#[2 4]
#convert the arrays to lists
new_shape = (a_shape // block_shape).tolist() + list(block_shape)
print(new_shape)#[1,2,2, 2]


new_strides = (a.strides[0] * block_shape[0], a.strides[1] * block_shape[1]) + a.strides 
print(new_strides)
res = as_strided(a, shape = new_shape, strides = new_strides)
print(res)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]
'''

方法 2:

import numpy as np
from numpy.lib.stride_tricks import as_strided

a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
# Block size
block_shape = (2, 2)

shape = a.shape
strides = a.strides

newShape1 =( shape[0] // block_shape[0] ) 
newShape2 =( shape[1] // block_shape[1] ) 

newShape = (newShape1,newShape2,block_shape[0],block_shape[1])
print(newShape)#(1, 2, 2, 2)

newStrides1 = strides[0] * block_shape[0]
newStrides2 = strides[1] * block_shape[1]

newStrides = (newStrides1, newStrides2,strides[0],strides[1] )
print(newStrides) #(32, 8, 16, 4)

blocks = as_strided(a, shape = newShape, strides = newStrides)
print(blocks)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

額外:

import numpy as np

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Define block shape
block_shape = (2, 2)

grsize = 4
halfsize = grsize // 2

reshaped = a.reshape(-1, grsize)
aa = np.einsum('ij -> ij', reshaped[:, :halfsize])
bb = np.einsum('ij -> ij', reshaped[:, halfsize:])

# Stack aa and bb along a new axis
combined = np.stack([aa, bb], axis=0)
print(combined)
'''
[[[1 2]
  [5 6]]

 [[3 4]
  [7 8]]]
'''

# Reshape to the desired 4D shape
final_output = combined.reshape(1, 2, 2, 2)

print(final_output)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

张量点:

import numpy as np

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Define block shape
block_shape = (2, 2)
grsize = 4
halfsize = grsize // 2
reshaped = a.reshape(-1, grsize)
# Split the reshaped array into two halves using tensordot
aa = np.tensordot(a[:, :halfsize], np.ones((1,), dtype=int), axes=0)
bb = np.tensordot(a[:, halfsize:], np.ones((1,), dtype=int), axes=0)

# Stack aa and bb along a new axis
combined = np.stack([aa, bb], axis=0)
print(combined)
# Reshape to the desired 4D shape
final_output = combined.reshape(1, 2, 2, 2)

print(final_output)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

方法5:

import numpy as np

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Block size
block_shape = (2, 2)

blockVertical = a.shape[0] // block_shape[0]

blockHorizontal = a.shape[0] // blockVertical

reshapedArray1 = a.reshape(blockVertical,blockHorizontal,*block_shape).swapaxes(1,2)
print(reshapedArray1)

'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''
相关推荐
  政府信创国产化的10大政策解读一、信创国产化的背景与意义信创国产化,即信息技术应用创新国产化,是当前中国信息技术领域的一个重要发展方向。其核心在于通过自主研发和创新,实现信息技术应用的自主可控,减少对外部技术的依赖,并规避潜在的技术制裁和风险。随着全球信息技术竞争的加剧,以及某些国家对中国在科技领域的打压,信创国产化显...
工程项目管理   1565  
  为什么项目管理通常仍然耗时且低效?您是否还在反复更新电子表格、淹没在便利贴中并参加每周更新会议?这确实是耗费时间和精力。借助软件工具的帮助,您可以一目了然地全面了解您的项目。如今,国内外有足够多优秀的项目管理软件可以帮助您掌控每个项目。什么是项目管理软件?项目管理软件是广泛行业用于项目规划、资源分配和调度的软件。它使项...
项目管理软件   1354  
  信创国产芯片作为信息技术创新的核心领域,对于推动国家自主可控生态建设具有至关重要的意义。在全球科技竞争日益激烈的背景下,实现信息技术的自主可控,摆脱对国外技术的依赖,已成为保障国家信息安全和产业可持续发展的关键。国产芯片作为信创产业的基石,其发展水平直接影响着整个信创生态的构建与完善。通过不断提升国产芯片的技术实力、产...
国产信创系统   21  
  信创生态建设旨在实现信息技术领域的自主创新和安全可控,涵盖了从硬件到软件的全产业链。随着数字化转型的加速,信创生态建设的重要性日益凸显,它不仅关乎国家的信息安全,更是推动产业升级和经济高质量发展的关键力量。然而,在推进信创生态建设的过程中,面临着诸多复杂且严峻的挑战,需要深入剖析并寻找切实可行的解决方案。技术创新难题技...
信创操作系统   27  
  信创产业作为国家信息技术创新发展的重要领域,对于保障国家信息安全、推动产业升级具有关键意义。而国产芯片作为信创产业的核心基石,其研发进展备受关注。在信创国产芯片的研发征程中,面临着诸多复杂且艰巨的难点,这些难点犹如一道道关卡,阻碍着国产芯片的快速发展。然而,科研人员和相关企业并未退缩,积极探索并提出了一系列切实可行的解...
国产化替代产品目录   28  
热门文章
项目管理软件有哪些?
云禅道AD
禅道项目管理软件

云端的项目管理软件

尊享禅道项目软件收费版功能

无需维护,随时随地协同办公

内置subversion和git源码管理

每天备份,随时转为私有部署

免费试用