将二维数组切片成更小的二维数组
- 2024-12-31 08:37:00
- admin 原创
- 94
问题描述:
有没有办法将 numpy 中的二维数组切成更小的二维数组?
例子
[[1,2,3,4], -> [[1,2] [3,4]
[5,6,7,8]] [5,6] [7,8]]
所以我基本上想将 2x4 数组缩减为 2 个 2x2 数组。寻找可用于图像的通用解决方案。
解决方案 1:
几个月前还有另一个问题reshape
,它让我想到了使用和的想法swapaxes
。这h//nrows
很有意义,因为这将第一个块的行保持在一起。你需要nrows
和ncols
成为形状的一部分也是有意义的。-1
告诉重塑填充使重塑有效所需的任何数字。有了解决方案的形式,我只是尝试了各种方法,直到找到可行的公式。
您应该能够使用reshape
和的某种组合将数组分成“块” swapaxes
:
def blockshaped(arr, nrows, ncols):
"""
Return an array of shape (n, nrows, ncols) where
n * nrows * ncols = arr.size
If arr is a 2D array, the returned array should look like n subblocks with
each subblock preserving the "physical" layout of arr.
"""
h, w = arr.shape
assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
return (arr.reshape(h//nrows, nrows, -1, ncols)
.swapaxes(1,2)
.reshape(-1, nrows, ncols))
转弯c
np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)
[out]:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
进入
print(blockshaped(c, 2, 3))
[out]:
[[[ 0 1 2]
[ 6 7 8]]
[[ 3 4 5]
[ 9 10 11]]
[[12 13 14]
[18 19 20]]
[[15 16 17]
[21 22 23]]]
我在这里发布了一个反函数,unblockshaped
在这里发布了一个 N 维泛化。泛化让我们更深入地了解了该算法背后的原因。
请注意,还有superbatfish 的blockwise_view
。它以不同的格式排列块(使用更多轴),但它的优点是 (1) 始终返回视图和 (2) 能够处理任意维度的数组。
解决方案 2:
在我看来,这是一个任务numpy.split
或者某种变体。
例如
a = np.arange(30).reshape([5,6]) #a.shape = (5,6)
a1 = np.split(a,3,axis=1)
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)
如果您有一个 NxN 图像,您可以创建一个包含 2 个 NxN/2 子图像的列表,然后沿另一个轴划分它们。
numpy.hsplit
并且numpy.vsplit
也可用。
解决方案 3:
还有一些其他答案似乎已经非常适合您的具体情况,但您的问题激起了我对内存高效解决方案的可能性的兴趣,该解决方案可用于numpy支持的最大维度数,最终我花了整个下午的时间想出了可能的方法。 (该方法本身相对简单,只是我还没有使用numpy支持的大多数真正花哨的功能,所以大部分时间都花在研究numpy有什么可用功能以及它可以做多少事情,这样我就不必这样做了。)
def blockgen(array, bpa):
"""Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray
# parameter checking
if array.ndim != bpa.size: # bpa doesn't match array dimensionality
raise ValueError("Size of bpa must be equal to the array dimensionality.")
if (bpa.dtype != np.int # bpa must be all integers
or (bpa < 1).any() # all values in bpa must be >= 1
or (array.shape % bpa).any()): # % != 0 means not evenly divisible
raise ValueError("bpa ({0}) must consist of nonzero positive integers "
"that evenly divide the corresponding array axis "
"size".format(bpa))
# generate block edge indices
rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
for i, blk_n in enumerate(bpa))
# build slice sequences for each axis (unfortunately broadcasting
# can't be used to make the items easy to operate over
c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]
# Now to get the blocks; this is slightly less efficient than it could be
# because numpy doesn't like jagged arrays and I didn't feel like writing
# a ufunc for it.
for idxs in np.ndindex(*bpa):
blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))
yield array[blockbounds]
解决方案 4:
你的问题实际上和这个一样。你可以使用一行代码np.ndindex()
和reshape()
:
def cutter(a, r, c):
lenr = a.shape[0]/r
lenc = a.shape[1]/c
np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)
要创建您想要的结果:
a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
# [5, 6, 7, 8]])
cutter( a, 1, 2 )
#array([[[[1, 2]],
# [[3, 4]]],
# [[[5, 6]],
# [[7, 8]]]])
解决方案 5:
对 TheMeaningfulEngineer 的答案进行了一些小的改进,处理了大二维数组无法完美地切成大小相等的子数组的情况
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
例子:
a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)
a->
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
out[0] ->
array([[0., 1., 2.],
[5., 6., 7.]])
out[1]->
array([[3., 4.],
[8., 9.]])
out[-1]->
array([[23., 24.]])
解决方案 6:
目前,只有当大的二维数组可以被完美地切成大小相等的子数组时,它才有效。
下面的代码片段
a ->array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
进入这个
block_array->
array([[[ 0, 1, 2],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 9, 10, 11]],
[[12, 13, 14],
[18, 19, 20]],
[[15, 16, 17],
[21, 22, 23]]])
p
angq
确定块大小
代码
a = arange(24)
a = a.reshape((4,6))
m = a.shape[0] #image row size
n = a.shape[1] #image column size
p = 2 #block row size
q = 3 #block column size
block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
previous_row = row_block * p
previous_column = 0
for column_block in range(blocks_per_column):
previous_column = column_block * q
block = a[previous_row:previous_row+p,previous_column:previous_column+q]
block_array.append(block)
block_array = array(block_array)
解决方案 7:
如果您想要一个能够处理矩阵未均匀划分的情况的解决方案,则可以使用以下命令:
from operator import add
half_split = np.array_split(input, 2)
res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)
解决方案 8:
a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)
产量
[[7 6 2 4 4 2 5 2 3]
[2 3 7 6 8 8 2 6 2]
[4 1 3 1 3 8 1 3 7]
[6 1 1 5 7 2 1 5 8]
[8 8 7 6 6 1 8 8 4]
[6 1 8 2 1 4 5 1 8]
[7 3 4 2 5 6 1 2 7]
[4 6 7 5 8 2 8 2 8]
[6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
[2, 3, 7],
[4, 1, 3]]), array([[4, 4, 2],
[6, 8, 8],
[1, 3, 8]]), array([[5, 2, 3],
[2, 6, 2],
[1, 3, 7]])], [array([[6, 1, 1],
[8, 8, 7],
[6, 1, 8]]), array([[5, 7, 2],
[6, 6, 1],
[2, 1, 4]]), array([[1, 5, 8],
[8, 8, 4],
[5, 1, 8]])], [array([[7, 3, 4],
[4, 6, 7],
[6, 6, 5]]), array([[2, 5, 6],
[5, 8, 2],
[5, 6, 1]]), array([[1, 2, 7],
[8, 2, 8],
[2, 6, 4]])]]
解决方案 9:
这是基于 unutbu 的回答的解决方案,用于处理矩阵无法均等划分的情况。在这种情况下,它会在使用某些插值之前调整矩阵的大小。您需要 OpenCV 来实现这一点。请注意,我必须交换ncols
并nrows
使其工作,但不知道为什么。
import numpy as np
import cv2
import math
def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
"""
arr a 2D array, typically an image
r_nbrs numbers of rows
r_cols numbers of cols
"""
arr_h, arr_w = arr.shape
size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )
if size_w != arr_w or size_h != arr_h:
arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)
nrows = int(size_w // r_nbrs)
ncols = int(size_h // c_nbrs)
return (arr.reshape(r_nbrs, ncols, -1, nrows)
.swapaxes(1,2)
.reshape(-1, ncols, nrows))
解决方案 10:
我发布了我的解决方案。请注意,此代码实际上不会创建原始数组的副本,因此它非常适合处理大数据。此外,如果数组不能均匀分布,它也不会崩溃(但您可以通过删除ceil
并检查v_slices
和h_slices
是否无余地分布来轻松添加条件)。
import numpy as np
from math import ceil
a = np.arange(9).reshape(3, 3)
p, q = 2, 2
width, height = a.shape
v_slices = ceil(width / p)
h_slices = ceil(height / q)
for h in range(h_slices):
for v in range(v_slices):
block = a[h * p : h * p + p, v * q : v * q + q]
# do something with a block
此代码更改(或更准确地说,让您直接访问数组的一部分):
[[0 1 2]
[3 4 5]
[6 7 8]]
变成这样:
[[0 1]
[3 4]]
[[2]
[5]]
[[6 7]]
[[8]]
如果您需要实际副本,Aenaon 代码就是您要找的。
如果您确定大数组可以被均匀划分,那么可以使用numpy 划分工具。
解决方案 11:
添加到@Aenaon 答案和他的 blockfy 函数,如果你正在使用彩色图像/3D 阵列,这里是我的管道,用于为 3 通道输入创建 224 x 224 的裁剪
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
然后扩展到
for file in os.listdir(path_to_crop): ### list files in your folder
img = io.imread(path_to_crop + file, as_gray=False) ### open image
r = blockfy(img[:,:,0],224,224) ### crop blocks of 224 x 224 for red channel
g = blockfy(img[:,:,1],224,224) ### crop blocks of 224 x 224 for green channel
b = blockfy(img[:,:,2],224,224) ### crop blocks of 224 x 224 for blue channel
for x in range(0,len(r)):
img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch
img = img.astype(np.uint8) ### cast back to proper integers
img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
img_swap_2 = img_swap.swapaxes(0, 1) ### do it again
Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
format = 'jpeg',
subsampling=0,
quality=100) ### save patch with new name etc
解决方案 12:
方法 1:
import numpy as np
from skimage.util import view_as_blocks
arr = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
# Define block shape
block_shape = (2, 2)
# Slice the array into blocks
blocks = view_as_blocks(arr, block_shape)
print(blocks)
'''
[[[[1 2]
[5 6]]
[[3 4]
[7 8]]]]
'''
方法2(简洁):
import numpy as np
from numpy.lib.stride_tricks import as_strided
a = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
# Block size
block_shape = (2, 2)
a_shape = np.array(a.shape)
print(a_shape)#[2 4]
#convert the arrays to lists
new_shape = (a_shape // block_shape).tolist() + list(block_shape)
print(new_shape)#[1,2,2, 2]
new_strides = (a.strides[0] * block_shape[0], a.strides[1] * block_shape[1]) + a.strides
print(new_strides)
res = as_strided(a, shape = new_shape, strides = new_strides)
print(res)
'''
[[[[1 2]
[5 6]]
[[3 4]
[7 8]]]]
'''
方法 2:
import numpy as np
from numpy.lib.stride_tricks import as_strided
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
# Block size
block_shape = (2, 2)
shape = a.shape
strides = a.strides
newShape1 =( shape[0] // block_shape[0] )
newShape2 =( shape[1] // block_shape[1] )
newShape = (newShape1,newShape2,block_shape[0],block_shape[1])
print(newShape)#(1, 2, 2, 2)
newStrides1 = strides[0] * block_shape[0]
newStrides2 = strides[1] * block_shape[1]
newStrides = (newStrides1, newStrides2,strides[0],strides[1] )
print(newStrides) #(32, 8, 16, 4)
blocks = as_strided(a, shape = newShape, strides = newStrides)
print(blocks)
'''
[[[[1 2]
[5 6]]
[[3 4]
[7 8]]]]
'''
額外:
import numpy as np
a = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
# Define block shape
block_shape = (2, 2)
grsize = 4
halfsize = grsize // 2
reshaped = a.reshape(-1, grsize)
aa = np.einsum('ij -> ij', reshaped[:, :halfsize])
bb = np.einsum('ij -> ij', reshaped[:, halfsize:])
# Stack aa and bb along a new axis
combined = np.stack([aa, bb], axis=0)
print(combined)
'''
[[[1 2]
[5 6]]
[[3 4]
[7 8]]]
'''
# Reshape to the desired 4D shape
final_output = combined.reshape(1, 2, 2, 2)
print(final_output)
'''
[[[[1 2]
[5 6]]
[[3 4]
[7 8]]]]
'''
张量点:
import numpy as np
a = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
# Define block shape
block_shape = (2, 2)
grsize = 4
halfsize = grsize // 2
reshaped = a.reshape(-1, grsize)
# Split the reshaped array into two halves using tensordot
aa = np.tensordot(a[:, :halfsize], np.ones((1,), dtype=int), axes=0)
bb = np.tensordot(a[:, halfsize:], np.ones((1,), dtype=int), axes=0)
# Stack aa and bb along a new axis
combined = np.stack([aa, bb], axis=0)
print(combined)
# Reshape to the desired 4D shape
final_output = combined.reshape(1, 2, 2, 2)
print(final_output)
'''
[[[[1 2]
[5 6]]
[[3 4]
[7 8]]]]
'''
方法5:
import numpy as np
a = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
# Block size
block_shape = (2, 2)
blockVertical = a.shape[0] // block_shape[0]
blockHorizontal = a.shape[0] // blockVertical
reshapedArray1 = a.reshape(blockVertical,blockHorizontal,*block_shape).swapaxes(1,2)
print(reshapedArray1)
'''
[[[[1 2]
[5 6]]
[[3 4]
[7 8]]]]
'''