如何消除数独方格中的凸性缺陷？-IT科技

摘要：问题描述：我正在做一个有趣的项目：使用 OpenCV（如 Google 眼镜等）从输入图像中解答数独问题。我已经完成了任务，但最后我发现了一个小问题，所以我来这里了。我使用 OpenCV 2.3.1 的 Python API 进行了编程。以下是我所做的：读图找到轮廓选择面积最大的一个（也有点相当于正方形）。找...

问题描述：

我正在做一个有趣的项目：使用 OpenCV（如 Google 眼镜等）从输入图像中解答数独问题。我已经完成了任务，但最后我发现了一个小问题，所以我来这里了。

我使用 OpenCV 2.3.1 的 Python API 进行了编程。

以下是我所做的：

读图
找到轮廓
选择面积最大的一个（也有点相当于正方形）。
找到角点。

例如如下所示：

在此处输入图片描述

（请注意，这里绿线与数独的真实边界正确重合，因此数独可以正确扭曲。查看下一张图片）

将图像扭曲成完美的正方形

例如图片：

在此处输入图片描述

执行 OCR（我使用了我在 OpenCV-Python 中的简单数字识别 OCR中给出的方法）

而且这个方法很有效。

问题：

看看这张图片。

在该图像上执行步骤 4 得到以下结果：

在此处输入图片描述

所画的红线是原始轮廓，也是数独边界的真实轮廓。

绘制的绿线是近似的轮廓，它将成为扭曲图像的轮廓。

当然，数独顶部边缘的绿线和红线是有区别的。所以在扭曲时，我没有得到数独的原始边界。

我的问题：

如何将图像扭曲到数独的正确边界（即红线）上，或者如何消除红线和绿线之间的差异？OpenCV 中是否有此方法？

解决方案 1：

我有一个可行的解决方案，但你必须自己将其转换为 OpenCV。它是用 Mathematica 编写的。

第一步是通过将每个像素除以闭运算的结果来调整图像的亮度：

src = ColorConvert[Import["http://davemark.com/images/sudoku.jpg"], "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]

在此处输入图片描述

下一步是找到数独区域，这样我就可以忽略（屏蔽）背景。为此，我使用连通分量分析，并选择具有最大凸面面积的分量：

components = 
  ComponentMeasurements[
    ColorNegate@Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All, 
    2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]

在此处输入图片描述

通过填充此图像，我得到了数独网格的面具：

mask = FillingTransform[largestComponent]

在此处输入图片描述

现在，我可以使用二阶导数滤波器在两个独立的图像中找到垂直线和水平线：

lY = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];

在此处输入图片描述

我再次使用连通分量分析从这些图像中提取网格线。网格线比数字长得多，因此我可以使用卡尺长度来仅选择网格线连通分量。按位置对它们进行排序，我得到了图像中每条垂直/水平网格线的 2x10 个掩模图像：

verticalGridLineMasks = 
  SortBy[ComponentMeasurements[
      lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All, 
      2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks = 
  SortBy[ComponentMeasurements[
      lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All, 
      2]], #[[2, 2]] &][[All, 3]];

在此处输入图片描述

接下来，我取每对垂直/水平网格线，将其扩大，计算逐像素的交点，并计算结果的中心。这些点是网格线的交点：

centerOfGravity[l_] := 
 ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters = 
  Table[centerOfGravity[
    ImageData[Dilation[Image[h], DiskMatrix[2]]]*
     ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h, 
    horizontalGridLineMasks}, {v, verticalGridLineMasks}];

在此处输入图片描述

最后一步是定义两个通过这些点进行 X/Y 映射的插值函数，并使用这些函数变换图像：

fnX = ListInterpolation[gridCenters[[All, All, 1]]];
fnY = ListInterpolation[gridCenters[[All, All, 2]]];
transformed = 
 ImageTransformation[
  srcAdjusted, {fnX @@ Reverse[#], fnY @@ Reverse[#]} &, {9*50, 9*50},
   PlotRange -> {{1, 10}, {1, 10}}, DataRange -> Full]

在此处输入图片描述

所有操作都是基本的图像处理功能，因此在 OpenCV 中也应该可以实现。基于样条的图像变换可能更难，但我认为您真的不需要它。可能使用您现在在每个单独的单元格上使用的透视变换就可以得到足够好的结果。

解决方案 2：

Nikie 的答案解决了我的问题，但他的答案是在 Mathematica 中。所以我认为我应该在这里给出它的 OpenCV 改编版。但在实施之后，我发现 OpenCV 代码比 nikie 的 mathematica 代码大得多。而且，我在 OpenCV 中找不到 nikie 完成的插值方法（虽然可以使用 scipy 完成，我会在适当的时候告诉它。）

1. 图像预处理（闭运算）

import cv2
import numpy as np

img = cv2.imread('dave.jpg')
img = cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))

close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)

结果：

结束结果

2. 寻找数独方块并创建蒙版图像

thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

max_area = 0
best_cnt = None
for cnt in contour:
    area = cv2.contourArea(cnt)
    if area > 1000:
        if area > max_area:
            max_area = area
            best_cnt = cnt

cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)

res = cv2.bitwise_and(res,mask)

结果：

在此处输入图片描述

3. 寻找垂直线

kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))

dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)

contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
    x,y,w,h = cv2.boundingRect(cnt)
    if h/w > 5:
        cv2.drawContours(close,[cnt],0,255,-1)
    else:
        cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()

结果：

在此处输入图片描述

4. 寻找水平线

kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)

contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
    x,y,w,h = cv2.boundingRect(cnt)
    if w/h > 5:
        cv2.drawContours(close,[cnt],0,255,-1)
    else:
        cv2.drawContours(close,[cnt],0,0,-1)

close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()

结果：

在此处输入图片描述

当然，这个不太好。

5. 查找网格点

res = cv2.bitwise_and(closex,closey)

结果：

在此处输入图片描述

6. 纠正缺陷

这里，nikie 做了某种插值，我对此不太了解。而且我找不到此 OpenCV 的任何对应函数。（也许有，我不知道）。

查看这个 SOF，它解释了如何使用 SciPy 来做到这一点，但我不想使用它：OpenCV 中的图像转换

因此，在这里我取每个子正方形的 4 个角，并对每个角应用扭曲透视。

为此，我们首先找到质心。

contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
    mom = cv2.moments(cnt)
    (x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
    cv2.circle(img,(x,y),4,(0,255,0),-1)
    centroids.append((x,y))

但生成的质心不会被排序。查看下图查看它们的顺序：

在此处输入图片描述

因此，我们从左到右、从上到下对它们进行排序。

centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]

b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)])
bm = b.reshape((10,10,2))

现在看看他们的顺序：

在此处输入图片描述

最后我们应用转换并创建一个尺寸为 450x450 的新图像。

output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
    ri = i/10
    ci = i%10
    if ci != 9 and ri!=9:
        src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
        dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
        retval = cv2.getPerspectiveTransform(src,dst)
        warp = cv2.warpPerspective(res2,retval,(450,450))
        output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()

结果：

在此处输入图片描述

结果与 nikie 的几乎相同，但代码长度较大。也许有更好的方法，但在此之前，这种方法是可行的。

问候 ARK。

解决方案 3：

您可以尝试使用某种基于网格的任意扭曲建模。由于数独已经是一个网格，所以这应该不会太难。

因此，您可以尝试检测每个 3x3 子区域的边界，然后单独扭曲每个区域。如果检测成功，它将为您提供更好的近似值。

解决方案 4：

我认为这是一篇很棒的文章，ARK 提出了很棒的解决方案；布局和解释都非常好。

我曾经研究过类似的问题，并构建了整个程序。有一些变化（例如，将 xrange 改为 range，将参数改为 cv2.findContours），但这应该是开箱即用的（Python 3.5，Anaconda）。

这是上述元素的汇编，并添加了一些缺失的代码（即点的标记）。

'''

https://stackoverflow.com/questions/10196198/how-to-remove-convexity-defects-in-a-sudoku-square

'''

import cv2
import numpy as np

img = cv2.imread('test.png')

winname="raw image"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,100)


img = cv2.GaussianBlur(img,(5,5),0)

winname="blurred"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,150)

gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))

winname="gray"
cv2.namedWindow(winname)
cv2.imshow(winname, gray)
cv2.moveWindow(winname, 100,200)

close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)

winname="res2"
cv2.namedWindow(winname)
cv2.imshow(winname, res2)
cv2.moveWindow(winname, 100,250)

 #find elements
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
img_c, contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

max_area = 0
best_cnt = None
for cnt in contour:
    area = cv2.contourArea(cnt)
    if area > 1000:
        if area > max_area:
            max_area = area
            best_cnt = cnt

cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)

res = cv2.bitwise_and(res,mask)

winname="puzzle only"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,300)

# vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))

dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)

img_d, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
    x,y,w,h = cv2.boundingRect(cnt)
    if h/w > 5:
        cv2.drawContours(close,[cnt],0,255,-1)
    else:
        cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()

winname="vertical lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_d)
cv2.moveWindow(winname, 100,350)

# find horizontal lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)

img_e, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

for cnt in contour:
    x,y,w,h = cv2.boundingRect(cnt)
    if w/h > 5:
        cv2.drawContours(close,[cnt],0,255,-1)
    else:
        cv2.drawContours(close,[cnt],0,0,-1)

close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()

winname="horizontal lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_e)
cv2.moveWindow(winname, 100,400)


# intersection of these two gives dots
res = cv2.bitwise_and(closex,closey)

winname="intersections"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,450)

# text blue
textcolor=(0,255,0)
# points green
pointcolor=(255,0,0)

# find centroids and sort
img_f, contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
    mom = cv2.moments(cnt)
    (x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
    cv2.circle(img,(x,y),4,(0,255,0),-1)
    centroids.append((x,y))

# sorting
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]

b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in range(10)])
bm = b.reshape((10,10,2))

# make copy
labeled_in_order=res2.copy()

for index, pt in enumerate(b):
    cv2.putText(labeled_in_order,str(index),tuple(pt),cv2.FONT_HERSHEY_DUPLEX, 0.75, textcolor)
    cv2.circle(labeled_in_order, tuple(pt), 5, pointcolor)

winname="labeled in order"
cv2.namedWindow(winname)
cv2.imshow(winname, labeled_in_order)
cv2.moveWindow(winname, 100,500)

# create final

output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
    ri = int(i/10) # row index
    ci = i%10 # column index
    if ci != 9 and ri!=9:
        src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
        dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
        retval = cv2.getPerspectiveTransform(src,dst)
        warp = cv2.warpPerspective(res2,retval,(450,450))
        output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()

winname="final"
cv2.namedWindow(winname)
cv2.imshow(winname, output)
cv2.moveWindow(winname, 600,100)

cv2.waitKey(0)
cv2.destroyAllWindows()

解决方案 5：

我想补充一点，上述方法仅在数独板直立时才有效，否则高度/宽度（或反之亦然）比率测试很可能会失败，并且您将无法检测到数独的边缘。（我还想补充一点，如果线条不垂直于图像边框，则索贝尔操作（dx 和 dy）仍将起作用，因为线条相对于两个轴仍然具有边缘。）

为了能够检测直线，您应该进行轮廓或像素分析，例如轮廓区域/边界矩形区域、左上角和右下角点……

编辑：我设法通过应用线性回归并检查误差来检查一组轮廓是否形成一条线。但是，当直线的斜率太大（即 >1000）或非常接近 0 时，线性回归表现不佳。因此，在线性回归之前应用上述比率测试（在得票最多的答案中）是合乎逻辑的，并且对我来说确实有效。

解决方案 6：

为了消除未检测到的角，我应用了伽马值为 0.8 的伽马校正。

伽马校正之前

绘制红色圆圈以显示缺失的角。

伽马校正后

代码如下：

gamma = 0.8
invGamma = 1/gamma
table = np.array([((i / 255.0) ** invGamma) * 255
                  for i in np.arange(0, 256)]).astype("uint8")
cv2.LUT(img, table, img)

如果缺少某些角点，这是对 Abid Rahman 答案的补充。