How to check if a file is a valid image file?
- 2025-03-13 08:54:00
- admin 原创
- 39
问题描述:
I am currently using PIL.
from PIL import Image
try:
im = Image.open(filename)
# do stuff
except IOError:
# filename not an image file
However, while this sufficiently covers most cases, some image files like, xcf, svg and psd are not being detected. PSD files throw an OverflowError
exception.
Is there someway I could include them as well?
解决方案 1:
您可以使用内置的imghdr模块。摘自其文档:
imghdr 模块确定文件或字节流中包含的图像类型。
使用方法如下:
>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'
注意:从 Python 3.11 开始, imghdr 已被弃用,因为它仅支持少数文件格式。
解决方案 2:
除了 Brian 的建议之外,您还可以使用 PIL 的验证方法来检查文件是否损坏。
im.验证()
尝试确定文件是否损坏,而不实际解码图像数据。如果此方法发现任何问题,它会引发适当的异常。此方法仅适用于新打开的图像;如果图像已加载,则结果未定义。此外,如果您需要在使用此方法后加载图像,则必须重新打开图像文件。属性
解决方案 3:
Additionally to the PIL
image check you can also add file name extension check like this:
filename.lower().endswith(('.png', '.jpg', '.jpeg', '.tiff', '.bmp', '.gif'))
Note that this only checks if the file name has a valid image extension, it does not actually open the image to see if it's a valid image, that's why you need to use additionally PIL
or one of the libraries suggested in the other answers.
解决方案 4:
One option is to use the filetype
package.
Installation
python -m pip install filetype
Advantages
Fast: Does its work by loading only the first few bytes of your image (check on the magic number)
Supports different mime type: Images, Videos, Fonts, Audio, Archives.
Example
filetype >= 1.0.7
import filetype
filename = "/path/to/file.jpg"
if filetype.is_image(filename):
print(f"{filename} is a valid image...")
elif filetype.is_video(filename):
print(f"{filename} is a valid video...")
filetype <= 1.0.6
import filetype
filename = "/path/to/file.jpg"
if filetype.image(filename):
print(f"{filename} is a valid image...")
elif filetype.video(filename):
print(f"{filename} is a valid video...")
Additional information on the official repo: https://github.com/h2non/filetype.py
解决方案 5:
很多时候,前几个字符对于各种文件格式来说都是一个神奇的数字。除了上面的异常检查之外,您还可以检查这一点。
解决方案 6:
更新
我还在GitHub 上的Python 脚本中实现了以下解决方案。
我还验证了损坏的文件 (jpg) 通常不是“损坏”的图像,即损坏的图片文件有时仍然是合法的图片文件,原始图像丢失或更改,但您仍然可以加载它而不会出现错误。但是,文件截断总是会导致错误。
结束更新
您可以使用 Python Pillow (PIL) 模块(适用于大多数图像格式)来检查文件是否是有效且完整的图像文件。
如果您想要检测损坏的图像,@Nadia Alramli 正确地建议了该im.verify()
方法,但这并不能检测到所有可能的图像缺陷,例如,im.verify
不能检测到截断的图像(大多数查看器通常会加载灰色区域)。
Pillow也能够检测到这些类型的缺陷,但您必须应用图像处理或图像解码/重新编码才能触发检查。最后我建议使用此代码:
from PIL import Image
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
如果出现图像缺陷,此代码将引发异常。请考虑 im.verify 比执行图像处理快约 100 倍(我认为翻转是成本较低的转换之一)。使用此代码,您将以大约 10 MBytes/sec 的速度使用标准 Pillow 或以 40 MBytes/sec 的速度使用 Pillow-SIMD 模块(现代 2.5Ghz x86_64 CPU)验证一组图像。
For the other formats xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
Check the Wand documentation: here, to installation: here
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
import os
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
解决方案 7:
On Linux, you could use python-magic which uses libmagic to identify file formats.
AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".
For other definitions of "valid" you might have to write your own tests.
解决方案 8:
You could use the Python bindings to libmagic, python-magic and then check the mime types. This won't tell you if the files are corrupted or intact but it should be able to determine what type of image it is.
解决方案 9:
Adapting from Fabiano and Tiago's answer.
from PIL import Image
def check_img(filename):
try:
im = Image.open(filename)
im.verify()
im.close()
im = Image.open(filename)
im.transpose(Image.FLIP_LEFT_RIGHT)
im.close()
return True
except:
print(filename,'corrupted')
return False
if not check_img('/dir/image'):
print('do something')
解决方案 10:
format = [".jpg",".png",".jpeg"]
for (path,dirs,files) in os.walk(path):
for file in files:
if file.endswith(tuple(format)):
print(path)
print ("Valid",file)
else:
print(path)
print("InValid",file)
解决方案 11:
Extension of the image can be used to check image file as follows.
import os
for f in os.listdir(folderPath):
if (".jpg" in f) or (".bmp" in f):
filePath = os.path.join(folderPath, f)