How to automatically classify images by dominant color?

I have many images (tens of thousands) of fairly large JPG images. Each is an image of an index card. Most of them are white, but some have standard index-card colors (these colors).

The colors correspond to data attributes, so I would like to programmatically classify these cards by color. I know it's possible to extract the dominant color from images in a web browser using a canvas element and a an algorithm like color thief, and it works–it gives me an rgb value which is enough to bin the cards.

But I can't see how I could run such a thing through a web browser on so many images.

I'm wondering if anyone can recommend a command-line tool, perhaps a Python or Ruby module, that could do something similar.


This appears to strongly overlap with this question.

If I were in your situation, though, there are a couple of things I would try. First, you can get any one of several Python libraries that can work: OpenCV, Mahotas, PIL, or scikits.image can all probably handle this since you really just need to strip out average RGB values.

Write a script to load each image and append its average RGB value to a list. You can then use built-in methods from scikits.learn to perform either meanshift or k-means clustering on this big list of colors. Since you likely know how many different colors there are, k-means is a good choice. Alternatively, if you know the RGB values of the standard colors already, then for each image you load, just compute the average RGB value, then "classify" that image according to whichever standard color RGB vector is closest.

链接地址: http://www.djcxy.com/p/10562.html

上一篇: 控制器包裹在面板中

下一篇: 如何按主色自动分类图像?