Looking for a simple way to compare tiny images
category: code [glöplog]
I'm working on a thing that needs to do some basic image comparison of tiny images, around 16^2 pixels. I don't want to compare the things pixel by pixel as they might be 1 or 2 pixels offset or something. Any idea on how to do it?
I'm currently doing something like blurring both images and calculating the error (RMS), and it kind of works, but I would like to know if there's something "standard" for this sort of things.
I'm currently doing something like blurring both images and calculating the error (RMS), and it kind of works, but I would like to know if there's something "standard" for this sort of things.
By "comparing" do you mean you want a "distance" or just test for equality?
If it's for block equality, I would try a "block digest" function which creates an approximation of the block. Sort of like a hash. Candidate replacement or "similar enough" blocks are those that get the same digest. Make a few different digest functions so it doesn't depend on exact rounding. So, each block goes to several groupings. When you want to find a replacement for a block, get all digest groups it belongs to, and do the final selection more accurately. Or if you want to find the best overall combination of replacements, look at how many groups a block belongs in, and select the "most median" blocks. Or try to construct new blocks that aren't necessarily exactly the originals but could replace many of them. One digest function might look at brightness values only, others also color hues, and whatever color spaces you think is important for visual similarity. (These are ideas I thought about when making graphics converters for MSX, where the number of blocks/characters available is limited)
equality, as in "probability of being this".
xernobyl: yeah, you didn't specify the color bit-depth in your first post, but as you talk about "color hues" I guess its something like 24- or 32-bits per pixel.
However, one standard way of doing things is to think of each bit as one "frame" in the overall comparison-scheme. Then apply the same technique/method or whatever you're using on all the frames. One might use distance-metric (not sure if that is what Gargaj meant to say).
For a 24-bit color image, there are 6144 bits in your image, so that means that the total range is 1 out of 3,38e+1849 combinations, which is a very big number and hard to guess out of an Pseudo Random Generator that has no presumption of what it should recognize from your pattern. A simple method would be to use euclidean distance as metric in R^2, and use these weights as error measurements. Hope this can give you some directions.
However, one standard way of doing things is to think of each bit as one "frame" in the overall comparison-scheme. Then apply the same technique/method or whatever you're using on all the frames. One might use distance-metric (not sure if that is what Gargaj meant to say).
For a 24-bit color image, there are 6144 bits in your image, so that means that the total range is 1 out of 3,38e+1849 combinations, which is a very big number and hard to guess out of an Pseudo Random Generator that has no presumption of what it should recognize from your pattern. A simple method would be to use euclidean distance as metric in R^2, and use these weights as error measurements. Hope this can give you some directions.
First, you'll need to decide what you want to be invariant to. Translation? Rotation? Brightness changes? This will have an impact on your algorithm.
Assuming the images are anywhere near natural images (ie., not just random noise), what you want is a way to align them. There are plenty of ways of doing this (e.g. gradient descent), but if you're only at 16x16 and the possible offsets are only 1–2 pixels each way (possibly in half- or quarter-pixels?), brute force is likely to work best.
Assuming the images are anywhere near natural images (ie., not just random noise), what you want is a way to align them. There are plenty of ways of doing this (e.g. gradient descent), but if you're only at 16x16 and the possible offsets are only 1–2 pixels each way (possibly in half- or quarter-pixels?), brute force is likely to work best.
you are looking for SSIM or a variant (SSIM2, CWSIM, etc. ) these were specifically designed for your purpose.
Start here : SSIM
Start here : SSIM
SSIM still wants alignment, though, if you can. But it's better than just directly subtracting, if and only if you talk about human perception. Again, it's important to figure out what the question is before looking for answers. :-)
SSIM is a good starting point for research. He also could train an autoencoder and measure the difference in latent space, but I agree he should specify the case before delving deeper into possible answers ;)
SSIM seems to be a great fit for my particular problem, thank you.
since I know it works great for audio, why not try something in the frequency domain. fft plus some kind of preprocessing (esp. with regards to normalization). really depends on the desired invariants though. working in a particular color representation, esp. separating color and brightness might help. I'd take a look at opencv too, it provides some interesting tools.
Substract image from its negative and count all non-zeros?
ImageDupless
@rutra80 that would work for 100% similar images, mine might be a few pixels of
if we're going into 3rd party soft, then xnview has such a function with definable threshold
we're not.