Well, today I discovered an Imatest article that demonstrates just how wildly different slanted edge implementations can (and apparently do) vary. I will leave my critique of said article for another day, but I will note that this article makes reference to the "5 degrees" rule that is often seen when slanted edge measurements are performed.
The "5 degrees" rule states that the orientation of the edge relative to the sensor's photosite grid should be approximately 5 degrees (either horizontal or vertical).
There are two notable reasons for this: firstly, a 5 degree angle is far from the critical angles (the topic of this post), and secondly, a 5 degree angle ensures that the potential non-rotationally symmetric behaviour of the PSF is minimized. A discussion of the non-rotationally symmetric PSFs will also be postponed to a future article.
A closer look at the slanted edge method
Figure 1 illustrates how MTF Mapper constructs the oversampled edge spread function (ESF) that is the starting point of the MTF calculation.Figure 1: How the ESF is sampled |
Pushing up our sampling rate to 8x moves the Nyquist limit up to 4 cycles per pixel, which allows us to examine the behaviour of our MTF curve near 0.5 cycles per pixel without fear that we are being misled by aliasing artifacts.
How can we increase the spatial sampling rate of our sensor? Well, we cannot change the sensor, but we can use a trick to generate a synthetic ESF. Looking at Figure 1 above we can see that the edge (represented as a black line) crosses the pixel grid in different places as we move along the edge. More importantly, pay attention to the shortest distance from each black dot (representing the centre of each pixel/photosite) to the black edge. Notice how this distance varies by a fraction of the pixel spacing as we move along the edge.
Let us assume that we have a coordinate system with its origin at the centre of our top/leftmost pixel of our sensor, such that the black dots representing the pixel centres can be addressed by integer coordinates. If we take the (x, y) coordinate of a pixel near the edge, and project this coordinate onto the vector representing the edge normal (i.e., the vector perpendicular to the edge under analysis), then we obtain a real-valued scalar that represents the distance of the pixel centre from our edge. We can pair this projected distance-from-edge value with the intensity of that pixel to form a sample point on our synthetic ESF, as shown in Figure 1.
How does this help us to oversample the ESF? Well, if we choose an appropriate edge orientation angle, say, 5 degrees, then the projected ESF points will be densely spaced. In other words, the average distance between to consecutive samples in our projected ESF will be a fraction of the pixel spacing. We can partition the projected ESF points into bins of width 0.125 pixels to produce a regularly-spaced sampled ESF with 8x oversampling.
We know this works well for 5 degrees (because that is what everyone is doing), but what is so special about 5 degrees? To answer that, we have to slog through some elementary math.
Spacing of projected samples
Figure 2 illustrates one possible way in which we can assign integer coordinates to the pixels near the edge under analysis.Figure 2: How pixel coordinates are assigned |
The orientation of the edge can be specified in degrees as measured from the horizontal, but I prefer using the slope of the line. If the angle between the edge and the horizontal is θ, then the direction perpendicular to the edge can be represented as the unit length vector (-sin(θ), cos(θ)). This would be expressed as a slope 1/Δx = tan(θ), such that Δx = 1/tan(θ).
The normal vector (-sin(θ), cos(θ)) then becomes (-1, Δx) * 1/√(1 + Δx2). We project our pixel centres, represented as integer coordinates (x, y), onto this normal vector by computing the dot product (x, y) · (-1, Δx) * 1/√(1 + Δx2), which evaluates to d(x,y) = 1/√(1 + Δx2) * (-x + yΔx).
The function d(x,y) thus computes the distance that the pixel located at (x, y) is from the origin (X0,Y0), which we will pretend falls on the edge; this means that d(x,y) measures the perpendicular distance of point (x, y) from the edge. The projected ESF point is thus [d(x,y), I(x+X0, y + Y0)], where I(i, j) denotes the intensity of the pixel located at pixel(i,j).
Suppose that we focus only on the subset of pixels with integer coordinates (p, q) such that 0 ≤ d(p, q) < 1. If we are to achieve 8x oversampling, then there must be at least 8 unique distance values d(p, q) in this interval. In fact, we would require these 8 points to be spread out uniformly such that at least one d(p, q) value falls in the interval [0, 0.125), one in [0.125, 0.25), and so on, such that each of the sub-intervals of length 0.125 between 0 and 1 contain at least one point.
Consider, for example, the case where Δx = 4. This reduces d(p, q) to 1/√(1 + 42) * (-p + 4q) = (-p + 4q)/√17. Because both p and q are integers, we can deduce that d(p, q) must be an integer multiple of 1/√17. How many integer multiples of 1/√17 can we fit in between 0 and 1? If we enumerate them, we can choose p and q such that (-p + 4q) is the set {0, 1, 2, 3, 4, 5, 6 ...}. But √17 = 4.123106 (and change), so if (-p + 4q) ≥ 5, then d(p, q) > 1. That leaves only the set {0, 1, 2, 3, 4}, such that the only values of 0 ≤ d(p, q) < 1 are {0, 1/√17, 2/√17, 3/√17, 4/√17}.
Whoops! If Δx = 4, then there will only be 5 unique values of d(p, q) between 0 and 1, and we need at least 8 points between 0 and 1 to achieve 8x oversampling! The implications of the failure to achieve 8x oversampling will be covered a bit later; first we must identify the critical angles.
Enumerating the problem angles
We already know that Δx = 4 causes our 8x oversampling to fail; this corresponds to an angle of atan(1/4) = 14.036 degrees. In fact, it is fairly simple to see that for any integer value Δx, we will have Δx + 1 unique values between 0 and 1 (if we include the 0 in our count). For 8x oversampling, the spacing between d(p, q) values must be less than 0.125, which happens when we have at least 8 unique d(p, q) values between 0 and 1. For Δx = 8, we see that 1/√(1 + Δx2) = 1/√65 ≈ 0.12403.The angles that will lead to a failure of the 8x oversampling mechanism are thus: 45, 26.565051, 18.434949, 14.036243, 11.309932, 9.462322, and 8.130102.
Some other Δx values are also problematic: 1.5, and 2.5. These yield only 2Δx + 1 unique values (including zero). Setting Δx = 1.25 only yields 4Δx + 1 a total of 7 unique values. These fractional slopes occur at angles of 33.69007, 21.80141, and 38.65981 degrees.
There may even be more of these problematic angles, but this is as far as I have come with this analysis. Feel free to comment if you can help me identify other values of Δx that will lead to undersampling.
Dealing with the critical angles
So what exactly happens when we do not have at least one sample every 0.125 pixels along the ESF? The corresponding bin in the resampled ESF will be missing, and leaving gaps in the resampled ESF leads to severe distortion of the MTF because those gaps show up as high-frequency transitions in the FFT.A workable strategy is to fall back on 4x oversampling. Another strategy is to simply interpolate the from nearby bins. Both of these solutions address the primary issue (gaps in the ESF/PSF), but the residual impact of the interpolation/replacement on the final MTF is harder to mitigate.
A new hope
After my previous post (on improved apodization) I started thinking about the notion of applying low-pass filters to an interpolating function applied directly to the dense ESF samples, before binning is performed. I realized that my explanation of the equivalence between binning and fitting an interpolating function + low-pass filtering + sampling only holds when the points are relatively uniformly distributed within each bin.This got me thinking that I can probably apply a low-pass filter directly to the dense ESF samples, even before binning. The implementation of this approach feels familiar; it turns out to be similar to the method I implemented to perform importance sampling when using an Airy + photosite aperture PSF (this post). Before describing the new method, first consider this illustration of plain vanilla unweighted binning:
Figure 3: Unweighted binning |
The pink boxes denote the bins, each 0.125 pixels wide; the horizontal direction depicted here corresponds to the "d" axis in Figure 2. The midpoint, or representative "x" value for each bin is indicated by the arrows and the values in blue. The green dots represent individual dense ESF samples --- their "y" values are not important in this diagram; the position of the green dots are merely to illustrate where each dense ESF sample is located within each bin in terms of x value, and the number of dots give a rough indication of the density of the dense ESF samples.
If we use plain binning, then we choose as representative x value for each bin the midpoint of the bin. The representative y value is obtained as the mean of the y values of the ESF samples within that bin. In Figure 3, the rightmost bin has many ESF samples quite close to the midpoint of the bin, but almost as many ESF samples near the edge of the bin. The effect of unweighted averaging would be that the samples near the right edge of the bin will carry roughly the same weight as the samples near the middle of our bin, but clearly the samples near the middle of the bin should have had a larger weight in computing the representative value for this bin.
A much better way of binning would be to combine the binning step with the low-pass filtering step. Instead of representing each dense ESF sample as a point, it instead becomes a small rectangle, as shown here:
Figure 4: Weighted binning |
Yes, I have just reinvented kernel density estimation. Sigh.
Anyhow, this binning approach also makes the low-pass filtering step explicit, so if each dense ESF sample is now represented by an interval of width w pixels, then we are effectively convolving the ESF with a rect(w * x) function. We can remove the low-pass filtering effect on the MTF (calculated further down the pipeline) by dividing the MTF by sinc(0.5 * w * f), as I have shown in my previous post.
Our binning process is beginning to look more like a proper approach to sampling: we apply a low-pass filter to our dense ESF points to remove (or at least strongly attenuate) higher frequencies, followed by choosing one representative value at the midpoint of each bin (the downsampling step). By choosing w = 0.33333 pixels, we have a fairly strong low-pass filter, but one that still has a cut-off frequency that is high enough to allow good detail at least up to 3 cycles per pixel.
Because of the (relatively) wide low-pass filter, we could probably drop from 8x oversampling down to 4x oversampling, but I like the extra frequency resolution the 8x oversampling produces in the MTF.
Results
Simulating synthetic images with noise similar to that produced by a D7000 at ISO 800 (but a Gaussian PSF), we can investigate the benefits of the new binning method. Ideally, what we would like to see is no difference between accuracy at a 4 degree angle, and accuracy at one of the critical angles. To quantify this, here is a comparison of 95% percentile of the relative MTF50 error (over a range of MTF50 values from 0.08 cycles/pixel to 0.5 cycles/pixel):Figure 5: 95% percentile of relative MTF50 error (click to enlarge) |
Just the be sure, I checked for bias at an edge orientation of 4 degrees (although I recycled the ISO800 images):
Figure 6: Relative MTF50 deviation (%) |
Conclusion
With the new binning method I can say that MTF Mapper no longer has significant problems with edges of certain orientations. More testing is required, but the 95% percentile of relative MTF50 error appears to be below 5%, regardless of edge orientation, for MTF50 values from 0.08 cycles/pixel through to 0.5 cycles/pixel.The improved binning method will be included in the next release (which should be v0.4.17).
Hi, fvdbergh! I'm an university student in Sweden writing my bachelor thesis on GRD and slanted edge. I'm wondering if I may use figure 1 from this post in my report? I will of course give credit to you! Best regards, Christoffer
ReplyDeleteSure, you are welcome to use figure 1!
Delete