Tuesday 3 November 2015

PffffFFTttt...

There is no doubt that FFTW is one of the fastest FFT implementations available. It can be a pain to include in a Microsoft Visual Studio project, though. Maybe I am "using it wrong"...

One solution to this problem is to include my own FFT implementation in MTF Mapper, thereby avoiding the FFTW dependency entirely. Although it is generally frowned upon to use a homebrew FFT implementation in lieu of an existing, proven library, I decided it was time to ditch FFTW.

One of the main advantages of using a homebrew FFT implementation is that it avoids the GPL license of FFTW. Not that I have any fundamental objection to the GPL, but the main sources of MTF Mapper are available under a BSD license, which is a less strict license than the GPL. In particular, the BSD license makes allowance for commercial use of the code. Before anyone asks, no, MTF Mapper is not going closed source or anything like that. All things being equal, the BSD license is just less restrictive, and avoiding FFTW brings MTF Mapper closer to being a pure BSD (or compatible) license project.

FFT Implementation

After playing around with a few alternative options, including considering the my first c++ FFT implementation way back from first year at university, I settled on Sorenson's radix-2 real-valued FFT (Sorenson, H.B, et al, Real-Valued Fast Fourier Transform Algorithms, IEEE Transactions on Accoustics, Speech, and Signal Processing, 35(6), 1987). This algorithm appears to be a decent balance between complexity and theoretical efficiency, but I had to work fairly hard at the code to produce a reasonably efficient implementation.

I tried to implement it in fairly straightforward c++, but taking care to use pointer walks in stead of array indexing, and using look up tables for both the bit-reversal process and the sine/cosine functions. These changes produced an algorithm that was at least as fast as my similarly optimized complex FFT implementation augmented with a two-for-the-price-of-one step for real-valued inputs.

One thing I did notice is that the FFT in its "natural" form does not lend itself to an efficient streaming implementation. For example, the first pass of the radix-2 algorithm looks like this:
for (; xp <= xp_sentinel; xp += 2) {  
    double xt = *xp;
    *(xp)   = xt + *(xp+1);
    *(xp+1) = xt - *(xp+1);
}
Note that the value of x[i] (here *xp) is overwritten in the 3rd line of the code, while the original value of x[i] (copied into xt) is still required in the 4th line of the code. This write-after-read dependency causes problems for out-of-order execution. Maybe the compiler is smart enough to unroll the loop and intersperse the reads and writes to achieve maximal utilization of all the processing units on the CPU, but the stride of the loop and the packing of the values is not ideal for SSE2/AVX instructions either. I suppose that this can be addressed with better code, but before I spend time on that I first have to determine how significant raw performance of the FFT is in the context of MTF Mapper.

Real world performance in MTF Mapper

So how much time does MTF Mapper spend calculating FFTs? Well, one FFT for every edge. A high-density grid-style test chart has roughly 1452 edges. According to a "callgrind" trace produced using valgrind, MTF Mapper v0.4.21 spends 0.09% of its instruction count inside FFTW's real-valued FFT algorithm.

Using the homebrew FFT of MTF Mapper 0.4.23 the total number of instruction fetches increase by about 1.34%, but this does not imply a 1.34% increase in runtime. The callgrind trace indicates that 0.31% of v0.4.23's instructions are spent in the new FFT routine.

In relative terms, this implies that the new routine is roughly 3.5 times slower, but this does not account for the additional overheads incurred by FFTW's memory allocation routines (the FFTW routine is not in-place, hence requires a new buffer to be allocated before every FFT to keep the process thread-safe).

Measuring the actual wall-clock time gives us a result of 22.27 ± 0.14 seconds for 20 runs of MTF Mapper v0.4.21 on my test image, versus 21.631 ± 0.16 seconds for 20 runs of v0.4.23 (each experiment repeated 4 times for computing standard deviations). These timings were obtained on a Sandy-bridge laptop with 8/4 threads. The somewhat surprising reversal of the standings (the homebrew FFT now outperforms the FFTW implementation) just goes to show that the interaction between hyperthreading, caching, and SSE/AVX unit contention can produce some surprising results.

Bottom line: the homebrew FFT is fast enough (at least on the two hardware/compiler combinations I tested).

Are we done yet?

Well, surely you want to know how fast the homebrew FFT is in relation to FFTW in a fair fight, right?

I set up a simple test using FFTW version 3.3.4 built on gentoo using gcc-4.9.3, running on a Sandy-bridge laptop cpu (i7-2720QM) running at a base clock of 2.2 GHz. This was a single-threaded test, so we should see a maximum clock speed of 3.3GHz, if we are lucky.

For a 1024-sample real-valued FFT, 2 million iterations took 14.683 seconds using the homebrew code, and only 5.798 seconds using FFTW. That is a ratio of ~2.53.

For a 512-sample (same as what MTF Mapper uses) real-valued FFT, 2 million iterations took 6.635 seconds using the homebrew code, and only 2.743 seconds using FFTW. That is a ratio of ~2.42.

According to general impressions gathered from the Internet, you are doing a good-enough job if you are less than 4x slower than FFTW. I ran metaFFT's benchmarks, which gave a ratio of 2.4x and 2.1x relative to FFTW for size 1024 and 512, respectively (these were probably complex transforms, so not a straight comparison).

The MTF Mapper homebrew FFT at least appears to be in the right ballpark, at least fast enough not to cause embarrassment....

Sunday 5 July 2015

A critical look

Most of the posts on this blog are tutorial / educational in style. I have come across a paper published by an Imatest employee that requires some commentary of a more critical nature. With some experience in the academic peer review process, I hope I can maintain the appropriate degree of objectivity in my commentary.

At any rate, if you have no interest in this kind of commentary / post, please feel free to skip it.

The paper

The paper in question is : Jackson K. M. Roland, " A study of slanted-edge MTF stability and repeatability ", Proc. SPIE 9396, Image Quality and System Performance XII, 93960L (January 8, 2015);  doi:10.1117/12.2077755; http://dx.doi.org/10.1117/12.2077755.

A copy can be obtained directly from Imatest here.

Interesting point of view

One of the contributions of the paper is a discussion of the impact of edge orientation on MTF measurements. The paper appears to approach the problem from a direction that is more closely aligned with the ISO12233:2000 standard, rather than Kohm's method ("Modulation transfer function measurement method and results for the Orbview-3 high resolution imaging satellite", Proceedings of ISPRS, 2004).

By that I mean that Kohm's approach (and MTF Mapper's approach) is to compute an estimate of the edge normal, followed by projection of the pixel centre coordinates (paired with their intensity values) onto this normal. This produces a dense set of samples across the edge in a very intuitive way; the main drawback of this approach being the potential increase in the processing cost because it lends itself better to a floating point implementation.

The ISO12233:2000 approach rather attempts to project the edge "down" (assuming a vertical edge) onto the bottom-most row of pixels in the region of interest (ROI). Using the slope of the edge (estimated earlier), each pixel's intensity (sample) can be shifted left or right by the appropriate phase offset before being projected onto the bottom row. If the bottom row is modelled as bins with 0.25-pixel spacing, this process allows us to construct our 4x-oversampled, binned ESF estimate with the minimum amount of computational effort (although that might depend on whether a particular platform has strong floating-point capabilities).

The method proposed in the Imatest paper is definitely of the ISO12233:2000 variety. How can we tell? Well, the Imatest paper proposes that the ESF must be corrected by appropriate scaling of the x values using a scaling factor of cos(theta), where theta is the edge orientation angle. What this accomplishes is to "squash" the range of x values (i.e. pixel column) to be spaced at an interval that is consistent with the pixel's distance as measured along the normal to the edge. For a 5 degree angle, this correction factor is only 0.9962, meaning that distances will be squashed by a very small amount indeed. So little, in fact, that the ISO12233:2000 standard ignores this correction factor, because a pixel at a horizontal distance of 16 pixels will be mapped to a normal distance of 15.94. Keeping in mind that the ESF bins are 0.25 pixels wide, this error must have seemed small.

I recognize that the Imatest paper proposes a valid solution to this "stretching" of the ESF that would occur in its absence, and that this stretching would become quite large at larger angles (about a 1.5 pixel shift at 25 degrees for our pixel at a horizontal distance of 16 pixels).

My critique of this approach is that it would typically involve the use of floating point calculations, the potential avoidance of which appears to have been one of the main advantages of the ISO12233:2000 method. If you are going to use floating point values, then Kohm's method is more intuitive.

Major technical issues

  1. The Point Spread Functions (PSFs) used to perform the "real world" and simulated experiments were rather different, particularly in one very important aspect. The Canon 6D camera has a PSF that is anisotropic, which follows directly from its square (or even L-shaped) photosites. The composite PSF for the 6D would be an Airy pattern (diffraction) convolved with a square photosite aperture (physical sensor) convolved with a 4-dot beam splitter (the OLPF). Of course I do not have inside information on the exact photosite aperture (maybe chipworks has an image) nor the OLPF (although a 4-dot Lithium Niobate splitter seems reasonable). The point remains that this type of PSF will yield noticeably higher MTF50 values when the slanted edge approaches 45 degrees. Between the 5 and 15 degree orientations employed in the Imatest paper, we would expect a difference of about 1%. This is below the error margin of Imatest, but with a large enough set of observations this systematic effect should be visible.

    In contrast, the Gaussian PSF employed to produce the simulated images is  (or at least is supposed to be) isotropic, and should show no edge-orientation dependent bias. Bottom line: the "real world" images had an  anisotropic PSF, and the simulated images had an isotropic PSF. This means that the one cannot be used in the place of the other to evaluate the effects of edge orientation on measured MTF. Well, at least not without separating the PSF anisotropy from the residual orientation-depended artifacts of the slanted edge method.
  2.  On page 7 the Imatest paper states that "The sampling of the small Gaussian is such that the normally rotationally-invariant Gaussian function has directional factors as you approach 45 degree increments." This is further "illustrated" in Figure 13.

    At this point I take issue with the reviewers who allowed the Imatest paper to be published in this state. If you suddenly find that your Gaussian PSF becomes anisotropic, you have to take a hard look at your implementation. The only reason that the Gaussian (with a small standard deviation) is starting to develop "directional factors" is because you are undersampling the Gaussian beyond repair.

    The usual solution to this problem is to increase the resolution of your synthetic image. By generating your synthetic image at, say, 10x the scale, all your Gaussian PSFs will be reasonably wide in terms of samples in the oversampled image. For MTF measurement using the slanted edge method, you do not even have to downsize your oversampled image before applying the slanted edge method. All you have to do is to change the scale of your resolution axis in your MTF plot. That way you do not even have to worry about the MTF of the downsampling kernel.

    There are several methods that produce even higher quality simulated images. At this point I will plug my own work: see this post or this paper. These approaches rely on importance sampling (for diffraction PSFs) or direct numerical integration of the Gaussian in two dimensions; both these approaches avoid any issues with downsampling and do not sample on a regular grid. These methods are implemented in mtf_generate_rectangle.exe, which is part of the MTF Mapper package.

Minor technical issues

  1. On page 1 the Imatest paper states that the ISO 12233:2014 standard lowered the edge contrast "because with high contrast the measurement becomes unstable". This statement is quite vague, and appears to contradict the results presented in Figure 8, which shows no degradation of performance at high contrast, even in the presence of noise.

    I would offer some alternative explanations: the ISO12233 standard is often applied to images compressed with DCT-based quantization methods, such as JPEG. A high-contrast edge typically shows up with a large-magnitude DCT coefficient at higher frequencies; exactly the frequencies that are more strongly quantized, hence the well-kown appearance of "mosquito noise" in JPEG images. A lower contrast edge will reduce the relative energy at higher frequencies, thus the stronger quantization of high frequencies will have a proportionately smaller effect. I am quite temtpted to go and test this theory right away.

    Another explanation, one that is covered in some depth on Imatest's own website, is of course the potential intensity clipping that may result from incorrect exposure. Keeping the edge contrast in a more manageable range reduces the chance of clipping. Another more subtle reason is that a lower contrast chart allows more headroom for sharpening without clipping. By this I mean that sharpening (of the unsharp masking type) usually results in some "ringing" which manifests as overshoot (on the bright side of the edge) and undershoot (on the dark side of the edge). If chart contrast was so high that the overshoot of overzealous sharpening would be clipped, then it would be harder to measure (and observe) the extent of oversharpening.
  2. The noise model is employed a little basic. Strictly speaking the standard deviation of the additive Gaussian white noise should be signal dependent; this is a more accurate model of photon shot noise, and is trivial to implement. I have not done a systematic study of the effects of noise simulation models on the slanted edge method, but in 2015 one really should simulate photon shot noise as the dominant component of additive noise.
  3. Page 6 of the Imatest paper states that "There is a problem with this 5 degree angle that has not yet been addressed in any standard or paper." All I can say to this is that Kohm's paper has presented an alternative solution to this problem that really should be recognized in the Imatest paper.

Summary

Other than the unforgivable error in the generation of the simulated images, a fair effort, but more time spent on the literature, especially papers like Kohm's, would have changed the tone of the paper considerably, which in turn would have made it more credible.
 

Taking on Imatest


After having worked on MTF Mapper for almost five years now, I have decided that it is time to go head-to-head with Imatest. I downloaded a trial version of Imatest 4.1.12 to face off against MTF Mapper 0.4.18.

For the purpose of this comparison I decided to generate synthetic images using mtf_generate_rectangle. This allows me to use a set of images rendered using an accurately known PSF, meaning that we know exactly what the actual MTF50 value should be for those images. I decided to render a test chart conforming to the SFRPlus format, since that allows me to extract a fair number of edges for each test case. The approximately-sfrplus-chart looks like this:
Figure 1: SFRPlus style chart with an MTF50 value of 0.35 cycles/pixel
 SFRPlus was quite happy to automatically identify and extract regions of interest (ROIs) over all the relevant edges from this image. MTF Mapper can also extract edges from this image automatically. One notable difference is that SFRPlus includes the edges of the squares that overlap with the black bars at the top and bottom of the images, whereas MTF Mapper only considers edges that form part of a complete square. To keep the comparison fair, I discarded the results from the top and bottom rows of squares (as extracted by SFRPlus), leaving us with 19*4 edges per image (SFRPlus ignores the third square in the middle column).

Validating the test images

(This section can be skipped if you trust my methodology)

Although I have posted quite a few posts here on this blog regarding the algorithms used by mtf_generate_rectangle to render synthetic images, I will now show from first principles that the synthetic images truely have the claimed point spread functions (PSFs), and thus known MTFs.

I rendered the synthetic image using a command like this:

mtf_generate_rectangle.exe --b16 --pattern-noise 0.0085 --read-noise 2.5 --adc-gain 0.641 --adc-depth 12 -c 0.33 --target-poly sfrchart.txt -m 0.35 -p gaussian-sampled --airy-samples 100

This particular command renders the SFRPlus chart using a Gaussian PSF with an MTF50 value of 0.35. Reasonably realistic sensor noise is simulated, including photon shot noise, which implies that the noise standard deviation scales as the square root of the signal level; in plain English: we have more noise in bright parts of the image.

I ran a version of  mtf_mapper that dumped the raw samples extracted from the image (normally used to construct the binned ESF); I specified the edge angle as 5 degrees to remove all possible sources of error. NB: the "raw_esf_values.txt" file produced by MTF Mapper contains the binned ESF, and is not suitable for this particular experiment because of the smoothing inherent in the binning.

Given that I specified an MTF50 value of 0.35 cycles per pixel, we know that the standard deviation of the true PSF should be 0.5354018 pixels [ sqrt( log(0.5)/(-2*pi*pi*0.35*0.35) ]. From this we can calculate the expected analytical ESF, which is simply erf(x/sigma)*(upper-lower) + lower, where erf() is the standard "error function", defined as the integral of the unit Gaussian. The values upper and lower merely represent the mean white and black levels, which were defined as lower = 65536*0.33/2 and upper = 65536 - lower. With these values, I can now plot the expected analytical ESF along with the raw ESF samples dumped by MTF Mapper. 

Figure 2: Raw ESF samples along with analytical ESF
I should mention that I shifted the analytical ESF along the "d" axis to compensate for any residual bias in MTF Mapper's edge position estimate. We can see that the overall shape of the analytical ESF appears to line up quite well with the ESF samples extracted from the synthetic image. Next we look at the difference between the two curves:
Figure 3: ESF difference
 
We see two things in Figure 3: The mean difference appears to be close to zero, and the noise magnitude appears to increase with increasing signal levels (to the right). The increase in noise was expected, since that follows from the photon shot noise model used to simulate sensor noise. We can normalize the noise by dividing the ESF difference (noise) by the square root of the analytical ESF, which gives us this plot:

Figure 4: Normalised ESF difference
This normalization appears to keep the noise standard deviation constant, which would be consistent with garden-variety additive Gaussian white noise. The density estimate of the normalized noise looks Gaussian:
Figure 5: Normalized ESF difference density
Running the normalized residuals through the Shapiro-Wilk normality test gives us a p-value of 0.03722 over our 3285 samples. That is bad news, because it means our data is non-Gaussian at a 5% significance level. It is, however, Gaussian at a 10% confidence level. Correction: The normalized residuals are Gaussian at a 3% (or 2.5%, or 1%) significance level. The qqnorm() plot is pretty straight too, which tells us it is more likely that the Shapiro-Wilk test is negatively affected by the large number of samples, than that the residuals are truely not Gaussian.

Now that we have confirmed that the distribution of the residuals are Gaussian, we can fit a line through them. This line comes out with a slope of -0.005765, which means that our normalized residuals are fairly flat. Lastly, we can perform some LOESS smoothing on the normalized residuals:
Figure 6: LOESS fit on normalized ESF difference
Again, we can see that the LOESS-smoothed values oscillate around 0, i.e., there is no trend in the difference between the analyical ESF and the ESF measured from our synthetic image.

The mean signal-to-noise ratio in the bright regions of the images comes out at around 15dB; because we compute the LSF (or PSF if you prefer)) from the derivative of the ESF, the bright parts of the image are representative of the worst-case noise. Alternatively, we can say that the noise is quite similar to that produced by a Nikon D7000 at ISO400, for an SRFplus test chart at a 5:1 contrast ratio.

I have shown that there is no systematic difference between the ESF extracted from a synthetic image and the expected analytical ESF. The simulated noise also behaves in the way that we would expect from properties of the simulated sensor. Based on these observations, we can safely assume that the synthetic images have the desired PSF, i.e., the simulated MTF50 values are spot-on. (In previous posts I examined the properties of the simulated ESF values in the absence of noise, but here I chose to demonstrate the PSF properties directly on the actual images used in the Imatest vs MTF Mapper comparison).


The results

The results presented here were obtained by running Imatest 4.1.12 and MTF Mapper 0.4.18 on these images (about 100MB). SFRPlus (from Imatest, of course) was configured to enable the LSF correction that was recently introduced. Other than that, all settings were left to defaults, including leaving the apodization option enabled. I turned off the "quick mtf" option, although I did not check to see whether this affected the results. After a run of SFRPlus, the "save data" option was used to store the results, after which the "MTF50" column values were extracted, discarding the top and bottom row edges as explained before.

MTF Mapper was run using the "-t 0.5 -r" settings; the "-t 0.5" option is required to allow MTF Mapper to work with the rather low  5:1 contrast ratio. The values output to "raw_mtf_values.txt" were used as the representative MTF50 values extracted by MTF Mapper.

Simulated images were produced over the MTF50 range 0.1 cycles/pixel to 0.7 cycles/pixel in increments of 0.05 cycles/pixel, with one extra data point at 0.08 cycles/pixel to represent the low end (which is quite blurry). For each MTF50 level a total of three images were simulated, each with a different seed to produce unique sensor noise. This gives us 19*3*4 = 228 samples at each MTF50 level.  

As in previous posts, the results will be evaluated in two ways: bias and variance. The first plots to consider illustrate both bias and variance simultaneously, although it is somewhat harder to compare the variance of the methods on these plots.
Figure 7: Imatest relative error boxplot
Figure 8: MTF Mapper relative error boxplot
In figures 7 and 8, the relative difference (or error) is calculated as 100*(measured_mtf50 - expected_mtf50)/expected_mtf50. It is clear that Imatest 4.1.12 underestimates MTF50 values sligthly for MTF50 values above 0.2 cycles/pixel; this pattern is typical of what one would expect if the MTF curve is not adequately corrected for the low-pass filtering effect of the ESF binning step (see this post; ). MTF Mapper corrects for this low-pass filtering effect, producing no clear trend in median MTF50 error over the range considered. We can plot the median measured MTF50 relative error for Imatest and MTF Mapper on the same plot:
Figure 9: Median relative MTF50 error comparison

Figure 9 shows us that the Imatest bias is not all that severe; it remains below 2% over the range of MTF50 values we are likely to encounter in actual photos. (NB: Up to July 30, 2015, this figure had Imatest and MTF Mapper swapped around).

So that illustrates bias. To measure variance we can plot the standard deviation at each MTF50 level:
Figure 10: Standard deviation of relative MTF50 error
Other than at very low MTF50 values (say, 0.08 cycles/pixel and lower), it would appear that MTF Mapper 0.4.18 produces more consistent MTF50 measurements than Imatest 4.1.12.

A final performance metric to consider is the 95th percentile of relative MTF50 error. By computing this value on the absolute value of the relative error, it combines both variance and bias into a single measurement that tells us how close our measurements will be to the true MTF50 value, in 95% of measurements. Here is the plot:
Figure 11: 95th percentile of MTF50 error
Of all the performance metrics presented here, I consider Figure 11 to be the most practical measure of accuracy.

Conclusion

It took quite a bit of effort on my part to improve MTF Mapper to the point where it produces more accurate results than Imatest. There are some other aspects I have not touched on here, such as how accuracy varies with edge orientation. For now, I will say that MTF Mapper produces accurate results at known critical angles, whereas Imatest appears to fail at an angle of 26.565 degrees. Given that Imatest never claimed to work well at angles other than 5 degrees, I will let that one slide.

I have also not included any comparisons to other freely available slanted edge implementations (sfrmat, Quick MTF, the slanted edge ImageJ plugin, mitreSFR). I can tell you from informal testing that most of them appear to perform significantly worse than Imatest, mostly because none of those implementations appear to include the finite-difference-derivative correction. Maybe I will back this opinion up with some more detailed results in future.

So where does that leave your typical Imatest user? Well, the difference in accuracy between Imatest and MTF Mapper is relatively small. What I mean by that is that these results do not imply that Imatest users have to switch over to using MTF Mapper, rather, these results show that MTF Mapper users can trust their measurements to be at least as good as those obtained by Imatest. And, of course, MTF Mapper is free, and the source code is available.

There are some fairly nifty features that I noticed in SFRPlus during this experiment. It appears that SFRPlus will perform lens correction automatically, meaning that radial distortion curvature can be corrected for on the fly. MTF Mapper currently limits the length of the edge it will include in the analysis as a means of avoiding the effects of strong radial distortion. But now that I am aware of this feature, I think it would be relatively straightforward to include lens distortion correction in MTF Mapper. So little time, so many neat ideas to play with ...


Wednesday 24 June 2015

Truncation of the ESF

A really quick post to highlight one specific aspect: what happens to the MTF produced by the slanted edge method if the ESF is truncated.

To recap: The slanted edge method projects image intensity values onto the normal of the edge to produce the Edge Spread Function (ESF). Any practical implementation has to place an upper limit on the maximum distance that pixels can be from the edge (as measured along the edge normal). MTF Mapper, for example, only considers pixels up to a distance of 16 pixels from the edge.

Looking back at the Airy pattern that results from the diffraction of light through a circular aperture we can see that the jinc2 function has infinite support, in other words, it tapers off to zero but never quite reaches zero if we consider a finite domain.

We also know that the effective width of the Airy pattern increases with increasing f-number. Herein lies the problem: a slanted edge implementation that truncates the ESF will necessarily discard part of the Airy pattern. The discarded part is of course the samples furthest from the edge, and we know that those samples tend to contribute more to the lower frequencies in the MTF.

Simulating a slanted edge image using the Airy + photosite aperture model, with an aperture of f/8, light at 550 nm, a 100% fill-factor square photosite aperture, and 4.886 micron photosite pitch (something approximating the D810), we can investigate the impact of the truncation distance on the MTF as measured by the slanted edge method. Here goes:
The green dotted line represents the expected MTF curve (from our simple model). I have zoomed in on the low-frequency region, but we can see that both the truncated MTF measurements (red and black curves) tend to follow the green curve more closely after about 0.10 cycles per pixel. We also note that both the red and black curves contain a few points that are clearly above the green curve between 0 and 0.05 cycles per pixel. It is physically impossible for the measured MTF to exceed the diffraction MTF (blue curve), so we can state with confidence that this is a measurement error.

If we compare the red and the black curves we can see that a wider truncation window (red curve) reduces the overshoot at low frequencies. If we had the opportunity to use an even wider truncation window, we would be able to reduce the overshoot to even lower levels.

Lastly, if we introduce apodization into the mix we are compounding the problem even further by attenuating the edges of the PSF. This leads to even greater overshoot (at low frequencies) in our measured MTF curve.

Bottom line: The slanted edge method is constrained by practical limitations, most notably the desire to have a finite truncation window, and the desire to reduce the impact of image noise using apodization of the PSF. These constraints lead to overshoot in the lowest frequencies of the measured MTF. It may be possible to apply an empirical correction to minimize the overshoot, but only at the cost of making strong assumptions regarding the shape of the MTF, which is best avoided.

Tuesday 23 June 2015

Anisotropy

In my post on "critical angles" I mentioned that there was one other factor to consider when looking at the influence of edge orientation on slanted edge analysis. I will refer to that phenomenon as the influence of anisotropic point spread functions. In this context, I use the term anisotropic to refer to point spread functions that are not radially symmetric.

The simplest example of an anisotropic PSF is to consider just a square photosite aperture, without any lens aperture diffraction.
Figure 1: Edge orientation relative to photosite aperture

In figure 1 we can see the interaction between our slanted edge (shown in blue here) and the photosite aperture (orange). If the value t represents the distance from the centre of our photosite aperture to the right edge of our slanted edge (rectangle or step edge), then we can consider the overlapping area between the two as a function of t. The interesting range of values for t would be between -√0.5 and √0.5, if we assume the photosite is a square with sides of length 1. Plotting this overlapping area as a function of t gives us Figure 2:
Figure 2: Fraction of square photosite covered by slanted edge as a function of edge distance to photosite centre
When the edge orientation angle theta is 0 degrees, then we obtain a linear function, which is what one would expect. If the edge is at a 45 degree angle (as shown in the right panel of Figure 1), then we obtain the other extreme. Angles between 0 and 45 degrees produce a curve that is somewhere in between these extremes.

What can we learn from these curves? Well, we can see that an edge orientation of 45 degrees will overlap with the photosite square from -√0.5 to √0.5, whereas the 0 degrees edge orientation only results in overlap between -0.5 and 0.5. From this we can infer that the square appears wider when approached by an edge with a 45 degree orientation. We also know that a square photosite acts as a low-pass filter, in the sense that the image captured by our sensor is the convolution of this low-pass filter and the analytical model of our scene. This might lead one to believe that the 45 degree case would result in a stronger low-pass filter, because it is clearly "wider" than the 0 degree case.

We can plot the derivative of the curves from Figure 2:
Figure 3: Instantaneous width of PSF

The 0 degree case is easy to visualize with the help of the left panel of Figure 1: clearly, the width of the photosite square (measured along the step edge) is constant. The 45 degree case is also readily visualized by noting that the we cross the widest part of the photosite square when t=0 (right panel of Figure 1); this nicely corresponds to the peak instantaneous width of √2 in Figure 3.

We can interpret the curve in Figure 3 as a weighting function, i.e., the relative contribution to the convolution of the edge and the photosite aperture at distance t from the centre of the photosite aperture. Looking at the problem this way reveals a new angle: the 45 degree case presents a fair amount of its total weight located close to t=0. Roughly 50.6% of its weight is located in the part where it is wider than the 0 degree case, corresponding to the central of Figure 3 where the red curve is above the gray curve. In contrast, only about 8.6% of the weight of the 45 degree curve is located in the two tail ends (t < -0.5 and t > 0.5). If we compare this to the 0 degree case, we obtain 42% in the centre (area under gray curve where the red curve is above the gray curve), and of course 0% in the tails.

This is a rather unexpected turn of events, since it implies that even though the 45 degree case starts overlapping with the edge sooner (the regions -√0.5 < t < -0.5 and 0.5 < t < √0.5), it represents only a small fraction of the total interaction with the edge. Instead of the 45 degree case being a stronger low-pass filter than the 0 degree case, we expect the opposite because the 45 degree case has roughly 20% (50.6/42) more of its weight located close to t=0.

We appear to have two mildly conflicting views:
a) the 45 degree case is "wider" at its widest point, thus it should be a stronger low-pass filter than the 0 degree case, and
b) more of the weight of the 45 degree case is close to the centre, hence it should present a weaker low-pass filter than the 0 degree case.

I am betting on outcome b), mostly because I already know what the empirical results will tell us ....

Empirical results for square photosites (no diffraction)

The prediction favoured by outcome b) in the previous section tells us that we should expect MTF50 values to increase as we progress from a relative edge orientation of 0 degrees through to 45 degrees. Simulations were performed in the absence of noise, using 30 repetitions over sub-pixel shifts. Keep in mind that the MTF50 value of a square photosite aperture is about 0.6033 cycles per pixel, which is quite high.
Figure 4: Square (box) PSF relative MFT50 error as function of edge orientation
We can see that MTF50 overestimation steadily increase to about 5% as we approach 45 degrees.

Just to check, let us examine an isotropic PSF: a pure Gaussian without any photosite aperture simulation. This should yield a purely Gaussian MTF. Same simulation, but with the radially symmetric Gaussian PSF:
Figure 5: Gaussian PSF relative MTF50 error as a function of edge orientation
Other than a bit of a glitch at 2 degrees producing a few outliers, we see a fairly flat median MTF50 error with the Gaussian PSF. No systematically increasing MTF50 error with increasing angle appears.

Somewhat real world: squares plus diffraction

We have seen that a box PSF (without diffraction) produces strong anisotropy, and that a Gaussian PSF (without photosite aperture) produces no noticeable anisotropy. Using a PSF consisting of an Airy pattern convolved with a square photosite aperture should put us somewhere in the middle of the anisotropy scale.

Simulations were repeated using a simulated aperture at f/2.8, light at 550 nm, a photosite pitch of 4.73 micron and no AA (OLPF) filter. These settings give an expected MTF50 value of ~ 0.504 cycles per pixel, which is slightly lower than the expected MTF50 value of ~ 0.6 cycles per pixel seen in the previous section. Accordingly, the MTF50 errors may be slightly reduced (or at least the expected variance should be reduced).
Figure 6: Airy+box PSF relative MTF50 error as a function of edge orientation

The trend is clearly visible, but appears to be only about 60% of the magnitude of the case without diffraction (about 2.5% at 44 degrees, vs about 4% without diffraction). Smaller apertures (larger f-numbers) will reduce the anisotropy as the Airy component of the PSF will start to dominate the photosite aperture PSF.

Any practical implications?

The effect of PSF anisotropy on MTF measurements is real, but appears to be relatively small. At 2.5%, do we even have to worry about it?

Unfortunately, we have to at least be aware of this for certain types of testing and measurement. Because the error (overestimation) is systematic, it will show up in any measurement that sweeps through a range of angles, just like the MTF Mapper grid test chart, pictured here:
MTF Mapper grid test chart
This chart can be used to produce Sagittal/Meridional MTF50 plots across your lens/sensor/camera. The chart aims to keep one edge perpendicular to the virtual line connecting that edge to the centre of the chart, which inevitably causes some of the squares to approach a 45 degree edge orientation.

I simulated this chart using mtf_generate_rectangle, using an aperture of f/4, an Airy+box PSF, green light and a photosite pitch of 4.73 micron. Passing this synthetic image through MTF mapper to produce a surface plot (-s option) produces this result:
Figure 7: MTF50 plot obtained from simulated rendering of the MTF Mapper grid test chart using the Airy-box MTF with a square photosite aperture

The systematic distortion of MTF50 values is clearly visible, even though the range of values is quite small. The maximum value on the scale is 0.47, which is only about 2% higher than the expected MTF50 value of 0.46073 (at 0 degrees, of course). But the cross pattern is clearly visible. At least I have confirmed the cause.

Pushing for even greater realism I repeated the simulation using the "rounded-square" photosite aperture that MTF Mapper provides. Here is the surface plot:
Figure 8: MTF50 plot obtained from simulated rendering of the MTF Mapper grid test chart using the Airy-box MTF with a rounded-square photosite aperture
We can see that the MTF50 values are slightly higher (I think the effective fill factor is slightly lower for my hand-crafted rounded corner photosite aperture), but ignore that bit for the moment. Instead, notice that the range is even smaller than the square aperture case (Figure 7), but the cross pattern is still visible.

Lastly, if we use a circular photosite aperture, we get this:
Figure 9: MTF50 plot obtained from simulated rendering of the MTF Mapper grid test chart using the Airy-box MTF with a circular photosite aperture
Other than the fact that the resulting image is appallingly ugly, we can see that the cross structure has disappeared, as expected.

Conclusion

Anisotropy is a reality that we have to deal with if we apply the slanted edge method to edges that approach a relative orientation of 45 degrees with respect to the (presumed) square photosites. The isotropy of the Airy pattern helps to attenuate the overestimation of edges approaching 45 degrees, but the systematic effect is still clearly visible in simulated images.

I tried to construct an elegant analytical explanation for the interaction between the edge orientation and a square photosite aperture. This turned out to be harder than I expected, so I only have some interesting plots to offer for now. What did emerge from the theory is that we should not focus on the apparent width of the photosite aperture, but rather on the distribution of its weight relative to the centre. The somewhat startling conclusion is that we should observe higher MTF50 measurements when the orientation approaches 45 degrees.

This was supported by the actual experiments using simulated imagery.

So what can we do about this systematic distortion? Well, the only sound solution would be to stick to edges with a relative orientation of about 5 degrees. This is not a universal solution, though, because it makes it impossible to measure in the true Sagittal/Meridional directions. Imatest solved the problem by sticking to 5 degree angles and referring to "horizontal" and "vertical" MTF. This works well enough if you wish to measure peak astigmatism, but it does not allow you to measure MTF in the optically more appropriate sagittal/meridional directions.

I might add a 5-degree test chart to MTF Mapper in future, just to cover all bases.

Tuesday 16 June 2015

MTF Mapper v0.4.17 Windows binary released

Must be a slow news day.

Anyhow, a Windows binary of the latest release of MTF Mapper, v.0.4.17, is now available on sourceforge.

Version 0.4.17 does not add any new functionality as such, but it does incorporate a few improvements in measurement accuracy. If I broke anything, please let me know!

Also, I finally upgraded the dcraw version included in the Windows binaries to 9.26.

Monday 15 June 2015

Critical angles

It is often said that there is more than one way to skin a cat.

Well, today I discovered an Imatest article that demonstrates just how wildly different slanted edge implementations can (and apparently do) vary. I will leave my critique of said article for another day, but I will note that this article makes reference to the "5 degrees" rule that is often seen when slanted edge measurements are performed.

The "5 degrees" rule states that the orientation of the edge relative to the sensor's photosite grid should be approximately 5 degrees (either horizontal or vertical).

There are two notable reasons for this: firstly, a 5 degree angle is far from the critical angles (the topic of this post), and secondly, a 5 degree angle ensures that the potential non-rotationally symmetric behaviour of the PSF is minimized. A discussion of the non-rotationally symmetric PSFs will also be postponed to a future article.

A closer look at the slanted edge method

Figure 1 illustrates how MTF Mapper constructs the oversampled edge spread function (ESF) that is the starting point of the MTF calculation.
Figure 1: How the ESF is sampled

We want to oversample the ESF so that we can increase the effective Nyquist limit; this is extremely important if we want to measure frequencies close to the natural Nyquist limit of 0.5 cycles per pixel of our sensor. The Shannon-Nyquist theorem shows us that we will have aliasing at frequencies above 0.5 cycles per pixel if we sample at a rate of 1 sample per pixel.

Pushing up our sampling rate to 8x moves the Nyquist limit up to 4 cycles per pixel, which allows us to examine the behaviour of our MTF curve near 0.5 cycles per pixel without fear that we are being misled by aliasing artifacts.

How can we increase the spatial sampling rate of our sensor? Well, we cannot change the sensor, but we can use a trick to generate a synthetic ESF. Looking at Figure 1 above we can see that the edge (represented as a black line) crosses the pixel grid in different places as we move along the edge. More importantly, pay attention to the shortest distance from each black dot (representing the centre of each pixel/photosite) to the black edge. Notice how this distance varies by a fraction of the pixel spacing as we move along the edge.

Let us assume that we have a coordinate system with its origin at the centre of our top/leftmost pixel of our sensor, such that the black dots representing the pixel centres can be addressed by integer coordinates. If we take the (x, y) coordinate of a pixel near the edge, and project this coordinate onto the vector representing the edge normal (i.e., the vector perpendicular to the edge under analysis), then we obtain a real-valued scalar that represents the distance of the pixel centre from our edge. We can pair this projected distance-from-edge value with the intensity of that pixel to form a sample point on our synthetic ESF, as shown in Figure 1.

How does this help us to oversample the ESF? Well, if we choose an appropriate edge orientation angle, say, 5 degrees, then the projected ESF points will be densely spaced. In other words, the average distance between to consecutive samples in our projected ESF will be a fraction of the pixel spacing. We can partition the projected ESF points into bins of width 0.125 pixels to produce a regularly-spaced sampled ESF with 8x oversampling.

We know this works well for 5 degrees (because that is what everyone is doing), but what is so special about 5 degrees? To answer that, we have to slog through some elementary math.

Spacing of projected samples

Figure 2 illustrates one possible way in which we can assign integer coordinates to the pixels near the edge under analysis.
Figure 2: How pixel coordinates are assigned

Note that we can pick an arbitrary origin (shown as (X0,Y0) in red); this just simplifies the math that will follow. This point need not fall exactly on the edge, but without loss of generality we can pretend that it does, since this means we can use integer coordinates to refer to the pixel centres of pixels near the edge.

The orientation of the edge can be specified in degrees as measured from the horizontal, but I prefer using the slope of the line. If the angle between the edge and the horizontal is  θ, then the direction perpendicular to the edge can be represented as the unit length vector (-sin(θ), cos(θ)). This would be expressed as a slope 1/Δx = tan(θ), such that Δx = 1/tan(θ).

The normal vector (-sin(θ), cos(θ)) then becomes (-1, Δx) * 1/√(1 + Δx2). We project our pixel centres, represented as integer coordinates (x, y), onto this normal vector by computing the dot product (x, y) · (-1, Δx) * 1/√(1 + Δx2), which evaluates to d(x,y) = 1/√(1 + Δx2) * (-x + yΔx).

The function d(x,y) thus computes the distance that the pixel located at (x, y) is from the origin (X0,Y0), which we will pretend falls on the edge; this means that d(x,y) measures the perpendicular distance of point (x, y) from the edge. The projected ESF point is thus [d(x,y), I(x+X0, y + Y0)], where I(i, j) denotes the intensity of the pixel located at pixel(i,j).

Suppose that we focus only on the subset of pixels with integer coordinates (p, q) such that 0 ≤ d(p, q) < 1. If we are to achieve 8x oversampling, then there must be at least 8 unique distance values d(p, q) in this interval. In fact, we would require these 8 points to be spread out uniformly such that at least one d(p, q) value falls in the interval [0, 0.125), one in [0.125, 0.25), and so on, such that each of the sub-intervals of length 0.125 between 0 and 1 contain at least one point.

Consider, for example, the case where Δx = 4. This reduces d(p, q) to 1/√(1 + 42) * (-p + 4q) = (-p + 4q)/√17. Because both p and q are integers, we can deduce that d(p, q) must be an integer multiple of 1/√17. How many integer multiples of 1/√17 can we fit in between 0 and 1? If we enumerate them, we can choose p and q such that (-p + 4q) is the set {0, 1, 2, 3, 4, 5, 6 ...}. But √17 = 4.123106 (and change), so if (-p + 4q) ≥ 5, then d(p, q) > 1. That leaves only the set {0, 1, 2, 3, 4}, such that the only values of 0 ≤ d(p, q) < 1 are {0, 1/√17, 2/√17, 3/√17, 4/√17}.

Whoops! If Δx = 4, then there will only be 5 unique values of d(p, q) between 0 and 1, and we need at least 8 points between 0 and 1 to achieve 8x oversampling! The implications of the failure to achieve 8x oversampling will be covered a bit later; first we must identify the critical angles.

Enumerating the problem angles

We already know that Δx = 4 causes our 8x oversampling to fail; this corresponds to an angle of atan(1/4) = 14.036 degrees. In fact, it is fairly simple to see that for any integer value Δx, we will have Δx + 1 unique values between 0 and 1 (if we include the 0 in our count). For 8x oversampling, the spacing between d(p, q) values must be less than 0.125, which happens when we have at least 8 unique d(p, q) values between 0 and 1. For Δx = 8, we see that 1/√(1 + Δx2) = 1/√65 ≈ 0.12403.

The angles that will lead to a failure of the 8x oversampling mechanism are thus:  45, 26.565051, 18.434949, 14.036243, 11.309932,  9.462322, and  8.130102.

Some other Δx values are also problematic: 1.5, and 2.5. These yield only 2Δx + 1 unique values (including zero). Setting Δx = 1.25 only yields 4Δx + 1 a total of 7 unique values. These fractional slopes occur at angles of 33.69007, 21.80141, and 38.65981 degrees.

There may even be more of these problematic angles, but this is as far as I have come with this analysis. Feel free to comment if you can help me identify other values of Δx that will lead to undersampling.

Dealing with the critical angles

So what exactly happens when we do not have at least one sample every 0.125 pixels along the ESF? The corresponding bin in the resampled ESF will be missing, and leaving gaps in the resampled ESF leads to severe distortion of the MTF because those gaps show up as high-frequency transitions in the FFT.

A workable strategy is to fall back on 4x oversampling. Another strategy is to simply interpolate the from nearby bins. Both of these solutions address the primary issue (gaps in the ESF/PSF), but the residual impact of the interpolation/replacement on the final MTF is harder to mitigate.

A new hope

After my previous post (on improved apodization) I started thinking about the notion of applying low-pass filters to an interpolating function applied directly to the dense ESF samples, before binning is performed. I realized that my explanation of the equivalence between binning and fitting an interpolating function + low-pass filtering + sampling only holds when the points are relatively uniformly distributed within each bin.

This got me thinking that I can probably apply a low-pass filter directly to the dense ESF samples, even before binning. The implementation of this approach feels familiar; it turns out to be similar to the method I implemented to perform importance sampling when using an Airy + photosite aperture PSF (this post). Before describing the new method, first consider this illustration of plain vanilla unweighted binning:

Figure 3: Unweighted binning

The pink boxes denote the bins, each 0.125 pixels wide; the horizontal direction depicted here corresponds to the "d" axis in Figure 2. The midpoint, or representative "x" value for each bin is indicated by the arrows and the values in blue. The green dots represent individual dense ESF samples --- their "y" values are not important in this diagram; the position of the green dots are merely to illustrate where each dense ESF sample is located within each bin in terms of x value, and the number of dots give a rough indication of the density of the dense ESF samples.

If we use plain binning, then we choose as representative x value for each bin the midpoint of the bin. The representative y value is obtained as the mean of the y values of the ESF samples within that bin. In Figure 3, the rightmost bin has many ESF samples quite close to the midpoint of the bin, but almost as many ESF samples near the edge of the bin. The effect of unweighted averaging would be that the samples near the right edge of the bin will carry roughly the same weight as the samples near the middle of our bin, but clearly the samples near the middle of the bin should have had a larger weight in computing the representative value for this bin.

A much better way of binning would be to combine the binning step with the low-pass filtering step. Instead of representing each dense ESF sample as a point, it instead becomes a small rectangle, as shown here:

Figure 4: Weighted binning
Now we can make the weight of each sample point proportional to the overlap of the sample's rectangle and the bin extents. This will allow the samples closer to the midpoint more weight, but it also allows a point to contribute to multiple adjacent bins, depending on the width of the rectangle. This smooths out the transition from one bin to the next, especially if the rectangle is wider than the bin width. (Ok, so the rectangle in the diagram is really just a 1-D interval, not a 2D shape. But the principle still holds.)

Yes, I have just reinvented kernel density estimation. Sigh.

Anyhow, this binning approach also makes the low-pass filtering step explicit, so if each dense ESF sample is now represented by an interval of width w pixels, then we are effectively convolving the ESF with a rect(w * x)  function. We can remove the low-pass filtering effect on the MTF (calculated further down the pipeline) by dividing the MTF by sinc(0.5 * w * f), as I have shown in my previous post.

Our binning process is beginning to look more like a proper approach to sampling: we apply a low-pass filter to our dense ESF points to remove (or at least strongly attenuate) higher frequencies, followed by choosing one representative value at the midpoint of each bin (the downsampling step). By choosing w = 0.33333 pixels, we have a fairly strong low-pass filter, but one that still has a cut-off frequency that is high enough to allow good detail at least up to 3 cycles per pixel.

Because of the (relatively) wide low-pass filter, we could probably drop from 8x oversampling down to 4x oversampling, but I like the extra frequency resolution the 8x oversampling produces in the MTF.

Results

Simulating synthetic images with noise similar to that produced by a D7000 at ISO 800 (but a Gaussian PSF), we can investigate the benefits of the new binning method. Ideally, what we would like to see is no difference between accuracy at a 4 degree angle, and accuracy at one of the critical angles. To quantify this, here is a comparison of 95% percentile of the relative MTF50 error (over a range of MTF50 values from 0.08 cycles/pixel to 0.5 cycles/pixel):
Figure 5: 95% percentile of relative MTF50 error (click to enlarge)
The results are very promising. Most notable is the fact that the new binning method performs virtually identically regardless of edge orientation, with 26.565 degrees being the only angle that is slightly worse than the others.  There may be a slight drop relative to MTF Mapper v0.4.16 (at 4 degrees), but keep in mind the contribution of the change in windowing method discussed in my previous post.

 Just the be sure, I checked for bias at an edge orientation of 4 degrees (although I recycled the ISO800 images):
Figure 6: Relative MTF50 deviation (%)
We can see that the new binning method does not introduce any bias in MTF50 estimates --- of course this is after correction using the MTF of the low-pass filter, as described above.

Conclusion

With the new binning method I can say that MTF Mapper no longer has significant problems with edges of certain orientations. More testing is required, but the 95% percentile of relative MTF50 error appears to be below 5%, regardless of edge orientation, for MTF50 values from 0.08 cycles/pixel through to 0.5 cycles/pixel.

The improved binning method will be included in the next release (which should be v0.4.17).

Thursday 11 June 2015

Improved apodization and bias correction

Following on the relatively recent addition of LSF correction to Imatest, I decided to revisit some of the implementation details of MTF Mapper.

The brutal truth is that MTF Mapper used an empirical correction factor (shock, shock, horror!) to remove the observed bias in measured MTF curves. The empirical correction factor (or rather, family of correction factors) was obtained by generating a synthetic image with a known, analytical MTF curve, and calculating the resulting ratio of the measured curve (as produced by MTF Mapper) to the expected analytical curve.

This had the advantage that it would remove both known distortions, such as that generated by the finite-difference approximation to the derivative (which Imatest refers to as the LSF correction factor), and other distortions which were produced by processes that I did not fully understand at the time.

This post will deal with two of the distortions that I have identified, and I will propose solutions that will enable MTF Mapper to do away with the empirical correction approach.

Apodization

Apodization, also called "windowing", is a way to attenuate some of the artifacts resulting from the application of the FFT (or DFT, if you like) to a signal of a finite length. The DFT/FFT assumes that the signal is periodic, that is, the first (leftmost) sample is preceded (circularly) by the last (rightmost) sample. Applying the FFT to a signal that is discontinuous when treated in this circularly wrapped-around way usually results in significant energy spuriously appearing on the high frequency end of the frequency spectrum.

A common windowing function is the Hamming window, which looks like a cosine function centered on the center of the sequence of samples. The samples are multiplied component-wise with the window function, effectively producing a new set of samples such that the leftmost and rightmost samples are scaled to very low magnitudes. Since the left- and rightmost samples are now all close to zero, we are guaranteed to have a signal that no longer has a discontinuity when wrapping around the left/right ends.

So why would we use apodization as part of the slanted edge method? First, recall how the slanted edge method works:
Step 1: generate the edge spread function (ESF)
This diagram shows how the individual pixel intensities are projected along a line that coincides with the edge we are analyzing. Owing to the angle of the edge relative to the pixel grid, the density of the projected values (along the direction perpendicular to the edge) is much greater than the original pixel spacing. The densely-spaced projected values are binned to form a regularly-spaced set of samples at (usually) 4x or 8x oversampling relative to the pixel grid. This allows us to measure frequencies above the Nyquist limit imposed by the original pixel grid.

Now we can compute the MTF as illustrated here:
Step 2: Compute MTF from PSF using FFT

Notice that the PSF is usually quite compact, i.e., most of the area under the PSF curve is located close to the centre of the PSF curve. This is typical of a PSF extracted from a real-world edge. We see some noise on the tails of the PSF, with visibly more noise on the right side --- this is an artifact of photon shot noise being relative to the signal level, so the noise magnitude is larger in the bright parts of the image.

Anyhow, since the noise is random, we might end up with large values on the edges, such as can be seen on the right end of the PSF samples. This is exactly the scenario which we would like to avoid, so we can apply a window to "squash" the samples near the edges of the PSF.

MTF Mapper had been using a plain Hamming window up to now --- this resulted in a systematic bias in MTF measurements, particularly affecting edges with an MTF50 value below 0.1 cycles per pixel.
Hamming window

Two things are visible here: the noise is suppressed reasonably well (on the ends of the green curve) after multiplying the PSF by the Hamming window function (see right side of illustration), and the PSF appears to contract slightly, effectively becoming slightly narrower after windowing.

The apparent narrowing of the PSF has the expected impact on MTF50 values: they are overestimated slightly.

I identified three possible methods to address this systematic overestimation of MTF50 values (on the low end of MTF50 values): empirical correction (as MTF Mapper has been doing so far), deconvolution, and using a different window function.

We can "reverse" the effect of the windowing after we have applied the FFT to obtain the MTF. By the convolution theorem, we know that convolution in the time domain becomes multiplication in the frequency domain. Since we multiply the PSF by the window function in the time domain, it stands to reason that we must deconvolve the MTF by the Fourier transform of the window function. Except that deconvolution is a black art that is best avoided.

I have tried many different approaches, but the high noise levels in the PSF makes for a poor experience, more apt to inject additional distortion into our MTF than to undo the slight distortion caused by windowing in the first place.


That leaves us only with the last option: choose a different window function. Purely based on aesthetics, I decided on the Tukey window with an alpha parameter of 0.6:
Tukey window
Notice that we may get slightly less noise suppression, but in return we distort the PSF far less. In fact, at this level (MTF50 = 0.05) the distortion is negligible, and no further correction factors are required. This is the new apodization method employed by MTF Mapper.

LSF correction and beyond

As already mentioned, the finite-difference method used to calculate the PSF (or LSF, if you are pedantic) from the ESF is not identical to the ideal analytical derivative of the ESF. A sin(x)/(x) correction factor can be employed to effectively remove this distortion. The Imatest article on this topic does a fine job of explaining the maths behind this correction; the method was originally published by Burns while working at Kodac.

Since MTF Mapper employs 8x oversampling, we must divide the calculated MTF by the function sin(π * f/4)/(π * f/4). Clarification: This stems from the sample spacing that is 0.125 pixels. Plugging this into the finite-difference derivative calculation as explained in the Imatest article we see that for 8x oversampling we will have a correction factor of sin(π * f/4)/(π * f/4) as opposed to the sin(π * f/2)/(π * f/2) we would have had for 4x oversampling.

Even after applying this correction factor, though, we can see a systematic difference between the expected ideal MTF and the MTF produced by the slanted edge method. To understand this (final?) distortion, we have to rewind back to the step where we construct the ESF (helpfully captioned "Step 1" above...).

The projection used to form the dense ESF samples produces a dense set of points, but these points are no longer spaced at convenient regular intervals. The FFT rather depends on being fed regularly spaced samples, so the simplest solution is to bin the samples at our desired oversampling factor. An oversampling factor of 8x thus produces bins that are 0.125 pixels wide.

Again following the path of least resistance, we simply average all the values in each bin to obtain our regularly-sampled ESF. This seems like such a harmless little detail, but if we stop and think about it, we realize that this must be a low-pass filter. Why?

Well, consider first a continuous interpolation function passing through all the ESF samples before binning. We would like to sample this function at regular intervals (0.125 pixels, to be exact), but we know that point sampling will produce horrible aliasing artifacts. The correct approach is to apply a low-pass filter, i.e., convolve our interpolating function with some filter. Let us choose a simple box filter of width 0.125 pixels. If we first convolve the interpolating function with this box filter, and then point-sample at intervals of 0.125 pixels, we end up with exactly the same result as we would obtain from binning followed by averaging all the values in each bin. This approach is optimal in terms of noise suppression for a Gaussian noise source, so even though it sounds simplistic, it is a good solution.

Fortunately, this process is easily reversible by indiscriminate application of the convolution theorem: convolution in the time domain can be reversed by dividing the MTF (in the frequency domain) by the Fourier transform of our low-pass filter. And by now we know that the Fourier transform of a box filter is the sinc() function --- all we have to do is choose the proper frequency.

At 8x oversampling, our bin width is 0.125 pixels, resulting in a low-pass filter of rect(8x). In the Fourier domain, this means we must divide the MTF by sinc(π * f/8) --- this will effectively reverse the attenuation of the MTF induced by the low-pass filter.

To illustrate the effect of these two components (discrete derivative and binning low-pass filter) we can look at a simple example using a Gaussian PSF, with no added noise, and no apoditization. We start with the dense ESF of an edge with an MTF50 value of exactly 0.25 cycles/pixel:
Figure 1: Dense ESF

This ESF is binned into bins of width 0.125 pixels:
Figure 2: binned ESF

Next we calculate the discrete derivative to obtain the PSF:
Figure 3: discrete PSF

 This PSF is passed through the FFT to obtain the following MTF curve:

Figure 4: measured MTF curve
That MTF curve looks pretty good. And it looks very much like half of a Gaussian, just as we would expect. But looks can be deceiving at this scale. We know the true analytical MTF curve that we would expect: a Gaussian with a standard deviation of about 0.2123305 (and change). So next we plot the measured MTF curve divided by the expected MTF curve:
Figure 5: Uncorrected MTF ratio (red)
The dashed blue curve is the sin(π * f/4)/(π * f/4) function, corresponding to the discrete derivative correction, and the red curve is the ratio of measured to expected MTF. Clearly these two curves have roughly the same shape. Let us take our measured MTF curve, divide it by the sinc(f) curve to apply the discrete derivative correction, and plot the ratio of the corrected curve to the expected curve:
Figure 6: Partially corrected MTF ratio (red)
Note how the red curve (corrected MTF divided by expected MTF) has flattened out --- keep in mind that we would expect this curve to flatten out into a straight line. The black dashed line is the function sin(π * f/8)/(π * f/8), i.e., the Fourier transform of the rect(8x) low-pass filter induced by the binning process. Now we can combine the two corrections, i.e., take the measured MTF, divide by the discrete derivative correction, and then divide the result by the low-pass correction; this gives us the "fully corrected" MTF curve. Plotting the fully corrected MTF curve divided by the expected analytical MTF curve yields this:
Figure 7: Fully corrected MTF ratio (red)
The red curve is almost, but not quite, a constant value of 1.0. This demonstrates that the low-pass correction helps to bring us closer to the expected ideal MTF curve.

If we zoom out a bit on the last plot, we see things are not entirely rosy:
Figure 8: Fully corrected MTF ratio (red), wide view
Once we move past a frequency of 1 cycle per pixel, the corrected curve does not match the expected curve so well anymore, at least not when expressed as a ratio. But looking back at Figure 4 above, we see that the measured MTF curve is practically zero beyond 1 cyc/pixel anyway, so we should expect some numerical instability when dividing the measured curve by the expected curve. This also explains my choice of scale in a few of the plots above.

If we express the difference between the fully corrected curve and the expected analytical curve as a percentage of the magnitude of the analytical curve, we see that the fully corrected curve deviates only about 0.15% at 1 cyc/pixel, and only about 0.05% at 0.5 cyc/pixel (Nyquist). For reference, the relative deviation of a completely uncorrected curve is about 10% and 3% at 1 and 0.5 cyc/pixel respectively. Applying only the discrete derivative correction leaves a deviation of about 2.8% and 0.6%.

So adding the correction for the low-pass filter effect of the binning is definitely in the diminishing returns category, but I certainly aim to make MTF Mapper the most accurate tool out there, so no expense is spared.

Summary: The full correction to take care of both the finite-difference correction, and the removal of the attenuation induced by the low-pass filter (implicitly part of the binning operation) is the product of the two individual term, i.e.,
c(f) = sin(π * f/4)/(π * f/4) * sin(π * f/8)/(π * f/8),
The MTF curve is corrected by dividing by this correction factor.

Accuracy evaluation

To demonstrate the effect of the new apoditization and MTF correction approaches, we can look at the MTF50 accuracy over a range of MTF50 values. For each of the MTF50 levels shown below, a number of synthetic images were rendered without adding any simulated noise --- this is to emphasize the inherent bias in measured MTF50 values. All edges were kept at a relative angle of 4.5 degrees, with 30 repetitions rendered using small sub-pixel shifts of the rectangle.
Figure 9: Relative MTF50 deviation on a Gaussian PSF
Our three contestants are MTF Mapper v0.4.16, which employs a Hamming windowing function and empirical MTF curve correction, followed by an implementation that uses a Hamming window with only the discrete derivative correction, and finally the new implementation using a Tukey windowing function with both discrete derivative and binning low-pass corrections.

It is clear that the Hamming window + derivative correction (blue curve) produces a significant bias at low MTF50 values, raising their values artificially (as expected from the apparent narrowing of the PSF). Also note how the MTF50 values are underestimated at higher MTF50 values, which is again consistent with the effects of the binning low-pass filter.

Both the empirical correction method (red curve) and the new Tukey window plus full correction (black curve) display much lower bias in their MTF50 estimates, as seen in Figure 9.

What happens when we use a different PSF to generate our synthetic images? This time I chose the Airy + photosite aperture (square aperture, 100% fill factor) as a representative. This corresponds to something like a D7000 sensor without an OLPF, but without noise.
Figure 10: Relative MTF50 deviation on an Airy+box PSF
Firstly, we see some shockingly large errors on the low MTF50 side. The data points correspond to a simulated aperture of f/64, followed by f/32, f/16, f/8, f/5.6, f/4 and finally f/2.8. A reasonable explanation for the difference between the results in Figure 9 and 10 might be the wider support of the Airy PSF. Typically, the central peak of the Airy PSF is narrower than a Gaussian, but the Gaussian also drops off to zero more quickly, i.e., the Airy PSF has more energy in the tails of the PSF. This means that a wide (f/64) Airy PSF will be affected more strongly by the windowing function, and may even suffer from some truncation of the PSF --- this notion seems to be supported by the difference between the Tukey and Hamming window curves (black vs blue).

Interestingly the empirical correction performed better than expected, doing almost as well as the Tukey + full correction method. This is somewhat unexpected, since the empirical correction factors were calculated from a Gaussian PSF.

Since these experiments were all performed in the absence of simulated noise, they really only test the inherent bias of the various methods. The good news is that the Tukey + full correction approach appears to be an overall improvement over the existing empirical correction, even thought the improvement is really quite small.

Adding in some noise

It always makes sense to look at both bias and variance when comparing the quality of two competing models. In this spirit, the experiments above were repeated under mild noise conditions, corresponding to roughly ISO 800 on a D7000 sensor. First up, the Gaussian PSF:
Figure 11: Standard deviation of relative MTF error on Gaussian PSF
Figure 11 presents the standard deviation of the relative MTF50 error, expressed as a percentage. We see the impact of the Tukey windowing function quite clearly: since the Tukey window does not attenuate such a large part of the PSF (i.e., less of the edge of the PSF is attenuated), we see a small increase in the standard deviation of the relative error. As expected, both the methods using the Hamming window perform nearly identically.

Conclusion

MTF Mapper will employ the new apodization function (Tukey window) as well as the analytically-derived full correction in lieu of the older Hamming window + empirical correction, starting from the next release. This should be v0.4.17 onwards.

The new correction method is more elegant, and makes fewer assumptions regarding the shape of the MTF curve, unlike the empirical correction that was trained on only Gaussian MTFs. But throwing out the empirical correction brings back the strong attenuation of the PSF at lower MTF50 values, so the Hamming window had to be replaced with the Tukey window.

We pay a small price for using the Tukey window, but realistically the MTF50 error should remain below 5% (for an expected MTF50 value of 0.5 c/p) even under quite noisy conditions.

In theory it should be possible to incorporate strong low-pass filtering of the PSF, followed by suitable reversal-via-division of the low-pass filter in the frequency domain. In practice, I have not seen any worthwhile improvement in accuracy. I suspect that some non-linear adaptive filter may be able to strike the right balance, but that will have to wait for now.