MTF Mapper

Roving

2021-03-04T10:11:00.001+02:00

Are you sending a robotic rover to Mars? Do you require software to characterize and calibrate the cameras on your rover? Then you should consider using MTF Mapper, just like the Mars 2020 Perseverance Mastcam-Z team did! [1]

ps: Of course, MTF Mapper also works on earthbound cameras of all shapes and sizes.

References

1. Hayes, A.G., Corlies, P., Tate, C. et al. Pre-Flight Calibration of the Mars 2020 Rover Mastcam Zoom (Mastcam-Z) Multispectral, Stereoscopic Imager. Space Sci Rev, 217, 29 (2021).

Lateral Chromatic Aberration measurements now available in MTF Mapper

2020-07-12T19:07:00.007+02:00

The article title rather ruins the surprise, but MTF Mapper now supports the measurement of Chromatic Aberration (CA). For now, the emphasis is on the measurement of lateral CA, and in this post I will demonstrate the accuracy of MTF Mapper's implementation, and I will also provide some usage examples.

Before we jump in, I'd just like to point out that I now accept donations to support the development of MTF Mapper through PayPal:

It goes without saying that MTF Mapper remains free and Open Source. There is no obligation to send me money if you use MTF Mapper, and I will continue to add new features to MTF Mapper regardless.

Ok, back to the topic. This post is quite long, so you can skip ahead to the part that interests you if you like. First, I have a review of the concept of lateral chromatic aberration, so you can skip over this to an explanation of how lateral CA is measured in MTF Mapper. Or you can skip to the MTF Mapper CA measurement accuracy experiments and results. Lastly, you can go straight to the usage example if you want to see how to use this, and what the output looks like.

Edge orientation convention

In order to fully explain lateral CA measurement as implemented in MTF Mapper, it is necessary to agree on some convention to describe the orientation of an edge. In this context, "edge" refers to the edge of a trapezoidal slanted-edge target. A radial edge falls along a line passing through the centre of the lens, as seen from the front of the lens. The midpoint of a tangential edge is tangent to a circle concentric with the optical axis of the lens.

Fig 1: Edge orientation convention

Although these definitions are strict, we will allow for some deviation from the exact radial and tangential orientations in practice when performing measurements.

Longitudinal Chromatic aberration

Longitudinal Chromatic Aberration (LoCA) appears when a lens does not focus light with different wavelengths at the same focus plane. For example, blue light might be focused slightly in front of the sensor at the same time that green light is perfectly focused on the sensor.

Fig 2: Illustration of longitudinal CA

If you are capturing an image of a crisp step edge (like we often see in slanted-edge MTF measurements), then the image of the edge in the blue wavelengths still lines up perfectly with the image of the edge in the green wavelengths, it might just be slightly softer because it is out of focus. As you might imagine, LoCA does not necessarily have any preferential direction, i.e., edges in a tangential orientation could be affected by LoCA in exactly the same way as edges with a radial orientation. Of course, things are never quite this simple, so tangential and radial LoCA might be different if your lens suffers from severe astigmatism. Fig 3 gives you some idea of what longitudinal CA could look like in practice.

Fig 3: Example of simulated longitudinal CA

For another excellent discussion of CA, including a demonstration of how MTF50 varies with focus position between the three colour channels, you should head over to Jack Hogan's article on the topic, and I should mention that Jack used Jim Kasson's data in places.

Lateral Chromatic aberration

Lateral Chromatic Aberration simply means that the image magnification is dependent on the wavelength. In other words, we might have a lens that can keep blue, green and red wavelengths all concurrently in focus at the sensor (where "in focus" means focus is within some tolerance), but the blue wavelengths are projected with a magnification of 0.99 relative to the green wavelengths, and the red wavelengths with a magnification of 1.01, for example.

Fig 4: Illustration of lateral CA

Returning to our slanted-edge target, which we can pretend is located near the edge of the image with a tangential orientation, we will see that the red, green and blue channels have comparable sharpness, but that the edge in the blue channel appears to be shifted inwards towards the centre of the image, and the red channel edge appears to be shifted outwards, relative to the green edge. Keep in mind that this is just one example of how the magnification varies by wavelength, and that other combinations are possible, e.g., both red and blue could have sub-unity magnification. Fig 5 presents a simulated example of lateral CA:

Fig 5: Example of simulated lateral CA

So what about radially oriented edges? Well, because the wavelength-depended magnification is radial, we tend to not see any red or blue channel shift at all on radial edges. You could still see some longitudinal CA on those edges, though.

How do you measure CA

Well, I do not know how other people measure it, but MTF Mapper uses the Line Spread Function (LSF) of slanted-edge targets to measure the apparent shift between the red-green and green-blue channel pairs. This means that I first extract an Edge Spread Function (ESF) for each channel, using the tried-and-tested code in MTF Mapper that already compensates for geometric lens distortion. Taking the derivative of the ESF yields the LSF, and the weighted centroid of the LSF is a fairly good estimate of the sub-pixel location of the edge. Rinse and repeat for all three colour channels, and you can estimate CA by subtracting the green channel centroid from the red and blue channel centroids.

I suppose there are many other ways to implement this, really any method that is typically used to perform dense image co-registration will work. Personally, I have played with some Fourier cross-correlation methods [1], as well as a really cool method [2] that estimates the least-squares optimal FIR filter (convolution filter, if you want to sound more hip) to reconstruct the moving image from the fixed image, e.g., estimating the red channel from the green channel, in the CA application case. The centroid of this FIR filter will give you an accurate estimate of the sub-pixel shift. Anyhow, these methods are fine, but there are some advantages to the LSF-based method. The first two that come to mind is that the LSF-based method is not affected by lens distortion, and that you can apply it directly to a Bayer-mosaiced image without requiring a demosaicing algorithm. Just like MTF/SFR measurements, if you are interested in measuring the lens properties, then capturing a raw image and processing the Bayer-mosaiced image directly is the way to go.

Note that the LSF-centroid method I described above can be applied to any edge orientation that is suitable for slanted-edge measurements, so more-or-less anything but integer multiples of 45°. I started out measuring the CA on all edges, but I eventually decided to discard the measurements obtained from radial edges (more precisely, edges that are within 45 degrees of being radially oriented), mostly because I did not know what I was measuring. You can get around this MTF Mapper restriction at the moment by cropping out a single slanted edge, and using the --single-roi MTF Mapper option, which will measure CA regardless of the radial / tangential nature of the edge.

Experimental set-up

To test the accuracy of MTF Mapper's CA measurement functionality, I generated synthetic images with simulated CA. The following three aspects could affect the measurements:

The magnitude of the shift between the colour channels. Just to make sure that we can detect both subtle and severe CA.
The apparent sharpness of the edge, since a softer simulated lens yields an image with a more blurry edges. Intuitively, a more blurry edge make the exact location of the edge less well defined, which should increase the uncertainty in the CA measurement.
Simulated noise, using a realistic sensor noise model (FPN, read noise and photon shot noise), but with a tunable overall SNR. Again, one would expect higher noise levels to increase the uncertainty of the CA measurement.

Some combinations of the above dimensions will now be investigated.

Results: blur vs. noise at a fixed shift, full density RGB

Although one would expect the absolute shift between, say, the red and the green channels to increase as a function of radial distance, with worse lateral CA at the edge of the image, this does not make the best candidate for low-level accuracy evaluation. My first set of experiments simulated a constant radial shift of 1 pixel between the red and the green channels across the entire simulated image (and a -1 shift for blue). This yields 322 usable radial edge measurements from one simulated "lensgrid" chart image, with most edges ending up with a length between 100 and 120 pixels (this is relevant when discussing the impact of noise). This is what the synthetic image looks like, if you are curious:

The simulated system was similar to my usual diffraction + square photosite with a simulated aperture of f/4, and a photosite pitch of 4.73 micron. This time, however, I further employed the "-p wavefront-box" option to simulate varying degrees of defocus, sweeping the W020 term from 0λ through to 3.5λ. Such a range of simulated defocus yields images with MTF50 values in the range 0.47 to 0.05 cycles per pixel, covering the range you are likely to encounter in actual use.

Fig 6: 95th percentile shift error (pixels), full density RGB input images

In Fig 6 we can see the 95th percentile of the absolute error in the measured shift, relative to the expected shift, as we vary the edge sharpness and the noise level. To put things in perspective, keep in mind that an SNR of 10 is quite poor, and that you should be able to achieve an SNR of 100 with a printed paper test chart without too much effort. Of course, one has to keep in mind that vignetting can be a problem in the image corners: an SNR of 100 in the image centre with about 3.3 stops of vignetting will result in an SNR of 10 in the image corners. Anyhow, this is what the MTF50=0.05, SNR=5 case, with red shifted +1 pixels, and blue shifted -1 pixels looks like:

Fig 7: An example of what a 1-pixel lateral shift in the red and blue channels looks like under MTF50=0.05 with SNR=5 conditions

As I will demonstrate later, the measured CA shift values are completely unbiased. Here I show the 95th percentile of the error, but the mean value in each cell (over all 322 edges) deviates less than 0.013 pixels from the expected value in all cells, including the worst-case combination of blur and noise. One minor detail remains, though: the results in Fig 6 were obtained with full density RGB input images, meaning the simulation generated individual R, G and B values at each pixel, so this is representative of a Foveon-like sensor, not a typical Bayer CFA sensor.

Results: blur vs. noise at a fixed shift, Bayer-mosaiced

A more representative example would be to reduce the full-density RGB image to a Bayer-mosaiced image, and then processing the resulting image as using MTF Mapper's "--bayer green" input option. This will cause the CA measurements to be performed using the mosaiced data; note that the "--bayer green" option will still cause MTF Mapper to use the appropriate red and green Bayer subsets for CA measurements, and that only the MTF data will be extracted using only the green Bayer subset. Since the number of samples along each edge is reduce by the Bayer mosaicing, we expect an increase in the CA measurement error.

Fig 8: 95th percentile shift error (pixels), Bayer mosaic input images

Comparing Fig 8 to Fig 6 reveals that we now require better SNR values to maintain the same level of accuracy in the CA shift measurements. For example, the SNR=40 column in Fig 8 is a reasonable match for the SNR=20 column in Fig 6. Ok, perhaps Bayer-mosaiced SNR=40 is slightly better than full-RGB SNR=20, but you get the idea.

Results: blur vs. noise at a fixed shift, Bayer demosaiced

What happens if we take the Bayer-mosaiced images generated in the previous step, and we run them through a demosaicing algorithm to produce another full-density RGB image? I chose to use OpenCV's built-in demosaicing algorithm, COLOR_BayerRG2BGR_EA. This may not be the best demosaicing algorithm on the planet, but it produced output that was consistent.

Fig 9: 95th percentile shift error (pixels), demosaiced input images

Comparing Fig 9 to Fig 8 delivers a few surprises: At the SNR=100 and SNR=50 noise levels, the demosaiced image produced slightly more accurate results around the MTF50=0.24 scenario rather than at the sharpest setting (MTF50=0.47), and the demosaiced image actually performed slightly better than the raw Bayer-mosaiced case in the high-blur, low-SNR corner.

In retrospect, it makes sense that the demosaiced CA measurement is less accurate on very sharp edges, because aliasing is likely to make demosaicing harder, or less accurate, perhaps. The other surprise is perhaps also not so unexpected, since demosaicing is an interpolation process, thus we expect it to filter out high-frequency noise to some extent. In the lower-right corner of our table we encounter scenarios where there is little high-frequency content to the edge location (the edge is blurry, after all), but we have a lot of noise, so some additional low-pass filtering courtesy of the demosaicing interpolation helps just enough. Perhaps this is an indication that MTF Mapper could benefit from applying additional smoothing during low-SNR CA measurements in future.

Results: shift vs. noise at a fixed blur, Bayer-mosaiced

Note that I will now skip over the full-density RGB results for brevity, and move right along to raw Bayer-mosaiced results obtained at different simulated lateral CA shifts. The expectation is that the shift measurement accuracy should be independent of the actual shift magnitude within the usable measurement range, which is probably around ±20 pixels. Only two noise levels were investigated, SNR=50 and SNR=10, representing the good-but-not-perfect, and need-to-improve-lighting scenarios.

Fig 10: Difference between measured CA shift and expected CA shift, low noise scenario, raw mosaiced image

Fig 11: Difference between measured CA shift and expected CA shift, high noise scenario, raw mosaiced image

The boxplots not only give us an indication of the spread of the CA shift measurements, but also hint at the smallest difference in shift that can be discerned. As shown in Fig 10, MTF Mapper can easily measure differences in CA shift of well below 0.1 pixels under good SNR conditions, but Fig 11 shows that there is some overlap between adjacent shift steps under very noisy conditions. By overlap I mean that adjacent boxes are separated by an expected difference of 0.1 pixels in CA shift, but we can see that the height of the whiskers of the boxes in Fig 11 exceeds 0.1 pixels. Another reassuring feature is that we can see that all the boxes are centred nicely around an error of zero, thus the measurements are unbiased.

Results: shift vs. noise at a fixed blur, Bayer demosaiced

Similar to the previous experiment, but with an additional demosaicing step, again using OpenCV's algorithm. I will only show the noisy case for brevity:

Fig 12: Difference between measured CA shift and expected CA shift, high noise scenario, OpenCV demosaiced image

If you compare Figs 11 and 12, you might notice that the spread of the differences is slightly smaller in Fig 12, i.e., just like we observed in Fig 9 we are seeing a small noise reduction benefit with demosaicing, but without adding any bias to the measurement. Well, that is with OpenCV's demosaicing algorithm. I also repreated the experiment using LibRaw's open_bayer() function, but I suspect that I am doing something wrong, because I get some pronounced zippering artifacts, and this boxplot:

Fig 13: Difference between measured CA shift and expected CA shift, low noise scenario, LibRaw AHD demosaic image

Unlike all the other scenarios, the LibRaw AHD experiment yields CA shift measurements that have a very obvious bias, seen as the systematic deviation from a zero-mean error in Fig 13. I am not going to say much more here, because it is most likely user error on my part.

Recommendations following the experiments

Several aspects that affect the accuracy of lateral CA measurements were considered above, including the magnitude of the shift between channels, the overall sharpness of the edge, and the prevailing signal-to-noise conditions. It is hard to make any recommendation without choosing some arbitrary threshold of when a CA measurement is considered "accurate enough". My rule of thumb in these matters is that an accuracy of 0.1 pixels is a reasonable target, since this figure often appears in machine vision literature. Given this target, we can break down the results above to produce the following recommendations:

The easiest parameter to control, and the one you should probably put the most effort into controlling, is the signal-to-noise ratio. If your images look like Fig 7 above, you simply cannot expect accurate results. If you can keep the SNR at 30 or higher, you can expect to hit the target accuracy of 0.1 pixels.
The next parameter you should try to control is focus. CA measurements on well-focused images are more accurate than those obtained from blurry photos. Having said that, some lenses are just not very sharp, and if you have extreme field curvature your image corners are bound to be out of focus. One recommendation here is to capture two shots: one with the "optimal" focus (whatever your criteria for that are), and another obtained after focusing in one of the image corners. This two-photo strategy is really only necessary in extreme cases when you face blurry corners and extreme vignetting, or you could not control the SNR because of external factors.
Do not measure CA on demosaiced images if you can help it. For Fuji X-trans you have no choice with MTF Mapper at the moment, but for any Bayer CFA layout you should use raw images when possible. At some point, I will look at other demosaicing algorithms, maybe even LightRoom, just to see what is possible.

User interface and visualization

CA measurements are now available through the MTF Mapper GUI. If you are new to MTF Mapper, you can view the user guide after installation (e.g., using this link to install the Windows 10 binaries) under the help menu in the GUI, or you can download a PDF version from this link if you want to browse through the guide first before you decide to install MTF Mapper.

Fig 14: Preferences dialogue

As shown in Fig 14, MTF Mapper now has a new output type (a), as well as an option to control which type of visualization to generate (b). The CA measurement is visualized either as the actual shift in pixels (or microns, if you set the pixel size and select the "lp/mm units" option), measured in the red/blue channels, or the shift can be expressed as a percentage of the radial distance from the measurement to the centre of the image. I processed some images of my Sigma 10-20 mm lens on a Nikon D7000 to illustrate:

Fig 15: lateral CA map, with shift indicated in pixels

Fig 16: same lateral CA map, but this time the shift is expressed as a fraction of the radial distance

Personally, I prefer the visualization in Fig 15, but I have noticed that other software suites offer the visualization in Fig 16; normalizing the lateral CA shift by the radial distance should help to make the measurement less dependent on the final print size, I suppose. The main problem with the way that I have implemented this method (Fig 16) is that there is an asymptote at the centre of the lens where the radial distance is very small. This rather ruins the scaling of the plot; I think Imatest works around this problem by suppressing the central 30% of the plot. I should probably do something similar, but suggestions are welcome.

Here is a crop of a target trapezoid from the top-left corner of the image, with a measured red channel shift of -0.4 (i.e, red is smaller than green), and a measured blue channel shift of -1.06 on the upper left edge, which gives it the green tinge we see in Fig 17. The opposite edge sports a magenta tinge, as we would expect.

Fig 17: Crop of a target from the top-left corner of the chart image used to produce Figs 15 and 16. Please excuse the sad condition of the test chart I used here :)

For my usage patterns, I think it would be very helpful to be able to select a specific edge, and have MTF Mapper display the lateral CA measurement for that edge. My plan is to add the CA measurement to the info display what goes with the MTF/SFR curve plot that pops up when you click on an edge in the Annotated image output. In the meantime, both the smoothed, gridded data and the raw measurements are made available for your enjoyment.

If you are a command-line user, you can enable CA measurement with the "--ca" flag. The raw CA measurements are available in a self-documenting file called "chromatic_aberration.txt" in the output directory, but you also get the plot as "ca_image.png". Note that GUI users can now also get their hands on both these files by using the "Save all results" button on the main GUI window after completing a run.

Lastly, I should emphasize that it is important to achieve good alignment between the sensor and the test chart for CA measurements. I think that there might be some tilt in my results shown in Figs 15 and 16 above. Or perhaps lateral CA is not supposed to be perfectly symmetric; come to think of it, I suppose it would be affected by tilted lens elements, for example. At any rate, you can use MTF Mapper's built-in Chart Orientation output mode to help you refine your chart alignment.

Conclusion

You can try out the latest version of MTF Mapper yourself to play with the new features. I have released Windows binaries of version 0.7.29, and I will probably release some Ubuntu .debs as well soon. Or you can build from source :)

In future, I plan on doing some comparisons with other related tools, probably Imatest and QuickMTF, to see how well they perform on my synthetic images. And I hope to understand the problems I have run into with certain demosaicing algorithms, although I still think that lateral CA measurements should be performed on raw images whenever possible.

References

Almonacid-Caballer, Jaime, Josep E. Pardo-Pascual, and Luis A. Ruiz. "Evaluating Fourier cross-correlation sub-pixel registration in Landsat images." Remote Sensing 9, no. 10 (2017): 1051.
Gilman, Andrew, Leist, Arno, "Global Illumination-Invariant Fast Sub-Pixel Image Registration." The Eighth International Multi-Conference on Computing in the Global Information Technology (ICCGI), Nice, France, (2013):95-100.

Aliasing and the slanted-edge method: what you have to know

2019-08-29T13:50:00.000+02:00

Aliasing, as it pertains to the digital sampling of signals, is a tricky subject to discuss, even amongst people who have a background in signal processing. It is not entirely unlike the notion that the speed of light in a vacuum represents an absolute upper speed limit to everything in the universe as we understand it. Once people accept the speed of light as an absolute limit, they often have trouble accepting subtle variations on the theme, like how Cerenkov radiation is caused by particles moving through a medium at a speed exceeding the speed of light in that same medium.

Similarly, I have found that once people embrace the Shannon-Nyquist sampling theorem and its consequent implications, they are reluctant to accept new information that superficially appears to violate the theorem. In this post I hope to explain how the slanted-edge method works, in particular the way that it only appears to violate the Shannon-Nyquist theorem, but really does not. I also circle back to the issue of critical angles discussed in an earlier post, but this time I show the interaction between aliasing and edge orientation angle.

What were we trying to do anyway?

I will assume that you would like to measure the Modulation Transfer Function (MTF) of your optical system, and that you have your reasons.

To obtain the MTF, we first obtain an estimate of the Point Spread Function (PSF); the Fourier Transform of the PSF is the Optical Transfer Function (OTF), and the modulus of the OTF is the MTF we are after. But what is the PSF? Well, the PSF is the impulse response of our optical system, i.e., if you were to capture the image of a point light source, such as a distant star, then the image of your star as represented in your captured image will be a discretely sampled version of the PSF. For an aberration-free in-focus lens + sensor system at a relatively large aperture setting, the bulk of the non-zero part of the PSF will be concentrated in an area roughly the size of a single photosite; this should immediately raise a red flag if you are familiar with sampling theory.

Nyquist-Shannon sampling theory tells us that if our input signal has zero energy at frequencies of F Hertz and higher, then we can represent the input signal exactly using a series of samples spaced 1/(2F) seconds apart. We can translate this phrasing to the image domain by thinking in terms of spatial frequencies, i.e, using cycles per pixel rather than cycles per second (Hertz). Or you could use cycles per millimetre, but it will become clear later that cycles per pixel is a very convenient unit.

If we look at an image captured by a digital image sensor, we can think of the image as a 2D grid of samples with the samples spaced 1 pixel apart in both the x and y dimensions, i.e., the image represents a sampled signal with a spacing of one sample per pixel. I really do mean that the image represents a grid of dimensionless point-sampled values; do not think of the individual pixels as little squares that tile neatly to form a gap-free 2D surface. The fact that the actual photosites on the sensor are not points, but that the actively sensing part of each photosite (its aperture) does approximately look like a little square on some sensors, is accounted for in a different way: rather think of it as lumping the effect of this non-pointlike photosite aperture with the spatial effects of the lens. In other words, the system PSF is the convolution of the photosite aperture PSF and the lens PSF, and we are merely sampling this system PSF when we capture an image. This way of thinking about it makes it clear that the image sampling process can be modeled as a "pure" point-sampling process, exactly like the way it is interpreted in the Nyquist-Shannon theory.

With that out of the way, we can get back to the sampling theorem. Recall that the theorem requires the energy of the input signal to be zero above frequency F. If we are sampling at a rate of 1 sample per pixel, then our so-called Nyquist frequency will be F = 0.5 cycles per pixel. Think of an image in which the pixel columns alternate between white and black --- one such pair of white/black columns is one cycle.

So how do we know if our signal power is zero at frequencies above 0.5 cycles per pixel? We just look at the system MTF to see if the contrast is non-zero above 0.5 cycles per pixel. Note that the system MTF is the combination (product) of the lens MTF, and the photosite aperture MTF. If our photosites are modelled as 100% fill-factor squares, then the photosite aperture MTF is just |sinc(f)|, and if we model an ideal lens that is perfectly focused, then the lens MTF is just the diffraction MTF. I have simulated some examples of such a system using a photosite pitch of 5 micron, light at 550 nm, and apertures of f/4 and f/16, illustrated in Fig 1.

Fig 1: Simulated system MTF curves, ideal lens and sensor

We can see that the f/16 curve in Fig 1 is pretty close to meeting that criterion of zero contrast above 0.5 cycles/pixel, whereas the f/4 curve represents a system that is definitely not compliant with the requirements of the sampling theorem, with abundant contrast after 0.5 cycles/pixel.

Interestingly enough, the Nyquist-Shannon sampling theorem does not tell us what will happen if the input signal does not meet the requirement. So what does actually happen? Aliasing, and potentially lots of it. The effect of aliasing can be illustrated in the frequency domain: energy at frequencies above Nyquist will be "folded back" onto the frequencies below Nyquist. But aliasing does not necessarily destroy information; it could just make it hard to tell whether the energy we see at frequency f (for f < Nyquist) should be attributed to the actual frequency f, or whether it is an alias for energy at a higher frequency of 2*Nyquist - f, for example. The conditions under which we can successfully distinguish between aliased and non-aliased frequencies are quite rare, so for most practical purposes we would rather avoid aliasing if we can.

A subtle but important concept is that even though our image sensor is sampling at a rate of 1 sample per pixel, and therefore we expect aliasing because of non-zero contrast at frequencies above 0.5 cycles per pixel, it does not mean that the information at higher frequencies has already been completely "lost" at the instant the image was sampled. The key to understanding this is to realize that aliasing does not take effect at the time of sampling our signal: aliasing only comes into play when we start to reconstruct the original continuous signal, or if we want to process the data in a way that implies some form of reconstruction, e.g., computing an FFT. A trivial example might illustrate this argument: Consider that we have two analogue to digital converters (ADCs) simultaneously sampling an audio signal at a sample rate of 4 kHz each. If we offset the timing of the second ADC with half the sampling period (i.e., 1/8000 s) relative to the first ADC, then we have two separate sets of samples each with a Nyquist limit of 2 kHz. If we combine the two sets of samples, we find that they interleave nicely to give us an effective sampling period of only 1/8000 s, so that the combined set of samples now has a higher Nyquist limit of 4 kHz. Note that sampling at 4 kHz did not harm the samples from the individual ADCs in any way; we can combine multiple lower-rate samples at a later stage (under the right conditions) to synthesize a higher-rate set of samples without contradicting the Nyquist-Shannon sampling theorem. The theorem does not require all the samples to be captured in a single pass by a single sampler. This dual ADC example illustrates the notion that the Nyquist limit does not affect the capture of the samples themselves, but rather comes into play with the subsequent reconstruction of a continuous signal from the samples.

Normally, this subtle distinction is not useful, but if we are dealing with a strictly cyclical signal, then it allows us to combine samples from different cycles of the input signal as if they were sampled from a single cycle. An excellent discussion of this phenomenon can be found in this article, specifically the sections titled "Nyquist and Signal Content" and "Nyquist and Repetitive Signals". That article explains how we can sample the 60 Hz US power line signal with a sample rate of only 19 Hz, but only because of the cyclic nature of the input signal. This technique produces a smaller effective sample spacing, which is accomplished by re-ordering the 19 Hz samples, captured over multiple 60 Hz cycles, to reconstruct a single cycle of the 60 Hz signal. The smaller effective sample spacing obtained in this way is sufficiently small, according to the Nyquist-Shannon theorem, to adequately sample the 60 Hz signal.

We can design a similar technique to artificially boost the sampling rate of a 2D image. First, imagine that we captured an image of a perfectly vertical knife-edge target, like this:

A perfectly vertical knife-edge image, magnified 4x here

We can plot the intensity profile of the first row of pixels, as a function of column number, like this:

An example of a discretely sampled ESF

This plot is a sparse discrete sample of the Edge Spread Function (ESF), with a spacing of one sample per pixel. In particular, note how sparse the samples appear around columns 20-22, right at the edge transition where the high-frequency content lives. The slanted-edge method constructs the ESF as an intermediate step towards calculating the system MTF; more details are provided in the next section below. The important concept here is that the first row of our image is a sparse sample of the ESF.

But what about the next row in this image? It is essentially another set of samples across the edge, i.e., another set of ESF samples. In fact, if we assume that the region of interest (ROI) represented by the image above is small enough, we can pretend that the system MTF is constant across the entire ROI, i.e., the second row is simply another "cycle" of our cyclical signal. Could we also employ the "spread your low-rate samples across multiple cycles" technique, described above in the context of power line signals? Well, the problem is that our samples are rigidly spaced on a fixed 2D grid, thus it is impossible to offset the samples in the second row of the image, relative to the first, which is required for that technique. What if we moved the signal in the second row, rather than moving the sampling locations?

This is exactly what happens if we tilt our knife-edge target slightly, like this:

A slanted knife-edge image, magnified 4x

Now we can see that the location of the edge transition in the second row is slightly offset from that of the first row; if we plot the sparse ESFs of the first two rows we can just see the small shift:

Sparse ESFs of rows 1 and 0 of the slanted edge image

Since only the relative movement between our sample locations and the signal matters, we can re-align the samples relative to the edge transition in each row to fix the signal phase, which effectively causes our sample locations to shift a little with each successive image row.
This approach creates the conditions under which we can use a low sampling rate to build a synthetic set of samples, constructed over multiple cycles, with a much smaller effective sample spacing. The details of how to implement this are covered in the next two sections.

To recap: the slanted-edge method takes a set of 2D image samples, and reduces it to a set of more closely spaced samples in one dimension (the ESF), effectively increasing the sampling rate. There are some limitations, though: some angles thwart our strategy, and we will end up with some unrecoverable aliasing. To explain this phenomenon, I will first have to explain some implementation details of the slanted-edge method.

The slanted-edge method

I do not want to describe all aspects of the slanted-edge method in great detail in this post, but very briefly, the major steps are:

Identify and model the step edge in the Region Of Interest (ROI);
Construct the irregularly-spaced oversampled Edge Spread Function (ESF) with the help of the edge model;
Construct a regularly-spaced ESF from the irregularly-spaced ESF;
Take the derivative of the ESF to obtain the Line Spread Function (LSF);
Compute the FFT of the LSF to obtain the Modulation Transfer Function (MTF);
Apply any corrections to the MTF (derivative correction, kernel correction if necessary).

Today I will focus on steps 2 & 3.

Constructing the ESF

We construct an oversampled Edge Spread Function (ESF) by exploiting the knowledge we have of the existing structure in the image, as illustrated in Fig 1.

Fig 2: The slanted-edge method, following Khom's description

We start with a region of interest (ROI) that is defined as a rectangular region of the image aligned with the edge orientation, as shown in Fig 2. We use the orientation of the edge-under-analysis (1) to estimate θ, the edge angle, which is used to construct a unit vector that is normal (perpendicular) to the edge (2), called n. Now we take the (x, y) coordinates of the centre of each pixel (3), and project them onto the normal vector n (4) to obtain for each pixel i its signed distance to the edge, represented by d_i. The intensity of pixel i is denoted I_i, and we form a tuple (d_i, I_i) to represent the projection of pixel i onto the normal n. If we process all the pixels i inside the ROI, we obtain the set of tuples ESF_irregular = {(d_i, I_i)}, our dense-but-irregularly spaced ESF (5).

We can see quite clearly in Fig 2 that even though the original 2D pixel centres have a strict spacing of one sample per pixel-pitch, the projection onto the normal vector n produces a set of 1D samples with a much finer spacing. Consider a concrete example with θ = 7.125°, producing a line slope of 1/8 with respect to the vertical axis, as illustrated in Fig 2; this means the vector n can be written as (-1, 8)/√(-1² + 8²), or n = (-1, 8)/√65. If we take an arbitrary pixel p = (x, y), we can project it onto the normal n by first subtracting the coordinates of a point c that falls exactly on the edge, using the well-know formula involving the vector dot product to yield the scalar signed-distance-from-edge d such that

d = (p - c) · n.

If we substitute our concrete value of n and expand this formula, we end up with the expression d = (8y - x)/√65 + C, where C = c · n is simply a scalar offset that depends on exactly where we chose to put c on the edge. If we let k = 8y - x, where both x and y are integers (we can choose our coordinate frame to accomplish this), then d = k/√65 + C, or after rearranging, d - C = k/√65.
This implies that d - C is an integer multiple of 1/√65 ≈ 0.12403, meaning that the gaps between consecutive d_i values in ESF_irregular must be approximately 0.12403 times the pixel pitch. We have thus transformed our 2D image samples with a spacing of 1 pixel pitch into 1D samples with a spacing of 0.12403 times the pixel pitch, to achieve roughly 8× oversampling.

The spacing of our oversampled ESF can indeed support a higher sampling rate, e.g. 8× as shown in the previous paragraph. This means that the Nyquist limit of our 8× oversampled ESF is now 4 cycles/pixel, compared to the original 0.5 cycles/pixel of the sampled 2D image, effectively allowing us to reliably measure the system MTF at frequencies between 0.5 cycles/pixel and 1.0 cycles/pixel (and higher!), provided of course that our original system MTF had zero energy at frequencies above 4 cycles/pixel. For typical optical systems, this 4 cycles/pixel limit is much more realistic compared assuming zero contrast above the sensor Nyquist limit of 0.5 cycles/pixel.

The argument presented here to show how the slanted-edge projection yields a more densely-spaced ESF holds for most, but not all, angles θ in the range [0, 45]. We typically take the irregularly spaced ESF_irregular and construct from it a regularly spaced ESF at a convenient oversampling factor such as 8× so that we can apply the FFT algorithm at a later stage (recall that the FFT expect a regular sample spacing). The simplest approach to producing such a regular spacing is to simply bin the values in ESF_irregular into bins with a width of 0.125 times the pixel pitch; we can then analyse this binning to determine which specific angles cause problems.

Revisiting the critical angles

Above I have shown that you can attain 8× oversampling at an edge orientation angle of θ = 7.1255°. But what about other angles? In a previous blog post I attempted to enumerate the problematic angles, but I could not come up with a foolproof way to list them all. Instead, we can follow the empirical approach taken by Masaoka, and simply step through the range 0° to 45° in small increments. At each angle we can attempt to bin the irregular ESF into an 8× oversampled regular ESF. For some angles, such as 45°, we already know that consecutive d_i will be exactly 1/√2 ≈ 0.70711 times the pixel pitch apart, leaving most of the 8× oversampling bins empty. The impact of this is that the effective oversampling factor will be less than 8.

How can we estimate the effective oversampling factor at a given edge orientation angle? To keep things simple, we will just look at d_i values in the range [0, 1) since we expect this to be representative of all intervals of length 1 (near the edge --- at the extremes of the ROI samples will be more sparse). We divide this interval into 8 equal bins of 0.125 pixels wide, and we take all the samples of ESF_irregular such that 0 ≤ d_i < 1, and bin them while keeping track which of the 8 bins received samples, which I'll call the non-empty bins. The position of the edge relative to the origin of our pixel grid (what Masaoka called the "edge phase") will influence which of the 8 bins receive samples when we are simulating certain critical angles. For example, at an edge angle of 45° and an edge phase of zero we could expect two non-empty bins, since the consecutive d_i values might be (0, 0.70711, 1.4142, ...). If we had an edge phase of 0.5, then we would have only one non-empty bin because the consecutive d_i values might be (0.5, 1.2071, 1.9142, ...). A reasonable solution is to sweep the edge phase through the range [0, 1) in small increments, say 1000 steps, while building a histogram of the number of non-empty bins. We can then calculate the mean effective oversampling factor (= mean number of non-empty bins) directly from this histogram, which is shown in Fig 3:

Fig 3: Mean effective oversampling factor as a function of edge orientation

This simulation was performed with a simulated edge length of 30 pixels, so Fig 3 is somewhat of a worst-case scenario. We can readily identify the critical angles I discussed in the previous post on this topic: 45, 26.565, 18.435, 14.036, 11.310, 9.462, and 8.130. We can also see a whole bunch I previously missed, including 30.964, 33.690, 36.870, and 38.660. In addition helping us spot critical angles, Fig 3 also allows us to estimate their severity: near 45° we can only expect 1× oversampling, near 26.565° we can only expect 2× oversampling, and so on.

What can we do with this information? Well, if we know our system MTF is zero above 0.5 cycles/pixel, then we can actually use the slanted-edge method safely on a 45° slanted edge, as I will show below. Similarly, using an edge at an angle near 26.565° is only a problem if our system MTF is non-zero above 1 cycle/pixel. Alternatively, we could decide that we require a minimum oversampling factor of 4×, thus allowing us to measure system MTFs with cut-off frequencies up to 2 cycles/pixel, and use Fig 3 to screen out edge angles that could lead to aliasing, such as 0°, 18.435°, 33.690°, 26.565° and 45°.

What aliasing looks like when using the slanted-edge method

This post is already a bit longer than I intended, but at the very least I must give you some visual examples of what aliasing looks like. Of course, to even be able to process edges at some of the critical angles requires a slanted-edge implementation that deals with the issue of empty bins, or perhaps an implementation that adapts the sampling rate to the edge orientation. MTF Mapper, and particularly the newer "loess" mode introduced in 0.7.16, does not bat an eye when you feed it a 45° edge.

A 45° edge angle will give us d_i values with a relative spacing in multiples of √0.5, so our effective sampling rate is approximately 0.70711 samples per pixel pitch, giving us a sampling frequency of 1/√0.5 ≈ 1.4142, with a resulting Nyquist frequency of 0.70711. But first I will show you the SFR of three simulated systems at f/4, f/11 and f/16 under ideal conditions (perfect focus, diffraction only, no image noise) at a simulated edge angle of 5° so that you can see approximately what we expect the output to look like:

Fig 4: Reference SFR curves at 5°. Blue=f/4, Green=f/11, Orange=f/16

Fig 4 is the baseline, or what we would expect if our sampling rate was high enough. Note that the f/4 curve in particular has a lot of non-zero contrast above 0.70711 cycles/pixel, so we should be prepared for some aliasing at critical angles. If we process the 45° case with the default MTF Mapper algorithm (or "kernel" in version 0.7.16 and later) then we obtain Fig 5. This is rather more exciting that we hoped for.

Fig 5: MTF Mapper "kernel" SFR curves at 45°. Blue=f/4, Green=f/11, Orange=f/16

In particular, notice the sharp "bounce" in the blue f/4 curve at about 0.71 cycles/pixel, our expected Nyquist frequency; this is very typical aliasing behaviour. Also notice the roughly symmetrical behaviour around Nyquist. Normally we do not expect to see a sharp bounce at 0.71 cycles/pixel on a camera without an OLPF (my simulation is without one), however, we do often see a bounce at 0.68 cycles/pixel for the apparently common OLFP "strength", which makes it a little bit tricky to tell the two apart. The best way to eliminate the OLPF as the cause of the bounce is to check another slanted-edge measurement at 5°. Can we do something about those impressively high contrast values (> 1.2) near 1.41 cycles/pixel? Well, MTF Mapper has does have a second ESF construction algorithm (announced here): the "loess" option, which gives us the SFR curves see in Fig 6.

Fig 6: MTF Mapper "loess" SFR curves at 45°. Blue=f/4, Green=f/11, Orange=f/16

We can see that the output of the "loess" algorithm does not have a huge bump around 1.41 cycles/pixel, which is much closer to the expected output shown in Fig 4. It does, however, make it much harder to identify the effects of aliasing. It is tempting to try and directly compare the 45° case to the 5° case (Fig 4), but we have to acknowledge that the system MTF is anisotropic (see one of my earlier posts on anisotropy, or Jack Hogan's detailed description of our sensor model, including the anisotropic square pixel apertures), meaning that the analytical system MTF at a 45° angle is not identical to that at 5° angle owing to the square photosite aperture. I will rather use a nearby angle of 40°, thus comparing the 45° results to the 40° results at each aperture in Figures 7 through 9.

Fig 7: f/4: Comparing "loess" algorithm at 45° (green), and "kernel" algorithm at 45° (orange) to reference 40° (blue)

Fig 8: f/11: Comparing "loess" algorithm at 45° (green), and "kernel" algorithm at 45° (orange) to reference 40° (blue)

Fig 9: f/16: Comparing "loess" algorithm at 45° (green), and "kernel" algorithm at 45° (orange) to reference 40° (blue)

In all cases, the "loess" algorithm produced more accurate results, although we can see that neither the "loess" or the "kernel" algorithms performed well on the f/4 case. This is to be expected: the effective sampling rate at a 45° edge angle pegs the Nyquist frequency at 0.70711 cycles/pixel, and we know that our input signal has significant contrast above that frequency (as shown again in the blue curve in Fig 7). On the other hand, both algorithms performed acceptably on the f/16 simulation (if we ignore the "kernel" results above 1.0 cycles/pixel), which supports the claim that if our system MTF has zero contrast above the effective Nyquist frequency (0.70711 cycles/pixel in this case), then a 45° edge angle should not present us with any difficulties in applying the slanted-edge method.

Does this mean that we should allow 45° edge angles? Well, they will not break MTF Mapper, and they can produce accurate enough results if our system is bandlimited (zero contrast above 0.71 c/p), but I would still avoid them; it is just not worth the risk of aliasing. As such, the MTF Mapper test charts will steer clear of 45° edge angles.

What about our second-worst critical angle at 26.565°? A quick comparison at f/4 using the "loess" algorithm vs a 22° edge angle is shown in Fig 10.

Fig 10: f/4: Comparing "loess" algorithm at 26.565° (green), and "kernel" algorithm at 26.565° (orange) to reference 22° (blue)

We can see in Fig 10 that the "loess" algorithm at 26.565° (green) is pretty close to the reference at 22° (blue), but the "kernel" algorithm does its usual thing in the face of aliasing, and adds another bump at twice the Nyquist frequency. So far so good at 26.565° angles, right?

Unfortunately the story does not end here. Recall that MTF Mapper has the ability to use a subset of the Bayer channels to perform a per-channel MTF analysis. We can pretend that the grayscale images we have been using so far is a Bayer mosaic image, and process only the green and red subsets, as shown in Fig 11.

Fig 11: f/4: Comparing "loess" algorithm at 26.565° using only the Green Bayer channel (green), and the "loess" algorithm at 26.565° using only the Red Bayer channel (orange) to the "loess" algorithm on the grayscale image at 26.565° as reference (blue)

Whoops! Even though the "loess" algorithm performed adequately at 26.565° using a grayscale image, and almost the same when using only the Green Bayer channel, it completely falls apart when we apply it to only the Red Bayer channel. This is not entirely unexpected, since we are only using 1/4 of the pixels to represent the Red Bayer channel, and these samples are effectively spaced out at 2 times the photosite pitch in the original image. The resulting projection onto the edge normal still decreases the sample spacing like it normally does, but our effective oversampling factor is now only 0.5 * √5 ≈ 1.118, which drops the Nyquist frequency down to 0.559 cycles/pixel, as can be seen from the bounce the orange curve in Fig 11. You can see that the "loess" algorithm is struggling to suppress the aliased peak at 1.118 cycles/pixel.

If you really want to see the wheels come off, you can process a 45° angled edge using only the Red Bayer subset, as shown in Fig 12. As a general rule, the SFR/MTF curve should decrease with increasing frequency. There are exceptions to this rule: 1) we know that if the impact of diffraction is relatively low, such as with an f/4 aperture on a 5 micron photosite pitch sensor, then we expect the curve to bounce back after 1 cycle/pixel owing to the photosite aperture MTF, and 2) when we are dealing with a severely defocused MTF we can expect multiple bounces. Incidentally, these bounces are actually phase inversions, but that is a topic for another day. Even in these exceptional cases, we expect a generally decreasing envelope to the curve, so if you see something like Fig 12, you should know that something is amiss.

Fig 12: MTF Mapper "loess" algorithm applied to a simulated f/4 aperture image, with an edge orientation of 45° while limiting the analysis to only the Red Bayer channel. In case you are just reading this caption and skipping over the text, when you see an SFR curve like this, you should be worried.

As long-term MTF Mapper user Jack Hogan pointed out, the SFR curves above do look rather scary, and some reassurance might be necessary here. It is important to realize that only a handful of edge orientation angles will produce such scary SFR curves; if you stick to angles near 5° you should always be safe. Sticking to small angles (around 4° to 6°) also avoids issues with the anisotropy induced by the square photosite aperture, but if we stick to those angles then we cannot orient our slanted-edge to align with the sagittal/meridional directions of the lens near the corners of our sensor. As long as you are aware of the impact of the anisotropy, you can use almost any edge orientation angle you want, but navigate using Fig 3 above to avoid the worst of the critical angles.

Wrapping up

We have looked closely at how the ESF is constructed in the slanted-edge method, and how this effectively reduces the sample spacing to allow us to reliably measure the SFR at frequencies well beyond the Nyquist rate of 0.5 cycles/pixel implied by the photosite pitch. Unfortunately there are a handful of critical angles that have poor geometry, leading to a much lower than expected effective oversampling factor. At these critical angles, the slanted-edge method will still alias badly if the system MTF has non-zero contrast above the Nyquist frequency specific to that critical angle.

If you are absolutely certain that your system MTF has a sufficiently low cut-off frequency, such as when using large f-numbers or very small pixels, then you can safely use any edge orientation above about 1.7°. I would strongly suggest, though, that you rather stick to the range (1.73, 43.84) degrees, excluding the range (26.362, 26.773) degrees to avoid the second-worst critical angle. It would also be prudent to pad out these bounds a little to allow for a slight misalignment of your test chart / object.

I guess it is also important to point out that for general purpose imaging, your effective sample spacing will remain at 1 sample per pixel, and aliasing will set in whenever your system MTF has non-zero contrast above 0.5 cycles/pixel. The slanted-edge method can be used to measure your system MTF to check if aliasing is likely to be significant, but it will not help you to avoid or reduce aliasing in photos of subjects other than slanted-edge targets.

Lastly, this article demonstrated the real advantage of using the newer "loess" algorithm that has been added to MTF Mapper. The new algorithm is much better at handling aliased or near-aliased scenarios, without giving up any accuracy on the general non-aliased scenarios.

Acknowledgements

I would like to thank DP Review's forum members, specifically Jack Hogan and JACS, for providing valuable feedback on earlier drafts of this post.

Journal paper on MTF Mapper's robust edge-spread function construction algorithms now available

2019-07-03T20:04:00.000+02:00

A new paper on MTF Mapper's robust edge-spread function (ESF) construction algorithms has been published in the Optical Society of America's JOSA A journal. The paper provides an analysis of the impact of the slanted-edge orientation angle on the uniformity (or lack thereof) of the distribution of the samples used to construct the ESF; this is essentially the evolution of the notion of critical angles first described in this blog post. Next, the paper describes two different methods that can be used to construct an ESF to minimize the impact of the non-uniformity of the samples, with some results to demonstrate the efficacy of the proposed methods.

The full citation for this paper is:
F. van den Bergh, "Robust edge-spread function construction methods to counter poor sample spacing uniformity in the slanted-edge method," Journal of the Optical Society of America A, Vol. 36, Issue 7, pp. 1126-1136, 2019.

You can see the abstract of the paywalled article here. Or you can go and take a look on SourceForge, where you can find pre-press versions of my MTF Mapper related papers.

I am particularly fond of the LOESS-based algorithm, which is available in MTF Mapper version 0.7.16 and later (on SourceForge now). The LOESS-based algorithm performs better than the current implementation at higher frequencies, with less bias in the MTF curve above 0.5 cycles per pixel. This does not result in a huge improvement in the accuracy of MTF50 values; for practical use the main advantage of the LOESS-based algorithm is that it is able to produce more consistent results regardless of the slanted edge orientation angle. As an example of the improvement you can expect with the LOESS-based algorithm, the following figure illustrates the difference between the legacy ESF model (now called "kernel" in the preferences), and the LOESS ESF model (called "loess" in the preferences):

The error between the expected analytical SFR and the legacy ESF model (black curve), compared to the error obtained with the LOESS model (red curve), as measured on a simulated slanted edge. The legacy ESF model tends to overestimate contrast at high frequencies; the LOESS ESF model no longer does this. Note the scale of the y-axis!

Initially, I plan on making the use of the new algorithm optional (0.7.16 still defaults to "kernel"), but I hope to make the LOESS-based algorithm the default in versions 0.8.0 and later, after it has endured some more real-world testing. Any feedback will be appreciated!

Simulating defocus and spherical aberration in mtf_generate_rectangle

2018-08-23T15:24:00.000+02:00

It has been many years since I last tinkered with the core functionality of mtf_generate_rectangle. For the uninitiated, mtf_generate_rectangle is a tool in the MTF Mapper package that is used to generate synthetic images with a known Point Spread Function (PSF). The tool can generate black rectangular targets on a white background for use with MTF Mapper's main functionality, but mtf_generate_rectangle can do quite a bit more, including rendering arbitrary multi-polygon scenes (I'll throw in some examples below).

A synthetic image is the result of convolving the 2D target scene with the 2D PSF of the system. In practice, this convolution is only evaluated at the centre of each pixel in the output image; see this previous post for a more detailed explanation of the algorithm. That post covers the method used to simulate an aberration-free ideal lens, often called a diffraction-limited lens, combined with the photosite aperture of the sensor. The Airy pattern which models the effects of diffraction is readily modelled using a jinc(x) function, but if we want to add in the effects of defocus and spherical aberration things become a little bit harder.

I can recommend Jack Hogan's excellent article on the topic, which describes the analytical form of a PSF that combines diffraction, defocus and spherical aberration. I repeat a version of this equation here with the normalization constant omitted:

where

specifies the defocus (W020 coefficient) and spherical aberration (W040 coefficient), both expressed as a multiple of the wavelength λ.

The PSF(r) equation contains two parameters that relate to radial distances: r, which denotes the normalized distance from the centre of the PSF, and ρ, which denotes a normalized radial distance across the exit pupil of the lens. Unfortunately we have to integrate out the ρ parameter to obtain our estimate of the PSF at each radial distance r that we wish to sample the PSF. Figure 1 below illustrates our sampled PSF as a function of r (the PSF is shown mirrored around r=0 for aesthetic reasons). Keep in mind that for a radially symmetric PSF, r is in the range [0, ∞). For practical purposes we have to truncate r at some point, but this point is somewhere around 120 λN units (λ is the wavelength, N is the f-number of the lens aperture).

Figure 1: Sampled PSF of the combined effect of circular aperture diffraction and spherical aberration.

Integrating oscillating functions is hard

Numerical integration is hard, like eating a bag of pine cones, but that is peanuts compared to numerical integration of oscillating functions. Where does all this oscillation come from? Well, that Bessel function of the first kind (of order 0), J₀, looks like this when r=20 (and λN = 1):

Figure 2: Bessel J₀ when r=20

Notice how the function crosses the y=0 axis exactly 20 times. As one would expect, when r=120, we have 120 zero crossings. Why is it hard to numerically integrate a function with zero crossings? Well, consider the high-school equivalent of numerical integration using the rectangle rule: we evaluate the function f at some arbitrary x_i values (orange dots in Figure 3); we form rectangles with a "height" of f(x_i), and a width of (x_i+1 - x_i), and we add up the values f(x_i) * (x_i+1 - x_i) to approximate the area under the curve.

Figure 3: Notice how the area of the negative rectangles do not appear to cancel the area of the positive rectangles very well.

Ok, so I purposely chose some poorly and irregularly spaced x_i values. The integral approximation will be a lot more accurate if we use more (and better) x_i values, but you can see what the problem is: there is no way that the positive and negative rectangles will add up to anything near the correct area under the curve (which should be close to zero). We can use more sophisticated numerical integration methods, but it will still end in tears.

Fortunately there is a very simple solution to this problem: we just split our function into intervals between the zero crossings of f(), integrate each interval separately, and add up the partial integrals. This strategy is excellent if we know where the zero crossings are.

And our luck appears to be holding, because the problem of finding the zero crossings of the Bessel J₀ function has already been solved [1]. The algorithm is fairly simple: we know the location of the first two roots of J₀, they are at x₁=2.404826 and x₂=5.520078, and subsequent roots can be estimated as x_i = x_i-1 + (x_i-1 - x_i-2), for i >= 3. We can refine those roots using a Newton-Raphson iteration, where x_i,j = x_i,j-1 + J₀(x_i,j-1)/J₁(x_i,j-1), and J₁is the Bessel function of the first kind of order 1. We also know that if our maximum r is 120, then we expect 120 roots, and we just have to apply our numerical integration routine to each interval defined by [x_i-1, x_i] (plus the endpoints 0 and 120, of course).

But we are not quite there yet. Note that our aberration function γ(ρ) is a complex exponential function. If either the W020 or the W040 coefficient is non-zero, then our J₀ function is multiplied by another oscillatory function, but fortunately the oscillations are a function of ρ, and not r, so this does not introduce enough oscillation to cause major problems for realistic values, since W020 < 20 and W040 < 20. This is not the only challenge that γ(ρ) introduces, though. Although our final PSF is a real-valued function, the entire integrand is complex, and care must be taken to only apply the modulus operator to the final result. This implies that we must keep all the partial integrals of our [xi-1, xi] intervals as a complex numbers, but it also requires the actual numerical integration routine to operate on complex numbers. Which brings us to the selection of a suitable numerical integration routine ....

Candidate numerical integration routines

The simplest usable integration routine that I know of is the adaptive variant of Simpson's rule, but this only works well if your integrand is sufficiently smooth, and your subdivision threshold is chosen carefully. The adaptive Simpson algorithm is exceedingly simple to implement, so that was the first thing I tried. Why did I want to use my own implementation? Why not use something like Boost or GSL to perform the integration?

Well, the primary reason is that I dislike adding dependencies to MTF Mapper unless it is absolutely necessary. There is nothing worse, in my experience, than trying to build some piece of open-source code from source, only to spend hours trying to get just the right version of all the dependencies. Pulling in either Boost or GSL as a dependency just because I want to use a single integration routine is just not acceptable. Anyway, why would I pass up the opportunity to learn more about numerical integration? (Ok, so I admit, this is the real reason. I learn most when I implement things myself.)

So I gave the adaptive Simpson algorithm a try, and it seemed to work well enough when I kept r < 20, thus avoiding the more oscillatory parts of the integrand. It was pretty slow, just like the literature predicted [2]. I decided that I will look for a better algorithm, but one that is still relatively simple to implement. This led me to TOMS468, and a routine called QSUBA. This FORTRAN77 implementation employs an adaptive version of Gauss-Kronrod integration. Very briefly, one of the main differences between Simpson's rule and Gaussian quadrature (quadrature = archaic name for integration) is that the former approximates the integrand with a quadratic polynomial with regularly-spaced samples, whereas the latter can approximate the integrand as the product of a weighting function and a higher-order polynomial (with custom spacing). The Kronrod extension is a clever method of choosing our sample spacing that allows us to re-use previous values while performing adaptive integration.

Much to my surprise, I could translate the TOMS468 FORTRAN77 code to C code using f2c, and it worked out of the box. It took quite a bit longer to port that initial C code to something that resembles good C++ code; all the spaghetti GOTO statements in the FORTRAN77 was faithfully preserved in the f2c output. I also had to extend the algorithm a little to support complex integrands.

Putting together the QSUBA routine and the root intervals of J₀ described in the previous section seemed to do the trick. If I used only QSUBA without the root intervals, the integration was much slower, and led to errors at large values of r, as shown in Figure 4.

Figure 4: QSUBA integration without roots (top), and with roots (bottom). Note the logarithmic y-axis scale

Those spikes in the PSF certainly look nasty. Figure 5 illustrates how they completely ruin the MTF.

Figure 5: MTF derived from the two PSFs shown in Figure 4.

So how much faster is QSUBA compared to my original adaptive Simpson routine? Well, I measured about a 20-fold increase in speed.

Rendering the image

After having obtained a decent-looking PSF as explained above, the next step is to construct a cumulative density function (CDF) using the PSF, but this time we take into account that the real PSF is 2D. We can still do this with a 1D CDF, but at radial distance r we have to correct the probability proportionally to the area of a disc of radius r. The method is described in detail in the section titled "Importance sampling and the Airy disc" in this post. The next step is to generate 2D sampling locations drawn from the CDF, which effectively places sampling locations with a density proportional to the intensity of the 2D PSF.

If we are simulating just the lens part, we just calculate what fraction of our sampling locations are inside the target polygon geometry (e.g., our black rectangles), and shade our output pixel accordingly. To add in the effect of the sensor's photosite aperture, which is essentially an additional convolution of our PSF with a square the size of our photosites if we assume a 100% fill factor, we just replace the point-in-target-polygon test with an intersect-photosite-aperture-polygon-with-target-polygon step. This trick means that we do not have to resort to any approximations (e.g., discrete convolutions) to simulate the sensor side of our system PSF. Now for a few examples, as promised.

Figure 6: MTF curves of simulated aberrations. The orange curve is f/5.6 with W040 = 0.75 (i.e., significant spherical aberration). The blue curve is f/5.6 with W040 = 2 (i.e., extreme spherical aberration). The green curve is f/5.6 with no spherical aberration, but W020 = 1 (i.e., a lot of defocus).

Before I show you the rendered images, take a look at Figure 6 to see what the MTF curves of the simulated images look like. First up, the orange curve with W040 = 0.75. This curve sags a little between 0.1 and 0.4 cycles/pixel, compared to the same lens without the simulated spherical aberration, but otherwise it still looks relatively normal. Figure 7 illustrates what a simulated image with such an MTF curve looks like.

Figure 7: Simulated lens at f/5.6 with W040 = 0.75, corresponding to the orange curve in Figure 6. Looks OK, if a little soft. (remember to click on the image for a 100% view)

The blue curve (in Figure 6) represents severe spherical aberration with W040 = 2, also rendered at f/5.6. Notice how the shape of the blue curve looks very different from what we typically see on (most?) real lenses, where the designers presumably do their best to keep the spherical aberrations from reaching this magnitude. The other interesting thing about the blue curve is that contrast drops rapidly in the range 0 to 0.1 cycles/pixel, but despite the sudden drop we then have a more gradual decrease in contrast. This simulated lens gives us Figure 8.

Figure 8: Simulated lens at f/5.6 with W040 = 2, corresponding to the blue curve in Figure 6. This image has the typical glow that I associate with spherical aberration.

Figure 9 illustrates a simulated scene corresponding to the green curve in Figure 6, representing significant defocus with W020 = 1, but no spherical aberration with W040 = 0. It also exhibits a new MTF curve shape that requires some explanation. It only appears as if the curve "bounces" at about 0.3 cycles per pixel; what is happening is that the OTF undergoes phase inversion between, say, 0.3 and 0.6 cycles per pixel, but the because the MTF is the modulus of the OTF we see this rectified version of the OTF (see Jack Hogan's article on defocus for an illustration of the OTF under defocus).

Figure 9: Simulated lens at f/5.6 with W020 = 1.0, corresponding to the green MTF curve in Figure 6. If you look very closely at 100% magnification (click on the image), you might just see some detail between the "2" and the "1" marks on the trumpet. This corresponds to the frequencies around 0.4 c/p, i.e., just after the bounce.

If we were to increase W020 to 2.0, we would see even more "bounces" as the OTF oscillates around zero contrast. Figure 10 shows what our simulated test chart would look like in this scenario. If you look closely near the "6" mark, you can see that the phase inversion manifests as an apparent reversal of the black and white stripes. Keep in mind that this amount of defocus aberration (say, W020 <= 2) is still in the region where diffraction interacts strongly with defocus. If you push the defocus much higher, you would start to enter the "geometric defocus" domain where defocus is essentially just an additional convolution of the scene with a circular disc.

Figure 10: Simulated lens at f/5.6 with W020 = 2.0. This image looks rather out of focus, as one would expect, but notice how the contrast fades near the "7" mark on the trumpet, but then recovers somewhat between the "6" and "3" marks. This corresponds to the "bounce" near 0.3 cycles/pix we saw in the MTF curve. Look closely, and you will see the black and white stripes have been reversed between the "6" and "3" marks.

Sample usage

To reproduce the simulated images shown above, first grab an instance of the "pinch.txt" test chart geometry here. Then you can render the scenes using the following commands:

mtf_generate_rectangle.exe --target-poly pinch.txt -p wavefront-box --aperture 5.6 --w040 2.0 -n 0.0 -o pinch_example.png

This should produce an 8-bit sRGB PNG image called pinch_example.png with significant spherical aberration (the --w040 parameter). You can use the --w020 parameter to add defocus to the simulation. Note that both these aberration coefficient arguments only take effect if you use either the wavefront-box or wavefront PSF models (as argument to the -p option); in general I recommend using the wavefront-box PSF, unless you specifically want to exclude the sensor photosite aperture component for some reason.

These new PSF models are available in MTF Mapper versions 0.7.4 and up.

Caveats

I am satisfied that the rendering algorithm implemented in mtf_generate_rectangle produces the correct PSF, and eventually MTF, for simulated systems with defocus and/or spherical aberration. What I have not yet confirmed to my own satisfaction is that the magnitude of the aberrations W020 and W040, currently expressed as multiples of the wavelength (λ), does indeed effect an aberration with a physically-correct magnitude. In other words, see them as unitless parameters that control the magnitude of the aberrations.

I hope to verify that the magnitude is physically correct at some point in the future.

If you simulate highly aberrated lenses, such as f/5.6 with W020 = 2, and measure a slanted-edge in the resulting image, you may notice that MTF Mapper does not reproduce the sharp "bounces" in the SFR curve quite as nicely as shown in Figure 6. You can add the "--nosmoothing" option to the Arguments field in the Preferences dialog to make those "bounces" more crisp, but this is only recommended if your image contains very low levels of image noise.

Another somewhat unexpected property of the W020 and W040 aberration coefficients is that the magnitude of their effect interacts strongly with the simulated f-number of the lens. In other words, simulating W020 = 0.5 at f/16 looks a lot more out-of-focus than W020 = 0.5 at f/2.8. This is because the W020 and W040 coefficients are specified as a displacement (along the axis of the lens) at the edge of the exit pupil of the lens, meaning that their angular impact depends strongly on the diameter of the exit pupil. Following Jack's article, the diameter of the defocus disc at the sensor plane scales as 8N λ W020 (translated to my convention for W020 and W040). Thus, if we want the same defocus disc diameter at f/2.8 that we saw at f/16 with W020 = 0.5, then we should choose W020 = 16/2.8 * 0.5 = 2.857. If I compare simulated images at f/2.8 with W020 = 2.857 to those at f/16 with W020 = 0.5, then it at least looks as if both images have a similar amount of blur, but obviously the f/16 image will lack a lot of the higher-frequency details outright.

Hold on a minute ...

The astute reader might have noticed something odd about the equation given for PSF(r) in the introduction. If we set W020 = 0, W040 = 0, N = 1 and λ = 1, then γ(ρ) is real-valued and equal to 1.0. This reduces the integral to

But without any defocus or spherical aberration, we should obtain the Airy pattern PSF that we have used in the past, i.e, we should get

It helps to call in some reinforcements at this stage. The trick is to invoke the Hankel transform. As shown by Piessens [3], the Hankel transform (of order zero) of a function f(r) is

If we choose the function f(r) to be 1.0 when |r| < a, and zero otherwise, then Piessens [3, Example 9.2] shows that

If we make the substitutions a = 1.0, and s = πr, then we can see that our PSF model that includes defocus and spherical aberration readily reduces (up to a constant scale factor in this case, I could have been more careful) to the plain old Airy pattern PSF if we force the aberrations to be zero.

References

S.K. Lucas and H.A. Stone, "Evaluating infinite integrals involving Bessel functions of arbitrary order", Journal of Computational and Applied Mathematics, 64:217-231, 1995.
I. Robinson, "A comparison of numerical integration programs", Journal of Computational and Applied Mathematics, 5(3):207-223, 1979.
R. Piessens, "The Hankel transform", in The transforms and applications handbook, CRC press, 2000. (PDF here)

Adding OpenGL to the GUI to make it zippy

2018-07-03T17:30:00.003+02:00

The MTF Mapper GUI has always been a bit of a red-headed stepchild compared to the command-line version. In fact, the GUI just calls the command-line version to do the actual work. I have tried to keep the GUI functional, but minimal, mostly because I find working on the actual slanted-edge algorithm a lot more interesting than working on the GUI. At least the GUI is written in Qt, rather than, say, MATLAB ...

Fortunately I have found a way to make the GUI-related coding work a bit more interesting. I decided to upgrade the main image viewer of the GUI to an OpenGL implementation. The main motivation is that with an OpenGL rendering engine you essentially get high-quality image scaling for free, meaning you can effortlessly zoom into and out of an image without any noticeable lag. If you are familiar with the older MTF Mapper GUI (prior to version 0.7.0), then you may have experienced the unbearable delays when you try to adjust the image magnification with the mouse wheel.

Integrating OpenGL into Qt was a lot simpler than I expected; maybe this is because I only had to deal with the more modern QOpenGLWidget implementation. The somewhat more unexpected learning curve hit me when I tried to draw something in OpenGL. I think the last time I wrote any OpenGL code must have been in 2002, i.e., a while before OpenGL 2.0 was released. This meant that my knowledge of OpenGL was firmly stuck in the fixed-function pipeline era, so I had to start learning from scratch how to use the modern shader-based pipeline. Fortunately Joey de Vries created an excellent set of tutorials to help me get up to speed.

Anyhow, the idea is to cut the image (which may be larger than 10000 by 10000 pixels) into manageable tiles, and to map these tiles as textures onto quads. The textures are loaded with Mipmapping enabled, so the textures on the tiles always appear smoothly rescaled regardless of the final display size of each tile. I chose to stick to power-of-two dimensions for the textures, even though modern GPUs should be able to handle non-power-of-two (NPOT) textures, mostly because I read some unconfirmed reports that certain integrated GPUs may experience slowdowns or other unexpected behaviour. With a bit of luck all these choices will maximise compatibility.

The hardest part of the OpenGL viewer was actually to implement the scrolling / zooming behaviour with the help of Qt's QAbstractScrollArea; there are very few examples of how to use this class, at least according to Google. I also discovered that if you enable zooming in/out with a mouse wheel, then it is critical that the image appears to zoom around the point in the image directly under the mouse cursor --- any other zooming strategy feels disorienting. And of course I learnt that doing this while getting both your QOpenGLWidget and your QAbstractScrollArea objects to agree on the state (i.e., where you are in the image) is not trivial.

I will update the MTF Mapper help/user guide accordingly, but for the record, here is a rundown of the image viewer controls:

You can scroll/pan the image by holding down the left mouse button while moving the mouse.
You can scroll/pan the image by using the mouse wheel; the default is vertical scrolling, but you can select horizontal scrolling by holding down the shift key while scrolling the wheel.
You can zoom in/out by scrolling the mouse wheel while holding down the control key. Zooming in is limited to a maximum magnification of 2x. Images that are smaller than the current viewport (window) size cannot be zoomed, nor can you zoom out past the point where one edge of your image matches the viewport width/height.
You can also zoom by holding down the right mouse button while moving the mouse up/down.
You can zoom in/out using the "+" and "-" keys on the keyboard after you have clicked (with any mouse button) at the location in the image around which you would like to zoom.
If you are viewing an "Annotated image", you can display the SFR curve of an edge by clicking on the annotation text, as described in Section 5.3 of the MTF Mapper help/user guide. The new feature is that a coloured dot will be drawn to indicate which edge you have selected, as illustrated below. (Yes, I know this feature should have been there from the start, but it would have been an enormous pain to implement without the new OpenGL viewer.)

I also discovered that actually loading a large annotated image can take a while, around 0.3 seconds on my test machine for a D7000-sized image, and around 1.8 seconds for an IQ180 image. If you are examining multiple images in a session, then switching between the images still felt painfully slow, especially if you repeatedly go back-and-forth between them. I decided to add a cache to speed this up; the default cache size is 1GB, but you can change this (in the preferences) if your machine is memory constrained, or you regularly open multiple 100 MP images (and have RAM to burn).

Since the new OpenGL-based viewer introduced a whole lot of brand-new code, I expect that there may be a few issues, so please let me know if you encounter any!
(You can download version 0.7.0 from SourceForge)

Automatic processing of Imatest charts

2018-05-16T17:14:00.000+02:00

It turns out that people sometimes want to process an image of an Imatest SFRplus type chart in MTF Mapper. Or at the very least, I have received requests about this.

When I first released MTF Mapper, I took it as a given that people would just print out the MTF Mapper charts if they wanted to use MTF Mapper. In the meantime I have gotten wiser, and I now know how hard (or expensive?) it is to print high-quality test charts. So it actually makes perfect sense to use a good quality chart that you already own (e.g., an SFRplus chart) with MTF Mapper.

Unfortunately the design of the SFRplus-style charts includes a black bar that runs through the top of the top row of target squares, like in this example:

An example of an SFRplus chart. I blatantly copied this example from Imatest's website (please don't sue me).

This black bar causes MTF Mapper to see the entire top row of squares plus the black bar as a single object, and since this compound object does not resemble a square, it ignores it. The same thing happens with the black bar at the bottom of the chart. As a result, MTF Mapper only processes the interior squares, like so:

Ignore the actual MTF50 values, but do note that only the interior square targets were detected automatically

Other than the obvious spurious detections on the non-target squares (which you can cover up with post-its or such if necessary) the output is usable, but you lose the top and bottom rows of squares, which is not ideal.

A simple solution is to just crop your image to exclude the bars, and then to process the cropped image with MTF Mapper's "-b" option. This works, but it is rather clunky. So I added a convenience feature that will do this automatically.

You can choose the new File/Open Imatest image(s) option in the GUI, or you can add the --imatest-chart option if you use the command-line interface. Because the gray target squares of the SFRplus chart cover more of the white background than a typical MTF Mapper chart, you probably have to adjust the "Threshold" value (-t on the CLI, or under Settings/Preferences/Advanced in the GUI) a little to detect all the target squares. The default Threshold is 0.55, and bumping it down to 0.4 should work a little better. For our test image above, we then get this:

Much better; all target squares detected

Note that the top edges of the top row of squares are tagged with "N/A" rather than MTF50 values; this is just MTF Mapper's way to indicate that these edges do not represent valid measurements. If you are parsing the "edge_mtf_values.txt" file produced by the "-q" output option, these edges will have an MTF50 value of 1.0 (which is an impossible / invalid MTF50 value). Or you could identify them by their pixel coordinates, which is probably the better way.

This feature is available from MTF Mapper 0.6.21 onward.

Cropped single edge images now handled more elegantly in the GUI

2018-05-14T07:56:00.000+02:00

A while back I wrote a post that described the "--single-roi" option of MTF Mapper. The "--single-roi" mode specifically allows you to feed MTF Mapper with images that have already been cropped to contain only a single edge, i.e., they look like this:

From MTF Mapper 0.6.20 onwards, you can now use the menu option File/Open single edge image(s) to load images that look like the one pictured above. Note that by opening an image using this new menu option MTF Mapper will automatically enable the outputs that make sense (Annotated image output, with the ability to click on edges to view the SFR curve), while silently ignoring all the output types that do not make sense if you only have one edge.

Journal paper on MTF Mapper's deferred slanted edge analysis algorithm now available!

2018-02-20T09:01:00.000+02:00

A new paper on MTF Mapper's deferred slanted edge analysis algorithm has been published in the Optical Society of America's JOSA A journal. The paper describes one of the methods that MTF Mapper can use to compensate for radial lens distortion. The paper also covers the technique that MTF Mapper uses to process Bayer CFA subsets, e.g., when processing just the green Bayer channel of a raw mosaiced image.

The full reference:
F. van den Bergh, Deferred slanted-edge analysis: a unified approach to spatial frequency response measurement on distorted images and color filter array subsets, Journal of the Optical Society of America A, Vol. 35, Issue 3, pp. 442-451 (2018).

You can see the on-line abstract here. The full article is paywalled, but if you contact me by email I can send you an alternative document that covers the same topic. I will probably post some articles on this topic here on the blog sometime too.

Improved user documentation

2018-02-10T12:05:00.000+02:00

If you are a long-time MTF Mapper user, then you are probably just as surprised by this unexpected turn of events as I am, but I have updated the docs. Actually, it gets better: I completely rewrote the user documentation to produce the new and improved (yes, it is new, and yes, it is an improvement on the old docs) MTF Mapper user guide.

You can grab a PDF copy of the user guide here. I have discovered a way to produce a decent-looking HTML version of the PDF documentation; by selecting Help in the GUI (MTF Mapper version 0.6.14 and later) you should see a copy of the user guide open in your system web browser. This is probably a better way to ensure that you are reading the latest version of the user guide.

I have tried to make the user guide more task-focused so that new users will be able to have a better idea of what they can do with MTF Mapper, as well as how they can get started. However, it took me about two weeks to write the new user guide, and it weighs in at 50+ pages, so it is probably still a little intimidating at first glance. If you are a new user, and you have any suggestions on ways in which I can improve the user guide, please let me know.

Unfortunately, even 50+ pages are not enough to really cover all the functionality, and the sometimes non-intuitive behaviour, of MTF Mapper so there is still some work to be done.

Device profile support

2018-02-08T14:31:00.000+02:00

From version 0.6.16 onwards, MTF Mapper now supports device profiles embedded in input image files. If you normally feed MTF Mapper with raw camera files (via the GUI), then this new feature will not affect you in any way.

If you have been feeding MTF Mapper with JPEG files, or perhaps TIFF files produced by your preferred raw converter, then this new feature could have a meaningful impact on your results. You can jump ahead to the section on backwards compatibility if you want the low-down.

To explain what device profiles are, and why they affect MTF Mapper, I first have to explain what linear light is.

Linear light

At the sensor level we can assume that both CCD and CMOS sensors have a linear response, meaning that a linear increase in the amount of light falling on the sensor will produce a linear increase in the digital numbers (DNs) we read from the sensor. This is true up to the point where the sensor starts to saturate, where the response is likely to become noticeably non-linear.

The slanted-edge method at the heart of MTF Mapper expects that the image intensity values (DNs) are linear. If your intensity values are not linear, then the resulting SFR you produce using the slanted-edge method is incorrect.

Gamma and Tone Reproduction Curves

Rather than exploring the history of gamma correction in great detail, I'll try to summarize: Back in the day of Cathode Ray Tube (CRT) displays they found that a linear change in the signal (voltage) sent to the tube did not produce a linear change in the display brightness. If you were to produce a graph of the input signal vs brightness, you would obtain something that looks like this:

Figure 1: A non-linear display response

If you fast-forward to the digital era, you can see how this non-linearity in the brightness response can become rather tiresome if you want to display, say, a grayscale image. If you took a linear light image from a CCD sensor, which we assume produced linear values in the range 0 to 255, and put that in the display adaptor frame buffer, then your image would appear to be too dark. The solution was to pre-distort the image with the inverse of the non-linear response of the display, i.e., using this function:

Figure 2: The inverse of Figure 1

If you take a linear signal and apply the inverse curve of Figure 2, then take the result and apply the non-linear display response curve of Figure 1, you end up with a linear signal again. The process of pre-distorting the digital image intensity values to compensate for the non-linear display response is called gamma correction.

This scheme worked well enough when you knew what the approximate non-linear display response was: on PCs the curve could be approximated as f(x) = x^2.2. Things became a lot more complicated when you wanted to display an image on different platforms, for example, early Macintosh systems were characterized by a gamma of 1.8, i.e., f(x) = x^1.8.

The solution the platform interoperability problem was to attach some metadata to your image to clearly state whether the digital numbers were referring to linear light intensity values, or whether they were pre-distorted non-linear values chosen to produce a perceived linear brightness image on a display. So how do you specify what your actual image represents? Do you specify the properties of the non-linear pre-distortion you applied to produce the values found in the image file, or do you instead specify the properties of the display device for which you image has been corrected? It turns out that both strategies were followed, with the PNG specification choosing the former, and most of the other formats (including ICC profiles) choosing the latter.

To cut to the chase: One important component of a device profile is its Tone Reproduction Curve (TRC); Figure 1 can be considered to be a TRC of our hypothetical CRT display device. Depending on the metadata format, you can either provide a single gamma parameter (γ) to describe the TRC as f(x) = x^γ, or you can provide a look-up table to describe the TRC.

Colour spaces

The other component of a device profile that potentially has an impact on how MTF Mapper operates is the definition of the colour space. The colour space matters because MTF Mapper transforms an RGB input image to a luminance image automatically. The main reason for this is that the slanted-edge method does not naturally apply to colour images; you have to choose to either apply it to each of the R, G and B channels separately, or you have to synthesize a single grayscale image from the colour channels. For MTF Mapper I chose the luminance (Y) component of the image, as represented in the CIE XYZ colour space, because this luminance component should correlate well with how our human vision system perceives detail.

So what is a colour space? To keep the explanation simple(r), I will just consider tristimulus colour spaces; in practice, those that describe a colour as a combination of three primary colours such as RGB (Red, Green, and Blue). Now consider the digital numbers associated with a given pixel in a linear 8-bit RGB colour space, e.g., (0, 255, 0) would represent a green pixel. Sounds straightforwards, right? The catch is that we have not defined what we mean by "green". Two common RGB colour spaces that we encounter are sRGB and Adobe RGB; they have slightly different TRCs, but here we focus on their colour spaces. The difference between sRGB and Adobe RGB is that they (literally) have different definitions of the colour "green": Our "green" pixel with linear RGB values (0, 255, 0) in the sRGB colour space would have the values (73, 255, 10) in the linear Adobe RGB colour space because the Adobe RGB colour space uses a different green primary compared to sRGB. Note that the actual colour we wanted has not changed, but the internal representation has changed.

The nub of the matter is that our image file may contain the value (0, 255, 0), but we only really know what colour that refers to once we know in which colour space we are working. I hope you can see the parallel to the TRC discussion above: the image file contains some numbers, but we really do have to know how to interpret these numbers if we want consistent results.

ICC profiles

A valid ICC profile always contains both the TRC information and the colour space information that MTF Mapper requires to produce a linear luminance grayscale image. In fact, the ICC profile contains the matrix that tells use how to transform linear RGB values into CIE XYZ values adapted for D50 illumination (that is very convenient if you do not want to get into chromatic adaptation).

So if you provide MTF Mapper with a TIFF, PNG^* or JPEG file with an embedded ICC profile, you can be sure that the resulting synthesized grayscale image will be essentially identical regardless of which colour profile you saved your image in.

^*MTF Mapper 0.6.17 and later.

JPEG/Exif files

If your JPEG file has no ICC profile, and no Exif metadata, then MTF Mapper will just assume that your image is encoded in the sRGB profile. Most cameras appear to at least add Exif metadata, but that only helps a little bit, since the Exif standard only really has a definitive way of indicating that the image is encoded in an sRGB profile. If your JPEG file is encoded in the Adobe RGB space (most DSLRs allow you to configure the JPEG output this way), then MTF Mapper will try to infer this from the clues in the Exif data.

MTF Mapper will use the appropriate TRC (either sRGB, or Adobe RGB), and the appropriate D50-adapted RGB-to-XYZ matrix will be selected for the luminance conversion.

Backwards compatibility (or the lack thereof)

Unfortunately the addition of device profile support has encouraged me to change the way in which JPEG files are converted to grayscale luminance images. In MTF Mapper version 0.6.14 and earlier, the JPEG RGB-to-YCrCb conversion was used to obtain a luminance image; from version 0.6.16 onwards the device profile conversion is used. In practice, this means that older versions would use

Y = 0.299R + 0.587G + 0.114B

regardless of whether the JPEG file was encoded in sRGB or Adobe RGB, which is clearly incorrect in the case of Adobe RGB files (regardless of the TRC differences). A typical weighting for an sRGB profile in version 0.6.16 and later would be

Y = 0.223R + 0.717G + 0.061B.

The practical implication is that results derived from JPEG files will be different between versions <= 0.6.14 and 0.6.16 and later. Figure 3 illustrates this difference on a sample sRGB JPEG file. To make matters worse, the difference will be exacerbated by lenses with significant chromatic aberration because the relative weight of the RGB channels have changed.

Figure 3: SFR difference on the same edge of a JPEG image (Green is v0.6.14, Blue is v0.6.16)

In Figure 3 the difference is small, but noticeable. For example, when rounded to two decimal places this edge will display as an MTF50 of 0.19 c/p on version 0.6.16, but 0.20 c/p on version 0.6.14. I expect that there will be examples out there that will exhibit larger differences than what we see here, but I do not expect to see completely different SFR curves.

More importantly, MTF Mapper's behaviour regarding TIFF files has changed. In version 0.6.14 and earlier, all 8-bit input images were treated as if they were encoded in the sRGB profile; this probably produced the desired behaviour most of the time. If, however, a 16-bit TIFF file is used as input in version 0.6.14 and earlier, then MTF Mapper assumed the file contained linearly coded intensity values (i.e., gamma = 1.0). This behaviour worked fine if you used "dcraw -4 ..." to produce the TIFF (or .ppm) file, but would not work on 16-bit TIFF files produced by most raw converters or image editors. From version 0.6.16 onwards all TIFF files with embedded ICC profiles will work correctly, whether they are encoded in linear, sRGB, Adobe RGB or ProPhoto profiles. Figure 4 illustrates the difference on a 16-bit sRGB encoded TIFF file.

Figure 4: SFR difference on the same edge of a 16-bit sRGB TIFF image (Green is v0.6.14, Blue is v0.6.16)

In Figure 4 we see much larger differences between the SFR curves produced by versions 0.6.14 and 0.6.16; this larger difference is because 0.6.14 incorrectly interpreted the sRGB encoded (roughly gamma = 2.2) values as if they were linear.

One last important rule: MTF Mapper version 0.6.16 still interprets all other 16-bit input files without ICC profiles (PNG, PPM) as if they have a linear (gamma = 1.0) encoding.

Summary recommendations

Overall, you should now be able to obtain more consistent results with MTF Mapper now that it supports embedded device profiles. For best results, choose to embed an ICC profile if your raw converter or image editor supports it. Given a choice, I still recommend using raw camera files directly with the GUI, or doing the raw conversion using "dcraw -4 -T ..." to convert your raw files if you use the command-line interface.

A brief overview of lens distortion correction

2017-08-22T13:44:00.000+02:00

Before I post an article on the details of MTF Mapper's automatic lens distortion correction features, I would like to describe in some detail the lens distortion model adopted by MTF Mapper.

The basics

Radial lens distortion is pretty much as the name suggests: the lens distorts the apparent radial position of an imaged point, relative to its ideal position predicted by the simple pinhole model. The pinhole model tells us that the position of a point in the scene, P(x, y, z) [assumed to be in the camera reference frame], is projected onto the image plane at position p(x, y) as governed by the focal length f, such that

p_x = (P_x - C_x) * f/(P_z - C_z)

p_y = (P_y - C_y) * f/(P_z - C_z)

where C(x, y, z) represents the centre of projection of the lens (i.e., the apex of the imaging cone).

We can express the point p(x, y) in polar coordinates as p(r, theta), where r² = p_x² + p_y²; the angle theta is dropped, since we assume that the radial distortion is symmetrical around the optical axis.

Given this description of the pinhole part of the camera model, we can then model the observed radial position r_d as

r_d = r_u * F(r_u)

where the function F() is some function that describes the distortion, and r_u is the undistorted radial position, which we simply called "r" above in the pinhole model.

Popular choices of F() include:

Polynomial model (simplified version of Brown's model), with
F(r_u) = 1 + k₁ * r_u² + k₂ * r_u⁴
Division model (extended version of Fitzgibbon's model), with
F(r_u) = 1 / (1 + k₁ * r_u² + k₂ * r_u⁴)

Note that these models are really just simple approximations to the true radial distortion function of the lens; these simple models persist because they appear to be sufficiently good approximations for practical use.

I happen to prefer the division model, mostly because it is reported in the literature to perform slightly better than the polynomial model [1, 2].

Some examples of radial distortion

Now for some obligatory images of grid lines to illustrate the common types of radial lens distortion we are likely to encounter. First off, the undistorted grid:

Figure 1: What the grid should look like on a pinhole camera

Add some barrel distortion (k₁ = -0.3, k₂ = 0 using division model) to obtain this:

Figure 2: Barrel distortion, although I think "Surface of an inflated balloon distortion" would be more apt.

Note how the outer corners of our grid lines appear at positions closer to the centre than we saw in the undistorted grid. We can instead move those corners further outwards from where they were in the undistorted grid to obtain pincushion distortion (k₁ = 0.3, k₂ = 0 using division model):

Figure 3: Pincushion distortion, although I would prefer "inaccurate illustration of gravitationally-induced distortion in space-time".

If we combine these two main distortion types, we obtain moustache distortion (k₁ = -1.0, k₂ = 1.1 using division model):

Figure 4: Moustache distortion.

We can swap the order of the barrel and pincushion components to obtain another type of moustache distortion, although I do not know if any extant lenses actually exhibit this combination (k₁ = 0.5, k₂ = -0.5 using division model):

Figure 5: Alternative (inverted?) moustache distortion.

Quantifying distortion

Other than using the k₁ and k₂ parameters (which might be a bit hardcore for public consumption), how would we summarize both the type and the magnitude of a lens' radial distortion? It appears that this is more of a rhetorical question than we would like it to be. There are several metrics currently in use, most of them unsatisfying in some respect or another.

One of the most widely used metrics is SMIA "TV distortion", which expresses distortion as a percentage in accordance with the following diagram:

Figure 6: Slightly simplified SMIA TV distortion

The SMIA TV distortion metric is just 100*(A - B)/B. If the value is negative you have barrel distortion, and positive values imply pincushion distortion. If you have moustache distortion like shown in Figures 4 and 5, then you could very likely obtain a value of 0% distortion. Whoops!

I only show SMIA TV distortion here to make a concrete link to the k₁ parameter, and to highlight that SMIA TV distortion is not useful in the face of moustache distortion.

Using the division model

There is one subtlety that is worth pondering a while: are we modelling the forward distortion, i.e, the distortion model maps our undistorted pinhole projected points to their distorted projected points, or are we modelling the reverse mapping, i.e., we model the correction required to map the distorted projected points to their undistorted pinhole projected points?

The important point to note is that neither the polynomial model, nor the division model, compels us to choose a specific direction, and both models can successfully be applied in either direction by simply swapping r_d and r_u in the equations above. I can think of two practical implications of choosing a specific direction:

If we choose the forward direction (such as presented above in "The basics") where r_d = r_u * F(r_u), then we must have a way of inverting the distortion if we want to correct an actual distorted image as received from the camera. If we undistort an entire image, then we would prefer to have an efficient implementation of the reverse mapping, i.e., we require an efficient inverse function F^-1() so that we may calculate F^-1(r_d) = r_d/r_u. In this form it is not immediately clear that we can find a closed-form solution to the reverse mapping, and we may have to resort to an iterative method to effect the reverse mapping. Depending on how we plan to obtain our distortion coefficients k₁ and k₂, it may be that the forward distortion approach could be far more computationally costly than the reverse distortion approach. To summarize: inverting the distortion model for each pixel in the image can be costly.
The process of estimating k₁ and k₂ typically involves a non-linear optimization process, which can be computationally costly if we have to compute the reverse mapping on a large number of points during each iteration of the optimization algorithm. I have a strong aversion to using an iterative approximation method inside of an iterative optimization process, since this is almost certainly going to be rather slow. To summarize: inverting the distortion model during non-linear optimization of k₁ and k₂ can be costly.

Just how costly is it to compute the undistorted points given the distorted points and a forward distortion model?

Polynomial model:
r_d = r_u * (1 + k₁ * r_u² + k₂ * r_u⁴), or after collecting terms,
r_u + r_u * k₁ * r_u² + r_u * k₂ * r_u⁴ - r_d = 0
k₁ * r_u³ + k₂ * r_u⁵ + r_u - r_d = 0
Since we are given r_d, we can compute potential solutions for r_u by finding the roots of a 5th-order polynomial.
Division model:
r_d = r_u / (1 + k₁ * r_u² + k₂ * r_u⁴), or
r_d * k₁ * r_u² + r_d * k₂ * r_u⁴ - r_u + r_d = 0
This looks similar to the polynomial model, but at least we only have to find the roots of a 4th-order polynomial, which we can do using Ferrari's formula because the r_u³ term has already been deflated.

In both cases we have to find the all the roots, including the complex ones, and then choose the appropriate real root to obtain r_u given r_d (I assume here that the distortion is invertible, which we can enforce in practice by constraining k₁ and k₂ as proposed by Santana-Cedres et al. [3]).
Alternatively, we could try a fixed-point iteration scheme, i.e., initially guess that r_u = r_d, substitute this into the equation r_u = r_d / F(r_u) to obtain a new estimate of r_u, rinse and repeat until convergence (this is what OpenCV does). Both of these approaches are far too computationally demanding to calculate for every pixel in the image, so it would appear that we would be better off by estimating the reverse distortion model.

But there is a trick that we can employ to speed up the process considerably. First, we note that our normalized distorted radial values are in the range [0, 1], if we normalize such that the corner points of our image have r = 1, and the image centre has r = 0. Because the interval is closed, it is straightforward to construct a look-up table to give us r_u for a given r_d, using, for example, the root-finding solutions above. If we construct our look-up table such that r_d is sampled with a uniform step length, then we can use a pre-computed quadratic fit to interpolate through the closest three r_d values to obtain a very accurate estimate of r_u. The combination of a look-up table plus quadratic interpolation is almost as fast as evaluating the forward distortion equation. The only limitation to the look-up table approach, though, is that we have to recompute the table whenever k₁ or k₂ changes, meaning that the look-up table method is perfect for undistorting an entire image for a given k₁ and k₂, but probably too expensive to use during the optimization task to find k₁ and k₂.

So this is exactly what MTF Mapper does: the forward distortion model is adopted so that the optimization of k₁ and k₂ is efficient, with a look-up table + quadratic interpolation implementation for undistorting the entire image.

Some further observations on the models

If you stare at the equation for the inversion of the division model for a while, you will see that

r_d * k₁ * r_u² + r_d * k₂ * r_u⁴ - r_u + r_d = 0

neatly reduces to

r_d * k₁ * r_u² - r_u + r_d = 0

if we assume that k₂ = 0. This greatly simplifies the root-finding process, since we can use the well-known quadratic formula, or at least, the numerically stable version of it. This is such a tempting simplification of the problem that many authors [1, 2] claim that a division model with only a single k₁ parameter is entirely adequate for modeling radial distortion in lenses.

That, however, is demonstrably false in the case of moustache distortion, which requires a local extremum or inflection point in the radial distortion function. For example, the distortion function that produces Figure 4 above looks like this:

Figure 7: The distortion function F() corresponding to Figure 4.

It is clear that the division model with k₂ = 0 cannot simultaneously produce the local minimum observed at the left (r_d = 0) and the local maximum to the right (r_d ~ 0.65).

Similar observations apply to the polynomial model, i.e., we require k₂ ≠ 0 to model moustache distortion.

Wrapping up

I think that covers the basics of radial distortion modelling. In a future article I will demonstrate how one would go about determining the parameters k₁ and k₂ from a sample image.

References

Fitzgibbon, A.W., Simultaneous linear estimation of multiple view geometry and lens distortion, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001.
Wu, F, Wei, H, Wang, X, Correction of image radial distortion based on division model, SPIE Optical Engineering, 56(1), 2017.
Santana-Cedres, D, et al., Invertibility and estimation of two-parameter polynomial and division lens distortion models, SIAM Journal on Imaging Sciences, 8(3):1574-1606, 2015.

Image interpolation: Fighting the fade, part 1

2017-08-01T13:29:00.000+02:00

Over the last six months or so I kept on bumping into a particularly vexing problem related to image interpolation: the contrast in an interpolated image drops to zero in the worst case of interpolating at an exact half-pixel shift relative to the original image.

Consider the case where you are trying to co-register (align) two images, such as two images captured by the same camera, but with a small translation of the camera between the two shots. If we translate the camera purely in the horizontal direction, then the shift between the two images will be h pixels, where h can be any real number. The integer part of h will not cause us any trouble for reasonable values of h, such that the two images still overlap, of course. The trouble really lies in the fractional part of h, since this forces us to interpolate pixel values from the moving image if we want it to line up correctly with the fixed image.

The worst case scenario, as mentioned above, is if the fractional part of h is exactly 0.5 pixels, since this implies that the value of a pixel in the interpolated moving image will be the mean of the two closest pixels from the original moving image. Figure 1 illustrates what such an interpolated moving image will look like for a half-pixel shift; the edges are annotated with their MTF50 values.

Figure 1: Scaled (Nearest Neighbour interpolation) view of an interpolated moving image that experienced a half-pixel shift. The numbers are the measured MTF50 values, in cycles/pixel.

Looking closely at the vertical edges of the gray square, we can see that there are some visible interpolation artifacts manifesting as overshoot and undershoot. This image was interpolated using OMOMS cubic spline interpolation [2], which is best method that I am aware of. Linear interpolation would produce much more blurring (but no overshoot/undershoot). And of course we see a marked drop in MTF50 on the vertical edges!

At any rate, the MTF curves for the edges are illustrated in Figure 2. The blue curve corresponds to the horizontal edge (i.e., the direction that experienced no interpolation), and the orange curve corresponds to the vertical edge (an 0.5-pixel horizontal shift interpolation). The green curve was obtained from another simulation where the moving image was shifted by 0.25 pixels.

Figure 2: MTF curves of interpolated moving images corresponding to fractional horizontal shifts of zero (blue), 0.25 pixels (green), and 0.5 pixels (orange).

Certainly the most striking feature of the orange curve is how the contrast drops to exactly zero at the Nyquist frequency (0.5 cycles/pixel). The smaller 0.25-pixel shift (green curve) shows a dip in contrast around Nyquist, but this would probably not be noticeable in most images.

In Figure 3 we can see that this loss of contrast around Nyquist follows a smooth progression as we approach a fractional shift of 0.5 pixels.

Figure 3: MTF curves of interpolated moving images corresponding to fractional horizontal shifts of 0.25 pixels (blue), 0.333 pixels (green), and 0.425 pixels (orange).

The conclusion from this experiment is that we really want to avoid interpolating an image with a fractional shift of 0.5 pixels (in either/both horizontal and vertical directions), since this will produce a very noticeable loss of contrast at higher frequencies, i.e., we will lose all the fine details in the interpolated image.

Radial distortion lens correction

An applied example of where this interpolation problem crops up is when we apply a radial distortion correction model to improve the geometry of images captured by a lens exhibiting some distortion (think barrel or pincushion distortion). I aim to write a more thorough article on this topic soon, but for now it suffices to say that our radial distortion correction model specifies for each pixel (x, y) in our corrected image where we have to go and sample the distorted image.

I prefer to use the division model [1], which implies that for a pixel (x, y) in the corrected image, we go and sample the pixel at

x' = (x - x_c) / (1 + k₁r² + k₂r⁴) + x_c

where

r = sqrt((x - x_c)² + (y - y_c)²)
and (x_c,y_c) denotes the centre of distortion (which could be the centre of the image, for example).

The value of y' is calculated the same way. The actual distortion correction is then simply a matter of visiting each pixel (x, y) in our undistorted image, and setting its value to the interpolated value extracted from (x', y') in the distorted image.

The important part to remember here is that the value (x', y') can assume any fractional pixel value, including the dreaded half-pixel shift.

An example of mild pincushion distortion

In order to illustrate the effects of radial distortion correction, I thought it best to start with synthetic images with known properties. Figure 4 illustrates a 100% crop near the top-left corner of the reference image, i.e., what we would have obtained if the lens did not have any distortion.

Figure 4: The pure, undistorted reference image. Note that the closely-spaced black lines blur into gray bars because of the simulated Gaussian Point Spread Function (PSF) with an MTF50 of 0.35 c/p. If you squint hard enough, you can see some traces of the original black bars. Rendered at 400% size with nearest-neighbour upscaling. (click for 100% view)

I simulated a very mild pincushion distortion with k₁ = 0.025 and k₂ = 0, which produces an SMIA lens distortion figure of about -1.62%. This distortion was applied to the polygon geometry, which was again rendered with a Gaussian PSF with an MTF50 of 0.35 c/p. The result is shown in Figure 5. Keep in mind that you cannot really see the pincusion distortion at this scale, since we are only looking at the top-left corner of a much larger image.

Figure 5: Similar to Figure 4, but with about 1.62% pincushion distortion applied to the polygon geometry. Rendered at 400% size with nearest-neighbour upscaling. (click for 100% view)

We can see the first signs of trouble in Figure 5: Notice how the black/white bars appear to "fade out" at regular intervals. The straight lines of Figure 4 are no longer perfectly straight, nor are they aligned with the image rows and columns. The lines thus cross from one row (or column) to the next, and the gray patches correspond to the regions where the lines fell halfway between two rows (or columns), leading to the apparent loss of contrast.

It is important to understand at this point that the fading in Figure 5 is not a processing artifact; this is exactly what would happen if you were to photograph similar thin bars that are not aligned with the image rows/columns.

Finally, we arrive at the radial distortion correction phase. Figure 6 illustrates what the corrected image would look like if we used standard cubic interpolation to resample the image.

Figure 6: The undistorted version of Figure 5. Resampling was performed using standard cubic interpolation. Rendered at 400% size with nearest-neighbour upscaling. (click for 100% view).

We see some additional fading that appears in Figure 6. If you flick between Figures 5 and 6 (after clicking for 100% view) you will notice that an extra set of fading patches appear in between the original fading patches. These extra fades are the manifestation of the phenomenon illustrated in Figure 2: the contrast drops to zero as the interpolation sample position approaches a fractional pixel offset of 0.5. The interesting thing about these additional fades is that they are not recoverable using sharpening --- once the contrast reaches zero, no amount of sharpening will be able to recover it.

A potential workaround

The aim of radial distortion correction is to remove the long range (or large scale) distortion, since the curving of supposedly straight lines (e.g., building walls) is only really visible once the distortion produces a shift of more than one pixel. Unfortunately we cannot simply ignore the fractional pixel shifts --- this would be equivalent to using nearest-neighbour interpolation, with its associated artifacts.

Perhaps we can cheat a little: what if we pushed out interpolation coordinates away from a fractional pixel shift of 0.5? Let x' be the real-valued x component of our interpolation coordinate obtained from the radial distortion correction model above. Further, let x_f be the largest integer less than x' (the floor of x'). If x' - x_f < 0.5, then let d = x' - x_f. (We can deal with the d > 0.5 case by symmetry).

Now, if d > 0.375, we compress the value of d linearly such that 0.375 <= d' <= 0.425. We can obtain the new value of x', which we can call x", such that x" = x_f + (x' - x_f ) * 0.4 + 0.225. Looking back at Figure 3, we see that a fractional pixel shift of 0.425 seems to leave us with at least a little bit of contrast; this is where the magic numbers and thresholds were divined from.

Does this work? Well, Figure 7 shows the result of the above manipulation of the interpolation coordinates, followed by the same cubic interpolation method used in Figure 6.

Figure 7: The undistorted version of Figure 5. Resampling was performed using the modified interpolation coordinates followed by cubic interpolation. Rendered at 400% size with nearest-neighbour upscaling. (click for 100% view).

Careful squinting reveals that the additional fading patches observed in Figure 6 have been reduced noticeably. This looks promising. Of course, one might argue that I have just added some more aliasing to the image. Which might be the case.

Further testing will be necessary, especially on more natural looking scenes. I might be able to coax sufficient distortion from one of my lenses to perform some real-world experiments.

Further possibilities

Using the forced geometric error method proposed above, we can now extract at least some contrast at the frequencies near Nyquist. We also know what the fractional pixel shift was in both x and y, so we know what the worst-case loss-of-contrast would be. By combining these two bits of information we can sharpen the image adaptively, where the sharpening strength is adjusted according to the expected loss of contrast.

Stay tuned for part two, where I plan to investigate this further.

References

Fitzgibbon, A.W.: Simultaneous linear estimation of multiple view geometry and lens distortion. In: Proc. IEEE International Conference on Computer Vision and Pattern Recognition, pp. 125–132 (2001).
Thevenaz, P., Blu T. and Unser, M.: Interpolation revisited, IEEE Transactions on medical imaging, 19(7), pp. 39–758, 2000.

Windows binaries are now 64-bit

2017-07-18T12:06:00.000+02:00

I figured that by 2017 most Windows users will probably be running a 64-bit version of Windows, so it should be reasonably safe to switch to distributing 64-bit binaries from version 0.6 onward.

The practical benefit of this move is that larger images can now be processed safely; late in the 0.5 series of MTF Mapper you could cause it to crash by feeding it a 100 MP image. While it is possible to rework some of MTF Mapper's code to use substantially less memory (e.g., some of the algorithms can be run in a sliding-window fashion rather than whole-image-at-a-time, and I could add some on-the-fly compression in other places), it just seemed like much less work to switch to 64-bit Windows binaries.

That being said, if there is sufficient demand, I am willing to build 32-bit binaries occasionally.

There are quite a few new things in the 0.6 series of MTF Mapper (check the Settings dialog of the GUI):

Fully automatic radial distortion correction using only one image (preferably of an MTF Mapper test chart, but anything with black trapezoidal targets on a white background will work). Enabling this feature slows down the processing quite a bit, so I do not recommend using this by default. More on this in an upcoming blog article.
Correction of equiangular (f-theta) fisheye lens images.
Correction of stereographic fisheye lens images.

I plan on posting an article or two on these new features, so stay tuned!

New --single-roi input mode

2017-05-03T10:25:00.000+02:00

My original vision was for MTF Mapper to be fully automated; all you had to do was provide it with an image of one of the MTF Mapper test charts. The implementation was centered on the idea that detecting a dark, roughly rectangular target on a white background was a much more tractable problem than detecting arbitrary edges (hopefully representing slanted edges) in arbitrary input images. Figure 1 illustrates what a suitable MTF Mapper input image looks like.

Figure 1: A single target (black rectangle) on a white background. MTF mapper can detect any number of such shapes in your input image; the target objects need not be perfectly rectangular either, as some deviation from perfect 90-degree corners is allowed.

This approach did pay off, and still does, allowing users to design their own test charts that just work with MTF Mapper without requiring specific support for each custom test chart design.

As it turns out, many users have a very different workflow which does not allow them to specify their own chart. Examples of this include Jack Hogan's analysis of the DP Review test chart images, or Jim Kasson's razor-blade focus rail experiments. This type of workflow produces a rectangular Region Of Interest (ROI) that contains only a single edge. Figure 2 illustrates what a typical input image from this use case looks like.

Figure 2: A rectangular ROI containing a single slanted edge.

In the past, MTF Mapper could only process images that look like Figure 2 by specifying the -b option, which would add a white border around the image, thereby transforming it to look more like the expected input convention illustrated in Figure 1. This was a bit of a hack, and has some severe drawbacks. The most prominent disadvantage of the -b option is that the automatic dark target detection code in MTF Mapper could fail to detect the target if the edge contrast was poor, or if the edge was extremely blurry. Fussing with the detection threshold (-t option) sometimes helped, but this just highlighted the fact that the -b option was a hack.

From MTF Mapper version 0.5.21 onwards, there is a new option, --single-roi, which is intended to replace the use of the -b option when the input images look like Figure 2.

The --single-roi input mode completely bypasses the automatic thresholding and target detection code, and instead assumes that the input image contains only a single edge. The ROI does not have to be centered perfectly on the edge, but I recommend that your ROI must include at least 30 pixels on each side of the edge. MTF Mapper will automatically restrict the analysis to the region of the image that falls within a distance of 28 pixels from the actual edge, so it does not hurt to have a few extra pixels on the sides of the edge (meaning the left and/or right side of an edge oriented as shown in Figure 2).

A typical invocation would look like this:

mtf_mapper.exe --single-roi -q image.png output_dir

which would produce two files (edge_mtf_values.txt, edge_sfr_values.txt) in output_dir. The second and third columns of edge_mtf_values.txt give you the image coordinates of the centre of the detected edge (not really that useful in combination with --single-roi), and the fourth column gives you the measured MTF50 value. To learn the mysteries of the format of the edge_sfr_values.txt file you must first signal the secret MTF Mapper handshake.

Note that it is also possible to use the --single-roi mode in conjunction with the MTF Mapper GUI, provided that your images have already been cropped to look like Figure 2. Just add the string "--single-roi" to the "Arguments" field of the Settings dialog; now you can view the SFR curve of your edge as described in this post. Update: From MTF Mapper 0.6.20 onward you can use the menu option File/Open single edge image(s) as a much more convenient method of processing cropped edges in the GUI.

You can still use the --bayer red option with the --single-roi option to process only the red channel (for example) from an un-demosaiced Bayer image, such as produced by dcraw -4 -d;

just be careful that your ROI is cropped such that the starting row/column of the Bayer pattern is RGGB (the only format currently supported by MTF Mapper).

View MTF (SFR) curves in the GUI

2017-04-15T10:02:00.001+02:00

The Easter bunny has delivered. That most elusive of egg-laying mammals has brought you a new GUI feature, which finally completes the feature set I originally envisioned for MTF Mapper.

To visualize the MTF curve (or SFR curve, if you prefer), load up any suitable image in MTF Mapper using the menu option "File -> Open". Select the "Annotated image" output mode, like so:

Make sure "Annotated image" is selected in the desired output types box

You may select any of the other output types (e.g., "Grid") concurrently, except the "Focus position" output type, which is not currently compatible with the other output types.

Click the "Open" button, and wait for the outputs to start appearing in the "Data set" tree-view panel. Expand the entry in the tree-view to expose the "annotated" entry, and click on it:

Note the cyan-coloured text superimposed on the edges of your black target squares. These values are the MTF50 values of the edges, expressed in cycles per pixel.

Those cyan-coloured text labels serve two purposes: a) they tell you the MTF50 (cycles per pixel) of the edge on top of which it is drawn, and b) they are clickable (left mouse button) targets to bring up the MTF curve display for the selected edge.

A brief digression: If the text label is displayed in yellow (rather than cyan), then it indicates that MTF Mapper has deemed the edge to be of "medium" quality. This usually means that the edge orientation is poor, that is, close to one of the critical angles that could mean the displayed MTF50 value is less accurate than the ideal. The edge will also be displayed in yellow if the edge length is sub-optimal (meaning too few pixels along the edge), which also degrades the MTF accuracy. Normally the values displayed in yellow are still usable, but be careful.

Occasionally you will see some of the MTF50 labels displayed in red, like in this example:

The leftmost red label coincides with an edge with a ~25 degree orientation, which is within 2 degrees of the critical angle at 26.565 degrees

It is possible that an edge labelled in red is still usable, but there is no way to know for certain. I would rather recommend that you try to re-align your camera so that no edges end up with red labels, or that you ignore the edges with red labels for any serious analysis.

Back to the main story: clicking on an edge label pops up the MTF curve display window,

The result of a single left-button click on an MTF50 label

I hope the plot is self-explanatory. It corresponds to the MTF curve plots found in other slanted edge tools (Imatest, QuickMTF, ImageJ slanted edge plugin, etc.), with the x-axis of the plot indicating spatial frequency in cycles per pixel, and the y-axis indicating the contrast at the indicated frequency. In the top-right corner you can see a tag "MTF50=0.087" in a shade of blue that matches the plotted curve; this indicates the MTF50 value of the edge you just clicked on, as you might have expected.

The vertical gray bar is a cursor that follows the mouse, which will read off the actual contrast value corresponding to the MTF curve at the indicated spatial frequency; the read-out of this cursor is displayed just below the plot ("frequency: 0.098 contrast: 0.403" in the example). Again, the colour of the "contrast: <xyz>" readout matches that of the plotted curve.

While the MTF curve window is open, you may left-click on any other edge in the "annotated" image to replace the contents of the MTF curve window with the data corresponding to the newly selected edge. This includes clicking on edges in any other "annotated" output available in the "Data set" tree-view.

If you would like to compare the MTF curves of two edges, select the first edge as above, but hold down <shift> while left-clicking on the second edge. This adds the second edge to the plot:

Note the addition of the green curve, and the two green text labels, corresponding to the newly added edge's MTF

The read-out below the plot tracks and displays the MTF for both curves at the current spatial frequency, making it easy to read off accurate values for comparison purposes. Note that you can again select the second edge from any other "annoted" output in the "Data set" tree-view, making it easy to compare curves from different lenses or cameras.

Lastly, you can add a third curve to the plot using the same <shitf>+left click method.

Adding a third curve behaves as expected

If you already have three curves plotted, the last curve is replaced by this action. If you left click on a new edge without holding down shift, the plot reverts back to displaying only a single MTF curve (using the newly selected edge's MTF).

Update (2017/04/16)

I have since added the two save buttons, grab a copy of version 0.5.19 or later to give it a try.

Future improvements

It might be useful to drop a pin, or some other visual marker onto the "annotated" image to indicate which edge was selected. It might be even more useful to show the actual ROI used by MTF Mapper, but that information is not currently available in the outputs.

Let me know if you have any other suggestions for useful features to add to the MTF curve display function.

Where?

This feature is available from MTF Mapper 0.5.18 onwards, available from SourceForge.

Focus peak measurement with MTF Mapper: Description and Validation

2017-04-04T16:14:00.000+02:00

It is a truth universally acknowledged that a single man in possession of a large aperture lens must capture images with as shallow a depth of field as he can manage. (Jane, please forgive me ...).

All kidding aside, the downside to employing a shallow depth of field is the way in which it accentuates even the smallest focus error. By focus error I mean that the apparent position of the focus plane is not where the photographer intended. And of course there is no such thing as a focus plane, since in reality it is a curved surface, but for convenience I will use the term focus plane here.

Even if we accept the convenient notion of a focus plane, we still have not really explained clearly what a focus plane is. One way of describing the focus plane would be to say that it is the distance at which the circle of confusion is minimized (as projected onto the image sensor). Personally, I am not a fan of using the circle of confusion to measure focus (or defocus, to be more precise), mostly because it is hard to measure the circle of confusion. The other difficulty with the notion of the circle of confusion is that it conjures up the image of these perfect little circular discs being formed on the image sensor, which is a rather crude simplification that does not take into account the actual point spread function (PSF) of the imaging system.

A much more convenient (to me, at least) way of defining a focus plane is to do so in terms of MTF, since this explicitly acknowledges the full PSF. This idea has been proposed recently by Jim Kasson (example from his blog, sample discussion from DPR forum), using MTF50 as the final criterion. It only takes a little bit of thought to see that circle of confusion diameter and MTF50 are both approximations of the degree of sharpness of an image; note that using MTF50 might discard some of the useful information that we could extract from the full MTF curve, but it is convenient to have only a single value to express our measure of "sharpness" (I am deliberately avoiding the term "resolution" here, since the slanted edge method measures MTF, not resolution).

We can plot the "sharpness" measure of our choice (MTF50) as a function of distance from the camera to produce a curve like this one:

Figure 1: An example of MTF50 at a fixed position on a hypothetical image sensor, plotted as a function of the distance between the sensor and the slanted edge target (it happens to be a 50 mm f/1.8 lens at f/2.8)

Figure 1 illustrates the MTF50 that we would measure as we move our slanted edge target relative to the camera. Now that we have something to visualize, it is easy to explain what a focus plane is: the hypothetical plane that is parallel to our image sensor, located at the distance that maximizes MTF50 (e.g., the peak of the curve, as indicated by the green line in Figure 1). Similarly, we could define depth of field (DOF) as the length of the interval between the two dashed gray lines, where the dashed gray lines correspond to the distances from the camera at which MTF50 = 0.15 cycles per pixel. Note that this is an arbitrary re-definition of DOF only for illustration of the concept as it applies to the curve shown in Figure 1.

In this article, I will use the terms "focus peak distance", "focus distance", and "focus plane position" interchangeably to refer to the distance corresponding to the green line in Figure 1. As you can probably deduce from the title, this article deals with the measurement of this focus distance value using MTF Mapper.

A new test chart

The "classic" MTF Mapper test charts proved to be inadequate when it came to accurate measurement of the focus peak distance. Firstly, none of the older charts provided the required density of slanted edges to obtain a robust measurement. Secondly, the older charts did not allow MTF Mapper to convert image space (pixel) coordinates to real-world coordinates (in mm). The new chart is illustrated in Figure 2:

Figure 2: The new "focus" MTF Mapper chart type

This chart is a 45-degree slanted chart design. The camera should be pointed towards the centre of the chart; the centre of the chart falls on the dashed line, halfway between the two central fiducials (the large black dots). The chart should be tilted at 45 degrees around the axis illustrated with the dashed (horizontal) line. Notice that the large black bars down the centre of the chart decrease in size towards the bottom of the chart --- that end of the chart should be closer to the camera (so that perspective ends up distorting the bars to have roughly the same size in the final image, although this is not critical). Figures 3 (a) and (b) illustrate two possible chart orientations relative to the camera.

Figure 3a: One possible set-up, with the chart tilting top-to-bottom at 45 degrees

Figure 3b: An alternative set-up, with the chart tilting left-to-right at 45 degrees

The chart does not have to be positioned in a portrait orientation; landscape orientation works just fine (compare Figure 3(a) to 3(b)). The camera can also be used in either landscape or portrait orientation, as long as you can fit in most of the chart in the image. If you happen to have a sub-optimal combination of focal length, chart size and distance from the chart, then you may crop the chart a little if you must. It is important that the 45-degree tilt of the chart is around the correct axis (running through the dashed line of the chart shown in Figure 2), and that the other two axes must be close to being square.

It is critical to print the chart at the correct size (without "fit to page" scaling). MTF Mapper relies on the fact that distances on the chart are correct --- note the four "+" markers near the corners of the chart, which you can use to verify that your print came out at the right scale. Of course, if you do print with some page scaling, or you print the A3 chart on an A4 page, then MTF Mapper will still work, but the distances that it reports will no longer be accurate. Lastly, note that the fiducials are coded to allow MTF Mapper to identify the correct chart size, so if you pay close attention, you will see the differently sized charts are not just scaled copies of a base chart size.

This is a manual-focus chart only. The camera should preferably be focused (manually) on the centre of the chart, i.e., roughly at the point halfway between the two central fiducials (black dots). The chart features around this point are not suitable for auto-focus use because there is no way to tell what part of the chart (in the general region around the centre) the camera chooses to focus on. Just to clarify: This chart should not be used to perform PDAF micro-adjust / fine tuning with.

A minor digression: Although this chart is not suitable for use with auto-focus, nothing prevents you from using a removable overlay target. You could, for example, use a second printed page containing only a large black rectangle (like the central rectangle in the MTF Mapper "perspective" chart type) to perform the auto-focus operation. Lock the focus, remove the overlay, and capture the image of this new chart. If you plan ahead, you could use fridge magnets to make the process of adding/removing the auto-focus overlay target more convenient. Or you could wait for the eventual release of a new auto-focus MTF Mapper chart I plan on introducing.

A new output type

To process images of the new "focus" chart just introduced, a new output type has been added to MTF Mapper. As of MTF Mapper version 0.5.16, this output type is not compatible with other typical output types produced by MTF Mapper, i.e., when you choose the "focus position" output type, then you should not enable any other output types (they will not produce usable output). This is a temporary inconvenience, and I aim to fix this sometime. The corresponding command-line switch for this new output type is "--focus"; it produces a file called "focus_peak.png".

An example of this output type is illustrated in Figure 4:

Figure 4: An example of the "--focus" MTF Mapper output. Note that the chart was oriented as shown in Figure 3b

The curve illustrated in Figure 1 is overlaid on top of the captured image by back-projecting the curve onto the image using the estimated camera perspective transformation, to produce the green curve in Figure 4. The "height" of this curve is simply scaled to fill the image of the chart, so the peak of the curve will always be on the midline of the black slanted edge bars.

The dark blue line illustrates the intersection of the hypothetical focus plane with the surface of the chart. In the centre of the image illustrated in Figure 4 we see an orange-ish coordinate origin marker (four outwards pointing arrows), representing the physical center of the chart. The red reticule (with its four inwards pointing arrows) indicate the centre of the captured image; together these two features provide feedback for centering the camera to the chart.

Right under the peak of the green curve we see two values reported in cyan-coloured text. The first line is the MTF50 value measured at the peak, and the second is the focus plane position relative to the centre of the chart. In other words, MTF Mapper subtracts the estimated position (distance from the camera) of the centre of the chart from the focus peak distance to compute the value displayed in this second line below the green curve.

Lastly, it may be worth reading my article on chart orientation estimation, since the underlying method of extracting the camera pose parameters is the same one used by the "--chart-orientation" output mode. If you are using the command line version of MTF Mapper, take note that you may have to specify the focal ratio of your camera + lens combination in order for the camera pose parameters to be correct. For example, a 105 mm lens mounted on a Nikon APS-C body (23.6 mm sensor width) would require the command "--focal-ratio 4.45" to improve the accuracy of the "Estimated chart distance" value reported at the bottom of the "focus_peak.png" output image. The value 4.45 is derived from (lens focal length)/(sensor width), i.e., 105/23.6 ~ 4.45. Because the "focus peak depth" value reported in the output (the -24.7 mm in Figure 4) is a relative measurement, it is expected that an incorrect "Estimated chart distance" value will not have a large impact, but more testing has to be performed to confirm this.

The principle

The curve presented in Figure 1 seems to imply that we have a single slanted edge that we measure as we move it away from the camera, starting at a distance closer than the focus plane distance. This is an entirely valid way of obtaining the measurements required to produce Figure 1, and Jim Kasson has done exactly that. It does require a good linear rail, preferably a computer controlled one to automate the capture of a large number of images from our desired range of distances (from the camera).

We can obtain a fairly decent approximation if we use a 45-degree chart with a large number of slanted edges. The tilt in the chart naturally ensures that these slanted edges appear at different distances from the camera. All that MTF Mapper has to do is extract slanted edge MTF values, and reconstruct the MTF50 vs distance curve.

That sounds straightforward, but there is one fairly large caveat: if our edge is slanted (as required for the slanted edge method to work), then that edge will pass through a range of distances, e.g, the starting tip of the edge is closer to the camera, and the endpoint of the edge is further from the camera. If the MTF50 value varies as a function of distance (as illustrated in Figure 1), then strictly speaking the PSF of the image formation process also varies along this edge. This violates the central assumption of the slanted edge method, which implicitly assumes that we can measure the MTF at a single location in the field by examining a small region around that location. In practice, the MTF we measure with the slanted edge method is a blend of the MTFs at the various distances the edge passes through.

There is not much we can do about it, but it helps to oversample. Each of the long edges of the slanted edge bars in the "focus" test chart (see Figure 2) is processed with a sliding window that uses only a small section of the edge to apply the slanted edge method to. This approach increases our sampling density whilst minimising the range of depth values over which each slanted edge MTF calculation is performed, i.e., we only assume that the true MTF is constant over a very small section of the edge. It is a well-known fact that the slanted edge method produces estimates with a smaller standard deviation if the length of the edge that it is applied to is increased (and the true MTF remains constant); conversely, we expect that each of our individual slanted edge measurements performed with the sliding window method will result in a large standard deviation in the estimated MTF50 value. Fortunately, we can safely assume that our desired MTF50 vs distance curve must be smooth, thus we can fit a smooth model to our multiple noise-contaminated MTF50 measurements. It turns out that a rational polynomial function of order (4, 2) seems to fit rather nicely in all the cases I have examined so far, so that is what MTF Mapper uses internally.

This strategy violates any number of model-fitting assumptions (e.g., my noisy samples are bound to be correlated, and the noise might be correlated too), but it seems to work in practice.

One last observation: Why are the slanted edge bars oriented so that they run left-to-right if the chart is tilted at 45-degrees top-to-bottom (assuming portrait orientation, as shown in Figure 2)? What would happen if we had only a single edge running top-to-bottom, and we applied the sliding window approach to that edge? It turns out that this top-to-bottom method is viable, but because each short edge segment passes through a larger range of distance (from the camera) values, compared to the left-to-right edges, the sensitivity of the detection of the peak of the MTF50 vs distance curve is compromised. So it works, just not as well as the edge orientation of Figure 2.

Accuracy assessment: set-up

So does it work? This turns out to be a fairly hard question to answer. One approach would be to validate the single-image-45-degree-chart method against a computer-controlled focusing rail (i.e., physically moving the edge like Jim Kasson does), but I do not have one of those handy.

I settled on a rather different approach that relies on the observation that a lens fitted on an extension tube can no longer focus at infinity. From what I could gather, a lens set to focus at infinity will focus at a distance d = f*(f/e + 2) + e, where f is the focal length, and e is the extension length (update: see Appendix A below for a discussion of this formula). I happen to have a Micro-Nikkor 105 mm f/4 Ai lens with a hard stop at infinity. After a bit of iterative experimentation (translation: building something, then going back to the drawing board, then salvaging the hardware) I found that an extension of about 6.4 mm will cause the 105 mm lens to focus at a distance of about 1939 mm. At this distance, the lens covers an object size just a tad smaller than an A3 test chart.

Of course, there are some practical problems. Firstly, it is rather difficult to measure a distance of 1939 mm with good accuracy using my available tools. More importantly, I am not quite sure where to measure this distance from (update: As mentioned in Appendix A, this is the total lens conjugate distance, i.e., the distance between the image plane and the focus plane. Since I do not know the principal plane separation distance of my lens, I still cannot use the total lens conjugate distance directly). Even if I could solve the measurement problem, I would still only end up with a single measurement, and no experimental variables to vary.

My solution was to build a variable-length extension tube. The idea was that I could preset the effective length of the extension tube with good accuracy if I used shim stock (or feeler gauges) --- all I had to do is build an extension tube from scratch, since none of the commercially available ones appear to go below 8 mm. Here is a photo of my custom extension tube mounted between my D7000 and the 105 mm Nikkor lens:

Figure 5: The bronze-coloured ring is part of my extension tube

Here is what the front of my extension tube looks like with the lens removed:

Figure 6: extension tube with the lens removed

As you can see from Figure 6, it was a bit of a tight fit to build an adaptor that was wide enough to allow adjustment without removing the lens, but still small enough to physically fit below the prism housing.

Here is what the extension tube looks like with some shims installed:

Figure 7: extension tube with some shims installed, front view

Figure 8: extension tube with some shims installed, rear view

As can be seen in Figure 7, the front part of the extension tube comprises two parts: the front flange (visible as the large ring with the black pen markings), and a Nikon F-mount female bayonet mount. The inner four screws fix the female mount the the outer flange.

The rear flange has an integrated Nikon F-mount male bayonet. I discovered that the male bayonet is a lot easier to manufacture than the female F-mount bayonet --- that probably explains why Nikon sells them :) Figure 8 also shows how the shims are installed between the front and rear flanges. Careful lapping of the flanges (an a bit of shimming with aluminium foil) ensured that the front surface of the female bayonet mount was parallel to the rear surface of the male bayonet mount to within 5 micron.

The front of the rear flange looks like this when we open up the extension tube:

Figure 9: front face of rear flange seen in the foreground

And lastly, we can see the rear face of the front flange:

Figure 10: rear face of front flange.

Notice the dowel pins in Figure 10: these acted as the registration mechanism so that the flanges always line up correctly without any rotation or tilt.

The length of the extension tube can be adjusted by installing three shims between the front and rear flanges. Measurements with a micrometer show that the repeatability of this process was around 5 micron. I also learned that bargain-store feeler gauges are not necessarily manufactured down to the tolerances that this experiment demanded --- I found some evidence that the feeler gauge thickness varied a little bit across their surfaces. I compensated as much as possible by labeling the position at which a particular shim should be installed, and I measured the effective extension tube length rather than relying on the nominal shim thickness.

I ended up with the following (effective) shim thicknesses: 130, 94, 72, 58, and 45 micron.

Accuracy assessment: the results

The basic experiment involves setting up the chart so that the apparent focus plane position was just slightly in front of the chart center when the 130 micron shims were installed. For each shim set, I then captured 10 images to yield 50 images in total. I repeated the whole experiment a second time to check for repeatability. Figure 11 presents the resulting box-and-whisker plot.

Figure 11: Focus position shift measured by MTF Mapper as a function of shim thickness

The y-axis denotes the "focus peak depth" value reported by MTF Mapper; all the values are positive indicating that the measured focus plane position was slightly in front of the chart centre in all cases.

Other than the large variability (across 10 images) of Set A with 45 micron shims, it would appear that the individual measurements were quite robust. Typical standard deviation within a particular batch of 10 images was below 0.3 mm.

Using the formula presented above, we can compute the expected focus plane position for each of the shim sets, however, we still have no idea how to measure these absolute distances (and the principal plane separation distance of the lens is unknown). Instead, we can subtract the focus distance obtained from the formula with the 45 micron shim; doing the same for the focus peak depth values reported by MTF Mapper allows us to perform a relative comparison. The results are presented in Figure 12:

Figure 12: Summary of results. All values reported in millimeters

The second column contains the "focus peak depth" value reported by MTF Mapper. The second last column lists the relative focus peak depth value; these values should be compared to the relative values derived from the formula appearing in the last column.

Overall we see a reasonable agreement between the relative values derived from the formula, and the relative values as measured by MTF Mapper. There are some outliers (set B, 58 micron shim), but the difference between expected and measured values are typically below 1 mm. Keep in mind that a 5 micron change in shim thickness produces a change of 1.3 mm in focus plane position using the formula, i.e., the measured values appear to be within the mechanical repeatability of the shimming process itself.

The smallest change in shim thickness tested here was 13 micron (45 micron shim set swapped out with 58 micron shim set), followed closely by the 72 vs 58 micron shim combination with a difference of 14 micron. In both those cases it is clear (see Figure 11) that, using the "focus peak depth" values reported by MTF Mapper, one can easily discern the change in focus plane position induced by a 13 micron change in shim thickness.

Why would we want to do this? One application would be the calibration of a camera system where we have to shim the flange distance (distance from sensor to lens mounting flange front surface) to ensure that the image formed on the sensor is in focus when a reference lens is mounted. This is particularly useful for systems with hard infinity focus stops.

Of course, one would have to consider things like actual image magnification relative to sensor resolution when considering this "smallest discernible change in flange distance" measurement, because MTF Mapper performs the analysis of images at the pixel level. More testing!

References

[Burke2012]: Burke, Michael W, Image acquisition: handbook of machine vision engineering, Springer Science & Business Media, 2012.

Appendix A

The formula used to calculate the focus distance of the lens with focal length f and extension e is d = f*(f/e + 2) + e. This formula is taken from [Burke2012, p311], where d is stated to be the total lens conjugate distance. The total lens conjugate distance is the sum of the object-to-lens-centre and image-to-lens-centre distances when looking at the thin lens model. Burke notes that the derivation of this equation depends on the lens being symmetric, which allows us to assume that d = d_o + 2f + d_i, where d_o is the object-to-focal-point distance, and d_i is the image-to-focal-point distance.

I strongly doubt that my Micro-Nikkor 105 mm f/4 Ai is really a symmetric lens, so I just assume that this formula still gives reasonable results. Burke's formula only applies to a thin lens, which I am fairly certain my lens is not (being a compound lens). The implication of this, from my understanding, is that there is an additional distance d_p that separates the two principal planes which must be added to the total lens conjugate distance, which implies that d = d_o + 2f + d_i+ d_p. Using my convention above, where d_i is called e (denoting extension), we see that the thick lens version of this equation should be d = f*(f/e + 2) + e + d_p.

Unfortunately I have no idea what the value of d_p would be for my lens. Serendipitously, I only end up using the difference between d values computed using different values of e, meaning that the subtraction removes the d_p term from the difference, so I can get away with using the thin lens version of the formula.

Automatic chart orientation estimation: validation experiment

2017-02-10T16:41:00.002+02:00

In my previous post I mentioned that it is rather important to ensure that your MTF Mapper test chart is parallel to your sensor (or that the chart is perpendicular to the camera's optical axis, which is almost the same thing) to ensure that you do not confuse chart misalignment with a tilted lens element. I have added the functionality to automatically estimate the orientation of the MTF Mapper test chart relative to the camera using circular fiducials embedded in the test chart. Here is an early sample of the output, which nicely demonstrates what I am talking about:

Figure 1: Sample output of chart orientation estimation

Figure 1 shows an example of the MTF Mapper "lensprofile" chart type, with the new embedded circular fiducials (they are a bit like 2D circular bar codes). Notice that the actual photo of the chart is rendered in black-and-white; everything that appears in colour was drawn in by MTF Mapper.
There is an orange plus-shaped coordinate origin marker (in the centre of the chart), as well as a reticle (the red circle with the four triangles) to indicate where the camera is aimed at. Lastly, we have the three orientation indicators in red, green and blue, showing us the three Tait-Bryan angles: Roll, Pitch and Yaw.

But how do I know that the angles reported by MTF Mapper are accurate?

The set-up

I do not have access to any actual optics lab hardware, but I do have some machinist tools. Fortunately, being able to ensure that things are flat, parallel or perpendicular is a fairly important part of machining, so this might just work. First I have to ensure that I have a sturdy device for mounting my camera; in Figure 2 you can see the hefty steel block that serves as the base of my camera mount.

Figure 2: Overview of my set-up

I machined the steel block on a lathe to produce a "true" block, meaning that the two large faces of the large shiny steel block are parallel, and that those two large faces are also perpendicular to the rear face on which the steel block is standing in the photo. The large black block in Figure 2 is a granite surface plate; this one is flat to something ridiculous like 3.5 micron maximum deviation over its entire surface. The instrument with the clock face is a dial test indicator; this one has a resolution of 2 micron per division. It is used to accurately measure small relative displacements through the pivoting action of the lever you can see in contact with the lens mount flange of the camera body.

Using this dial test indicator, surface plate and surface gauge, I first checked that the two large faces of the steel block were parallel: they were parallel to within about 4 micron. Next, I stood up the block on its rear face (bottom face in Figure 2), and measured the perpendicularity. The description of that method is a bit outside of the the scope of this post, but the answer is what matters: near the top of the steel block the deviation from perpendicularity was also about 4 micron. The result of all this fussing with parallelism and perpendicularity is that I know (because I measured it) that my camera mounting block can be flipped through 90 degrees by either placing it on the large face with the camera pointing horizontally, or stood up with the camera pointing to the ceiling.

That was the easiest part of the job. Now I had to align my camera mount so that the actual mounting flange was parallel to the granite surface plate.

Figure 3: Still busy tweaking the mounting flange parallel to the surface plate

The idea is that you keep on adjusting the camera (bumping it with the tripod screw partially tightened, or adding shims) until the dial test indicator reads almost zero at four points, as illustrated between Figures 2 and 3. Eventually I got it parallel to the surface plate to within 10 micron, and called it good.

This means that when I flip the steel block into its horizontal position (see Figure 4) the lens mount flange is perpendicular to the surface plate with a reasonably high degree of accuracy. Eventually, I will arrange my test chart in a similar fashion, but bear with me while I go through the process.

Figure 4: Using a precision level to ensure my two reference surfaces are parallel

In Figure 4 you can see more of my set-up. The camera is close to its final position, and you can see a precision level placed on the granite surface plate just in front of the camera itself. That spirit level measures down to a one-division movement of the bubble for each 20 micron height change at a distance of one metre, or 0.0011459 decimal degrees if you prefer. I leveled the granite surface plate in both directions. Next, I placed a rotary table about 1 metre from the camera --- you can see it to the left in Figure 4. The rotary table is fairly heavy (always a good thing), quite flat, and will later be used to rotate the test chart. The rotary table was shimmed until it too was level in both directions.

The logic is as follows: I cannot directly measure if the rotary table's surface is parallel with the granite surface plate, but I can ensure that both of them are level, which is going to ensure that their surfaces are parallel to within the tolerances that I am working to here. This means that I know that my camera lens mount is perpendicular to the rotary table's surface. All I now have to do is place my test chart so that it is perpendicular to the rotary table's surface, and I can be certain that my test chart is parallel to my camera's mounting flange. I aligned and shimmed my test chart until it was perpendicular to the rotary table top, using a precision square, resulting in the set-up shown in Figure 5.

Figure 5: overview of the final set-up. Note the obvious change in colour temperature relative to Figure 4. Yes, it took that long to get two surfaces shimmed level.

One tiny little detail (or make that two)

Astute readers may have picked up on two important details:

I am assuming that my camera's lens mounting flange is parallel to the sensor. In theory, I could stick the dial test indicator into the camera and drag the stylus over the sensor itself to check, but I do actually use my camera to take photographs occasionally, so no sense in ruining it just yet. Not even in the name of science.
The entire process above only ensures that I have two planes (the test chart, and the camera's sensor) standing perpendicularly on a common plane. From the camera's point of view, this means there is no up/down tilt, but there may be any amount of left/right tilt between the sensor and the chart. This is not the end of the world, since my initial test will only involve the measurement of pitch (as illustrated in Figure 1).

The first measurements

Note: Results updated on 13/02/2017 to reflect improvements in MTF Mapper code. New results are a bit more robust, i.e., lower standard deviations.

From the set-up above, I know that my expected pitch angle should be zero. Or at least small. MTF Mapper appears to agree: the first measurement yielded a pitch angle of -0.163148 degrees, which is promising. Of course, if your software gives you the expected answer on the first try, you may not be quite done yet. More testing!

I decided to shim the base of the plywood board that the test chart was mounted on. The board is 20 mm thick, so the 180 micron shim (0.18 mm) that I happened to have handy should give me a tilt of about 0.52 degrees. I also had a 350 micron (0.35 mm) shim nearby, which yields a 1 degree tilt. That gives me three test cases (~zero degrees, ~zero degrees plus 0.52 degree relative tilt, and ~zero degrees plus 1 degree relative tilt). I captured 10 shots at each setting, which produced the following results:

Expected = 0 degrees. Measurements ranged from -0.163 degrees to -0.153 degrees, for a mean measurement of -0.1597 degrees and a standard deviation of 0.00286 degrees.
Expected = 0.52 degrees. Measurements ranged from 0.377 to 0.394 degrees, for a mean measurement of 0.3910 degrees with a standard deviation of 0.00509 degrees. Given that our zero measurement started at -0.16 degrees, relative angle between the two test cases comes down to 0.5507 degrees (compared to the expected 0.52 degrees).
Expected = 1.00 degrees. Measurements ranged from 0.814 to 0.828, for a mean measurement of 0.8210 degrees with a standard deviation of 0.00423 degrees. The tilt relative to the starting point is 0.9806 degrees (compared to the expected 1.00 degrees).

I am calling that good enough for government work. It seems that there may have been a small residual error in my set-up, leading to the initial "zero" measurement coming in at -0.16 degrees instead, or perhaps there is another source of bias that I have not considered.

Compound angles

Having established that the pitch angle measurement appears to be fairly close to the expected absolute angle, I set out to test the relative accuracy of yaw angle measurements. Since my set-up above does not establish an absolute zero for the yaw angle, I cheated a bit: I used MTF Mapper to bring the yaw angle close to zero by nudging the chart a bit, so I started from an estimated yaw angle of 0.67 degrees. At this setting, I zeroed my rotary table, which as you can see from Figure 5 above, will rotate the test chart approximately around the vertical (y) axis to produce a desired (relative) yaw angle. At this point I got a bit lazy, and only captured 5 shots per setting, but I did rotate the chart to produce the sequence of relative yaw rotations in 0.5 degree increments. The mean values measured over each set of 5 shots were 0.673, 1.189, 1.685, 2.211, 2.717, and 3.157. If we subtract the initial 0.67 degrees (which represents our zero for relative measurements), the we get 0.000, 0.5165, 1.012, 1.538, 2.044, and 2.484, which seems pretty close to the expected multiples of 0.5.

In the final position, I introduced the 0.18 mm shim to produce a pitch angle of 0.5 degrees. Over 5 shots a mean yaw angle of 3.132 degrees was measured (or 2.459 if we subtract out zero-angle of 0.67). I should have captured a few more shots, since at such small sample sizes it is hard to tell if the added yaw angle has changed the pitch angle, or not. It is entirely possible that I moved the chart while inserting the shim. That is what you get with a shoddy experimental procedure, I guess. Next time I will have to machine a more positive mechanism for adjusting the chart position.

Discussion

Note that MTF Mapper could only extract the chart orientation correctly if I provided the focal length of the lens explicitly. My previous post demonstrated why it appears to be impossible to estimate the focal length automatically when the test chart is so close to being parallel with the sensor. This is unfortunate, because it means that there is no way that MTF Mapper can estimate the chart orientation completely automatically --- some user-provided input is required.

The good news is that it seems that MTF Mapper can actually estimate the chart orientation with sufficient accuracy to aid the alignment of the test chart. Both repeatability (worst-case spread) and relative error appears to be better than 0.05 degrees, or about three minutes of arc, which compares favourably with the claimed accuracy of Hasselblad's linear mirror unit. Keep in mind that I tested under reasonably good conditions (ISO 100, 1/200 s shutter speed, f/2.8), so my accuracy figures do not represent the worst-case scenario. Lastly, because of the limitations of my set-up, my absolute error was around 0.16 degrees, or 10 minutes of arc; it is possible that actual accuracy was better than this.

How does this angular accuracy relate to the DOF of the set-up? To put some numbers up: I used a 50 mm lens on an APS-C size sensor at a focus distance of about 1 metre. If we take the above results, and simplify it to say that MTF Mapper can probably get us to within 0.1 degrees under these conditions, then we can calculate the depth error at the extreme edges of the test chart. I used an A3 chart, so our chart width is 420 mm. If the chart has a yaw angle of 0.1 degrees (and we are shooting for 0 degrees), then the right edge of our chart will be 0.37 mm further away than expected, or our total depth error from the left edge of the chart to the right edge will be twice that, about 0.73 mm. If I run the numbers through vwdof.exe, the "critical" DOF criterion (CoC of 0.01 mm) yields a DOF of 8.95 mm. So our total depth error will be around 8% of our DOF. Will that be enough to cause us to think our lens is tilted when we look at a full-field MTF map?

Only one way to find out. More testing!

Limitations of using single-shot planar targets to perform automatic camera calibration

2017-02-08T11:21:00.000+02:00

When you are trying to measure the performance of your system across the entire field, it is rather important to ensure that your test chart is parallel to your sensor. If you are not careful, then a slight tilt in your test chart could look very much like a tilted lens element if you are looking at the MTF values, i.e., two opposite corners of your MTF image would appear to be soft: is your lens titled along the diagonal, or is the chart tilted along the same diagonal?

My solution to this problem is to directly estimate the camera pose from the MTF test chart. I have embedded fiducial markers in the latest MTF Mapper test charts which will allow me to measure the angle between your sensor and your test chart. This post details a particular difficulty I encountered while implementing the camera pose estimation method as part of MTF Mapper.

The classical approach

Classical planar calibration target methods like Tsai [Tsai1987] or Zhang [Zhang2000] prescribe that you capture several images of your planar calibration target, while ensuring that there is sufficient translation and rotation between the individually captured images. From each of the images you can extract a set of correspondences, e.g., the location of a prominent image feature (corner of a square, for example) and the corresponding real-world coordinates of that feature.

This sounds tricky, until you realize that you are allowed to express the real-world coordinates in a special coordinate system attached to your planar calibration target. This implies that you can put all the reference features at z=0 in your world coordinate system (their other two coordinates are known through measurement with a ruler, for example), meaning that even if you moved the calibration object (rather than the camera) to capture your multiple calibration images, the model assumes that the calibration object was fixed and the camera moved around it.

A set of four such correspondences are sufficient to estimate a 3x3 homography matrix up to a scale factor, since four correspondences yields 8 equations to solve for the 8 free parameters of the matrix. A homography is a linear transformation that can map one plane onto another, such as mapping our planar calibration target onto the image sensor. For each of our captured calibration images we can solve these equations to obtain a different homography matrix. The key insight is that this homography matrix can be decomposed to separate the intrinsic camera parameters from the extrinsic camera parameters. We can use a top-down approach to understand how the homography matrix is composed.

To keep things a bit simpler, we can assume that the principal point of the system is fixed at the centre of the captured image. We can thus normalize our image coordinates so that the principal point maps to (0,0) in normalized image coordinates, and while we are at it we can divide the result by the width of the image so that x coordinates run from -0.5 to 0.5 in normalized image coordinates. This centering and rescaling generaly improves the numerical stability of the camera parameter estimation process. This gives us the intrinsic camera matrix K, such that

where f denotes the focal length of the camera. Note that I am forcing square pixels without skew. This appears to be a reasonable starting point for interchangeable lens cameras. We can combine the intrinsic camera parameters and the extrinsic camera parameters into a single 3x4 matrix P, such that

where the 3x3 matrix R represents a rotation matrix, and the vector t represents a translation vector. The extrinsic camera parameters R and t is often referred to as the camera pose, and represents the transformation required to transform from world coordinates (i.e., our calibration target local coordinates) to homogeneous camera coordinates. If we have multiple calibration images, then we obtain a different R and t for each image, but the intrinsic camera matrix K must be common to all views of the chart.

The process of estimating K and the set of R_i and t_i over all the images i is called bundle adjustment [Triggs1999]. Typically we will use all the available point correspondences (hopefully more than four) from each view to minimized the backprojection error, i.e., we take our known chart-local world coordinates from each correspondence, transform it with the appropriate P matrix, divide by the third (z) coordinate to convert homogeneous coordinates to normalized image coordinates, and calculate the Euclidean distance between this back-projected image point and the measured image coordinates (e.g., output of a corner-finding algorithm) of the corresponding point in the captured image. The usual recommendation is to use a Levenberg-Marquardt algorithm to solve this non-linear optimization problem to minimize the sum of the squared backprojection errors.

Strictly speaking, we usually include a radial distortion coefficient or two in the camera model to arrive at a more realistic camera model than the pinhole model presented here, but I am going to ignore radial distortion here to simplify the discussion.

Single-view calibration using a planar target

From the definition of the camera matrix P above we can see that even if we only have a single view of the planar calibration target, we can still estimate both our intrinsic and extrinsic camera parameters using the usual bundle adjustment algorithms. Zhang observed that when a planar calibration target is employed, we can estimate a 3x3 homography matrix H such that

where the vectors r₁ and r₂ define the first two basis vectors of the world coordinate frame in camera coordinates, and t is a translation vector. Since we require r₁ and r₂ to be orthonormal, the third basis vector of the world coordinate frame is just the cross product of r₁ and r₂. This little detail explains how the 8 free parameters of the homograph H are able to represent all the required degrees of freedom we expect in our full camera matrix P.

In the previous section we restricted our intrinsic camera parameters to a single unknown f, since both P_x and P_y are already know because we assume the principal point coincides with the image centre. With a little bit of algebraic manipulation we can see that Zhang's orthonormality constraints allows us to estimate the focal length f directly from the homography matrix H (see Appendix A below).

So this leaves me with a burning question: if we can estimate all the required camera parameters using only a single view of a planar calibration target, why do all the classical methods require multiple views (with different camera poses)?

Limitations of single-view calibration using planar targets

To answer that question, we simply have to find an example of where the single-view case would fail to estimate the camera parameters correctly. The simplest case would be to assume that our rotation matrix R is the 3x3 identity matrix (camera axis is perpendicular to planar calibration target), and that our translation vector is of the form [0 0 d] where d represents the distance of the calibration target from the camera's centre of projection. This scenario reduces our camera matrix P to

A given point [x y 0] in world coordinates is thus transformed to [fx fy d] in homogeneous camera coordinates. We can divide out the homogeneous coordinate to obtain our desired normalized image coordinates as [fx/d fy/d].

And there we see the problem: the normalized image coordinates depend only on the ratio f/d, which implies that we do not have sufficient constraints to estimate both f and d from this single view. The intuitive interpretation is simple to understand: you can always increase d, i.e., move further away from the calibration target while adjusting the focal length f (zooming in) to keep f/d constant without affecting the image captured by the camera.

This happens because there is no variation in the depth of the calibration target correspondence points expressed in camera coordinates, thus the depth-dependent properties of a perspective projection are entirely absent.

We can try to apply the formula in Appendix A to estimate the focal length directly from the homography corresponding to the matrix P above, but we quickly run into a divide-by-zero problem. This should give us a hint. If we choose to ignore the hint, we can apply a bundle adjustment algorithm to estimate both the intrinsic and extrinsic camera parameters from correspondences generated using the matrix P. All that this will achieve is that we will find an arbitrary pair of f and d values that satisfy the constant ratio f/d imposed by P.

The middle road

What happens if we have a slightly less pathological scenario? Let us assume that there is a small tilt between the calibration target plane and the sensor. For simplicity, we can just choose a rotation around the y axis so that

We know that for a small angle θ, sin(θ) ≈ 0, so our matrix P will be very similar to the sensor-parallel-to-chart case above. The corresponding homography H should be

We can apply the formula in Appendix A to H, which simplifies to f² = f², which is a relief. The question is: how accurately can we estimate the homography H using actual correspondences extracted from the captured images?

I know from simulations using MTF Mapper that the position of my circular fiducials can readily be estimated to an accuracy of 0.1 pixels under fairly heavy simulated noise. The objective now is to measure the impact of this uncertainty on the accuracy of the homography estimated using OpenCV's findHomography function. I start out with a camera matrix P like the one above with only a rotation around the y axis. A set of 25 points are generated on my virtual calibration target, serving as the world coordinates (with the same real-world dimensions as the actual A3 chart used by MTF Mapper). These are transformed using P to obtain the `perfect' simulated corresponding image coordinates representing the position of the fiducials. I perturb these perfect coordinates by adding Gaussian noise with a standard deviation of about 0.000020210 units, which corresponds to an error of 0.1 pixels, but expressed in normalized image coordinates (divided by 4948, the width of a D7000 raw image). Now I can systematically measure the uncertainty in the focal length estimated with the formula of Appendix A as a function of the angle between the chart and the sensor, θ. I ran 100000 iterations at a selection of angles, and calculated the difference between the 75th and 50th percentile of the estimated focal length as a measure of spread.

Figure 1

In Figure 1 we see that the spread of the focal length estimates increases dramatically once the angle θ drops below about 2 degrees. For the purpose of using the estimated camera pose to measure if you have aligned your chart parallel to your camera sensor, this is really terrible news: essentially, we cannot estimate the focal length of the camera reliably if the chart is close to being correctly aligned.

Figure 2

Figure 2 shows that the focal length estimate is relatively unbiased for angles above about 1 degree, but once the angle becomes small enough, we overestimate the focal length dramatically.

This experiment demonstrated that small errors in the estimated position of features (e.g., corners or centre of circular targets) leads to dramatic errors in focal length estimation. Intuitively, this makes sense, since the relative magnitude of perspective effects decreases the closer we approach a parallel alignment between the sensor and the calibration target. Since perspective effects depend on the distance from the chart, and the estimated distance from the chart is effectively controlled by the estimated focal length (assume the same framing), this seems reasonable.

I have tried using bundle adjustment, rather than homography estimation as an intermediate step, but clearly the problem lies with the unfavourable viewing geometry and the resulting subtlety of the perspective effects, not with the algorithm used to estimate the focal length. At least, as far as I can tell.

Hobson's choice

If we take the focal length of the camera as a given parameter, then the ambiguity is resolved, and we can obtain a valid, unique estimate of the calibration target distance d. This is not entirely surprising, since our assumed constrained intrinsic camera parameters depend only of the focal length f, i.e., K is known, thus the pose of the camera can be estimated for any given view, even the degenerate case where the calibration target is parallel to the sensor.

In other words, I see no way other than requiring the user to specify the focal length as an input to MTF Mapper. I will try to extract this information from the EXIF data when the MTF Mapper GUI is used, but it seems that not all cameras report this information. Fortunately, it seems that a user-provided focal length need not be 100% accurate in order to obtain a reasonable estimate of the chart orientation relative to the camera.

References

[Zhang2000], Z. Zhang, A flexible new technique for camera calibration, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), pp. 1330-1334, 2000.
[Tsai1987], R. Tsai, A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses, IEEE Journal on Robotics and Automation, 3(4), pp. 323-344, 1987.
[Triggs1999], B. Triggs, P. McLauchlan, R. Hartley, A. Fitzgibbon, Bundle Adjustment — A Modern Synthesis, ICCV '99: Proceedings of the International Workshop on Vision Algorithms, Springer-Verlag, pp. 298-372, 1999.

Appendix A

If we have a homography H between our normalized image coordinate plane and our planar calibration target, such that

where h₃₃ is an arbitrary scale factor, then the focal length of the camera can be estimated assuming square pixels, zero skew and a principal point of (0,0) in normalized image coordinates, using the formula

Note that this is only one possibility, derived from the constraint that r₁ is a unit vector.

MTF Mapper finally gets a logo!

2016-11-25T14:30:00.001+02:00

It is a sad day for command line enthusiasts, but MTF Mapper has finally conformed by adopting a logo for its GUI version.

I guess in the world of graphical user interfaces, a logo is to an application what a flag is to a nation (cue the Eddie Izzard reference).

There is of course a new version of MTF Mapper (0.5.11 or later) available over on SourceForge. Lots of fixes and cleanup to the GUI; please let me know what you think of the new(ish) interface.

Running MTF Mapper under Wine

2016-06-14T08:08:00.000+02:00

MTF Mapper 0.5.2 was compiled using MSVC Express 2013, which Microsoft calls "vc12". The Windows binaries have been linked statically against the runtime, but this does not appear to be sufficient to run MTF Mapper under wine without further tweaks.

For me, running "winetricks vcrun2013" in the console seemed to do the trick. I would say that this is a necessary step to get MTF Mapper to work under wine.

In case you are wondering, without the winetricks step I get the following error:
wine: Call from 0x7b83c506 to unimplemented function msvcr120.dll.?_Trace_ppl_function@Concurrency@@YAXABU_GUID@@EW4ConcRT_EventType@1@@Z, aborting

Let me know if there are any other issues related to wine, and I'll see what I can do.

MTF Mapper vs Imatest vs Quick MTF

2016-04-13T17:30:00.000+02:00

I recently noticed that Quick MTF now has an automated region-of-interest (ROI) detection function. This allows me (in theory) to perform the same type of automated testing that I applied to MTF Mapper and Imatest. Now would be a good time to read the Imatest comparison post to familiarise yourself with my testing procedure.

Anyhow, the automatic ROI functionality in Quick MTF is almost able to work with the simulated Imatest charts I produced with mtf_generate_rectangle. I had to manually adjust about half of the ROIs to ensure that Quick MTF was using as much of each edge as possible, i.e., similar ROIs to what Imatest and MTF Mapper used. Since the edge locations remain the same across all the test images, I used the "open with the same ROI" option to keep the experiment as fair as possible.

I also discovered that QuickMTF's "trial" limit of 40 tests can be bypassed with relatively little fuss (Oleg, if you are reading this, I promise not to share the secret).

Lastly, note that I performed these tests using the "ISO 12233" mode of Quick MTF. The default settings produces much smoother plots, but these are severely biased, i.e., they report MTF50 values that are much too low. To illustrate: the default settings produce a 95th percentile relative error of 13% when measured using images with an expected MTF50 of 0.25 c/p; switching to ISO 12233 mode reduces the error to only 5%. As expected, the standard deviation of MTF50 error is lower in the default mode, but I maintain that bias and variance should both be managed well.

The results

Figure 1: Quick MTF MTF50 relative error boxplot

Figure 1 illustrates the relative MTF50 error boxplot, calculated as 100*(measured_mtf50 - expected_mtf50)/expected_mtf50. Firstly, Quick MTF should be commended for its unbiased performance between expected MTF50 values of 0.1 and 0.4 cycles/pixel; the median error is exactly zero. Unfortunately, a strong bias appears after 0.4 c/p, which is consistent with some (light) smoothing of the ESF. The boxes, and especially the whiskers, are a bit wide, which is more readily seen in Figure 2.

Figure 2: Standard deviation of relative MTF50 error

Things go a bit pear shaped when we look at the standard deviation of the relative MTF50 error. If we consider the "usable" range of 0.08 to 0.5 c/p, then Quick MTF contains the standard deviation below 3.5%, which is not bad, but Imatest and MTF Mapper perform a bit better here. A more useful (and my preferred) measure is the 95th percentile of relative MTF50 error magnitude, as illustrated in Figure 3.


Figure 3: 95th percentile of relative MTF50 error magnitude

The values in Figure 3 have a natural interpretation: the magnitude of the error will remain below the indicated value in about 95% of the edges measured with each tool. This measure combines the effects of bias (Figure 1) and variance (Figure 2) in one convenient value. Consider again the "usable" range of 0.08 to 0.5 c/p: Quick MTF only manages to keep the error below about 9% across the range. It does quite a bit better in the centre of the range, almost matching Imatest at 0.2 c/p.

Conclusion

The Imatest results were not based on the latest version; I do not have an Imatest license, and my trial has expired, so it will take a fair bit of effort to refresh the Imatest results. The Quick MTF 2.09 results are current, though.
Based on these versions, it would appear that MTF Mapper still produces competitive results. And you cannot beat MTF Mapper's price.

PffffFFTttt...

2015-11-03T12:16:00.001+02:00

There is no doubt that FFTW is one of the fastest FFT implementations available. It can be a pain to include in a Microsoft Visual Studio project, though. Maybe I am "using it wrong"...

One solution to this problem is to include my own FFT implementation in MTF Mapper, thereby avoiding the FFTW dependency entirely. Although it is generally frowned upon to use a homebrew FFT implementation in lieu of an existing, proven library, I decided it was time to ditch FFTW.

One of the main advantages of using a homebrew FFT implementation is that it avoids the GPL license of FFTW. Not that I have any fundamental objection to the GPL, but the main sources of MTF Mapper are available under a BSD license, which is a less strict license than the GPL. In particular, the BSD license makes allowance for commercial use of the code. Before anyone asks, no, MTF Mapper is not going closed source or anything like that. All things being equal, the BSD license is just less restrictive, and avoiding FFTW brings MTF Mapper closer to being a pure BSD (or compatible) license project.

FFT Implementation

After playing around with a few alternative options, including considering the my first c++ FFT implementation way back from first year at university, I settled on Sorenson's radix-2 real-valued FFT (Sorenson, H.B, et al, Real-Valued Fast Fourier Transform Algorithms, IEEE Transactions on Accoustics, Speech, and Signal Processing, 35(6), 1987). This algorithm appears to be a decent balance between complexity and theoretical efficiency, but I had to work fairly hard at the code to produce a reasonably efficient implementation.

I tried to implement it in fairly straightforward c++, but taking care to use pointer walks in stead of array indexing, and using look up tables for both the bit-reversal process and the sine/cosine functions. These changes produced an algorithm that was at least as fast as my similarly optimized complex FFT implementation augmented with a two-for-the-price-of-one step for real-valued inputs.

One thing I did notice is that the FFT in its "natural" form does not lend itself to an efficient streaming implementation. For example, the first pass of the radix-2 algorithm looks like this:

for (; xp <= xp_sentinel; xp += 2) {
    double xt = *xp;
    *(xp)   = xt + *(xp+1);
    *(xp+1) = xt - *(xp+1);
}

Note that the value of x[i] (here *xp) is overwritten in the 3rd line of the code, while the original value of x[i] (copied into xt) is still required in the 4th line of the code. This write-after-read dependency causes problems for out-of-order execution. Maybe the compiler is smart enough to unroll the loop and intersperse the reads and writes to achieve maximal utilization of all the processing units on the CPU, but the stride of the loop and the packing of the values is not ideal for SSE2/AVX instructions either. I suppose that this can be addressed with better code, but before I spend time on that I first have to determine how significant raw performance of the FFT is in the context of MTF Mapper.

Real world performance in MTF Mapper

So how much time does MTF Mapper spend calculating FFTs? Well, one FFT for every edge. A high-density grid-style test chart has roughly 1452 edges. According to a "callgrind" trace produced using valgrind, MTF Mapper v0.4.21 spends 0.09% of its instruction count inside FFTW's real-valued FFT algorithm.

Using the homebrew FFT of MTF Mapper 0.4.23 the total number of instruction fetches increase by about 1.34%, but this does not imply a 1.34% increase in runtime. The callgrind trace indicates that 0.31% of v0.4.23's instructions are spent in the new FFT routine.

In relative terms, this implies that the new routine is roughly 3.5 times slower, but this does not account for the additional overheads incurred by FFTW's memory allocation routines (the FFTW routine is not in-place, hence requires a new buffer to be allocated before every FFT to keep the process thread-safe).

Measuring the actual wall-clock time gives us a result of 22.27 ± 0.14 seconds for 20 runs of MTF Mapper v0.4.21 on my test image, versus 21.631 ± 0.16 seconds for 20 runs of v0.4.23 (each experiment repeated 4 times for computing standard deviations). These timings were obtained on a Sandy-bridge laptop with 8/4 threads. The somewhat surprising reversal of the standings (the homebrew FFT now outperforms the FFTW implementation) just goes to show that the interaction between hyperthreading, caching, and SSE/AVX unit contention can produce some surprising results.

Bottom line: the homebrew FFT is fast enough (at least on the two hardware/compiler combinations I tested).

Are we done yet?

Well, surely you want to know how fast the homebrew FFT is in relation to FFTW in a fair fight, right?

I set up a simple test using FFTW version 3.3.4 built on gentoo using gcc-4.9.3, running on a Sandy-bridge laptop cpu (i7-2720QM) running at a base clock of 2.2 GHz. This was a single-threaded test, so we should see a maximum clock speed of 3.3GHz, if we are lucky.

For a 1024-sample real-valued FFT, 2 million iterations took 14.683 seconds using the homebrew code, and only 5.798 seconds using FFTW. That is a ratio of ~2.53.

For a 512-sample (same as what MTF Mapper uses) real-valued FFT, 2 million iterations took 6.635 seconds using the homebrew code, and only 2.743 seconds using FFTW. That is a ratio of ~2.42.

According to general impressions gathered from the Internet, you are doing a good-enough job if you are less than 4x slower than FFTW. I ran metaFFT's benchmarks, which gave a ratio of 2.4x and 2.1x relative to FFTW for size 1024 and 512, respectively (these were probably complex transforms, so not a straight comparison).

The MTF Mapper homebrew FFT at least appears to be in the right ballpark, at least fast enough not to cause embarrassment....

A critical look

2015-07-05T19:30:00.000+02:00

Most of the posts on this blog are tutorial / educational in style. I have come across a paper published by an Imatest employee that requires some commentary of a more critical nature. With some experience in the academic peer review process, I hope I can maintain the appropriate degree of objectivity in my commentary.

At any rate, if you have no interest in this kind of commentary / post, please feel free to skip it.

The paper

The paper in question is : Jackson K. M. Roland, " A study of slanted-edge MTF stability and repeatability ", Proc. SPIE 9396, Image Quality and System Performance XII, 93960L (January 8, 2015); doi:10.1117/12.2077755; http://dx.doi.org/10.1117/12.2077755.

A copy can be obtained directly from Imatest here.

Interesting point of view

One of the contributions of the paper is a discussion of the impact of edge orientation on MTF measurements. The paper appears to approach the problem from a direction that is more closely aligned with the ISO12233:2000 standard, rather than Kohm's method ("Modulation transfer function measurement method and results for the Orbview-3 high resolution imaging satellite", Proceedings of ISPRS, 2004).

By that I mean that Kohm's approach (and MTF Mapper's approach) is to compute an estimate of the edge normal, followed by projection of the pixel centre coordinates (paired with their intensity values) onto this normal. This produces a dense set of samples across the edge in a very intuitive way; the main drawback of this approach being the potential increase in the processing cost because it lends itself better to a floating point implementation.

The ISO12233:2000 approach rather attempts to project the edge "down" (assuming a vertical edge) onto the bottom-most row of pixels in the region of interest (ROI). Using the slope of the edge (estimated earlier), each pixel's intensity (sample) can be shifted left or right by the appropriate phase offset before being projected onto the bottom row. If the bottom row is modelled as bins with 0.25-pixel spacing, this process allows us to construct our 4x-oversampled, binned ESF estimate with the minimum amount of computational effort (although that might depend on whether a particular platform has strong floating-point capabilities).

The method proposed in the Imatest paper is definitely of the ISO12233:2000 variety. How can we tell? Well, the Imatest paper proposes that the ESF must be corrected by appropriate scaling of the x values using a scaling factor of cos(theta), where theta is the edge orientation angle. What this accomplishes is to "squash" the range of x values (i.e. pixel column) to be spaced at an interval that is consistent with the pixel's distance as measured along the normal to the edge. For a 5 degree angle, this correction factor is only 0.9962, meaning that distances will be squashed by a very small amount indeed. So little, in fact, that the ISO12233:2000 standard ignores this correction factor, because a pixel at a horizontal distance of 16 pixels will be mapped to a normal distance of 15.94. Keeping in mind that the ESF bins are 0.25 pixels wide, this error must have seemed small.

I recognize that the Imatest paper proposes a valid solution to this "stretching" of the ESF that would occur in its absence, and that this stretching would become quite large at larger angles (about a 1.5 pixel shift at 25 degrees for our pixel at a horizontal distance of 16 pixels).

My critique of this approach is that it would typically involve the use of floating point calculations, the potential avoidance of which appears to have been one of the main advantages of the ISO12233:2000 method. If you are going to use floating point values, then Kohm's method is more intuitive.

Major technical issues

The Point Spread Functions (PSFs) used to perform the "real world" and simulated experiments were rather different, particularly in one very important aspect. The Canon 6D camera has a PSF that is anisotropic, which follows directly from its square (or even L-shaped) photosites. The composite PSF for the 6D would be an Airy pattern (diffraction) convolved with a square photosite aperture (physical sensor) convolved with a 4-dot beam splitter (the OLPF). Of course I do not have inside information on the exact photosite aperture (maybe chipworks has an image) nor the OLPF (although a 4-dot Lithium Niobate splitter seems reasonable). The point remains that this type of PSF will yield noticeably higher MTF50 values when the slanted edge approaches 45 degrees. Between the 5 and 15 degree orientations employed in the Imatest paper, we would expect a difference of about 1%. This is below the error margin of Imatest, but with a large enough set of observations this systematic effect should be visible.

In contrast, the Gaussian PSF employed to produce the simulated images is (or at least is supposed to be) isotropic, and should show no edge-orientation dependent bias. Bottom line: the "real world" images had an anisotropic PSF, and the simulated images had an isotropic PSF. This means that the one cannot be used in the place of the other to evaluate the effects of edge orientation on measured MTF. Well, at least not without separating the PSF anisotropy from the residual orientation-depended artifacts of the slanted edge method.
On page 7 the Imatest paper states that "The sampling of the small Gaussian is such that the normally rotationally-invariant Gaussian function has directional factors as you approach 45 degree increments." This is further "illustrated" in Figure 13.

At this point I take issue with the reviewers who allowed the Imatest paper to be published in this state. If you suddenly find that your Gaussian PSF becomes anisotropic, you have to take a hard look at your implementation. The only reason that the Gaussian (with a small standard deviation) is starting to develop "directional factors" is because you are undersampling the Gaussian beyond repair.

The usual solution to this problem is to increase the resolution of your synthetic image. By generating your synthetic image at, say, 10x the scale, all your Gaussian PSFs will be reasonably wide in terms of samples in the oversampled image. For MTF measurement using the slanted edge method, you do not even have to downsize your oversampled image before applying the slanted edge method. All you have to do is to change the scale of your resolution axis in your MTF plot. That way you do not even have to worry about the MTF of the downsampling kernel.

There are several methods that produce even higher quality simulated images. At this point I will plug my own work: see this post or this paper. These approaches rely on importance sampling (for diffraction PSFs) or direct numerical integration of the Gaussian in two dimensions; both these approaches avoid any issues with downsampling and do not sample on a regular grid. These methods are implemented in mtf_generate_rectangle.exe, which is part of the MTF Mapper package.

Minor technical issues

On page 1 the Imatest paper states that the ISO 12233:2014 standard lowered the edge contrast "because with high contrast the measurement becomes unstable". This statement is quite vague, and appears to contradict the results presented in Figure 8, which shows no degradation of performance at high contrast, even in the presence of noise.

I would offer some alternative explanations: the ISO12233 standard is often applied to images compressed with DCT-based quantization methods, such as JPEG. A high-contrast edge typically shows up with a large-magnitude DCT coefficient at higher frequencies; exactly the frequencies that are more strongly quantized, hence the well-kown appearance of "mosquito noise" in JPEG images. A lower contrast edge will reduce the relative energy at higher frequencies, thus the stronger quantization of high frequencies will have a proportionately smaller effect. I am quite temtpted to go and test this theory right away.

Another explanation, one that is covered in some depth on Imatest's own website, is of course the potential intensity clipping that may result from incorrect exposure. Keeping the edge contrast in a more manageable range reduces the chance of clipping. Another more subtle reason is that a lower contrast chart allows more headroom for sharpening without clipping. By this I mean that sharpening (of the unsharp masking type) usually results in some "ringing" which manifests as overshoot (on the bright side of the edge) and undershoot (on the dark side of the edge). If chart contrast was so high that the overshoot of overzealous sharpening would be clipped, then it would be harder to measure (and observe) the extent of oversharpening.
The noise model is employed a little basic. Strictly speaking the standard deviation of the additive Gaussian white noise should be signal dependent; this is a more accurate model of photon shot noise, and is trivial to implement. I have not done a systematic study of the effects of noise simulation models on the slanted edge method, but in 2015 one really should simulate photon shot noise as the dominant component of additive noise.
Page 6 of the Imatest paper states that "There is a problem with this 5 degree angle that has not yet been addressed in any standard or paper." All I can say to this is that Kohm's paper has presented an alternative solution to this problem that really should be recognized in the Imatest paper.

Summary

Other than the unforgivable error in the generation of the simulated images, a fair effort, but more time spent on the literature, especially papers like Kohm's, would have changed the tone of the paper considerably, which in turn would have made it more credible.

Taking on Imatest

2015-07-05T14:38:00.000+02:00

After having worked on MTF Mapper for almost five years now, I have decided that it is time to go head-to-head with Imatest. I downloaded a trial version of Imatest 4.1.12 to face off against MTF Mapper 0.4.18.

For the purpose of this comparison I decided to generate synthetic images using mtf_generate_rectangle. This allows me to use a set of images rendered using an accurately known PSF, meaning that we know exactly what the actual MTF50 value should be for those images. I decided to render a test chart conforming to the SFRPlus format, since that allows me to extract a fair number of edges for each test case. The approximately-sfrplus-chart looks like this:

Figure 1: SFRPlus style chart with an MTF50 value of 0.35 cycles/pixel

SFRPlus was quite happy to automatically identify and extract regions of interest (ROIs) over all the relevant edges from this image. MTF Mapper can also extract edges from this image automatically. One notable difference is that SFRPlus includes the edges of the squares that overlap with the black bars at the top and bottom of the images, whereas MTF Mapper only considers edges that form part of a complete square. To keep the comparison fair, I discarded the results from the top and bottom rows of squares (as extracted by SFRPlus), leaving us with 19*4 edges per image (SFRPlus ignores the third square in the middle column).

Validating the test images

(This section can be skipped if you trust my methodology)

Although I have posted quite a few posts here on this blog regarding the algorithms used by mtf_generate_rectangle to render synthetic images, I will now show from first principles that the synthetic images truely have the claimed point spread functions (PSFs), and thus known MTFs.

I rendered the synthetic image using a command like this:

mtf_generate_rectangle.exe --b16 --pattern-noise 0.0085 --read-noise 2.5 --adc-gain 0.641 --adc-depth 12 -c 0.33 --target-poly sfrchart.txt -m 0.35 -p gaussian-sampled --airy-samples 100

This particular command renders the SFRPlus chart using a Gaussian PSF with an MTF50 value of 0.35. Reasonably realistic sensor noise is simulated, including photon shot noise, which implies that the noise standard deviation scales as the square root of the signal level; in plain English: we have more noise in bright parts of the image.

I ran a version of mtf_mapper that dumped the raw samples extracted from the image (normally used to construct the binned ESF); I specified the edge angle as 5 degrees to remove all possible sources of error. NB: the "raw_esf_values.txt" file produced by MTF Mapper contains the binned ESF, and is not suitable for this particular experiment because of the smoothing inherent in the binning.

Given that I specified an MTF50 value of 0.35 cycles per pixel, we know that the standard deviation of the true PSF should be 0.5354018 pixels [ sqrt( log(0.5)/(-2*pi*pi*0.35*0.35) ]. From this we can calculate the expected analytical ESF, which is simply erf(x/sigma)*(upper-lower) + lower, where erf() is the standard "error function", defined as the integral of the unit Gaussian. The values upper and lower merely represent the mean white and black levels, which were defined as lower = 65536*0.33/2 and upper = 65536 - lower. With these values, I can now plot the expected analytical ESF along with the raw ESF samples dumped by MTF Mapper.

Figure 2: Raw ESF samples along with analytical ESF

I should mention that I shifted the analytical ESF along the "d" axis to compensate for any residual bias in MTF Mapper's edge position estimate. We can see that the overall shape of the analytical ESF appears to line up quite well with the ESF samples extracted from the synthetic image. Next we look at the difference between the two curves:

Figure 3: ESF difference

We see two things in Figure 3: The mean difference appears to be close to zero, and the noise magnitude appears to increase with increasing signal levels (to the right). The increase in noise was expected, since that follows from the photon shot noise model used to simulate sensor noise. We can normalize the noise by dividing the ESF difference (noise) by the square root of the analytical ESF, which gives us this plot:

Figure 4: Normalised ESF difference

This normalization appears to keep the noise standard deviation constant, which would be consistent with garden-variety additive Gaussian white noise. The density estimate of the normalized noise looks Gaussian:

Figure 5: Normalized ESF difference density

Running the normalized residuals through the Shapiro-Wilk normality test gives us a p-value of 0.03722 over our 3285 samples. That is bad news, because it means our data is non-Gaussian at a 5% significance level. ~~It is, however, Gaussian at a 10% confidence level.~~ Correction: The normalized residuals are Gaussian at a 3% (or 2.5%, or 1%) significance level. The qqnorm() plot is pretty straight too, which tells us it is more likely that the Shapiro-Wilk test is negatively affected by the large number of samples, than that the residuals are truely not Gaussian.

Now that we have confirmed that the distribution of the residuals are Gaussian, we can fit a line through them. This line comes out with a slope of -0.005765, which means that our normalized residuals are fairly flat. Lastly, we can perform some LOESS smoothing on the normalized residuals:

Figure 6: LOESS fit on normalized ESF difference

Again, we can see that the LOESS-smoothed values oscillate around 0, i.e., there is no trend in the difference between the analyical ESF and the ESF measured from our synthetic image.

The mean signal-to-noise ratio in the bright regions of the images comes out at around 15dB; because we compute the LSF (or PSF if you prefer)) from the derivative of the ESF, the bright parts of the image are representative of the worst-case noise. Alternatively, we can say that the noise is quite similar to that produced by a Nikon D7000 at ISO400, for an SRFplus test chart at a 5:1 contrast ratio.

I have shown that there is no systematic difference between the ESF extracted from a synthetic image and the expected analytical ESF. The simulated noise also behaves in the way that we would expect from properties of the simulated sensor. Based on these observations, we can safely assume that the synthetic images have the desired PSF, i.e., the simulated MTF50 values are spot-on. (In previous posts I examined the properties of the simulated ESF values in the absence of noise, but here I chose to demonstrate the PSF properties directly on the actual images used in the Imatest vs MTF Mapper comparison).

The results

The results presented here were obtained by running Imatest 4.1.12 and MTF Mapper 0.4.18 on these images (about 100MB). SFRPlus (from Imatest, of course) was configured to enable the LSF correction that was recently introduced. Other than that, all settings were left to defaults, including leaving the apodization option enabled. I turned off the "quick mtf" option, although I did not check to see whether this affected the results. After a run of SFRPlus, the "save data" option was used to store the results, after which the "MTF50" column values were extracted, discarding the top and bottom row edges as explained before.

MTF Mapper was run using the "-t 0.5 -r" settings; the "-t 0.5" option is required to allow MTF Mapper to work with the rather low 5:1 contrast ratio. The values output to "raw_mtf_values.txt" were used as the representative MTF50 values extracted by MTF Mapper.

Simulated images were produced over the MTF50 range 0.1 cycles/pixel to 0.7 cycles/pixel in increments of 0.05 cycles/pixel, with one extra data point at 0.08 cycles/pixel to represent the low end (which is quite blurry). For each MTF50 level a total of three images were simulated, each with a different seed to produce unique sensor noise. This gives us 19*3*4 = 228 samples at each MTF50 level.

As in previous posts, the results will be evaluated in two ways: bias and variance. The first plots to consider illustrate both bias and variance simultaneously, although it is somewhat harder to compare the variance of the methods on these plots.

Figure 7: Imatest relative error boxplot

Figure 8: MTF Mapper relative error boxplot

In figures 7 and 8, the relative difference (or error) is calculated as 100*(measured_mtf50 - expected_mtf50)/expected_mtf50. It is clear that Imatest 4.1.12 underestimates MTF50 values sligthly for MTF50 values above 0.2 cycles/pixel; this pattern is typical of what one would expect if the MTF curve is not adequately corrected for the low-pass filtering effect of the ESF binning step (see this post; ). MTF Mapper corrects for this low-pass filtering effect, producing no clear trend in median MTF50 error over the range considered. We can plot the median measured MTF50 relative error for Imatest and MTF Mapper on the same plot:

Figure 9: Median relative MTF50 error comparison

Figure 9 shows us that the Imatest bias is not all that severe; it remains below 2% over the range of MTF50 values we are likely to encounter in actual photos. (NB: Up to July 30, 2015, this figure had Imatest and MTF Mapper swapped around).

So that illustrates bias. To measure variance we can plot the standard deviation at each MTF50 level:

Figure 10: Standard deviation of relative MTF50 error

Other than at very low MTF50 values (say, 0.08 cycles/pixel and lower), it would appear that MTF Mapper 0.4.18 produces more consistent MTF50 measurements than Imatest 4.1.12.

A final performance metric to consider is the 95th percentile of relative MTF50 error. By computing this value on the absolute value of the relative error, it combines both variance and bias into a single measurement that tells us how close our measurements will be to the true MTF50 value, in 95% of measurements. Here is the plot:

Figure 11: 95th percentile of MTF50 error

Of all the performance metrics presented here, I consider Figure 11 to be the most practical measure of accuracy.

Conclusion

It took quite a bit of effort on my part to improve MTF Mapper to the point where it produces more accurate results than Imatest. There are some other aspects I have not touched on here, such as how accuracy varies with edge orientation. For now, I will say that MTF Mapper produces accurate results at known critical angles, whereas Imatest appears to fail at an angle of 26.565 degrees. Given that Imatest never claimed to work well at angles other than 5 degrees, I will let that one slide.

I have also not included any comparisons to other freely available slanted edge implementations (sfrmat, Quick MTF, the slanted edge ImageJ plugin, mitreSFR). I can tell you from informal testing that most of them appear to perform significantly worse than Imatest, mostly because none of those implementations appear to include the finite-difference-derivative correction. Maybe I will back this opinion up with some more detailed results in future.

So where does that leave your typical Imatest user? Well, the difference in accuracy between Imatest and MTF Mapper is relatively small. What I mean by that is that these results do not imply that Imatest users have to switch over to using MTF Mapper, rather, these results show that MTF Mapper users can trust their measurements to be at least as good as those obtained by Imatest. And, of course, MTF Mapper is free, and the source code is available.

There are some fairly nifty features that I noticed in SFRPlus during this experiment. It appears that SFRPlus will perform lens correction automatically, meaning that radial distortion curvature can be corrected for on the fly. MTF Mapper currently limits the length of the edge it will include in the analysis as a means of avoiding the effects of strong radial distortion. But now that I am aware of this feature, I think it would be relatively straightforward to include lens distortion correction in MTF Mapper. So little time, so many neat ideas to play with ...