Wednesday, 28 March 2012

Validating MTF Mapper accuracy, part 1

Validation of synthetic images generated by mtf_generate_rectangle

The first step in the validation of MTF Mapper accuracy is to generate images with known MTF curves --- this is why I have developed (and included) the mtf_generate_rectangle tool.

But before I can use the mtf_generate_rectangle tool, I have to test whether the images generated by the tool indeed have the desired edge profile. Fortunately, this is relatively straightforward if the synthetic images are generated using a Gaussian Point Spread Function (PSF). The PSF defines the kernel with which each pixel in the synthetic image is sampled. A step edge (such as the black-> white edges used in the slanted edge method) sampled with such a Gaussian PSF will produce an Edge Spread Function (ESF) with a known functional form. The slanted edge method usually computes the PSF by taking the derivative of the ESF, i.e., the ESF must therefore be the integral of the PSF. For a Gaussian PSF, we expect an ESF in the shape of the integral of a Gaussian, also called the error function, or erf. 

To test whether mtf_generate_rectangle produces correct images, all I have to do is measure the empirical ESF, and compare it to the expected analytical erf function. With mtf_generate_rectangle I thus generated a synthetic image with an expected MTF50 value of 0.5, and measured the ESF empirically using mtf_mapper. Then I plotted the measured ESF along with the expected analytical ESF, which will have the form erf(x/0.53002):
At this scale, the measured ESF appears to match the analytical ESF perfectly, but we have to look more closely. A better visualisation is to plot the difference between the two ESFs:
I chose a percentage scale for the y-axis to better convey the magnitude of the differences. We can see from the plot that the difference between the two curves is less than 0.3% in absolute value, relative to range 0.0 to 1.0 used to represent pixel intensities. I expected the random, noisy appearance of the differences in the middle, but the apparently systematic error around -1.5 and 1.5 is slightly worrying.

This difference is slightly larger than I had hoped for, but how much does this affect the MTF50 estimates of mtf_mapper? Take the ESF data sets, and bin them according to the same method used in mtf_mapper, take the finite difference to obtain the PSF, and finally pass this through an FFT. The resulting SFR curves are (again) visually indistinguishable, so here is the difference between the SFR curves instead:
Please note the scale of the plot, and keep in mind that SFR values are in the range 0.0 to 1.0. In the frequency range 0 to 1 cycles per pixel, the maximum absolute error is on the order of 0.13%. The impact of this error on the resulting MTF50 value is also small, resulting in a difference of 0.0566% in MTF50 values between the empirically measured ESF and the expected analytical ESF.

Ok, so what happens at different MTF50 values? I repeated the test by generating a synthetic image with an MTF50 value of 0.06 --- this is a pretty blurry image. The maximum absolute difference between the measured ESF and the analytical ESF came in at 0.356%, and the absolute error in the SFR curve was about 0.8%. The larger errors appear to come from higher frequencies, i.e., because the area under the SFR curve is smaller than that of the SFR curve for the MTF50=0.5 edge above, the relative magnitude of the high-frequency noise (quantization errors?) has increased. Still, the impact on  measured MTF50 was small, differing from the MTF50 derived from the analytical ESF by only 0.066%.

So why do we see errors at all? Why does mtf_generate_rectangle not produce "perfect" results? This is down to the stochastic sampling method used to measure the overlap between a rectangle fragment and the Gaussian PSF centered at each pixel. The default setting is to use 2025 samples per pixel, which is a lot, but not sufficient to ensure perfection. By increasing the number of samples to 31329 per pixel, and repeating the above experiments, the difference between the MTF50 value computed from the measured ESF, and that computed from the analytical ESF is reduced to 0.0147% and 0.0108% for the MTF50=0.5 and MTF50=0.06 cases, respectively. It is therefore reasonable to suppose that these errors will continue to decrease as the number of samples per pixel is increased, eventually converging on the "perfect" case. 

I have an idea on how to change the sampling strategy used by mtf_generate_rectangle (performing numerical integration directly, rather than with a Monte Carlo method), which may turn out to be more accurate at a lower computational cost. But for now, I am keeping the number of samples at 2025, since this appears to offer sufficient accuracy for my purposes.

This experiment demonstrated that the ESF of a synthetic image generated by the mtf_generate_rectangle tool is close to the expected analytical ESF. The small differences observed have no meaningful impact on the accuracy of the resulting SFR curves, nor the computed MTF50 values. In further validation experiments, the synthetic images will be taken as ground truth to measure the accuracy of various slanted edge method implementations.

Thursday, 15 March 2012

AA filters, RL-deconvolution sharpening and optimal raw conversion

When I measured the MTF of the D40 AA filter, I speculated that the "optimal" radius (standard deviation, really) for RL-deconvolution sharpening for the D40 would be 0.57. (See Measuring D40 AA filter MTF)
We can perform an experiment to see what "optimal" sharpening looks like, and what happens when we sharpen more than that. First, generate a synthetic image with a known MTF50 value, say, about 0.38 cycles per pixel. Here it is:
This image has an exact Gaussian point spread function, i.e., it does not include the effects of diffraction. I added realistic sensor noise (based on actual measured D7000 sensor parameters), so this image could just as well have come straight from a D7000 (barring Bayer filters, and assuming the AA filter is slightly less blurry). This image is then sharpened with the default settings of the RL-deconvolution sharpening found in RawTherapee. In short, the RL-deconvolution algorithm will undo the effects of a known low pass filter very well. Here is the sharpened version of the above image:
Notice how there is a white halo around the black rectangle (click to view full image). The default radius for the RL-deconvolution sharpening algorithm in RawTherapee is 0.75. This is too large a radius, and over-sharpens the image a bit, causing the appearance of haloes. The haloes I can live with, but notice how much more pronounced the noise in the white parts of the image has become. Looks familiar, doesn't it ... ? If I process the same image (two up, MTF50 of 0.38) with the "optimal" parameters for the D40, i.e., radius set to 0.57, I get the following image:
This image looks a little bit sharper than the input image, but the noise certainly does not appear as pronounced as in the over-sharpened image (radius 0.75). So here is the summary of the MTF50 values:
Original image                        -> MTF50 = 0.38
Over-sharpened image           -> MTF50 = 0.6
Optimally sharpened image   -> MTF50 = 0.45

Typically, we would consider an MTF50 value of 0.5 to be the upper limit before severe aliasing sets in. Take a close look at the edges of the over-sharpened image: observe the "waves" along the edges of the black rectangle. These are textbook examples of aliasing: frequencies that are too high to be represented in the target resolution will masquerade as lower frequency components. An MTF50 value of 0.6 is definitely too high, and sharpening artefacts are evident. (Note: frequencies above Nyquist are measured here, but this is OK because the slanted edge method produces a over-sampled edge spread function at a much higher sampling rate than the pixel pitch)

The optimally sharpened image has no "waves" on its edges, i.e., it does not suffer from aliasing. Because the sharpening is less severe, and has less impact on high frequencies, the noise is also amplified less than in the over-sharpened image.

So that leaves an important question: Why is the default in RawTherapee to use a radius of 0.75? Adobe LightRoom appears to use a default radius of 1.0 (I have heard that LightRoom uses RL-deconvolution to perform sharpening, but do not quote me as a source on this). Surely this will lead to aliasing, as could be seen in the over-sharpened image above?

Well, the "original" synthetic image above already had a very good MTF50 value, higher than what I believe is possible with a Nikon (non D800E) camera. But more importantly, this synthetic image only modelled two aspects: the low-pass filter effect of the AA filter, and sensor noise. If one throws in a diffraction MTF and a defocus MTF (maybe in a next post?) then the overall sharpness, and the MTF50, will decrease quite a bit. In other words, real images can probably tolerate sharpening by a radius of 0.75, and maybe even 1.0, because they are not perfectly focused, and typically do not contain perfect razor blade edges.

But the noise component remains because this happens at the sensor level, after all the other resolution-robbing distortions (lens MTF, sensor AA MTF, defocus). So even if you can justify sharpening with a radius of 1.0 because of these other factors, you still end up with significant noise magnification.

So here is the bottom line: In theory, you can perform sharpening with RL-deconvolution to mostly reverse the blurring effects of the lens and sensor AA filter --- but because the noise is not affected by any of those distortions, you end up sharpening the noise too, i.e., rather than getting a more faithful representation of the noise (not that you would want that in the first place), you are magnifying the effects of noise from the first time you apply any kind of sharpening.

One possible solution is to improve the raw converter directly, which may be what converters such as ACR / LR do in any case. My proposed strategy would be as follows:
  1. Subtract a dark frame from the raw image
  2. Perform flat-field correction to remove the PRNU effects (Pixel Response Non-Uniformity)
  3. Perform noise suppression on the raw Bayer channels (before Bayer demosaicing) to yield image I1
  4. Perform sharpening using RL-deconvolution on I1 to yield I2
  5. Perform Bayer demosaicing using both I1 and I2
  6. Profit!
There are some potential problems. For example, if you perform sharpening before Bayer demosaicing, you are effectively removing the AA filter, hence you will increase colour Moire artefacts. This is why I propose to use both the sharpened and the unsharpened image during Bayer demosaicing, hopefully getting the best of both worlds in the process.

First attempt at determining what the Nikon D40 anti-aliasing (AA) filter does to your image

Update: Please read follow-up article to this one.

It would appear that the same question pops up every time a new higher resolution sensor is announced: "Will my lenses be sharp enough for an X megapixel sensor?"
I was curious to see exactly what happens when the lens is much sharper than the sensor, so I set up an experiment. The idea is to select a sharp lens using a D7000 body, and then evaluate the performance of the lens on a lower resolution body, a D40 in this case. A by-product of this experiment will be a pretty good idea of what the anti-aliasing (AA) filter (also called the low pass filter) on the D40 does to the image.

First, I took my sharpest lens, a Sigma 17-50 mm f/2.8. At 35 mm f/5, this lens can resolve more than 60 line pairs per mm in the centre, which is a bit better than the Nikkor 35 mm f/1.8 prime (about 57 lp/mm in the centre). It is possible that this lens is even sharper than that, but as you will see, the D7000 sensor may be the limiting factor.

To construct a high quality, high-contrast straight edge I took a normal razor blade (2-sided variety), and blackened it with candle soot. The blade was mounted on a sheet of white paper (high quality, not normal copier paper). The camera was positioned at a distance of about 1.4 m from the target, to ensure we are in the "sweet spot" of lens performance. A flash was used to ensure maximum contrast. Here is what the raw Bayer image data from the D7000 looks like (upscaled for display)
On the D7000, focus was performed with LiveView, and the sharpest shot of a sequence was selected. On the D40, I started with AF, and then bracketed focus manually, again selecting the sharpest shot from the sequence.

Before we look at the details of the curve shapes, first some justification: I do not have the equipment to measure if the Sigma lens is diffraction limited, but if I can measure an MTF50 of 60 lp/mm on a D7000 sensor, then I should be safe on the D40 sensor, which has a theoretical Nyquist limit of 64 lp/mm. I have thus assumed that lens resolution is not the limiting factor for the D40, i.e., that the MTF curve we measure is really the D40 AA filter MTF curve.

The razor edge was then processed with MTF Mapper to extract MTF curves of the green photosites (i.e., no demosaicing was applied, no interpolation, just the raw green photosite data). Plotting the MTF curves in cycles per pixel gives us the following:
The red curve (which looks like a straight line here) is the diffraction-limited MTF curve for the selected aperture (f/5), plotted against the scale of the D40 pixels (7.8 micron). The real diffraction-limit MTF is independent of pixel size, but because the plots are in terms of cycles per pixel, the diffraction curve would be "lower" (i.e., steeper slope) if we plotted it for the D7000 pixel pitch (not shown). The diffraction limit is the absolute physical upper resolution limit imposed by the lens aperture.

The orange curve is the measured MTF curve for the D40. The dashed blue curve is the MTF curve of a Gaussian blur kernel with the same MTF50 value as the D40 curve, i.e., 0.294 cycles/pixel. The D40 curve is quite close to the Gaussian, which would mean that sharpening the D40 image with an inverse Gaussian, i.e., Richardson-Lucy (RL) deconvolution such as implemented in RawTherapee, is a really good match for the D40.

The black curve is the measured MTF curve for the D7000. This curve is quite different from the D40 curve. There are two possible explanations: either the D7000 AA filter has a different MTF curve shape (compared to D40), or the lens is not quite sharp enough to reveal the sensor AA filter MTF completely. I suspect the latter is more likely. At any rate, the interesting thing here is that the contrast is much lower than the D40 at around 0.15 cycles per pixel. In other words, per-pixel microcontrast may appear lower than the D40, but this will probably be rather subtle because the MTF50 values are so close, which can be observed at around 0.29 cycles per pixel.

That wraps it up for comparing per-pixel sharpness (which would be analogous to comparing 100% crops for sharpness). What about real-world sharpness? We can plot the same data on a line pairs per mm scale:
Here we can see that the D7000 performs much better than the D40. The measured MTF50 values are 60.47 lp/mm for the D7000, vs 37.64 for the D40. The D7000 thus captures images that are sharper by a factor of 1.61 than the D40. Now for the punchline: the linear increase in resolution going from the D40 to the D7000 is 1.649 (7.8 micron pixels vs 4.73 micron pixels). From this, I deduce that the Sigma 17-50 mm f/2.8 lens is sharper than the D7000 sensor. This is of course only in the centre of the lens. This is further corroborated by Sigma's published MTF curve for this lens. They claim roughly 0.9 contrast at 30 lp/mm, I measured 0.77 on the D7000, and 0.63 on the D40 at 30 lp/mm.

The other feature that I would like to point out is the contrast at 40 lp/mm. The D7000 obtains a contrast value of 0.68, D40 only 0.46, thus the D7000 manages to pull out about 49% more contrast at that resolution. This will produce a lot more "pop" at 6x9 print scales.

Lastly, my take on the Nikon AA filters: It would appear that the D40 AA filter has an MTF50 of around 0.3 cycles per pixel (allowing for experimental error), which produces a blur that is very close in shape to a Gaussian. This seems to put the effective resolution of the D40 at about 60% of the resolution implied by the pixel pitch, which is lower than the 70% figure often quoted for sensors with Bayer and AA filters. The D7000 seems to be heading in the same direction, i.e., an MTF50 of 0.3 cycles/pixel, or a peak resolution of about 63.4 lp/mm.

Norman Koren stated that 0.33 cycles per pixel is "pretty decent for digital cameras" ( He must have been referring to Canon bodies :)

... Some time passes ....
Ok, so I speculated that the D40 MTF curve looks like a Gaussian, and that the D40 AA filter may have an MTF50 value of 0.3. After thinking about this a bit, I had another idea: If the AA filter MTF is a Gaussian, and we assume that the lens is diffraction limited (which is likely for the Sigma 17-50 mm, as explained in above), then we should obtain the diffraction limit MTF if we deconvolve the measured D40 MTF curve with the best-fitting Gaussian MTF.

Since that paragraph was unhelpfully technical, let me try to explain in a more straightforward way: With digital SLRs, you have to sharpen the image to counteract the effects of the AA filter. So if the AA filter function is a Gaussian (i.e., shaped like the blue dashed curve in the plots in this post), then optimal sharpening would completely "undo" the effect of the AA filter. If our lens is diffraction limited, we expect to measure the diffraction-limit MTF (the dashed red curve in the plots), but obviously our sensor will "blur" that curve. So if we use a Gaussian MTF to "sharpen" our image, and the result is the diffraction curve, then our lens must have been diffraction limited.
The following plot illustrates this:

From this plot it would appear that the D40 AA filter is closer to a Gaussian with an MTF50 of 0.33, or that the Sigma lens is almost, but not quite, diffraction limited.

This would put the "optimal" radius for an unsharp mask (or standard deviation for RL deconvolution) at 0.57 pixels for the D40. In other words, this amount of sharpening would not introduce "oversharpening" artifacts in the sharpest part of the image. Other parts may require more sharpening (larger radius or standard deviation), but then you run the risk of oversharpening the sharp parts.

One more thing: above I measured the MTF50 to be 0.293 for the D40, and speculated that the AA filter MTF can be approximated by a Gaussian with an MTF50 of 0.3, and here we see that a Gaussian with MTF50 of 0.33 appears to be a better fit. Why the difference? Well, a slight defocus blur will also be approximately Gaussian in shape, i.e., a small amount of defocus blur will likely be present in all MTF measurements. This defocus blur will lower the measured MTF, hence requiring a more "sharpening" using the deconvolution method above.

Update: Please read follow-up article to this one.

How dcraw demosaicing skews sharpness, part two

In the first post on this topic I provided some evidence of orientation-depended distortion of sharpness caused by the Bayer demosaicing algorithms found in dcraw. Now we will look at some samples from real photos.

I used MTF Mapper to produce maps of both Sagittal and Meridional MTF50 values of a Nikkor 35 mm f/1.8 AF-S G lens mounted on a Nikon D7000 body. The actual photo looks like this:
I captured this shot in both raw and JPEG. The .NEF was then processed with dcraw v9.10, with various demosaicing options. First up, the results for Adaptive Homogeneity-Directed (AHD) interpolation (dcraw -q 3):
This is the image that started this whole thread. While I was developing the MTF surface map rendering option for MTF Mapper, I obviously used synthetic images to see whether MTF Mapper was producing the desired output. Once I was satisfied that the output look good, I tried running MTF Mapper on some .NEF shots developed with dcraw. The strange cross shapes that appeared in both the Meridional and Sagittal MTF maps did not really look like what I was expecting, which got me thinking about possible causes.

Update: In the years since I first posted this article, I have learnt a lot. In particular, the anisotropy of the photosite aperture must be considered when interpreting results captured with an actual camera, as opposed to synthetic images generated with isotropic PSF. See my my article on anisotropy for more information.

I initially though that it was caused by something in the CMOS sensor pixel geometry. Maybe the pixels were not really square, and some edge angles thus produced biased MTF50 estimates (Update: actually, square pixels are one of the main contributors to MTF50 varying according to edge angle). I have yet to test this, but it seems that the demosaicing algorithms are responsible for the bulk of the distortion. Next, I modified the slanted edge MTF estimation algorithm used in MTF Mapper to use only pixels from a specific filter colour, e.g., only the red sites. Using the dcraw -D option to extract the raw Bayer image from the .NEF files, I could then directly measure only the red channel MTF without going through a demosaicing algorithm first.

The main disadvantage of this approach is that the red and blue filters cover only 25% of the pixels, meaning that effectively I have only 25% of the usual number of intensity observations around each edge. This has the effect of increasing the uncertainty in MTF50 estimates, and I still want to look at ways of improving performance --- maybe a test chart with larger squares will be the best solution. At any rate, here is the MTF surface of the raw red channel, without any demosaicing:
Notice how the meridional MTF map now appears much more symmetrical. The left side still looks a bit low, but then again, I did not take every possible precaution to ensure that the sensor is perfectly parallel to the test chart. Maybe the lens is even slightly decentered. Time will tell.

The sagittal MTF map also looks much more radially symmetrical. Although it is somewhat skew, it looks as if there is a donut-shaped region of maximum sharpness, something that is certainly possible if the field of this lens is not perfectly flat. The most important feature, though, is that the cross-shaped pattern seems to have disappeared, like we may expect based on the results derived from the synthetic images in the first post.

Next, I extracted the red channel from the in-camera JPEG. There are several important things to take into account with JPEGs, though. The chroma channels are downsampled by a factor two, meaning that MTF values should actually decrease, as this will involve a low-pass filter. Extracting only the red channel of a JPEG involves re-mixing the chroma and luminance channels, and then extracting the red channel from the resulting RGB image. This means that both the blue and green MTF values will be mixed in with the red channnel MTF, so if the red channel is actually softer than the other two, then the JPEG process will appear to increase the (relative) MTF50 values somewhat. Anyhow, here is the image

The most disturbing thing about these MTF maps is the grid of bumps that appear. I cannot explain their presence at the moment, but I will try to get some answers at some point. We can see that the centre of the meridional MTF map seems broader, and the sagittal map appears quite flat (other than the local bumps). There is only a very slight hint of a horizontal and vertical stripe through the centre --- more testing will be required to see if this is just random coincidence, or a real feature. Anyhow, the in-camera JPEG does not appear to suffer from the distortion we saw with the dcraw AHD demosaicing interpolation, so whatever the camera is using, it seems to be of better quality.

In future posts I will add some more plots, specifically for other popular raw convertors.

How dcraw Bayer demosaicing skews MTF

I have run some experiments with dcraw to investigate the effects of various Bayer demosaicing algorithms on image sharpness.

A very basic Bayer demosaicing algorithm (see here for a brief introduction) will effectively act as a low-pass (i.e., blurring) filter because it interpolates the missing colours from nearby neighbours. In smooth areas, this is actually beneficial, since it will help to suppress image noise. On sharp edges (or detailed regions) this will obviously introduce unwanted blurring. Smarter demosaicing algorithms, such as Pattern Pixel Grouping (PPG) or Adaptive Homogeneity-Directed (AHD) interpolation have crude mechanisms to detect edges. They can then avoid interpolating values across the edges, thus reducing blurring in areas with high detail or sharp edges. Unfortunately, because they use only local information (maybe 1 or two pixels away), they cannot estimate edge orientation very accurately, and hence appear to introduce varying amounts of blurring, depending on the orientation of edges in the image.

I set out to investigate this. The first step is to generate synthetic images for each channel (Red, Green, and Blue). These images are generated with a known edge orientation, and a known edge sharpness (MTF50 value). The gray scales ranges of these images have been manipulated so that the black level and the intensity range differs, to match more closely how a real Bayer sensor performs. A little bit of Gaussian noise was added to each channel. Next, the three channels are combined to form a Bayer image. Each pixel in the Bayer image comes from only one of the three channels, according to a regular pattern, e.g., RGRGRG on even image rows, and GBGBGB on odd rows.
Here is an example:
The red, green and blue synthetic images were generated with different MTF50 values. In this case, red had edges with MTF50 = 0.15, G = 0.25, and B = 0.35. This is somewhat exaggerated, but not too far from values that I have observed in practice. Real lenses will typically have lower sharpness in the red anyway. The simplest case is if the lens is diffraction limited, at which point the longer wavelength of red light leads to more diffraction and lower sharpness. Longitudinal chromatic aberration (clearly visible in my Nikkor 35 mm f/1.8 lens) can cause the red channel to become slightly defocused, reducing sharpness.

 Using a set of these synthetic images with known MTF50 values, I could then measure the MTF50 values with and without Bayer demosaicing. The difference between the known MTF50 value for each edge, and the measured MTF50 value, is then expressed as a percentage (of the known MTF50 value). This was computed over 15 repetitions over edge angles from 3 degrees through 43 degrees (other edge orientations can be reduced to this range through symmetry). The result is a plot like this:

This clearly shows a few things:

  1. Using the original synthetic full resolution red channel image (before any Bayer mosaicing) produces very accurate, unbiased results (blue boxes in the plot). The height of each box represents the width of the middle-most 50% of individual measurements, i.e., the typical spread of values.
  2. Both PPG (black) and AHD (red) start off with a 5% error, which increases as the edge angle increases from 3 through 43 degrees. This upwards trend is the real problem -- a constant 5% error would have been fine. The values are positive, i.e., sharpness was higher than expected. This is because the green and blue channels both have higher MTF50 values, so some of that sharpness is bleeding through to the red channel.
  3. A variant of the MTF measurement algorithm, designed to run on the raw Bayer mosaiced image, was run on the red sites only -- see the green boxes. Since this means only 25% of the pixels could be used, the effect of noise is magnified (remember, the synthetic images had a little bit of Gaussian noise added). This increases the spread of the errors, but the algorithm remains unbiased, regardless of edge orientation.

The data is quite clear: both PPG and AHD distort the MTF50 measurements in an orientation-dependent manner. These two algorithms are the most sophisticated ones included with dcraw. Other raw converters, such as ACR, may perform differently, and I intend to demonstrate their performance in future posts. I have not had time to investigate this in great detail, but I suspect that the AHD implementation in LibRaw does not produce the same result, based on a quick-and-dirty test.