Tuesday, 24 April 2012

An accurate method for rendering synthetic images with a specified point spread function

Since I could not easily find this information when I needed it, I thought it would be helpful to write a short article describing the method I use to generate synthetic images with a specified point spread function.

Point spread functions

The point spread function of an imaging system is simply the impulse response of the system. In terms of digital cameras, this can be understood as the photo of an infinitely thin beam of light head-on, somewhat like taking a picture of a star. The camera pixels have a fixed, finite size, but the hypothetical beam of light is infinitely thin, so how does the camera sensor represent such an input? The short (but approximate) answer is that the average light over the physical area of the pixel is recorded by the sensor. Unfortunately, this approximation does not produce a well-sampled image, and would lead to severe aliasing. The solution is to apply some low-pass filtering to the incoming light to remove (or at least strongly attenuate) the spatial frequencies above the Nyquist limit of the sensor. In plain English: you have to blur the light before it reaches the sensor if you want to minimise sampling artifacts.

A suitable filter for this task must spread the light of our hypothetical infinitely thin beam over several adjacent pixels. This is fairly simple to prove, but requires considerable background knowledge of sampling theory and frequency domain representations of images, so I will have to treat this as a given fact in this article.

Such a blurring filter is found in most DSLR cameras, and is often called the Anti-Aliasing (AA) filter, the Low-Pass filter, or the Optical Low-Pass Filter (OLPF). Back to our example: if we capture an image of our thin beam with a sensor that has an AA filter, we will see something like this (magnified for visualisation here):
This image is a direct visualization of the actual PSF. An intensity cross-section of this image will look like this:
Lo and behold, it looks like a Gaussian distribution! This is, of course, only an approximation, but in my experience Nikon OLPFs are quite close to a Gaussian, so this is not a terrible assumption for DSLRs.

The x axis of the cross-section above is marked in units of pixels, with the origin being treated as the centre of the pixel to which this PSF belongs. In other words, if we are generating a synthetic image with a given PSF, we should place that PSF at the centre of each pixel that we wish to render. The sample PSF above is substantially more blurry than what you would find in a real camera, but even here we can see that the bulk of the distribution is only 4 pixels wide in terms of the desired target image's pixel size.

Direct Gaussian PSF simulation by blurring

Let us assume that we are interested in generating synthetic images with a Gaussian PSF. We want to render black rectangular targets on white backgrounds to obtain images that look like this:

The most naive method for rendering a synthetic image would be to take our target image (generated by drawing it in a paint program, for example) and blur it with the desired Gaussian PSF --- this is easily done with something like GIMP or ImageMagic. While this method may be suitable for a very quick-and-dirty implementation, it is far from ideal because is only samples the PSF at integer multiples of the pixel spacing, producing the following discretized PSF:
Clearly, this is a very crude approximation of the smooth PSF above, and this approximation is even worse for narrower Gaussians --- imagine if the width of the Gaussian is only 1 pixel in stead of 4 as shown here.

Direct Gaussian PSF simulation with oversampling

The first refinement would be to start with a higher resolution image. You could start by drawing your target at 4 times the final resolution. The Gaussian blur is then applied (after adjusting its scale) to the higher resolution image, followed by downsampling. In this case, downsampling would mean taking only every 4th pixel from the high resolution image after the blur has been applied. At 4x oversampling, the PSF approximation would look like this:
This is already much better. If we keep on increasing the oversampling factor, we should eventually obtain a good enough approximation, but our oversampled image quickly becomes unmanageable large (e.g., 128x oversampling would consume 16384 as much memory as our final image).

Inverse Gaussian PSF rendering basics

A better way of rendering the image is to reverse our thinking. First, imagine that we are not applying the blur to the high resolution image, followed by downsampling, but instead we position our PSF at the centre of a given pixel in our final image. This PSF is then mapped back onto the high-resolution image, where the PSF is used to compute a weighted sum of intensity valued taken from the high resolution image. Effectively, we choose our target pixel from the final image, and go back to the high resolution image to perform weighted sampling. In this form we are not gaining anything yet, but it should be understood that this formulation is functionally equivalent to the direct approach.

The next step is to abandon the manual generation of the high resolution image altogether. For simple targets, such as rectangles, we can easily provide an analytical description: if we know the position of the four corners of the rectangle, then we can easily test whether a point is inside the rectangle or not. Using this analytical description, we can now render the final image without keeping the entire high resolution image. We proceed to centre the PSF at each pixel in the final image. This PSF is then mapped back to the coordinate system in which the analytical description of the target is defined.

Now we have a few options. The first approach would be discrete sampling: for each pixel in the final image, project a grid of closely spaced (something like 16x oversampling or more) points around the pixel centre onto the analytical target. Test each point to see if it is inside the target (rectangle) or not, and evaluate the PSF at the same location to accumulate the weighted sum over this fine grid. This works, but Gaussian PSFs with large standard deviations (wide Gaussians) would require larger sampling grids, e.g., 7x7 pixels or more at 16x oversampling, so this strategy becomes inefficient.

Enter importance sampling

A much better strategy is to have a non-uniform set of points in stead of a grid of evenly spaced sampling points. The best distribution of these points would be one that matched the density of the PSF itself. For a Gaussian, we can use something like the Box-Muller transform or Moro's inversion to transform numbers in the range [0,1] to match the desired Gaussian distribution. Using quasi-random samples in stead of uniform or randomly spaced samples, we can get equally good results with fewer samples.

Thus, we follow the same steps as with plain inverse sampling: project the PSF centred at each pixel in the final image back onto the coordinate system in which the analytical target (rectangle) is defined. For each sampling point, test to see whether it is inside the target, or not. Note that because our sampling points already follow the PSF's density distribution, we should not weight our samples with the PSF again. Just count the number of samples inside the rectangle, and divide by the number of samples in total to obtain the correct intensity for each pixel.

This method works reasonably well, and has the advantage that it automatically adapts the sampling density according to the width of the PSF, while keeping the computational cost per pixel constant. I have had reasonably good results using 2025 samples per final pixel. But can we do even better with comparable computational cost?

Direct integration of the PSF over the target

The importance sampling method is really just Monte Carlo integration of the PSF over the region defined by the target rectangle. If you are happy with this statement, then you are ready for the next (and final) method: Directly computing the integral of the PSF over the rectangle.

It helps to visualise this step as follows: Imagine the PSF centered at a given pixel as a red dot superimposed on the target rectangle.
Suppose that the PSF is quite wide, with a standard deviation of around 6 pixels, i.e., a width of about 30 pixels. If we represent the rectangular region as a slightly raised platform above the background, and the PSF superimposed on top of that, we obtain something like this:

or from a different angle,
The final intensity of the pixel (centered at the peak of the PSF surface) should be proportional to the are under the curve. Note how the integral of the PSF surface is limited to the region defined by the rectangle.

Given that our PSF is a Gaussian, this boils down to evaluating the double integral of the PSF over a the rectangular region which serves as our target object. There is no closed-form solution to this integral. In one dimension, the integral of a Gaussian is often called the error function or erf(x). There is still no closed form solution, but good polynomial approximations exist.

If we thus restrict ourselves to radially symmetric Gaussian PSFs, we can decompose the computation of the double integral into two parts. Imagine a thin horizontal slice of our rectangle, illustrated with a green line here:
We only want to integrate inside the rectangle, so we can compute the intersection of the horizontal line with our rectangle to find the integration limits. The cross section of the PSF along this line is simply a Gaussian centered at our target pixel. The height of this horizontal Gaussian is determined by the vertical cross section of the PSF at our pixel centre.
We can directly compute the integral of the horizontal cross-section using the error function, and we simply scale the result according to vertical Gaussian cross section of our PSF (using the distance from our horizontal line to the PSF centre as argument).

If we proceed to slice our rectangle into such horizontal lines, it becomes clear that we are evaluating the numerical integral in the y direction. I settled on the adaptive version of Simpson's rule, mostly because it is straightforward to implement. This method is comparable to the importance sampling method above using 2025 samples per pixel in terms of run time, but produces significantly more accurate images.

Some care should be take to ensure that the adaptive integration does not "skip over" parts of the PSF, but the details can be found by reading the mtf_generate_rectangle source code.

This last method is quite accurate, yielding synthetic images that have PSFs that are extremely close to the expected analytical PSFs --- see part 2 of the accuracy validation experiment.

Variations and improvements

This method can easily be adapted to convex polygon targets, or even integration over elliptical targets. If more than one target must be rendered in a single image, then you just accumulate the PSF integral over each of the target objects. This will clearly become rather slow if you have many target objects, but spatial partitioning such as used in ray tracing should help to speed up rendering.

I have assumed a stationary PSF throughout, but it is straightforward to modify the method to vary the standard deviation of the Gaussian according to the pixel's position in the image. The numerical integration method can also easily accommodate Gaussian PSFs with differing horizontal and vertical standard deviations.

The more general case of a astigmatic Gaussian PSF, i.e., where the PSF contours are ellipsoidal, and the orientation of the PSF varies across the image, can be accommodated by rotating the targets around the centre of the PSF until the ellipsoidal contours are aligned with the x and y directions.

Monday, 23 April 2012

Validating MTF Mapper accuracy, part 2

Validation of synthetic images generated by mtf_generate_rectangle, revisited

The first step in the validation of MTF Mapper accuracy is to generate images with known MTF curves --- this is why I have developed (and included) the mtf_generate_rectangle tool. I have always suspected that the synthetic images generated by mtf_generate_rectangle were reasonably accurate, and this was confirmed in part 1 of the validation exercise.

Of course, having seen some evidence of systematic error (second figure of validation part 1), I could not leave it at that. Well, not for long, anyhow.

The "old" method used to render synthetic images relied upon discrete sampling of the PSF of a pixel, using a quasi-random numbers following the desired Gaussian distribution. This method has a fixed cost per pixel, and is reasonably accurate when enough samples are used.

An alternate method of rendering the synthetic images is conceptually very simple. The target is the desired black rectangular shape we wish to render, and we know that the image value should be zero (black) inside the rectangle, and 1 (white) outside. Thus, to determine the intensity of an arbitrary pixel, we place the Gaussian PSF at the centre of our pixel, and measure the integral of the PSF over the region defined by the rectangle. Since the rectangle is finite, and the surrounding white region is infinite, I choose to integrate over the inside of the rectangle. This is simply a double integral of the PSF, with the limits of integration defined by the target rectangle.

Urgh. Double integrals. Well, I did say it was conceptually simple. I will leave the nasty details for another post (now available), but suffice it to say that this particular double integral can be evaluated in closed form using the erf() function in one axis (I chose x), leaving us to use numerical integration (adaptive Simpson's rule) on the other axis.

Since there is numerical integration involved, we still have to choose some parameters. I went with parameters that produce good results, but still result in run times that are comparable to the old discrete sampling method.

At this point, I assume that you have read part 1. First up, we compare the measured edge spread function (ESF) of the synthetic image with that of the expected analytical ESF, which happens to be an erf() function:
No surprises here. At this scale, the two functions are expected to appear identical. Next is the plot of the difference between the measured and expected ESFs:
This looks much better than the comparable plot of the "old" discrete sampling rendering method. To put this in perspective: the largest absolute difference is now 4/65536. This in no coincidence, since both the measured and the analytical ESFs were sampled as 16-bit unsigned integers. Differences of up to 1/65536 can probably be labelled as "rounding errors" --- you can imagine that if the measured ESF was rounded up to the nearest 16-bit number, and the analytical one happened to be rounded, then you will measure an error of 1/65536.

With the excuses out of the way, we can compare the old discrete PSF sampling and new PSF integral methods side-by-side:
The integral method thus produces significantly more accurate edge profiles: the difference between the measured ESF and the analytical ESF is very close to the quantization error. There is still some structure visible in the difference plot, but at this point I am going to claim that this difference is negligible.

Moving on to the SFR curve, we can plot the difference between the measured SFR and the analytical SFR:
Note that the scale has been magnified 100x compared to the plot in part 1, but we still see very little structure in the SFR difference plot. In the frequency range 0 to 1 cycle per pixel, the maximum absolute difference is only 0.0002967%.

Computing the MTF50 value from the SFR data reveals that the measured MTF50 differs from the analytical MTF50 by 0.000351%.

The whole experiment was repeated on a synthetic image with an expected MTF50 value of 0.06 (just like in part 1). The results were, surprisingly, somewhat worse. For example, the maximum absolute difference in ESF curves increased to 5/65536, which is still negligible, but the difference curve was visibly structured:
The maximum absolute error in the SFR curves amounted to 0.13%, which is much higher than the 0.0002967% seen in the MTF50=0.5 case above, but still substantially better than the 0.356% seen with the old discrete PSF sampled rendering method. This peak error in the SFR curve occurs after the MTF50 frequency, though, so that the difference between the measured MTF50 and the analytical MTF50 came down to only -1.496e-6%, which is negligible, and substantially better than the old PSF sampling result.

These results demonstrate that the synthetic images generated with the mtf_generate_rectangle tool are valid, and that the actual ESF/PSFs of the synthetic images agree with the expected analytical ESF/PSFs to such a high degree that the images can safely be used as ground truth in evaluating the accuracy of an SFR (or MTF50) measurement tool.

ps: No binaries have yet been released with the PSF integral rendering algorithm. You will have to compile from source checked out from the repository to obtain the new rendering algorithm, for now.