It is a truth universally acknowledged that a single man in possession of a large aperture lens must capture images with as shallow a depth of field as he can manage. (Jane, please forgive me ...).
All kidding aside, the downside to employing a shallow depth of field is the way in which it accentuates even the smallest focus error. By focus error I mean that the apparent position of the focus plane is not where the photographer intended. And of course there is no such thing as a focus plane, since in reality it is a curved surface, but for convenience I will use the term focus plane here.
Even if we accept the convenient notion of a focus plane, we still have not really explained clearly what a focus plane is. One way of describing the focus plane would be to say that it is the distance at which the
circle of confusion is minimized (as projected onto the image sensor). Personally, I am not a fan of using the circle of confusion to measure focus (or defocus, to be more precise), mostly because it is hard to measure the circle of confusion. The other difficulty with the notion of the circle of confusion is that it conjures up the image of these perfect little circular discs being formed on the image sensor, which is a rather crude simplification that does not take into account the actual point spread function (PSF) of the imaging system.
A much more convenient (to me, at least) way of defining a focus plane is to do so in terms of MTF, since this explicitly acknowledges the full PSF. This idea has been proposed recently by Jim Kasson (
example from his blog,
sample discussion from DPR forum), using MTF50 as the final criterion. It only takes a little bit of thought to see that
circle of confusion diameter and MTF50 are both approximations of the degree of sharpness of an image; note that using MTF50 might discard some of the useful information that we could extract from the full MTF curve, but it is convenient to have only a single value to express our measure of "sharpness" (I am deliberately avoiding the term "resolution" here, since the slanted edge method measures MTF, not resolution).
We can plot the "sharpness" measure of our choice (MTF50) as a function of distance from the camera to produce a curve like this one:
|
Figure 1: An example of MTF50 at a fixed position on a hypothetical image sensor, plotted as a function of the distance between the sensor and the slanted edge target (it happens to be a 50 mm f/1.8 lens at f/2.8) |
Figure 1 illustrates the MTF50 that we would measure as we move our slanted edge target relative to the camera. Now that we have something to visualize, it is easy to explain what a focus plane is: the hypothetical plane that is parallel to our image sensor, located at the distance that maximizes MTF50 (e.g., the peak of the curve, as indicated by the green line in Figure 1). Similarly, we could define depth of field (DOF) as the length of the interval between the two dashed gray lines, where the dashed gray lines correspond to the distances from the camera at which MTF50 = 0.15 cycles per pixel. Note that this is an arbitrary re-definition of DOF only for illustration of the concept as it applies to the curve shown in Figure 1.
In this article, I will use the terms "focus peak distance", "focus distance", and "focus plane position" interchangeably to refer to the distance corresponding to the green line in Figure 1. As you can probably deduce from the title, this article deals with the measurement of this focus distance value using MTF Mapper.
A new test chart
The "classic" MTF Mapper test charts proved to be inadequate when it came to accurate measurement of the focus peak distance. Firstly, none of the older charts provided the required density of slanted edges to obtain a robust measurement. Secondly, the older charts did not allow MTF Mapper to convert image space (pixel) coordinates to real-world coordinates (in mm). The new chart is illustrated in Figure 2:
|
Figure 2: The new "focus" MTF Mapper chart type |
This chart is a 45-degree slanted chart design. The camera should be pointed towards the centre of the chart; the centre of the chart falls on the dashed line, halfway between the two central fiducials (the large black dots). The chart should be tilted at 45 degrees around the axis illustrated with the dashed (horizontal) line. Notice that the large black bars down the centre of the chart decrease in size towards the bottom of the chart --- that end of the chart should be closer to the camera (so that perspective ends up distorting the bars to have roughly the same size in the final image, although this is not critical). Figures 3 (a) and (b) illustrate two possible chart orientations relative to the camera.
|
Figure 3a: One possible set-up, with the chart tilting top-to-bottom at 45 degrees |
|
Figure 3b: An alternative set-up, with the chart tilting left-to-right at 45 degrees |
The chart does not have to be positioned in a portrait orientation; landscape orientation works just fine (compare Figure 3(a) to 3(b)). The camera can also be used in either landscape or portrait orientation, as long as you can fit in most of the chart in the image. If you happen to have a sub-optimal combination of focal length, chart size and distance from the chart, then you may crop the chart a little if you must. It is important that the 45-degree tilt of the chart is around the correct axis (running through the dashed line of the chart shown in Figure 2), and that the other two axes must be close to being square.
It is critical to print the chart at the correct size (without "fit to page" scaling). MTF Mapper relies on the fact that distances on the chart are correct --- note the four "+" markers near the corners of the chart, which you can use to verify that your print came out at the right scale. Of course, if you do print with some page scaling, or you print the A3 chart on an A4 page, then MTF Mapper will still work, but the distances that it reports will no longer be accurate. Lastly, note that the fiducials are coded to allow MTF Mapper to identify the correct chart size, so if you pay close attention, you will see the differently sized charts are not just scaled copies of a base chart size.
This is a manual-focus chart only. The camera should preferably be focused (manually) on the centre of the chart, i.e., roughly at the point halfway between the two central fiducials (black dots). The chart features around this point are not suitable for auto-focus use because there is no way to tell what part of the chart (in the general region around the centre) the camera chooses to focus on. Just to clarify: This chart should not be used to perform PDAF micro-adjust / fine tuning with.
A minor digression: Although this chart is not suitable for use with auto-focus, nothing prevents you from using a removable overlay target. You could, for example, use a second printed page containing only a large black rectangle (like the central rectangle in the MTF Mapper "perspective" chart type) to perform the auto-focus operation. Lock the focus, remove the overlay, and capture the image of this new chart. If you plan ahead, you could use fridge magnets to make the process of adding/removing the auto-focus overlay target more convenient. Or you could wait for the eventual release of a new auto-focus MTF Mapper chart I plan on introducing.
A new output type
To process images of the new "focus" chart just introduced, a new output type has been added to MTF Mapper. As of MTF Mapper version 0.5.16, this output type is not compatible with other typical output types produced by MTF Mapper, i.e., when you choose the "focus position" output type, then you should not enable any other output types (they will not produce usable output). This is a temporary inconvenience, and I aim to fix this sometime. The corresponding command-line switch for this new output type is "--focus"; it produces a file called "focus_peak.png".
An example of this output type is illustrated in Figure 4:
|
Figure 4: An example of the "--focus" MTF Mapper output. Note that the chart was oriented as shown in Figure 3b |
The curve illustrated in Figure 1 is overlaid on top of the captured image by back-projecting the curve onto the image using the estimated camera perspective transformation, to produce the green curve in Figure 4. The "height" of this curve is simply scaled to fill the image of the chart, so the peak of the curve will always be on the midline of the black slanted edge bars.
The dark blue line illustrates the intersection of the hypothetical focus plane with the surface of the chart. In the centre of the image illustrated in Figure 4 we see an orange-ish coordinate origin marker (four outwards pointing arrows), representing the physical center of the chart. The red reticule (with its four inwards pointing arrows) indicate the centre of the captured image; together these two features provide feedback for centering the camera to the chart.
Right under the peak of the green curve we see two values reported in cyan-coloured text. The first line is the MTF50 value measured at the peak, and the second is the focus plane position relative to the centre of the chart. In other words, MTF Mapper subtracts the estimated position (distance from the camera) of the centre of the chart from the focus peak distance to compute the value displayed in this second line below the green curve.
Lastly, it may be worth reading my article on
chart orientation estimation, since the underlying method of extracting the camera pose parameters is the same one used by the "--chart-orientation" output mode. If you are using the command line version of MTF Mapper, take note that you may have to specify the focal ratio of your camera + lens combination in order for the camera pose parameters to be correct. For example, a 105 mm lens mounted on a Nikon APS-C body (23.6 mm sensor width) would require the command "--focal-ratio 4.45" to improve the accuracy of the "Estimated chart distance" value reported at the bottom of the "focus_peak.png" output image. The value 4.45 is derived from (lens focal length)/(sensor width), i.e., 105/23.6 ~ 4.45. Because the "focus peak depth" value reported in the output (the -24.7 mm in Figure 4) is a relative measurement, it is expected that an incorrect "Estimated chart distance" value will
not have a large impact, but more testing has to be performed to confirm this.
The principle
The curve presented in Figure 1 seems to imply that we have a single slanted edge that we measure as we move it away from the camera, starting at a distance closer than the focus plane distance. This is an entirely valid way of obtaining the measurements required to produce Figure 1, and Jim Kasson has done exactly that. It does require a good linear rail, preferably a computer controlled one to automate the capture of a large number of images from our desired range of distances (from the camera).
We can obtain a fairly decent approximation if we use a 45-degree chart with a large number of slanted edges. The tilt in the chart naturally ensures that these slanted edges appear at different distances from the camera. All that MTF Mapper has to do is extract slanted edge MTF values, and reconstruct the MTF50 vs distance curve.
That sounds straightforward, but there is one fairly large caveat: if our edge is slanted (as required for the slanted edge method to work), then that edge will pass through a range of distances, e.g, the starting tip of the edge is closer to the camera, and the endpoint of the edge is further from the camera. If the MTF50 value varies as a function of distance (as illustrated in Figure 1), then strictly speaking the PSF of the image formation process also varies along this edge. This violates the central assumption of the slanted edge method, which implicitly assumes that we can measure the MTF at a single location in the field by examining a small region around that location. In practice, the MTF we measure with the slanted edge method is a blend of the MTFs at the various distances the edge passes through.
There is not much we can do about it, but it helps to oversample. Each of the long edges of the slanted edge bars in the "focus" test chart (see Figure 2) is processed with a sliding window that uses only a small section of the edge to apply the slanted edge method to. This approach increases our sampling density whilst minimising the range of depth values over which each slanted edge MTF calculation is performed, i.e., we only assume that the true MTF is constant over a very small section of the edge. It is a well-known fact that the slanted edge method produces estimates with a smaller standard deviation if the length of the edge that it is applied to is increased (and the true MTF remains constant); conversely, we expect that each of our individual slanted edge measurements performed with the sliding window method will result in a large standard deviation in the estimated MTF50 value. Fortunately, we can safely assume that our desired MTF50 vs distance curve must be smooth, thus we can fit a smooth model to our multiple noise-contaminated MTF50 measurements. It turns out that a rational polynomial function of order (4, 2) seems to fit rather nicely in all the cases I have examined so far, so that is what MTF Mapper uses internally.
This strategy violates any number of model-fitting assumptions (e.g., my noisy samples are bound to be correlated, and the noise might be correlated too), but it seems to work in practice.
One last observation: Why are the slanted edge bars oriented so that they run left-to-right if the chart is tilted at 45-degrees top-to-bottom (assuming portrait orientation, as shown in Figure 2)? What would happen if we had only a single edge running top-to-bottom, and we applied the sliding window approach to that edge? It turns out that this top-to-bottom method is viable, but because each short edge segment passes through a larger range of distance (from the camera) values, compared to the left-to-right edges, the sensitivity of the detection of the peak of the MTF50 vs distance curve is compromised. So it works, just not as well as the edge orientation of Figure 2.
Accuracy assessment: set-up
So does it work? This turns out to be a fairly hard question to answer. One approach would be to validate the single-image-45-degree-chart method against a computer-controlled focusing rail (i.e., physically moving the edge like Jim Kasson does), but I do not have one of those handy.
I settled on a rather different approach that relies on the observation that a lens fitted on an extension tube can no longer focus at infinity. From what I could gather, a lens set to focus at infinity will focus at a distance d = f*(f/e + 2) + e, where f is the focal length, and e is the extension length (update: see Appendix A below for a discussion of this formula). I happen to have a Micro-Nikkor 105 mm f/4 Ai lens with a hard stop at infinity. After a bit of iterative experimentation (translation: building something, then going back to the drawing board, then salvaging the hardware) I found that an extension of about 6.4 mm will cause the 105 mm lens to focus at a distance of about 1939 mm. At this distance, the lens covers an object size just a tad smaller than an A3 test chart.
Of course, there are some practical problems. Firstly, it is rather difficult to measure a distance of 1939 mm with good accuracy using my available tools. More importantly, I am not quite sure where to measure this distance from (update: As mentioned in Appendix A, this is the total lens conjugate distance, i.e., the distance between the image plane and the focus plane. Since I do not know the principal plane separation distance of my lens, I still cannot use the total lens conjugate distance directly). Even if I could solve the measurement problem, I would still only end up with a single measurement, and no experimental variables to vary.
My solution was to build a variable-length extension tube. The idea was that I could preset the effective length of the extension tube with good accuracy if I used shim stock (or feeler gauges) --- all I had to do is build an extension tube from scratch, since none of the commercially available ones appear to go below 8 mm. Here is a photo of my custom extension tube mounted between my D7000 and the 105 mm Nikkor lens:
|
Figure 5: The bronze-coloured ring is part of my extension tube |
Here is what the front of my extension tube looks like with the lens removed:
|
Figure 6: extension tube with the lens removed |
As you can see from Figure 6, it was a bit of a tight fit to build an adaptor that was wide enough to allow adjustment without removing the lens, but still small enough to physically fit below the prism housing.
Here is what the extension tube looks like with some shims installed:
|
Figure 7: extension tube with some shims installed, front view |
|
Figure 8: extension tube with some shims installed, rear view |
As can be seen in Figure 7, the front part of the extension tube comprises two parts: the front flange (visible as the large ring with the black pen markings), and a Nikon F-mount female bayonet mount. The inner four screws fix the female mount the the outer flange.
The rear flange has an integrated Nikon F-mount male bayonet. I discovered that the male bayonet is a lot easier to manufacture than the female F-mount bayonet --- that probably explains why Nikon sells them :) Figure 8 also shows how the shims are installed between the front and rear flanges. Careful lapping of the flanges (an a bit of shimming with aluminium foil) ensured that the front surface of the female bayonet mount was parallel to the rear surface of the male bayonet mount to within 5 micron.
The front of the rear flange looks like this when we open up the extension tube:
|
Figure 9: front face of rear flange seen in the foreground |
And lastly, we can see the rear face of the front flange:
|
Figure 10: rear face of front flange. |
Notice the dowel pins in Figure 10: these acted as the registration mechanism so that the flanges always line up correctly without any rotation or tilt.
The length of the extension tube can be adjusted by installing three shims between the front and rear flanges. Measurements with a micrometer show that the repeatability of this process was around 5 micron. I also learned that bargain-store feeler gauges are not necessarily manufactured down to the tolerances that this experiment demanded --- I found some evidence that the feeler gauge thickness varied a little bit across their surfaces. I compensated as much as possible by labeling the position at which a particular shim should be installed, and I measured the effective extension tube length rather than relying on the nominal shim thickness.
I ended up with the following (effective) shim thicknesses: 130, 94, 72, 58, and 45 micron.
Accuracy assessment: the results
The basic experiment involves setting up the chart so that the apparent focus plane position was just slightly in front of the chart center when the 130 micron shims were installed. For each shim set, I then captured 10 images to yield 50 images in total. I repeated the whole experiment a second time to check for repeatability. Figure 11 presents the resulting box-and-whisker plot.
|
Figure 11: Focus position shift measured by MTF Mapper as a function of shim thickness |
The y-axis denotes the "focus peak depth" value reported by MTF Mapper; all the values are positive indicating that the measured focus plane position was slightly in front of the chart centre in all cases.
Other than the large variability (across 10 images) of Set A with 45 micron shims, it would appear that the individual measurements were quite robust. Typical standard deviation within a particular batch of 10 images was below 0.3 mm.
Using the formula presented above, we can compute the expected focus plane position for each of the shim sets, however, we still have no idea how to measure these absolute distances (and the principal plane separation distance of the lens is unknown). Instead, we can subtract the focus distance obtained from the formula with the 45 micron shim; doing the same for the focus peak depth values reported by MTF Mapper allows us to perform a relative comparison. The results are presented in Figure 12:
|
Figure 12: Summary of results. All values reported in millimeters |
The second column contains the "focus peak depth" value reported by MTF Mapper. The second last column lists the relative focus peak depth value; these values should be compared to the relative values derived from the formula appearing in the last column.
Overall we see a reasonable agreement between the relative values derived from the formula, and the relative values as measured by MTF Mapper. There are some outliers (set B, 58 micron shim), but the difference between expected and measured values are typically below 1 mm. Keep in mind that a 5 micron change in shim thickness produces a change of 1.3 mm in focus plane position using the formula, i.e., the measured values appear to be within the mechanical repeatability of the shimming process itself.
The smallest change in shim thickness tested here was 13 micron (45 micron shim set swapped out with 58 micron shim set), followed closely by the 72 vs 58 micron shim combination with a difference of 14 micron. In both those cases it is clear (see Figure 11) that, using the "focus peak depth" values reported by MTF Mapper, one can easily discern the change in focus plane position induced by a 13 micron change in shim thickness.
Why would we want to do this? One application would be the calibration of a camera system where we have to shim the flange distance (distance from sensor to lens mounting flange front surface) to ensure that the image formed on the sensor is in focus when a reference lens is mounted. This is particularly useful for systems with hard infinity focus stops.
Of course, one would have to consider things like actual image magnification relative to sensor resolution when considering this "smallest discernible change in flange distance" measurement, because MTF Mapper performs the analysis of images at the pixel level. More testing!
References
[Burke2012]: Burke, Michael W, Image acquisition: handbook of machine vision engineering, Springer Science & Business Media, 2012.
Appendix A
The formula used to calculate the focus distance of the lens with focal length f and extension e is d = f*(f/e + 2) + e. This formula is taken from [Burke2012, p311], where d is stated to be the total lens conjugate distance. The total lens conjugate distance is the sum of the object-to-lens-centre and image-to-lens-centre distances when looking at the thin lens model. Burke notes that the derivation of this equation depends on the lens being symmetric, which allows us to assume that d = do + 2f + di, where do is the object-to-focal-point distance, and di is the image-to-focal-point distance.
I strongly doubt that my Micro-Nikkor 105 mm f/4 Ai is really a symmetric lens, so I just assume that this formula still gives reasonable results. Burke's formula only applies to a thin lens, which I am fairly certain my lens is not (being a compound lens). The implication of this, from my understanding, is that there is an additional distance dp that separates the two principal planes which must be added to the total lens conjugate distance, which implies that d = do + 2f + di + dp. Using my convention above, where di is called e (denoting extension), we see that the thick lens version of this equation should be d = f*(f/e + 2) + e + dp.
Unfortunately I have no idea what the value of dp would be for my lens. Serendipitously, I only end up using the difference between d values computed using different values of e, meaning that the subtraction removes the dp term from the difference, so I can get away with using the thin lens version of the formula.