Sadece Litres'te okuyun

Kitap dosya olarak indirilemez ancak uygulamamız üzerinden veya online olarak web sitemizden okunabilir.

Kitabı oku: «Optical Engineering Science», sayfa 5

Yazı tipi:

2.11.3 Simple Telescope

A classical optical telescope is an example of an afocal system. That is to say, no clearly defined focus is presented either in object or image space. As the name suggests, the telescope views distant objects, nominally at the infinite conjugate and provides a collimated output for ocular viewing in the case of a traditional instrument. As far as the instrument is concerned, both object and image are located at the infinite conjugate. Of course, this narrative does assume that the instrument is designed for ocular viewing as opposed to image formation at a detector or photographic plate. In any case, the design principles are similar. Fundamentally, the telescope provides angular magnification of a distant object, and this angular magnification is a key performance attribute.

The basic layout of a simple telescope is shown in Figure 2.9. Light from the distant object is collected by an objective lens whose focal length is f₁ and then collimated by an eyepiece with a focal length of f₂. These two lenses are separated by the sum of their focal lengths, thus creating an afocal system with an angular magnification given by the ratio of the lens focal lengths.

The matrix of the telescope is similar to that of the compound microscope, with an objective lens and eyepiece separated by some fixed distance.

The separation, s, is simply the sum of the two focal lengths and the system matrix is given by:

(2.13)

The angular magnification (the D value of the matrix) is simply −f₁/f₂. It is important to note the sign of the magnification, so that for two positive lenses, then the magnification is negative. In line with the previous discussion with regard to the optical invariant, the linear magnification (given by matrix element A) is the inverse of the angular magnification. Also, the C element of the matrix, attesting to the focal power of the system, is actually zero and is characteristic of an afocal system.

As in the case of the microscope, the objective lens forms the system entrance pupil. The exit pupil is formed by the eyepiece imaging the objective lens. This is located a short distance, approximately f₁ from the eyepiece, this distance determining the ‘eye relief’. Ideally, for ocular viewing, the pupil of the eye should be co-incident with the exit pupil. Unlike the compound microscope, the exit pupil of a simple (ocular) telescope is relatively large, about the size of the pupil of the eye. Clearly, if the exit pupil were significantly larger than the pupil of the eye, then any light falling outside the ocular pupil would be wasted. In fact, in a typical telescope, where f1 ≫ f2, the size of the exit pupil is approximately given by the diameter of the objective lens multiplied by the ratio of the focal lengths.

As an example, a small astronomical refracting telescope might comprise a 75 mm diameter objective lens with a focal length of 750 mm (f/10) and might use a ×10 eyepiece. Eyepiece magnification is classified in the same way as for microscope eyepieces and so the focal length of this eyepiece would be 25 mm, as derived from Eq. (2.12b). The angular magnification (f₁/f₂) would be ×30 and the size of the pupil about 3 mm, which is smaller than the pupil of the eye.

In the preceding discussion, the basic description of the instrument function assumes ocular viewing, i.e. viewing through an eyepiece. However, increasingly, across a range of optical instruments, the eye is being replaced by a detector chip. This is true of microscope, telescope, and camera instruments.

2.11.4 Camera

In essence, the function of a camera is to image an object located at the infinite conjugate and to form an image on a light sensitive planar surface. Of course, traditionally, this light sensitive surface consisted of a film or a plate upon which a silver halide emulsion had been deposited. This allowed the recording of a latent image which could be chemically developed at a later stage. Depending upon the grain size of the silver halide emulsion, feature sizes of around 10–20 μm or so could be resolved. That is to say, the ultimate system resolution is limited by the recording media as well as the optics. For the most part, this photographic film has now been superseded by pixelated silicon detectors, allowing the rapid and automatic capture and processing of images. These detectors are composed of a rectangular array of independent sensor areas (usually themselves rectangular) that each produce a charge in proportion to the amount of light collected. Resolution of these detectors is limited by the pixel size which is analogous to the grain size in photographic film. Pixel size ranges from a one micron to a few microns.

Optically from a paraxial perspective, the camera is an exceptionally simple instrument. Its purpose is simply to image light from an object located at the infinite conjugate onto the focal plane, where the sensor is located. As such, from a system perspective one might regard the camera as a single lens with the sensor located at the second focal point. This is illustrated in Figure 2.10.

If this system is the essence of simplicity, then the Pinhole Camera, a very early form of camera, takes this further by dispensing with the lens altogether! A pinhole camera relies on a very small system aperture (a pinhole) defining the image quality. In this embodiment of the camera, all rays admitted by the entrance pupil follow closely the chief ray. However, light collection efficiency is low. Whilst in the paraxial approximation, the camera presents itself as a very simple instrument, as indeed early cameras were, the demands of light collection efficiency require the use of a large aperture which results in the breakdown of the paraxial approximation. As we shall see in later chapters, this leads to the creation of significant imperfections, or aberrations, in image formation which can only be combatted by complex multi-element lens designs. Thus, in practice, a modern camera, i.e. its lens, is a relatively complex optical instrument.

Figure 2.10 Basic camera.

In defining the function of the camera, we spoke of the imaging of an object located at infinity. In this context, ‘infinity’ means a substantially greater object distance than the lens focal length. For the traditional 35 mm format photographic camera, a typical standard lens focal length would be 50 mm. The ‘35 mm’ format refers to the film frame size which was 36 mm × 24 mm (horizontal × vertical). As mentioned in Chapter 1, the focal length of the camera lens determines the ‘plate scale’ of the detector, or the field angle subtended per unit displacement of the detector. Overall, for this example, plate scale is 1.15° mm⁻¹. The total field covered by the frame size is ±20° (Horizontal) × ±13.5° (Vertical). ‘Wide angle’ lenses with a shorter focal length lens (e.g. 28 mm) have a larger plate scale and, naturally a wider field angle. By contrast, telephoto lenses with longer focal lengths (e.g. 200 mm), have a smaller plate scale, thus producing a greater magnification, but a smaller field of view.

Modern cameras with silicon detector technology are generally significantly more compact instruments than traditional cameras. For example, a typical digital camera lens might have a focal length of about 8 mm, whereas a mobile phone camera lens might have a focal length of about half of this. The plate scale of a digital camera is thus considerably larger than that of the traditional camera. Overall, as dictated by the imaging requirements, the field of view of a digital camera is similar to its traditional counterpart, although, in practice, equivalent to that of a wide field lens. Therefore, in view of the shorter focal length, the detector size in a digital camera is considerably smaller than that of a traditional film camera, typically a few mm. Ultimately, the miniaturisation of the digital camera is fundamentally driven by the resolution of the detector, with the pixel size of a mobile phone camera being around 1 μm. This is over an order of magnitude superior to the resolution, or ‘grain size’ of a high specification photographic film.

3
Monochromatic Aberrations

3.1 Introduction

In the first two chapters, we have been primarily concerned with an idealised representation of geometrical optics involving perfect or Gaussian imaging. This treatment relies upon the paraxial approximation where all rays present a negligible angle with respect to the optical axis. In this situation, all primary optical ray behaviour, such as refraction, reflection, and beam propagation, can be represented in terms of a series of linear relationships involving ray heights and angles. The inevitable consequence of this paraxial approximation and the resultant linear algebra is apparently perfect image formation. However, for significant ray angles, this approximation breaks down and imperfect image formation, or aberration, results. That is to say, a bundle of rays emanating from a single point in object space does not uniquely converge on a single point in image space.

This chapter will focus on monochromatic aberrations only. These aberrations occur where there is departure from ideal paraxial behaviour at a single wavelength. In addition, chromatic aberration can also occur where first order paraxial properties of a system, such as focal length and cardinal point locations, vary with wavelength. This is generally caused by dispersion, or the variation in the refractive index of a material with wavelength. Chromatic aberration will be considered in the next chapter.

A simple scenario is illustrated in Figure 3.1 where a bundle of rays originating from an object located at the infinite conjugate is imaged by a lens. Figure 3.1a presents the situation for perfect imaging and Figure 3.1b illustrates the impact of aberration.

In Figure 3.1b, those rays that are close to the axis are brought to a focus at the paraxial focus. This is the ideal focus. However, those rays that are further from the axis are brought to a focus at a point closer to the lens than the paraxial focus. In fact, the behaviour illustrated in Figure 3.1b is representative of a simple lens; marginal rays are brought to a focus closer to the lens than the chief ray. However, in general terms, the sense of the aberration could be either positive or negative, with the marginal rays coming to a focus either before or after the paraxial focus.

3.2 Breakdown of the Paraxial Approximation and Third Order Aberrations

In formulating perfect or Gaussian imaging we assumed all relationships are linear. For example, Snell's law of refraction was reduced in the following way:

(3.1)

In making the paraxial approximation, we are considering just the first or linear term in the Taylor series. The next logical stage in the process is to consider higher order terms in the Taylor series.

(3.2)

Figure 3.1 (a) Gaussian imaging. (b) Impact of aberration.

Following the term that is linear in θ, we have terms that are cubic or third order in θ. Of course, these third order terms are followed by fifth and seventh order terms etc. in succession. Third order aberration theory deals exclusively with those imperfections associated with the third order departure from ideal behaviour, as illustrated in Eq. (3.2). Much of classical aberration theory is restricted to consideration of these third order terms and is, in effect a refinement or successive approximation to paraxial theory. Higher order (≥5) terms can be important in practical design scenarios. However, these are generally dealt with by numerical computation, rather than by a simple generically applicable theory.

Third order aberration theory forms the basis of the classical treatment of monochromatic aberrations. Unless specific steps are taken to correct third order aberrations in optical systems, then third order behaviour dominates. That is to say, error terms in the ray height or angle (compared to the paraxial) have a cubic dependence upon the angle or height. As a simple illustration of this, Figure 3.1b shows rays originating from a single object (at the infinite conjugate). For perfect image formation, the height of all rays at the paraxial focus should be zero, as in Figure 3.1a. However, the consequence of third order aberration is that the ray height at the paraxial focus is proportional to the third power of the original ray height (at the lens).

In dealing with third order aberrations, the location of the entrance pupil is important. Let us assume, in the example set out in Figure 3.1b, that the pupil is at the lens. If the radius of the entrance pupil is r₀ and the height a specific ray at this point is h, then we may define a new parameter, the normalised pupil co-ordinate, p, in the following way:

(3.3)

The normalised pupil co-ordinate can have values ranging from −1 to +1, with the extremes representing the marginal ray. The chief ray corresponds to p = 0. At this stage, it is useful to provide a specific and quantifiable definition of aberration. The quantity, transverse aberration, is defined as the difference in height of a specific ray and the corresponding chief ray as measured at the paraxial focus. The ‘corresponding chief ray’ emanates from the same object point as the ray under consideration. In addition, the term longitudinal aberration is also used to describe aberration. Longitudinal aberration (LA) is the axial distance from the point at which the ray in question intersects the chief ray and the location of the paraxial focus. The transverse aberration (TA) and longitudinal aberration definitions are illustrated in Figure 3.2.

In keeping with the previous arguments, the TA has a third order dependence upon the pupil function. This is illustrated in Eq. (3.4):

(3.4)

Transverse aberration has dimensions of length, whereas the pupil function is a dimensionless ratio. Geometrically, the LA is approximately equal to the transverse aberration divided by the ray angle which itself is proportional to the pupil function. Therefore, the longitudinal aberration has a quadratic dependence upon the pupil function. This is illustrated in Eq. (3.5).

Figure 3.2 Transverse and longitudinal aberration.

(3.5)

In fact, if the radius of the pupil aperture is r₀ and the lens focal length is f, then the longitudinal and transverse aberration are related in the following way:

(3.6)

NA is the numerical aperture of the lens.

A plot of the transverse aberration against the pupil function is referred to as a ‘ray fan’. Ray fans are widely used to provide a simple description of the fidelity of optical systems. If one views the transverse aberration at the paraxial focus, then the transverse aberration should show a purely cubic dependence upon the pupil function. This is illustrated in Figure 3.3a which shows the aberrated ray fan. If, one the other hand, the transverse aberration is plotted away from the paraxial focus, then an additional linear term is present in the plot. This is because pure defocus (i.e. without third order aberration) produces a transverse aberration that is linear with respect to pupil function. This is illustrated in Figure 3.3b which shows a ray fan where both the linear defocus and third order aberration terms are present.

The underlying amount of third order aberration is the same in both plots. However, the overall transverse aberration in Figure 3.3b (plotted on the same scale) is significantly lower than that seen in Figure 3.3a. This is because defocus can, to some extent, be used to ‘balance’ the original third order aberration. As a result, by moving away from the paraxial focus, the size of the blurred spot is reduced. In fact, there is a point at which the size (root mean square radius) of the spot is minimised. This optimum focal position is referred to as the circle of least confusion. This is illustrated in Figure 3.4.

Most generally, the transverse aberration where third order aberration is combined with defocus can be represented as:

(3.7)

TA0 is the nominal third order aberration and α represents the defocus

Figure 3.3 (a) Ray fan for pure third order aberration. (b) Ray fan with third order aberration and defocus.

Since the geometry is assumed to be circular, to calculate the rms (root mean square) aberration, one must introduce a weighting factor that is proportional to the pupil function, p. The mean squared transverse aberration is thus:

(3.8)

Figure 3.4 Balancing defocus against aberration – optimal focal position.

The expression is minimised where α = −2/3. To understand the significance of this, examination of Eq. (3.6) suggests that, without defocus, the marginal ray (p = 1) has a longitudinal aberration of TA₀/NA. The defocus term itself produces a constant longitudinal aberration or defocus of αTA₀/NA. Therefore, the optimum defocus is equivalent to placing the adjusted focus at 2/3 of the distance between the paraxial and marginal focus, as shown in Figure 3.4. Without this focus adjustment, with the third order aberration viewed at the paraxial focus, the rms aberration is TA₀/4. However, adding the optimum defocus reduces the rms aberration to TA₀/12, a reduction by a factor of 3.

This analysis provides a very simple introduction to the concept of third order aberrations. In the basic illustration so far considered, we have looked at the example of a simple lens focussing an on axis object located at infinity. In the more general description of monochromatic aberrations that we will come to, this simple, on-axis aberration is referred to as spherical aberration. In developing a more general treatment of aberration in the next sections we will introduce the concept of optical path difference (OPD).

3.3 Aberration and Optical Path Difference

In the preceding section, we considered the impact of optical imperfections on the transverse aberration and the construction of ray fans. Unfortunately, this treatment, whilst providing a simple introduction, does not lead to a coherent, generalised description of aberration. At this point, we introduce the concept of optical path difference (OPD). For a perfect imaging system, with no aberration, if all rays converge onto the paraxial focus, then all ray paths must have the same optical path length from object to image. This is simply a statement of Fermat's principle. We now consider an aberrated system where we accurately (not relying on the paraxial approximation) trace all rays through the system from object to image. However, at the last surface, we (hypothetically) force all rays to converge onto the paraxial focus. For all rays, we compute the optical path from object to image. The OPD is the difference between the integrated optical path of a specified ray with respect to the optical path of the chief ray. Of course, if there were no aberration present, the OPD would be zero. Thus, the OPD represents a quantitative description of the violation of Fermat's principle.

The general concept is shown in Figure 3.5. Rays are accurately traced from the object through the system, emerging into image space. That is to say, ray tracing proceeds until the last optical surface, mirror or lens etc. Following the preceding discussion, at some point, we force all rays to converge upon the paraxial focus. However, the convention for computing OPD is that all rays are traced back to a spherical surface centred on the paraxial focus and which lies at the exit pupil of the system. Of course, it must be emphasised that the real rays do not actually follow this path. In the generic system illustrated, the real ray is traced to point P located in object space and the optical path length computed. Thereafter, instead of tracing the real ray into the object space, a dummy ray is traced, as shown by the dotted line. This dummy ray is traced from point P to point Q that lies on the reference surface – a sphere located at the exit pupil and centred on the paraxial focus. The optical path length of this segment is then added to the total.

Figure 3.5 Illustration of optical path difference.

After calculating the optical path length for the dummy ray OPQ, we need to calculate the OPD with respect to the chief ray. The chief ray path is calculated from the object to its intersection with the reference sphere at the pupil, represented, in this instance, by the path OR. In calculating the OPD, the convention is that the OPD is the chief ray optical path (OR) minus the dummy ray optical path (OPQ). Note the sign convention.

Having established an additional way of describing aberrations in terms of the violation of Fermat's principle, the question is what is the particular significance and utility of this approach? The answer is that, when expressed in terms of the OPD, aberrations are additive through a system. As a consequence of this, this treatment provides an extremely powerful general description of aberrations and, in particular, third order aberrations. Broadly, aberrations can be computed for individual system elements, such as surfaces, mirrors, or lenses and applied additively to the system as a whole. This generality and flexibility is not provided by a consideration of transverse aberrations.

There is a correspondence between transverse aberration and OPD. This is illustrated in Figure 3.6. At this point, we introduce a concept that is related to that of OPD, namely wavefront error(WFE). We must remember that, according to the wave description, the rays we trace through the system represent normals to the relevant wavefront. The wavefront itself originates from a single object point and represents a surface of equal phase. As such, the wavefront represents a surface of equal optical path length. For an aberrated optical system, the surface normals (rays) do not converge on a single point. In Figure 3.6, this surface is shown as a solid line. A hypothetical spherical surface, shown as a dashed line, is now added to represent rays converging on the paraxial focus. This surface intersects the real surface at the chief ray position. The distance between these two surfaces is the WFE.

In terms of the sign convention, the wavefront error, WFE, is given by:

The sign convention is important, as it now concurs with the definition of OPD. As the wavefronts form surfaces of constant optical path length, there is a direct correspondence between OPD and WFE. A positive OPD indicates the optical path of the ray at the reference sphere is less than that of the chief ray. Therefore, this ray has to travel a small positive distance to ‘catch up’ with the chief ray to maintain phase equality. Hence, the WFE is also positive.

Figure 3.6 Wavefront representation of aberration.

Figure 3.7 Simplified wavefront and ray geometry.

Both OPD and WFE quantify the violation of Fermat's principle in the same way. OPD is generally used to describe the path length difference of a specific ray. WFE tends to be used when describing OPD variation across an assembly of rays, specifically across a pupil. The concept of WFE enables us to establish the relationship between OPD and transverse aberration in that it helps define the link between wave (phase and path length) geometry and ray geometry. This is shown in Figure 3.7. It is clear that the transverse aberration is related to the angular difference between the wavefront and reference sphere surfaces.

We now describe the WFE, Φ, as a function of the reference sphere (paraxial ray) angle, θ. The radius of the reference sphere (distance to the paraxial focus) is denoted by f. This allows us to calculate the difference in angle, Δθ, between the real and paraxial rays. This is simply equal to the difference in local slope between the two surfaces.

(3.9)

n is the medium refractive index.

In this analysis, the WFE represents the difference between the real and reference surfaces with the positive axial direction represented by the propagation direction (from object to image). In this convention, the WFE has the opposite sign to the OPD. The transverse aberration, t, can be derived from simple trigonometry.

(3.10)

If θ describes the angle the ray makes to the chief ray, then Eq. (3.10) may be reformed in terms of the numerical aperture, NA. The numerical aperture is equal to nsinθ, and Eq. (3.11) may be recast as:

(3.11)

So, the transverse aberration may be represented by the first differential of the WFE with respect to the numerical aperture. In terms of third order aberration theory, the numerical aperture of an individual ray is directly proportional to the normalised pupil function, p. If the overall system, or marginal ray, numerical aperture is NA₀, then the individual ray numerical aperture is simply NA₀p. The transverse aberration is then given by:

(3.12)

Equation (3.12) provides a simple direct relationship between OPD and transverse aberration. Of course, we know that, for third order aberration, the transverse aberration is proportional to the third power of the pupil function, p. If this is the case, then it is apparent, from Eq. (3.12), that the OPD is proportional to the fourth power of the pupil function. So, for third order aberration, the transverse aberration shows a third power dependence upon the pupil function whereas the OPD shows a fourth power dependence.

Applying these arguments to the analysis of the simple on-axis example illustrated earlier, with the object placed at the infinite conjugate, then the WFE can be represented by the following equation:

(3.13)

p is the normalised pupil function.

Figure 3.8 shows a plot of the OPD against the normalised pupil function; such a plot is referred to as an OPD fan.

Despite the fact that this simple aberration has a quartic dependence on the pupil function, it is still referred to as third order aberration after the transverse aberration dependence. As with the optimisation of transverse aberration, the OPD can be balanced by applying defocus to offset the aberration. We saw earlier that a simple defocus produces a linear term in the transverse aberration. Referring to Eq. (3.12), it is clear that defocus may be represented by a quadratic term. Equation (3.14) describes the OPD when some defocus has been added to the initial aberration.

(3.14)

An OPD fan with aberration plus balancing defocus is shown in Figure 3.9.

In this instance, the plot has a characteristic ‘W’ shape, with the curve in the vicinity of the origin dominated by the quadratic defocus term. As with the case for transverse aberration, the defocus can be optimised to produce the minimum possible OPD value when taken as a root mean squared value over the circular pupil. Again, using a weighting factor that is proportional to the pupil function, p, (to take account of the circular geometry), the mean squared OPD is given by:

(3.15)

Figure 3.8 Quartic OPD fan.

Figure 3.9 OPD fan with balancing defocus.

The above expression has a minimum where α = −¾. To understand the magnitude of this defocus, it is useful first to convert the new OPD expression into a transverse aberration using Eq. (3.12).

(3.16)

From Eq. (3.16), it can be seen that the optimum defocus is 3/8 of the distance between the paraxial and marginal ray foci. This value is different to that derived for the optimisation of the transverse aberration itself. It should be understood that the optimisation of the transverse aberration and the OPD, although having the same ultimate purpose in minimising the aberration, nonetheless produce different results. Indeed, in the optimisation of optical designs, one is faced with a choice of minimising either the geometrical spot size (transverse aberration) or OPD in the form of rms WFE. The rationale behind this selection will be considered in later chapters when we examine measures of image quality, as applied to optical design.

The balanced defocus, as illustrated in Eq. (3.15) does significantly reduce the rms OPD. In fact, it reduces the OPD by a factor of four. Resultant rms values are set out in Eq. (3.17).

(3.17)