Sadece Litres'te okuyun

Kitap dosya olarak indirilemez ancak uygulamamız üzerinden veya online olarak web sitemizden okunabilir.

Kitabı oku: «Optical Engineering Science», sayfa 12

Yazı tipi:

6
Diffraction, Physical Optics, and Image Quality

6.1 Introduction

Hitherto, we have presented optics purely in terms of the geometrical interpretation provided by the propagation and tracing of rays. Notwithstanding this rather simplistic foundation, this conveniently simple picture is ultimately derived from an understanding of the wave nature of light. More specifically, Fermat's principle, which underpins geometrical optics is itself ultimately derived from Maxwell's famous wave equations, as introduced in Chapter 1. However, in this chapter, we shall focus on the circumstances where the assumptions underlying geometrical optics breakdown and this convenient formulation is no longer tractable. Under these circumstances, we must look to another approach, more explicitly tied to the wave nature of light, the study of physical optics. To look at this a little more closely, we must further examine Maxwell's equations. The ubiquitous vector form in which Maxwell's equations are now cast is actually due to Oliver Heaviside and these are set out below:

(6.1a)

(6.1b)

(6.1c)

(6.1d)

D, B, E, H, and J are all vector quantities, where D is the electric displacement, B the magnetic field, E the electric field, H the magnetic field strength and J the current density.

The quantities D and E and B and H are themselves interrelated:

(6.2)

The quantities, ε₀ and μ₀, are the permittivity and magnetic permeability of free space respectively. These quantities are associated specifically with free-space (vacuum). The quantities ε and μ are the relative permittivity and relative permeability of a specific medium or substance.

These equations may be greatly simplified if we assume that the local current and charge density is zero and we are ultimately presented with the classical wave equation.

(6.3)

The next stage in this critique of geometrical optics is to use Maxwell's equation to derive the Eikonal equation, that was briefly introduced in Chapter 1.

6.2 The Eikonal Equation

In Eq. 6.3, we have presented that wave equation in its true vector format. That is to say, the equation describes the electric field, E, as a vector quantity. However, much of what we will present in this chapter is a simplification of the wave equation, known as scalar theory. In this case, it is assumed that the electric field may be represented as a pseudo-scalar quantity. That is to say, the electric field, although varying in magnitude, is confined to one specific orientation and may be treated as if it were a scalar quantity. In fact, this approximation is reasonable where light is closely confined to some axis of propagation, i.e. consistent with the paraxial approximation. Thus, we are to understand that there are some limitations to this treatment.

In presenting the Eikonal equation according the scalar view, we assume that solutions to the wave equation are of the form:

(6.4)

E₀(x, y, z) is a slowly varying envelope function and S(x, y, z) is the spatially varying phase of the wave. In fact S(x, y, z) has dimensions of length and when it is equal to the wavelength the phase term it describes is equal to 2π. The angular frequency is denoted by ω and the spatial frequency by k.

The scalar form of the wave equation may be written as

From the above, we can derive the Eikonal equation, but we must assume the E₀(x, y, z) and the first differential of S(x, y, z) vary slowly with respect to position. The classical Eikonal equation is set out in Eq. 6.5.

(6.5)

It is clear that by differentiating Eq. (6.4) twice with respect to x, y, and z, that in deriving Eq. (6.5), we are neglecting terms containing the second differential with respect to S. We are also ignoring changes in the envelope function. Thus it is clear that in deriving Eq. (6.5), we are making the following assumptions:

(6.6a)

and

(6.6b)

What Eq. (6.6a) suggests is that the envelope function must vary slowly compared to the wavelength. In addition, Eq. (6.6b) suggests that the curvature of the wavefront must be small when compared to the spatial frequency, k. In other words, the assumptions underlying the Eikonal equation are only justified where the radius of any wavefront is much greater than the wavelength. As the Eikonal equation underpins geometrical optics, this sets the limits on the applicability of this methodology, and we must then seek other, more general, means to describe the behaviour of light. These methods are, of course, based on a more rigorous application of Maxwell's equations and are generally categorised under the heading of physical optics.

6.3 Huygens Wavelets and the Diffraction Formulae

Although Maxwell's equations form the rigorous description of electromagnetic wave propagation, we will first proceed from the rather more intuitive description by Huygens' principle. Huygens' principle states that, given a known wave disturbance described by a continuous surface of equal phase – the wavefront, then the amplitude of the wave at any point in space may be determined as the sum of the amplitude of forward propagating wavelets from that surface. This is illustrated in Figure 6.1.

Figure 6.1 Conceptual illustration of Huygens' principle.

The amplitude of the wave represents the strength of the local electric or magnetic field. In this case, in our scalar representation, we consider the amplitude as the magnitude of the vector electric field. The flux or power per unit area transmitted by the wave is determined by the Poynting vector, which is the cross product of the electric and magnetic fields. In the context of this scalar treatment, the flux density is proportional to the square of the electric field. In the Huygens' representation, as illustrated in Figure 6.1, the amplitude of the secondary waves emerging from some point on the original wavefront is inversely proportional to the distance from that point. It follows, therefore, that the flux density associated with that secondary wave follows an inverse square dependence with distance. This is further illustrated in Figure 6.2 which summarises the geometry.

Figure 6.2 describes the contribution to the wave amplitude at point P′ made by a single point, P, on the original wavefront. The original wavefront has an amplitude, A(x, y, z) which may be complex. The angle, χ, is the angle the line from P to P′ makes to the normal to the wavefront. As indicated in Figure 6.2, there is some dependence of the secondary wave amplitude upon this angle, in the form of f(χ). There is no intuitive process that can shed further light on the precise form of this function. Elucidation of this can only be provided by a proper application of Maxwell's equation. Re-iterating the description of the Huygens' representation in Figure 6.2, it can be described more formally, as in Eq. (6.7).

Figure 6.2 Huygens secondary wave geometry.

Figure 6.3 Geometry for Rayleigh diffraction equation of the first kind.

(6.7)

Proper application of Maxwell's equations gives rise to a series of equations that are similar in form to the Huygens' representation shown in Eq. (6.7). These include the so-called Rayleigh diffraction formulae of the first and second kinds. In the first case, it is assumed that the amplitude of the wave disturbance A(x, y, z) is known across some semi-infinite plane. We now seek to determine the amplitude, A(x′, y′, z′) at some other point in space. The geometry of this is illustrated in Figure 6.3.

Equation (6.8) shows the Rayleigh diffraction formula of the first kind.

(6.8)

Equation (6.8) is referred to as the Rayleigh diffraction formula of the first kind. In form, Eq. (6.8) is very similar to what one might expect from the summation of an expression of the form shown in the Huygens' representation in Eq. (6.7). We have formally expressed the summation of the Huygens wavelets as a surface integral over the plane, as shown in Figure 6.3. Note, however, instead of the decay of the wavelet amplitude with distance being expressed as in Eq. (6.7), a differential with respect to the axial distance is added. This is crucial, since it gives an insight into the formulation of the inclination term f(χ) which will be explored further a little later.

The other condition covered by the Rayleigh formulae occurs where the axial gradient of the amplitude is known rather than the amplitude itself. In this instance, we have the Rayleigh diffraction formula of the second kind.

(6.9)

If we combine these two solutions and make the qualifying assumption that k ≫ 1/s, then we obtain the so-called Kirchoff diffraction formula, which is replicated in Eq. (6.10).

(6.10)

The Kirchoff diffraction formula lacks the generality of the Rayleigh formulae, as it only applies where the secondary wave propagation distance is much greater than the wavelength. However, it provides a useful reference point for comparison with the Huygens approach. The factor, 1 + cosχ is the inclination factor that was alluded to previously. A further approximation may be made where the system is paraxial, i.e. where cosχ ∼ 1. In this case, there is no inclination factor to speak of. Furthermore, if the axial displacement s is very much larger than the lateral extent of the illuminated area defined by A(x, y, z), then for all intents and purposes, s is constant and the inverse term may be taken outside the integral. This is the so-called Fraunhofer approximation and may be written as:

(6.11)

6.4 Diffraction in the Fraunhofer Approximation

The assumptions underlying the Fraunhofer approximation are relevant to a wealth of problems in optical engineering. In particular, the approximation relates the behaviour and distribution of electromagnetic radiation in two distinct zones, the so-called near field and far field. Separation of these two zones must be such that the preceding approximations apply, i.e. that the axial displacement is much larger than the lateral extent of the radiation field and, of course, is much greater than the wavelength. We now wish to calculate the amplitude on a sphere whose vertex is located at z′ = z₀ and whose centre is located at z = 0, where the near field is located. Figure 6.4 shows the general scheme.

Choice of the reference sphere centred on the near field location places the following constraint upon the variables, x, y, x′, y′, and z′:

Expanding the propagation distance, s, in terms of the variables, x, y, and x′, y′ we can re-write Eq. (6.11):

In the Fraunhofer approximation, we are seeking to calculate the amplitude at the limit where z₀ tends to infinity. We wish to know the far field distribution at some angle, θ, where z₀ tends to infinity. Therefore, we can assume that x′ ≫ x and y′ ≫ y. Hence, the diffraction formula may be recast in the following form:

(6.12)

Figure 6.4 Far field diffraction.

Figure 6.5 Far field diffraction of laser beam emerging from fibre.

Equation (6.12) has the form of a Fourier transform. So, the far field diffraction pattern of a near field amplitude distribution is simply given by the Fourier transform of that near field distribution. Of course, we must understand all the caveats that apply to this treatment, namely that the far field distribution must imply that the distance of the ‘far field’ location from the near field location must be sufficiently great. Finally, we might like to cast Eq. (6.12) more conveniently in terms of the angles involved:

(6.13)

NAx and NAy are the numerical apertures (sine of the angles) in the x and y directions respectively.

A typical example of the application of Fraunhofer diffraction might be the emergence of a laser beam from a very small, single mode optical fibre a few microns across. As the beam emerges from the fibre, it will have some near field distribution. In fact, the spatial variation of this amplitude may be approximated by a Gaussian distribution. In the light of the previous analysis, the angular distribution of the emitted radiation far from the fibre will be the Fourier transform of the near field distribution. This is shown in Figure 6.5.

We will be returning to the subject of diffraction and laser beam propagation later in this chapter. A more traditional concern is the impact of diffraction upon image formation in real systems. As far as the design of optical systems is concerned, hitherto we have only been concerned with the impact of aberrations in limiting optical performance. In the next section, we will examine the application of Fraunhofer diffraction to the study of image formation in optical system and the way in which the presence of diffraction limits optical resolution.

6.5 Diffraction in an Optical System – the Airy Disc

In the Fraunhofer approximation, we considered the effect of diffraction by considering a near field amplitude distribution and a far field, nominally located at infinity. However, it is not necessary for the far field to be physically located at infinity. For example the (second) focal point of an optical system is conjugated to an object plane located at infinity. In this instance, the relation of the two planes is perfectly described by the Fraunhofer approximation. This is illustrated schematically in Figure 6.6 which shows the realisation of the far field of a laser source. The focus of the lens in Figure 6.6 is conjugate to infinity in object space and, assuming the lens aberration is not significant, then the Fraunhofer diffraction pattern would be imaged at this location.

In the case of the near field distribution associated with the laser, the far field distribution will be given by the Fourier transform of the near field distribution, but mediated by the focal length of the lens. In other words, the spatial distribution, A(x′, y′) of the far field at the lens focus is given by:

(6.14)

In practice, the quantity, x′/f, can be regarded as equivalent to the numerical aperture (NA) associated with the far field distribution.

Figure 6.6 Imaging of a Fraunhofer diffraction pattern by a simple lens.

Figure 6.7 Diffraction of evenly illuminated pupil.

In terms of a real optical system, the greatest practical interest is invested in the diffraction produced by the pupil. For an object located at infinity and the physical stop located in object space, the far field diffraction pattern of the pupil will be formed at the focal point of the system. Of course, the pupil, or its image, the exit pupil, is of great significance in the analysis of an optical system as the system optical path difference (OPD) is, by convention, referenced to a sphere whose vertex lies at the exit pupil. As such, the diffraction pattern produced by a uniformly illuminated circular disc is of prime importance in the analysis of optical systems.

We will now assume that an optical system is seeking to image a point object, and the exit pupil size can be expressed as an even cone of light with a numerical aperture, NA. It will produce a diffraction pattern at the focus of the system, whose extent and form we wish to elucidate. A schematic of this scenario is shown in Figure 6.7.

We are now simply required to determine the Fourier transform of a circular disc. In fact, the Fourier transform of a circular disc is described in terms of J₁(x), a Bessel function of the first kind. Proceeding along the lines set out in Eq. (6.14), we find that the far field distribution at the system focus is given by:

(6.15)

It is natural, of course, that the far field distribution retains the circular symmetry of the near field. We have to remember that, in this analysis, we have calculated the amplitude (electric field) of the far field distribution. The flux density, I(r′), is proportional to the square of the electric field and this is given by:

(6.16)

The pattern produced at the far field location, as defined by Eq. (6.16) is known as the Airy disc. For r′ → 0, Eq. (6.16) tends to one. Thus, all values computed by Eq. (6.16) represent the local flux taken in ratio to the central maximum. The form of the Airy disc consists of a bright central region surrounded by a number of weaker rings. This is shown in Figure 6.8.

Figure 6.8 Airy disc.

The importance of the Airy disc lies in the fact that it represents the ideal replication of a point source in a totally unaberrated system. Hitherto, in the idealised geometrical optics representation, a point source would be replicated as a point image. The presence of diffraction, therefore, critically compromises resolution. That is to say, even in a perfect optical system, the lateral resolution of the system is limited by the extent of the Airy disc. At this point it is useful to examine the form of the Airy disc in more detail. Figure 6.9 shows a graphical trace of the Airy disc, expressed in terms of the ratio r′/r₀.

Figure 6.9 Graphical trace of Airy disc.

Figure 6.10 The Rayleigh criterion and ideal diffraction limited resolution.

As illustrated in Figure 6.9, the full width half maximum (FWHM) is equal to 3.233r₀. Equally significant is the presence of local minima at 3.832r₀ and 7.016r₀. It is more informative to express these values in terms of the wavelength and numerical aperture. This gives the FWHM as 0.514λ/NA and the locations of the minima as 0.610λ/NA and 1.117λ/NA. At first sight, the FWHM may seem a useful indication of the ideal optical system resolution. In practice, it is the location of the first minimum that forms the basis for the conventional definition of ideal resolution. The rationale for this is shown in Figure 6.10.

Considering two adjacent point sources, these are said to be resolved when the maximum of one Airy disc lies at the minimum of the other. Therefore the separation of the two images must be equal to 0.610λ/NA. This is the so-called Rayleigh criterion for diffraction limited imaging. Under the Rayleigh criterion, two separated and resolved peaks are seen with a local minimum between them at 73.5% of the maximum. This is illustrated in Figure 6.11

At this point, we will re-iterate the formula describing diffraction limited resolution under the Rayleigh criterion, as it is fundamental to the enumeration of resolution in a perfect optical system. This is set out in Eq. (6.17).

(6.17)

Figure 6.11 Profile of two point sources just resolved under Rayleigh criterion.

Worked Example 6.1 Microscope Objective

A microscope objective has a numerical aperture of 0.8. What is the diffraction limited resolution at the D wavelength of 589.3 nm?

Calculation is very straightforward, we simply need to substitute the relevant values into Eq. (6.17).

The resolution is 0.45 μm.

This figure only applies to ‘perfect’ or diffraction limited in an aberration-free system. The presence of aberrations will affect the resolution, as will be considered in the next section.

6.6 The Impact of Aberration on System Resolution

6.6.1 The Strehl Ratio

In the preceding analysis we examined the diffraction pattern produced by a circular disc – namely the pupil. This produced the Airy diffraction pattern. In this analysis, we ignored the impact of phase, i.e. the possibility that the amplitude across the pupil might have a complex component. In fact, for a point source, the phase across the pupil is, by definition, directly related to the OPD. That is to say, if we assume that the modulus of the near field amplitude, A(x, y) is unity, the complex amplitude is given by:

where Φ(x, y) is the wavefront error across the pupil.

The final diffraction pattern is given by the Fourier transform of the above which, from Eq. (6.13), is given by:

(6.18)

We now wish to compute the amplitude at the central location of the far field pattern, i.e. where NAx and Nay = 0. In this case the Fourier transform can be further simplified:

(6.19)

For an optical system that is close to perfection, or almost diffraction limited, we can make the further assumption that kΦ ≪ 1 at all locations across the pupil. We find that the ratio of the amplitude with the presence of aberration to that without is approximately given by:

(6.20)

The expressions in the pointed brackets in Eq. 6.19 represent the mean square wavefront error and the mean wavefront error respectively.

However, the expression in Eq. (6.20) is merely the amplitude of the disturbance and not the flux. To calculate the flux density at the centre of the diffraction pattern, we need to multiply Eq. (6.19) by its complex conjugate. This gives:

(6.21)

The expression contained within the brackets is merely the variance of the wavefront error taken across the pupil. If we define the root mean square (rms) wavefront error, Φ_rms, as the rms value computed under the assumption that the average wavefront error has been normalised to zero, we get the following fundamental relationship:

(6.22)

Equation (6.22) is of great significance. The ratio expressed in Eq. (6.22), the ratio of the aberrated and unaberrated flux density, is referred to as the Strehl ratio. The Strehl ratio is a measure of the degradation produced by the introduction of system imperfections. Of course, Eq. (6.22) only applies where kΦ²_rms ≪ 1. The fact that the peak flux of a diffraction pattern is reduced by the introduction of aberration necessarily implies that the distribution is in someway broadened, i.e. the resolution is reduced. For example, if the Strehl ratio is 0.8, then the area associated with the diffraction pattern is likely to have increased by about 20% and the linear dimension by about 10%.