Diapycnal mixing in the Southern Ocean diagnosed using the DIMES Tracer and Realistic Velocity Fields

In this work, we use realistic isopycnal velocities with a 3‐D eddy diffusivity to advect and diffuse a tracer in the Antarctic Circumpolar Current, beginning in the Southeast Pacific and progressing through Drake Passage. We prescribe a diapycnal diffusivity which takes one value in the SE Pacific west of 67°W and another value in Drake Passage east of that longitude, and optimize the diffusivities using a cost function to give a best fit to experimental data from the DIMES (Diapycnal and Isopycnal Mixing Experiment in the Southern Ocean) tracer, released near the boundary between the Upper and Lower Circumpolar Deep Water. We find that diapycnal diffusivity is enhanced 20‐fold in Drake Passage compared with the SE Pacific, consistent with previous estimates obtained using a simpler advection‐diffusion model with constant, but different, zonal velocities east and west of 67°W. Our result shows that diapycnal mixing in the ACC plays a significant role in transferring buoyancy within the Meridional Overturning Circulation.


Introduction
Global climate is strongly influenced by the overturning circulation of the oceans, transporting up to 5 PW of heat meridionally and mediating the exchange of gases between the ocean and atmosphere (Rahmstorf, 2002;Trenberth & Caron, 2001). In recent years, the importance of the Southern Ocean to the closure of the circulation has been highlighted (e.g., Lumpkin & Speer, 2007;Marshall & Speer, 2012). Cold dense waters created at high latitudes must return to the surface, and this is achieved in part by adiabatic upwelling along isopycnals in the Southern Ocean, and in part through diapycnal mixing. Although the spatial variability of this diapycnal mixing has significant impacts on the circulation, it is not well represented in conceptual or numerical circulation models (Wunsch & Ferrari, 2004). In this work, we quantify and characterize the diapycnal mixing taking place in the Drake Passage region of the Southern Ocean, a region of greatly enhanced turbulence (Naveira Garabato et al., 2004;Thompson et al., 2007).
The Diapycnal and Isopycnal Mixing Experiment in the Southern Ocean (DIMES), which commenced in 2008, has aimed to characterize the spatial and temporal variability in Southern Ocean mixing and to understand its controlling physical processes. It comprises a large-scale tracer experiment combined with microstructure, finestructure, and mooring-based measurements in a region stretching from the Southeast Pacific through Drake Passage to the Scotia Sea and beyond Ledwell et al., 2011;Sheen et al., 2013;St. Laurent et al., 2012;Watson et al., 2013). Through the monitoring of its horizontal and vertical distribution as it evolves, the tracer provides a direct measurement of temporally and spatially integrated isopycnal and diapycnal mixing; the first of its kind in the region. The path of the tracer during the first 2 years of the experiment goes through two contrasting regions of the Southern Ocean: the SE Pacific where smooth topography leads us to expect relatively weak mixing, and Drake Passage where the interaction of meandering jets with complex topography is expected to produce enhanced mixing rates (see e.g., Naveira Garabato et al., 2004;Nikurashin & Ferrari, 2010a, 2010bScott et al., 2011). demonstrating strongly enhanced mixing in Drake Passage compared with the SE Pacific. They obtained the values by fitting Gaussian profiles to observed and modeled vertical tracer distributions and calculating a cost function comparing the vertical widths of these profiles. They allowed both a constant zonal velocity u and constant diapycnal diffusivity K z to take on different values east and west of 678W and adjusted the values prescribed to the model in the two regions to give the best fit to the observations. The main limitation of the model of W13 is the difficulty in precisely determining u. This is important because when diagnosing K z by comparing model with observed vertical profile widths, estimates of K z and u are approximately proportional to one another. W13 obtained a first guess of u in each region by calculating a spatiotemporal mean of zonal velocities from the SatGEM product (Meijers & Bindoff, 2011) at the tracer depth. They estimated uncertainties on these means from the standard error of the contributing values and allowed u to vary within these limits when optimizing K z . However, since the values of u in each region are not constrained by the observed concentrations this means they are rather uncertain, and in addition they are not necessarily the same as the actual mean zonal velocities from 3-D models.
In the present work, we use the following version of the advection-diffusion equation: where (u, v) are along-layer velocities in the zonal (x) and meridional (y) directions, K h is an isotropic, spatially independent along-isopycnal diffusivity intended to represent eddy stirring at scales not resolved by the models, and z is the height above the target isopycnal surface. We use (1) to describe the dispersion of the tracer along and across density layers making the approximation that layer thickness is constant and neglecting diapycnal velocity; these approximations are discussed in sections 4.4 and 4.5 where we find their effect to be small. Our 3-D isopycnal framework with constant layer thickness allows us to do two things not possible with the 2-D model of W13. First, we can apply realistic velocities to the tracer advection derived from two products, SatGEM and SOSE (see section 2.2) while minimizing spurious diapycnal diffusion. Compared to the simple zonal flow of the W13 model, the velocity fields reproduce much more of the complexity of the evolution of the DIMES tracer as it is advected from the SE Pacific through Drake Passage by a combination of eddies and the mean flow. Second, we can test the validity of the model tracer advection by comparing its lateral distribution with the observations. The paper is organized as follows: in section 2, we summarize the DIMES tracer experiment and describe the model and the methods used to compare the model output with the tracer observations in order to diagnose K z . In section 3, we compare the lateral distribution of the model tracer with observations from three of the DIMES cruises. In section 4, we examine the vertical tracer distributions, report the optimized K z , and explore its sensitivities to various factors. Section 5 contains a discussion of the results and section 6 our conclusions.

Tracer Release Experiment
In February 2009, 76 kg of trifluoromethyl sulfur pentaflouride (CF 3 SF 5 ) was released at around 1,500 m depth onto the ''target'' neutral density surface c n 527:906 kg m 23 near 1078W, 588S in the SE Pacific sector of the Antarctic Circumpolar Current (ACC), and surveyed approximately annually as it progressed eastward (Ledwell et al., 2011;Watson et al., 2013). Profiles of tracer concentrations (C) with depth (z) used in this study were collected on three cruises at roughly 1, 1.9, and 2.2 years after release ( Figure 1 and Table 1). The bathymetry of the region is relatively smooth over most of the SE Pacific sector but is characterized by steep ridges and hills in Drake Passage ( Figure 1).

3-D Model Setup 2.2.1. Model Overview
We employ code from the offline version of the Massachusetts Institute of Technology General Circulation Model (MITgcm) as a platform for our model tracer evolution. The code solves the advection-diffusion equation in three dimensions (equation (1)), using prescribed time-evolving velocity fields uði; j; k; tÞ5ðu; v; wÞ, a scalar horizontal diffusivity K h and a diapycnal diffusivity field K z 5K z ði; j; kÞ, where (i, j, k) are model grid cell indices in the zonal, meridional, and vertical directions. We use the Prather advection scheme with a flux limiter (Prather, 1986). We have constructed a grid that is effectively isopycnal with constant layer thicknesses, and with nominally zero diapycnal velocity. Thus, the version of equation (1) that we solve is highly simplified; these simplifications are discussed later in this paper. We simulate the tracer evolution from the time of the release to the time of the survey designated UK2.5, in April 2011, 2.2 years after release, extracting outputs at the relevant time steps to be compared with observations (Table 1).

Velocity Fields
We carry out two separate sets of model runs using different velocity products for the period of the experiment. The first is SatGEM (Meijers & Bindoff, 2011), which combines a Gravest Empirical Mode (GEM; Meinen & Watts, 2000) projection of temperature and salinity with satellite altimetry to generate 3-D time-evolving geostrophic velocities for the Southern Ocean at weekly intervals. The SatGEM fields have been constructed through optimal interpolation of historical hydrographic observations. The horizontal resolution of the velocity fields is 1/38, and we use SatGEM's horizontal grid definitions for these runs. The SatGEM grid extends from 1268W to 08W in longitude and from 828S to 198S in latitude.
The second product is the Southern Ocean State Estimate (SOSE; Mazloff et al., 2010), which is a 3-D timeevolving estimate of u, v, w, temperature and salinity for the Southern Ocean, also at weekly intervals. SOSE is constructed using model outputs from the MITgcm by adjusting model parameters to give the best fit to a large number of available observations including satellite altimetry and sea surface temperature. It has a horizontal resolution of 1/68, and again we use its native horizontal grid for our model runs. We use the state estimate with the Gent-McWilliams parameterization switched off (this decision is discussed in Appendix A). The SOSE grid extends from 1278W to 278W in longitude and from 708S to 288S in latitude. Both grids cover a large enough area such that negligible amounts of tracer reach the edges of the domain during our model runs.  Smith and Sandwell (1997), with the coastline shown in black.
There are advantages and disadvantages of each product. SatGEM derives directly from observations, so should be the best estimate of the geostrophic velocities at the 100 km scale resolved by the altimeter. SOSE resolves smaller scales and obeys dynamics well beyond geostrophic balance but is less strongly constrained by observations than SatGEM. Thus, the eddy fields it generates are allowed to depart somewhat from the observations. SatGEM suffers a drawback due to the method of calculating the GEM: there are no velocities in water shallower than 1,500 m (see Figure 2), and this includes a region in the north of Drake Passage along the South American continental slope. SOSE velocities, meanwhile, extend all the way to the boundary. We pad the empty regions of the SatGEM velocity fields with zeros; the effect is discussed in section 4.5.

An Isopycnal Framework
Since our primary goal is to determine diapycnal diffusivity, it is critical to ensure that the vertical spread in the model tracer is due, as far as possible, only to the K z field imposed upon the model. We adapt the framework of the offline MITgcm in the following manner: the grid is set up with 67 layers, to each of which we assign a neutral density level, with 0:003 kg m 23 between the layers. The tracer release isopycnal corresponds to the 36th layer. The density range covers that occupied by the DIMES tracer during years 1-2 of the experiment. The layer spacing is smaller than the separation in density between data points in the observed tracer profiles in 85% of the observations; we have thereby chosen a grid which matches the vertical resolution of the observations while minimizing computation time. The isopycnal framework we use allows us to resolve well the distribution of tracer in density space in the face of the large slope of isopycnals across the ACC (Figure 2), with relatively low computational expense.
For each velocity product (SatGEM and SOSE), we use their respective temperature and salinity fields to calculate neutral densities for each time slice, then linearly interpolate the zonal and meridional components of velocity vertically onto our neutral density grid. We then map the interpolated velocity fields to the model layers. The interpolation introduces some horizontal divergence into the velocity fields, with a mean typically less than 10 29 s 21 . This is enough to cause a small local and transient nonconservation of tracer but does not affect our diffusivity estimates (see section 4.5 and Appendix B). We set the vertical velocities to zero everywhere, therefore assume no diapycnal velocity; this approximation is discussed in section 4.4. We then define the thickness, h, of the model layers as follows. First, hðx; y; c n ; tÞ at every point on the grid is determined from the SatGEM fields for each time slice. We then take a time mean of those fields, <hðx; y; c n Þ >. Finally we calculate the zonal and meridional average, <hðc n Þ >, for the range (1108W-508W) in longitude, (528S-688S) in latitude, a range which covers the extent of the bulk of the tracer patch over the first 2 years. <hðc n Þ > then becomes the model layer thickness at each neutral density. We use the same layer thickness, calculated from SatGEM, for model runs with both sets of velocity fields, for ease of comparison. The effect of using a constant layer thickness throughout the system, and thus neglecting fluxes due to correlations between thickness and mass or tracer, was tested using the full SOSE model with variable layer thickness, and found to be small (Appendix A).
The procedure described above for setting model layer thicknesses is equivalent to choosing a single depth-density profile for the entire domain. We chose to use a time and spatial mean over the region occupied by the tracer in the model so that our diagnosed diffusivities are most representative of the tracer's path. However, the K z values reported should be taken in the context of that depth-density profile (the black curve in Figure 3); the diffusivity inferred varies inversely with the square of the mean vertical density gradient assumed. Calculating the density gradients by approximating the profile within 6100 m of the target surface to a straight line, our estimates might change by a factor of 1.08 if the western profile were used instead of the mean (light blue dotted curve in Figure 3) or by a factor of 0.86 if the eastern profile were used (pink dotted curve in Figure 3). The factors are much more significant for the northern and southern profiles (1.66 and 0.76, respectively). These are extreme examples of variability in the density gradient, however, obtained by going well outside the path of the tracer, especially to the south. The gradient following the tracer varies much less than suggested here, and is close to the gradient we have used in the model.

Model Initialization
We initialize the model tracer as a small Gaussian patch: Cðx; y; zÞ5 N ð2 pÞ 3 2 r x r y r z exp 2 ðx2x 0 Þ 2 2r x 2 2 ðy2y 0 Þ 2 2r y 2 2 ðz2z 0 Þ 2 2r z 2 " # ; (2) where x, y, z are the model zonal, meridional, and vertical coordinates in meters, N is the total tracer released (388 mol), r x 520 km; r y 520 km; and r z 55 m are the dimensions of the initial tracer patch, (x 0 , y 0 ) are the coordinates of the release (1078W, 588S) converted into meters, and z 0 is the model depth of the layer corresponding to the target density. The total tracer quantity, patch dimensions, and release location are taken from Ledwell et al. (2011;hereafter L11). We explore the effects on our results of reasonable adjustments to x 0 ; y 0 ; r x ; r y , and r z in section 4.3. We run the simulation in two parts: year 1 from release to US2, and year 2 from US2 up to UK2.5. We optimize parameters for year 1, and use the output as the initial condition for year 2; this is to save computation time as we optimize fewer parameters in year 1 while the tracer remains wholly in the SE Pacific.

Validation and Optimization 2.3.1. Model Outputs
Model tracer fields are output at times corresponding to US2 (12 months after release), UK2 (23 months), and UK2.5 (27 months) to be compared with observations. A single snapshot in time of the model tracer is taken in the middle of each cruise, under the assumption that cruises are short compared with the length of the tracer experiment (we examine the effect of sampling time on our results in section 4.3). For year 1, since the tracer was confined to the region west of 678W, K z and K h each take a single value throughout the domain, and we optimize these for the best fit to the US2 observations. For year 2, K z takes one value, K zp , west of 678W and another value, K zd , east of that longitude, both independent of depth, as in W13. We optimize K zp and K zd through comparison with the UK2 and UK2.5 observations. The location of the split between the SE Pacific ''low diffusivity zone'' and the Drake Passage ''high diffusivity zone'' was chosen for the W13 model as it was found to give the best fit to the observations. We continue to use it here because

10.1002/2017JC013536
it is just to the west of the prominent Phoenix Ridge which is arguably the start of the rough topography of Drake Passage (Figure 1). We also note that the geographical positions of the three transects (Pacific, Albatross, and SR1) surveyed on UK2 and UK2.5 are such that the split at 678W, intersecting the Albatross transect, is a logical choice for our aim of diagnosing the area-averaged diffusivities based on the tracer observations. Various K z fields with more spatial variability have been tested and found to give a poorer fit between the model and observed tracer.
For each cruise, we extract a vertical profile of model tracer concentration at the nearest grid cell to each station location marked in Figure 1. We then create mean profiles according to the groupings in Figure 1: one long track for US2; Pacific, Albatross and SR1 transects for UK2; and Pacific and SR1 transects for UK2.5. The individual profiles are averaged by layer on the model grid to produce mean profiles C mod ðc n Þ. We construct observed mean profiles of tracer concentration, C obs ðc n Þ in a similar way, having mapped them onto a fine neutral density grid from accompanying in situ observations. We then map both sets of profiles back into z space using the mean depth-density profile described in section 2.2.3 and shown in Figure 3 (black solid line).

Method of Estimating Diffusivities
We use two cost functions to assess the quality of the model fit to the observations and optimize the model parameters. We optimize the vertical and isopycnal components of the diffusivity separately to save computation time, since the optimal values have been found to be nearly independent of one another. To optimize K z , we fit Gaussians to the model and observed mean profiles in z space and compare the Gaussian widths of these profiles with the following function: where W T obs and W T mod are the observed and model mean profile widths for each transect, defined as the square roots of the second moments of the Gaussians, and the sum is over all the transects being compared. We manually adjust one parameter at a time and repeatedly rerun the model to determine the cost function, ceasing the optimization when we have found the K z values to two significant figures which minimize v 2 W . As described in the next subsection, for each run v 2 is actually taken as the mean v 2 from 5,000 random selections of individual stations to be included in transect mean profiles in a bootstrap approach to estimating uncertainties. This ensures that where we have short transects with significant variability in the widths of individual profiles (e.g., SR1 on UK2 which has only five stations), the cost function is not underestimated where the model mean profile happens to match the observations well. The bootstrap process simulates scenarios whereby a different sampling strategy for the real tracer would have altered the fit between the model and the observations. We estimate the isopycnal diffusivity, K h , which is taken to be the same east and west of 678W, by comparing observed and modeled tracer column integrals at individual stations, which are calculated as follows: where here c i is the modeled or observed concentration at the center of layer i, which is of thickness Dz i . We then define our second cost function (equation (5)), to validate the model lateral tracer distribution, in two parts. The first part compares the model and observed column integrals at each sampled station on a transect, normalised by the number of stations sampled and the variance of the observations on that transect. This provides a measure of the station-by-station quality of the model-observations fit. The second part compares the variance of the model column integrals with the variance of the observed column integrals and is again normalised by the variance of the observations. This term enables the models to capture some of the variability within the transects of the column integrals at scales of 100 km or so (see Figure 6).
Equation (5) is the cost function that we minimize to optimize K h . n T is the number of stations sampled on a given transect T, r Tmod and r Tobs are the standard deviations of the model and observed column integrals on that transect, and I i mod and I i obs are the model and observed individual column integrals on the same transect. We search a wide range of K h from 10 to 2; 000 m 2 s 21 , with approximately a factor of two between trial values, to find the optimum value for year 1 and 2 using both sets of velocity fields. The optimum value for K h is nearly independent of K z , the only feedback from K z being a weak one due to shear dispersion. Hence, we optimize K h first and then use that optimal value in the optimization of K z . Our diagnosed K h should not be confused with an estimate for the overall isopycnal diffusivity acting on the tracer, since tracer is dispersed by eddies resolved by the model in addition to the specified diffusivity, K h .

Uncertainty of the Diffusivities
Stations where the tracer was sampled are sparse; in the worst case, there are only five stations along the SR1 section during UK2. To estimate the uncertainty in diffusivity due to this sampling, we employ a bootstrap method in conjunction with the cost functions presented in section 2.3.2. We describe our bootstrap method in this section.
On each transect on UK2 and UK2.5, we select a sample of 5 stations to contribute to that transect's mean profile (65 stations for the single US2 mean profile) using a random number generator. This means that a single station may be counted more than once, but the full set of stations is available to contribute to the mean. By choosing a sample of five from each transect on UK2/UK2.5, we ensure that the transects have equal weight in the cost function. Having selected the stations that will contribute to each transect, we calculate the mean profiles for both the model and observed tracer and compute the cost function, v 2 W , comparing the two, using equation (3). For a given model run, we repeat the random sampling process 5,000 times, computing v 2 W each time. We then use the mean of these, v 2 W , as the cost function for that run when evaluating which model run (and therefore which K z ) gives the best fit to the observations. When optimizating K h we compute the cost function values v 2 I using the same sampling strategy but compare the column integrals one-by-one instead of by transect, as in equation (5). This is to take account of how well the model has reproduced the lateral distribution of the tracer. We carry out the bootstrapping in the same way as for v 2 W . We use our bootstrapped values of v 2 to estimate the uncertainty on K z ðor K h Þ due to the sampling of stations in the following way. For the runs fully optimized for both K h and K z , the 5,000 values of v 2 W (or v 2 I Þ contributing to the mean are each ordered from lowest to highest, and the ninetieth percentile (i.e., the 4,500th value) is taken as the upper limit for the cost function for that run. We then carry out additional model runs, shifting K z ðor K h Þ upward or downward away from its optimal value until v 2 W ðor v 2 I Þ has increased above this upper limit. The diffusivities giving these higher cost function values are then taken as the limits on K z ðor K h Þ. This is equivalent to a standard bootstrap method for estimating the uncertainty on a metric that has an unknown distribution within a population, except that, since we are unable to bootstrap the diffusivities themselves, we use the cost function as a proxy for the diffusivity in order to estimate the uncertainty.

Lateral Tracer Distributions
At US2, the lateral distribution of the model tracer advected using the SatGEM velocities is fairly close to the observed column integrals in terms of the general shape, extent, and zonal displacement of the tracer patch ( Figure 4a; see also Figure 7 and the discussion of the zonal velocity below). There is also some filamentation of the model tracer, some of which coincides with filaments in the observations, and some of which does not (Figure 6a). v 2 I is 1.03 for this output, and the optimal K h for year 1 is 100 m 2 s 21 with a range of 20-1; 000 m 2 s 21 at the 90% confidence level. Thus, the fit of the model to the observations is good, and not very sensitive to the value chosen for K h . By UK2 (Figure 4b), the leading edge of the model tracer has gone through Drake Passage, but the bulk of the tracer patch is still some way further west. The model and observed column integrals on the Pacific transect are similar (Figure 6b), with the decrease to the southern edge of the tracer patch matching well with the observations. At the Albatross transect, the model tracer patch seems to extend slightly too far northward and southward, and the variance in the observed column integrals is larger than in the model ( Figure  6c), probably indicating that the model velocity fields are too smooth. At SR1, the fit is generally good, although there is very little tracer in the model in the two most northerly stations where more has been found in the observations (Figure 6d). The tracer at UK2 has also become more homogenised, due to the effect of the optimized model K h of 400 m 2 s 21 for year 2. This homogenization is also evident in the observations. The uncertainty range for K h is 30-1; 400 m 2 s 21 at the 90% confidence level. On the Pacific transect at UK2.5 (Figures 4c and 6e), the model and observed column integrals are similar in the north, but in the Journal of Geophysical Research: Oceans 10.1002/2017JC013536 south model values are too low. There is also evidence of filamentation in the high-resolution section carried out here which is not reproduced in the model. On SR1, the model values are low (Figure 6f), particularly in the north where model tracer has been lost into the region with no SatGEM velocities (see section 4.5). v 2 I for the UK2/UK2.5 comparison for this run was 5.07. For the model tracer advected using the SOSE fields, the leading edge at US2 is slightly further east than for the SatGEM run (Figure 5a). The model tracer has higher values than the observations in stations to the east of the tracer patch, suggesting that the model may be advecting slightly too fast. v 2 I is 1.25, slightly higher than the SatGEM run; the optimized K h is the same at 100 m 2 s 21 . By UK2 (Figure 5b), more of the leading edge of the tracer has gone through Drake Passage compared with the SatGEM run, and the SOSE model tracer perhaps better matches the observations here. The southern edge of the tracer patch as indicated by the observed column integrals at both the Pacific and Albatross transects is captured well in the model (Figures 6b and 6c). The model values are higher than the observations at the northern end of the Albatross and SR1 transects however, indicating the model tracer may extend too far north here. At UK2.5 (Figures 5c  and 6e and 6f), the model values along the Pacific transect are slightly too low; at SR1, the fit is good in the north but the model values are too low in the south. There is also almost no variability to the model values on the Pacific transect, suggesting that the prescribed K h may be too large. v 2 I for UK2/UK2.5 was 4.56: a slightly better fit than the SatGEM run (v 2 I 55:07). The optimal value of K h was 400 m 2 s 21 , the same as with SatGEM. Given the differences between the two velocity field products, one might expect some difference in the optimized K h values; however since v 2 I is not very sensitive to K h the fact that they are the same is probably not significant. In summary, both models seem to do a reasonable job of reproducing the lateral dispersion of the tracer, so far as is possible to tell from the observations. The SatGEM velocities produce a better fit at US2 as measured by the cost function; at UK2 and UK2.5 the fit is better with SOSE. Adding a weighting factor to the second term in equation (5) to either double or half its contribution to v 2 I does not affect these comparisons.
For the first 2 years of the experiment, the tracer progresses principally in a zonal direction. By integrating the model tracer column integrals over latitude through the model domain, we can visualize the tracer evolution from release up to UK2.5 ( Figure 7). As we saw in Figures 4 and 5, the SOSE velocities advect the tracer into Drake Passage and the model's high diffusivity zone sooner than SatGEM. However, the zonal first moment for the SatGEM runs shows fairly little eastward movement for the first 20 weeks of the simulation, while the SOSE first moment moves eastward steadily from the outset. This reflects the fact that tracer was intentionally released at a stationary hyperbolic point, chosen on the basis of altimetry and geostrophy; a feature captured by SatGEM due to its construction. Using the movement of the center of mass to calculate an average speed from 20 weeks after release to the end of the simulation, the result is very close for the two runs: 0.318 longitude/week for SatGEM and 0.328 for SOSE. So the earlier arrival of the SOSE advected tracer in Drake Passage does not seem to imply a faster transit through Drake Passage compared with SatGEM. In W13, the velocity applied to the SE Pacific zone of the model converts to 0.328 longitude/week, which agrees well with these calculations; however, it should be noted that our estimate is a tracer concentration-weighted average velocity, so is not the same as the velocity applied to W13. The center of mass does not reach the model's high K z zone (Drake Passage) in either simulation, so the model tracer sampled at SR1 is from the leading edge of the patch in both cases. We can crudely estimate the speed of the tracer advance through Drake Passage from the slope of the tracer leading edge in Figure  7, obtaining 0.648 longitude/week for the SatGEM model and 0.608 for SOSE; these compare to an equivalent of 0.668 longitude/week applied to the model of W13, with a stronger caveat due to the differing methods of calculation.

Vertical Tracer Distributions
The model mean profiles from the optimized SatGEM run compare fairly well with the observations, although not all are within observational uncertainty (Figure 8). At US2, the model mean profiles almost match the observations within uncertainties. At UK2, the model profiles on all three transects match well with the observations, with the exception of the shallow tail on the Pacific transect due to a slight asymmetry in the observed profile. At UK2.5, the fit is slightly less good, with the model underestimating the amount of tracer at both transects.
For the optimized SOSE run, the model slightly underestimates the tracer at US2 (Figure 9). At UK2, the model mean profiles all agree with the observations within uncertainties, but the agreement is not quite as We can compare the vertical spread of the tracer in the observations with the models by looking at the vertical widths of Gaussians fitted to the mean profiles on each transect (Table 2). For US2, which is used to diagnose the year 1 K z in the SE Pacific, the two models and the observations are in agreement. This is because the diffusivity only needs to fit one profile, so may be fully optimized to fit the observations (the 0.1 m difference is due to only having optimized K z to the nearest 0:1310 25 m 2 s 21 to save on model runs). For the remaining six transects, the model mean profiles reflect the best fit that can be achieved for each profile while adjusting the two parameters K zp and K zd . For all transects, the model mean profile widths are within 7% of the observed widths, indicating that the assumption of two zones is sufficient to explain the tracer evolution. The Pacific transect widths are dependent only on K zp , while the Albatross transect width is dependent on both K zp and K zd . The SR1 transect widths are dependent almost entirely on K zd since it is much larger than K zp . The SatGEM velocity field seems to give a better fit than the SOSE velocity field as the widths at the Albatross, UK2 SR1 and UK2.5 SR1 transects are closer to the observations; this is backed up by the cost function v 2 W which is 171.2 for the SatGEM run and 185.5 for the SOSE run. Table 3 summarizes the diffusivities obtained by optimizing both the SatGEM and SOSE implementations of our model. The two models are in good agreement with one another, and both show a 20-fold increase in K z between the SE Pacific and Drake Passage zones in year 2. Both also show a slight increase in the SE Pacific diffusivity between years 1 and 2, but it is well within the uncertainties. The similarity between the results for the two models is not surprising, since both velocity fields are designed to mimic the observations; therefore tracer transit times, and consequently the diagnosed K z , should be similar. It is reassuring that although they are constructed with different methodologies, the two velocity fields are in reasonably close agreement. Our optimized values for K z in both the SE Pacific and Drake Passage regions for both model implementations agree with W13 within the uncertainties. Our uncertainty ranges are rather more generous, principally because we fully take account of the error due to the selection of stations to include in the cost function calculation, which is quite large. It is encouraging to note that our K zd for both SatGEM and SOSE agree with W13 to within the smaller uncertainties quoted in that paper. Whereas in our model the SE Pacific K z was optimized separately for years 1 and 2 of the tracer experiment, the W13 model used the same value from release up to UK2.5. That both the SatGEM and SOSE runs here show an increase between year 1 and year 2, and that the W13 value lies between the SatGEM year 1 and year 2 estimates might point toward temporal variability in the mixing rate in the SE Pacific. This could be explained by an increase in the strength and/or frequency of storms, since changes in wind stress cause downward propagating internal waves which result in diapycnal mixing when they break. In the region occupied by the tracer in years 1-2, there is a statistically significant increase in the wind stress at the 95% level, based on ERA-Interim wind fields (Dee et al., 2011) with a wind stress formulation from Large and Yeager (2009). Larger wind stresses would tend to be associated with larger wind stress changes, indicating that such downward propagating waves are a likely candidate for the increase in K z diagnosed here. Depth-density profiles calculated from SatGEM averaged over parts of the SE Pacific occupied by the model tracer in year 1 (1108W-808W) and year 2 (1058W-708W) are virtually identical around the tracer depth, so a change in stratification can probably be ruled out as a contributing factor.

Sensitivity of K z Estimates
The results of an exploration of various factors that might affect our estimates of K z are presented in Table 4. In each case, the model was rerun multiple times in order to reoptimize each K z with the altered parameters. To test the effect of varying K h , we reoptimized the SatGEM model K z for the extremes of well the location of the tracer release and have measurements of the initial diapycnal spread of the tracer (Ledwell et al., 2011), the velocity fields from the models at the location of the release may not be accurate in detail. To test sensitivity of the model runs to initial conditions, we halved and doubled the horizontal and vertical widths in four separate sets of runs. We also moved the release location by two model grid cells northwest, northeast, southwest, and southeast. For each initialization, we reoptimized all three K z values. To test the effect of sample time, the model outputs taken to correspond to each cruise were shifted by 3 weeks one way or another. For all the parameters explored the effect on the optimized K z values was small, and in many cases there was no effect to within the two significant figure search tolerance for optimization.

Diapycnal Velocity
The effect of diapycnal diffusion that is important to the Meridional Overturning Circulation is modification of the density of the fluid. We can estimate the diapycnal velocity of the tracer due to K z acting on the Mean concentration profiles have been mapped from neutral density to depth using the standard depth-density curve for this study and are plotted relative to the mean depth of the tracer target density. The uncertainties on the mean profiles, calculated from the standard error of the individual profiles contributing to the mean, are shown as shaded areas.
where c here is neutral density and K z is assumed to be the same for salt, tracer, and heat. We can write this equation in terms of the diapycnal velocity w c : w c 2 @K z @z @c @z 5K z @ 2 c @z 2 : The vertical density gradient of the standard density profile used here on the target isopycnal of the tracer release is 22:7310 24 kg m 24 , estimated by fitting a quadratic function to cðzÞ in a 312 m depth range centered on the tracer target depth (the fit is not sensitive to different choices of range from 150 to 512 m). The second derivative of the neutral density from this quadratic fit is 24:30310 27 kg m 25 . We have not attempted here to place limits on the vertical gradient of the diffusivity, which would be reflected in asymmetry of the tracer profiles. Though the asymmetries appear small in Figures 8 and 9, Ledwell et al. (1998) found that the shape of mean tracer profiles in a similar tracer release experiment was not very sensitive to vertical variations in diffusivity. Work that is underway by the authors indicates that there may be a significant increase in diffusivity with depth, and this would be consistent with the work of Mashayek et al. (2017), who find the DIMES tracer diffusivity in Drake Passage is dominated by near-bottom processes. Thus, the @Kz @z term in (7) may be nonnegligible and have negative sign. Perhaps an extreme limit of the upward diapycnal velocity that we are neglecting can be found using (7) and setting @Kz @z 50, which gives a value of 20.4 m/yr in Drake Passage. We can estimate the transit time across Drake Passage from the start of our high diffusivity zone at 678W to the SR1 transect at 588W to be 29 weeks, based on the average speed of the center of mass of the SatGEM model tracer established in section 3. This implies a mean diapycnal displacement of the center of mass of 11.4 m for the tracer up to SR1, which is around the thickness of a model grid cell at the tracer depth. Such a displacement should not affect the diagnosed K z and represents an upper limit. A more conservative estimate could be made using the transit time implied by the leading edge of the model tracer in Figure 7, around 14 weeks. This would give an even smaller diapycnal displacement of 5.5 m. Note that it is the square of these displacements (of order 100 m 2 ) compared with the square of the widths of the mean profiles at SR1 (of order 8,000 m 2 ) that gives a measure of the effect of diapycnal displacements on the estimated diffusivity.

Effects of the Offline Approximation
There are a few consequences of our decision to use an offline tracer advection in an isopycnal framework. In reality, the distance between isopycnals varies in time and space; in our model, they are fixed on each density level. One effect of these fluctuations on the tracer is a bolus flux (Gent et al., 1995) which causes the lateral movement of tracer down the thickness gradient. To investigate the impact of  Note. K z uncertainties dK z are in 310 25 m 2 s 21 . Values given are the largest change in optimized diffusivity found when varying each factor within the explored ranges. Where values of dK z are reported as 0, this means uncertainty is smaller than the search tolerance for optimizing K z (two significant figures). neglected effects, we ran an online experiment with the MITgcm to compare with our offline SOSE model implementation (see Appendix A). This suggests that our estimates of K z might increase by 10% if these details were accounted for. However, we note that the online experiment resulted in a poorer fit to the observations in terms of the lateral distribution of the tracer. It may well be that our offline simulations more accurately reproduce the tracer evolution, even if that may be as a result of compensating errors.
In addition, the movement of isopycnals in the real ocean means that vertical velocities are required in order to conserve volume. We neglect the vertical velocity, and our horizontal velocity fields are not perfectly nondivergent. This manifests in imperfect conservation of tracer in our model runs (note that the flux limiter used in the advection scheme can also affect tracer conservation). For the model implementation using the SOSE velocities, the overall tracer is no more than 2.05% from perfectly conserved over the full 2 years of the run. For SatGEM, the conservation is better than 1.25% in year 1, but in year 2 drops off sharply (up to 10% loss by the end of the run) due to the tracer advecting into the areas marked in white in Figure  2 where there are no SatGEM velocities. The good conservation in the SOSE model and in SatGEM for year 1 while the tracer remains within the SatGEM fields is reassuring. Nevertheless, even with good overall conservation it is still possible that local divergences in our velocity fields could have an unexpected effect on our measurement of K z . We carry out an experiment to quantify any spurious diapycnal mixing in the model (see Appendix B), and conclude that it is less than 0:15310 25 m 2 s 21 in the SE Pacific and less than 1:53 10 25 m 2 s 21 in Drake Passage.

Other Estimates of Diffusivity
The value of the diffusivity of 1:660:2310 25 m 2 s 21 for the first year in the SE Pacific reported here is higher than the value of 1:360:2310 25 m 2 s 21 reported by L11. Though the uncertainties overlap, one might expect the best estimate and confidence intervals from two legitimate statistical approaches to the same data set to be in closer agreement. L11 fit the evolution of the tracer profile from the initial observed mean profile to the observed mean profile from US2 with a 1-D diffusion equation, as is appropriate for diffusion of a tracer in a statistically homogenous region with distant boundaries. Their cost function was based on variations in the shape of the observed profiles. They obtained nearly the same result, however, by fitting Gaussians to the mean initial and US2 tracer profiles, which is more akin to what is done in the present work. The density profile used to transform the mean tracer profile from density to depth was virtually the same in the two analyses. Most of the difference between the results must be due to details of the nonlinear least square fits of Gaussians to the mean tracer profiles, and to the bootstrap approach used here, but not in L11. Perhaps the difference in the results, compared with the internally generated uncertainty estimates, can be taken as an example of how large the effect can be of taking different reasonable approaches to the same data set.
Turbulent kinetic energy dissipation rates were found to be much greater in Drake Passage than upstream in the SE Pacific St. Laurent et al., 2012), though the diffusivity inferred from the dissipation rates in the layer occupied by the tracer appeared to fall short of that indicated by our results and those of W13 by nearly an order of magnitude. A possible explanation for the discrepancy was offered by Mashayek et al. (2017), who calculated the diapycnal dispersion of a numerical tracer released at the entrance to Drake Passage with a high-resolution model. In their model, the diapycnal diffusivity was based on the relation between dissipation and height above bottom averaged over all the microstructure measurements available from Drake Passage. They found that the model tracer spent enough time in the vicinity of topographic peaks, where average dissipation rates, and hence in their model diffusivity, are very large, that the overall diapycnal diffusivity felt by the tracer was of the same order as that observed.
Estimates of diapycnal diffusivity in the ACC from less direct methods based on profiles of shear and strain at scales greater than 10 m have been made by several investigators, anticipating our results, at least qualitatively. Kunze et al. (2006, their Figure 10) estimated diffusivities on the order of 10 25 m 2 s 21 between 508S and 608S at 1108W from WOCE LADCP/CTD profiles, in agreement with the DIMES tracer and microstructure measurements in the SE Pacific. Thompson et al. (2007) estimated diffusivities greater than 5310 24 m 2 s 21 from CTD strain in their bottom bin between 900 and 1,000 m depth in the ACC in Drake Passage, and even higher values from Thorpe Scale analyses. Their bottom bin did not quite reach as deep as the tracer layer but was close to that layer where it rises toward the Polar Front. Wu et al. (2011) estimated diffusivities of 2:5310 25 m 2 s 21 in the layer occupied by the tracer in the SE Pacific from strain spectra measured by profiling Argo floats, and values greater than 10 24 m 2 s 21 in Drake Passage, though not as high as we have reported. Whalen et al. (2012) reported diffusivities from strain spectra from Argo profiles of less than 10 25 m 2 s 21 in the SE Pacific and of order 10 24 m 2 s 21 in Drake Passage between 1,000 and 2,000 depth in the ACC.
It is promising that the relatively inexpensive observations of strain and shear from Argo floats and LADCP/ CTD profiles are of the same order of magnitude and show similar patterns, as more direct observations of the dispersion of a tracer. However, it must be remembered that the chain of assumptions and approximations leading from shear and strain at scales of 10 m or more to turbulent diapycnal diffusion is a long one. See especially Kunze et al. (2006) for a full discussion. Comparison with direct measurements with a tracer release experiment, such as we have in DIMES and earlier tracer experiments, are required to build credibility of the less direct techniques. A similar statement can be made about diffusivities inferred from turbulence dissipation measurements, though those are more expensive than shear and strain measurements, on the one hand, but more directly related to diffusivity on the other.

Interpretation and Mechanisms
High dissipation rates and diapycnal diffusivities in deep Drake Passage were estimated from LADCP shear profiles by Naveira Garabato et al. (2004). They proposed that the mechanism was due to the interaction of geostrophic flows with the rough bathymetry. Ferrari (2010a, 2010b) and Scott et al. (2011) developed a theory for such interactions and applied the theory and numerical simulations to the environment characteristic of Drake Passage. The amount of dissipation predicted due to lee waves generated by flow over 2-D bathymetry for steepness characteristic of Drake Passage was more than an order of magnitude larger than estimated from microstructure profiles, but the estimate was lowered by a factor of 5 when flow around features allowed by 3-D topography is taken into account (Nikurashin et al., 2014). Cusack et al. (2017) report evidence for an energetic wave with the characteristics of a lee wave generated by flow over the Shackleton Fracture Zone. Thus, much of the enhancement of turbulent mixing in Drake Passage could well be due to the breaking of lee waves generated by the interaction of strong near-bottom geostrophic flows with complex topography, the likes of which can be seen in other parts of the Southern Ocean such as the Scotia Sea, Kerguelen Plateau, and the southeast Indian Ridge (see e.g., Brearley et al., 2013;Nikurashin & Ferrari, 2011;Sloyan, 2005;Waterman et al., 2013). Futhermore, de Lavergne et al. (2016), Ferrari et al. (2016), and argue that convergence of buoyancy flux into deep waters of the ocean is dominated by mixing in the vicinity of continental slopes, ocean ridges, and other bathymetric features. There is some evidence that the tracer distribution observed in Drake Passage had been affected by such boundary mixing; we see some amplification of the diapycnal spread of tracer in the two stations closest to the continental slope, for example, see stations 54 and 55 in Figure 3.3 of the JR276 cruise report, available from BODC. In addition, the numerical simulations of Mashayek et al. (2017) suggest that mixing of the tracer was enhanced over many topography features of Drake Passage, such as the Phoenix Ridge and Shackleton Fracture Zone that cross the passage. In their simulation, the effect is rapidly homogenized across the ACC by eddy stirring, so that spatial patterns of the tracer dispersion may not reflect topographic enhancement at specific sites.

Implications for the Overturning Circulation
W13 estimated that a 20-fold enhancement of K z in Drake Passage over background levels was consistent with a 4 Sv contribution by diapycnal mixing to the Meridional Overturning Circulation at the density of the deepest Upper Circumpolar Deep Water. They extrapolated the enhancement of diffusivity over rough topography to the whole of the ACC using an estimate of lee wave generation for the Southern Ocean (Nikurashin & Ferrari, 2011), and neglected possible covariances between K z , @q @z , and @ 2 q @z 2 , also assuming K z to be independent of depth. If we make the same assumptions, our confirmation of the enhancement of mixing over the rough topography of Drake Passage supports that estimate. However, the work of Mashayek et al. (2017) suggesting that much of the mixing of the tracer occurred close to topography implies a significant role for @Kz @z , as K z decreases with height above the bottom over rough topography. What we can say with certainty is that there is a significant downward buoyancy flux through the density layer occupied by the tracer. If we assume the mean density gradient from section 4.4 of 22:7310 24 kg m 24 , we can estimate the buoyancy flux as 2g q K z dq dz , where g is the acceleration due to gravity, equal to 9:8 m s 22 . The diapycnal diffusivity found here for Drake Passage implies a downward buoyancy flux of approximately 10 29 m 2 s 23 through the layer occupied by the tracer, if the diffusivities of heat and salt are similar to that of tracer and to one another. This buoyancy flux is significant compared with annual circumpolar mean air-sea buoyancy fluxes, whose absolute values are of order 5310 29 m 2 s 23 or less in the ACC (e.g., Sall ee et al., 2010). The buoyancy flux implied by the background diffusivity we have determined for the SE Pacific would be 43 10 211 m 2 s 23 if extrapolated to the whole ACC.
The tracer layer is centered near c n 527:91 kg m 23 , which in the ACC region lies near the boundary between the Upper and Lower Circumpolar Deep Water. More specifically, according to Lumpkin and Speer (2007), it is at the bottom of the south-going limb of the deep Meridional Overturning Circulation. Thus, buoyancy is being transported by interior diapycnal mixing from somewhere above the tracer level to somewhere below, where it must ultimately converge. This convergence may contribute to the lightening of bottom water in the deep north-going limb of the overturning circulation. If the source of the buoyancy, i.e., buoyancy divergence, were in the south-going upper limb, it would oppose the strong tendency of that water to become lighter as it travelled south (see Lumpkin & Speer, 2007). If buoyancy is being drained from the north-going limb of the upper cell, then it would contribute to the densification of water seen in that limb.

Conclusion
Our results have confirmed those previously obtained using a much simpler model. W13 were able to arrive at accurate estimates for the area-averaged diffusivities in the SE Pacific and Drake Passage regions implied by the evolution of the DIMES tracer using such a simple model for a number of reasons. The fast zonal flow of the ACC on its approach to and journey through Drake Passage mean that the evolution of the DIMES tracer was adequately described by the simple area-averaged zonal velocities used in W13. Nonetheless, they were fortunate that their optimization resulted in zonal velocities that were appropriate for diagnosing K z , given the weak constraints on these velocities. In addition, the sampling strategy of cross-ACC transects employed in the DIMES tracer experiment and the locations of the three transects means that a model with two zones for mean diffusivity and zonal velocity lends itself well to reproducing the observations. Since the observed mean profiles at each transect are themselves the result of integrated effects on the tracer, it is only the area-averaged quantities between the transects that are constrained by these observations.
The strength of the current work over the simpler model of W13 is that we have successfully validated the models' lateral tracer evolution through comparison of the column integrals with the observations (Figure 6). The resolution of our models is not high enough to capture variability observed at scales of less than 100 km; this smaller-scale variability is no doubt due to filamentation of the tracer by straining motions not resolved by the models, coupled with the larger scale tracer gradients. However, the models have done a good job of reproducing the tracer evolution on larger scales, which are most relevant to the transit times between transects, and hence to the determination of K z . Consequently, we can be satisfied that our result for K z is robust. We also found an increase in the optimal lateral diffusivities applied to the models between years 1 and 2, though the fit to the data is not very sensitive to K h . Early in the simulation a large prescribed K h homogenizes the tracer too quickly. Later on when the tracer patch has grown and taken on a largescale structure with less filamentation, larger values of model K h fit the data.
The implementation of a 3-D model to diagnose diapycnal diffusivites from the DIMES tracer has been a worthwhile exercise in that it has confirmed the result of W13, and provided increased confidence in its conclusions. We might add that the ability of the models to reproduce the along-isopycnal movement and dispersion of a tracer at 1,500 m depth speaks well for their accuracy. The ability to advect the tracer with realistic velocity fields and compare the resulting model tracer distributions with the observations is important to the robust determination of K z and should be considered a valuable step to take in analysing the results of similar future tracer release experiments.
''online'' primitive equation model of MITgcm. Its setup is equivalent to our offline SOSE model runs except that the vertical grid used is the original SOSE grid, with 42 vertical levels covering the whole water column, and the online simulation contains the full 3-D velocity fields (u, v, w). We switch off the Gent-McWilliams parameterization (GM) in the online run so that it only includes effects due to the model physics resolved at 1/68. The ability of a model to resolve mesoscale eddies is dependent on the ratio of the first baroclinic deformation radius to the model horizontal grid spacing. Hallberg (2013) argues that in a region where eddies are resolved, GM will suppress them rather than replicating their effects. In the DIMES region, the deformation radius and SOSE grid spacing are similar (see Jones et al., 2016, Figure S1), which means that larger mesoscale features are well resolved; hence, the disabling of GM is appropriate. We cannot use the online run to look at vertical dispersion because of the much coarser vertical resolution, but we compare the lateral dispersion by calculating the zonal first moment of the tracer patch as it evolves ( Figure A1). The progress of the center of mass is somewhat slower in the online model, with a rate of eastward propagation of 0.298 longitude/week after week 20 (to compare with 0.328/week calculated for the offline model). Therefore, we estimate that our offline framework could add a relative uncertainty of 10% to the diagnosed K z . However, we note that v 2 I at UK2/UK2.5 for the online simulation is 6.25; the fit to the observations is worse than that for either of our offline models.

Appendix B: Quantifying Spurious K z
We have stated that a principle requirement of our model is that the diapycnal dispersion of the tracer is as far as possible caused by the prescribed K z field, with minimal spurious diapycnal diffusion. To test this, we ran an experiment to measure the change in the mean profile vertical widths obtained from sampling the tracer at each of our five UK2/UK2.5 transects under the influence of a prescribed model K z set to zero everywhere. As for our main study, this experiment is run in two stages. First, we initialize the tracer with our normal Gaussian initial condition and subject it to a uniform diffusivity for 1 year, advected using our SatGEM offline model setup. We then take the output from this run as the initial condition for a year 2 run, this time with K z 5 0. We measure the mean profile widths at the UK2/UK2.5 transects at the beginning and end of the year 2 run and look for any change. We find that there is a small change in the widths of some mean profiles under zero K z , and that the change is larger for wider profiles. For profiles 25 m wide, appropriate for the US2 optimization, the largest change between start and end for year 2 implies a diffusivity of 1:4310 27 m 2 s 21 ; negligible. For profiles 50 m wide, the width at the Albatross and Pacific transects used to optimize K zp , the largest change implies a diffusivity of 1:5310 26 m 2 s 21 ; a small error on our year 2 SE Pacific diffusivity. Finally for profiles of 90 m width needed to optimize K zd the implied diffusivity is 1:5310 25 m 2 s 21 ; again introducing only a small percentage error on the estimated value for Drake Passage.