Repeatability analysis of airborne electromagnetic surveys
 Avril Hegarty^{1}Email author,
 Gerry Stanley^{3},
 Eugene Kashdan^{2, 4},
 Jim Hodgson^{3} and
 Andrew C. Parnell^{2, 5}
DOI: 10.1186/s4092901600081
© The Author(s) 2016
Received: 13 October 2015
Accepted: 19 August 2016
Published: 9 September 2016
Abstract
Purpose
We provide methods for determining the repeatability of airborne electromagnetic surveys when conducted at different altitudes over a number of repeated flights. Our data arise from the TELLUS project carried out by the Geological Surveys of Ireland and Northern Ireland and we examine the repeatability of the apparent resistivity at different frequencies.
Methods
After considering a number of issues with the data, we propose two different models from the functional data analysis literature; a Weiner process with random effects, and a penalised spline smoother.
Results
Both methods arrive at the same conclusion regarding repeatability of the data; results obtained are more repeatable for flights at lower altitudes.
Conclusions
The target altitude for aircraft carrying out airborne electromagnetic surveys should be as low as possible.
Keywords
Apparent resistivity Functional data analysis Psplines TELLUS project Weiner processMathematics Subject Classification (2010)
MSC 62Introduction
Airborne electromagnetic (AEM) surveying is a common induction technique [1–3] used to interpret subsurface geology, structure, mineralization and contamination. To conduct an AEM survey, a transmitter (coil) on the aircraft emits a sinusoidally varying current at a specific frequency. This generated magnetic field induces a secondary electric field within the ground. The receiver coil on the aircraft measures this response and the relationship between the two fields can be used to determine the apparent resistivity of the ground. Electromagnetic (EM) surveys are commonly carried out in either the frequency domain (FEM), where the effects are measured at different frequencies, or in the time domain (TEM). The study used in this paper is of the former type, the main output of which is the apparent resistivity (R) at different frequencies. Our main concern in this paper is the repeatability of these apparent resistivity values, especially with respect to the target altitude at which the aircraft is aiming to fly.
“The closeness of agreement of independent test results obtained using the same method on identical test items in the same laboratory by the same operator using the same equipment within short intervals of time.”
We show that the test run flights are most repeatable when undertaken at the lowest feasible target altitudes.
While it is generally recognised that repeatability is an important part of AEM surveys [6] few studies have been published. Most studies that do mention repeatability assess it by repeating a testline flight daily or on a number of occasions at a single altitude [7, 8]. For example, Green and Lane [9] analysed AEM data from a flightline flown repeatedly at a single target altitude to monitor system performance. They described an approximate method for correcting for altitude and obtained a good measure of the repeatability after applying this correction. Foged et al. [10] investigated repeatability of airborne and groundbased TEM systems at three different altitudes (10, 20 and 30 m) and concluded that results were satisfactorily repeatable within and between altitudes, and that there was good agreement with a groundbased reference section. A more extensive study of repeatability was carried out by Huang and Cogbill [11] in which they concluded that spatially consistent flight paths are required for repeatability analysis of the EM data, and that this analysis is more meaningful if the apparent resistivity is used instead of the EM response itself. Our paper is an attempt to follow and validate this recommendation.
 1.
An initial data cleanup stage to remove flights that went off course, and areas of the testline where data were recorded over water.
 2.
An exploratory data analysis to determine which variables are important and to reveal any hidden structure in the data
 3.
Statistical models to quantify the variability between test flights that took place at the same target altitude
For each flight and for each frequency we have the aforementioned inphase and outofphase components, but we focus our analysis on the apparent resistivities since the former tend to vary with altitude. This is in line with Huang and Cogbill’s conclusion [11]. To our knowledge this is the first paper to compare statistically the repeatability at different altitudes of an FEM survey.
We do not attempt to quantify any error in the apparent resistivity arising from sources other than the altitude of the flight and its position across the test line route. Furthermore, the relationship between repeatability, which we estimate via a variance computation, does not necessarily correspond to a reduction in bias. As we will show, in nearly all cases the flights that were most repeatable were those where the aircraft was aiming to fly at the lowest altitudes. This suggests that future flights to determine apparent resistivity should also be conducted at the lowest feasible altitude. However, it is possible (though unlikely) that apparent resistivity data from flights taken at the lowest altitude also contain considerable bias, and that a higher altitude may be preferable when considering both repeatability and bias simultaneously. While data for actual groundbased apparent resistivity measurements for a short 300 m section of the flight testline was available, these data were insufficient to examine this potential bias, so we leave this as a topic for further research.
We evaluate the repeatability of the apparent resistivity via two methods from the field of functional data analysis (FDA; e.g. [12]). Since there are many different ways to analyse such data, we choose a Bayesian and a frequentist version, the former corresponding to a functional ANOVA (FANOVA) model. We evaluate the repeatability for each target altitude using a signal to noise ratio (SNR) [13] appropriate to each model. (Note that there are a number of different definitions of SNR, we use the reciprocal of the coefficient of variation). We find that both approaches reach the same conclusion; lower altitudes are more repeatable.
This paper is structured as follows. In Section “Methods” we describe the design of the study, outline our data set and perform some exploratory data analysis. The data requires careful cleaning before analysis and we document these steps here. In Section “Statistical models for measuring repeatability of apparent resistivity” we outline our two statistical approaches and detail their various advantages and shortcomings. We discuss results in Section “Results and discussion” and conclude with some ideas for further analysis in Section “Conclusions”.
Methods
Design of study
Test flights were flown along a 6 km test line with readings taken every 0.1 second or approximately every 6 metres. The aircraft used was a De Havilland DHC6 twin Otter (registration number CGSGF) for all survey work. A map of the testline is given in the left panel of Fig. 1. One end of the test line, for approximately 1 km, was over the sea. There were 5 individual flights on different days and each of these flew up and down the line changing altitude after each turn to give 7 different altitudes, nominally at 60, 65, 70, 75, 80, 85 and 90 m. Although a target altitude was set fluctuations were unavoidable and actual altitudes were also recorded via a laser altimeter. Measurements were collected at four different frequencies: 912, 3005, 11962 and 24510 Hz referred to henceforth as 0.9, 3, 12 and 25 KHz. Negative values of apparent resistivity were ignored and replaced by interpolation from neighbouring measurement points.
Cleaning of test flights data set
Exploratory analysis
Weather and power line interference
Weather reports for the days in which the 5 different flights were taken (source: Met Éireann)
Flight  Date  Station  Rain  Max Temp  Min Temp  Mean Wind 

(mm)  °C  °C  (knots)  
L7001  26/10/2011  Finner  0.3  11.0  4.5  9.8 
L7025  10/12/2011  Finner  7.4  8.6  2.9  12.8 
L7050  23/01/2011  Finner  2.9  8.4  3.7  10.7 
L7129  08/05/2011  Finner  2.2  11.1  2.4  7.9 
L7177  15/07/2011  Finner  1.9  16.2  9.6  9.1 
Variation with altitude
One key aspect of the repeatability problem is the discrepancy between target altitude and actual altitude. During the course of the flights, the pilot was set a task of flying at a set altitude (the ‘target’ altitude), but due to weather conditions or other obstacles the actual altitude of the plane can vary widely. This causes something of a confounding problem in our approach, as it might be that poor repeatability in, e.g. apparent resistivity, is caused by the inability of the pilot to fly at that target altitude consistently, rather than because the target altitude is simply higher or lower. We thus performed an initial analysis on the data set to investigate the relationship between the variability of the actual altitude at the different target altitudes.
Comparison of standard deviations for two chosen sections of the test line data
Nominal  No. of  Mean of  Standard Deviation 

Altitude  Observations  Actual Altitude  of Actual Altitude 
Section A: 3000–3500 m  
90 m  415  91.1  3.76 
85 m  438  84.7  6.26 
80 m  418  79.4  4.06 
75 m  441  74.3  5.21 
70 m  397  69.7  4.35 
65 m  426  65.1  5.80 
60 m  408  56.9  4.41 
Section B: 4800–5200 m  
90 m  341  98.5  8.91 
85 m  364  86.0  10.53 
80 m  327  89.3  9.79 
75 m  299  74.3  9.67 
70 m  331  80.3  7.75 
65 m  318  65.4  8.07 
60 m  327  63.9  6.50 
Statistical models for measuring repeatability of apparent resistivity
In this section we outline and build models for each flight’s apparent resistivity and quantify the variability between the different target altitude replicates. We explore this problem with two different approaches, both falling within the framework of Functional Data Analysis (FDA; e.g. [12]). The first approach involves fitting a single statistical model where the apparent resistivity for each frequency/altitude combination is given an overall mean modelled as a continuous time random walk (a Weiner Process, e.g. [14]) in distance, together with a random effect for replicate. Under this approach repeatability is quantified by a specific parameter in the model; the variance of the random effect. The second approach involves fitting splines individually to each frequency/altitude/replicate combination. We can subsequently calculate the variance between replicates to give an estimate of variability over the entire course of the test line. Whilst providing richer summaries, this second approach does not utilise a holistic statistical model on all of the data, and so results are more influenced by outlying values. A further contrast between the two models is in the smoothness of the stochastic process applied to the apparent resistivity. In the Weiner Process model, this is considerably rougher than the spline approach. This is a deliberate attempt to show that our conclusions are robust to the choice of statistical model.
We define, for both approaches, y _{ ijk }(d) as the natural log of apparent resistivity for frequency i=1,…,4, target altitude j=1,…,7, and replicate k=1,…,5 at continuous transect distance d. This variable forms our response. For the first approach we treat each frequency/altitude combination as independent, so for notational simplicity we write out the models as y _{ k }(d) and we ask the reader to remember that each of these models is run independently for each frequency/altitude combination. For the second approach we simplify further to write y(d) as each model is run on every frequency/altitude/replicate combination. With so many fitted models, the number of plots and results that we can display becomes cumbersome. Instead we show only those plots that we feel are of most interest, usually corresponding to those where the models fit best and worst.
Approach 1  Weiner process
Different frequencies have different SNR ratios and different penetration depths; higher frequencies have higher SNRs than lower frequencies. However, high frequency signals decay very fast and the penetration of lower frequencies is deeper. A good model will have a higher SNR, as both the withinreplicate variability term σ and the betweenreplicate variability σ _{ b } will be small in comparison to the level of signal as represented by μ. When calculating the final SNR, we average over distance via \(\frac {1}{N} \sum _{i=1}^{N} \text {SNR}(d_{i})\), where N is the number of unique distances. We thus get a single estimate of SNR from the model, which allows us to compare different target altitudes via boxplots (see “Results” Section).
We fitted the Wiener Process model using the Bayesian Hamiltonian Monte Carlo package stan [15] using halfCauchy weakly informative priors on the standard deviation terms σ _{ μ },σ _{ b } and σ. We ran the model for 1000 iterations on 4 chains, for each of the target altitudes at each of the 4 different frequency values (0.9, 3, 12 and 25 KHz), totalling 28 model runs. On a 3.4 GHz Core i7 Processor with 16 Gb of RAM the computing took about 12 h. The main advantage of using Hamiltonian Monte Carlo is that far fewer iterations are required as it more efficiently explores the posterior parameter space [16]. We remove 200 iterations for burnin and checked for convergence using the standard BrooksGelmanRubin statistic [17, 18]. A more complete joint model incorporating all frequency/altitude/replicate combinations was attempted but found to be too computationally expensive.
Approach 2  splines
We fit the above model using the frequentist smooth.spline function in R [20] with K≈100 basis functions (the exact number is set for each run by the function according to the response variability). As stated above, we run the model for each of the 4 frequencies (900 Hz, 3, 12 and 25 KHz) at each of the 7 altitudes and each of the 5 replicates, a total of 140 runs. We estimate λ via 10fold generalised cross validation where the optimisation criteria is the root mean square error (RMSE). The smoothed functions are derived and the average and standard deviation functions were calculated across the different distance values. Since the model fitting in this approach is a deterministic procedure the computing time required is a matter of seconds.
Results and discussion
We fitted the Weiner Process model to the apparent resistivities for each frequency/altitude combination to get a mean apparent resistivity over the sets of five replications. See Fig. 2 for two data cases chosen to showcase a situation of low repeatability (12 KHz at target altitude 90 m), and high repeatability (25 KHz at target altitude 65 m). The estimated posterior mean apparent resistivity μ(d) is shown as a solid black line and the raw data as coloured lines.
Conclusions
We applied two models using very different approaches, one Bayesian (Weiner process) and one frequentist (Spline model). Boxplots of the results show that for both models the conclusion regarding repeatability of apparent resistivity, over the range of altitudes 60–90 m, is that lower altitudes give more repeatable results than higher altitudes. Note however that due to lack of data, the conclusion regarding repeatability ignores any possible bias in the apparent resistivity at the different altitudes arising from any source i.e. it is possible that the lower altitude measurements contain more bias than the higher altitudes.
Several opportunities present themselves for future research. First, it would be desirable to quantify both repeatability and bias in the apparent resistivity measurements. To do this we would need groundbased measurements across a long segment of the test line. A second extension would be to run a richer cross validation experiment to determine which amongst a larger family of statistical models fit the data (and so quantify replication) best. A final possible extension would be to include a richer Weiner Process or Spline model that treats all of the data simultaneously. We leave such an extension to another paper.
Abbreviations
 AEM:

Airborne electromagnetic
 EM:

Electromagnetic
 FEM:

Frequency (domain) electromagnetic
 SNR:

Signal to noise ratio
 TEM:

Time (domain) electromagnetic
Declarations
Acknowledgements
AH wishes to acknowledge the support of the Mathematics Applications Consortium for Science and Industry (www.macsi.ul.ie) funded by the Science Foundation Ireland Investigator Award 12/IA/1683. Published with the permission of the Director of the Geological Survey of Ireland. The authors would like to thank Miguel Bustamante for organising the European Study Group with Industry conference (ESGI102) which brought the authors together.
Authors’ contributions
JH and GS were responsible for the design of the study, collecting and interpreting the data. EK participated in the preprocessing of the project and the correct interpretation of the electromagnetic data, and helped to draft the report. AH and AP carried out the statistical analysis and the interpretation of the results and wrote the paper. All authors read and approved the final manuscript.
Authors’ information
AH is a Research Fellow in Statistics in the University of Limerick. GS is a senior geologist and head of the minerals section and JH is the geophysics programme manager for the TELLUS project, both at the Geological Survey of Ireland. EK is a lecturer in Applied and Computational Mathematics, interested in geological data interpretation from AEM surveys and AP is an Associate Professor in statistics, chartered statistician and involved in Insight: the National Centre for Data Analytics, both at University College Dublin.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Grant FS, West GF (1965) Interpretation Theory in Applied Geophysics. McGrawHill Book Company, New York.Google Scholar
 Fraser DC (1978) Resistivity Mapping with an Airborne Multicoil Electromagnetic System. Geophysics 43(1): 144–172.MathSciNetView ArticleGoogle Scholar
 Fraser DC (1979) The Multicoil II Airborne Electromagnetic System. Geophysics 44(8).
 Huang H, Fraser DC (2003) Inversion of helicopter electromagnetic data to a magnetic conductive layered earth. Geophysics 68(4): 1211–1223.View ArticleGoogle Scholar
 ISO 35341:1993 (1993) Statistics Vocabulary and Symbols Part 1: Probability and general statistical terms. International Organization for Standardization, Geneva. www.iso.org/iso/catalogue_detail.htm?csnumber=8919.
 Lane R, Worrall L (2002) Interpretation of Airborne Electromagnetic Data: Summary Report on the Challenger Workshop. Geoscience Australia Record 2002/02. http://www.ga.gov.au/webtemp/image_cache/GA16669.pdf.
 Basheer AA, Taha AI, ElKotb A, Abdalla FA, Elkhateeb SO (2014) Relevance of AEM and TEM to Detect the Groundwater Aquifer at Faiyum Oasis Area, Faiyum, Egypt. Int J Geosci 5: 611–621.View ArticleGoogle Scholar
 Pfaffhuber AA, Monstad S, Rudd J (2009) Airborne electromagnetic hydrocarbon mapping in Mozambique. Explor Geophys 40: 1–9.View ArticleGoogle Scholar
 Green A, Lane R (2003) Estimating noise levels in AEM data. ASEG, 16th Geophysical Conference and Exhibition, Extended Abstract, Preview, 70. Australian Society of Exploration Geophysicists, ASEG Special Publications (2):1–5.
 Foged N, Auken E, Christiansen AV, Sorensen KI (2013) Testsite calibration and validation of airborne and groundbased TEM systems. Geophysics 78(2): E95–E106.View ArticleGoogle Scholar
 Huang H, Cogbill A (2006) Repeatability study of helicopterborne electromagnetic data. Geophysics 71(6): 285–290.View ArticleGoogle Scholar
 Ramsay JO, Silverman BW (2005) Functional Data Analysis. Springer, New York.View ArticleMATHGoogle Scholar
 Taguchi G (1987) System of Experimental Design, White Plains. Unipub/Kraus International, White Plains, New York.Google Scholar
 Davison AC (2003) Statistical models, Vol. 11. Cambridge University Press, Cambridge.View ArticleMATHGoogle Scholar
 Stan Development Team (2014) RStan: the R interface to Stan, Version 2.5. http://mcstan.org/interfaces/rstan.org.
 Hoffman M, Gelman A (2011) The NoUTurn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15(2014): 1593–1623.MathSciNetMATHGoogle Scholar
 Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7: 434–455.MathSciNetGoogle Scholar
 Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 2: 457–472.View ArticleGoogle Scholar
 Eilers PHC, Marx BD (1996) Flexible smoothing with Bsplines and penalties. Statistical Science 11(2): 89–121.MathSciNetView ArticleMATHGoogle Scholar
 R Core Team (2014) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.rproject.org.