In environmental monitoring studies, certain desired data-quality objectives (DQO's) can be identified at the outset; that is, the degree of sample representativeness, data precision, and the site conditions over which the information data are collected are established at the inception of a study so that appropriate sampling methods can be designed (Technical Appendix I). Those DQO's define not only how a given study or monitoring program is carried out, but also how or when such information could be appropriately utilized by other users. This is a significant issue because without such explicit communication of DQO's and method characteristics, it is difficult to separate errors associated with field-method error from natural variation.
The DQO's will dictate, among other things, two critical components of any field method--the geographic extent of the site and field-method timing. Both of these components must be defined for any field method because they bear directly on the representativeness of the samples or data collected. The same field method executed either at a different type of site or at a different time (season, for example) may not perform with similar efficiency, precision, or bias. The DQO's are critical in defining the types of sites and sampling times over which a given field method is likely to yield data representative of the actual conditions of interest.
Figure 1 shows the steps involved in many types of field methods. In situ field methods in which no samples are actually collected for laboratory analysis are distinguished from those in which samples are collected because the two types of methods require somewhat different treatment in defining performance criteria. In situ methods follow an abbreviated sequence of steps as shown in figure 1. Performance criteria are associated with each step of a given method. Table 1 illustrates examples of performance criteria and ways in which these criteria would be addressed for a generic field method in which samples are collected and analyzed by using laboratory procedures. In this type of scenario, performance criteria for a given procedure or protocol, which consists of several procedures, can be characterized by subjecting the field method to a specific range of tests, each one followed by the same laboratory analysis. Differences among laboratory results are assumed to be due to performance characteristics of the method and not to either the laboratory analysis or differences in the analyte among samples. The degree to which these assumptions are true will depend on the precision of the laboratory method used and the type of site.
Figure 1. Procedural steps required in field methods.
Table 1. Translation of some performance criteria derived for laboratory analytical testing to field methods
Aquatic systems have certain factors or considerations that bear directly on appropriate sample timing and location within the context of developing performance criteria. Table 2 summarizes some of the factors for several different types of aquatic systems, which include streams, lakes, estuaries, and ground water. Depth, for example, may be a factor for examining certain analytes in large streams, lakes, and estuaries where a vertical profile component could be important. Therefore, whether particular samples are depth integrated or surface grabs can result in very different results and perhaps different method-performance characteristics; this depends on the system. Similarly, for systems where there is a flow, such as in streams and some shallow aquifers, flow-proportional samples may yield a much different measurement than grab samples or time-composite samples. Again, these different forms of sampling may have different associated performance characteristics even for the same analyte and accompanying laboratory procedure. Knowledge of important site factors can be used to minimize differences among replicate samples, thereby ensuring a more precise determination of field-method-performance criteria. The information presented in table 2 suggests another important effect of the type of site on performance-criteria characterization. For some systems, such as shallow ground water and small streams, season or precipitation can have a significant effect on the analyte being measured. Therefore, in some type of sites, a given field method may be used to examine a broad range of environmental conditions to characterize criteria, such as performance range or interferences, adequately. Note that for such sites as deep ground-water systems, seasonality or precipitation may play a very minor role in terms of certain analyte concentrations or other characteristics of the water. In this case, sample timing may not be a major factor that affects performance characteristics for some deep ground-water field methods and analytes.
Performance criteria Procedural steps or methods
Precision Duplicate samples/split samples for
later analysis, replicate samples and
measurements from the same site Bias Field-spiked samples, equipment blanks,
sampling reference sites from different
regions.
Performance range Sampling in a range of habitat environments
consistent with DQO's, examination of range
of related analytes or measurements. Interferences Habitat effects on measurement quality,
sampling device performance over different
environmental conditions, spiked samples. Method detection limit Equipment blanks, sampling in sites known
to have absence of analyte, spiked samples.
Field methods, whether they yield in situ measurements or laboratory-based measurements, rely on adequate training to carry out the method with the most accuracy and precision (Technical Appendix I). It is desirable to have training evaluations or proficiency testing of results available for the corresponding field data so that a secondary user could independently judge the quality of the information. Part of characterizing performance criteria for a given field method will include aspects of training and the level of expertise necessary to perform specific steps. Unlike laboratory methods, where operator training can be directly evaluated (through the use of performance-evaluation samples and fortified spike samples, for example), adequate field-method training is evaluated by means of more indirect means. One way in which field-method training and performance characteristics may be evaluated is through the use of "standard" sites. Standard sites are locations in which the variability in the analyte or measurement of interest is low over a specific time period or habitat condition. Furthermore, the variability around the mean value is well defined. As a result, samples can be repeatedly taken in such a location over that time period, and similar measurements can be obtained. In this way, the standard site is analogous to a performance standard in laboratory analytical work. Adequate training can be evaluated by having a particular field crew sample at least one, and preferably more, standard site. Significant deviations between the new crew results and those obtained historically for the site and similar environmental conditions (with a mean and some measure of variance) could indicate inadequate training or proficiency.
Also, selected "regional" training centers under interagency(s) direction ("Methods and Data Comparability Committee," see below) could review "crew" or "individual" training survey methods or protocols so that some standardization of training or methods could be achieved on a geographical basis.
Characteristics that define a reference site will be specific to what is being measured. For biological collection methods (Technical Appendixes F, O) geomorphic and cultural factors, such as ecological region, watershed or basin, land use, habitat type, and lack of anthropogenic disturbances, are critical in defining a reference condition that is analogous to a standard site in the present context. When controlled or defined, these attributes yield consistent results over a given time period for biological data. Similar attributes may be useful in defining reference sites for some chemical and physical field methods. Certain types of measurements, however, may require different reference-site attributes. For example, a field method designed to collect water temperature or major ion data may choose certain freshwater springs as one type of condition because a fairly consistent level of water temperature or major ion is observed during a certain time period. Similarly, some deep ground-water aquifers may provide appropriate reference sites for certain analytes because the concentration is stable over time.
In addition to using carefully selected reference sites, another way to evaluate proficiency of training and to characterize various performance criteria for analytical, biological, and some physical methods is through the use of field blanks. For analytical and biological measurements, results of field blanks will indicate the degree of cross-contamination among samples and overall carefulness in carrying out the field procedures. Clearly, use of field blanks is limited to those methods in which samples are collected for later laboratory analysis. Field methods that yield in situ measurements may not be amenable to this procedure. Instead, such methods must rely on several field teams and several measurements at the same locations to characterize method proficiency and other performance criteria.
The flow chart presented in summarizes the major steps in defining performance criteria for a given field method involving sampling. As noted previously, the DQO's will define what is measured, the site, and the timing of interest. For those methods in which samples are collected for later analysis, several types of tests are available to characterize performance criteria. Several samples should be collected from the same location (a reference site) at the same time to quantify sampling precision or reproducibility. Ideally, this should be repeated at different times (seasons) and different sites to ensure that realistic precision estimates are obtained. Also, this sampling will help quantify the performance range and potential interferences of the method. In addition, field blanks should be performed with sufficient frequency to quantify contamination and method sensitivity. For biological methods, field blanks could be samples that consist of water without the organisms of interest into which the sampling device is placed. Assuming that laboratory methods have been satisfactorily validated, field blanks that contain significant quantities of the analyte of interest suggest that the field method may introduce a certain bias or lack proficiency. Recovery may also be addressed for some chemical analytes by utilizing field-spiked samples at the point of sample collection or before a particular prelaboratory procedure (sample preservation, filtering) if that is the method of interest. Finally, the field method should be performed over a range of site conditions applicable to the DQO's to characterize the performance range and method robustness. Site conditions would include conditions other than those represented at standard sites. In many ways, the process just described may be iterative by defining new sites and new sampling index periods and repeating the sampling and laboratory analyses.
The flow chart presented in figure 2 can pertain to a field protocol as a whole, which would consist of several steps or methods, or could pertain to an individual step. For example, the USGS study on nutrient-preservation methods for ambient samples, dealt with one step within a larger field-sampling protocol, namely how samples are preserved. If individual steps are to be examined, then it is critical that other steps in the process be held constant; that is, field and laboratory methods for steps outside the one of particular interest need to be performed in a similar manner by using the same equipment and standard operating procedures.
The discussion thus far has focused on field methods in which samples are collected and analyzed. Several types of field methods, however, do not result in samples being collected. Data are collected directly instead. Examples would include in situ measurement of pH or dissolved oxygen by using a field meter and probes, in situ enumeration and identification of fish species collected, and physical habitat measurements, such as percent shading, stream velocity, and stream gradient. In these instances, the framework just discussed cannot be utilized to characterize performance criteria. In situ field methods must be subjected to a framework that relies heavily on interfield crew evaluations and several measurements in the same locations (fig. 3).
Figure 3. Procedural steps in relation to developing performance criteria for in situ methods.
Figure 4. Scheme for comparing field methods that involve sampling and subsequent analyte analysis.
Reference sites are important for in situ measurements as they are with true sampling methods because the value of the reference site becomes a "standard" by which to judge measurement precision and relative bias. However, test sites or nonreference sites are just as important in defining the degree of measurement consistency among different field crews and certain performance characteristics of the method. Where a sampling instrument is involved, such as for stream velocity or pH, these should be calibrated before data are recorded. Furthermore, for some parameters, such as dissolved oxygen, samples can be preserved and analyzed by using appropriate laboratory procedures. The laboratory results are then used to verify the results of the on-site method.
An additional component for comparing field methods is to sample a range of test sites that includes the extremes of environmental conditions likely to be encountered by using the method. At each test site, both methods should obtain several measurements to evaluate precision, performance range, and potential interferences of the methods (bias) (fig. 4). Two methods may be fairly comparable in some types of sites or under certain conditions and not others. For example, an impeller-type current-velocity probe yields measurements similar to those obtained by using an ultrasound probe under low- to intermediate-flow conditions in streams and rivers. At higher flows, however, turbulence and wave eddies increase propeller friction in the impeller probe, which results in consistently lower current velocity readings than the ultrasound probe. Such information can be used to quantify the range over which the two methods (instruments in this case) yield comparable results and where they do not.
An example that demonstrates the importance of testing several environmental conditions would be a recent USGS nutrient preservation study, in which several nutrients were measured in a range of different types of ambient-water samples. Each water sample was examined in side-by-side tests by using different preservation procedures. The results of that study are robust because a range of nutrients and a range of ambient sample types were examined. However, the comparability of different preservation methods under nonambient conditions (waste-water effluents, for example) is unknown and likely to be different than that observed for ambient samples in which natural microbiological activity was low. Comparability of nutrient preservation methods for nonambient samples will require additional study.
The comparison of field methods that include in situ measurements needs to be handled somewhat differently from that above (fig. 5). Because samples are not collected, it is even more critical that the methods to be compared include measurements in the same locations and at the same time. This is because method results, in this case, often pertain to a narrowly defined region in space and time. For example, an in situ pH measurement will be relevant for a certain vertical stratum of water, at a certain horizontal or transect location, and only for a very restricted time period that spans perhaps 1 to 2 hours (or less in some eutrophic systems). After sampling in a different vertical stratum, a different horizontal location, or morning instead of afternoon, the same method could yield a significantly different measurement result. Therefore, if the objective is to determine comparability between a certain pH probe/meter and a certain pH test-strip paper, then the two methods would need to sample side by side at all sites. Only then can interferences that result from various site factors (table 2) be sufficiently controlled to examine method comparability. As discussed for field methods in which samples are collected, reference sites and a range of test sites are equally important in determining performance criteria and examining method comparability for in situ field measurements.
Institution Framework for Examining Field-Method Comparability
Field-method comparability tests require a certain degree of resources, in particular trained personnel to collect samples or to make measurements at the different standard and test sites. If follow-up laboratory work is required to obtain a measurement, then laboratory resources (equipment and trained people) also need to be available. Given the resources needed to examine comparability of field methods, it is imperative that a system be in place that will adequately store and manage such information so that others can use the results. Furthermore, it should be clear that reference sites are extremely valuable in evaluating the performance criteria and the method comparability of a given method. Therefore, reference sites (possibly regional ones) must be identified, cataloged, and easily accessible so that other users or methods can choose appropriate sampling locations.
Figure 5. Scheme for comparing field methods involving in situ measurements (no sample collection).
A second issue that pertains to the institutional framework is that of defining or characterizing adequate method training. As explained earlier, satisfactory training and demonstrated proficiency are essential elements of all methods, particularly field methods. Furthermore, certain field methods or procedures require significantly more sophisticated training and expertise than others. The level of training and expertise needs to be clearly indicated for a given field method so that other users can evaluate the proficiency of different field personnel and the resulting data.
In the MDCB Charter (Technical Appendix H), one of the stated objectives is to evaluate the need for a certification or proficiency testing program for field methods much like that already proposed for laboratory analytical methods under the U.S. Environmental Protection Agency's Environmental Monitoring Methods Council. A certification program for field methods will require a large commitment of resources initially, and the specifics are undetermined at this time owing to the complexity of this issue and the many types of field methods used. A more realistic goal would be to have the MDCB be the repository of the information that pertains to training requirements and the level of expertise necessary for various field methods. Once enough methods are formally characterized with respect to performance criteria, it would be realistic to embark on a certification or proficiency testing program.
The Methods and Data Comparability Board (MDCB) is intended to carry out the institutional functions described above (Technical Appendix H). The MDCB, as mandated in the Charter, would store and manage information that pertains to method-performance criteria and results of any tests of method comparability. Furthermore, the MDCB would identify and catalog reference-site information that would be easily accessible for users, agencies, and the public.
One issue in this regard is how field methods should be classified for ease of organization and accessibility. Several possible classification schemes are matrix (sediment, freshwater, saltwater, ground water), type of analyte or measurement (metal, nutrient, current velocity, pH), and submethod or procedure (sampling, preservation, measurement procedure if done in situ). It is likely that the primary level of classification should be the measurement or analyte because this is the primary topic of interest for which users would want information (table 3). Under this classification would be a subclassification according to submethod or procedure because this is typically the next critical level of interest to users. Finally, a given procedure for an analyte would be classified according to the matrix and (or) type of site. Within a given type of site or matrix, tests of comparability would specify the types of samples examined (ambient, surface grabs, depth-integrated composites, flow-proportioned composites). Alternatively, protocols could be set up by geographic/region area and (or) by type of habitat/parameters being measured.
Return to
ITFM Report Appendixes Table of Contents