SECTION ONE: COMPARISON OF KEY STUDIES
Hundreds of millions of dollars were spent by government and Exxon scientists to determine the effects of the Exxon Valdez oil spill on natural resources. Results of the two separate data sets paint starkly contrasting pictures of injury to and recovery of the Prince William Sound ecosystem. In general, government studies show long-term damage in a variety of species and delayed ecosystem recovery, while Exxon studies conclude there is virtually no long-term damage and that there is rapid ecosystem recovery.
This section compares key studies on natural resources presented by government and Exxon scientists to determine possible causes for the extreme differences in results. Key studies were identified as those which were central to the data base (chemistry), or understanding overall ecosystem injury and recovery (intertidal and bird), or which monitored effects in species that are "cornerstones" in the ecosystem and/or the economy (herring and salmon).
The experimental design, methodology, statistics, results, discussion and conclusions of key studies for the government and Exxon are examined and compared for sources of discrepancy which could have resulted in under- or over-estimating injury.
CHEMISTRY STUDIES: SEAWATER OIL CONTAMINATION
Measurement of oil contamination in seawater provides a critical link to laboratory studies, which provide the bulk of known effects of oil on marine organisms. Concentrations of oil pollutants must be measured in order to compare the impact of oil spills on marine life, especially on an individual species basis, with known effects from laboratory studies.
In addition, measurement of oil contamination in seawater may identify oil exposure or uptake pathways for fish and other marine organisms that live in the water column, the volume of water between the sea surface and the seafloor.
Seawater measurements are most significant immediately and within the first few weeks following an oil spill, because the most volatile and highly toxic components of crude oil rapidly dissolve into the water column or evaporate into the air (Wolfe et al. 1993), where they are carried away by currents and diluted. After the first few weeks following an oil spill, oil contamination of seawater is usually confined to waters adjacent to persistent residual reservoirs of oil, such as heavily-oiled shorelines.
This section compares "government" studies by Short & Rounds (1993) with "Exxon" studies by Neff (1991).
Different Allocation of Sampling Effort
Both government and Exxon scientists sampled seawater at approximately the same number of stations and at generally similar locations inside Prince William Sound. However, government sampling was greatest immediately after the spill, and ended six weeks after the spill, because contamination of seawater by oil was not expected to persist in the open waters of the Sound after currents flushed the source of the contamination--the oil slick--out of the Sound (Short & Rounds 1993).
Exxon scientists continued to sample seawater throughout 1989 and into 1990, accumulating a large number of samples in which oil contamination was not detected (and not expected). Problems arose when these data were averaged (pooled) by depth and station (see below).
Different Seawater Sampling Methods
Measuring oil contaminants in seawater requires great care to avoid losing, literally, the oil contaminants through evaporation into the air, adsorption into the storage container, or ingestion by naturally-occurring bacteria in the seawater sample.
Government scientists used methods developed by NOAA specifically to measure oil contaminants in seawater. These methods involved extracting the sample into organic solvents within minutes of sample collection to avoid oil contaminant losses from prolonged storage. This method is known to preserve the oil contaminants indefinitely in the organic solvents.
Exxon scientists used EPA sampling and analysis methods designed for drinking (fresh)water to measure oil contaminants in seawater. These drinking water methods are appropriate for freshwater samples that contain very low numbers of bacteria, because preservatives can be added to the water samples to suppress any hydrocarbon-decomposing bacteria that may be present in the water during the prolonged sample storage periods allowed by this method.
However, seawater usually contains much higher numbers of hydrocarbon-decomposing bacteria than drinking water, and the efficacy of preservatives for suppressing large numbers of these marine bacteria is not well established. Consequently, most of any oil contaminants in Exxon's samples could have been "eaten" by marine bacteria during sample storage, so that by the time the samples were analyzed, the samples no longer would have contained the oil contaminants. Obviously, this method would have underestimated oil contaminants in seawater.
Sampling Effort at Stations and Times Not Reported
Government scientists reported the number of stations (32), depths (1 and 5 meters), number of replicates (3), and the sample location for each of three sampling periods during 31 March 1989 and 8 May 1989 (Appendix I, Short & Rounds 1993a). Government scientists concentrated their sampling and sample analysis effort on the first sampling period: all stations and depths were sampled and all replicates analyzed. Most of the stations were visited during the other two periods, but only one sample from each depth was analyzed (for economy), for a total of 501 samples collected.
Exxon scientists provide only a generalized summary of sampling effort. Critical, specific details were lacking. Exxon scientists report an impressive number of sampling stations (over 100 inside and outside the Sound), depths (surface, 1, 3, 9, and in some cases, 30 meters below the water surface), for a total of several thousand samples.
However, Exxon does not provide the exact number and location of samples collected during each of the seven sampling periods from 30 March 1989 to February 1990. This makes it impossible to verify the data interpretation presented by the Exxon scientists. For example, the very low oil-contaminant concentrations reported by Exxon scientists for their first sample collection period may be the consequence of relatively few samples collected within the actual spill path, averaged with a relatively large number of samples collected within the eventual spill path, but taken before the oil actually physically reached these sites.
Without reporting where and when the samples were collected, data interpretations may substantially underestimate levels of oil-contaminants in seawater.
Chemical Analyses (Raw Data) Not Reported
Government scientists reported all the quantitative data, including the chemical analysis of each sample (Appendix III, Short & Rounds 1993a), so that the study could be evaluated, verified, or reproduced by others.
Exxon scientists did not report quantitative chemical data for samples in a way that made it possible for others to verify the study results. Selective reporting of data most likely underestimated the concentrations of oil-contaminants in seawater. Absence of quantitative data may make interpretations of Exxon's seawater data meaningless.
No Statistical Analysis Provided
Government scientists provided statistical summaries of their data in the form of statistical means for oil-contamination levels, and variance estimates for these means. The scientists further demonstrated a high degree of measurement reproducibility, about ±10% for 95% confidence intervals, that is, all measurements would be within 10% of the statistical mean 95% of the time.
Exxon scientists provided only "average" oil-contaminate levels with no variance estimates and no raw data. Without this quantitative information, it is impossible for others to verify the results. Absence of these data most likely underestimate oil contaminant levels.
Extreme Pooling of Stations
Government scientists analyzed their data for temporal and spatial distribution of oil-contaminants in seawater. They made explicit comparisons of mean results among sampling stations and depths. Further "replicate samples" were qualified as samples collected at the same place, depth and time, using identical sample collection procedures and equipment.
Exxon scientists obscured temporal and spatial distribution of oil-contaminants in seawater by treating all samples collected during a single sampling period as replicates. This is an invalid assumption. No arguments are presented to justify this pooling, which would require all of Prince William Sound to be a single well-mixed homogeneous waterbody.
Nonetheless, samples were pooled by depth (i.e., average of surface samples or average of samples taken from depths greater than 1 meter), and by station (i.e., average of all primary sites. The pooling was inconsistent, and completely obscured differences among areas (i.e., nearshore and offshore, lightly and heavily oiled) and effectively "diluted" any high oil-contaminant levels with low oil-contaminant levels from depths or locations where the oil did not go.
This extreme pooling of stations led to data interpretations that substantially underestimated the oil contaminants in the seawater.
Temporal Variation of Reported Results Implausible
Government scientists found temporal and spatial distributions of oil-contaminants in seawater that were consistent with the physical dynamics of weathering (i.e., dissolution and dispersion) and the flushing action of prevailing currents. Government scientists reported: (1) oil-contaminant levels were highest in samples collected during the first sampling period after the oil spill, and declined steadily thereafter, (2) stations with high oil-contaminant levels were all grouped together in the middle of the spill-path through the Sound, (3) where samples at one depth were high in oil-contaminants, they were also high at other depths, and (4) oil-contaminant concentrations were consistently lowest at reference and low-impact spill locations.
Exxon scientists found temporal and spatial distributions of oil-contaminants in seawater that were both implausible given the physical factors and inconsistent with oil spill literature. For example, Exxon scientists reported that "average" oil-contamination levels in seawater were initially low after the spill, then increased to a high in late April, then steadily decreased. It is highly probable that these "results" were themselves a result of the extensive data pooling, e.g. initially low "averages" could be an artifact of pooling large numbers of samples from stations that were not yet oiled with relatively few numbers of samples that were oiled, as admitted by the author (Neff 1990 p. 5-6).
Exxon scientists do not address possible causes for these highly implausible "results." Since the raw data are not reported, there is no evidence to support their data interpretations which most likely significantly underestimate the levels of oil-contaminants in seawater.
Misrepresentation of Study Findings
Government scientists methodically eliminated all possible alternative oil-contamination sources before attributing contamination found in their samples to Exxon Valdez oil. Biological effects could then be correlated with exposure to Exxon Valdez oil, as opposed to other non-Exxon Valdez oil.
Exxon scientists used their "averages" as indicative of the actual oil contamination levels of seawater in the Sound following the spill, despite study flaws so serious that all the data interpretations could be meaningless. The "averages" were used to justify the claims for absence of biological effects.
The design of Exxon's study is so seriously flawed as to result in data completely at variance with all other known reports.
CHEMISTRY STUDIES: SEDIMENT OIL-CONTAMINATION
The beaches of Prince William Sound and beyond were coated with Exxon Valdez oil when it washed ashore after the tanker grounding. Subsequent activities (storms and cleanup) mixed the oil with sediments, which were then transported downward into adjacent subtidal areas. Because sediment acts as a reservoir, reintroducing oil over time into biological communities, sediment contamination levels and persistence are issues fundamental to ecosystem recovery.
This section compares "government" studies on intertidal sediments by O'Clair et al. (1994) with "Exxon" studies on intertidal sediments by Boehm et al. (1993) and, by reference, studies on experimental design by Page et al. (1993b), and studies on chemical analysis by Page et al. (1993a).
Different Sample Designs
Government scientists designed their studies to assess injury with the goal of quantifying recovery over time at specific areas. Fixed sampling sites were chosen according to habitat type (including wave energy), degree of oiling, and treatment (cleanup) history. This controlled a high degree of variability and increased the chances of finding significant differences due to oil or beach treatment. However, by selecting nonrandom sites, results could not be extrapolated to the entire spill-affected area of the Sound.
Exxon scientists appear to have designed their studies to assess injury with the goal of extrapolating results to the entire spill-affected area of the Sound. Random sites were chosen within habitat types for a "stratified random sampling" design. However, random sites were sampled only for one year (1990). While this may have adequately assessed injury, it did not quantify recovery, because of lack of comparisons over time. The attempt by Exxon scientists to project recovery to the entire Sound is not supported by the experimental design.
Exxon scientists also chose a limited number of fixed sites, similar to the government scientists' design, to assess changes in sediment oil-contaminants over time at specific areas. As mentioned, because of the subjective site selection, results should not have been extrapolated to other areas.
Different Criteria for Identifying Exxon Valdez Oil
Oil released into the environment changes in composition over time. As the weathering proceeds, the characteristic chemical "fingerprint" becomes more difficult to distinguish. Care must be taken to correctly identify the source of oil-contaminants in samples, so that cause-effect relationships between an oil source and a biological effect can be clearly demonstrated.
Both government and Exxon scientists used criteria for identifying oil that may have misidentified weathered Exxon Valdez oil as "extraneous" oil. This would have underestimated the extent and persistence of Exxon Valdez oil contamination in the sediments.
Both government and Exxon scientists identified "extraneous" oil and discounted these contaminates when addressing biological impacts due to Exxon Valdez oil.
Pseudoreplication & Low Signal-to-Noise Ratio
For their stratified random design, Exxon scientists chose 16 beaches within four habitat types, collected data from three "widely separated transects" on each beach, and extrapolated results to all similar beaches within the spill zone. Exxon scientists treated the three transects within each beach as independent observations rather than subsamples, e.g. 48 (16 times 3) independent samples rather than 16 independent samples each with three replicate subsamples. Statistical independence of the within-site subsamples was not clearly demonstrated, raising the problem of increased variability in the study design due to "pseudoreplication."
Because of the problems created by pseudoreplication, the ability of this transect model to detect statistically significant differences among treatment areas is considerably less than a model using 48 beaches randomly selected from the impact zone. In effect, the Type II error in the transect model is unconstrained. Consequently, there is a high likelihood that Exxon's model will not detect a significant difference when there is one. Technically, this translates to a low "signal-to-noise ratio" in which the signal (the differences due to degree of oiling) is lost in the high background noise (the high variability).
Therefore, it is likely that use of this model significantly underestimated injury, both in 1990 and in the extrapolation forward in time to the entire Sound.
By comparison, government scientists chose specific sites in different habitat types to control variability, and level of oiling was established by hydrocarbon measurements at each site. This study design was well suited to detect significant adverse impacts from the spill; however, the use of nonrandom sites did not allow extrapolation of results to other beaches or forward in time. The scientists recognized their design limitations and drew conclusions by comparing their data set over the years in which the study was conducted.
Misrepresentation of Study Findings
Government scientists concentrated their efforts on sampling shallow sediments (0-20 meters), although deep sediments (40-100 meters) were also collected to track the movement of Exxon Valdez oil. Between 1989-1991, a total of 3,127 sediment chemistry samples were collected, of which 1,640 (52%) were from shallow areas (0-20 meters). Of these, 1,154 (70%) were from oiled areas and 486 (30%) were from unoiled areas. The chemistry effort paralleled studies of biological injury, which also focused efforts in the area most at riskthe intertidal and shallow subtidal zones.
Government scientists also focused their efforts on identifying fate and effects of Exxon Valdez oil. Extraneous sources were identified when found, but biological effects in the intertidal zone were unequivocally attributed to the Exxon Valdez spill and its cleanup, based on chemistry data.
Exxon scientists concentrated their sediment sampling effort in deeper (benthic) areas of Prince William Sound and the Gulf of Alaska. More than 2,350 sediment chemistry samples were collected from these two areas, of which at least 865 (37%) were from shallow areas (0-20 meters). Of these, 533 (62%) were from oiled areas and 332 (38%) were from unoiled areas.
Government scientists collected more than twice (1154/533 = 217%) as many samples in oiled shallow areas as compared to Exxon's scientists. Exxon scientists systematically diverted sampling effort away from the area most at risk from Exxon Valdez oil -- the intertidal and shallow subtidal zone .
Exxon scientists also overstated the possibilities of other sources of oil contamination relative to Exxon Valdez oil. Natural sub-marine oil seeps of Katalla oil in the Gulf of Alaska were the primary non-Exxon Valdez oil identified in the benthic sediment study (Page et al. 1993a). Katalla oil was found to occur predominantly in deep (100 meter plus) basins, where it had accumulated over time.
The results of this Exxon study are in agreement with an extensive baseline study conducted by NOAA from 1977 through 1980 to determine oil-contaminate levels in intertidal sediments and mussels (Karinen 1993). The government study found no evidence of Katalla oil in the intertidal zone anywhere in the Sound. However, low levels of Katalla oil, similar to the levels found by Exxon, were found in deep basins where the oil had accumulated over time.
Despite the intense benthic sampling effort, Exxon scientists did not attempt to correlate their sediment chemistry findings with biological effects in the intertidal zone on a station-by-station basis. Because of this, the attention on Katalla oil shifts attention away from the intertidal zone and attempts to underestimate the biological injury from Exxon Valdez oil.
CHEMISTRY STUDIES: MUSSEL TISSUE OIL-CONTAMINATION
Mussels are used worldwide to monitor marine systems for changes in concentrations of organic contaminants (Farrington 1980). They are sensitive, accurate and accepted indicators of water column quality, as these organisms filter substantial volumes of water and are known to both rapidly accumulate oil present in the environment, and rapidly flush out oil when placed in clean seawater.
Mussels are a major component of intertidal and subtidal ecosystems. Because mussels are an important food source for birds, fish and mammals, they provide a significant exposure pathway for oil into these higher trophic consumers.
Collection and chemical analysis of mussels for oil contaminants played a significant role in the assessment of the 1978 Amoco Cadiz oil spill in France (Wolfe et al. 1981). Transplanted caged mussels in the same spill helped identify persistent, low-level seawater contamination from the spill (Wolfe et al. 1979).
Comparisons of government and Exxon studies on mussels reveal an enormous difference in emphasis of importance of this monitoring tool.
Government scientists collected and analyzed over 2,000 mussel tissue samples to monitor different aspects of the Exxon Valdez spill. Caged mussels were used to identify persistent widespread oil contamination in seawater through the summer of 1989 (Short & Rounds 1993b). Mussels from intertidal areas were collected and analyzed to track oil exposures of target species such as spawning herring (Hose et al. accepted 1993a, b). Mussel beds were found to contain extremely high levels of essentially unweathered Exxon Valdez oil as late as 1993 (Peterson 1993), both in the mussels themselves and in the underlying sediments (Babcock et al. 1993). Continuing concentrations of oil in intertidal mussels serves as a potential source of oil associated with reproductive failures in harlequin ducks (Patten 1993) and delayed effects such as the continuing poor survival of sea otters (Gorbics 1993), and river otters (Bowyer et al. 1993).
Exxon scientists failed to exploit one of the most sensitive and easily accessible indicators of oil pollution available after the Exxon Valdez spill -- mussels. Exxon scientists collected and analyzed mussel samples from only 76 sites from 1989 to 1991 as part of the sediment monitoring study (Boehm et al. 1993). While problems with the study design have already been discussed (see sediment oil-contaminates). This could have led these scientists to significantly underestimate the overall extent of both short- and long-term biological effects from the spill.
NATURAL RESOURCE STUDIES:
BEACH & NEARSHORE COMMUNITIES
Over 1,500 miles of Alaska's shoreline were oiled. Understanding the injury to and recovery of plant and animal communities in this critical habitat is central to understanding overall effects of the Exxon Valdez oil spill on the coastal ecosystem.
Beach (intertidal) and nearshore (shallow subtidal) areas are critical habitat for many commercially and ecologically important fish, birds and animals. Land mammals such as Sitka black-tailed deer, river otters and black and brown bear come to the beach to feed on algae, shellfish and fish. Migratory waterfowl and shorebirds feed on small animals that live in or on the sediments (infauna and epifauna). Salmon, king and Dungeness crab, some shrimps and Pacific herring use the beach and nearshore areas as nursery grounds, and, in some cases, as spawning grounds.
This section compares "government" studies by Houghton et al. (1991, 1993a, b) and Highsmith et al. (1992) with "Exxon" studies by Page et al. (1993) and Gilfillan et al. (1993).
Differences in Study Designs
Government scientists used complimentary studies to assess injury and to quantify recovery over time. To compare specific areas over time, fixed sampling sites were chosen according to habitat type (including wave energy), degree of oiling and treatment (cleanup) history (Houghton et al. 1991, 1993a, b). This controlled a high degree of variability and increased the chances of finding significant differences due to oil or beach treatment. However, by selecting non-random sites, results from this study could not be extrapolated to the entire spill-affected area of the Sound. A stratified random sampling program, in which oiled sites were randomly selected within habitat types, was designed for this purpose (Highsmith et al. 1992) and conducted from 1989 through 1991.
Exxon scientists appear to have designed their studies to assess injury with the goal of extrapolating results to the entire spill-affected area of the Sound. A stratified random sampling program was designed by selecting random sites within habitat types. However, random sites were sampled only for one year (1990). While this may have adequately assessed injury, it did not quantify recovery, because of lack of comparisons over time. The attempt by Exxon scientists to project recovery to the entire Sound is not supported by the experimental design.
Exxon scientists also chose fixed sites, based on degree of oiling, to compare recovery over time, which was similar to the government scientists' design. However, because of the definition of "oil" (see below) and the limited number of transects and sites sampled, a high degree of variability was not controlled. This minimized the chances of finding significant differences due to oil and/or treatment and led Exxon to underestimate injury at the sites and, consequently, during the extrapolation of the injury to the Sound.
Differences in Definition of Oil Exposure
Government scientists found widely varying degrees of oiling at different elevations along their transects (upper intertidal to shallow subtidal). Site-specific oil designations, based on actual amounts of oil, were used to reduce variability and to increase chances of detecting real differences.
Exxon scientists maintain that fixed sites were chosen from heavily oiled areas to represent a worst case scenario. However, this assertion is clearly contradicted in the pebble/gravel habitat where three of six sites were classified as "heavy," two as "moderate" and one as "light" (Table 8, Gilfillan et al. 1993). In the stratified random sampling, a single oiling designation was assigned to all elevations along a transect regardless of the degree of oiling at a specific sampling site. For example, sites along a transect with little oil at most elevations, except for a "bathtub ring" of heavy tarry oil on a vertical rock wall, were designated as "light" oiling.
Mixing data from lightly to heavily oiled sites increased variability and decreased chances of finding significant correlations between chemical and biological variables and "degree of oiling." In fact, Exxon scientists found no consistent dose response relationship (e.g. a direct correlation between oil level and organism response)(Figures 2-8, Gilfillan et al. 1993). Use of a generic definition for level of "oiling" substantially underestimated injury.
Choice of Reference Sites
Government scientists chose reference sites to minimize, as closely as possible, variability due to geographic differences such as salinity, temperature and wave energy. While this was relatively easily accomplished for fixed sites (non-random), the stratified random sampling design presented a problem in that controlling for variability required non-random choices that would invalidate the random design. To overcome this problem, control sites were matched for each oiled site through random selection from a list of potential sites, which were controlled for geographic variability.
In contrast, Exxon scientists selected three pebble gravel reference stations in the southwest corner of the Sound. These sites were naturally unproductive (low number of species and individuals within a species) relative to unoiled sites in the oiled area, because the reference sites were in a colder, less saline, and more turbid portion of the Sound near glaciers. This of course lowered the "standard" against which "heavily oiled" sites were compared and masked spill impacts by confounding the issue with geographic differences.
For example, it would be hard to prove that the number of species at a "heavily oiled" site was reduced after the spill, if the oiled site was compared to a site with a naturally low number of species in the first place. There would appear to be no difference between the two sites, or, even more misleading, there might appear to be increases in abundance and diversity at oiled sites relative to nonproductive reference sites (Figure 2 of Gilfillan et al.,1993). The pattern of rockweed biomass at the reference sites (lowest in the middle intertidal) is highly atypical for Prince William Sound. Because of this anomaly, a significantly greater rockweed biomass is shown on "heavily oiled" exposed bedrock in the middle intertidal.
The ability of this design to detect even large oil effects was greatly impaired because of the choice of control sites. Further, the variability introduced by geographic differences of the control sites also led Exxon scientists to erroneously conclude there were "increases" in species abundance and diversity in entire biological assemblages (communities) "caused" by the spill. This conclusion is highly implausible and completely at odds with all other known reports.
Small Sample Size Relative to Patchiness
It is well accepted that plant and animal distributions in Prince William Sound are "patchy" (i.e.: not distributed uniformly). Studies must be designed to minimize effects of patchiness so that results are not confounded by this variable.
The quadrat size used by Exxon scientists was small (1/32 m2) relative to the patchiness of the epibiotic (surface dwelling) species. This led the scientists to extrapolate their results from sampling a total of about 14.8 ft2 of sheltered rocky substrates in the middle intertidal zone to all of that habitat/elevation in the study area, roughly 275 miles(1). In comparison, government scientists drew their conclusions in the same habitat and elevation zone from having sampled some 336 ft2 of beach in 1992 alone (Houghton et al. 1993b). The small quadrat size used by Exxon scientists increased the variability from patchiness and resulted in underestimation of injury.
The core size used by Exxon scientists was also small relative to distribution of some important epifauna (e.g., hardshelled clams). Exxon scientists extrapolated their results from sampling a total of 4.06 ft2 of pebble gravel substrates in the lower intertidal zone to all of that habitat/elevation in the study area, roughly 18 miles(2). In contrast, government scientists drew their conclusions in the same habitat and elevation zone from having sampled 5.5 ft2 in 1992 alone, as well as 129 ft2 for larger species such as hardshelled clams (Houghton et al. 1993b). The small core size of the sampling instrument used by Exxon scientists, without additional sampling to account for larger epibiota, increased the variability from patchiness and resulted in underestimation of injury.
Pooling of Species
Government scientists reported individual species of plants and animals at each site. This detailed information is central to understanding changes in community structure.
Exxon scientists provide little information on individual taxa, especially key species. They did not distinguish presence of a large number of opportunistic taxa exploiting a disturbed habitat, from presence of a normal assemblage, in variables such as total animal abundance and number of species. For example, the phrase "total algal biomass" from Exxon's findings does not distinguish between the opportunistic filamentous green algae that colonizes after a spill and reestablishment of rockweed with an understory of red algae, yet these two species do not represent the same stage of ecological recovery. Without information on individual species, it is easy to underestimate injury and/or to misinterpret the data regarding injury assessment or recovery of biological communities.
Effects of Cleanup
Government scientists isolated and separated effects of the cleanup from effects of the oil itself. Specifically, they found that the high-pressure, hot-water washing significantly impacted both infauna and epifauna in the short-term (Lees et al. 1993), and subsequently retarded recovery in 1990-1992 (Houghton et al. 1993a, b). Species also reacted differently to effects of oiling and treatment. For example, while two species of littorines were both largely eliminated by hot-water washes, the opportunistic Littorina scutulata quickly recolonized impacted areas where L. sitkana did not (Houghton et al. 1993a, b). Separating out cleanup effects reduced variability from this confounding factor. It also provided valuable information on treatment effects which are especially relevant, given the intense regulatory and scientific interest in the cleanup issue (e.g. how damaging were the cleanup methods and which methods should be used to cleanup the next spill?).
Exxon scientists did not distinguish between oil and cleanup effects, because they found the effects were highly correlated. However, no data are presented to substantiate this correlation which is first mentioned in the discussion of Gilfillan et al. (1993). Combining these variables confounds interpretation of the data set. To compare with the above example, Exxon scientists did not distinguish between the two species of littorines. In effect, the increase of the opportunistic species obscured the decrease of the more sensitive species, leading to a finding of no significant impacts on "littorines" from the spill. The variability created by combining oil and cleanup effects contributed to underestimates of injury, and masked the real impacts of both the spill and the cleanup.
Pseudoreplication & Signal-to-Noise Ratio
For their stratified random design, Exxon scientists chose 16 beaches within four habitat types, collected data from three "widely separated transects" on each beach, and extrapolated results to all similar beaches within the spill zone. Exxon scientists treated the three transects within each beach as independent observations rather than subsamples, e.g. 48 (16 x 3) independent samples rather than 16 independent samples each with three replicate subsamples. Statistical independence of the within-site subsamples was not clearly demonstrated, raising the problem of increased variability in the study design due to "pseudoreplication."
Because of the problems created by pseudoreplication, the ability of this transect model to detect statistically significant differences among treatment areas is considerably less than a model using 48 beaches randomly selected from the impact zone. In effect, the Type II error in the transect model is unconstrained. This means there is a high likelihood that the model will not detect a significant difference when there is one. Technically, this translates to a low "signal-to-noise ratio" in which the signal (the differences due to degree of oiling) is lost in the high background noise (the high variability). In the transect model, the high variability is inherent to the design, because of small quadrat size, small core size, pooling of species, generic definition of "oiling," choice of reference sites and pseudoreplication.
In sum, this transect model is well suited to the demonstration that there were no significant adverse impacts from the spill remaining in 1990. Use of this flawed model is likely to have significantly underestimated injury, both in 1990 and in the extrapolation forward in time to the entire Sound.
By comparison, government scientists controlled variability as much as possible within their design constraints to increase the ability of their study designs to detect significant effects from the spill (see "differences in study design", and "choice of reference site").
Invalid Use of Analysis of Covariance
Government scientists used a multivariate analysis to detect differences in similar habitat types due to oil. This analysis is capable of examining multiple variables simultaneously and detecting relationships among the variables, i.e. how did oil affect the different habitat types? Did one species, such the opportunistic oligocheate worm, increase with increasing oil levels in different habitat types?
Exxon scientists used a multivariate analysis, but used oil as a covariate. This invalidated the analysis, because the covariate (oil) was correlated with the response variable (oil) instead of the two being independent. This has the effect of improperly assigning effects of oiling to natural differences among habitats, i.e. the reason there were more oligocheate worms at gravel than rocky beaches was because of the differences in habitat, rather than because gravel beaches had a more pronounced impact from oil. Improper use of the multivariate analysis would have underestimated the effects of oil.
Selective Reporting of Data & Results
Government scientists reported quantitative data so that studies could be evaluated (or reproduced) by others.
Exxon scientists did not consistently measure the same parameters each year, and some of the 1989 data were not even analyzed. For example, no surface scrapes (quadrat sites) were taken in 1989, but more core samples were taken (and processed) from pebble/gravel sites in 1989 than in the 1990 fixed-site and stratified random sampling programs combined. However, the 1989 data, such as they were, were not analyzed. Also, only the infauna appear to have been sampled in 1991. Because of these data gaps, the impression that this was a three-year program is quite misleading. Further, quantitative information was not presented, making it impossible for others to verify the studies. Selective reporting of data most likely underestimated injury.
Misrepresentation of Study Findings
Government scientists found significant injury to and continuing but slow recovery of intertidal communities from both oiled and treated beaches. For example, several sites in both sheltered rocky and pebble/gravel habitat studied in Exxon's fixed site program in 1989 remained in early or intermediate stages of recovery as late as 1992 (Houghton et al. 1993b). In 1990 and 1991, there were many areas in the Sound where intertidal communities remained severely altered by the combined effects of oiling and treatment. Many of these sites were recovering more slowly than areas that were oiled alone (no cleanup treatment). These findings were confirmed by (Highsmith et al. 1993).
These conclusions are well supported by the data set and by findings of numerous other researchers working on the Exxon Valdez and other spills. These studies provide information that is relevant to understanding natural ecological processes in intertidal assemblages affected by the oil spill and its cleanup.
Exxon scientists found minimal injury to and rapid recovery of intertidal communities from the oil spill. However, these conclusions are not well supported by the data set and contrast with findings of numerous other researchers on the Exxon Valdez spill and other spills.
Further, the finding that communities in some oiled areas of the Sound were more rich and diverse than those in some unoiled parts of the Sound is also unsubstantiated by the data and contrasts with findings of other researchers. While many investigators have shown increases in certain opportunistic species, such as oligacheate worms, following spills, Exxon scientists stand alone in their assertion that consistent increases in entire assemblages of animals occurred following the spill, (this assertion is probably based on the use of improper reference sites, see above).
Discussion of correlations between cleanup effects and oiling are also not substantiated with any data.
The conclusions and data presented in Exxon's studies provide little understanding of injury to and recovery of biological communities after an oil spill.
NATURAL RESOURCE STUDIES
PACIFIC HERRING
Prince William Sound herring are a cornerstone species for the ecosystem and the local economy. Herring are a critical food source for over 40 predatory birds, mammals, other fish, and even invertebrates. Herring are also an important subsistence food, and support a multi-million dollar commercial fishery.
This section compares government studies by Hose et al. (accepted 1993a, b), Kocan et al. (accepted 1994), and Norcross et al. (accepted 1993) with Exxon studies by Pearson et al. (1993).
Different Definitions of Oil Exposure
Government scientists identified "oiled beaches" by using a time series of spill trajectory maps from daily surveys by four agencies including Exxon in 1989 and by hydrocarbon analysis of mussel tissue. Oil droplets were present under the water surface everywhere within the spill trajectory in Prince William Sound, but were impossible to track aerially. Mussels bioaccumulate oil, so their tissue mirrors the levels of oil in the water they filter. Oil found in mussel tissue was identified as Exxon Valdez oil by chemical fingerprinting (Short & Rounds 1993b).
Exxon scientists used the term "oiled shorelines" in its most literal meaning and avoided the terms oil trajectory and oiled waters. According to government studies,
"over 40% of the areas used by herring to stage, spawn, or deposit eggs, and over 90% of the areas needed for summer rearing and feeding were lightly to heavily exposed to crude oil. As a result, herring encountered oil as eggs, larvae, juveniles, and adults in 1989 and, to a lesser extent, in 1990."
The different definitions of "oiled beaches" resulted in government scientists concluding 43% and 31.4% of the linear miles (about 40 actual miles) of herring spawn were within oiled sites in 1989 and 1990, respectively, while Exxon concluded 4% and 10% of the linear miles of spawn were at oiled sites in 1989 and 1990, respectively. This led Exxon to underestimate injury to herring.
Differences in Timing of Study Initiation
Government scientists started tracking schools of herring the day of the spill (March 24, 1989). They observed oil sheens of various size and duration in areas frequented by spawning herring, and observed adult herring migrating across the oil trajectory to reach the nearshore spawning areas.
Exxon did not have herring scientists in Prince William Sound until after the first week in April. While the scientists could reconstruct the track of the oil during their absence, they would not have known where the herring had been during this time. This could have contributed to Exxon's inaccurate reporting of percent of oiled versus unoiled beaches.
Different Year Classes Studied
The herring population is composed of two to eleven year old fish, often with offspring of a single spawning year (i.e., a "year class") dominating. The 1988 year class currently dominates the Prince William Sound population and was exposed to oil as one-year olds when they spent the summer of 1989 feeding and growing in nearshore bays.
Government scientists evaluated the reproductive potential of the 1988 year class, exposed to oil as 1-year old fish, in 1992 when the fish returned as adults to spawn. They found less than half of the eggs collected from kelp in areas oiled in 1989 hatched successfully compared to eggs from unoiled sites (Kocan et al. accepted 1994).
Exxon scientists did not investigate reproductive impairment or any other sublethal types of damage to juveniles and adult fish born before and exposed during the spill. By limiting the scope of its studies, Exxon scientists limited their ability to detect sublethal and other long-term damage to herring from the spill.
Pooling of Sites
Both government and Exxon scientists pooled (combined) data from all oiled sites to compare with pooled data from all unoiled sites. Because of key differences in the definition of "oiled site" (see above), Exxon's pooled data for unoiled sites included sites which were "oiled," using the government's definition of the term. Averaging these data increased the number of effects from the unoiled sites, and made it harder to detect differences from means from the oiled sites. This would underestimate injury.
For example, government scientists reported significant elevations (30% at p < 0.05) in abnormal larvae at pooled oil sites versus unoiled sites, which included indicators such as skeletal bends, missing fins, percent abnormal and percent genetic aberrations at individual sites (Hose et al. accepted). Exxon scientists did not find any significant differences in abnormal larvae at pooled oil versus unoiled sites.
Selective Reporting of Data
At the spill onset, scientists knew that water concentrations of oil correlated poorly with hatching success:
"the oil spill situation most likely to impact herring reproductive success is one in which fresh oil is physically dispersed into the water column by turbulence and, subsequently, adheres as droplets to attached herring eggs... the aspect of hatching success of Pacific herring most likely to be impacted by oil exposure is the frequency of abnormal larvae... water concentrations of specific or total hydrocarbons could be poor or misleading indicators of the frequency of abnormal larvae" (Pearson et al. 1985).
Scientists also knew that concentration of hydrocarbons in egg tissue correlated poorly with hatching success, because the embryo itself metabolizes oil and the chorion around the embryo filters and flushes water and compounds in water:
"no significant regressions were found between the percentage of abnormal larvae and any measure of hydrocarbons-in-egg concentration" (Pearson et al. 1985).
Based on this prior knowledge, scientists took mussel tissue (government) or sediment (Exxon) samples in addition to water samples and egg tissue samples to test correlations with abnormal larvae.
Not surprisingly, government scientists found no significant correlations between percentage of abnormal larvae and either water or egg tissue hydrocarbon concentrations. They did, however, find highly significant correlations with mussel tissue chemistry and level of injury.
Not surprisingly, Exxon scientists also found no significant correlations between percentage of abnormal larvae and either water or egg tissue hydrocarbon concentrations. However, Exxon did not report its results for sediment samples. This omission could have dramatically underestimated injury.
Poor Use of Summary Statistics
Government scientists presented their raw data in appendices and measurements of error with summary statistics, so that their results could be evaluated, verified or reproduced by other.
Exxon scientists did not present their raw data. Further, they did not even present some of the summary statistics such as measurements of error. Without the quantitative data, it is impossible for other scientists to verify or evaluate the results.
Exxon scientists also omitted comparisons of sublethal rates in herring from pooled oiled versus unoiled sites for 1989, the year of the spill, although this comparison was presented for 1990 data (no significant findings). The omission of 1989 data is highly suspicious and could have dramatically underestimated injury.
Misrepresentation of Data
Government scientists found that the 1989 year class of herring, spawned during the spill, was missing from the population based on both 1992 and 1993 age composition data (ADF&G 1993, 1994). Further, the 1989 year class in Sitka Sound, which serves as the control for Prince William Sound (r2 = 0.89), was noticeably present in 1992 and 1993. The Prince William Sound 1989 year class clearly stands out as an anomaly. However, if rates of loss through to metamorphosed larvae over 20 mm long are combined, the cumulative loss of the 1989 year class is found to be 96%, which accounts for the missing year class (Funk et al. submitted 1993).
Exxon scientists claim the 1989 year class a poor year for recruitment, because it followed a high recruitment year. However, this claim is not in agreement with the Sitka control data. Exxon scientists also claim that losses to the 1989 year class will not be evident until 1996. This is simply not true: state fisheries management biologists routinely track a year class from 2-year olds to 11-year plus when they disappear from the spawning population.
Misrepresentation of Study Findings
Exxon scientists discuss the large commercial herring catches in Prince William Sound in 1990 to 1992 as evidence of low impact of the oil spill on herring. However, they fail to mention that the herring fisheries were closed in 1989 because of the oil spill, and that the subsequent years of high harvest were in large part attributed to harvesting the 1989 quota in later years, combined with the anticipated quota for these years.
Exxon scientists also fail to discuss the cyclic nature of herring populations. Because of shifting baseline data, the effects of the oil spill can only be distinguished by comparing actual versus predicted population levels, not by comparisons with absolute population levels. Oil spill mortality, plus mortality from other unforeseen events, is the difference between the predicted and the actual population levels.
NATURAL RESOURCE STUDIES: PINK SALMON
Pink salmon are vital to the economy of coastal fishing communities and the well-being of subsistence villages, and they play a crucial role in the ecosystem:
"A host of birds, including gulls, kittiwakes, cormorants, loons, mergansers, murres, auklets, puffins, kingfishers, and dippers, and small terrestrial mammals, such as mink, renew their energy stores as they feed upon migrating juvenile salmon in the spring. Numerous predatory fishes such as cod, pollack, rockfish, and sculpins also feed on juvenile salmon in marine waters.
"Adult salmon returning to spawn fall prey to killer whales, porpoises, sea lions, harbor seals, and salmon sharks. In the spawning streams, adult salmon provide a pre-winter feast for bears, land otters, mink, wolverines, wolves, coyotes, eagles, ravens, and gulls. Still later, the decomposing salmon carcasses enrich the estuaries for young fish, crab, shrimp, mussels, and many other invertebrates. In fact, most animals in Prince William Sound rely on wild pink salmon in one form or another for their continued survival" (Bue et al. 1993).
Pink salmon stocks in Prince William Sound are composed of wild and hatchery fish. Most, if not all, of these pink salmon were exposed to oil as outmigrating juveniles: all pink salmon, both wild and hatchery fish, spend about 2 months in nearshore areas of the Sound actively feeding and growing prior to migrating out the southwest district with the currents, (the same path as the oil), and into the Gulf of Alaska for the winter. The nearshore areas in the southwest district of the Sound were heavily oiled in 1989. Wild pink salmon were also exposed to oil as eggs in the streams: up to 75% of wild pink salmon in the Sound spawn in intertidal areas (Bue et al. 1993).
Pink salmon studies were conducted on five different phases of development. This analysis compares "government" studies on egg mortality in the field by Bue et al. (1993) and Sharr et al. (1989, 1990, 1991), egg mortality in the lab by Bue (1993), pre-emergent fry by Wiedmer 1993, emergence timing by Peckham (1993), juvenile growth by Willette (1993) and Wertheimer et al. (1993), and adult escapement by Sharr et al. (1993) with "Exxon" studies on early life history by Brannon et al. (1993) and adult escapement by Maki et al. (1993).
Egg Mortality in the Field
Differences in Study Design
Government scientists designed their study using large sample sizes to reduce variability and to increase the ability of the design to detect real differences in egg mortality between oiled and unoiled streams. Both government and Exxon scientists used four transects over identical tidal zones per stream, but the similarities end there. Government scientists sampled 31 streams (vs. 9 for Exxon), with 10 (1989) or 14 (1990 on) replicates per transect (vs. 3 for Exxon), every year since, and including, 1989 (vs. 1989 only for Exxon), collecting eggs which were broadly representative of the gene pool (vs. 5 fish per stream for Exxon). The power of Exxon's study design to "see" even a very large difference in egg mortality from oil was very poor compared to the government's design.
Misinterpretation of Study Findings
Government scientists found that egg mortality in oiled streams was significantly higher than in unoiled streams in 1989 in the intertidal zone, and progressively higher in 1990 in the upper intertidal zone, and even higher in 1991 across all stream zone. This was interpreted as indicating potential genetic damage, and laboratory studies were initiated to verify the field results.
Exxon scientists detected no significant differences in egg mortality due to oil, and interpreted this to mean that there were no differences, rather than that their design was incapable of detecting differences. However, despite the high inherent variability in the study design, the highest egg mortality in oiled streams occurred at the highest intertidal zone (Table 3, Brannon et al. 1993). Exxon scientists also initiated a laboratory study to verify their field results.
Egg Mortality in the Laboratory
Differences in Study Timing
Government scientists started their "incubation" study in 1993 and have requested to continue the work in 1994, based on their results. Exxon scientists conducted their incubation study in 1990 and 1991. With a properly designed study, differences in egg mortality due to genetic impairment should have been easier to detect in 1990 and 1991, closer to the initial exposure event.
Differences in Study Design
Government scientists designed their study using large sample sizes to reduce variability and to increase the ability of the design to detect real differences in egg mortality between oiled and unoiled streams. Government scientists collected fish from 16 streams (vs. 4 in 1990 and 10 in 1991 for Exxon), to make 24 replicates per stream with 500 eggs each (vs. 40 "replicates" with 200 eggs each for Exxon), and repeated the sampling over four days (vs. 1 day for Exxon). The power of Exxon's study design to "see" even a very large difference in egg mortality from oil was very poor compared to the government's design.
Pseudoreplication & Low Signal-to-Noise Ratio
Government scientists collected 30 female and 30 male fish from each stream, and made a "stream specific embryo pool" consisting of all possible intra-stream crosses (30 x 30 = 900). Replicate samples were then drawn from this pool for incubation. This pooling created a composite gene pool which reduced the variability among replicate samples.
Exxon scientists collected 20 female and 40 male fish from each stream, and spawned eggs from each female into two "jars" (40 jars total), each of which was fertilized by a different male. Because of the inherent variability in genetic material in individual fish, these 40 samples were not true replicates. Treating them as such for statistical analysis, increased the variability and decreased the power of the test to find statistically significant differences.
Misinterpretation of Study Findings
Government scientists found a "difference in embryo survival between oil contaminated and control streams" and concluded "that these differences are carried by the parents" (Bue 1993 p. 3). These results confirm the findings of the field studies.
Exxon scientists detected no significant differences in egg mortality due to oil, and interpreted this to mean that there were no differences, rather than that their design was incapable of detecting differences. These results are consistent with the findings of the field studies.
Pre-Emergent Fry Development
Differences in Study Design
Government and Exxon scientists sampled about the same number of streams (9 vs. 11 in 1990 and 12 in 1991, respectively), and collected about the same total number of fish per stream (about 140).
While both government and Exxon scientists looked for gross abnormalities such as various lesions, government scientists also looked for more subtle indicators of toxicant exposure, specifically induction of cytochrome P450IA, an enzyme known to be elevated in response to exposure to oil.
Misrepresentation of Study Results
Government scientists found that P450IA content was significantly elevated in oiled streams in 1990 and 1991, and that no significant change in induction intensity was observed between 1990 and 1991. Histopathological lesions were found in 1990 and 1991 in samples collected from oiled streams near the time of emergence (May and June), but the "small sample size prevented meaningful statistical analysis" of the lesion data (Wiedmer et al. 1993 p. 105).
Exxon scientists found that both survival (at the lowest tide level) in 1991 and the number of normal alevins was significantly lower in oiled streams in 1990, similar to the government's data, despite the low sample size. However, by combining the survival data across tidal zones, which Exxon scientists note is "not valid" because of the mixed results (Brannon et al. p. 17), the statistical difference, not surprisingly, disappears. Yet these combined data are presented in Table 4 "for visual comparison only" (Brannon et al. p. 17), which is extremely misleading as no mention is made in the table of its purpose. Exxon scientists attribute the low survival and weak correlation between number of normal alevins and oil to geographic differences in substrate and exposure, rather than to oil.
Misinterpretation of Study Findings
Government scientists found that more than 2 years after the spill, pre-emergent pink fry were still picking up levels of oil from the environment that caused detectable physiological changes.
Exxon scientists detected no "substantial" differences in pre-emergent fry survival or condition due to oil, and interpreted this to mean that there were no differences, rather than that their design was unable to detect significant differences.
Emergence Timing
Both government and Exxon scientists monitored timing of fry emergence, and both found significant differences. However,the samples in both studies were too small to determine whether the difference in timing was due to oil or other factors such as geographic differences.
Juvenile Growth
Differences in Study Design
Growth of pink salmon is strongly correlated with survival (Willette 1993): the faster the growth, the less chance for predation, and the higher the survival.
Government scientists conducted studies on growth for three years from 1989 through 1991 in two separate studies, one using coded wire tag (CWT) or marked fish (Willette 1993) and one using unmarked fish (Wertheimer et al. 1993). A significant reduction in growth was found in fish released from a hatchery in the spill- impacted area of the Sound compared to fish released from a hatchery outside the area of the Sound that was oiled. Marginally significant reductions in growth were observed in 1990 and 1991, however the magnitude of the reduction was only about one third that observed in 1989, and were thought to be due to temperature differences, rather than oil effects.
Exxon scientists only conducted growth studies in 1990, a year after the spill, and found no significant differences in oiled versus unoiled areas. Given the marginal significance of the government studies in 1990, the findings of government and Exxon scientists may not be that different. However, Exxon scientists conclude from their limited study that growth of pink salmon was not affected by the oil spill. This extrapolation is not justified by the single year study and may have overlooked effects from oil in 1989.
Misinterpretation of Study Findings
Government and Exxon scientists detected no significant differences in growth due to oil in 1990, however, government scientists found growth was significantly reduced in 1989, and also in 1991, but to a lesser degree, in both of their studies. Exxon scientists did not study growth in 1989, and conclusions based on a single year of sampling, the year after the spill, may be misleading as to the true effects of oil.
Adult Escapement
Differences in Study Design
Both government and Exxon scientists monitored pink salmon spawning escapement, but for two entirely different purposes. Government scientists conducted a study in 1990 and 1991, in addition to the annual escapement counts, specifically to calibrate the accuracy of the historic methods used to estimate annual escapement. Exxon scientists conducted their study in 1989 through 1992 to determine whether adult salmon return to oiled versus unoiled streams.
Misinterpretation of Study Findings
Study findings are not comparable because of the different study design: government scientists found that historic methods underestimate escapement, and Exxon scientists found that pink salmon return to oiled streams.
However, Exxon scientists went on to conclude that there were no measurable effects of the Exxon Valdez oil spill on adult salmon, a finding which may overstate their data base. Exxon scientists assumed that the two postspill year classes at greatest risk for spill effects were 1990 and 1991, and determined that returns were strong in both years.
The obvious corollary to Exxon scientists' assumption is that the postspill year classes at the second greatest risk for spill effects would be the offspring of these two year classes, returning as adults in 1992 and 1993. Adult returns in both these years were extremely weak and about one-third and one-fifth of anticipated goals, respectively (ADF&G 1993, 1994). Exxon scientists did detect the weak return in 1992, the last year of their study, but attributed it solely to poor ocean rearing conditions, rather than oil. This conclusion may be misleading.
NATURAL RESOURCE STUDIES
MURRES, HARLEQUIN DUCKS & OTHER BIRD SURVEYS
Of all the seabirds, murres were the hardest hit by the Exxon Valdez oil spill, comprising 74% of recovered carcasses (Erikson 1993). It is estimated that over 20% of the entire northern Gulf of Alaska population was killed, 260,000 to 308,000 birds (Ford et al. 1991). Murre colonies at the Barren Islands and along the spill path dropped by up to 70% compared to pre-spill counts (Piatt 1993). Since murre reproductive success is directly dependent on density (the more birds, the better the breeding success)(Nyswender et al. 1993), murre colonies were studied to assess injury to and recovery of this species.
This analysis compares "government" studies by Nyswender et al. (1993) with "Exxon" studies by Boersma et al. (1993) and Erikson (1993).
Differences in Study Design
Government scientists surveyed murre breeding populations at five locations within and two locations outside the oil spill trajectory annually from 1989-1991. Information was also gathered about timing of nesting events and productivity at some of the sites. Because surveys were conducted over multiple colonies and years, the results could be extrapolated throughout the spill-impacted area.
Exxon scientists assessed breeding success at one closely-observed 5 x 5 meter site, East Amatuli Light Rock (EALR) in the Barren Islands from the year after the spill, 1990, to 1992 (Boersma et al. 1993). The breeding colony at EALR is atypical compared to surrounding sites: it has the best breeding occupancy and density is constant between years, unlike other colonies which can vary by up to two-fold (Nyswender et al. 1993). Estimates of murre colony breeding success at EALR give a misleading impression of recovery in murre colonies, because this colony is the least likely to show an effect.
Exxon scientists also conducted a survey of attendance at many colonies, but only in 1991, two years after the spill (Erikson 1993), and compared numbers with historical surveys which are of questionable quality (Nysewander et al. 1993). It is possible that the high variability in the baseline data limited the ability to detect statistical differences.
Different Timing of Study Initiation
Government scientists initiated studies in 1989, when the greatest differences were apt to be apparent. Exxon scientists initiated their studies one (Boersma et al. 1993) or two (Erikson et al. 1993) years after the spill.
Survey Methods
Exxon scientists substantially overestimated the chick production on EALR in 1990: at least 39 marked eggs were lost to predators when the nest site was visited to count eggs. These eggs were counted as eggs that would have hatched and fledged, and it was assumed that the adults did NOT re-lay eggs. However, data from other murre colonies shows that 85% of birds that loose eggs through disturbance early in the season re-lay (Ainley & Bockelheide 1990). Thus, Exxon scientists may have over-estimated production by nearly 100% in 1990 (Figure 9, Boersma et al. 1993).
Emphasis on Statistical Uncertainty
Government and Exxon data show broad agreement on some of the murre surveys. However, when Exxon scientists performed the data analysis, they presented the data on a per colony basis, with several small colonies increasing and several very large colonies decreasing. There was no statistical trend (as noted by Exxon scientists), but the numerical change in overall colony attendance was pronounced (which Exxon scientists failed to mention).
Both government and Exxon scientists used the same baseline data set which was outdated and incomplete. Most of the data was collected in 1972 and 1984-5, and both climactic changes (global warming) and commercial fishing may have later introduced large changes in populations. This created difficult confounding problems for the government scientists attempting to quantify injury, and provided an uneven baseline for Exxon to use in demonstrating statistical uncertainty and consequential lack of oil effects.
Misrepresentation of Study Findings
Government scientists made more counts after the spill and were able to show large differences in colony numbers and breeding success after the spill, despite the difficulties with the baseline data set.
Exxon scientists relied upon the statistical uncertainty of baseline data to give such large variances that, when they conducted the post-spill surveys, no statistical differences could be found.
0Comparisons of other bird studies conducted by government and Exxon scientists further demonstrate the attempt by Exxon scientists to obfuscate effects of oil.
Multispecies Studies
Government scientists conducted a survey from 1989-1992 that systematically searched the spill area to count and identify all species of birds present in different seasons in both oiled and unoiled areas and made careful correlations with historical numbers to compare bird distributions pre- and post-spill (Laing & Klowiewski 1993). This survey showed changes between oiled and unoiled areas, as well as reduced numbers in oiled areas compared to pre-spill surveys. The size and scope of this survey had much more statistical power to detect real differences than the Exxon study.
Exxon scientists conducted a survey from 1989-1991 that searched only within the oil spill zone, comparing numbers of birds in bays that had been oiled or that had escaped local oiling because of the oil trajectory (Day et al. 1993). The limited scope of this study, conducted without control areas, narrowed the extent to which statistical differences could be found, and Exxon emphasized lack of statistical difference between areas as evidence for "no effect" of oil. It appears that Exxon's field studies by design had a very limited ability to show oil spill effects. This enabled Exxon scientists to claim that since no statistical differences were found, there was minimal injury.
Carcass Counts
Government scientists, in a very detailed study, calculated that the actual number of total dead birds was over 350,000 with considerable uncertainty due to the vast distances, infrequent beach searches in many areas, and highly local distribution of birds (Ford et al. 1991).
Exxon scientists stressed recovery, rather than injury, and completely avoided estimating the numbers of birds killed or injured in the spill. However, Exxon sought to discredit the government study by claiming that the government scientists could not show large decreases in numbers of birds at breeding colonies (Erikson 1993, Wiens 1993), and that even if large numbers had died, Exxon claimed recruitment at the breeding colonies was apparently high, and therefore the population had recovered by 1991. Exxon scientists relied upon statistical uncertainty to be able to not demonstrate an effect.
Harlequin Ducks
Government scientists conducted a systematic and detailed study on harlequin ducks from 1989-1993 that documented near total reproductive failure has occurred in western (oiled) versus eastern (unoiled) Prince William Sound since the spill (Patten 1993). Oiled contaminated tissues and poor body condition were found in ducks in oiled areas: blue mussels, a principal component of the ducks' diet, were found to be contaminated with Exxon Valdez oil. Concern was expressed that unless steps were taken to remove the oil from the mussel beds, a local extinction of harlequin ducks could occur within the western Sound (Patten 1993).
Exxon scientists conducted a more limited set of field surveys specifically to gather data to counter the government's study (Day et al. 1993). Brief surveys were made only into oiled areas to search for broods of ducks late in the breeding and molting season. No attempt was made by Exxon scientists to compare breeding success in oiled areas with success in unoiled areas.
(1)Extrapolation of area sampled is from Table 3 Page et al. 1993: 14.8 ft2 = 44 samples x 1/32 m2 x 10.76 ft2/m2. Extrapolation of miles is from Table 1 Page et al. 1993: 275 mi = 0.566 (percent of habitat type) x 486 mi total.
(2)Extrapolation of area samples is from Table 3 Page et al. 1993: 4.06 ft2 = 48 samples x 1/32 m2 x 10.76 ft2/m2. Extrapolation of miles is from Table 1 Page et al. 1993: 18 mi = 0.037 (percent of habitat type) x 486 mi total.