|
|
| APPENDIX D |
APPENDIX D: PILOT TEST TO IDENTIFY STEWARDSHIP DATA
This appendix presents the results of a pilot test to identify stewardship data available in a DOE site document index database. The pilot test was a preliminary feasibility exercise to see whether information of stewardship value can be identified from existing information indexes and to identify barriers that may prevent more diagnostic selections. The pilot test was conducted on a records management database from the Rocky Flats Environmental Technology Site (RFETS), hereafter referred to as the focus site. The methodology, results, and conclusions from the pilot test are presented below.
D.1 Methodology
The pilot test was conducted in three steps:
1. Develop criteria to identify stewardship data.
2. Develop queries to search for the stewardship data available in the site database.
3. Query the site database.
These steps are further described in the following sections.
D.1.1 DEVELOP CRITERIA TO IDENTIFY STEWARDSHIP DATA
To develop criteria to identify stewardship data, experts from a wide range of disciplines developed a variety of scenarios that were likely to be encountered during stewardship, identified the decisions needed to be made for each scenario and the data required to make the decisions, and criteria to identify the appropriate data (refer to Figure D-1). To focus the efforts and to simulate real situations in which a future steward may want to access stewardship data, the experts conducted the pilot test in the following functional areas:
An example of a stewardship functional area scenario with the corresponding decisions and information needs (for the barriers/ buffers functional area) is shown in Table D-1.
| Scenarios in Which Data Would be Used | Decisions that Would Need to be Made Under the Scenario | Information Needed to Support the Decision | Criteria to Identify Information |
| Monitoring indicates that the site is not performing as expected in original closure plan. |
|
|
Confine requested information to publicly available, published (referenceable) material. |
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
| APPENDIX D |
To identify the information needed to support the decisions and the resulting criteria, the functional area experts reviewed the 12 stewardship data types defined in Chapter 2 of this report. For each of the stewardship data types, the functional area experts identified the information required to support the data type (see Table D-2). For each information requirement, the functional area experts also identified a temporal reference, stating when they believe the information is likely to have been generated. For example, some of the information may have been generated in the past (and hence be the subject of a search of currently available records) and some of the information will be developed only in the future. The temporal reference for each information requirement is also shown in Table D-2.
| Data Type and Specific Information to Support Stewardship | Temporal Reference |
| Hazards and Controls | |
| A.Existing hazards.This information includes the location, type, condition, and vulnerability (e.g., to fire, rain, earthquakes) of radioactive and chemical hazards left onsite after cleanup is complete. This information also includes the likelihood that these hazards will migrate or otherwise move either within the site or to offsite areas. At the point of site closure/ transfer, this information essentially provides a "baseline" of the state of each onsite hazard at the start of long-term stewardship. A few examples are listed below. | |
|
Past/Future Past/Future
|
| B.Past and present releases and accidents.This information includes reports and other related data on past and present releases and accidents; radioactive and chemical contaminants or materials released during these events; who or what was known or suspected to be exposed to these contaminants of materials; and any documented or suspected exposure levels. A few examples are listed below. | |
|
Past/Future Past/Future |
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
| Data Type and Specific Information to Support Stewardship | Temporal Reference |
| Hazards and Controls (continued) | |
|
Past/Future Past/Future Past/Future Past/Future Past/Future |
| C. Disposition of historical hazards. This information pertains to site hazards that existed in the past but were removed or otherwise mitigated to a point that allows unrestricted future uses. It also includes legal or other supporting documentation to demonstrate that the hazards are no longer present onsite or the extent to which historical hazards were mitigated to baseline conditions at the start of long-term stewardship. A few examples are listed below. | |
|
Past/Future
Past/Future Past/Future |
| D. Information regarding existing barriers and other active or passive mechanisms for preventing exposures. This information includes the location, type, condition, and vulnerability (e.g., to fire, rain, earthquakes) of barriers and other protective mechanisms. This information includes knowledge of which specific barriers/protective mechanisms are required for each existing hazard. This information also includes schedules for maintenance or other related actions required to ensure adequate protections remain in place. A few examples are listed below. | |
|
Past/Future Past/Future Future Future |
| APPENDIX D |
| Data Type and Specific Information to Support Stewardship | Temporal Reference |
| Gerations and Activities | |
| E. Process history. This information includes current and historical data on what activities occurred onsite, where these activities occurred, when these activities were conducted; and what infrastructure was used to support these activities. It includes the processes that occurred onsite, the materials used for these processes, and the products and wastes produced. This information includes a general history of the site; its historical mission(s); its role in the design, testing, production, and dismantlement of U.S. nuclear weapons; and any post-Cold War missions or activities at the site. A few examples are listed below. | |
| Past/Future Past/Future Past/Future Past/Future Past/Future Past/Future Past/Future Past/Future Past/Future Past/Future |
|
| F. Historical infrastructure. This information also includes what buildings, facilities, pipelines, and other infrastructure existed onsite; where they were located; and what they were used for. It also includes how onsite land areas were used. A few examples are listed below. | |
| |
| G. Post-closure/ transfer operations and infrastructure. Information pertaining to the operation of the site after closure including policies and procedures, post-closure monitoring data, compliance reports, land use during stewardship, remaining buildings/ facilities, processes, pipelines, infrastructure, and effluent monitoring. A few examples are listed below. | |
|
Future Future |
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
| Data Type and Specific Information to Support Stewardship | Temporal Reference |
| G. Post-closure/transfer operations and infrastructure(continued). Information pertaining to the operation of the site after closure including policies and procedures, post-closure monitoring data, compliance reports, land use during stewardship, remaining buildings/facilities, processes, pipelines, infrastructure, and effluent monitoring. A few examples are listed below. |
|
| APPENDIX D |
| Data Type and Specific Information to Support Stewardship | Temporal Reference |
| Regulatory/ Legal Framework | |
| H. Regulatory framework (past and present). This information includes any compliance agreements, regulations, site closure agreements, permits, or other legal requirements associated with long-term stewardship activities at the site. A few examples are listed below. | |
|
Past/Future
Past/Future Past/Future Past/Future |
| I. Requirements specific to transfer/ closure and post transfer/ closure. This information includes any specific monitoring, maintenance, or reporting requirements established as a part of site closure agreements. This information also includes specific reporting schedules established for monitoring or other data. A few examples are listed below. | |
|
Future Future Future Future Future Future Future Future |
| J. Real Estate records. Real property records related to acquisition of the site, easements and other access rights onsite and offsite through public/private property, mineral rights, and water rights. This information includes legal agreements and associated documentation to allow appropriate access to offsite monitoring stations, pumps, or other active or passive control systems. This information also includes specific schedules for data collection, maintenance, and related tasks. A few examples are listed below. | |
|
Past/Future
Past |
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
| Data Type and Specific Information to Support Stewardship | Temporal Reference | J. Real Estate records (continued). Real property records related to acquisition of the site, easements and other access rights onsite and offsite through public/private property, mineral rights, and water rights. This information includes legal agreements and associated documentation to allow appropriate access to offsite monitoring stations, pumps, or other active or passive control systems. This information also includes specific schedules for data collection, maintenance, and related tasks. A few examples are listed below. |
|
Past Past |
| Site Characteristics/Settings | |
| K. Information about cultural and natural resources. This information includes the location, type, and condition of onsite natural resources (including minerals, land and water resources, and habitats/ species of concern), including resources of particular importance to Native American Tribes. It also includes the vulnerability of these resources to a variety of hazards, including residual radioactive and chemical hazards, other manmade hazards, and natural hazards. This information also includes relevant laws, regulations, and agreements regarding protection and/ or permitted uses of these resources. A few examples are listed below. | |
|
Past/Future
Past/Future
Past/Future Past Past Past Past Past Past |
| L. Geophysical and political information. This information includes site topography, site hydrogeology, geotechnical hazards, physical hazards, site boundaries, political boundaries, agricultural distribution patterns, and public exposure data. A few examples are listed below. | |
|
Past Past |
| APPENDIX D |
| Data Type and Specific Information to Support Stewardship | Temporal Reference |
| L. Geophysical and political information (continued).This information includes site topography, site hydrogeology, geotechnical hazards, physical hazards, site boundaries, political boundaries, agricultural distribution patterns, and public exposure data. A few examples are listed below. |
|
Past/Future Past/Future Past/Future Past/Future |
D. 1.2 DEVELOP QUERIES TO SEARCH FOR STEWARDSHIP DATA After developing the criteria to identify stewardship data, the functional area experts developed queries to search a document index data base for stewardship data. The site database used was the Environmental Records Database (ERD) from the focus site. The ERD, active through 1995, is a compilation of over 30 record indexing databases from across the site and has over 408,000 records. A list of the databases included in ERD is shown in Table D-3. The databases were included in the ERD because of their environmental data value.
| Name | Description | Summary |
| Environmental Record Database (ERD) | The ERD is a compilation of over 30 record indexing databases from across the site and has over 408,000 records. Databases included in the ERB were chosen due to their environmental data value. The system has migrated overtime and is currently maintained in FileMaker Pro. | Summary: Databases included in the ERD are: Procedure Tracking and Document Tracking System, RF Correspondence Control System, Env. Master File, Rocky Flats Dbase, Marcus Church Dbase, Records Mgmt. Dbase, Master Records Inventory, CERCLA Administrative Records, ERM Project File Center Dbase,
Rockwell (Grand Jury Investigation) Dbase, EPA Dbase, ChemRisk, Woodward Clyde System, Doty Database, RAC Dbase, CDPH& E Dbase, Solar Ponds Files, RCRA Permitting and Compliance Library, Summary of Root Cause Analysis, Lessons Learned, Hazspills, and RCRA Regulatory Programs Permitting Files.
Data Status: In most cases, data is current through 7/1/95. Current system is inactive. Document Availability: Most documents are available from Records Management at the site or from other site document custodians. |
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
| Name | Description | Summary |
| Rockwell (Criminal Grand Jury Investigation) | Contains records seized by the FBI and EPA agents and records produced in response to Grand Jury Investigations. The database contains approximately 150,000 documents that focus mainly on the activities that occurred at the site from 1984 to 1989. The database is owned and operated by Rockwell International. | Summary: The database is primarily used for litigation defense by Rockwell International. Many of the documents within the system have been optically scanned. Data Status: The database focus mainly on the activities that occurred at the Site from 1984 to 1989. Current system is inactive. Document Availability: Documents are available through an attorney's office in Denver. |
| Marcus Church | Contains documents associated with monitoring data, reports and scientific studies dealing within offsite environmental issues. Approximately 35,000 documents are in the system. | Summary: The database was developed in support of the Church-McKay litigation against the DOE, DOW Chemical and Rockwell International. Data Status: The database focuses mainly on the years 1952 -1981. Current system is inactive. Document Availability: Documents were optically scanned and full text retrieval is available. Hard copies are available from the Federal Records Center. |
| Cook | Contains documents collected in support of a class action lawsuit (class members are people who reside or work within a certain radius of the site) against Dow Chemical and Rockwell International in 1990. | Summary: The database was developed in support of the class action lawsuit against DOW Chemical and Rockwell International. Along with information that is associated with potential impact to health or decreases in property value, the system contains records of building history for several buildings constructed from the early 1950s through the early 1970s. Data Status: The database focus mainly on the years of 1952-1990. Current system is inactive. Document Availability: Documents were optically scanned and full text retrieval is available. However, not all documents are available due to poor quality originals. |
| Rocky Flats | Contains documents for future possible litigation purposes. The majority of these documents deal with organizational information. The system also contains security, safety and health information for the period of collection. | Summary: The database was developed for future litigation support and on a variety of subjects. Data Status: The database contains pre-1990 documents. Current system is inactive. Document Availability: Documents were optically scanned and full text retrieval is available. Hard copies are available from the Federal Records Center. |
| Environmental Master File (EMF) | The EMF contains records associated with the environmental history of the site and the surrounding lands. The database contains approximately 28,000 documents and has been used to support environmental projects and litigation activities. | Summary: The database was developed to retain a historical log of environmental activities at and around the site. Data Status: The database focus mainly on the years of 1952 to the late 1980s. Current system is inactive. Document Availability: Documents were optically scanned and full text retrieval is available for some documents. Hard copies are available from the both the site contractor and from the Federal Records Center. |
| Woodward Clyde | This system contains environmental documents collected in characterizing the environmental baseline conditions of the operable units identified at the site. | Summary: The database was developed for site characterization and delineation of operable units. Information contained in the database include the site history, nature of contamination at the site, and environmental conditions of the site in 1992. Data Status: Data has not been updated since origination. Current system is inactive. |
| APPENDIX D |
| Name | Description | Summary |
| (continued) | (continued) | Document Availability: Documents were optically scanned and full text retrieval is available. Hard copies are available from the Federal Records Center. |
| ChemRisk | This system contains documents collected during 1991 and 1992 for the dose reconstruction/ toxicological review performed by the Colorado Department of Health. The system contains approximately 2,000 documents. | Summary: The dose reconstruction project included the collection of onsite and offsite monitoring data, routine and accidental releases of radionuclides and non-radioactive chemicals, environmental management procedures, and waste stream characterizations. Data Status: The database focus mainly on the years of 1951 to 1989. Current system is inactive. Document Availability: Documents were optically scanned and full text retrieval is available. Hard copies are available from the Site contractor. |
| Doty | This system contains documents collected for the generation of the Historical Release Report in June, 1992. The system contains approximately 5,700 documents. | Summary: The Historical Release Report (HRR) contains information regarding spills, releases and/ or accidents involving hazardous substances; potential cumulative effects of inside-building releases on the environment beneath buildings; and known/ potential environmental impacts outside the site. Data Status: The database contains documents collected during 1991 and 1992. Current system is inactive. Document Availability: Documents were optically scanned and full text retrieval is available. Hard copies are available from the site contractor. |
| RAC | The system contains documents collected in support of the Phase II Dose Reconstruction study started by ChemRisk. Documents contained within this system came from the same collection of documents, which were available to the Doty and ChemRisk efforts. Approximately 1,100 documents are in the database. | Summary: Contains historical public exposures (estimate of offsite exposures, doses and potential health risks). Data Status: Not known, but it is expected that the information in the database has not been updated nor maintained since generation of the RAC report in the early 1990s. Current system is inactive. Document Availability: Not known. |
| EPA | The system contains documents collected in response to EPA CERCLA 104(e) Requests for Information. | Summary: Information contained in the system include information regarding plutonium in the air ducts, and shipments of contaminated wastes. Data Status: The database contains information regarding shipments to the Lowry Landfill covering the years 1952 through the early 1980s. Current system is inactive. Document Availability: Documents were optically scanned and full text retrieval is available. Hard copies are available from the Site contractor. |
| CDPH& E | The system contains miscellaneous documents requested by the CDPH& E and used in a cancer incidents study. Approximately 180 documents are in the database. | Summary: Collection of miscellaneous documents requested by the CDPH& E for a cancer incidents study. Information includes original land selection documentation, Church-McKay land litigation, and Industrial Hygiene records. Data Status: Current system is inactive. Document Availability: Hard copies are available from the Site Contractor. |
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
| Name | Description | Summary |
| Records Management Database (RMDB) | The RMDB is the site's primary system for locating and retrieving inactive, unclassified site records. Approximately 60 million pages of inactive unclassified records are tracked by the system. The system resides on a mainframe using Oracle software. | Summary: The RMDB is used to index and retrieve inactive records that have been sent to Records Management for low-cost storage. The RMDB has been active since October 1993 and contains records from a variety of dates. Nearly 6,000 cubic feet of records and 3,000 reels
of microfilm were indexed in FY 96. Data Status: Current system is active. Document Availability: Database is an indexing system only. Documents can be retrieved via formal search requests of the Site contractor. |
| Master Records Inventory (MRI) | The MRI contains data from a sitewide records inventory that was conducted from June 1993 through August 1995. | Summary: The MRI contains a variety of active record information assessed from June 1993 through August 1995. Contents of the MRI provide history, use and function of the record series at the Site. The system has been used heavily by Site efforts including the epidemiology study, transition environmental database report, operating records audit, and the dose reconstruction study. Data Status: It is stated that the inventory ended in August of 1995. It is not known whether the system has been maintained. Current system is inactive. Document Availability: Documents indexed in the MRI are retained by the record originator as the system was designed to track active records. |
| Master Records Turnover Instruction (RTI) Database | The RTI is used by Records Management to retain all records turnover instructions that have been written for site records collections. | Summary: The RTI is essentially a controlled procedure that identifies the pertinent information fields that need to be captured for cost effective and efficient record retrieval. The RTI acts as a guide for data entry personnel to enter individual records into the Records Management Database. Data Status: Current system is active. Document Availability: n/a |
| Plantwide Procedures and Manuals Tracking Database (PADT) | The PADT is used to track distribution of all documents controlled by the centralized Document Control organization. The PADT resides on a main-frame running on Oracle software. | Summary: The PADT consist of an index that tracks the distribution of all site policies, plans, manuals, and procedures formally controlled by the Site. The system is linked to the RMDB in order to link data on inactive records for electronic transfer. Data Status: Current system is active. Document Availability: Controlled documents can be obtained through the Site contractor. |
| Rocky Flats Correspondence Control System (RFCC) | The RFCC has been used at the site since 1993 to track incoming and out-going external correspondences. The RFCC resides on a mainframe running on Oracle software. | Summary: The RFCC is an index of all external correspondences controlled by the Site contractor. It is primarily used to identify commitments to actions, dates or resources for the Site contractors identified in correspondences to and from the Department of Energy.
Data Status: Current system is active. Document Availability: Hard copy files are available from the Site contractor. |
| Building 706 Technical Library, Technical Reports Database | The Technical Library database provides an index of approximately 64,000 classified documents. The system is run on a FileMaker Pro database. | Summary: The Technical Library provides an index of classified technical reports that were used for production support at the site. Data Status: Current system is inactive. Document Availability: Hard copies are available through the Site contractor. |
| APPENDIX D |
| 1. Record ID Unique number for each record | 13.6. External Letters |
| 2. Data Source | 13.7. Manual |
| 13.8. Administrative | |
| 13.9. Health and Safety Preventive Manuals | |
| 13.10. Informational Procedure not held by Doc Control | |
| 13.11. Other | |
| 13.12. Old Manual Type | |
| 13.13. Preventative Maintenance Order | |
| 13.14. Environmental Management Procedure | |
| 13.15. Program Plan | |
| 13.16. WSRIC Book | |
| 13.17. Requirements | |
| 13.18. Waste Processing Report | |
| 13.19. Miscellaneous | |
| 13.20. Doe memorandum | |
| 13.21. Survey | |
| 13.22. Internal | |
| 13.23. Analytical | |
| 13.24. Presentation | |
| 13.25. Graph | |
| 13.26. Table | |
| 13.27. Investigative Report | |
| 13.28. Miscellaneous Handwritten Docs | |
| 13.29. Telecommunications Message | |
| 3. Title | 13.30. Memoranda |
| 4. Keywords | 13.31. Logbooks |
| 5. Authors | 13.32. List |
| 6. Addressees | 13.33. Misc Traffic Documents |
| 7. Distribution | 13.34. Procedure |
| 8. Comments | 13.35. Unplanned Event Info CTR Report |
| 9. Reference Numbers | 13.36. Routing Slip |
| 10. Publication Date 1 | 13.37. Policy |
| 11. Publication Date 2 | 13.38. Diskette |
| 12. Estimated | 13.39. Shipping Papers |
| 13. Type | 13.40. Building Book |
| 13.41. Approval Forms | |
| 13.42. Performance Indicator Reports | |
| 14. Size | |
| 15. Location | |
The queries developed to search for stewardship data are based on the fields available in the ERD. The fields available in the ERD are shown in Table D-4.
The queries used to search the database are lists of keywords, developed by each functional area expert based on their information requirements (as discussed in the above section). Such searches most likely represent the method by which stewards would try to identify information in a database. Through iterative searching, functional area experts devel-oped lists of keywords expected to encompass the majority of documents of interest to their functional area. The final keyword queries developed for the functional areas are presented in Table D-5.
One functional area, buffers/barriers, was further investigated. The keyword queries developed for buffers/barriers were grouped into sub-topics. These sub-topics and their corresponding keyword queries are shown in Table D-6. Each subtopic was then individually queried through the ERD database.
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
| Functional Area | Final Keyword Query | Number of Records Selected |
| Barriers/Buffers | Checking for the word anywhere within Keywords, Comments and Title and not accepting data source "RMDB"
Like "*electronic database*" Or Like "*soil*" Or Like "*soils*" Or Like "*surface water*" Or Like "*hydrology*" Or Like "*geology*" Or Like "*landfill*" Or Like "*pond*" Or Like "*ponds*" Or Like "*ditch*" Or Like "*NEPA*" Or Like "*environment*" Or Like "*monitoring*" Or Like "*groundwater*" Or Like "*ecology*" Or Like "*ecological*" Or Like "*EcMP*" Or Like "*SED*" Or Like "*RFEDS*" Or Like "*RI/FS*" Or Like "*RI*" Or Like "*FS*" Or Like "*ROD*" Or Like "*(ROD)*" Or Like "*RCRA*" Or Like "*CERCLA*" Or Like "*closure plan*" Or Like "*EIS*" Or Like "*(EIS)*" Or Like "*map*" Or Like "*meteorology*" Or Like "*weather*" Or Like "*sampling wells*" Or Like "*remedial investigation*" |
85,659 |
| Natural Resources | Checking for the word anywhere within Title and no screening of document types
Like "*ecolog*" or like "*cultur*" or like "*groundwater*" or like "*geolog*" or (like "*transport*" and like "*model*") or like "*archaeolog*" or like "*endangered*" or like "*mineral*" or like "*mining*" or like "*monitor*" or like "*meteorol*" or like "*weather*" or like "*radiol*" |
14,388 |
| Community Planning | Only searching Title Like "*land use*" or like "*site development*" or like "*sitewide eis*" or like "*site wide eis*" or like "*sitewide environmental impact statement*" or like "*site wide environmental impact statement*" |
514 |
| Emergency Response | Checking for the word anywhere within Title and no screening of document types
Like "*earthquake*" Or Like "*fire*" Or Like "*firefight*" Or Like "*flood*" Or Like "*floodplain*" Or Like "*emergency response*" Or Like "*disaster*" |
3,924 |
| Compliance | There was no "final query set by subject expert" for the Compliance sub-ject
area. It is believed that the expert's query attempts may have been
too restrictive and failed to find more than a minimal set of possible
database entries. A set of records provided by the expert as a sample
of query results had chromium in most of the records. A representative
query for chromium was put together and results comparable to other
subject areas, at least in number, were obtained.
Checking for the word anywhere within Title and no screening of document types Like "* chromium*" |
470 |
1The keyword searches were conducted using queries in an MS Access 97 database. Like "something" is the format of a basic query in Access, where something is the keyword (criteria) being searched. Access is sensitive to format. For example; l like "radiation" must match the entire field l like "radiation*" matches a field starting with radiation l like "* radiation" matches a field ending in radiation; l like "* radiation*" finds radiation anywhere in the field. Note that "* radiation*" would also also match both Irradiation whereas "* radiation *" (radiation with a blank on each side) would match radiation only.
| APPENDIX D |
| ElecDB | Like "* electronic database*" Or Like "* SED *" Or Like "* RFEDS*" |
| Soil | Like "* soil *" Or Like "* soils *" |
| SurfWater | Like "* surface water*" Or Like "* pond *" Or Like "* ponds *" Or Like "* ditch*" |
| Hydro | Like "* hydrology*" Or Like "* groundwater*" |
| Geology | Like "* geology*" |
| Landfill | Like "* landfill*" |
| Acts | Like "* NEPA *" Or Like "* RCRA*" Or Like "* CERCLA*" |
| Enviro | Like "* environment*" |
| Ecolog | Like "* ecology*" Or Like "* ecological*" |
| Monitor | Like "* monitoring*" Or Like "* EcMP*" Or Like "* sampling wells*" |
| Investigate | Like "* RI/ FS*" Or Like "* RI *" Or Like "* FS *" Or Like "* remedial investigation*" |
| Plans | Like "* ROD *" Or Like "* (ROD) *" Or Like "* closure plan*" Or Like "* EIS *" Or Like "* (EIS) *" |
| Map | Like "* map *" |
| Weather | Like "* meteorology*" Or Like "* weather*" |
| Note: SED = Surface Environmental Database, RFEDS = Rocky Flats Environmental Database System, and EcMP = Ecological Monitoring Program. | |
D. 1.3 QUERY THE SITE DATABASE
After developing the criteria to identify stewardship data and the queries to search for the data, the functional experts queried the database to identify stewardship data. First, the functional area experts analyzed the completeness of the data available in the database. For each of the fields in the ERD database, the functional area experts identified the number of records that contained data. The functional area experts also identified how many records were contained in each of the 30 databases. Second, the functional area experts identified stewardship data by determining how many times a record was selected by the keyword queries, i. e., conducting a "triage" on the records selected by the keyword queries (Figure D-2).
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
D.2 Results of Stewardship Pilot Study
The results of the stewardship pilot study include an analysis of the completeness of the data available in the site database and a summary of the records identified as stewardship data. These results are presented in the following sections.
D.2.1 COMPLETENESS OF INFORMATION AVAILABLE IN SITE DATABASE
As discussed above, the ERD contains 15 fields and 406,060 records. For each of the fields (except for the Record ID field, which is the unique number for each record and was populated for every record), the number of records with data for the field was counted and summarized (see Table D-7). As shown, the Data Source and Title/ Description fields are completed for each record, although a small percentage of the records have a value of "N/ A." Six other fields are completed for more than half of the records (Keywords, Authors, Addressees, Reference Numbers, Publication Date 1, and Location).
Of the populated fields, the only really useful field was Title. This field was really a combination title/abbreviated abstract for each record. The quality of this field varied widely. Some records contained detailed abstracts outlining specific contents. On the other hand, many of the entries in the title field were of little or undeterminable value.
| Data Source | ||
| Title/ Description | ||
| Keywords | ||
| Authors | ||
| Addressees | ||
| Distribution | ||
| Comments | ||
| Reference Numbers | ||
| Publication Date 1 | ||
| Publication Date 2 | ||
| Estimated Date | ||
| Type | ||
| Size | ||
| Location |
Of some value were the fields Keywords and Comments because these often contained useful information in which to search. They also provided information regarding the record pedigree (review information, etc.). The remaining fields were of little value. Table D-8 presents a summary of the number of records that were contributed by each of the databases consolidated into the ERD. The largest single source of records was the Rockwell Criminal Grand Jury Investigation (ROCK). This database contributed over a quarter (148,323 records) of the total records. The PFC database contributed another 50,516 records. There were seven other databases that contained between 10,000 and 40,000 records. The remaining 21 databases were relatively small and contained less that 10,000 records.
| APPENDIX D |
D.2.2 STEWARDSHIP DATA IDENTIFIED
As discussed above, stewardship data was identified based on the keyword queries. The number of records selected for each of the functional areas, based on the keyword queries, is shown in Table D-9.
| Total: | |||||||
| Percent: |
As can be seen in the table, about 25 percent (100,317) of the records were selected by the queries. The Barriers/ Buffers functional area selected the vast majority of the records identified as having potential stewardship value. It is interesting to note that over 75 percent of the records selected by the queries came from just three (EMF, PFC) of the 30 databases consolidated in the ERD database.
To determine the likelihood the records selected in Table D-9 contain stewardship data, the functional area experts conducted the triage logic (discussed in Section D. 1.3). Figure D-3 shows the number of selections for each record in the database. As can be seen, most records were not selected. Of those selected, most were selected by only one functional area. This would seem to indicate that the functional area queries were very focused on the unique and individual needs of the subject matter. It may also indicate the potential to effectively reduce the amount of data archived for sites by applying specific criteria.
If the triage decision logic presented above was applied to the data in Table D-9, then:
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |

The results of the 14 individual Barrier/Buffers sub-topic queries are presented in Table D-10. The number of times individual records were selected by the multiple Barrier/Buffer sub-topic queries are shown in Figure D-4.
| Electronic DB | |||||
| Soil | |||||
| Surface Water | |||||
| Hydrology | |||||
| Geology | |||||
| Landfill | |||||
| Regulatory Acts | |||||
| Total No. of Records in ERD Database |
If the triage decision logic presented above is applied to Figure D-4, then:
In both analyses, the triage decision logic demonstrates the potential to substantially reduce the volume of data to be placed in a stewardship archive.
| APPENDIX D |

D.3 Pilot Study Conclusions
As a result of this pilot study, the following conclusions can be drawn:
This pilot study indicated that a 75%-79% reduction in the volume of records required by a stewardship archive can be reasonably achieved by screening existing information archives. The reduction potential is expected to be increased with the refinement of selection criteria, the introduction of document pedigree criteria, and the enhancement of archival metadata standards.
This pilot study activity focused almost exclusively on document content criteria, under the hypothesis that content would allow for effective screening of information of value for stewardship. While this method was useful for developing meaningful database searches, it was not sufficient to screen between duplicative or similar information. For example, advising sites to archive all groundwater maps (content criteria) might still result in an unwieldy and less than useful set of information for a particular stewardship function. Far more useful in diagnostic screening would be the so-called pedigree criteria used in conjunction with the content criteria, including:
Vintage (did it cover the period of interest?)
Currency (was it the most recent edition of the work?)
Stature in decision making process (had it been used for site decision making, such as a federal facility agreement?)
Administrative pedigree (had it received the necessary reviews for release of information?).
| PILOT STUDY TO IDENTIFY STEWARDSHIP DATA |
The diagnosticity of triage screening is directly correlated with the quality and consistency of the information contained in the database. Of the thirteen fields in the ERD (defined in Table D-4), many were poorly populated. Of the populated fields, the only field that proved useful in the pilot study was Title. This field was really a combination title/abbreviated abstract for each record. The quality of this field varied widely. Some records contained detailed abstracts out-lining specific contents. On the other hand, many of the entries in the title field were of little or undeterminable value.
Of some value were the fields Keywords and Comments, because these often contained useful information in which to search. They also provided information regarding the record pedigree (review information, etc.). The remaining fields were of little value.
The ERD index did not include any of the pedigree information (metadata) that could potentially sharpen the resolution of stewardship triage.
|
|