The Data Management Life Cycle

The data life cycle organizes and illustrates the elements of data management. Our data are corporate assets with value beyond our immediate need and should be managed throughout the entire data life cycle.

Plan

Acquire

Maintain

Access

Evaluate

Archive

QA/QC

Plan for Success

Planning for a project involves making decisions about data management, potential products, as well as data stewardship roles and responsibilities. It is important to document all stages of the data management life cycle and quality control prior to beginning a new project.

Data Management Plans

What is a Data Management Plan?

A data management plan is a document or plan that contains elements of how a project’s data will be handled. The plan describes what data will be acquired; how the data will be managed, described, and stored; what standards will be used; and more. The goal of a data management plan is to consider the many aspects of data management life cycle to ensure the data are well-managed in the present and prepared for preservation in the future.

By laying out the blueprint for managing Service data throughout its life cycle, a data management plan provides valuable details, such as how the Service's data will be preserved for the long term and how data will be available for sharing.

Data management plans assist the Service

  • Increase visibility, reproducibility, and validity of research projects because data are well documented, including approach and methodology.
  • Reduce ;unnecessary duplication of data collection or procurement.
  • Help ensure data and data products are accessible and available for the long term.
  • Initiate the process of gathering metadata and documentation throughout the project life cycle.

Data Management Plan vs. Project Documentation

Data management plans are focused on the data-related aspects of the project and work together with other documentation (for example project proposals, project plans, standard operating procedures, metadata, and reports) to ensure data are well managed.

What are the components of a data management plan?

For a project occurring over a long time period or involving many staff, it is important to formally document a data management plan. 

Successful data management plans address:

  • Collection Methods and Acquisition Source
  • Data Processing and Workflows
  • Quality Assurance and Quality Control
  • Formats
  • Data History
  • Metadata
  • Backup and Security
  • Access and Sharing
  • Repository
  • Archive

Back to top

                                                               

Acquire Data Acquisition Methods

There are four methods of acquiring data: collecting new data, converting and transforming legacy data, sharing and exchanging data, and purchasing data. Before data are acquired by the Service, a data management plan must be in place that includes information about analysis, definitions, and standards. 

Common Data Acquisition Considerations

  • Needs: The first thing to always consider is the business need - why are these data required? What will be done with them?
  • Rules: A rule identifies the constraints under which the business operates. For instance, where applicable, all geospatial data must have Federal Geographic Data Committee (FGDC) compliant metadata. These rules will affect your data acquisition decisions.
  • Standards: Any Government, FWS, or industry standards that apply will need consideration.
  • Accuracy: Among the most familiar accuracy requirements is the locational accuracy for spatial data; but there are other accuracy requirements that you may need to consider as well.
  • Cost: Cost is always a consideration. Sometimes it's cheaper to buy than to collect.
  • Age: For many types of work, the data need to be current. For others, data may need to cover a specified time period or season. If you are trying to determine vegetation coverage, for example, you may want photographs from the summer, when vegetation is at the highest. If you are trying to look for landforms, you may want winter photos.
  • Timeliness: You should determine how soon you need the data.
  • Format: Do you need the data as spatial data, photos, flat files, Excel files, XML files? Data should also comply with the Open Data Act to be machine-readable and shared.

Data Collection

In the FWS, most data are collected by employees or volunteers. While FWS can obtain some data from outside sources, the bulk of the Service's data are collected, created, and maintained by our staff.

Data must be reviewed and updated on a regular schedule to maintain a high standard of quality. Metadata must also be updated at the same time. Managers need to be confident that they have the best possible data available when making decisions. 

Service Data Collection Champions

  • Environment Conservation Online System:  develops systems to maintain standardized Service data for the Ecological Services Program, Fish and Aquatic Conservation Program, and the National Wildlife Refuge System.
  • Migratory Bird Program monitors bird populations: develops and implements a variety of activities designed to inform bird conservation policies and initiatives.
  • National Wildlife Refuge System Inventory & Monitoring: uses surveys to collect data and assess the status and trends of refuge lands, waters, plants and wildlife, as well as their responses to management actions.
  • Fish and Aquatic Conservation Fish Technology Centers: use data to steer fisheries conservation practices and improve conservation techniques and methods.
  • Forensics Laboratory: examines, identifies, and compares evidence using a wide range of scientific procedures and instruments, in the attempt to link suspect, victim, and crime scene with physical evidence for crimes against wildlife.
  • Public Permits: The Service issues permits under various wildlife laws and treaties at different offices at the national, regional, and/or wildlife port levels that promote conservation efforts by authorizing data collection and scientific research.

Data Conversion and Transformation

The Service has been collecting data to make management decisions for well over a century. The Service has a significant amount of legacy data that could be used to inform decisions and support our mission. Legacy data often cannot be readily used because the data are not in a format we can readily access such as old paper data sheets or outdated technology.

Legacy Data Considerations:

  • Technical Issues: Is the storage medium readable? Can the data be converted into a usable format? At what cost?
  • How were the data collected:  What information is known about the survey? Methodology?
  • Quality: Is the data of sufficient quality to meet Service needs?
  • Access: Can the data be easily obtained, shared, and processed to answer our management needs?

Data Sharing Agreements

Data sharing agreements are formal contracts that detail what data are being shared and the appropriate use for the data.  A Memorandum of Understanding (MOU), which is one type of formal sharing agreement, is required for data acquisition and sharing with other government agencies. 

Purpose of Data Sharing Agreements

Data sharing agreements promote early communication among agencies about questions of data handling and use.  These data sharing agreements are critical for reducing duplication of effort and ensuring the data the Service acquires through its contracts are delivered.

Example Memorandums of Understanding

Contact the Service’s Regional solicitors for more information on formal contracts and memorandums of understanding.

Purchase

Many times, the FWS may need to contract out data collection.  Best practices for data acquisition should focus on the quality of data and documentation and may include:

  • Purchase Agreements: Data purchases require a Purchasing Agreement. By purchasing data, you are endorsing the data. Such data then becomes subject to the Information Quality Act, which covers all data.
  • Data Certification: Metadata are required for purchased data and should be specified in the Purchasing Agreement.
  • Licensing Issues: What restrictions are placed upon the use of the data? Are there Privacy Act or FOIA considerations?

Acquisition Security Requirements

It is important to protect data from harm and ensure information security policies are followed throughout the acquisition process.

Back to top

Data Maintenance

Data maintenance includes processing data for analysis, creating metadata, and making sure data are in a format that can be accessed by others in the future. Maintaining data is important to the Service because unmaintained data do not have enough information for future managers to analyze or interpret the information. In addition, if the format of the data is not kept current, those data could be lost because we cannot convert the information from an outdated version to a machine or software that can read those data. Data maintained by Service employees is often used to make significant decisions about wildlife or habitat management, so it is imperative that these data are of high quality, can be understood and used by others, and are stored in a format that is secure and safe.

Processing newly collected or acquired data involves organizing files (e.g., combining or renaming), checking and correcting errors and inconsistencies, creating additional variables and/or intermediate data products, and loading data into databases. These steps provide quality assurance and control, and enhance the utility of the data. For example, a Service employee may process data by downloading telemetry data from a receiver and convert the files to an excel file that can be analyzed using statistical software.

Creating metadata and completing documentation of data and data products ensures that other users know how the data were collected, what the variables represent, and how they should be used. For example, GPS coordinates collected for a vegetation survey on a refuge need to have documented the projection and coordinate system so that the coordinates can be put on a map or used for analysis.

Uploading data to a repository or data sharing site, or otherwise publishing its availability, is a crucial step in making data available to other users. FWS employees can upload data records to ServCat, a data sharing site, which publishes these records to www.data.gov. The Service has over 40,000 datasets uploaded to www.data.gov ranging from wildlife and vegetation surveys on refuges to locations of National Wildlife Refuges campgrounds. The Service is currently developing a new data release policy, and evaluating other options for sharing and publishing data.

Even after these steps have been completed, there will be a need for ongoing maintenance, to update data as new information is obtained, correct errors that are discovered at later dates, develop new data products and documents, and migrate data to new systems or formats as needed.

Federal agency employees may have additional data maintenance responsibilities related to other laws or directives, such as establishing retention schedules for data that are included in a System of Records, ensuring the safety and security of personally identifiable information (PII), or holding data that are requested under FOIA, or for litigation.

Data Stewardship

Stewardship is the careful and responsible management of something entrusted to one's care on behalf of others. All Service employees have a responsibility to manage scientific data effectively and transparently. 

Who is a Data Steward?

A data steward, or data manager, is a person responsible for overseeing the life cycle activities of a project.  All Service employees need to play a data stewardship role.

Data Steward Roles

Data stewardship is the job of professionals who create and maintain data. The Service cannot accomplish data management without people taking on the role of data stewardship at all levels of the organization. Service staff should embrace data steward roles and responsibilities. People with knowledge about the needs of the organization are necessary at all levels to define and manage data content and quality to ensure that the data collected and maintained meet those needs.

Many of the responsibilities of Data Stewards are the same, regardless of where the person falls within the organization.

Data Steward Responsibilities

  • Be active advocates of data management
    • Endorse good data management practices, use them, and share them.
  • Participate in the data management team 
    • Data management is going on all around you. Work with others to help ensure everyone's efforts meet standards and are of high quality
    • Working together also improves efficiency and minimizes the chance that data efforts are duplicated
  • Ensure that information meets Service needs
    • Are data in a format that is readable and understandable?
    • Is there current documentation on the data such as when they were collected, where, how, by whom, and under what conditions?
  • Be accountable for the integrity and quality of data created and updated
    • Data stewards are responsible for establishing requirements and assessing the quality of the data
    • Can the data be relied on to be correct?
  • Ensure data documentation is developed and maintained 
    • Metadata, which is defined as "data about data" describes the content, quality, condition, and other characteristics of data
    • Metadata should be collected from the beginning of the data collection process
    • The data management plan can also include information to be used in metadata and other process descriptions
    • Standard Operating Procedures (SOPs) are also a good way to ensure data stewardship responsibilities are met
  • Establish data access security requirements 
  • Ensure official agency records requirements are being met 
    • The National Archives and Records Administration (NARA) rules regulate the disposal of all types of records
    • Always involve your Records Manager and Administrator early in the data collection planning process

Back to top

Access Data

The ability to prepare, release, share, and disseminate quality data is an important part of the life cycle process. The Service shares data with many partners and the public.

Repositories

Making publicly-funded government data readily available to the public is the responsibility of all Service employees. Data repositories are centralized places to store and maintain data. These repositories can be used to readily access data within the Service as well as by our partners and the public.

There are many different types of repositories. A repository can consist of one or more databases or files which can be distributed over a network, a collection of physical data, or a reference system for discovering data.

ServCat

The Service Catalog (ServCat) application is a valuable resource for preserving and discovering a variety of the Service's data and information. Since December 2011, ServCat has been used as a master repository to manage all types of information used to inform resource management decisions. Today, a number of ECOS ECOS
Environmental Conservation Online System (ECOS) serves a variety of reports related to FWS Threatened and Endangered Species.

Learn more about ECOS
applications are beginning to integrate with ServCat, and multiple programs and branches have adopted it as their repository. 

Download About ServCat Help Documentation.

Data.gov

Data.gov is the home of the U.S. Government’s open data. Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more.

You can search Data.gov from its catalog of government data from across the Federal Government. Once in the catalog, to find datasets you can:

  • Enter keywords in the search box.
  • Browse through types, tags, formats, groups, organization types, organizations, and categories.  
  • Search by geospatial area 

GeoPlatform

GeoPlatform.gov acts a one-stop shop for data that is also leveragable with our web services and applications. With over 160,000 datasets already registered in the Data Catalog – and planned data acquisitions in the Marketplace – GeoPlatform provides access to all your geospatial data needs.

  • Share geospatial resources
  • Search datasets
  • Exploit data with open application services

ScienceBase

The U.S. Geological Survey ScienceBase is a collaborative scientific data and information management platform used directly by science teams. ScienceBase provides access to aggregated information derived from many data and information domains, including feeds from existing data systems, metadata catalogs, and scientists contributing new and original content. ScienceBase architecture is designed to help science teams and data practitioners centralize their data and information resources to create a foundation needed for their work. ScienceBase, both original software and engineered components, is released as an open source project to promote involvement from the larger scientific programming community both inside and outside the USGS.

Key elements include:

  • Data cataloging and collaborative data management platform
  • Central search and discovery application
  • Web services facilitating other applications
  • Research community catalogs

Publish

Publication of quality scientific data, as stand-alone products, or in conjunction with scholarly articles, is integral to strengthening the Services’s tradition of scientific excellence.  To ensure the quality and credibility of the scientific information we use to make decisions, the Service has implemented policies and processes related to scientific integrity (Service policy 212 FW 7; Department policy 305 DM 3).

Scientific Journals

The Service welcomes submissions to its online peer reviewed publications focused on the practical application and integration of applied science to wildlife conservation and management online peer reviewed publications focused on the practical application and integration of applied science to wildlife conservation and management — the Journal of Fish and Wildlife Management and the revitalized North American Fauna monograph series.

Peer Review

In order to ensure the quality and credibility of the scientific information we use to make decisions, the Fish and Wildlife Service has implemented a formal "peer review" process for influential scientific documents.

FWS peer review process

Publication Policy

The Service does not review publications for policy implications and employees are required to include a disclaimer on their publications.

FWS Policy Review Guidance for Scientific Publications

How/Tools

Science Excellence:  The Service’s Science Applications program works to coordinate internal and partner efforts developing and applying science for conservation outcomes by ensuring science products are high quality, non-duplicative, and accessible to fish and wildlife managers and decision makers. Science Applications has responsibility for leading Service efforts in Landscape Conservation Cooperatives, Information Quality and Scientific Integrity, and Climate Change Adaptation.

  • Scientific Journals: The Journal of Fish and Wildlife Management and North American Fauna provide a mechanism for rigorous peer review, professional publication and wide dissemination of these types of scientific data and analyses along with more traditional applied conservation studies.
  • Information Quality: The Fish and Wildlife Service is committed to using sound science in its decision-making and to providing the American public with information of the highest quality possible.
  • Peer Review: To ensure the quality and credibility of scientific information we distribute the Service has a peer review process.  This includes a peer review checklist and an example Statement of Work for contractors performing peer reviews.
  • Service Ethics: The Service has implemented standards of conduct and guidelines that are related to scientific integrity and carrying out the mission of the Service.
  • Data Standards: The Service uses data standards to increase the quality and compatibility of its data. This approach will increase opportunities to share data and reduce incidents of redundant data development. There is a formal process for developing data standards.

Backup and Secure

It is important to protect data from accidental data loss, corruption, and unauthorized access. This includes routinely making additional copies of data files or databases that can be used to restore the original data or for recovery of earlier instances of the data. Making backups of scientific data is critically important in data management. Backups protect against human errors, hardware failure, virus attacks, power failure, and natural disasters. Backups can help save time and money if these failures occur.

Securing Your Data

Physical security and computer security of data must be considered in good data management. While it is encouraged to make data available to the public, some data is considered confidential or sensitive information must be kept secure. Make sure access to data is only given to those who require access.

Data Backup Best Practices

  • Understand the existing backup policies within your office or branch.
    • Backups should be part the data management plan.
  • If there is no established policy, document backup standard operating procedures:
    • Clarify responsibilities
    • Specify backup location
    • Establish access rules
    • Define timelines for backup
    • Describe data interoperability and format
  • Backup digital data and digitize physical documents.
  • Automate your backups.
  • Back up the metadata along with the data.
  • Locating the backup data:
    • Backup on a designated repository, on an external disk, or a network drive.
    • Place backups in a location different from the original data source to avoid a double loss.
  • Checking backups:
    • Test the backup process to ensure you can retrieve data.
    • Use Quality Assurance/Quality Controls to ensure data quality.
  • Determine how long to keep your backup.
    • This will depend upon requirements and needs.  

Data Security Best Practices

  • Share metadata but keep confidential or sensitive information unavailable.
  • Make sure to follow data encryption policies.
  • Comply with all Service computer security policies - especially patching and network.
  • Make sure data are physically protected in a locked drawer or on a secure network.

Back to top

Evaluate

Evaluate represents activities associated with analyzing and using data. Important goals are maximizing accuracy and productivity, while minimizing costs.  Workflows should be developed that are efficient, well-documented, and scripted where possible. Reproducible methods that simplify review and re-analysis will help achieve these goals.

Process

Data processing covers any set of structured activities resulting in the alteration or integration of data. Data processing can result in data ready for analysis, or generate output such as graphs and summary reports. Documenting the steps for how data are processed is essential for reproducibility and improves transparency.

Analyze

The Service analyzes data we collect to aid in answering management questions and determining best practices.  We use many tools and techniques to analyze data we collect to answer questions of conservation concern.

Back to top

Data Archiving

Data archiving is a process that supports the long-term storage of scientific data and the methods used to read or interpret archived data.

Preservation

Preservation involves actions and procedures to keep data for some period of time and/or to set data aside for future use. 

An archive is a collection of historical records, not actively used, that are kept for long-term retention and used for future references.

A repository is a centralized place to store and maintain data such as data.gov.

Archive vs. Repository

The terms "archive" and "repository" are very similar and are sometimes used interchangeably; however, for the government, the term "archive" has a special meaning. The term "archive" is specific to the mission and activities of the National Archives and Records Administration (NARA).  National Archives serves as the nation's record keeper and ensures that valuable government records are available to the public.

Archives vs. Backups

Archives are created for long-term storage of historically important data that are no longer needed for immediate access. Backups are created to restore data and continue operations in case of disasters (e.g., deletion, catastrophic equipment loss).



The official “record keeper” of the Federal Government is the National Archives of the United States. Now being developed, the Electronic Records Archives (ERA) is used to support the preservation of, and access to the permanently valuable electronic records of the Federal Government. Find out more at FWS NARA Records.

The Service is responsible for complying with federal records management policies and sending electronic records to the National Archives.

Back to top

Quality Assurance / Quality Control

Ensuring the Quality and Credibility of Information

Data quality management is the prevention of data defects or issues within a dataset that reduce our ability to apply data towards our science-based conservation efforts. There are two components of data quality management that, though often lumped together, represent different concepts:

  • Quality Assurance (QA) – Implementing processes that prevent data defects from occurring. For example, writing a detailed protocol for a long-term survey so the methodology is maintained as new staff come on board.
  • Quality Control (QC) – Detecting and repairing defects once you have the data. For example, noticing a negative value in a count field may indicate a data-entry error, which might be fixed by reviewing a field data sheet.

Potential defects may occur at any stage of the life cycle, including incorrectly entered data, invalid data, and missing or lost data. For this reason, prevention (QA) and detection (QC) of defects are equally important to ensure the quality of your data.

What projects should consider data quality management?

These terms will be very familiar to FWS’s laboratory staff who, for example, may develop a Quality Assurance Plan to prevent cross contamination of DNA, and run various controls to ensure an assay has run properly.

Though not often discussed in wildlife or habitat work, the same fundamental QA and QC principles apply to field data collection. FWS employees are ultimately responsible for the quality of the data they collect, manage, and use in their day-to-day work.

What does data quality management look like?

This will vary by project. In most projects, basic QC functions such as proofing and reviewing data before analyzing data will be sufficient. In some cases, policy, funding sources, or scientific norms may have clear expectations up to and including a full Quality Assurance Plan. For influential or highly influential work, the FWS implementation of the Information Quality Act lays out cases where peer review of our data are required. Some factors that would increase the need for more formal procedures would be longer projects, the information needs of the decisions being made, policy, and the chance of legal action. 

What are some best practices?

Develop a data assessment strategy

  • QA - Observer training and testing
  • QA - Schedule data-quality reviews at important points in your workflow
  • QC - Maintain data-quality metadata and documentation
  • QA - Track data changes and implement a versioning scheme for your data
  • QC - Periodically run test data through all processing scripts to verify expected functionality
  • QC - Compare new data to historical values
  • QC - Plot spatial data on a map to verify locations
  • QC - Calculate summary statistics for data or display data using common graphs such as box plots to evaluate for possible anomalies.
  • QC - Review field notes for unusual occurrences or events that may help explain data anomalies
  • QA - Use data quality indicators, or at least comment fields, to qualify data anomalies

What is the Service doing to ensure data quality management?

Plan

  • QA - Plan formal QA and QC for all of the steps, and make process checklists.
  • QA - Given the question you are addressing, consider the information you need to answer that question, and the level of precision and accuracy you will need to answer your questions.

Acquire

  • QA - Properly train and test your field techs on data collection techniques
  • QA - Calibrate instruments, determine the precision and accuracy that will be acceptable.
  • QC - Transcribe and check data in a timely fashion to help catch errors while the field work is still fresh in your mind.

Maintain

  • QA: Scheduled backups to protect against file deletion or corruption.
  • QC: Check data, make sure people are using the correct units.
  • QC: Examine data products, make sure data is being entered into data tables correctly and are formatted properly.

Access

  • QA: Procedures on sharing data, query and format data properly for the person to use it.
  • QA: Providing documentation and metadata to people who are using the data so that they can interpret it properly.
  • QA: Who can access the data?

Evaluate

  • QA - Preliminary reports, make sure that things are going well.
  • QA - Define the analytic decision making process prior to analyzing any data.
  • QC - Identify and examine outliers.
  • QC - Analytical approaches defined so you look for violations of assumptions.

Archive

  • QA: Policies for how long you will maintain the data, who owns it, where is it going to be stored during your project?
  • QA: Where will we store electronic data, paper datasheets, etc.?
  • QA: Prevent data loss.
  • QA: Archive - Where will the right people be able to find it in the future? What will be saved (data/physical samples)? How will the connection between the two be maintained?

Back to top