| 4. | DATA MANAGEMENT AND EXCHANGE MECHANISMS |
| 4.1 | WORLD WEATHER WATCH (WWW) |
| 4.1.1 | Introduction |
4.1.1.1 The objective of data management is to improve access to, and the usability of, data. One role of data management in the WWW environment is to solve the interface issues which arise as data originating from the GOS move via the GTS to data processing centres, and as information originating from the data processing centres moves via the GTS to users. Another role is to address the interface issues which arise when data and products are exchanged with groups outside of the WWW community. The terms of reference of the CBS Working Group on Data Management (WGDM) stress the integrative role of Data Management in reviewing the performance of the components of the WWW, its responsibilities for data representation and codes issues, its role in the quality control of data exchanged via the GTS, and its role in ensuring that the proper coordination occurs as interfaces are built between the various components of the WWW, and between elements of the basic system and those of other WMO programmes. It aims to improve the efficiency of data activities, thus enabling other WMO programmes to benefit from the WWW basic systems in support of their operational requirements.
| 4.1.2 | Data management coordination |
4.1.2.1 Over the past few years there has been increased emphasis on service to and coordination with other WMO and related international programmes. Therefore, CBS expanded its terms of reference to ensure that the Basic Systems respond to the data management requirements of all WMO and related international programmes. Twelfth WMO Congress requested CBS to coordinate the preparation of a WMO Plan for Data Management. WDM-sponsored inter-programme data management coordination meetings provide a forum to co-ordinate the data management activities of all WMO technical commissions and related international programmes and have led to significant progress and a methodology for developing an integrated WMO-wide data management plan.
4.1.2.2 Among the tasks of the CBS Working Group on Data Management is the coordination of data management across all WMO programmes and leading the preparation of a WMO Guide to Data Management. A draft outline of the guide has been provided to the Presidents of the other technical commissions seeking comment and inviting them to nominate experts who could contribute to the guide.
| 4.1.3 | WMO distributed databases |
4.1.3.1 The development of distributed databases within the WMO systems is being undertaken to expand the ability of all Members to make ad-hoc requests for data held in databases of Members who are willing to share specified data. In 1992, the tenth session of CBS renamed the WWW Distributed Databases concept WMO Distributed Databases (DDBs) to more accurately reflect its goals to meet the requirements to provide data and information needed by WMO, and related international, programmes but not routinely exchanged on the GTS. A trial of the DDBs concept began in October 1995 and as part of the trial the Secretariat has made the information contained in WMO Pub. 9, Vol. A and C and Publication 47 available via FTP. The trial will include reviews and adjustments to procedures every six months and will be expanded to the GTS once the Main Telecommunications Network (MTN) has been upgraded to support TCP/IP communications protocols.
| 4.1.4 | Software exchange |
4.1.4.1 WDM facilitates the exchange of software between Members through the CBS Software Registry which provides information on software available from and requested by Members. The latest version of the Registry was distributed in late and a digital version of the registry has been available via the World Wide Web since late 1995.
| 4.1.5 | Representation forms |
4.1.5.1 Agreed upon codes and data representation forms are one of the most fundamental requirements for efficient international exchange of meteorological data and products. The WGDM and its subgroup on data representation and codes is responsible for the development and maintenance of these code forms. Over the past 10 years the increasing automation of data exchange has led to development of binary table-driven data representation forms such as BUFR and GRIB. These data forms provide for much more flexible, efficient transfer and computer processing but are not directly readable by humans. Their enormous flexibility has proven to be so important, however, that a new character-based table driven code, CREX, has been developed and is now being used on a trial basis. The WWW supports and maintains a number of existing codes for oceanographic data exchange in character form (BUOY, BATHY, TESAC, TRACKOB, WAVEOB). A separate section has been included in BUFR exclusively for oceanographic data, maintained by IOC/IODE.
| 4.1.6 | Monitoring |
4.1.6.1 WDM is responsible for the monitoring of the operation of the WWW. It develops procedures for monitoring the quality and quantity of the data and products exchanged but these procedures are carried out by the other WWW components. The monitoring routinely includes oceanographic data exchange on the GTS in the existing oceanographic codes.
| 4.1.7 | Training and guidance material |
4.1.7.1 The WGDM compiled the Guide on WWW Data Management with an additional WWW Technical Document: "Guide to WMO Binary Code Forms" which provides an in-depth tutorial of GRIB and BUFR. Following the request of CBS-X, the Working Group on Data Management has initiated actions to coordinate development of a WMO-wide guide on data management.
4.1.7.2 Regional training seminars on WWW data management are held once every one or two years and provide up-to-date information on data management issues to WMO Members. These seminars cover all aspects of data management but often devote a considerable portion of their time to data representation and codes issues as these are often the most difficult to understand.
| 4.2 | IOC COMMITTEE ON INTERNATIONAL OCEANOGRAPHIC DATA AND INFORMATION EXCHANGE |
| 4.2.1. | Status |
4.2.1.1 The IOC's Committee on International Oceanographic Data and Information Exchange (IODE) was established in 1961 by the IOC as an intergovernmental mechanism to improve the management and exchange of marine data in delayed mode. Subsequently IGOSS was established for real-time collection, exchange and processing of oceanographic data (see Section 3.3). Today, IODE consists of over 65 member countries and with more than 40 National Oceanographic Data Centres and Designated National Agencies providing data management services to their countries and assisting the global exchange of data.
| 4.2.2. | Responsibilities |
4.2.2.1 IODE was established to :
"enhance marine research, exploration, and development by facilitating the exchange of oceanographic data and information between participating Member States."
With the advance of oceanography from a science dealing mostly with local processes to one which is also studying ocean basin and global processes, researchers depend critically on the availability of an international exchange system to provide data and information from all available sources. Additionally, scientists studying local processes benefit substantially from access to data collected by other Member States in their area of interest. The success of the IODE programme depends on the support of participating Member States, and the involvement of many individual institutions and marine scientists, who contribute not only data, but also the necessary expertise to maintain and further develop the IODE system.
| 4.2.3. | Publications |
4.2.3.1 IODE produces a range of publications and other material in support of marine data management and data exchange processes. These range from Manuals and Guides on the creation of National Oceanographic Data Centres through to data quality control procedures and data exchange formats. There are other products such as the OceanPC suite of software for marine data management, analysis and display of oceanographic data as well as a 'shoe box' of data management software.
| 4.2.4. | Structure |
4.2.4.1 The IODE Committee provides the direction and coordination for the operation of the IODE program. The physical composition of IODE is a network of agencies, data centres, expert groups and specific projects that provides a framework for the management and exchange of data. This 'infrastructure' also undertakes the development of standards, introduces new technologies and undertakes a range of training and technology transfer activities.
Designated National Agencies
4.2.4.2 Some Member States that have not established National Oceanographic Data Centres have instead officially assigned the responsibility of international exchange of oceanographic data and information to some other agency within the Member State. These agencies are referred to as Designated National Agencies (DNAs). DNA's are generally smaller agencies with few resources but with an interest in the coordination of marine data management.
National Oceanographic Data Centres (NODC)
4.2.4.3 National Oceanographic Data Centres are funded agencies with an endorsed government responsibility for the management, exchange and archiving of oceanographic data in the national interest. NODCs actively exchange data within their region and with other centres within the IODE program such as the World Data Centres. This facility acquires, processes, quality controls, inventories, archives and disseminates data in accordance with national responsibilities. In addition to disseminating data and data products nationally, NODCs are normally charged with the responsibility for conducting international exchange. Here, the most fundamental responsibility of the NODC within the IODE is to actively seek and acquire from national sources those data which are exchangeable internationally, and to process and quality control the data and submit them in a timely fashion to the appropriate WDC for Oceanography or RNODC. In return, the NODC can request and receive from the WDCs for Oceanography or RNODCs similar data or inventory information which they need for their own requirements.
Responsible National Oceanographic Data Centres (RNODC)
4.2.4.4 Some countries operate Responsible National Oceanographic Data Centres in association with the NODC's. RNODC's assist the World Data Centres in a specific area, such as a specific type of data or data exchange formats or they may cover a specific regions such as the RNODC Southern Ocean. Existing RNODCs include :
World Data Centres (WDC)
4.2.4.5 The top of the data exchange pyramid are the World Data Centres (WDC) for Oceanography, which form part of the network of data centres established by the International Council of Scientific Unions (ICSU). WDCs receive oceanographic data and inventories from NODCs, RNODCs, marine science organizations, and individual scientists. These data are collected and submitted voluntarily from national programmes, or arise from international cooperative ventures. On request, the WDCs provide copies of data, inventories and publications to NODCs/DNAs, to RNODCs and to international co-operative programmes, as appropriate, in exchange, or with a charge not to exceed the cost of providing the service. Another major responsibility of the WDCs for Oceanography is to monitor the performance of the international data exchange system and report their findings to the IOC Secretariat and the IODE Committee. The Committee can use this information to take appropriate action to correct deficiencies in the international exchange system.
4.2.4.6 There are currently three World Data Centres (Oceanography):
| 4.2.5. | Activities |
4.2.5.1 The IODE program undertakes a wide range of activities. Some of the more significant ones include the :
| 4.2.6. | Analysis |
4.2.6.1 Strengths: These include: funding stability; redundancies in data archives ensuring no loss of data; demonstrated successful programmes including Global Ocean Data Archaeology and Rescue (GODAR) and the GTSPP; demonstrated capability to partner scientific organizations and programmes (e.g. WOCE DACs for UOT, Sea Level, ADCP, SVP); considerable experience in managing large amounts of a wide range of data types; ability to provide services and develop products such as CD-ROMs on GTSPP, World Ocean Atlas 94, World Ocean Database 98, GEBCO, etc.; existing infrastructure in place to manage global data.
4.2.6.2 Weaknesses: These include: a large organization sometimes difficult to move; reliance on voluntary contributions of Member States; uneven levels of skills across the system.
| 4.3 | GLOBAL TEMPERATURE AND SALINITY PROFILE PROGRAMME (GTSPP) |
| 4.3.1 | Status |
4.3.1.1 The GTSPP is a joint IOC/WMO project that knits together both real-time (typically IGOSS) and delayed mode (typically IODE) data collections of global ocean temperature and salinity observations into a single programme. Participants are governmental and scientific organizations in various countries who support their contributions to GTSPP through their own budgets. It was initiated jointly by the Intergovernmental Committees for IGOSS and IODE in 1989 as a pilot project, and converted to a long-term programme in 1996.
4.3.1.2 Tasks in the GTSPP are shared amongst the participants. Real-time data processing services are provided by the Marine Environmental Data Service (MEDS) of Canada. The U.S. NODC provides data processing services for delayed mode data and maintenance of the Continuously Managed Database, CMD. AOML, CSIRO and Scripps provide scientific advice and assessment of the data handled by the project. Through cooperation with WOCE, the WOCE Subsurface Data Centre in Brest has also contributed data and expertise. Other data centres in IODE provide data to the project as they are processed. Cooperation with the GODAR Project also brings data into the GTSPP.
| 4.3.2 | Responsibilities |
4.3.2.1 Observations: The GTSPP concerns itself with temperature and salinity profiles collected from the world's oceans. Other observations made in association with the T and S profiles, such as other profiles or surface marine observations, are also carried with the data.
4.3.2.2 Services: One of the goals of the GTSPP is to provide data of the highest possible quality as quickly as possible to users. The foundation of this goal is the CMD. This database holds both real-time and delayed mode data. Where both the real-time and delayed mode data exist from a particular location and time, the delayed mode is retained in the CMD because it represents the highest resolution and highest quality data. The contents of the CMD are available upon request from the U.S. NODC.
| 4.3.3 | Publications |
4.3.3.1 In cooperation with other partners this includes:
| 4.3.4 | Structure |
4.3.4.1 The GTSPP Steering Committee meets as required to continue the operation of the programme. Meetings have been jointly held with WOCE committee meetings to reduce costs. In the last few years of the programme, meetings have been roughly 18 months apart.
4.3.4.2 The Chair of the GTSPP Steering committee was elected at the start of the Project and this post has been held by the same person since then.
4.3.4.3 The GTSPP functions by actions undertaken by participants to achieve common goals agreed to at the meetings. Since it is a collection of volunteer organizations, adjustments are always needed to accommodate changes in levels of participation of members. These adjustments are made by current members taking on new roles or by recruiting new members.
| 4.3.5 | Observing Network |
4.3.5.1 The GTSPP takes advantage of a number of services and infrastructures available at both the international and national levels. Internationally, the WMO provides the use of the GTS for the transmission of oceanographic messages through the IGOSS programme. GTSPP uses this service to acquire the data exchanged this way.
4.3.5.2 Some nations have developed an extensive infrastructure to provide and service ships of opportunity in the collection of temperature (and some salinity) profiles around the world. These programmes have become a key component of the SOOP, and GTSPP provides the data management component.
4.3.5.3 Many nations undertake both monitoring and research data collection programmes at sea. These may be through autonomous instruments, such as floats, or from ships. Data collected are provided to their National Oceanographic Data Centres or to RNODCs of the IODE system. From them, T and S data are provided to the GTSPP for inclusion in the programme.
| 4.3.6 | Data Exchange and Management |
4.3.6.1 Real-time data are managed by MEDS. The data are received and processed through quality assessment and duplicates resolution software three times each week. At the same schedule, the data are transferred to the CMD held in the U.S. Users who require fast availability to these data can contact MEDS for this service. MEDS provides response to one time requests or routine downloads of the data received. At present there are both Canadian and international users of the service.
4.3.6.2 Monitoring of the exchange of real-time data takes place primarily at MEDS. Each month MEDS reviews data from ships that show a more than 10% failure rate on profile data. Systematic problems are noted and ships operators are notified by email. Those ships that have had problems consistently over time are specially noted in the report.
4.3.6.3 GTSPP also monitors the data received by 4 different centres acquiring GTS data around the world. These include the Germans, Japanese and a U.S. site. Each month a report is prepared comparing the data received from North American sites and then shortly afterwards from all of the sites. Discrepancies noted by these reports are used to track down problems with data getting to the GTS or getting sent around the world.
4.3.6.4 GTSPP has also participated in special and routine monitoring projects of the GTS run by WMO.
4.3.6.5 Delayed mode data are managed by the U.S. NODC. They accept the real-time data from MEDS and update the CMD when data are received. Delayed mode data are acquired either from other NODCs, or from cooperation with projects such as WOCE, GODAR and SOOP. Users are supported in a similar manner as at MEDS. Some of the data are also available through the Internet.
4.3.6.6 Through comparisons with the holdings of the WOCE Subsurface Data Centre in Brest and the U.S. NODC, discrepancies in content have been corrected. Scientific data quality assessment has been provided on yearly files by the scientific institutions noted above. Not only does this provide another level of assessment, but promotes the collaboration and exchange of expertise between scientific and data management personnel.
| 4.3.7 | Analysis |
4.3.7.1 Strengths:
4.3.7.2 Weaknesses:
| 4.4 | DATA MANAGEMENT ISSUES FOR GOOS/GCOS |
| 4.4.1 | Introduction |
4.4.1.1 In the modern approach to data and information management, the delivery of oceanographic products is viewed as the result of a production line process. This approach implicitly requires a number of functions to be recognized as important. These are specified below in the form of a checklist (4.4.2 to 4.4.6). In generating any one product, not all of the items on the checklist may apply, but all should be considered.
4.4.1.2 It is likely that steps in the production line will be carried out by more than one agency. In that case, the responsibilities of each agency must be clear, and national or international support must be provided on a sound basis. Coordination between agencies will be important and mechanisms to ensure this must be established.
4.4.1.3 The steps taken, agency responsibilities, and the other factors considered for each product must be well documented. The documentation must be accessible to existing and future clients and properly maintained.
| 4.4.2 | Parameter Measurements |
Variable Selection
4.4.2.1 Each product will require one or more variables to be measured. Combinations of variables from different environmental regimes (i.e. ocean, atmosphere, land) may need to be combined to form a single product. This demands recognition of the variables required and a scientific basis for their combination into the desired product.
Instrumentation
4.4.2.2 Each variable required for a particular product will have to be measured to a required level of accuracy and precision. Consideration must be given to the ability of the various instruments available for making the measurement to meet the accuracy and precision requirements.
Space and Time Scales
4.4.2.3 Each product will demand sampling at specified time and space scales determined by the scientific requirement to meet the product specifications. It must be clear what the scales are and what the consequences are for the product when the sampling requirement is not met.
Metadata Requirements
4.4.2.4 Each product will be associated with its own set of parameters which must be documented as part of the product generation process. These metadata must be preserved with the data to make the information available for generation of other products in the future.
| 4.4.3 | Data Collection and Storage |
4.4.3.1 For the generation of each product a schedule will be required for the delivery of data from the instrument to the processing centre. These demands will in turn create demands on the data assembly process. Failure to meet these demands may cause some degradation of the product and must be documented with the product.
4.4.3.2 The data collection system will not always perform at peak efficiency. It will usually be necessary to build in redundancies in the collection and transmission systems to provide an adequate degree of tolerance to faults in the system. Accommodating some degree of fault tolerance will help to offset degradation of the product.
4.4.3.3 From time to time, the entire collection system may be shut down. Building a degree of redundancy into the system will enable it to perform robustly under adverse circumstances. Failure of a component of the collection system which shuts off provision of data to the processing system will also affect product generation, with a variety of possible consequences that need to be allowed for in designing product delivery.
4.4.3.4 Data collected for a particular product may also serve other valuable ends. This is particularly true of long and continuous time series measurements which are useful in climate change studies. Archiving may be required not only for original data and products but also for intermediate results used in product generation. Consideration must be given to how such archives will function.
4.4.3.5 Details of data and information transmission must be decided, including such things as the appropriateness of existing transmission systems and data formats. Where needed, new forms of transmission and formats may be required. In such cases the impacts of changes on the complete production line must be considered.
| 4.4.4 | Processing |
4.4.4.1 In order for products to be useful they must usually be produced in a timely manner. The time constraints set limits on the speed at which data processing must be carried out. The production process, including data quality assessment, duplicates removal, computer model runs, etc., must be designed to be completed within the required time schedule.
4.4.4.2 Redundancies in a processing system, or other factors, cause data to be duplicated. The impact of duplicates on a product, and the necessity to control these duplications, must be assessed. Where duplications must be controlled, specific mechanisms for doing so must be identified and implemented.
4.4.4.3 Each product will have a certain tolerance to data of poor quality affecting the result. It is important that this tolerance be considered. Mechanisms for managing data quality in order to screen out lower quality data must be put in place. The explicit rules of assessing quality must be well documented.
4.4.4.4 Each data collection system should include a mechanism for monitoring the characteristics of the data collection, including data quality, timeliness of transmission, adequacy of sampling strategy, etc. If there are problems with any of these factors, appropriate corrections must be made to prevent their recurrence.
4.4.4.5 Once used for product generation, the data and information stored in archives should be available to others for other purposes.
| 4.4.5 | Product Dissemination |
4.4.5.1 Given products should be designed for a particular target audience, which should be consulted on the form and content of the product and the delivery mechanism for getting the product to the user. In some cases more than one delivery schedule and carrier may be used.
| 4.4.6 | Product Suitability |
4.4.6.1 Mechanisms must be devised for obtaining feedback from users on the adequacies of and desired changes in a product.
![]()
| Previous item | Next item | Table of contents |