|
|
General
Information-Background
![](/web/20061120233220im_/http://www.osl.gc.ca/sgdo/images/px-rouge.gif) |
For several decades,
DFO scientists from Québec Region have collected large amounts
of data. The costs associated with these acquisitions are considerable
as is the value of the data, which are irreplaceable. In addition
to the use for which the data were originally collected, requests
for historical data are more and more common. The Oceanographic
Data Management System (ODMS) was developed to fulfill the diverse
needs related to the distribution and long-term conservation of
oceanographic data.
During the development of the system, we were forced us to face
a number of challenges, including the wide variety
of file formats and storage
media in use, and to find solutions. The are also important
advantages to a system like the ODMS, which include data
security, an inventory
of existing data, the ability
for selective retrieval of data, and controlled
access to data.
To learn more about the problems related to the long-term conservation
of numerical data, read the article by Jeff Rothenberg that was
published in Scientific American a few years ago:
Rothenberg, Jeff (1995): Ensuring the Longevity of Digital Documents,
Scientific American, January 1995, p 42-47 |
Multiple file formats
With the passage of time and technological advances, different instruments
and computer systems have been used over the years to collect data.
Usually, acquisition software associated with different instruments
will store data in their particular formats, resulting in a multiplicity
of formats even for the same type of data. If one wants to facilitate
the use and sharing of data, these various formats must be converted
to standardized and documented ones. To this end, we have decided
that only standardized and documented data will be included in the
system. In the Québec Region, there are two standard formats
currently in use: the TS and ODF formats. The TS format has been used
since 1989. In 1999, we decided to adopt the ODF standard already
used by the Maritimes Region. Although the files archived in the TS
format are gradually being converted, the two formats will co-exist
in the system for a certain amount of time. |
Storage media
When compiling a data inventory, one is also faced with the variety
of storage and exploitation systems. Considering the speed at which
computer technology has evolved over the last 20 years, we can easily
understand this situation. For the older systems, which have files
that were created under exploitation systems that are now obsolete,
there is a high risk that the technologies necessary to decode the
information will no longer be available. For current technologies,
it is still possible to transcribe data to another media, but this
exercise is nevertheless labour-intensive. If all data are stored
on the same platform and in a single medium, obtaining copies then
becomes extremely easy. Reconversion or retranscription is also much
easier. A centralized catalogue and archive system fulfills this requirement
perfectly. |
![Top of Page](/web/20061120233220im_/http://www.osl.gc.ca/sgdo/images/symboles/haut.gif) |
Data security
If data are not stored in a central location, with all the security
measures that this implies, the physical security of the data then
becomes a concern. It is almost certain that individuals will not
always make back-up copies and store them securely. If data are stored
in a central location on a server that has a back-up strategy in place,
there is no danger of loss, even in the event of a catastrophe (burst
pipes, fire, vandalism). The data held in the ODMS are subject to
the back-up procedures of IML's central network. The tapes are sent
to a vault located outside the building. |
Knowing of the existence of data
Even if all scientists would take appropriate logical and physical
security measures for their own data, the problem would remain of
informing the rest of the scientific community of the existence of
the different data sets. The only way to let others know of the existence
of data is to list them in a single system and make them accessible
from a single system. In this way, data is not lost when the person
holding the data leaves the organization. |
Selective data recovery
An important characteristic of the ODMS is the ability to selectively
recover data. The data are catalogued according to a relational model
and the database may be queried using the spatio-temporal coordinates
or additional information such as key words. Our system therefore
becomes a powerful tool for performing thematic searches for data.
Other attributes can also be added to the query, such as the file
format, the quality rating, the name of the data collector, and so
on, allowing for even more specific search criteria. |
Control to data access
Security measures have also been implanted to avoid unauthorized access
to the system. To access the system and the data, users must first
enter their user names and passwords; thus, they must be registered
users. Project leaders using the system also have the possibility
to limit access only to project members. The existence of the restricted
data is recorded in the catalogue, but the data may not be consulted.
This mechanism assures exclusive use of the data for a certain period,
which is specified in the oceanographic data management policy. Users
can therefore catalogue and archive their data while still keeping
a period of exclusivity. |
|