Overview Open Data Lecture Scientific data Reusability

Reusability

For scientific work to reach its full potential, it is important that data, software, methods and findings can be re-used and further developed by other scientists. Therefore, reusability is one of the most important quality characteristics of scientific data (Wilkinson et al, 2016).

The prerequisite for this is to make data available in the best possible quality and with sufficient documentation (metadata) and to grant clearly defined rights of use. For the following reasons, scientific data and software may become unusable after only a few months or years:

Incompatibility: Data, (experimental) software and scientific calculations can no longer be executed if the dependent programs, program libraries and the operating system are no longer available in the original version, any licenses and settings or the original setup is complex or even unknown. If active maintenance over a longer period of time is not possible, methodical archiving and documentation are therefore particularly important. In addition, the use of open, standardized data formats is advantageous.
Data loss: Data and software can be irretrievably lost due to hardware defects (e.g. unreadable hard disks, DVDs, USB sticks) or computer viruses if these are not backed up as part of a backup strategy.
Undetectability: If digital resources are not centrally archived and made accessible, this represents a barrier to subsequent use. This can also lead to data loss if, for example, personnel migrate and knowledge about the whereabouts of data is lost.
Lack of documentation and metadata: Scientific data cannot be used if information about how it was created (provenance) and how it is to be interpreted is not known. This knowledge - like technical knowledge about software - is lost after some time. Therefore, metadata and documentation about data and methods must be available that can be understood by third parties.