This study investigates the simultaneous use of multiple metadata schemas at research data repositories. The analysis covers how eight disciplinary research data repositories from the geosciences and social sciences use disciplinary metadata schemas and the DataCite Metadata Schema, and how two metadata records describing the same dataset compare. The results show that DataCite metadata records could be improved considerably by optimizing schema crosswalks. However, the parallel use of disciplinary and multidisciplinary metadata records is complex. For example, discipline has a significant effect on the completeness of DataCite metadata. A temporal analysis also highlights that metadata workflows are diverse, and in some cases, suboptimal crosswalks are likely not the sole cause of incomplete DataCite metadata. Comparing the disciplinary metadata schemas and the DataCite Metadata Schema on a structural level reveals that most differences between schemas are the result of different approaches to modelling statements about datasets, not the lack of opportunity to express them. The element sets of both disciplinary metadata schemas and the DataCite Metadata Schema could be extended to describe datasets in more detail. These observations demonstrate that disciplinary and multidisciplinary metadata schemas serve distinct purposes. Disciplinary repositories should take full advantage of the opportunities both options provide.
翻译:暂无翻译