2018 |
Kapidakis, S. International Conference on Theory and Practice of Digital Libraries, TPDL 2018, LNCS 10450, Springer, 2018, ISSN: 0302-9743. Abstract | Links | BibTeX | Tags: digital libraries, harvesting, Metadata, open archive @conference{Kapidakis2018b, Harvesting tasks gather information to a central repository. We studied the metadata returned from 744179 harvesting tasks from 2120 harvesting services in 529 harvesting rounds during a period of two years. To achieve that, we initiated nearly 1,500,000 tasks, because a significant part of the Open Archive Initiative harvesting services never worked or have ceased working while many other services fail occasionally. We studied the synthesis (elements and verbosity of values) of the harvested metadata, and how it evolved over time. We found that most services utilize almost all Dublin Core elements, but there are services with minimal descriptions. Most services have very minimal updates and, overall, the harvested metadata is slowly improving over time with “description” and “relation” improving the most. Our results help us to better understand how and when the metadata are improved and have more realistic expectations about the quality of the metadata when we design harvesting or information systems that rely on them. |