Data Search

Data Registers

Data registers enable direct dataset search, no need to find a repository at first. Several data registers, portals and search engines find datasets in repositories based on the datasets’ metadata.

DataCite Search

DataCite is a non-profit organization that registers metadata of datasets and mints DOIs to datasets.
Major Estonian universities have formed the consortium DataCite Estonia in order to offer this service to their researchers.

To search the DataCite registry start from the DataCite Commons page: 

DC Commons

 

The search terms include the metadata required by the DataCite Metadata Framework: author, title, keywords, and so on.
The mandatory, recommended and optional metadata used at DataCite can be found on the website of the DataCite Estonia consortium.
Currently, the DataCite register contains more than 36 million different units, almost 2,4 million of which are authored by Estonian scientists. The largest contribution has been made by the data management platform PlutoF.

This registry has some very good features:
1. From the beginning of 2020, the DataCite register gives the number of times the dataset has been cited, viewed and downloaded. It should be noted that past activities are not reflected, so an accurate overview of the use of the dataset is only available from 2020 onwards.
2. It is easy to cite  the dataset in eight common formats, which can be immediately copied from the register.
3. DataCite provides the Data Citation Formatter, where more than 5,000 standard references can be created by copying a DOI.

Example: Let’s look for data published by UT researcher Maarja Öpik:
 

Maarja Dataset

The search results show that the database has been cited once, viewed 175 times and downloaded 24 times. DOI takes the into the Dryad repository to the dataset and can be cited in several formats.

 

OpenAIRE EXPLORE

Another place to search for data as well as for publications is the OpenAIRE portal. OpenAIRE is a long-term project of the European Commission, which brings together and links the results of research projects funded by the Commission.

To search the portal for publications and linked open data, you need to open OpenAIRE Explore. In the search box you can select publications, data, software, organizations, projects or funders.

 

Explore

Let’s search this register by the author Rämmer and we can see that the research data of the UT researcher is in two different data repositories but findable in OpenAIRE portal. The reason may be that EC-funded projects are collaborative projects, involving researchers from many countries, so the data is stored in a repository approved by all project partners. The embargo period is usually set by the data author until the data analysis is completed and the article is published.

Rammer

 

 

Mendeley Data Search

 

Mendeley is a UK-based company that provides products and services to researchers. The company is owned by the academic publishing house Elsevier.
Mendeley Data is a service for researchers to store their research data and search data across many registers.
Mendeley Data Search covers the DataCite, OpenAIRE, etc. registers mentioned above, but the advantage of Mendeley Data is that it also searches the contents of the data files in their own Mendeley repository, not just the metadata.
At the moment, Mendeley Data Search seems to be the most practical search environment, although there are few filtering options. The data types are very clear.
Search results cannot be sorted by year, but the year number can be added into the search box.
The logic is exactly the same as for other registers: you have to find the dataset and go to the repository to access and download the data.

Mendeley allows a combined search of data by the following characteristics:

  • AUTHOR()
  • AUTHOR_ID (Mendeley User ID, Scopus User ID, ORCID and all user IDs supported by DataCite)
  • TITLE()
  • INSTITUTION()
  • INSTITUTION_ID() (Scopus Institution ID, Scival Institution ID, Mendeley Institution ID)
  • ID()
  • DOI()
  • KEYWORDS
  • SUBJECT_AREA
  • IS_SUPPLEMENT_TO

Boolean operators AND, OR, NOT, “quotation marks” and parentheses () are also possible, but operators inside parentheses may not work.
Example: we are looking for Maarja Kruusmaa’s research data on glacial hydrology: Kruusmaa AND “glacial hydrology”. There are two results that lead to the Zenodo repository.

Kruusmaa

 

Google Dataset Search

Google is developing a dataset search engine and Google Dataset Search is available from 2018. It is similar to the Google Scholar search engine and they are designed to complement each other. Currently, only a simple keyword search is possible, as well as filtering by data type.

 

Data Citation Index

Similar to the Web of Science citation database, Clarivate has developed the Data Citation Index, which at the end of 2020 included more than 10 million datasets. This product has not been purchased for Estonian libraries at moment.