Research Data Management and Publishing
Ethical and Legal Aspects
It is not only researchers who use open research data – the data is accessible for all people regardless of their purpose of using them. As the objective of open science is to preserve the machine readable FAIR data and make it accessible, the data is accessed by machines, making it impossible to control the actual use of the open data.
Therefore, it is important to underline in the data management plan that you are aware of the ethical and legal aspects involved in your project and you know how to handle them.
- Intellectual property right, copyright and licences
- Sensitive data (special categories of personal data, business secrets, national security, data concerning endangered species)
- Protection of personal data (including the metadata of personal data
Norms and laws of ethics have an effect on all aspects of data life cycle – how the data are collected, how it is preserved during the course of the work, who has the rights to access it, which data processing methods are used, and where, in which format and for how long are the data stored after the end of the project.
Intellectual property right, copyright and licences
It is important to find out the owners of intellectual property rights and copyrights. If some of these rights belong to third parties, you need to sign with them contracts which satisfy all the parties.
An example: the intellectual property rights of a researcher employed by the university belong to the university, but the intellectual property rights of a student belong to the student. A doctoral student is still a student, thus you should sign a contract with every doctoral student participating in your project, about transferring these rights to the university. This is important regarding the sharing and publishing of data, as well as on the occasion when the doctoral student leaves the university.
If your research group collaborates with a commercial enterprise, which is not interested in publishing the data, the fact has to be indicated. Point out all factors that limit data sharing, such as the issues of property rights or patent applications.
See also Licences
Specific issues of copyright are dealt with by the university lawyer, whom you can contact when necessary. The UT lawyer has worked out the following statement about licences of grant results. This is ready for copying into your data management plan:
Licencing in UT:
• The data belong to the University of Tartu. Persons employed for filling the grant will assign the proprietary rights to the results of the research (including the data) performed under the grant agreement to the University with the Employment Contract (academic employees) or with another written document (Act of Assignment of the Intellectual Property Rights).
• Data will be disclosed under the Creative Commons license CC-BY 4.0
• A third party, whose data have been used for creating the results of the grant, may set restrictions to the usage of the data. In this case those restrictions must be considered while the data are being licensed, i.e. the university can give the license for the data usage only in the scale of rights allowed by the third person (i. e. the scale of rights that university has received from the third persons)
• If the University or a third person, whose data have been used for creating the results of the grant, wants to submit a patent or a utility model application, the publishing of the data has to be postponed until the submission of the application.
Protection of Personal Data
New! In case you process personal data during youtr research, take a look at the Guide for data protection in research:
Juurik, M., Mäesalu, T., & Tarkpea, T. (2023). Guide for Data Protection in Research. University of Tartu; TÜ Wiki.
Protection of personal data has become especially topical after the implementation of the General Data Protection Regulation on 25.05.2018. Data Processing should follow Data Protection Policy of University of Tartu.
In research, especially in social and medical studies, informing of persons under examination and asking for their consent, and the protection of personal data has always been part of the code of conduct for research integrity. This is probably the reason why the Regulation does not restrict the use of personal data in research, historical studies and statistics in the cases where all appropriate legal and technical measures to safeguard the privacy of persons under examination have been applied.
Personal data are all kinds of information about an identified or identifiable natural person. An identifiable natural person is a person who can be directly or indirectly identified by such identification as name, identification number, location information, online identifier, or by one or several physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.
Special categories of personal data are data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation.
Processing of personal data is any automated or not automated operation performed on personal data or sets of personal data, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, reading or use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction.
Principles of processing personal data
- processed lawfully, fairly and in a transparent manner in relation to the data subject
- collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes (not relevant for research)
- data minimisation
- accurate and kept up to date data
- Storage limitation (not relevant for research)
- integrity and confidentiality
To illustrate the occasions when restrictions on the collection and preservation of personal data do not apply for research, we point out the long-term targets of genetic research and the first paragraph in the letter of consent for becoming a gene donor (translation of consent form in Estonian):
The long-time targets of genetic research are:
- To create conditions for better care of the health of our descendants
- To support the development of science and medicine
- To provide doctors with opportunities for better health promotion
- To allow researchers to study our genetic descent
- To help to make Estonia better known in the world of science
In order to achieve these long-time targets, the gene donor gives their consent:
“I have been informed and I am aware of the following:
1. The aim of the Estonian Genome Project is to establish the Gene Bank, a database that contains data on the health and heredity of the people of Estonia. The Gene Bank can be used only for research, for studying and treating the diseases of gene donors, for studying the public health and for statistical purposes. Research carried out with the help of the Gene Bank shall not be limited to the present scientific level.”
Giving their consent, the gene donor allows the use of their data for different genetic research-related purposes for an unlimited period. If the requirements that personal data can be collected only for a single certain purpose and that the data should be destroyed after the achievement of this purpose, it would make such genetic research, as well as much of other, personal data-based research, entirely impossible.
In the context of research, we can talk about two legal basis for processing personal data.
1. A study subject (natural person) has given their consent for processing their personal data for one or several specific purposes.
The consent is a freely given specific, informed and unambiguous indication of the data subject by a statement or a clear affirmative action that they agree to the processing of their personal data.
2. Public interest (processing without consent).
Public interest must be substantiated and it must be shown that these interests outweigh the person’s right to private life and the protection of their personal data, still using as many organisational and technical measures as possible.
Public interest could arise, for example, from the Statutes of the University of Tartu, stating that “The University of Tartu is the national university of the Republic of Estonia, a universal integrated research, development, study and culture institution, aiming to advance science and culture, provide the possibilities to acquire higher education based on the development of science and technology on three levels of higher education in the field of humanities, social, medical and natural sciences and to provide public services based on teaching, research and other creative activities”.
To summarise, the task of the university is to carry out research in the public interest.
Public interest is often the basis for secondary research, where the data has been collected by consent for one objective, but it may be useful also for some other objective. In this case, it is necessary to assess the effort needed for asking a new consent from all the data subjects. If the effort would be unreasonably massive, you have to contact the Data Protection Inspectorate and/or Research Ethics Committee and apply for a substantiated permission for a new study.
The homepage of the Research Ethics Committee of the University of Tartu has guidelines and application forms for correct procedures. It is necessary to know that the permission from the Committee needs to be received prior to the start of the research, it is not granted retrospectively.
A brief overview about the ethical aspects of research which should be taken into account can also be found on the web page of the Estonian Research Council: Guidelines for Completing Your Ethics Self-Assessment for Grant Application.
Data management plans should pay special attention to the processing of personal data when working with sensitive personal data, and with children or minority groups.
When preserving and processing personal data, it is important to discuss security measures, see Data preservation and information security.
It is important to differentiate between pseudonymised data, which are personal data, and anonymised data, which are not personal data because anonymisation is not a reversible process. You do not need a permission from the Research Ethics Committee for carrying out a new research using the anonymised data, because the persons under examination cannot be identified any more.
Pseudonymisation is the processing of personal data in a way where personal data cannot be attributed to the data subject without using additional information (a key), but with the existence of the key, the person can still be identified.
On the other hand, we need to think about whether anonymisation could affect or distort the value of the data. For example, when the data is anonymised in medical research, it is not possible to give feedback to the participants about their genetic and health risks.
Anonymisation of the data could create an illusory feeling that such data can be shared without specific security measures. However, we still need to consider whether re-identification could be possible by aggregating the data with some other databases of personal data or by using new technologies and methods.
This London Metro Map Approach to PIA could be helpful when deciding, what steps to take:
Additional material: In case you collect and process personal data, the OpenAIRE webinar could be useful:
Protection of the researcher’s privacy
Much attention is paid to the ways how researchers should protect the personal data of research subjects.
However, we should not forget that researchers need to protect their own privacy as well and to recognise the dangers related with the collection or processing of their data.
For example, the eElurikkuse (eBiodiversity) portal contains data about nature observations:
By opening the entries, it is possible to learn who, when and where observed some certain species or phenomena. The search can be made by the name of the observer, giving full information about the regions where the observer moved. If the observer has the habit of repeatedly walking in the same area at the same time of the day, (for example, birdwatching on Sunday mornings), it is easy to use the data to predict their activities and specify their location.
A bird observer often carries a good and valuable camera and mobile equipment, posting current photos and observation data into databases. Photos are provided with plenty of technical metadata, giving some idea about the equipment the person carries. Uploading of image files to https://www.get-metadata.com/ enables to examine the metadata of these files.
Aggregating it with other metadata could pose a direct danger to the property and safety of a person by describing their property, location and habits.
Such data is easily accessible by everyone and the purpose of using personal data cannot be controlled. Editing of metadata before publishing the data in order to increase their security is the task and responsibility of the collector of the data.