5 Data handling, data integrity and data use
5.1 Avoiding the use of personal data, data protection guidelines
The survey and the following data processing and analysis will be provided by data scientist and researchers affiliated with Reprex. We will disclose the name of each researcher with access to individual-level data, and they will all be all duty bound not to identify any respondent, or give access to any other person for individual data.
The survey is not designed to record any personal information about the respondent. The analysis will be performed in a way that minimizes the risk of a respondent, for example, from a very small locality that where there may be no similar artists, to be accidentally identified. Any data that will be released by the researchers will be averaged or otherwise calculated from a large enough number of responses that individual data cannot be guessed.
We are asking respondents to participate in research that is primarily intended to be used in a statistically aggregated way. We do not want to find out about anybody’s earnings – we want to see the average earnings in a country. We are not interested in a particular song – we want to analyse the average earning potential of a song from your country compared to an average song from another country.
Therefore our survey data collection does not fall under the scope of the data handling rules of the GDPR regulation, because we want to avoid recording any personal data about our respondents. However, if somebody fills out to some textboxes with their name or email address, we will apply GDPR rules, regardless of whether the respondent resides within the European Union or not.
The researchers pledge to make an effort to avoid respondents being accidentally identified. For full transparency, they will describe the steps made to ensure respondents’ personal data will not accidentally come to their possession.
5.2 How do we get the email addresses of the respondents?
In short, usually we do not. We are only working with associations, collective management and other organizations who are duty-bound to their members to conduct market research, or analyze the music markets to charge appropriate royalties or other income for their members, or to provide them export support or other in-kind or financial support.
In the EU, we cannot store email addresses and send out emails without a lawful reason and approval. If these organizations send an email with a link to the survey inviting a potential respondent to fill out the survey, within the European GDPR jurisdiction, they have a good reason to do so, and they do not need separate permission for this – we are not asking any organizations that do not have a good reason to participate in our research to invite their members to anonymously fill out our questionnaires. We even ask these partners not to publicly release the survey link, because we want to avoid vandalism from non-music professionals. Reprex carries out the survey without knowing who received an invitation to participate in the survey.
The researchers working with the data not only do not get personal information on the filled out questionnaires, but they do not even know which individuals could have potentially filled out questionnaires in this invitation-only research program.
There are very few exceptions to this general rule. In the previous years, some music professionals explicitly asked to be involved in the survey program in emails, on professional events, or in other forms. These people will be invited, until they do not opt-out from the original CEEMID mailing lists. They are a very small minority among the respondents, but in their case the email invitations are based on their explicit request.
5.3 Statistical data integrity
CEEMID & Reprex are creating high-quality indicators that support evidence-based business and policy decisions, advocacy, and valid scientific research.
These indicators may be published in the form of a public report (as Commissioned by Consolidated Independent or national partners), or, with the approval of national partners, as stand-alone data tables in the music.dataobservatory.eu website (without limitations for re-publishing elsewhere, but stating that we are responsible for the integrity only of the original datasets and not for any potential modifications.)
5.3.1 Methodology
We adhere to the principles of reproducible research, and various research and ethical standards, particularly:
In case of policy advocacy reports, we comply with the Open Policy Analysis standards developed by the Berkeley Initiative for Transparency in the Social Sciences & Center for Effective Global Action in case our policy analysis is made public. In case we provide policy analysis for clients who do not make our work public, we apply only the applicable principles from OPA.
In case of investment and economic advisory roles, we follow the Professional Standards of the CFI Institute.
Generally, we believe that the quality and integrity of our work is based on following reproducible research standards, namely reviewability, replicability, confirmability and auditability. This originally scientific principle can be extended for evidence-based policy analysis, responsible economic and investment advisory roles, evidence-giving in juridical and legal procedures, and similar situations where the high quality of evidence and its responsible use is expected.
CEEMID only uses open-source statistical software for the processing of the data. It has been publishing critical elements of its statistical software code on the peer-reviewed statistical software repository CRAN, adhering not only standards in code quality but also in code documentation. In some cases, we publish our entire code. This makes the replication or reproduction of our findings possible without accessible burden. (See, for example, retorharmonize, our software that harmonizes the results of multi-national and multi-year surveys.)
We publish authoritative copies of our statistical indicators, and when necessary, our model results on at least two data repositories. This plays a critical role in the auditability requirement of our work.
On either Zenodo of figshare, which are independent data repositories to store and identified authoritative copies of our data see next subchapter. These widely accepted scientific data repositories make sure that our indicators have an intact, well-identifiable, non-compromised copy. They are fully independent from us, and provide such services to leading research institutions of the world.
On our music.dataobservatory.eu website that provides methodological information and machine readable downloads of the authoritative indicator values. This website will be connected to figshare, but provides a more convenient user interface for documenting and communicating our data.
Any of our research work that is intended to be published will be published either by a methodological annex, or with a link to a methodological annex in a repository. This allows our users, other researchers or readers to critically evaluate our works’ validity. We will process any potential data error reports promptly, and make versioned corrections, if possible. But our responsibility ends with our authoritative releases. Because of the open data nature of our releases, the data may be use, re-used and modified by anybody. Any conclusions from datasets used and modified by our data users is their sole responsibility.
5.3.2 Authoritative copies
The role of authoritative copies is partly to prevent plagiarism, and partly to protect users from possible alterations of the data and research findings that can mislead their audience, partners, policy-makers, business partners, members about the valid findings.
These document identifier and repository services allow the ‘embargo’ of certain research products with a well-justified reason. This means that for a limited period of time, they may store an authoritative copy and provide a doi identifier without making the statistical table or document available for the public. This allows our partners to first benefit from the communication of our work that was commissioned for them, but the time-limitation makes sure that later any plagiarised or compromised copies of such work can be found. Therefore, these ‘publications’ are not concurring with the publication of our research project, but provide a long-term guarantee against compromising the results, for both the benefit of our clients and our research team.
Any statistical indicators that is communicated to the public in the form of a research report or a statistical table, we will keep a documented, verifiable, authoritative copy of the data on music.dataobservatory.eu, the figshare scientific repository and/or the Zenodo The authoritative copy of these indicators (with methodological description) will be published with a unique document object identifier (doi), and any potential updates or error corrections with a new version under the same doi (indicating the version change.)
We will also put into one of these repositories an authoritative copy of the final public research report.
5.3.3 Non-public research products
Our national partners may receive different, non-public research results and products. They can use these products without limitations unless they publish it.
Should they decide to publish it, they must contact us and ask for the approval of publication. Depending on our agreement, as a general rule, we cannot decline the approval if:
the author(s) and their brandnames, and the authoritative copy’s separate digital object identifier must be placed in the published document;
the publication does not alter our research findings in a way that misrepresents our findings;
our statistical indicators and tables are not altered, or presented in a misleading way.
In case of modifications that we do not find justified or we find misleading, depending on our agreement, we may not prevent the publication on the condition that any reference to our authors and brandnames is fully removed from such publication.
In other words, unless we agreed otherwise, we always allow our partners to publish originally non-public documents provided that they publish it in an ethical manner.
5.4 Visualization of our data
The visualizations of our data as images are copyright protected, but usually we give a relatively permissive license to re-use them. Any restrictions of re-use is connected to the ethical presentation of data. We woudl like to avoid the mispresentations of our data products and research findings via visual mispresentations of our data.
We know that data visualization is very important to our clients, and we more than happy to assist their creative team to create aesthetic, compelling, and at the same time ethical visual representations of our data. We also work with data visualizations professionals who can assist our clients to create beautiful and ethical visualizations.
In case of modifications of our visualizations that we do not find justified or we find misleading, depending on our agreement, we may not prevent the publication on the condition that any reference to our authors and brandnames is fully removed from such publication.