How to Deposit Data
The University's research data repository will enable you to publish any type of research data, from any discipline. Usually (although not always) the research data will underpin a research output (such as: an academic journal article, chapter in a book, a conference, an exhibition, or a performance).
For each deposit you make into the repository you will be given a unique, persistent DOI (Digital Object Identifier) that can be cited in publications as a way of providing easy access to the underlying data beyond the duration of your research project.
Not all data stored in the repository will be able to be shared with others due to copyright restrictions, commercial, or ethical considerations. There are various options available in the repository that allow for embargo periods and differing methods of access to data.
However, even if data can't be shared they must still be archived safely and managed securely for the long-term.
What do we mean by a 'Dataset'?
A dataset is a collection of your research data grouped together for deposit, which must also contain some documentation to help others to understand it.
A dataset will usually consist of two parts:
- The data: data that underpins or can be used to validate or reproduce research findings
- The documentation: at minimum a simple 'readme' text file that explains: what the data are, how the data was created and how they can be used.
Other types of documentation can also be included within a dataset at the researcher's discretion. How data are organised within a dataset is up to the researcher and will probably reflect the logical way in which they were created or how other people are expected to use them.
How do I get my dataset to Enlighten: Research Data?
In the longer term, researchers will be able to upload a dataset into the repository themselves. However at the current time the University deploys a mediated deposit. This means that staff in the Research Data Management team will manage the upload process, create a record of the dataset and mint a DOI for the dataset.
If your dataset is below 10GB in size, the best way to send it is to use the University's File Transfer System
Simply login using your GUID and use the generic email address: email@example.com drop off files for us to pick up.
When you drop files off using the File Transfer system, please include in the message section:
- Your Name
- Your College / School
- The title of your paper (if you have one)
- A list of co-authors
- The funder reference code (if you have one)
- The date on which the dataset can be made publicly available (e.g. 'now', 'when paper is published' or a future date if under embargo)
- Any restrictions on making the dataset public (eg ethical, legal or commercial considerations)
- If ethical approval was required for this research, please provide a copy of the consent form and information document.
- Preferred Licence
The default licence we use is Creative Commons Attribution (CC-BY) which allows re-distribution and re-use of your data on the condition that the creators are appropriately credited. Please let us know if your require an alternative.
If some of this information is already included in a draft of the publication, then you can just attach that to the dataset and we will take the details from there.
Please also provide a Readme file or similar form of documentation that will be stored alongside the dataset.
This should include information that will help others to re-use the data, for example: a description of the contents within folders; codes or keys to fields names in spreadsheets; software and software versions used; and anything else that will help future users to understand and work with the data.
If you don't know or can't answer any of the above please contact us to discuss
If your dataset is larger than 10GB then please contact us to arrange deposit via an alternative method.
What about Software?
Software can also have a requirement to be stored securely and cited in publications.
In order to get a DOI for software there are two main options:
If you use Github you can get a DOI by using an additional service provided free of charge by the CERN/EU repository Zenodo. This allows for the integration of archiving functionality which automates the sending of code releases from Github to Zenodo for archiving and DOI creation. It is particularly useful for code that is updated regularly as updates are sent to be archived automatically. There is a guide on how to do this on the Github website.
Archiving can't be done automatically in other code repositories, but you can replicate the process manually by creating a tarball of the code you want to store and deposit it with this repository (or any other repository that will give you a DOI). This approach is not ideal for code that is updated regularly as each new release would need to be bundled up and archived manually, however it would be fine for code which is comparatively static.
For more information, there is also a good article on software and related funder requirements provided by the Software Sustainability Institute.
When can I get my DOI?
It is often likely that you'll need to get the DOI for your dataset while your publication is in draft form (so you can use it to cite the dataset in the final version of the publication). It is also possible therefore that your dataset will not be ready for deposit at the time you need your DOI.
This is fine, and if you contact us we will mint a DOI for you that you can use to cite your dataset. The record that we create in the repository (which will be the DOI 'landing page') will be kept in a 'Review state' until the final dataset has been deposited.
It is therefore imperative that you send us your dataset before any citations are made public so that we can activate the DOI. Otherwise the DOI will not resolve to a webpage that describes your data.