data.ncl reaches half a million views

During the summer months data.ncl, Newcastle’s research data repository, reached over 500,000 views from datasets and code archived from researchers across the University. Reaching this milestone affords us the opportunity to take stock of how far we have travelled in openly sharing data as data.ncl was only launched in spring 2019.

It equally provides an indication of the reach data can have when it is archived and becomes findable, searchable and citable. Records in data.ncl have been viewed from as far away as Chile and New Zealand while the three countries who view and access data most frequently are USA, Netherlands and the UK – showing the global and national interest in research created at Newcastle University. In addition to views, data.ncl has enabled 215,000 downloads and preserves over 1200 records for future reuse.

“The long-term archiving and sharing of datasets through data.ncl is a significant part of our support for Open Research. Seeing datasets being viewed, accessed and reused shows there is real value in giving data a second life through data.ncl” said Professor Candy Rowe, Dean for Research Culture and Strategy. Professor Brian Walker, Pro-Vice-Chancellor for Research Strategy and Resources added: “Reaching this milestone shows Newcastle University is committed, along with UK government and other research funders, to the conduct of Open Research that is available to and used by as many people as possible for as long as possible”. 

All researchers and PGRs can freely archive and publicly share data from their research through data.ncl. Archived data obtains its own DOI (Digital Object Identifier) for inclusion in research outputs, including publications. To help increase engagement and impact archived data is indexed by Google Scholar and Google Dataset search. Data collections can also be created to group together data records produced from a project or research theme with its own DOI to increase discovery. This can include records of data held in discipline specific repositories to create a full showcase of the data produced by a research project. 

The Research Data Service has reviewed and approved hundreds of datasets and these are a few highlights:

  • The Coral Spawning Database brings together a huge international effort that includes over 90 authors from 60 institutions in 20 countries to openly share forty years of coral data in one place for the first time. The intention is for this database to grow over time so the data isn’t set in stone and can be added to as the research progresses. Dr James Guest said: “Coral reefs have been declining in health for decades and are severely threatened by climate change. It is, therefore, more important than ever to share large datasets on these ecosystems so that they can be used to guide management of reefs in the Anthropocene”. James added: “When we were looking for a suitable data repository for the Coral Spawning Database, data.ncl was the obvious choice because it was so user friendly and has excellent support from the Research Data Service at Newcastle University”.
  • Through National Lottery Heritage Fund, Dr Nicky Garland mapped and shared a number of features of Hadrian’s Wall including forts, towers, and road systems. The aim was to make the data open and accessible to allow researchers and the wider community to engage with Hadrian’s Wall and its conservation and research. The data records are proving to be very popular and are clearly supporting the aims of the WallCAP project. “In terms of our project decision to use data.ncl – it was a no-brainer! WallCAP will generate a considerable amount of data and we want that data to be readily accessible. Having a secure digital archive that provides DOIs that can be easily incorporated into academic publications is not only convenient, but essential in this era of data-proliferation” said Dr Rob Collis, Project Manager.
  • The Dental Micromotor Handpiece Dataset was one of the first open data examples of Newcastle University responding to the Covid-19 Pandemic. James Allison, Clinical Fellow, explained: “Our project looked at how we can use novel dental drill designs to reduce the amount of aerosol produced during dental procedures. This is important because concerns over transmitting viruses in these aerosols caused dental services to shut down during the Covid-19 pandemic. Our work showed that these drills produce less aerosol and therefore reduce this risk, allowing them to be safely used in dental practices. This also helped dental students get back to treating their patients at the School of Dental Sciences and in other institutions in the UK. We felt it was important to share our data on data.ncl so that it was available to other researchers looking at the same problem, and also to those developing guidance and policy documents to inform their decisions.”
  • Ali Alammer was a PhD researcher who shared his underpinning code for a biologically inspired machine vision model (En-HMAX), which rapidly processes 2D images with minimal computational requirements. Ali explained: “With a hierarchy of only six processing layers, the model was capable of extracting formative and unique representation to objects and scenes. It had also achieved comparable performances to existing state-of-the-art architectures including deep learning. I archived the code for research reproducibility purposes as it has a wide range of applications that includes surveillance and robotic vision.”

The Research Data Service runs data.ncl and supports researchers in planning, managing and sharing research data. For further information please visit the research data management website or contact rdm@ncl.ac.uk.

Newcastle University and Elsevier

The contract between academic publisher Elsevier and UK Universities is due for renewal in December 2021.

Newcastle University subscribes to Elsevier’s ScienceDirect at a cost of £1.1 million for the current subscription deal which enables University members to access Elsevier journals online.

The UK Universities sector – on behalf of its researchers and students – entered negotiations with Elsevier with two core objectives: to reduce costs to levels UK universities can sustain, and to provide full and immediate open access to UK research.

Open access to research allows for greater impact, expanding access worldwide and the potential for collaborative work to benefit the national and international research community.

Elsevier is now the only major publisher that does not have a transformative open access agreement in place. Subscription costs to Elsevier’s journals are high and continuing to increase but do not include an open access agreement. Transformative agreements are also supported by cOAlition S research funders and, from April 2022, UKRI’s new policy is similarly supportive.

Therefore, a key practical aim of the negotiations is to secure a transformative agreement with Elsevier, which will support the core objective of immediate open access publishing.

UK Universities began negotiations in March 2021. Representatives from the sector will sit on the official negotiation team and Jisc facilitates the overall negotiations.

Jisc has produced the following video which highlights the key issues involved and has also produced some background information about the negotiations.

The Library will provide more detailed information about the aims of the negotiations and news of any progress over the coming months via this blog and on the Research Services website.

John Williams

Photo credit: King’s Walk June 2013 by John Donoghue.

Research Culture Workshop: Towards Open Research

As part of Newcastle University’s Research Strategy, we are evolving our research culture in collaboration with the whole research community. We invite the research community across career stages, job families, and disciplines, to join this first Skills Academy Research Culture workshop: Towards Open Research.

The workshop will invite participants to consider open research practices and reflect on how they and the university can move towards a culture of more open research. In this workshop, we will consider open research principles and practices that increase transparency and rigour and accelerate the reach of our research.

Open research describes approaches to increase openness throughout the research cycle, including collaborative working, sharing and making research methodology, software, code, data, documentation and publications freely available online under terms that enable their reuse. Open research thereby increases the transparency, rigour and reproducibility of the research process and so can promote inclusivity, accelerate impact and improve public trust.  However, understanding and adopting open research practices can be challenging. This workshop therefore will explore strategies for culture change here at Newcastle University.

Workshop Details

Date: Thursday 30th September, 10.00 – 12.00.
Venue: Online.
Facilitators: Chris Emmerson and Steve Boneham.

Programme

  1. Introduction to open research
  2. Researcher perspectives on open research:
    1. Melissa Bateson – Professor of Ethology – Biosciences
    2. Greg Mutch – NU Academic Track Fellow – Engineering
    3. David Johnson – PhD Researcher – History, Classics and Archaeology
  3. Comfort break
  4. Breakout groups

    To discuss how the university can move towards a culture of open research by considering core aspects of the Center for Open Science strategy for culture change

    1. Systems and tools – what systems and research tools are needed to facilitate open sharing and documentation
    2. Support and training – what research support and training researchers require to undertake open research activities 
    3. Recognition and rewards – how open research behaviours can be encouraged, recognised and rewarded
    4. Policy – the role policy changes and interventions that require change to occur in open research practices at Newcastle

4. Reflections and next steps

*** This event is now fully booked. Please email RDM@ncl.ac.uk should you wish to discuss future Open Research events. ***

New UKRI Open Access Policy published

UK Research and Innovation logo

After its long awaited review UKRI announced its new open access policy on the 6th August. The policy will apply to publications acknowledging UKRI funding and aims to make UKRI-funded research freely available to the public. It aligns with Plan S and the Wellcome Trust open access policy, and there is a strong indication that the policy will align with the open access requirements for the next REF (due to be published in November 2021). UKRI have pledged continued and increased funding to support the implementation of the new policy.

The policy will apply to:

  • Peer-review research articles submitted for publication on or after 1 April 2022
  • Monographs, book chapters and edited collections published on or after 1 January 2024.

Summary of changes

Articles (from 1 April 2022)

  • Must be open access immediately upon publication
  • CC BY licence must apply (with some permitted CC BY-ND exceptions)
  • No embargoes
  • APCs for OA in hybrid journals no longer permitted
  • A data access statement is required (even if there is no data)
  • Biomedical research articles that acknowledge MRC or BBSRC funding are required to be archived in Europe PubMed Central 

Books, book chapters and edited collections (from 1 January 2024)

  • Must be open access within 12 months of publication
  • CC BY licence required
  • Open access can be either published open access or by deposit of the Author’s Accepted Manuscript in an institutional repository
  • Images, illustrations, tables and other supporting content should be included in the open access content however more restrictive licences can apply for third-party content.

The University will be providing training and guidance before April 2022 to support implementation of the policy.

You can read the full policy documents here: https://www.ukri.org/our-work/supporting-healthy-research-and-innovation-culture/open-research/open-access-policies-review/

Guest post: Why I support the ‘Wide in Opening Access’ approach

In this guest post Jan Deckers, senior lecturer in bioethics at Newcastle University, explains his vision of how a ‘Wide in Opening Access’ approach can allow all quality research to be published.

It is probably safe to assume that most authors like their work to be read.

The traditional model of publishing operates by means of the ‘reader pays principle’. In this model, readers must generally pay either to purchase a book or to subscribe to a journal. They might do neither. However, where readers do not pay themselves, others have to do so for them. Frequently, these others are libraries. However, most libraries that lend books and provide access to journals limit access, frequently requiring the reader to be a member of an institution and/or to pay a subscription to the library.

In the age of the internet, access to published work is much greater than what it used to be. Some books are available electronically, and many journals are. In spite of this rapid change, some things stay the same: publishers must still make their money. In order to provide open access to readers, many now demand that authors pay book or article processing charges. This disadvantages authors who seek to publish books and who cannot pay such charges, unless book publishers can rely on third party funds that cover publication costs for authors who cannot pay themselves. Where such funds are not available, other options are available. Authors can still find plenty of publishers who will offer contracts, free of any charge, to those who are able to produce good work. This option exists as many book publishers stand by the traditional model, at least in part because many readers still prefer the experience of reading a tangible book to that of reading a virtual one. Another option is self-publication, where authors can publish books at relatively low cost, essentially by taking on the publishing cost themselves. In sum, whilst open access book publication presents an ethical dilemma where it supports the ‘writer pays principle’, its benefits for readers and the availability of reasonable alternatives for authors who are excluded from publishing in the open access mode makes open access book publication, in my view, a relatively sound moral option.  

Open access journal publication presents a different challenge. Some journals find themselves in a position where, rather than to adopt the ‘writer pays principle’, they are able to get the money from elsewhere, for example from governments and other institutions that are willing and able to pay. This is the ideal scenario and – in the current world – the exception rather than the norm. This is why open access journal publication raises a massive moral challenge: what does one do, for example, when the leading journal in one’s academic specialty decides to become an open access journal that charges authors, where neither the author nor the institution that they may belong to can pay? To address this challenge, the journal may be able to offer free publication to some authors, effectively by elevating the processing fee for authors who are able to pay so that it can cover the cost for authors who are unable to pay. Some journals do this already by offering either a discount or a fee waiver to some authors. The problem is that such discounts may not be sufficient and that the criteria for discounts and waivers frequently are too indiscriminate. For example, offering waivers indiscriminately to authors who are based in particular countries both fails to recognise that those authors might be relatively rich and that authors who live in relatively rich countries might be relatively poor.

The only way that I can see out of this is to ‘de-individualise’ the article processing charge completely. Journals would then be able to publish any article that survives the scrutiny of the peer-review process, regardless of the author’s willingness or ability to pay. Such de-individualisation would also address another concern that I have with the open access journal publishing movement: how can we prevent publishers from publishing work that falls below the academic standard? One might argue that peer review should be able to separate the wheat from the chaff, but the problem is that the publisher is incentivised strongly to turn a blind eye to peer review reports, which – in the worst case – might be biased themselves by the knowledge that the author is willing to pay. 

Journals that are unable to raise enough funds to publish all articles in the open access mode may provide an option for authors who can pay to publish in the open access mode and for other authors to publish in the traditional mode. Many journals now operate in this mode, and are therefore known as hybrid journals. I do not consider this option to be ideal as it sets up a two tier system, where authors who publish in the former mode are likely to enjoy a wider readership. However, it may be preferable to the traditional mode of publication as this model is not free from problems either, providing access only to readers who can pay themselves or benefit from institutions, such as libraries, that pay for them.

The world in which authors, editors, and peer reviewers must navigate is complex. In spite of this complexity, I call upon all to resist any involvement with journals that do not provide authors with the chance to publish good quality work. Whilst I hope that open access journal publishing will become the norm for all articles, I recognise that journals may not be able to publish all articles in the open access mode due to financial constraints. As long as these constraints are there, however, I believe that journals should continue to provide the option of restricted access publication according to the ‘reader pays principle’.

This is why I only publish with and do editorial or peer-reviewing work for journals that adopt what one might call a ‘Wide in Opening Access’ (WOA) approach. It consists in peer-reviewed journals being prepared to publish all articles that survive scientific scrutiny through an appropriate peer-review process, regardless of the author’s ability or willingness to pay. It guarantees that authors who produce good journal articles and who cannot or will not pay are still able to publish. In this sense, it is ‘wide’. It is wide ‘in opening access’ as it fully supports open access publication becoming the norm. Whilst it adopts the view that articles from those who cannot or will not pay should ideally also be published in the open access mode, it recognises that this may not always be possible.

With this blog post I call upon all authors to support the WOA approach in the world of journal publishing. You can do so, for example, by stating your support for it on your website. Without such support, writers who do not have the means either to pay themselves or to mobilise others to pay for them will be left behind in the transition towards greater open access journal publication. Without support for the WOA approach, those without the means to pay to publish will be disadvantaged more than they are already in a world in which the ‘writer pays principle’ is gaining significant traction. To debate the WOA approach as well as other issues in publishing ethics, I created a ‘publishing ethics’ mailing list hosted by Jiscmail.  You can (un)subscribe to this list here

Image credit: Arek Socha from Pixabay

Secondary Data Is Out There

To researchers’ credit across the globe the amount of data being shared is growing and this will only increase over time as open research becomes ubiquitous. There are significant benefits to data sharing including increased rigour, transparency, and visibility.

But this post isn’t going to get blogged down in the benefits of data sharing as it is a path well-trodden. Instead, let’s consider that as researchers have been archiving and sharing data in archives and repositories there is a rich source of material that can be accessed, reworked, reanalysed and compared to recent data collections.

This secondary data analysis is a growing area of interest to researchers and funders, with the latter having calls focusing solely on reanalysis of data (e.g. UKRI). Accessing historic data also allows for research to be undertaken where costs are prohibitive, data is impossible or difficult to collect, and, possibly, reduce the burden on over researched populations. With the continuing challenges with collecting primary data during the pandemic there might not be a better time to consider what data is already out there.

And it is not only research that can benefit but also teaching and learning. Archived data sources can be accessed to introduce students to a fantastic range of existing data and code. Using secondary data can free students of data collection allowing them to focus on developing skills of research questions and analysis.

Based on data from re3data.org as of April 2021 there are over 2600 data repositories available for researchers to archive data, up from 1000 in November 2013. This isn’t a completely exhaustive list but is close enough to give an idea of the scale. Amongst these is our own data.ncl that now houses over 1200 datasets shared by university colleagues from across all disciplines and collected using a variety of methods and techniques.

However, finding the right dataset for your latest research project or teaching idea isn’t always straightforward. To help with that I have created guidance on how to find, reuse and cite data on the RDM webpages.

I would also be very keen to hear from users of secondary data to create case studies to inspire colleagues on this approach. If you would be interested in sharing your approach and experience, then please do get in touch.

Image Credit: Franki Chamaki on Unsplash

Open Publishing Week

Photo by Andraz Lazic on Unsplash

Our transformative agreements allow researchers to publish their articles as open access for free in thousands of journals from publishers including Wiley, Springer, T&F, OUP, CUP, BMJ and the Royal Society.

To help familiarise authors with the publishing workflows of these new agreements we are running an online ‘open publishing week’ where publishers will present details of how the agreements work in practice, explaining what authors should expect at each stage of the publication process.

The scheduled events are:

  • Royal Society (19/07/21, 11.00-12.00)
  • CUP (19/07/21, 14.00-15.00)
  • T&F (20/07/21, 10.00-11.00)
  • Springer (20/07/21, 14.00-15.00)
  • OUP (21/07/21, 11.00-11.00)
  • Wiley (21/07/21, 14.00-15.00)
  • BMJ (22/07/21, 11.00-12.00)

The broader aim of these agreements is to transform all subscription journals to full and immediate open access. You can read more about that in our post ‘Transformative agreements – an easier route to open access‘ or talk to us about them at open publishing week.

Guest post: Making Astronomy Research More Reproducible

Chris Harrison, as an astronomer who is a Newcastle University Academic Track Fellow (NUAct). Here he reflects on the good and bad aspects of reproducible science in observational astronomy and describes how he is using Newcastle’s Research Repository to set a good example. We are keen to hear from colleagues across the research landscape so please do get in touch if you’d like to write a post.

I use telescopes on the ground and in space to study galaxies and the supermassive black holes that lurk at their centres. These observations result in gigabytes to terabytes of data being collected for each project. In particular, when using interferometers such the Very Large Array (VLA) or the Atacama Large Millimetre Array, (ALMA) the raw data can be 100s of gigabytes from just one night of observations. These raw data products are then processed to produce two dimensional images, one dimensional spectra or three dimensional data cubes which are used to perform the scientific analyses. Although I mostly collect my own data, every so often I have felt compelled to write a paper from which I wanted to reproduce the results from other people’s observational data and their analyses. This has been in situations where the results were quite sensational and appeared to contradict previous results or conflict with my expectations from my understanding of theoretical predictions. As I write this, I have another paper under review that directly challenges previous work. This has been after a year of struggling to reproduce the previous results! Why has this been and what can we do better?

On the one hand most astronomical observations have incredible archives where all raw data products ever taken can be accessed by anyone after the, typically 1 year long, proprietary period has expired (great archive examples are ALMA and the VLA). These always include comprehensive meta-data and is always provided in standard formats so that it can be accessed and processed by anyone with a variety of open access software. However, from painful experience, I can tell you that it is still extremely challenging to reproduce other people’s results based on astronomical observational data. This is due to the many complex steps that are taken to go from the raw data products to a scientific result. Indeed, these are so complex it is basically not possible to adequately describe all steps in a publication. The only real solution for completely reproducible science would be to publicly release processed data products and the codes that were used both to reproduce these and analyse them. Indeed, I have even requested such products and codes from authors and found that they have been destroyed forever on broken hard drives. As early-career researchers work in a competitive environment and have vulnerable careers, one cannot blame them for wanting to keep their hard work to themselves (potentially for follow-up papers) and to not expose themselves to criticism. Discussing the many disappointing reasons why early career research are so vulnerable – and how this damages scientific progress – is too much to discuss here. However, as I now in an academic track position, I feel more confident to set a good example and hopefully encourage other more senior academics to do the same.

In March 2021 I launched the “Quasar Feedback Survey”, which is a comprehensive observational survey of 42 galaxies hosting rapidly growing black holes. We will be studying these galaxies with an array of telescopes. With the launch of this survey, I uploaded 45 gigabytes of processed data products to data.ncl (Newcastle’s Research Repository), including historic data from pilot projects that lead to this wider survey. All information about data products and results can also easily be accessed via a dedicated website. I already know these galaxies, and hence data, are of interest to other astronomers and our data products are being used right now to help design new observational experiments. As the survey continues the data products will continue to be uploaded alongside the relevant publications. The next important step for me is to find a way to also share the codes, whilst protecting the career development of the early career researchers that produced the codes.

To be continued!

Image Credit: C. Harrison, A. Thomson; Bill Saxton, NRAO/AUI/NSF; NASA.

2020 in review: ePrints

Following on from our annual review of data.ncl this post highlights some key statistics from our ePrints repository where researchers share their publications.

Headline stats for 2020

5086 new publication records added (total of 124,957)

2989 new full text publications made available (total 26,582)

289,864 views

33,031 downloads

Our three most viewed publications were:

  1. Agroecosystem management and nutritional quality of plant foods: The case of organic fruits and vegetables
  2. Associations between childhood maltreatment and inflammatory markers
  3. Cars, EVs and battery recycling forecasts and economic models

Author profile pages were also some of our most popular pages, so we’d encourage researchers to keep their publication list is up-to-date.

Adding publications to ePrints makes them eligible for REF, but also means they are more visible and can have more impact. We optimise ePrints for research discovery and syndicate content to aggregation services such as CORE and unpaywall. That helps people find free versions of research that would otherwise be inaccessible to them as well as making text and data mining more feasible.

Our aim for 2021 is to increase the proportion of research outputs we make open access in ePrints. That will be helped by our new transformative agreements with publishers that make open access free for our authors and by funder policies like that of the Wellcome Trust and Plan S that increasingly mandate this.

2020 in review: data.ncl

This has been the first full calendar year data.ncl has been available for our researchers to archive and share data. And in the spirit of best of 2020 articles on film, TV shows and music I have dug into data.ncl’s usage statistics to pull out the headlines.

360 data deposits (718 in total)

118 different researchers archiving data (174 in total)  

154,630 views

47,190 data downloads

Our top three datasets based on views and downloads in 2020 were:

  1. Newcastle Grasp Library
  2. Handwritten Chinese Numbers
  3. EMG and data glove dataset for dexterous myoelectric control

The treemap below shows unsurprisingly that the most popular item uploaded was dataset (72%), then figure (15%) and media in a distant third (9%).

upload by item type

And the USA was the country that accessed our datasets the most with nearly 100,000 views from the stars and stripes alone.

As we move into 2021, I would love for this growth to continue and to see an increase in numbers across the board but in particular:

  • A greater number of records of datasets where the data is held elsewhere
  • An increase in code and software being archived and shared (currently 3% of all items but we have a GitHub plugin to make it easy to send snapshots to data.ncl)
  • The use of data.ncl as a platform to build dashboards upon that allows data to be manipulated and visualised

Let’s see what 2021 holds for data.ncl and we’ll be here to help archive and share the full variety of data and code from research at Newcastle.