Daniel Nüst – Using Research Compendia

Introduction

Daniel Nüst was the day’s first speaker. at our Remote ReproHack. He is based in Germany and works for the o2r Project (https://o2r.info) which supports digital curation and geosciences. In this talk he spoke about the benefits of research compendia for open and reproducible work and in particular for peer review.

Daniel began by quoting Archaeologist Ben Marwick of the University of Washington:

‘An article about computational science in a scientific publication is nit the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.’

He went on to explain that most journal articles are too brief to include details of many of the methods and decisions made by the researcher, let alone details of the computational environment.

Daniel also sought to outline the shape of researchers. Traditionally he says they have been T-shaped with broad cross-discipline knowledge and an in-depth understanding of their research area. He went on to explain that modern researchers also need to have an additional deep-knowledge around statistics, computation and reproducibility. This makes them π-shaped.

Research Compendia

Research compendia are an elegant way of bringing all of these building blocks (data, software, documentation and computing environments) to help take research outputs beyond just the papers, but to be more accessible and reproducible. When organising a research compendium Daniel had some key suggestions:

  • Stick with the conventions of your peers
  • Keep data, methods and outputs separate
  • Specify your computational environment as clearly as you can
  • Use the R package structure and support tools
  • Use modern tools
  • Don't forget simple formats

Once you have a research compendium to accompany your paper it will be an enormous benefit to your work:

  • More Transparent
  • Help garner more credit
  • Make it more discoverable
  • Make your work easier to reuse
  • Make it easier to collaborate.

The final piece of advice from Daniel, and one that was repeated throughout the day was simple – have a README file. It’s the simplest way to help people access, understand and reproduce your work.

CODECHECK

Daniel then introduced the o2r (https://o2r.info) and CODECHECK (https://codecheck.org.uk/) projects. o2r is building tools to introduce executable research compendia (ERC) in the publication process within geosciences. CODECHECK is an initiative to introduce an independent execution of computations underlying research as part of peer review. By introducing a codechecker as an independent role in scholarly review, he and the project's Co-PI Stephen J. Eglen hope to improve the transparency and usefulness of computational research across all sciences, and to give credit to currently undervalued research outputs. The CODECHECK principles are:

  • Codecheckers record but don’t investigate or fix
  • Communication between humans is key
  • Credit is given to codecheckers
  • Workflows must be auditable

Daniel invited all to sign up as codecheckers (https://codecheck.org.uk/get-involved/), as the skills acquired at a ReproHack are just what codecheckers need.

Conclusion

Connecting the two topics, Daniel concluded by pointing out the main benefit of (executable) research compendia for peer review: they make thorough inspection and re-use of all building blocks of research a lot easier!

You can follow Daniel on Twitter: @nordholmen

Return to article index