About the project
Reproducibility of research results has long been a hot topic amongst scientists.
As science becomes more data and computationally intensive, the harder it has become to reproduce others’ research. Not only is access to the data required, but also access to the same software, operating systems and other tools used.
A Queensland-based team led by the Terrestrial Ecosystem Research Network (TERN) has taken a step towards addressing this issue by developing infrastructure for reproducible science in the form of a virtual desktop, accessed by a web-browser, called CoESRA: Collaborative Environment for Ecosystem Science Research and Analysis.
CoESRA, launched in July 2015, provides tools in the cloud and is equipped with the Kepler scientific workflow system and Nimrod software (other software can be added by users).
TERN, in collaboration with UQ's Research Computing Centre and QCIF, developed CoESRA, with funding from TERN and the Australian National Data Service through the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS) program. QRIScloud, the Queensland node of Nectar and RDS administered by QCIF, provides CoESRA’s compute and storage infrastructure: the project is using 150 cores of QRIScompute and 5 TB of QRISdata storage.
CoESRA aims to make ecosystem science research reproducible in a form others can repeat with minimal effort by providing easy access to the execution environment and resources to build, execute and share repeatable workflow-based scientific experiments — all without having to download any software or go through a merit-based allocation process of research cloud infrastructure. It enables users to share infrastructure, data and analysis via the cloud, which drastically reduces, or removes, setup costs for others to rerun the experiment. In cloud computing terms, CoESRA is offering users a Platform as a Service through a Desktop as a Service.
“The Kepler workflow system was used due to its support for access to local and remote data, its large pool of reusable components and because it was initially developed for the discipline of ecology, but any tool can be added to the platform based on users’ need,” said project lead Dr Siddeswara Guru, TERN’s Data Integration and Synthesis Manager.
“In the future, we envisage that the reproducible experiments will be made available during the review process of the scientific paper. This will enable reviewers to check the experiment while reviewing the paper with minimal effort.”
Currently, an IUCN (International Union for Conservation of Nature) Red List of Ecosystems Risk assessment workflow for Victorian mountain ash forest is available on CoESRA in a completely reproducible form.QCIF's Engineering Services division assisted the CoESRA project by stitching together the software that makes up the CoESRA virtual laboratory and then developing the process to create a copy of the virtual lab on a Nectar cloud virtual machine. ANDS funded this work, which was completed in mid-2015.
Gavin Kennedy, manager of QCIF's Engineering Services, said this work highlights that the division “has a strong skillset in creating, integrating and deploying (on NeCTAR) software to support research.”
TERN chose QRIScloud for the project’s compute and data storage due to it being a local, Queensland cloud provider. “QRIScloud’s support team is good,” said Dr Guru, “and QRIScloud provided flexibility to scale our deployment and resource allocation. QRIScloud was a great supporter of the project.”
RCC’s involvement saw it develop CoESRA’s core services and one of the first two workflows as illustrative use cases. RCC and TERN are continuing to develop the platform further.with more workflows and working towards allowing users to execute workflows and interact with them from the Web, in addition to the current desktop interface. Future plans also include enabling easier overseas access and extending CoESRA’s use to a wider range of users.
Each CoESRA session is limited to 48 hours to ensure efficient use of the cloud infrastructure. Users can ask to extend the session if they need to have continuously running jobs on the machine.