Chapter 15 Finale
We have come a long way since we first met Amira, Jun, and Sami in Section 0.2. Amira now has her scripts, datasets and reports organized, in version control, and on GitHub. Her work has already paid off: a colleague spotted a problem in an analysis step, that impacted five figures in one of her reports. Because Amira has embraced GitHub, the colleague could easily suggest a fix through a pull request. It took Amira some time to regenerate all of the affected figures, because she had to recall which code needed to be rerun. She recognized this as a great reason to embrace Make, and has added implementing it to her to-do list.
She used to be intimidated by the Unix shell, but now Amira finds it an essential part of her everyday work: she uses shell scripts to automate data processing tasks, she runs her own command-line tools she wrote in Python, and issues countless commands to Git. Her comfort with the shell is also helping as she learns how to run her analyses on a remote computing cluster from Sami.
Sami had experience with Git and Make in software projects from their undergraduate studies, but they have a new appreciation for their importance in research projects. They’ve reworked their Git and Make workshops to use examples of data pipelines, and have been getting rave reviews from participants. Sami has also gained an appreciation for the importance of provenance of both data and code in research. They’ve been helping their users by suggesting ways they can make their research more accessible, inspectable and reproducible.
Jun is now working on his first Python package. He’s added better error handling so he and his users get better information when something goes wrong, he’s implemented some testing strategies to give him and his users confidence that his code works, and he’s improved his documentation. Jun has also added a license, a Code of Conduct and contributing guidelines to his project repo, and has already had a contribution that fixes some typos in the documentation.
15.1 Why We Wrote This Book
Shell scripts, branching, automated workflows, healthy team dynamics—they all take time to learn, but they enable researchers to get more done in less time and with less pain, and that’s why we wrote this book. The climate crisis is the greatest threat our species has faced since civilization began. The COVID-19 pandemic has shown just how ill-prepared we are to deal with problems of that magnitude, or with the suspicion and disinformation that now poisons every public discussion online.
Every hour that a researcher doesn’t waste wrestling with software is an hour they can spend solving a new problem; every meeting that ends early with a clear decision frees up time to explain to the public what we actually know and why it matters. We don’t expect this book to change the world, but we hope that knowing how to write, test, document, package, and share your work will give you a slightly better chance of doing so. We hope you have enjoyed reading what we have written; if so, we would enjoy hearing from you.
- Damien Irving (https://damienirving.github.io/)
- Kate Hertweck (https://katehertweck.com/)
- Luke Johnston (https://lukewjohnston.com/)
- Joel Ostblom (https://joelostblom.com/)
- Charlotte Wickham (https://www.cwick.co.nz/)
- Greg Wilson (https://third-bit.com)
So much universe, and so little time.
— Terry Pratchett