Welcome to the blog on scientific computing at the Max-Planck Institute for Evolutionary Biology
In this blog, we summarize the activities in the Scientific Computing Unit. In particular, we cover topics such as
* Tips and Tricks, Dos and Don'ts in scientific computing
* Outlines and summaries of ongoing research and software projects
* Updates and new releases from our software repositories
* Recommended readings, videos, websites
Further links (in random order):
Getting your data into the database is one of the most frequent tasks when working with OMERO. There are two different ways to import data into OMERO: Via the desktop app OMERO.insight or via the commandline client that comes with OMERO.py. Both software packages can be found on the OMERO download site https://www.openmicroscopy.org/omero/downloads/. While the desktop app is easy and intuitive to use, a drawback of using it is that it must remain open while the data is uploaded.
[Read More]
Programming Courses at MPI Evolutionary Biology
As every year, some PostDocs and Staff Scientists offer courses on computing, programming, data analysis and visualization and related topics.
Requirements: Some courses require that participants already have certain levels of experience and knowledge. Before signing up, please assess for yourself if you feel comfortable with these requirements. If in doubt, please contact the course responsible. During the course, there will not be enough time to bring everybody up to the expected level before starting the course program.
[Read More]
Jupyter lab tutorial
On May 5 & 6 2021, I took part in the Workshop “Kompetenz Forschungsdatenmanagement” organized by the Max Planck Digital Library. Day 2 featured a full session on “Reproducible Science with Jupyter” with a presentation by Hans Fangohr (slides available here) followed by an interactive hands-on tutorial. In part 1 of the tutorial, I step through a data analysis workflow based on the Johns-Hopkins University CoViD19 dataset from github. Part 2 and 3 are about Bayesian Inference of SIR model parameters, kindly provided by Johannes Zierenberg from MPI Dynamics and Selforganization.
[Read More]
Dask and Jupyter
Parallel python with dask and jupyter The dask framework provides an incredibly useful environment for parallel execution of python code in interactive settings (e.g. jupyter) or batch mode. Its key features are (from what I’ve seen so far):
Representation of threading, multiprocessing, and distributed computing with one unified API and CLI. Abstraction of HPC schedulers (PBS, Moab, SLURM, …) Data structures for distributed computing with pandas and numpy syntax Dask-jobqueue The package dask_jobqueue seems to me to be the most userfriendly if it comes to parallelization on HPC clusters with a scheduling system such as SLURM.
[Read More]
Research Software Development Workshop 2020
On Dec. 10 & 11, Nikoletta Glynatsi and I ran our first Workshop on “Research Software Development”. Far from appraising myself, but judging from the feedback, it was a big success. For two days, we taught best practises in writing software (exemplified with python), using git for version control, collaborating on gitlab projects and employing gitlab’s built-in continuous integration tools to run automated tests and build a reference manual.
All material from the workshop, including all presentations and code examples are available under terms of the MIT License from this gitlab repository: https://gitlab.
[Read More]
Running matlab code on HPC with SLURM
Running MATLAB scripts on HPC Today, the question came up how to run MATLAB code on HPC featuring a SLURM scheduler. The syntax for running matlab on the command line is indeed a bit counterintuitive, at least if you are (like me) used to running python or R scripts.
Example SLURM script The following snippet is an example for how to submit a matlab script for execution on an HPC Server with the SLURM scheduler:
[Read More]
CancerSim paper
This one comes a bit late but for the sake of completeness:
Our software paper on 2D stochastic cancer simulations with CancerSim
is out: doi:10.21105/joss.02436.
Converting jupyter notebooks with embedded images to pdf.
Inserting images in a jupyter notebook is just drag and drop:
This will automagically produce the image link at the drop position.
And after executing the cell, the image is rendered
So far so good. But ever tried to convert a notebook with embedded images to pdf or html (slides)?
My first guess was: Menu -> File -> Export Notebook As -> PDF.
However, this immediately runs into Error 500 tracing back to latex not being able to locate the image attachment:5444.
[Read More]
Research-Software Development Workshop
Our (Nikoleta Glynatsi and myself) Research Software Development workshop starts tomorrow. All code and presentations are available here: https://gitlab.gwdg.de/glynatsi/rsd-workshop.
Rent-a-Scientist 2020
I just finished my lecture at Heinrich-Heine-Schule Büdelsdorf on “Evolution und Künstliche Intelligenz.” (“Evolution and Artificial Intelligence”). Slides are here.