#ReproducibleResearch

w3id.orgRAicq7k9QH (explore) | nanodash

Last year at the Open Science Retreat (#OSR24NL) I have been introduced to nanopubs by @egonw and created my first nanopub declaring citations for a paper using CiTOs (citation ontologies).

Now, travelling to #OSR25CH, due to issues with the train network foreseen with plenty of time, I used the opportunity and created a new one ( https://w3id.org/np/RAicq7k9QHX8EG8ho7Baib9GxHsU18O0tyFCZ4tbbGGPA ) for my latest publication on teaching #Snakemake on #HPC systems.

The teaching material is — again — in desperate need of additions and overhaul, but that is for another day.

#OpenScience #ReproducibleResearch #AcademicChatter

**Ludovic Courtès** @civodul@toot.aquilenet.fr · Apr 11

Apr 11

Ludovic Courtès @civodul@toot.aquilenet.fr

If you’re into #ReproducibleResearch & #OpenScience, don’t miss this MOOC
https://scholar.social/@khinsen/114314471616473600

Lots of practical tools for computational reproducibility, including the unequaled #Guix :-), all this brought to you by experts in the field, starting with @khinsen.

Scholar SocialKonrad Hinsen (@khinsen@scholar.social)The second session of the MOOC "Reproducible Research II: Practices and tools for managing computations and data" will be open from 5 May to 10 September 2025. Sign up now: https://www.fun-mooc.fr/en/courses/reproducible-research-ii-practices-and-tools-for-managing-comput/ It covers advanced topics in reproducible computation: massive data, complex calculations, management of computational environments. 1/2

Continued thread

**Simon Tournier** @zimoun@sciences.re · Apr 7

Apr 7

https://i4replication.org
https://www.stata.com/manuals/m-5luinv.pdf

Reading report 182 by Institute for replication:

« While demand estimates perfectly replicate, the production function estimates of Hong and Luparello (2024) differ slightly from those in Orr (2022) to the second decimal.

Hong and Luparello (2024) argue this due to numerical instability of STATA’s matrix inversion when recovering marginal cost, which I believe is correct.

The documentation for the particular command used for the marginal cost inversion — luinv in STATA — notes that different computers can give different results when a matrix is close to singular. »

Ok, now I would like to inspect the numerical method behind ’luinv’.

Doc reads « This function uses the MKL LAPACK by default. » … but for some cases Netlib LAPACK could also be used. Hum?!

Which LAPACK version? Compiled using which options? In other words, which computational environment?

i4replication.orgInstitute For Replication

**Simon Tournier** @zimoun@sciences.re · Apr 7

Apr 7

https://en.wikipedia.org/wiki/Stata
https://www.stata.com/order/new/gov/single-user-licenses/dl/

Do you know Stata?

« a general-purpose statistical proprietary software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fields, including biomedicine, economics, epidemiology, and sociology. »

Guess how much does it cost? $USD ~1000 per year! Option government/nonprofit single-user. Bang!

And guess what?

For this price, you have only the “fast“ option, not “twice as fast”, neither “almost four times as fast” or neither even “even faster”. Ahah!

Crazy to read we’re still there in 2025 about #ReproducibleResearch and #OpenScience in some academic fields…

en.wikipedia.orgStata - Wikipedia

Continued thread

**Simon Tournier** @zimoun@sciences.re · Apr 4

https://dx.doi.org/10.2139/ssrn.4790780

« Mass Reproducibility and Replicability: A New Hope »

Reproduire en masse – grâce aux « reproducible games » – pour changer les normes… Approche de l’Institute for Replication qui interpelle !

https://i4replication.org

Continued thread

**Simon Tournier** @zimoun@sciences.re · Apr 4

https://gt-env-logiciels.gricad-pages.univ-grenoble-alpes.fr/sandbox-notecards

Dans le cadre du réseau français de Recherche Reproducible, le Groupe de Travail “Environnement Computationnel” vient d’ouvrir la rédaction collaborative de « fiches ».

Logiciel : identifier quoi ? utiliser où ? développer comment ?

L’idée est d’avoir comme une boussole pour s’orienter dans les questions et ressources déjà existantes autour du “logiciel” et de la “reproductibilité”.

Comment contribuer ? Relecture, retours, ressources, rédaction, … Toutes vos idées quoi !

Le dépôt: https://gricad-gitlab.univ-grenoble-alpes.fr/gt-env-logiciels/sandbox-notecards

Reproducible softwares environmentsReproducible softwares environments:warning: Work in progress :warning: Publication de fiches issues du groupe de travail “logiciel” du réseau français de la recherche reproductible Cadre et objectifs du groupe Reproductibilité du procédé de calcul (en excluant les données) Recenser les ressources disponibles Introduire les difficultés et orienter vers des solutions Reproductibilité du procédé de calcul Identification on lit le code source mais on exécute un binaire: comment citer ? que citer ? quoi décrire ? et l’arbre de dépendances ? et quellesdépendances (construction vs exécution) ? et la composition (workflow) ?

Continued thread

**Simon Tournier** @zimoun@sciences.re · Apr 4

https://gt-env-logiciels.gricad-pages.univ-grenoble-alpes.fr/sandbox-notecards

Dans le cadre du réseau français de Recherche Reproducible, le Groupe de Travail “Environnement Computationnel” vient d’ouvrir la rédaction collaborative de « fiches ».

Logiciel : identifier quoi ? utiliser où ? développer comment ?

L’idée est d’avoir comme une boussole pour s’orienter dans les questions et ressources déjà existantes autour du “logiciel” et de la “reproductibilité”.

Comment contribuer ? Relecture, retours, ressources, rédaction, … Toutes vos idées quoi !

Le dépôt: https://gricad-gitlab.univ-grenoble-alpes.fr/gt-env-logiciels/sandbox-notecards

Continued thread

**Simon Tournier** @zimoun@sciences.re · Apr 4

https://gt-env-logiciels.gricad-pages.univ-grenoble-alpes.fr/sandbox-notecards

Dans le cadre du réseau français de Recherche Reproducible, le Groupe de Travail “Environnement Computationnel” vient d’ouvrir la rédaction collaborative de « fiches ».

Logiciel : identifier quoi ? utiliser où ? développer comment ?

L’idée est d’avoir comme une boussole pour s’orienter dans les questions et ressources déjà existantes autour du “logiciel” et de la “reproductibilité”.

Comment contribuer ? Relecture, retours, ressources, rédaction, … Toutes vos idées quoi !

Le dépôt: https://gricad-gitlab.univ-grenoble-alpes.fr/gt-env-logiciels/sandbox-notecards

Continued thread

**Simon Tournier** @zimoun@sciences.re · Apr 4

https://jrfrr-2025.sciencesconf.org/?lang=fr
#ReproducibleResearch #OpenScience

Ce qui est toujours passionant aux journées du Réseau Français de la Recherche Reproductible, c’est l’inter-disciplinarité.

Hier aprês des ouvertures sur les pratiques éditoriales, des présentations autour de l’analyse d'images IRM et de la recherche pré-clinique.

Puis discussion avec un café et des pralines sur la géomatique avec @NRoelandt.

Ce matin, une présentation sur la Science Ouverte en archéologie pré-historique puis ensuite une sur l’astrophysique.

Une belle stimulation par la pollinisation croisée.

jrfrr-2025.sciencesconf.orgjrfrr-2025 : Journées du Réseau National de la Recherche Reproductible - Sciencesconf.org

**Ludovic Courtès** @civodul@toot.aquilenet.fr · Apr 3

Apr 3

Ludovic Courtès @civodul@toot.aquilenet.fr

As of 2019, less than 25% of the papers in ecology & evolution came with their data; less than 20% came with their code. Ouch.

#ReproducibleResearch

**Simon Tournier** @zimoun@sciences.re · Apr 3

Apr 3

https://jrfrr-2025.sciencesconf.org/?lang=fr
#ReproducibleResearch #OpenScience

Content d’être à Lyon pour les journées du Réseau Français de la Recherche Reproductible.

Riche programme !

jrfrr-2025.sciencesconf.orgjrfrr-2025 : Journées du Réseau National de la Recherche Reproductible - Sciencesconf.org

**Eric R. Scott** @LeafyEricScott@fosstodon.org · Apr 1

Apr 1

Eric R. Scott @LeafyEricScott@fosstodon.org

I recently did a live demo of a "reproducibility audit" for @us_rse. Check it out: https://www.youtube.com/watch?v=Q2ZsLbBkWrk

YouTubeUS-RSE Code Review WG Demo: Reproducibility AuditBy US Research Software Engineer Association

#ReproHack #reproducibleresearch #codeReview

**Yann Büchau** @nobodyinperson@fosstodon.org · Mar 24

Mar 24

Yann Büchau @nobodyinperson@fosstodon.org

Ugh, #SnakeMake apparently has a hard requirement of all input (and previously-made output-) files to be present on-disk. Can't even do a dry-run `snakemake -n`.

#rdm #reproducibleresearch

**Christian Meesters** @rupdecat@fediscience.org · Mar 12

Mar 12

#OpenScience #ReproducibleComputing #ReproducibleResearch

Today is the day of closed pull request for #Snakemake. The #SnakemakeHackathon2025 participants worked at full speed!

We decided to write a white-paper summarizing our achievements rather than posting individual things. Suffice to say, that also the documentation made a great leap towards better readability!

**Christian Meesters** @rupdecat@fediscience.org · Mar 10

Mar 10

#SnakemakeHackathon2025 ! We started!

At the CERN for better #ReproducibleComputing and #ReproducibleResearch .

Majority of all participants to the hackathon gathered for a photo at the CERN.

**Simon Tournier** @zimoun@sciences.re · Mar 4

Mar 4

https://blog.namisunami.com/b38e5ff5

« Revisiting my PhD dissertation after 4 years
or Why I wanted to make my dissertation reproducible »

Regret 5: Not keeping track of dependencies

« Dependencies are not only about the R packages. Some R packages require certain software to be installed on the OS, which are called system requirements. For example, the ggplot2 package requires clang++ (C++ compiler), which usually comes with an operating system. The situation gets complex when installing a package with multiple package dependencies that require different system tools. For example, installing the kableExtra package requires the svgLite and xml2 packages that require libpng and libxml2 as system requirements, respectively. So, I had to deal with the system requirements of the 240+ packages in addition to installing those packages. The process was time-consuming. »

Maybe a perfect job for #Guix?

#OpenScience #ReproducibleResearch @nsunami

blog.namisunami.comRevisiting my PhD dissertation after 4 yearsMy regrets and how I would do differently if I were to start over.

**Neil** @nshephard@fosstodon.org · Mar 4

Mar 4

Neil @nshephard@fosstodon.org

Applying the FAIR Principles to computational workflows

https://doi.org/10.1038/s41597-025-04451-9

#fair4rs #fair #reproducibleresearch

**Christian Meesters** @rupdecat@fediscience.org · Feb 25

Feb 25

#ReproducibleResearch #reproduciblecomputing

I will continue to find it disturbing, if new #HPC cluster users explicitly instruct a program to use one core/cpu only and then complain that the cluster is so slow. Slower than their basement server.

Usually they do not spot their mistake on their own.

But THIS is actually NOT the disturbing part: such users also tend to always use default parameters. This might or might not be the sensible thing to do for their problem. Also, when reading papers, software parameterization frequently is not reported.

We have a long way to go.

**Simon Tournier** @zimoun@sciences.re · Feb 11

Feb 11

https://hpc.guix.info/blog/2025/02/guix-hpc-activity-report-2024/

break? #Guix #HPC activity report is out!

Check it out:

The seventh report, time flies! Highlight of key achievements:

• More than 57,000 packages, Guix + related channels for scientific packages;
• Performance and portability: more about MPI;
• ROCm/HIP software stack for AMD GPUs upgrades;
• Migration of Guix-Science channel to Codeberg;
• New version of Guix-Jupyter;
• Common Workflow Language #CWL + Guix = : ccwl and ravanan;
• Ensuring Source Code Availability: #SoftwareHeritage rescue mode;
• Re-Deploying Software from the Past;
• Supporting Artifact Evaluation at SuperComputing 2024;
• “Package to Container image” conversion pipeline, #DIAMOND and #DIADEM;
• Digital Electronics Design;
• Reproducible Multiphysics Simulation and Workflows;
• Impact of Hardware Variability;
• Toward Guix on French Tier-1 and EuroHPC Tier-0 machines;
• Pangenome Genetics Research
• Supporting RISC-V
• List of articles, talks, tutorials, events, training sessions.
• … and more…

What a year, isn’t it?

hpc.guix.infoGuix-HPC — Guix-HPC Activity Report, 2024