Stoner stochastic graphing overhaul by weshinsley · Pull Request #37 · vimc/stoner

weshinsley · 2026-04-30T16:17:24Z

A pretty horrible PR really with a lot of churn to achieve some specific things. It might be best to just review the finished stochastic_file and stochastic_graph files (and maybe read the vignette) - and not treat this as a diff, as it's so messy,

Anywyay - it achieves the following, which kind of had to come in one go, or something else would break...

stochastic_file has a new command stone_stochastic_make_meta which makes a meta.csv in the root of the stochastic files (\wpia-hn2.hpc.dide.ic.ac.uk\vimc_stochastics), giving meta-data about the groups, diseases, touchstones, scenarios, outcomes and countries, so we know what we've got at a glance, rather than having to do a nasty trawl.
stochastic_explorer (the shiny app) uses the meta file instead of having to do exactly that nasty trawl, which saves about 20 seconds on startup, and makes everything a bit less sluggish generally. Some churn in utils.R with how it finds the info.
stochastic_graph can now produce two graphs at once with the same y-axis scale, in order to compare estimates in different touchstones, or by different modellin groups (keeping all other selections the same) - with the additional possibility of being able to plot the difference (often called impact) between two scenarios. This also was quite a churny update.
stochastic_graph can also plot with age as the x-axis, and filter which years to reduce - as well as plotting by time on the x-axis, and choosing which ages to reduce. Basically this is all flexed up.
All the above is available from the shiny app....
And written up in the vignette.

codecov · 2026-05-01T20:06:34Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (4d10598) to head (ac7cbb7).
⚠️ Report is 7 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff             @@
##           master       #37      +/-   ##
===========================================
+ Coverage   98.01%   100.00%   +1.98%     
===========================================
  Files          23        22       -1     
  Lines        2568      2640      +72     
===========================================
+ Hits         2517      2640     +123     
+ Misses         51         0      -51

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

pratikunterwegs

I've put down some comments. I wasn't able to run the tests due to an error in a script somewhere. I won't hold up merging this as GHA checks seem to pass, and this seems to work for @kgaythorpe overall - or at least she's approved it.

Some design notes: I would try going for a separation of concerns and split off data processing from plotting in stone_stochastic_graph. It might be good to break this fn up into smaller ones too.

I wonder whether it might be worth shifting to {ggplot2}-based plotting as well.

pratikunterwegs · 2026-05-11T02:13:25Z

+##' @importFrom utils write.csv
+##' @param path The root folder of the stochastic data.
+
+stone_stochastic_make_meta <- function(path) {


Could we check that path exists and is write-able? checkmate::test_path_for_output() or something similar would work.

pratikunterwegs · 2026-05-11T02:18:37Z

+    scenarios <- unique(unlist(lapply(files, `[[`, 2)))
+    df <- data.frame()
+    for (scenario in scenarios) {
+      matches <- files[unlist(lapply(files, `[[`, 2)) == scenario]
+      countries <- unique(unlist(lapply(matches, `[[`, 3)))
+      countries <- gsub(".pq", "", countries)


Could we get a comment or a section in the fn docs with what's intended here?

Added some comments throughout that section...

pratikunterwegs · 2026-05-11T02:23:02Z

+
+stone_stochastic_make_meta <- function(path) {
+
+  explore_files <- function(touchstone, folder, disease, group) {


Could we get docs on the expectations for these args? How are these injected into this scope? Is the expectation that they're present in the global scope?

It might be worth adding defaults to the fn signature - perhaps drawing from package constants?

And perhaps some input checking so users know how and why their fn calls fail would be great.

explore_files and touchstone_meta are only called from the code further down in stone_stochastic_make_meta - the user won't ever call them.

stone_stochastic_make_meta assumes the stochastic file share is perfectly curated (which is an interesting assumption), so it can work out all of these things automatically from the folder and file names.

I'll do some more commenting to clarify

pratikunterwegs · 2026-05-11T02:25:27Z

+    first <- file.path(path, touchstone, folder, files[1])
+    ds <- arrow::open_dataset(first)
+    outcomes <- ds$schema$names
+    outcomes <- outcomes[!outcomes %in% c("run_id", "disease", "year", "age",


I see a candidate for a package constant...

pratikunterwegs · 2026-05-11T03:54:53Z

+  }
+
+  touchstones <- basename(list.dirs(paste0(path, "/"), recursive = FALSE))
+  res <- data.table::rbindlist(lapply(touchstones, touchstone_meta))


Could we apply this pattern - rbindlist() applied to a functional over a list, to the loop above too? No real reason other than it's nice.

Yes, why not. It's tidier...

pratikunterwegs · 2026-05-11T04:56:42Z

+stone_stochastic_graph <- function(base,
+                                   touchstones, disease, groups, country,
+                                   scenarios, outcome,
+                                   xaxis = "time",
+                                   filter = NULL,
                                   by_cohort = FALSE, log = FALSE,
                                   packit_id = NULL, packit_file = NULL,
+                                   include_stochastics = TRUE,
                                   include_quantiles = TRUE,
                                   include_mean = TRUE,
-                                   include_median = TRUE,
-                                   scenario2 = NULL) {
+                                   include_median = TRUE) {


Could we get input checks on these args, esp. touchstones, disease etc.?

Semi-meaningful errors come out a bit later at the "file-not-found" level - but yes, I'll see if I can add something a bit better...

pratikunterwegs · 2026-05-11T05:12:47Z

+##' A feature-rich-ish graph plotting function, which can show the stochastic
+##' line for an outcome for different runs, the mean, median and quantiles.
+##' If two scenarios are specified, then the burden difference between
+##' scenarios is plotted. Additionally, if multiple groups or multiple


Would it be better to split the single scenario and scenario comparison functions into two? Might make it more modular.

The differences in code between the single/comparison graphs ends up very slight - just y-axis/title labelling, a different line in get_burden_difference, so... not sure...

pratikunterwegs · 2026-05-11T05:23:46Z

+
+stone_stochastic_graph <- function(base,


Could we get a @return tag saying what this fn returns? Seems to be a list with base graph elements. I'm not against base plotting per se, but if the list elements are going to be saved or re-used it seems like a good use case for {ggplot2} and gg-class elements.

Yep - @return tag done - it is indeed a list of exactly 1 or 2 base plots, which is a bit odd. ggplot2 and combining the graphs in a panel would probably be the nicer way, but ggplot isn't quite in my hitting zone yet, so I may resort to some more skilled ggplotters for some snazzy plots.

In the meantime, up to two base plots, stochastic_explorer can then replay the plots to put them separately in the place it wants. RStudio plots the last (or only) one, so the user has to call plot again, and can only see them one at a time, which is a bit non-ideal.

For now, I've documented it, and maybe a later PR to fix it more properly.

pratikunterwegs · 2026-05-11T05:32:05Z

I see that this file presents pre-canned outputs stored in ../figures/, which is fine if these are generated from data that doesn't live in the pkg and/or can't be easily imported - or can it? Either/or? One option would be to add some example data as package data, which would also be a good templtate when developing more functions.

Yeah, for now I plotted examples using real data in the file share and canned the results; I think I probably wouldn't want to have any genuine model data in the package, but yes, could manufacturer some interesting enough fake data which would make the vignette a better test - and would mean I wouldn't have to re-can if we snazz the plots up with ggplot2! Perhaps will push the discussion to the same time as ggplot2 improvements.

pratikunterwegs · 2026-05-11T05:34:21Z

Overall I'd push for the data processing and plotting to be separated. Functions downstream of data processing could be coded as pipe-friendly, so that operations can be chained together in vignettes etc.

pratikunterwegs · 2026-05-12T06:52:52Z

Thanks @weshinsley - happy to sit with VIMC folks and go over needs for this, for a future PR ofc.

weshinsley added 10 commits April 27, 2026 16:12

Create/use metadata for stochastic folder

e3229a5

Toggle stochastic lines

1b75e1d

Impact -> Burden Diff

a330901

Refactor graphs for y-axis and add age-based

87ee15c

Cleanup docs

1e6279b

Omit X and X.1 cols in standardise

4bbc072

Update meta creation

e71cf0d

Plot more outcomes

51fe628

Fix trailing comma in 2023 script

1ebd446

Fix packet tests

5f1face

weshinsley added 3 commits May 2, 2026 17:07

Fix tests

ac7cbb7

Export/document make_meta

61dde98

Add Rd

76ab53c

weshinsley marked this pull request as ready for review May 5, 2026 16:20

Update vignette

09896a5

weshinsley changed the title ~~Vimc 9252~~ Stoner stochastic graphing overhaul May 6, 2026

kgaythorpe approved these changes May 6, 2026

View reviewed changes

weshinsley requested review from EmmaLRussell and pratikunterwegs May 6, 2026 14:21

pratikunterwegs reviewed May 11, 2026

View reviewed changes


		stone_stochastic_make_meta <- function(path) {

		explore_files <- function(touchstone, folder, disease, group) {


		stone_stochastic_graph <- function(base,

Conversation

weshinsley commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pratikunterwegs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weshinsley May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weshinsley May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pratikunterwegs commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

weshinsley commented Apr 30, 2026 •

edited

Loading

codecov Bot commented May 1, 2026 •

edited

Loading

weshinsley May 11, 2026 •

edited

Loading

weshinsley May 11, 2026 •

edited

Loading