Fixing package infrastructure and making output portable for amRml#24
Fixing package infrastructure and making output portable for amRml#24amcim wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
@amcim I reviewed and pushed a small follow-up commit. The @export tags on retrieveMetadata and retrieveGenomes weren't actually exporting because devtools::document() hadn't been re-run, so I regenerated NAMESPACE. I also added R/imports.R with a roxygen @importFrom tag for data.table :=, otherwise the next document() run would have dropped the importFrom line from NAMESPACE.
One thing still open related to the parquet portability fix: The basename() change drops the absolute paths from the view definitions (good), but file_search_path only sticks while the connection is open, so a fresh dbConnect() doesn't remember it. So when amRml reopens the .duckdb, the views can't find the parquets and it errors out (confirmed with a small repro). I wasn't sure whether this requires additional changes in this PR, especially considering where you are taking the architecture with ExperimentHubs and Nextflow, so I'm just flagging this potential issue here to bring it to your attention in case it does warrant some resolution at this point.
I forgot to write in the notes of this PR that this PR works together with PR 22 in amRml (I noted this on amRml's side) JRaviLab/amRml#22 In that PR I added 4 lines of code that will resolve this as you say. Good catch on the @importFrom tag. When the amRml PR is approved ill merge both at the same time |
Description
amRdata did not successfully run on a user install. This PR resolves 2 bugs relating to package infrastructure and relativizes the filepath for the duckdb that will be used for amRml to quickly query the parquet output.
What kind of change(s) are included?
Added
R/imports.Rwhich includes importing the:=operator fromdata.tableAdded
@exportflags to 2 functionsretrieveMetadataandretrieveGenomes. These functions are being called in the vignette so a user should have access to themMade the parquet-backed DuckDB portable: views now reference parquet files by base filename, and the build sets file_search_path to resolve them during schema inference.
amRmlmust callDBI::dbExecute(con, sprintf("SET file_search_path='%s'", dirname(parquet_duckdb_path)))after dbConnect, and a PR inamRmlwas made that does precisely that.Feature (adds or updates new capabilities)
Bug fix (fixes an issue).
Enhancement (adds functionality).
Breaking change (these changes would cause existing functionality to not work as expected).
Checklist
Please ensure that all boxes are checked before indicating that this pull request is ready for review.