admin管理员组

文章数量:1023782

I have a first package where I use a dataset downloaded and stored in cache on the fly via BiocFileCache that it's used as a reference , called a bit everywhere in my code, but that the user can also modify it.

A second developed package, based on the first one will also make use of this reference.

I found as a convenient solution to use nested environments for this problem but it didn't pass Bioconductor good practices. I use an environment declared under the empty environment on the top of all packages loaded.

First package declare this environment variables in a zzz.R file:

.onLoad <- function(...) {

    nameEnv <- "My-Env"
    myEnv <- new.env(parent = emptyenv()) 
    attach(myEnv, name = nameEnv)
    assign("nameEnv", nameEnv, envir = as.environment(nameEnv))

    assign("Var-1", "/some/dir/here" , envir = as.environment(nameEnv))
}

In other parts (files) of the code, I sometimes use the following to retrieve content of this variable:

Var1 <- get("Var-1")

and I can decide to rewrite its content:

assign("Var-1", 
        "/some/dir/elsehwhere", envir = as.environment(nameEnv)

This mechanism works.

Major comment of the reviewer was:

  1. If you are using environment variables (or even options), there should be no need to use assign. Only call Sys.getenv. The user will set the value of the environment variable on their systems.
  2. You should not need to use get if all your variables are defined within the function environment and in that case, you simply call the object by name.

I don't know how to correct that without using assign or Sys.getenv and follow good pratices of development.

Update : I wrote Sys.getenv whereas it should have been get. Sorry.

I have a first package where I use a dataset downloaded and stored in cache on the fly via BiocFileCache that it's used as a reference , called a bit everywhere in my code, but that the user can also modify it.

A second developed package, based on the first one will also make use of this reference.

I found as a convenient solution to use nested environments for this problem but it didn't pass Bioconductor good practices. I use an environment declared under the empty environment on the top of all packages loaded.

First package declare this environment variables in a zzz.R file:

.onLoad <- function(...) {

    nameEnv <- "My-Env"
    myEnv <- new.env(parent = emptyenv()) 
    attach(myEnv, name = nameEnv)
    assign("nameEnv", nameEnv, envir = as.environment(nameEnv))

    assign("Var-1", "/some/dir/here" , envir = as.environment(nameEnv))
}

In other parts (files) of the code, I sometimes use the following to retrieve content of this variable:

Var1 <- get("Var-1")

and I can decide to rewrite its content:

assign("Var-1", 
        "/some/dir/elsehwhere", envir = as.environment(nameEnv)

This mechanism works.

Major comment of the reviewer was:

  1. If you are using environment variables (or even options), there should be no need to use assign. Only call Sys.getenv. The user will set the value of the environment variable on their systems.
  2. You should not need to use get if all your variables are defined within the function environment and in that case, you simply call the object by name.

I don't know how to correct that without using assign or Sys.getenv and follow good pratices of development.

Update : I wrote Sys.getenv whereas it should have been get. Sorry.

Share Improve this question edited Nov 19, 2024 at 19:23 ZheFrench asked Nov 19, 2024 at 9:07 ZheFrenchZheFrench 1,2073 gold badges25 silver badges49 bronze badges 2
  • FYI, “environment variables” are something completely different, and unrelated to what you are doing. – Konrad Rudolph Commented Nov 19, 2024 at 9:30
  • Your usage of get() is unnecessary: instead of Var1 <- get("Var-1") you can write Var1 <- `Var-1` . And if you use a syntactically valid variable name you don’t need the backticks either. – Konrad Rudolph Commented Nov 19, 2024 at 9:32
Add a comment  | 

1 Answer 1

Reset to default 5

The real issue in your code is the use of attach(). attach() modifies global state (specifically via the search path), and that’s (with a few exceptions) not allowed in packages — both on CRAN and on Bioconductor.

Luckily this seems to be completely unnecessary for your purposes anyway. Here’s what you could/should do instead:

  1. Create the environment inside your package’s namespace (let’s call this package A).
  2. Refer to that environment by its variable name instead of via as.environment()
  3. Create functions for reading and writing values inside that environment, and export these functions so that they can be used from other packages.

Here’s a rough outline:

store = new.env(parent = emptyenv())

.onLoad = function (lib, pkg) {
  store$var1 = "/some/dir/here"
}

#' @export
read = function (name) {
  get(name, envir = store)
}

#' @export
write = function (name, value) {
  assign(name, value, envir = store)
}

And now you can for example use A::read("var1").

You could also export the store environment itself, to allow users to write A::store$var1; however, R’s syntax forbids direct assignment via mystore::store$var1 = "new value" (this is arguably (…) a bug in the R interpreter but fixing it has been deemed to complex, with too little usefulness). So for assignment you’d need to either create a temporary alias:

store = A::store
store$var1 = "new value"

… or resort to using assign().

I have a first package where I use a dataset downloaded and stored in cache on the fly via BiocFileCache that it's used as a reference , called a bit everywhere in my code, but that the user can also modify it.

A second developed package, based on the first one will also make use of this reference.

I found as a convenient solution to use nested environments for this problem but it didn't pass Bioconductor good practices. I use an environment declared under the empty environment on the top of all packages loaded.

First package declare this environment variables in a zzz.R file:

.onLoad <- function(...) {

    nameEnv <- "My-Env"
    myEnv <- new.env(parent = emptyenv()) 
    attach(myEnv, name = nameEnv)
    assign("nameEnv", nameEnv, envir = as.environment(nameEnv))

    assign("Var-1", "/some/dir/here" , envir = as.environment(nameEnv))
}

In other parts (files) of the code, I sometimes use the following to retrieve content of this variable:

Var1 <- get("Var-1")

and I can decide to rewrite its content:

assign("Var-1", 
        "/some/dir/elsehwhere", envir = as.environment(nameEnv)

This mechanism works.

Major comment of the reviewer was:

  1. If you are using environment variables (or even options), there should be no need to use assign. Only call Sys.getenv. The user will set the value of the environment variable on their systems.
  2. You should not need to use get if all your variables are defined within the function environment and in that case, you simply call the object by name.

I don't know how to correct that without using assign or Sys.getenv and follow good pratices of development.

Update : I wrote Sys.getenv whereas it should have been get. Sorry.

I have a first package where I use a dataset downloaded and stored in cache on the fly via BiocFileCache that it's used as a reference , called a bit everywhere in my code, but that the user can also modify it.

A second developed package, based on the first one will also make use of this reference.

I found as a convenient solution to use nested environments for this problem but it didn't pass Bioconductor good practices. I use an environment declared under the empty environment on the top of all packages loaded.

First package declare this environment variables in a zzz.R file:

.onLoad <- function(...) {

    nameEnv <- "My-Env"
    myEnv <- new.env(parent = emptyenv()) 
    attach(myEnv, name = nameEnv)
    assign("nameEnv", nameEnv, envir = as.environment(nameEnv))

    assign("Var-1", "/some/dir/here" , envir = as.environment(nameEnv))
}

In other parts (files) of the code, I sometimes use the following to retrieve content of this variable:

Var1 <- get("Var-1")

and I can decide to rewrite its content:

assign("Var-1", 
        "/some/dir/elsehwhere", envir = as.environment(nameEnv)

This mechanism works.

Major comment of the reviewer was:

  1. If you are using environment variables (or even options), there should be no need to use assign. Only call Sys.getenv. The user will set the value of the environment variable on their systems.
  2. You should not need to use get if all your variables are defined within the function environment and in that case, you simply call the object by name.

I don't know how to correct that without using assign or Sys.getenv and follow good pratices of development.

Update : I wrote Sys.getenv whereas it should have been get. Sorry.

Share Improve this question edited Nov 19, 2024 at 19:23 ZheFrench asked Nov 19, 2024 at 9:07 ZheFrenchZheFrench 1,2073 gold badges25 silver badges49 bronze badges 2
  • FYI, “environment variables” are something completely different, and unrelated to what you are doing. – Konrad Rudolph Commented Nov 19, 2024 at 9:30
  • Your usage of get() is unnecessary: instead of Var1 <- get("Var-1") you can write Var1 <- `Var-1` . And if you use a syntactically valid variable name you don’t need the backticks either. – Konrad Rudolph Commented Nov 19, 2024 at 9:32
Add a comment  | 

1 Answer 1

Reset to default 5

The real issue in your code is the use of attach(). attach() modifies global state (specifically via the search path), and that’s (with a few exceptions) not allowed in packages — both on CRAN and on Bioconductor.

Luckily this seems to be completely unnecessary for your purposes anyway. Here’s what you could/should do instead:

  1. Create the environment inside your package’s namespace (let’s call this package A).
  2. Refer to that environment by its variable name instead of via as.environment()
  3. Create functions for reading and writing values inside that environment, and export these functions so that they can be used from other packages.

Here’s a rough outline:

store = new.env(parent = emptyenv())

.onLoad = function (lib, pkg) {
  store$var1 = "/some/dir/here"
}

#' @export
read = function (name) {
  get(name, envir = store)
}

#' @export
write = function (name, value) {
  assign(name, value, envir = store)
}

And now you can for example use A::read("var1").

You could also export the store environment itself, to allow users to write A::store$var1; however, R’s syntax forbids direct assignment via mystore::store$var1 = "new value" (this is arguably (…) a bug in the R interpreter but fixing it has been deemed to complex, with too little usefulness). So for assignment you’d need to either create a temporary alias:

store = A::store
store$var1 = "new value"

… or resort to using assign().

本文标签: rProperly use nested environments in packages for BioconductorStack Overflow