Obtains the size of a certain number of predefined corpora. The total size of a corpus is important for calculating term frequencies. That is, for a given corpus, the possible range for any term frequency is between 0 and the corpus size.

corpus_size(
  corpus = c("taxon-annotations", "taxon-variation", "annotated-taxa",
    "gene-annotations", "genes", "states")
)

Arguments

corpus

the name of the corpus, see term_freqs() for allowed values.

Value

The total size of the specified corpus as an integer number.

Details

Corpus sizes are cached per session after they have first been obtained. Thus, if the Phenoscape KB changes, a session needs to be restarted to have those changes be reflected.

Examples

corpus_size("taxon-variation")
#> [1] 798
corpus_size("annotated-taxa")
#> [1] 6533
corpus_size("states")
#> [1] 28461
corpus_size("genes")
#> [1] 27454