nexml_drop_otu
drops OTUs (taxa) from a nexml object.
Currently none of the OTUs to be dropped can be used in a tree that's part
of the nexml
object. If they are, first drop the OTUs from the tree(s)
(for example, using drop.tip() from package "ape"), then
replace the tree(s). Dropping OTUs will not only drop them from the OTUs
block, but will also drop matrix rows that referenced the OTUs to be dropped.
This may in turn leave some characters unused. Therefore, it is recommended
to drop unused characters afterwards.
nexml_drop_char
drops characters from a nexml object. Note
that dropping characters may make some OTUs unused if they result in empty
rows and empty rows are pruned (see parameter pruneRows
). It is therefore
recommended to drop unused OTUs afterwards.
is_unused_char
is a filter function for nexml_drop_char
for dropping
unused characters.
is_unused_otu
is a filter function for nexml_drop_otu
for dropping
unused OTUs.
nexml_drop_otu(nexml, filter, at = NA, block = 1, ...)
nexml_drop_char(
nexml,
filter,
at = NA,
block = 1,
...,
pruneStates = TRUE,
pruneRows = TRUE
)
is_unused_char(charList, ...)
is_unused_otu(otuList, ...)
the nexml object from which to drop parts
logical, or a function returning a logical. If the latter,
it will be passed the list of selected values for filtering (see parameter
at
), and any additional parameters (see ...
). The logical vector selects
the list elements to drop from the given block (see block
).
character, selects the values for filtering if filter
is a
function. If NA
(the default), a list of the respective objects (otu
and char
, respectively) are passed. If a string and the value matches
a slot of the objects (such as "id", "label"), a list of those slot values
is passed. Otherwise, it is assumed to be a metadata property for those
objects, and a list of the respective metadata values is selected (with
NA
for objects that do not have a matching metadata annotation).
integer, the respective block index (for OTUs and character blocks, respectively).
for nexml_drop_otu
and nexml_drop_char
, additional parameters
to be passed on to the filter function.
For is_unused_char
and is_unused_otu
, if used outside of serving as
value for the filter
parameter, a parameter characters
with
the characters block (or list thereof), and/or a parameter nexml
with the
nexml object, must be provided. If only nexml
is passed, all characters
blocks is assumed.
is_unused_otu
also accepts ignoreTrees
(default is FALSE
). If set to
TRUE
, trees will be ignored for determining whether an OTU is unused.
Otherwise, if nexml
is not passed to is_unused_otu
, a warning will be
issued that it cannot check trees for determining the use of OTUs.
logical, whether to also prune (i.e., drop) states that are no longer used after dropping the selected characters. The default is TRUE.
logical, whether to also prune (i.e., drop) matrix rows that have become empty (i.e., have no cells left) after dropping the selected characters. The default is TRUE.
a list of char
objects. This will be passed as the values
to filter on.
a list of otu
objects. This will be passed as the values
to filter on.
The functions for dropping components return a nexml
object with the filtered components dropped.
is_unused_char
returns a logical vector with TRUE for each char
(character) object that is not used in the matrix (or matrices) of the characters block(s), and FALSE otherwise.
is_unused_otu
returns a logical vector with TRUE for each otu
object that is not used in either the matrix (or matrices) of the characters
block(s), or, unless ignoreTrees = TRUE
, for a node of a tree, and FALSE
otherwise.
nex <- RNeXML::nexml_read(system.file("examples", "ontotrace-result.xml", package = "rphenoscape"))
# drop by label matching
nexml_drop_otu(nex, filter = function(x) grepl(" sp.", x), at = "label")
#> A nexml object representing:
#> 1 phylogenetic tree block(s), where:
#> block 1 contains 0 phylogenetic tree(s)
#> 1 character block(s), where:
#> block 1 defines 3 standard/discrete character(s), with 2 states each
#> and 2 polymorphic or uncertain state(s) defined
#> matrix 1 has 8 row(s)
#> 8 taxonomic units in 1 block(s)
#> Taxa: Ictalurus punctatus, Ictalurus mexicanus, Ictalurus australis, Ictalurus balsanus, Ictalurus pricei, Ictalurus furcatus ...
#> Metadata annotations:
#> 18 at top level
#> 9 in block 1 at otu level
#> 3 in block 1 at char level
#> 0 in block 1 at state level
#>
#> Original description: Generated from the Phenoscape Knowledgebase on 2015-10-21 by Ontotrace query:
#> * taxa: <http://purl.obolibrary.org/obo/VTO_0036217>
#> * entities: <http://purl.obolibrary.org/obo/BFO_0000050> some <http://purl.obolibrary.org/obo/UBERON_0008897>
#> Modified with: nexml_drop_otu(nex, filter = function(x) grepl(" sp.", x), at = "label")
#>
#> Author(s): , lapp
#>
#> NeXML generated by RNeXML using schema version: 0.9
#> Size: 209.6 Kb
# can pipe dropping and then removing potentially unused characters or OTUs:
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
nexml_drop_char(nex, filter = function(x) grepl("pelvic", x), at = "label") %>%
nexml_drop_otu(filter = is_unused_otu)
#> A nexml object representing:
#> 1 phylogenetic tree block(s), where:
#> block 1 contains 0 phylogenetic tree(s)
#> 1 character block(s), where:
#> block 1 defines 2 standard/discrete character(s), with 2 states each
#> and 1 polymorphic or uncertain state(s) defined
#> matrix 1 has 8 row(s)
#> 8 taxonomic units in 1 block(s)
#> Taxa: Ictalurus punctatus, Ictalurus mexicanus, Ictalurus australis, Ictalurus balsanus, Ictalurus pricei, Ictalurus furcatus ...
#> Metadata annotations:
#> 33 at top level
#> 9 in block 1 at otu level
#> 2 in block 1 at char level
#> 0 in block 1 at state level
#>
#> Original description: Generated from the Phenoscape Knowledgebase on 2015-10-21 by Ontotrace query:
#> * taxa: <http://purl.obolibrary.org/obo/VTO_0036217>
#> * entities: <http://purl.obolibrary.org/obo/BFO_0000050> some <http://purl.obolibrary.org/obo/UBERON_0008897>
#> Modified with: nexml_drop_char(nex, filter = function(x) grepl("pelvic", x),
#> at = "label")
#> Modified with: nexml_drop_otu(., filter = is_unused_otu)
#>
#> Author(s): , lapp, lapp
#>
#> NeXML generated by RNeXML using schema version: 0.9
#> Size: 205.8 Kb
if (FALSE) {
nex <- get_ontotrace_data(taxon = "Ictaluridae",
entity = "fin", variable_only = FALSE)
# ontotrace results store VTO IRIs in dwc:taxonID annotations:
nexml_drop_otu(nex,
filter = function(x) !is_descendant("Ictalurus", x),
at = "dwc:taxonID") %>%
nexml_drop_char(filter = is_unused_char)
# anatomy IRIs are in obo:IAO_0000219 ("denotes") annotations:
nexml_drop_char(nex,
filter = function(x)
!is_descendant("paired fin", x, includeRels = "part_of"),
at = "obo:IAO_0000219") %>%
nexml_drop_otu(filter = is_unused_otu)
}