nexml_drop_otu drops OTUs (taxa) from a nexml object. Currently none of the OTUs to be dropped can be used in a tree that's part of the nexml object. If they are, first drop the OTUs from the tree(s) (for example, using drop.tip() from package "ape"), then replace the tree(s). Dropping OTUs will not only drop them from the OTUs block, but will also drop matrix rows that referenced the OTUs to be dropped. This may in turn leave some characters unused. Therefore, it is recommended to drop unused characters afterwards.

nexml_drop_char drops characters from a nexml object. Note that dropping characters may make some OTUs unused if they result in empty rows and empty rows are pruned (see parameter pruneRows). It is therefore recommended to drop unused OTUs afterwards.

is_unused_char is a filter function for nexml_drop_char for dropping unused characters.

is_unused_otu is a filter function for nexml_drop_otu for dropping unused OTUs.

nexml_drop_otu(nexml, filter, at = NA, block = 1, ...)

nexml_drop_char(
  nexml,
  filter,
  at = NA,
  block = 1,
  ...,
  pruneStates = TRUE,
  pruneRows = TRUE
)

is_unused_char(charList, ...)

is_unused_otu(otuList, ...)

Arguments

nexml

the nexml object from which to drop parts

filter

logical, or a function returning a logical. If the latter, it will be passed the list of selected values for filtering (see parameter at), and any additional parameters (see ...). The logical vector selects the list elements to drop from the given block (see block).

at

character, selects the values for filtering if filter is a function. If NA (the default), a list of the respective objects (otu and char, respectively) are passed. If a string and the value matches a slot of the objects (such as "id", "label"), a list of those slot values is passed. Otherwise, it is assumed to be a metadata property for those objects, and a list of the respective metadata values is selected (with NA for objects that do not have a matching metadata annotation).

block

integer, the respective block index (for OTUs and character blocks, respectively).

...

for nexml_drop_otu and nexml_drop_char, additional parameters to be passed on to the filter function.

For is_unused_char and is_unused_otu, if used outside of serving as value for the filter parameter, a parameter characters with the characters block (or list thereof), and/or a parameter nexml with the nexml object, must be provided. If only nexml is passed, all characters blocks is assumed.

is_unused_otu also accepts ignoreTrees (default is FALSE). If set to TRUE, trees will be ignored for determining whether an OTU is unused. Otherwise, if nexml is not passed to is_unused_otu, a warning will be issued that it cannot check trees for determining the use of OTUs.

pruneStates

logical, whether to also prune (i.e., drop) states that are no longer used after dropping the selected characters. The default is TRUE.

pruneRows

logical, whether to also prune (i.e., drop) matrix rows that have become empty (i.e., have no cells left) after dropping the selected characters. The default is TRUE.

charList

a list of char objects. This will be passed as the values to filter on.

otuList

a list of otu objects. This will be passed as the values to filter on.

Value

The functions for dropping components return a nexml

object with the filtered components dropped.

is_unused_char returns a logical vector with TRUE for each char

(character) object that is not used in the matrix (or matrices) of the characters block(s), and FALSE otherwise.

is_unused_otu returns a logical vector with TRUE for each otu

object that is not used in either the matrix (or matrices) of the characters block(s), or, unless ignoreTrees = TRUE, for a node of a tree, and FALSE otherwise.

Examples

nex <- RNeXML::nexml_read(system.file("examples", "ontotrace-result.xml", package = "rphenoscape"))
# drop by label matching
nexml_drop_otu(nex, filter = function(x) grepl(" sp.", x), at = "label")
#> A nexml object representing:
#>  	 1 phylogenetic tree block(s), where:
#> 	   block 1 contains 0 phylogenetic tree(s) 
#>  	 1 character block(s), where:
#> 	   block 1 defines 3 standard/discrete character(s), with 2 states each 
#> 	    and  2 polymorphic or uncertain state(s) defined 
#> 	   matrix 1 has 8 row(s) 
#>  	 8 taxonomic units in 1 block(s) 
#>   Taxa:	 Ictalurus punctatus, Ictalurus mexicanus, Ictalurus australis, Ictalurus balsanus, Ictalurus pricei, Ictalurus furcatus ...
#>   Metadata annotations: 
#> 	18 at top level 
#> 	9 in block 1 at otu level 
#> 	3 in block 1 at char level 
#> 	0 in block 1 at state level
#> 
#> Original description: Generated from the Phenoscape Knowledgebase on 2015-10-21 by Ontotrace query:
#> * taxa: <http://purl.obolibrary.org/obo/VTO_0036217>
#> * entities: <http://purl.obolibrary.org/obo/BFO_0000050> some <http://purl.obolibrary.org/obo/UBERON_0008897>
#> Modified with: nexml_drop_otu(nex, filter = function(x) grepl(" sp.", x), at = "label")
#> 
#> Author(s): , lapp
#> 
#> NeXML generated by RNeXML using schema version: 0.9 
#> Size: 209.6 Kb 

# can pipe dropping and then removing potentially unused characters or OTUs:
library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
nexml_drop_char(nex, filter = function(x) grepl("pelvic", x), at = "label") %>%
  nexml_drop_otu(filter = is_unused_otu)
#> A nexml object representing:
#>  	 1 phylogenetic tree block(s), where:
#> 	   block 1 contains 0 phylogenetic tree(s) 
#>  	 1 character block(s), where:
#> 	   block 1 defines 2 standard/discrete character(s), with 2 states each 
#> 	    and  1 polymorphic or uncertain state(s) defined 
#> 	   matrix 1 has 8 row(s) 
#>  	 8 taxonomic units in 1 block(s) 
#>   Taxa:	 Ictalurus punctatus, Ictalurus mexicanus, Ictalurus australis, Ictalurus balsanus, Ictalurus pricei, Ictalurus furcatus ...
#>   Metadata annotations: 
#> 	33 at top level 
#> 	9 in block 1 at otu level 
#> 	2 in block 1 at char level 
#> 	0 in block 1 at state level
#> 
#> Original description: Generated from the Phenoscape Knowledgebase on 2015-10-21 by Ontotrace query:
#> * taxa: <http://purl.obolibrary.org/obo/VTO_0036217>
#> * entities: <http://purl.obolibrary.org/obo/BFO_0000050> some <http://purl.obolibrary.org/obo/UBERON_0008897>
#> Modified with: nexml_drop_char(nex, filter = function(x) grepl("pelvic", x), 
#>     at = "label")
#> Modified with: nexml_drop_otu(., filter = is_unused_otu)
#> 
#> Author(s): , lapp, lapp
#> 
#> NeXML generated by RNeXML using schema version: 0.9 
#> Size: 205.8 Kb 

if (FALSE) {
nex <- get_ontotrace_data(taxon = "Ictaluridae",
                          entity = "fin", variable_only = FALSE)
# ontotrace results store VTO IRIs in dwc:taxonID annotations:
nexml_drop_otu(nex,
               filter = function(x) !is_descendant("Ictalurus", x),
               at = "dwc:taxonID") %>%
  nexml_drop_char(filter = is_unused_char)
# anatomy IRIs are in obo:IAO_0000219 ("denotes") annotations:
nexml_drop_char(nex,
               filter = function(x)
                 !is_descendant("paired fin", x, includeRels = "part_of"),
               at = "obo:IAO_0000219") %>%
  nexml_drop_otu(filter = is_unused_otu)
}