R/mutual_exclusivity.R
mutually_exclusive.Rd
Determines whether the data in the KB includes evidence for mutual exclusivity between all distinct pairs of phenotypes in the provided list.
mutually_exclusive(phenotypes, studies = NULL, progress_bar = FALSE)
character or phenotype, a vector of phenotype IDs, or phenotype objects for which one wants to determine the mutual exclusivity.
character, a vector of study IDs (optional) if one is interested in determining mutual exclusivity only based on the evidence from a particular set of studies. By default all studies in the KB are considered.
logical, whether to print progress of the function.
WARNING: setting progress_bar to TRUE clears the R console by executing the cat('\014') command before printing the progress.
A list consisting a matrix and a dataframe that contain mutual exclusivity results for the phenotypes.
The mutual exclusivity type between two phenotypes is encoded as an ordered factor with the following levels (in this order):
'strong_compatibility' : the data show strong evidence of mutual compatibility;
'weak_compatibility' : the data suggest mutual compatibility, but the evidence is weak;
'inconclusive_evidence' : the default for phenotypes that lack links to character states;
'weak_exclusivity' : the data suggest mutual exclusivity, but the evidence is weak;
'strong_exclusivity' : the data show strong evidence of mutual exclusivity.
The first item in the list is a square matrix with each dimension equal to the number of phenotypes. The cells (i,j) of the matrix are the integer representation of the factor level for the mutual exclusivity type between the i th and j th phenotype in the list of phenotypes provided as input. The matrix is symmetric.
The second item in the list is a dataframe with five columns, the labels and ids of each phenotype of a pair, and the mutual exclusivity type of the pair is in the fifth column.
Evidence for mutual exclusivity or compatibility between a pair of phenotypes is taken from how they are or aren't linked to the same taxon, character, and character state. If two phenotypes are found to be linked to the same character state at least once, it is considered strong evidence for mutual compatibility. If they are found to be linked to the same character at least once, but not to the same character state (implying they must be linked to different states of the same character at least once), it is considered strong evidence for mutual exclusivity. Otherwise, if there is no character to which both are linked, the presence and absence of data showing co-occurrence in taxa is used as weak evidence. That is, if they are linked to character states of different characters, but the states are both exhibited by the same taxon at least once, it is considered as weak evidence for mutual compatibility (weak because the co-occurrence may be due to taxon-level polymorphism), and if there is no such taxon, this is considered as weak evidence for mutual exclusivity (because lack of evidence could still be due to gaps in the data, and does therefore not imply evidence of absence).
It is possible that for any pair of phenotypes the KB contains conflicting evidence on mutual exclusivity. For example, two phenotypes could be linked to the same character state for one character, and different character states for another character (possibly from another study by another author), the first of which would present strong evidence for mutual compatibility and the latter strong evidence of mutual exclusivity. The current implementation does not detect and thus ignores this possibility.
# Example 1: pass phenotypes as characters with phenotype IDs
# get a specific study of interest based on which mutual exclusivity is to be determined
studies <- get_studies()
study <- studies$id[studies$label == 'Dillman et al. (2016)']
# get phenotypes ids present in the study
phenotypes <- get_phenotypes(study = study)
# select 10 phenotype ids randomly from those in the study
phenotypes_ids <- sample(phenotypes$id, 10)
# determine mutual exclusivity
exclusivity <- mutually_exclusive(phenotypes_ids)
# resultant matrix
exclusivity$matrix
#> 1 2 3 4 5 6 7 8 9 10
#> 1 1 2 2 4 2 4 4 4 4 2
#> 2 2 1 2 2 2 2 2 2 4 2
#> 3 2 2 1 2 2 2 2 2 4 2
#> 4 4 2 2 1 2 4 2 2 4 4
#> 5 2 2 2 2 1 2 2 2 4 2
#> 6 4 2 2 4 2 1 2 4 4 4
#> 7 4 2 2 2 2 2 1 2 2 2
#> 8 4 2 2 2 2 4 2 1 4 4
#> 9 4 4 4 4 4 4 2 4 1 4
#> 10 2 2 2 4 2 4 2 4 4 1
# resultant dataframe
str(exclusivity$dataframe)
#> 'data.frame': 45 obs. of 5 variables:
#> $ id.1 : chr "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ ...
#> $ label.1 : chr "ventral margin and (part_of some maxilla) acute angle to" "ventral margin and (part_of some maxilla) acute angle to" "ventral margin and (part_of some maxilla) acute angle to" "ventral margin and (part_of some maxilla) acute angle to" ...
#> $ id.2 : chr "http://purl.org/phenoscape/expression?value=%28%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%28%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+"| __truncated__ ...
#> $ label.2 : chr "anatomical projection and (part_of some (ventral region and (part_of some autopalatine))) absent" "infraorbital sensory canal branched" "ceratobranchial 5 tooth robust" "anatomical projection and (part_of some (dorsal region and (part_of some vertebral element 3))) absent" ...
#> $ mutual_exclusivity: Ord.factor w/ 5 levels "strong_compatibility"<..: 2 2 4 2 4 4 4 4 2 2 ...
# filter dataframe to contain only mutually exclusive phenotypes
exclusivity_df <- exclusivity$dataframe
exclusive_phenotype_pairs <- exclusivity_df[
exclusivity_df$mutual_exclusivity == "weak_exclusivity" |
exclusivity_df$mutual_exclusivity == "strong_exclusivity", ]
# Example 2: pass phenotypes as phenotype objects
# sample some phenotypes objects present in the study
phenotypes_objs <- as.phenotype(sample(phenotypes$id, 4), withTaxa=TRUE)
# determine mutual exclusivity
exclusivity <- mutually_exclusive(phenotypes_objs)
# resultant matrix
exclusivity$matrix
#> 1 2 3 4
#> 1 1 4 2 4
#> 2 4 1 2 4
#> 3 2 2 1 2
#> 4 4 4 2 1
# resultant dataframe
str(exclusivity$dataframe)
#> 'data.frame': 6 obs. of 5 variables:
#> $ id.1 : chr "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ ...
#> $ label.1 : chr "proximal region and (part_of some rib 1) size and (not (increased size))" "proximal region and (part_of some rib 1) size and (not (increased size))" "proximal region and (part_of some rib 1) size and (not (increased size))" "melanophore spot and (posterior to some supraoccipital crest) present" ...
#> $ id.2 : chr "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ "http://purl.org/phenoscape/expression?value=%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051%3E+some+%0A"| __truncated__ ...
#> $ label.2 : chr "melanophore spot and (posterior to some supraoccipital crest) present" "posterior region and (part_of some mesethmoid element) notched" "subdivision of head and (adjacent to some orbital region) width" "posterior region and (part_of some mesethmoid element) notched" ...
#> $ mutual_exclusivity: Ord.factor w/ 5 levels "strong_compatibility"<..: 4 2 4 2 4 2
# Example 3: determine mutual exclusivity for two phenotypes
# fetch phenotypes for an entity/taxon combination
phenotypes <- get_phenotypes(entity="hyomandibular bone", taxon="Hypancistrus")
# select two phenotype ids based on the labels
phenotypes_ids <- c(
phenotypes$id[phenotypes$label == 'hyomandibular bone in contact with prootic bone'],
phenotypes$id[phenotypes$label == 'hyomandibular bone in contact with quadrate bone']
)
# determine mutual exclusivity
exclusivity <- mutually_exclusive(phenotypes_ids)
# exclusivity value
exclusivity$dataframe$mutual_exclusivity
#> [1] weak_compatibility
#> 5 Levels: strong_compatibility < ... < strong_exclusivity