Triplet-based similarity score for fully multi-labeled trees with poly-occurring labels
Triplet-based similarity score for fully multi-labeled trees with poly-occurring labels
Link to publication: https://doi.org/10.1093/bioinformatics/btaa676
Please refer to our Wiki for more details on installation and usage.
Using pip
pip3 install mp3treesim
Using bioconda
conda install mp3treesim
usage: mp3treesim [-h] [-i | -u | -g] [-c CORES] [--labeled-only]
[--exclude [EXCLUDE [EXCLUDE ...]]]
TREE TREE
MP3 tree similarity measure
positional arguments:
TREE Paths to the trees
optional arguments:
-h, --help show this help message and exit
-i Run MP3-treesim in Intersection mode.
-u Run MP3-treesim in Union mode.
-g Run MP3-treesim in Geometric mode.
-c CORES, --cores CORES
Number of cores to be used in computation.
--labeled-only Ingore nodes without "label" attribute. The trees will
be interpred as partially-label trees.
--exclude [EXCLUDE [EXCLUDE ...]]
String(s) of comma separated labels to exclude from
computation. If only one string is provided the labels
will be excluded from both trees. If two strings are
provided they will be excluded from the respective
tree. E.g.: --exclude "A,D,E" will exclude labels from
both trees; --exclude "A,B" "C,F" will exclude A,B
from Tree 1 and C,F from Tree 2; --exclude "" "C" will
exclude C from Tree 2 and nothing from Tree 1
For example:
$ mp3treesim examples/trees/tree10.gv examples/trees/tree3.gv
> 0.02347746030469402
It is possible to use mp3treesim
directly in a python script by import it.
import mp3treesim as mp3
tree1 = mp3.read_dotfile('examples/trees/tree10.gv')
tree2 = mp3.read_dotfile('examples/trees/tree3.gv')
print(mp3.similarity(tree1, tree2))
# 0.02347746030469402
A more detailed example in a clustering use case is availabe in example/clustering Jupyter Notebook.
The input file must be a valid Graphviz format with the following assumptions:
,
(comma).Example:
digraph Tree {
1 [label="A"];
2 [label="B,G"];
3 [label="C"];
4 [label="D"];
5 [label="E"];
6 [label="F"];
1 -> 2;
1 -> 3;
2 -> 4;
2 -> 5;
3 -> 6;
}
numpy
>= 1.18.1networkx
>= 2.4pygraphviz
>= 1.5 (requires libgraphviz-dev)The supplementary materials and the settings to reproduce the experiments are in https://github.com/AlgoLab/mp3treesim_supp