Phylogeny Benchmark

About

This is a small benchmark of tree-building tools on simulated DNA sequences.

Tested tools

Currently the benchmark includes tools implementing Neighbor-Joining (Saitou and Nei, 1987) or its variants. More methods may be covered in the future.

All of these tools work in command line, accept PHYLIP-formatted distance matrices and produce trees in Newick format.

Method

Testing is done on randomly generated data, produced using Clockwork Evolver (http://kirill-kryukov.com/study/tools/clockwork-evolver/). The script produces aligned sequences and a reference tree. Distance matrix is constructed based on the alignment using Distance Matrix Builder (http://kirill-kryukov.com/study/tools/distance-matrix-builder/). Distances are converted to p-distances using divide-matrix-by-number.

Test procedure: For each number of OTUs N, a set of test data is generated (alignment, distance matrix and reference tree). Sequence length of 16,500 bp is used. Then all tree-building software is applied to construct trees. Time spent for tree construction is measured and recorded. Each tree is compared via Robinson-Foulds distance with the reference tree, using "raxml-ng" (Kozlov et al., 2019; GitHub). The procedure is repeated 20 times (each time with new test data) and the results are averaged for each tool. Then the benchmark moves to the next N.

Two other tools have been tried for measuring distance between trees, but both were found problematic. "hashrf" (Sul and Williams, 2007; Homepage) does not correctly measure distance between unrooted trees. "Ktreedist" (Soria-Carrasco et al., 2007; Homepage) can't measure distance between a rooted and an unrooted tree.

Testing is automated with the custom-made benchmark script.

Benchmark results

Weighbor is excluded after 500 OTUs, and BioNJ is excluded after 3,500 OTUs due to their slowness.

Calculation time, in seconds. Each number is an average from 20 runs.

Sizebionjninjaquicktreerapidnjswiftnj-0.0.3weighbor
50.0260.2770.1200.1290.1140.068
60.0260.2750.1050.1100.1020.069
70.0240.2710.0660.0660.0690.063
80.0230.2690.0630.0680.0680.066
90.0230.2750.0640.0670.0660.065
100.0230.2760.0680.0660.0650.069
200.0280.2920.1110.1280.1000.079
300.0280.2810.1230.1520.1180.102
400.0280.2990.1170.1810.1200.137
500.0320.3140.1230.1730.1160.195
600.0350.3090.1350.1790.1120.277
700.0370.3310.1290.1880.1170.379
800.0390.3360.1260.1820.1180.550
900.0440.3430.1250.1940.1170.723
1000.0480.3450.1280.2070.1210.953
2000.0910.3970.1490.2340.1256.200
3000.1760.4300.2000.2890.14220.461
4000.3150.4750.2620.3330.16948.348
5000.5160.5500.3570.3720.19191.524
6000.7650.6060.4820.4210.215-
7001.1210.6730.6400.4740.252-
8001.5460.7580.7940.4800.261-
9002.0680.8350.9840.4660.265-
10003.0940.7301.2430.5340.332-
150011.0121.0213.4280.8960.628-
200031.0351.4757.6071.3081.157-
250071.5802.10814.5641.7841.727-
3000133.7482.70324.7232.4522.415-
3500242.7683.33238.1423.1263.203-
4000-4.18356.6993.9944.164-
4500-5.05480.3194.9335.248-
5000-6.041109.8995.9686.446-
5500-6.937145.0157.2397.895-
6000-8.192187.6398.6929.257-
6500-9.487237.99110.26611.125-
7000-10.808282.81911.50712.193-
7500-14.758342.90313.09713.992-
8000-18.087413.38715.36315.976-
8500-48.144499.52816.95218.173-
9000-51.347589.58119.31720.294-
9500-56.832736.97622.13023.720-
10000-63.122850.16925.10426.261-

Topological distance (Robinson-Foulds metric) from the correct tree. Each number is an average from 20 runs.

Sizebionjninjaquicktreerapidnjswiftnj-0.0.3weighbor
50.00.00.00.00.00.0
60.00.00.00.00.00.0
70.20.10.10.20.10.2
80.20.20.20.20.20.2
90.00.00.00.00.00.0
100.20.20.20.20.20.2
200.50.60.60.60.60.6
300.91.11.11.11.50.8
401.51.51.51.52.51.2
501.21.61.61.64.41.5
601.61.71.71.73.91.6
702.52.72.72.75.32.5
803.23.03.03.05.72.9
903.53.63.63.66.53.4
1003.33.03.03.09.73.1
2005.76.46.46.422.25.9
30010.911.711.711.737.710.4
40013.515.215.215.256.315.4
50018.621.021.021.069.420.0
60023.025.325.225.386.2-
70027.830.030.030.0109.4-
80030.433.133.133.1119.7-
90033.036.936.936.9132.1-
100037.943.643.743.7152.1-
150061.569.769.769.8238.2-
200078.185.585.685.6319.0-
2500101.3114.5114.2114.4394.8-
3000126.7141.4141.0141.2494.7-
3500135.1153.7153.8153.6554.3-
4000-186.5186.4186.8649.6-
4500-200.7200.6200.7712.7-
5000-233.0233.1233.2829.7-
5500-266.4266.1266.6910.1-
6000-286.4286.7286.1984.8-
6500-303.5304.0304.11066.0-
7000-335.5335.7335.41155.0-
7500-353.4354.3353.61250.1-
8000-377.6377.3378.11326.5-
8500-392.9392.8392.91413.4-
9000-424.4425.1424.41488.1-
9500-463.6461.9463.61582.3-
10000-474.6474.3474.51679.7-

References