Summary
I have added the option to randomly select initial trees from .t file. The result will be the best one among all results.
I have modified many classes and methods. So it's quite different from previous version. I have added a class resultSet to manage all information about results.(Because now there are multiple results to select) Now there is only one output file. The information about stop criteria has been included in this file.
Command and Options:
to compile: javac centroid.java
to run: java centroid option1=value1 option2=value2 ...
example:
java centroid tfile=seq3.nex1.run1.t randomnumber=5 hastruetree=1 itr=1000 algorithm=3 truetreefile=seq3.tree inittreetype=5
options:
help show help information
tfile the .t file
pfile the .p file
truetreefile the file contains the true tree
inittreefile the file contains the initial tree
itr the number of iterations
algorithm 1:K-Interval; 2:Quartet; 3:SD; 4:K-Interval Sqr
inittreetype 1: the tree with biggest LNL value in .p file
2: the most frquently happened tree in .t file
3: neighbor join tree
4: phyml tree
5: random trees from .t file
randomnumber the number of random trees
hastruetree 0: does not have true tree; 1: has true tree
stricthillclimbing 0: involves prob, c, temp; 1: use stricthillclimbings
Class and Methods:
node: the class for leaf node object
edge: the class for edge object
resultSet: the class for manage result information. It contains: the initial tree, the result tree, the stop criteria, the e_T value for the result tree, the c value if used, the number of moves happened.
tree: the class for implement tree object. Its methods can be divided into two parts.
format converter: these methods is used to convert the tree between tree object and Newick string (both rooted and unrooted, leaf as taxa or id)
int NewiToTree(String s, int pointer, node currentnode, SortedMap map)
String TreeToNewi(node r, SortedMap map)
String getUnrootedTree(node r, SortedMap map)
// the map in these methods is used to add taxa for leaves, instead of leaf ids
data formalizer: there methods is used to formalize the tree and its data (k-interval, SD, quartet, NNI)
String NNI(edge e, int lowerid, int upperid)
ArrayList NNIlist()
void setMACAPair(node r)
void setQuartetSet()
void setKInterval()
void setEdgePartition()
// setEdgePartition() is used for SD algorithm
centroid: the main class which implements the algorithm
Some important public variables:
//initTreeType sets the type of the initial tree.
// 1: the sample tree in .p file with biggest LNL
// 2: the sample tree in .t file which has been repeated for most times
// 3: the neighbor joining tree
// 4: the phyml tree
// 5. random initial trees taken from .t file
public static int initTreeType = 1;
// hasTrueTree decides if a true tree is provided
public static boolean hasTrueTree = false;
// strictHillClimbing decides if the algorithm uses strict hill climbing or temp, c, prob
public static boolean strictHillClimbing = true;
// algorithm sets the algorithm to use
// 1: k-interval
// 2: quartet
// 3. SD
// 4. k-interval square
public static int algorithm = 1;
//the size of the samples
public static int size = 0;
// the number of sample trees
public static int tree_num = 0;
// the pre-calculated value for k-interval
public static long N_c_sqr = 0;
public static ArrayList N_c = new ArrayList();
// the pre-calculated value for SD
public static SortedMap SD_set = new TreeMap();
// stop_criterion sets the step criterion. Not used in current version. Now all criterions are in use.
public static int stop_criterion = 0;
// the list which contains all trees in newick format from .t file
public static ArrayList newickTreeList = new ArrayList();
// the map between id and taxa. They are used to translate leaves between ids and taxas
public static SortedMap nodeMap = new TreeMap();
public static SortedMap node2idMap = new TreeMap();
// divideBy decides by how much the c value will be divided
public static int divideBy = 1;
// the true tree
public static tree TrueTree = new tree();
// the result list
public static ArrayList result_list = new ArrayList();
// the quartet tree list contains the puartet data for all trees
public static ArrayList> quartetTreeList = new ArrayList>();
// the number of the initial tree
public static int init_tree_num = 0;
Important methods:
ArrayList readFile(String file)
void getInitTree(String p_file, String t_file, String mltree)
int getDistance(tree t1, tree t2)
long getE_T(tree t, ArrayList> quartetList)
long setC(tree t, ArrayList quartetList, int algorithm)
tree centroid(int itr, long c, resultSet res, ArrayList quartetList)
resultSet hillclimbing(String p_file, String t_file, String truetree, String mltree, int itr)
void summarize(resultSet res, ArrayList> quartetList)
// centroid(int itr, long c, resultSet res, ArrayList quartetList) is the most important method