Convert csv to Newick tree

匿名 (未验证) 提交于 2019-12-03 02:28:01

问题:

So I have a csv file where each line represents hierarchical data in the form: 'Phylum','Class','Order','Family','Genus','Species','Subspecies','unique_gi'

I would like to convert this to the classic Newick tree format sans distances. Either a novel method or a python package would be amazing. Thank you!

回答1:

You could use some simple Python to build out a tree from the CSV, and then write it out to a Newick tree. Not sure if this is what you're trying to do or not.

import csv from collections import defaultdict from pprint import pprint  def tree(): return defaultdict(tree)  def tree_add(t, path):   for node in path:     t = t[node]  def pprint_tree(tree_instance):     def dicts(t): return {k: dicts(t[k]) for k in t}     pprint(dicts(tree_instance))  def csv_to_tree(input):     t = tree()     for row in csv.reader(input, quotechar='\''):         tree_add(t, row)     return t  def tree_to_newick(root):     items = []     for k in root.iterkeys():         s = ''         if len(root[k].keys()) > 0:             sub_tree = tree_to_newick(root[k])             if sub_tree != '':                 s += '(' + sub_tree + ')'         s += k         items.append(s)     return ','.join(items)  def csv_to_weightless_newick(input):     t = csv_to_tree(input)     #pprint_tree(t)     return tree_to_newick(t)  if __name__ == '__main__':     # see https://docs.python.org/2/library/csv.html to read CSV file     input = [         "'Phylum','Class','Order','Family','Genus','Species','Subspecies','unique_gi'",          "'Phylum','Class','Order','example'",         "'Another','Test'",     ]      print csv_to_weightless_newick(input) 

Example output:

$ python ~/tmp/newick_tree.py (((example,((((unique_gi)Subspecies)Species)Genus)Family)Order)Class)Phylum,(Test)Another 

Also, this library seems cool, and lets you visualize your trees: http://biopython.org/wiki/Phylo



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!