Can I export a tensorflow summary to CSV?

后端 未结 4 811
太阳男子
太阳男子 2020-12-14 18:07

Is there a way to extract scalar summaries to CSV (preferably from within tensorboard) from tfevents files?

Example code

The following code generates tfeve

相关标签:
4条回答
  • 2020-12-14 18:16

    Here is my solution which bases on the previous solutions but can scale up.

    import os
    import numpy as np
    import pandas as pd
    
    from collections import defaultdict
    from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
    
    
    def tabulate_events(dpath):
    
        final_out = {}
        for dname in os.listdir(dpath):
            print(f"Converting run {dname}",end="")
            ea = EventAccumulator(os.path.join(dpath, dname)).Reload()
            tags = ea.Tags()['scalars']
    
            out = {}
    
            for tag in tags:
                tag_values=[]
                wall_time=[]
                steps=[]
    
                for event in ea.Scalars(tag):
                    tag_values.append(event.value)
                    wall_time.append(event.wall_time)
                    steps.append(event.step)
    
                out[tag]=pd.DataFrame(data=dict(zip(steps,np.array([tag_values,wall_time]).transpose())), columns=steps,index=['value','wall_time'])
    
            if len(tags)>0:      
                df= pd.concat(out.values(),keys=out.keys())
                df.to_csv(f'{dname}.csv')
                print("- Done")
            else:
                print('- Not scalers to write')
    
            final_out[dname] = df
    
    
        return final_out
    if __name__ == '__main__':
        path = "youre/path/here"
        steps = tabulate_events(path)
        pd.concat(steps.values(),keys=steps.keys()).to_csv('all_result.csv')
    
    0 讨论(0)
  • 2020-12-14 18:29

    Just check the "Data download links" option on the upper-left in TensorBoard, and then click on the "CSV" button that will appear under your scalar summary.

    0 讨论(0)
  • 2020-12-14 18:39

    While the answer here is as requested within tensorboard it only allows to download a csv for a single run of a single tag. If you have for example 10 tags and 20 runs (what is not at all much) you would need to do the above step 200 times (that alone will probably take you more than a hour). If now you for some reason would like to actually do something with the data for all runs for a single tag you would need to write some weird CSV accumulation script or copy everything by hand (what will probably cost you more than a day).

    Therefore I would like to add a solution that extracts a CSV file for every tag with all runs contained. Column headers are the run path names and row indices are the run step numbers.

    import os
    import numpy as np
    import pandas as pd
    
    from collections import defaultdict
    from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
    
    
    def tabulate_events(dpath):
        summary_iterators = [EventAccumulator(os.path.join(dpath, dname)).Reload() for dname in os.listdir(dpath)]
    
        tags = summary_iterators[0].Tags()['scalars']
    
        for it in summary_iterators:
            assert it.Tags()['scalars'] == tags
    
        out = defaultdict(list)
        steps = []
    
        for tag in tags:
            steps = [e.step for e in summary_iterators[0].Scalars(tag)]
    
            for events in zip(*[acc.Scalars(tag) for acc in summary_iterators]):
                assert len(set(e.step for e in events)) == 1
    
                out[tag].append([e.value for e in events])
    
        return out, steps
    
    
    def to_csv(dpath):
        dirs = os.listdir(dpath)
    
        d, steps = tabulate_events(dpath)
        tags, values = zip(*d.items())
        np_values = np.array(values)
    
        for index, tag in enumerate(tags):
            df = pd.DataFrame(np_values[index], index=steps, columns=dirs)
            df.to_csv(get_file_path(dpath, tag))
    
    
    def get_file_path(dpath, tag):
        file_name = tag.replace("/", "_") + '.csv'
        folder_path = os.path.join(dpath, 'csv')
        if not os.path.exists(folder_path):
            os.makedirs(folder_path)
        return os.path.join(folder_path, file_name)
    
    
    if __name__ == '__main__':
        path = "path_to_your_summaries"
        to_csv(path)
    

    My solution builds upon: https://stackoverflow.com/a/48774926/2230045


    EDIT:

    I created a more sophisticated version and released it on GitHub: https://github.com/Spenhouet/tensorboard-aggregator

    This version aggregates multiple tensorboard runs and is able to save the aggregates to a new tensorboard summary or as a .csv file.

    0 讨论(0)
  • 2020-12-14 18:41

    Just to add to @Spen

    in case you want to export the data when you have varying numbers of steps. This will make one large csv file. Might need to change around the keys for it to work for you.

    import os
    import numpy as np
    import pandas as pd
    
    from collections import defaultdict
    from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
    import glob
    import pandas as pd
    listOutput = (glob.glob("*/"))
    
    listDF = []
    
    for tb_output_folder in listOutput:
     print(tb_output_folder)
     x = EventAccumulator(path=tb_output_folder)
     x.Reload()
     x.FirstEventTimestamp()
     keys = ['loss', 'mean_absolute_error', 'val_loss', 'val_mean_absolute_error'] 
    
     listValues = {}
    
     steps = [e.step for e in x.Scalars(keys[0])]
     wall_time = [e.wall_time for e in x.Scalars(keys[0])]
     index = [e.index for e in x.Scalars(keys[0])]
     count = [e.count for e in x.Scalars(keys[0])]
     n_steps = len(steps)
     listRun = [tb_output_folder] * n_steps
     printOutDict = {}
    
     data = np.zeros((n_steps, len(keys)))
     for i in range(len(keys)):
         data[:,i] = [e.value for e in x.Scalars(keys[i])]
    
     printOutDict = {keys[0]: data[:,0], keys[1]: data[:,1],keys[2]: data[:,2],keys[3]: data[:,3]}
    
     printOutDict['Name'] = listRun
    
     DF = pd.DataFrame(data=printOutDict)
    
     listDF.append(DF)
    
    df = pd.concat(listDF)
    df.to_csv('Output.csv')   
    
    
    0 讨论(0)
提交回复
热议问题