Choose File names automatically based on a calculation and then import them to python

问题

I have run into a wall where I don't know how to proceed further. I generate a lot of Raw Data from my CFD simulations. All the raw data will be in text format. The format of the text file will be "hA-'timestep'.txt" where A equals 0,1,2,3,4,5,6,7,8,9. For Eg h1-0500.txt will refer to data obtained along h1 at 500th time step.All the files of hA will be saved in a single folder. In my post processing, I want to import files at different flow times and do some analysis. I have written a code where it will calculate the timestep based on some equation which needs the flow time as user input.

What I would like to do is import all those files which correspond to the a particular timestep calculated through an equation.For Example, if I give an input of 2400 for the flow time, then the equation will give me time step as 16144. I want those file names which correspond to this time step to be automatically imported.Please see the below code.

I have uploaded the files corresponding to 16144. How do I choose the file name automatically based on the time step that is calculated. Currently after getting the time step from equation, I have to manually change the file name. I would really appreciate if some one could guide me on this. Samplefiles

 # Notes about the Simulation#
 # Total No. of Time Steps completed = 16152
 # No. of Time Steps completed in HPC = 165
 # Flow Time before HPC = 3.1212s
 # Total Flow time of Fill Cycle = 2401.2s

import numpy as np
from matplotlib import pyplot as plt
import os

FT_init = 3.1212
delt = 0.15 # Timestep size
TS_init = 165 
flowtime = input("Enter the flow time required: ") # This is user input. Timestep will be calculated based on the flow time entered.
timestep = (flowtime-FT_init)/delt
timestep = round(timestep + TS_init)
print timestep 

def xlineplots(X1,Y1,V1,Tr1):
  plt.figure(1)
  plt.plot(X1,Tr1)
  plt.legend(['h0','h3','h5','h7','h9'],loc=0)
  plt.ylabel('Tracer Concentration')
  plt.xlabel('X (m)')
  plt.title('Tracer Concentration Variation along the Tank width')
  plt.figtext(0.6,0.6,"Flow Time = 2400s",style= 'normal',alpha = 0.5)
  plt.figtext(0.6,0.55,"Case: ddn110B",style= 'normal')
  plt.savefig('hp1.png', format='png', dpi=600) 

  plt.figure(2)
  plt.plot(X1,V1)
  plt.legend(['h0','h3','h5','h7','h9'],loc=0)
  plt.ylabel('V (m/s)')
  plt.xlabel('X (m)')
  plt.title('Vertical Velocity Variation along the Tank width')
  plt.figtext(0.6,0.6,"Flow Time = 2400s",style= 'normal',alpha = 0.5)
  plt.figtext(0.6,0.55,"Case: ddn110B",style= 'normal',alpha = 0.5)
  plt.savefig('hv1.png', format='png', dpi=600) 

 path1='Location of the Directory' # Location where the files are located 

 filename1=np.array(['h0-16144.txt','h3-16144.txt','h5-16144.txt','h7-16144.txt','h9-16144.txt'])

for i in filename1:
  format_name= i
  data1  = os.path.join(path1,format_name)
  data2 = np.loadtxt(data1,skiprows=1)
  data2 = data2[data2[:,1].argsort()]    
  X1 = data2[:,1]  # Assign x-coordinate from the imported text file
  Y1 = data2[:,2]  # Assign y-coordinate from the imported text file
  V1 = data2[:,4]  # Assign y-velocity from the imported text file
  Tr1 = data2[:,5] # Assign Tracer Concentration from the imported text file
  xlineplots(X1,Y1,V1,Tr1)

Error Message:

Enter the flow time required: 1250
8477
timestep: 8477

file(s) found:  ['E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h0-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h1-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h2-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h3-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h4-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h5-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h6-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h7-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h8-8477.txt', 'E:/Fall2015/Research/CFD/ddn110B/Transfer/xline\\h9-8477.txt']
working in: E:/Fall2015/Research/CFD/ddn110B/Transfer/xline on: h0-8477
Traceback (most recent call last):

  File "<ipython-input-52-0503f720722f>", line 54, in <module>
    data2 = np.loadtxt(filename, skiprows=1)

  File "E:\WinPython-64bit-2.7.10.3\python-2.7.10.amd64\lib\site-packages\numpy\lib\npyio.py", line 691, in loadtxt
    fh = iter(open(fname, 'U'))

IOError: [Errno 2] No such file or directory: 'h9-8477.txt'

回答1:

I hope I got what you meant but it wasn't that clear. When the user inputs the timestep, then only the files corresponding to that timestep are loaded and used further with your plotting function:

I considered the following structure:

project/
| cfd_plot.py
+ sample/
| | h0-16144.txt
| | h1-16144.txt
| | h3-16144.txt
| | h0-25611.txt
| | h1-25611.txt
| | <...>

and here is cfd_plot.py

from __future__ import print_function
import numpy as np
from matplotlib import pyplot as plt
import os
import re

# pth is a path for plt to save the image
def xlineplots(X1, Y1, V1, Tr1n, pth):
    _, ax = plt.subplots()
    ax.plot(X1, Tr1)
    ax.legend(['h0', 'h3', 'h5', 'h7', 'h9'], loc=0)
    ax.set_ylabel('Tracer Concentration')
    ax.set_xlabel('X (m)')
    ax.set_title('Tracer Concentration Variation along the Tank width')
    plt.figtext(.6, .6, "Flow Time = 2400s", style='normal', alpha=.5)
    plt.figtext(.6, .55, "Case: ddn110B", style='normal')
    plt.savefig(pth + '-hp1.png', format='png', dpi=600)

    _, ax = plt.subplots()
    ax.plot(X1, V1)
    ax.legend(['h0', 'h3', 'h5', 'h7', 'h9'], loc=0)
    ax.set_ylabel('V (m/s)')
    ax.set_xlabel('X (m)')
    ax.set_title('Vertical Velocity Variation along the Tank width')
    plt.figtext(.6, .6, "Flow Time = 2400s", style='normal', alpha=.5)
    plt.figtext(.6, .55, "Case: ddn110B", style='normal', alpha=.5)
    plt.savefig(pth + '-hv1.png', format='png', dpi=600)


FT_init = 3.1212
delt = .15  # Timestep size
TS_init = 165
flowtime = input("Enter the flow time required: ")

timestep = (int(flowtime) - FT_init) / delt
timestep = round(timestep + TS_init)

reps = ['sample']  # location where the files are located

# first simple version
# files = []
# for rep in reps:  # recursive search for the files that match the timestep
#     for dirpath, dirnames, filenames in os.walk(rep):
#         for filename in [f for f in filenames if str(timestep) in f and f.endswith('.txt')]:
#             files.append(os.path.join(dirpath, filename))

# second version, using regular expressions
reg_exp = '^.*-({:d})\.txt'.format(timestep)

files = []
for rep in reps:  # recursive search for the files that match the timestep
    for dirpath, dirnames, filenames in os.walk(rep):
        for filename in [f for f in filenames if re.search(reg_exp, f)]:
            files.append(os.path.join(dirpath, filename))

print('timestep:', timestep)
print('file(s) found: ', files)

for file in files:
    directory = os.path.dirname(file)  # directory of the .txt file
    name = os.path.splitext(os.path.basename(file))[0]  # basename of the .txt file
    print('working in:', directory, 'on:', name)

    data2 = np.loadtxt(file, skiprows=1)
    data2 = data2[data2[:, 1].argsort()]
    X1 = data2[:, 1]  # Assign x-coordinate from the imported text file
    Y1 = data2[:, 2]  # Assign y-coordinate from the imported text file
    V1 = data2[:, 4]  # Assign y-velocity from the imported text file
    Tr1 = data2[:, 5]  # Assign Tracer Concentration from the imported text file

    # here you can give directory + name or just name to xlineplots
    xlineplots(X1, Y1, V1, Tr1, os.path.join(directory, name))
    # xlineplots(X1, Y1, V1, Tr1, name)

UPDATE: made some edits (comments)

UPDATE2: using regular expressions on file search, the filter is '^.*-({:d})\.txt'.format(timestep):

^      match beginning of the line
.*     match any character (except newline), zero or multiple times
-      match the character -
({:d}) match the timestep, formatted as an integer
\.     match the character .
txt    match characters txt

回答2:

Is the issue with generating file names, or finding file names that match a certain pattern?

I could rework your code with:

hs = [0,3,5,7,9]
timestep = 16144
filenames = ['h%s-%s'%(h, timestep) for h in hs]

for name in filenames:
    fname = op.path.join(path1, name)
    try:
        data = np.loadtxt(fname, skiprows=1)
    except IOError:
        # cannot open this file, most likely because it does not exist
        # continue with the next
        continue  
    ...

Here I'm generating filenames with the desired format, and loading and using each one, if possible.

I could do searches with glob or re applied to directory listings, but there's nothing wrong with my try-except approach. It is good Python style.

========================

Here's an example of using glob (in an Ipython session):

First a testdir with bunch of files (created with `touch):

In [9]: ls testdir
h1-123.txt  h12-1234.txt  h2-123.txt  h2-124.txt  h3-124.txt  h343.txt

In [10]: import glob

general search for files starting with h, ending with .txt:

In [11]: glob.glob('testdir/h*.txt')
Out[11]: 
['testdir/h2-124.txt',
 'testdir/h3-124.txt',
 'testdir/h12-1234.txt',
 'testdir/h343.txt',
 'testdir/h1-123.txt',
 'testdir/h2-123.txt']

narrow it to ones with 2 fields separated by dash

In [12]: glob.glob('testdir/h*-*.txt')
Out[12]: 
['testdir/h2-124.txt',
 'testdir/h3-124.txt',
 'testdir/h12-1234.txt',
 'testdir/h1-123.txt',
 'testdir/h2-123.txt']

restrict the 1st field to single character

In [13]: glob.glob('testdir/h?-*.txt')
Out[13]: 
['testdir/h2-124.txt',
 'testdir/h3-124.txt',
 'testdir/h1-123.txt',
 'testdir/h2-123.txt']

for a specific 'time' string:

In [14]: glob.glob('testdir/h?-123.txt')
Out[14]: ['testdir/h1-123.txt', 'testdir/h2-123.txt']

The search string could be created with string formatting

In [15]: times=123
In [16]: glob.glob('testdir/h?-%s.txt'%times)

========================

With os and re I could search like:

In [28]: import os
In [29]: import re
In [30]: filelist=os.listdir('./testdir')
In [31]: [n for n in filelist if re.match('h[1-9]-123',n) is not None]
Out[31]: ['h1-123.txt', 'h2-123.txt']

======================

If the file names have to have 4 digits (or whatever) in the name then use something like:

'h%d-%04d'%(3,123)   # 'h3-0123'
'testdir/h?-%04d.txt'%times

You need this sort of padding regardless of whether you use the try, glob or re.

Add zeros as prefix to a calculated value based on the number of digits

来源：https://stackoverflow.com/questions/33861074/choose-file-names-automatically-based-on-a-calculation-and-then-import-them-to-p

标签

python

numpy

matplotlib

import

filenames