Extracting text from a file where date -time is the index

夙愿已清 提交于 2019-12-12 04:42:30

问题


I have got around 800 files of maximum 55KB-100KB each where the data is in this format

Date,Time,Float1,Float2,Float3,Float4,Integer

Date is in DD/MM/YYYY format and Time is in the format of HH:MM

Here the date ranges from say 1st May to 1June and each day, the Time varies from 09:00 to 15:30.

I want to run a program so that, for each file, it extracts the data pertaining to a particular given date and writes to a file.

I am trying to get around, to form a to do a search and extract operation. I dont know, how to do it, would like to have some idea.

I have written the code below:

static void Main(string[] args)
    {
        string destpath = Directory.GetCurrentDirectory();
        destpath += "\\DIR";
        DirectoryInfo Dest = Directory.CreateDirectory(destpath);
        DirectoryInfo Source = new DirectoryInfo(Directory.GetCurrentDirectory() + "\\IEOD");
        FileInfo[] fiArr = Source.GetFiles("*.csv");
        Console.WriteLine("Search Date:");
        string srchdate = Console.ReadLine();
        String FileNewLine;
        String FileNewdt;
        FileInfo r;
        foreach (FileInfo f in fiArr)
        {
            r = new FileInfo(destpath + "\\" + f.Name);
            r.Create();
            StreamWriter Sw = r.AppendText();                
            StreamReader Sr = new StreamReader(f.FullName);

            while (Sr.Peek() >= 0)
            {
                FileNewLine = Sr.ReadLine();
                FileNewdt = FileNewLine.Substring(0,10);
                if (String.Compare(FileNewdt, srchdate, true) == 0)
                {
                    //write it to a file;
                    Console.WriteLine(FileNewLine);

                }
            }

        }
        Console.ReadKey();


    }

As of now, it should write into the Console. The writing with the help of StreamWriter will be done later, but I am facing a runtime error. It says, " 'C:\Documents and Settings\Soham Das\Desktop\Test\DIR\ABAN.csv' because it is being used by another process." Here ABAN is a newly created file, by the code. The problem is faced at StreamWriter Sw = r.AppendText()

Help appreciated. Thanks Soham


回答1:


Now that you have edited the question to show that the delimiter is actually a comma instead of a slash (which would have conflicted with the date format) this becomes a lot easier. I've re-posted the answer from last night below.

// This would come from Stream.ReadLine() or something
string line = "02/06/2010,10:05,1.0,2.0,3.0,4.0,5";

string[] parts = line.Split(',');
DateTime date = DateTime.ParseExact(parts[0], "dd/MM/yyyy", null);
TimeSpan time = TimeSpan.Parse(parts[1]);
date = date.Add(time); // adds the time to the date
float float1 = Single.Parse(parts[2]);
float float2 = Single.Parse(parts[3]);
float float3 = Single.Parse(parts[4]);
float float4 = Single.Parse(parts[5]);
int integer = Int32.Parse(parts[6]);

Console.WriteLine("Date: {0:d}", date);
Console.WriteLine("Time: {0:t}", date);
Console.WriteLine("Float1: {0}", float1);
Console.WriteLine("Float2: {0}", float2);
Console.WriteLine("Float3: {0}", float3);
Console.WriteLine("Float4: {0}", float4);
Console.WriteLine("Integer: {0}", integer);

Obviously you can make it more resilient by adding error handling, using TryParse, etc. But this should give you a basic idea of how to manipulate strings in .NET.




回答2:


So 800 files with around 100KB sums up to 80 KBytes. So why don't built up a little class like

public class Entry
{
    public DateTime Date {get; set;}
    public float Float1 {get; set;}
    public int Integer1 {get; set;}

    public Entry(string values)
    {
        //ToDo: Parse single line into properties
        //      e.g. use String.Split, RegEx, etc.
    }
}

Also you should take care about implementing GetHashCode() and Equals() (there is a good explanation in the book Essential C#). And you should add the interface IComparable to that class which just makes somethine like

public int CompareTo(Entry rhs)
{
    return this.Date.CompareTo(rhs.Date);
}

If you got this you can easily do the following:

var allEntries = new SortedList<Entry>();

string currentLine = null;

using (var streamReader = new StreamReader("C:\\MyFile.txt"))
    while ((currentLine = streamReader.ReadLine()) != null)
    {
        try
        {
            var entry = new Entry(currentLine);
            allEntries.Add(entry);
        }
        catch (Exception ex)
        {
            //Do whatever you like
            //maybe just
            continue;
            //or
            throw;
        }
    }

So what's missing is to read in all the files (instead of a single one). But this can be done by another loop on Directory.GetFiles() which maybe itself is looped through a Directory.GetDirectories().

After reading all the files into your List you can do whatever LINQ query comes to your mind.



来源:https://stackoverflow.com/questions/2955027/extracting-text-from-a-file-where-date-time-is-the-index

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!