I have a directory of .eml files that contain email conversations. Is there a recommended approach in C# of parsing files of this type?
Follow this link for a good solution:
The summary of the article is 4 steps(The second step below is missing in the article but needed):
Add a reference to "Microsoft CDO for Windows 2000 Library", which can be found on the ‘COM’ tab in the Visual Studio ‘Add reference’ dialog. This will add 2 references to "ADODB" and "CDO" in your project.
Disable embedding of Interop types for the 2 reference "ADODB" and "CDO". (References -> ADODB -> Properties -> Set 'Embed Interop Types' to False and repeat the same for CDO)
Add the following method in your code:
protected CDO.Message ReadMessage(String emlFileName)
{
CDO.Message msg = new CDO.MessageClass();
ADODB.Stream stream = new ADODB.StreamClass();
stream.Open(Type.Missing,
ADODB.ConnectModeEnum.adModeUnknown,
ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified,
String.Empty,
String.Empty);
stream.LoadFromFile(emlFileName);
stream.Flush();
msg.DataSource.OpenObject(stream, "_Stream");
msg.DataSource.Save();
return msg;
}
Call this method by passing the full path of your eml file and the CDO.Message object it returns will have all the parsed info you need including To,From, Subject, Body.