Long story short, I created a new gmail account, and linked several other accounts to it (each with 1000s of messages), which I am importing. All imported messages arrive as
Rather than try to parse our HTML why not just use the IMAP interface? Hook it up to a standard mail client and then just sort by date and mark whichever ones you want as read.