parsing raw email in php

前端 未结 14 520
猫巷女王i
猫巷女王i 2020-12-23 20:50

I\'m looking for good/working/simple to use php code for parsing raw email into parts.

I\'ve written a couple of brute force solutions, but every time, one small cha

14条回答
  •  难免孤独
    2020-12-23 21:42

    What are you hoping to end up with at the end? The body, the subject, the sender, an attachment? You should spend some time with RFC2822 to understand the format of the mail, but here's the simplest rules for well formed email:

    HEADERS\n
    \n
    BODY
    

    That is, the first blank line (double newline) is the separator between the HEADERS and the BODY. A HEADER looks like this:

    HSTRING:HTEXT
    

    HSTRING always starts at the beginning of a line and doesn't contain any white space or colons. HTEXT can contain a wide variety of text, including newlines as long as the newline char is followed by whitespace.

    The "BODY" is really just any data that follows the first double newline. (There are different rules if you are transmitting mail via SMTP, but processing it over a pipe you don't have to worry about that).

    So, in really simple, circa-1982 RFC822 terms, an email looks like this:

    HEADER: HEADER TEXT
    HEADER: MORE HEADER TEXT
      INCLUDING A LINE CONTINUATION
    HEADER: LAST HEADER
    
    THIS IS ANY
    ARBITRARY DATA
    (FOR THE MOST PART)
    

    Most modern email is more complex than that though. Headers can be encoded for charsets or RFC2047 mime words, or a ton of other stuff I'm not thinking of right now. The bodies are really hard to roll your own code for these days to if you want them to be meaningful. Almost all email that's generated by an MUA will be MIME encoded. That might be uuencoded text, it might be html, it might be a uuencoded excel spreadsheet.

    I hope this helps provide a framework for understanding some of the very elemental buckets of email. If you provide more background on what you are trying to do with the data I (or someone else) might be able to provide better direction.

提交回复
热议问题