Download a csv file from gmail using python

后端 未结 4 1208
無奈伤痛
無奈伤痛 2020-12-17 06:40

I tried different python scripts for download a CSV attachment from Gmail. But I could not able to get it.Is this possible. If it is possible which python script should I us

4条回答
  •  情书的邮戳
    2020-12-17 07:14

    TL;DR

    • I've put together a Github repo that makes getting CSV data from Gmail as simple as:

      from gmail import *
      service = get_gmail_service()
      
      # get all attachments from e-mails containing 'test'
      search_query = "test"
      service = get_gmail_service()
      csv_dfs = query_for_csv_attachments(service, search_query)
      print(csv_dfs)
      
    • Follow the instructions in the README and feel free to contribute!

    THE LONG ANSWER (directly using google-api-python-client and oauth2client)

    • Follow this link and click the button: "ENABLE THE GMAIL API". After the setup you will download a file called credentials.json.
    • Install the needed Python packages:

      pip install --upgrade google-api-python-client oauth2client
      
    • The following code will allow you to connect to your Gmail account via Python:

      from googleapiclient.discovery import build
      from httplib2 import Http
      from oauth2client import file, client, tools
      
      GMAIL_CREDENTIALS_PATH = 'credentials.json' # downloaded
      GMAIL_TOKEN_PATH = 'token.json' # this will be created
      
      store = file.Storage(GMAIL_TOKEN_PATH)
      creds = store.get()
      if not creds or creds.invalid:
          flow = client.flow_from_clientsecrets(GMAIL_CREDENTIALS_PATH, SCOPES)
          creds = tools.run_flow(flow, store)
      service = build('gmail', 'v1', http=creds.authorize(Http()))
      
    • With this service you can read your emails and any attachments.

    • First you can query your e-mails with a search string to find the e-mail id's that have the attachments:

      search_query = "ABCD"
      result = service.users().messages().list(userId='me', q=search_query).execute()
      msgs = results['messages']
      msg_ids = [msg['id'] for msg in msgs]
      
    • For each messageId you can find the associated attachments in the email.

    • This part is a little messy so bear with me. First we obtain a list of "attachment parts" (and attachment filenames). These are components of the email that contain attachments:

      messageId = 'XYZ'
      msg = service.users().messages().get(userId='me', id=messageId).execute()
      parts = msg.get('payload').get('parts')
      all_parts = []
      for p in parts:
          if p.get('parts'):
              all_parts.extend(p.get('parts'))
          else:
              all_parts.append(p)
      
      att_parts = [p for p in all_parts if p['mimeType']=='text/csv']
      filenames = [p['filename'] for p in att_parts]
      
    • Now we can obtain the attached CSV from each part:

      messageId = 'XYZ'
      data = part['body'].get('data')
      attachmentId = part['body'].get('attachmentId')
      if not data:
          att = service.users().messages().attachments().get(
                  userId='me', id=attachmentId, messageId=messageId).execute()
          data = att['data']
      
    • Now you have the CSV data but it's in an encoded format, so we change the encoding and convert the result into a Pandas dataframe:

      import base64
      import pandas as pd
      from StringIO import StringIO
      str_csv  = base64.urlsafe_b64decode(data.encode('UTF-8'))
      df = pd.read_csv(StringIO(str_csv))
      
    • That's it! you have a Pandas dataframe with the contents of the CSV attachment. You can work with this dataframe or write it to disk with pd.DataFrame.to_csv if you simply want to download it. You can use the list of filenames obtained earlier if you want to preserve the filename.

提交回复
热议问题