问题
I tried different python scripts for download a CSV attachment from Gmail. But I could not able to get it.Is this possible. If it is possible which python script should I use? Thank you.
回答1:
TL;DR
If you want to skip all the details in this answer I've put together a Github repo that makes getting csv data from gmail as simple as:
from gmail import * service = get_gmail_service() # get all attachments from e-mails containing 'test' search_query = "test" service = get_gmail_service() csv_dfs = query_for_csv_attachments(service, search_query) print(csv_dfs)
here is the repo: https://github.com/robertdavidwest/google_api
- Just follow the instructions in the
README
and have fun and please feel free to contribute!
THE LONG ANSWER - directly using google-api-python-client
and oauth2client
Follow this link and click on the button: "ENABLE THE GMAIL API"
https://developers.google.com/gmail/api/quickstart/python
After the set up you will download a file called
credentials.json
install the needed python packages
pip install --upgrade google-api-python-client oauth2client
The following code snippet will allow you to connect to your gmail account via python
from googleapiclient.discovery import build from httplib2 import Http from oauth2client import file, client, tools GMAIL_CREDENTIALS_PATH = 'credentials.json' # downloaded GMAIL_TOKEN_PATH = 'token.json' # this will be created store = file.Storage(GMAIL_TOKEN_PATH) creds = store.get() if not creds or creds.invalid: flow = client.flow_from_clientsecrets(GMAIL_CREDENTIALS_PATH, SCOPES) creds = tools.run_flow(flow, store) service = build('gmail', 'v1', http=creds.authorize(Http()))
Now with this service you can read your emails and read any attachments you may have in your e-mails
First you can query your e-mails with a search string to find the e-mail ids you need that have the attachments:
search_query = "ABCD" result = service.users().messages().list(userId='me', q=search_query).execute() msgs = results['messages') msg_ids = [msg['id'] for msg in msgs]
now for each
messageId
you can find the associated attachments in the email.This part is a little messy so bear with me. First we obtain a list of "attachment parts" (and attachment filenames) from the e-mail. These are components of the email that contain attachments:
messageId = 'XYZ' msg = service.messages().get(userId='me', id=messageId).execute() parts = msg.get('payload').get('parts') all_parts = [] for p in parts: if p.get('parts'): all_parts.extend(p.get('parts')) else: all_parts.append(p) att_parts = [p for p in all_parts if p['mimeType']=='text/csv'] filenames = [p['filename'] for p in att_parts]
Now we can obtain the attached csv from each part:
messageId = 'XYZ' data = part['body'].get('data') attachmentId = part['body'].get('attachmentId') if not data: att = service.users().messages().attachments().get( userId='me', id=attachmentId, messageId=messageId).execute() data = att['data']
Now you have the csv data but its in an encoded format, so finally we change the encoding and convert the result into a pandas dataframe
import base64 import pandas as pd from StringIO import StringIO str_csv = base64.urlsafe_b64decode(data.encode('UTF-8')) df = pd.read_csv(StringIO(str_csv))
and thats it! you have a pandas dataframe with the contents of the csv attachment. You can work with this dataframe. Or you could write it to disk with
pd.DataFrame.to_csv
if you simply want to download the csv. You can use the list offilenames
we obtained earlier if you want to preserve the filename
回答2:
I got it. This is not my own work. I got some codes, combined them and modified to this code. However, finally, it worked.
print 'Proceeding'
import email
import getpass
import imaplib
import os
import sys
userName = 'yourgmail@gmail.com'
passwd = 'yourpassword'
directory = '/full/path/to/the/directory'
detach_dir = '.'
if 'DataFiles' not in os.listdir(detach_dir):
os.mkdir('DataFiles')
try:
imapSession = imaplib.IMAP4_SSL('imap.gmail.com')
typ, accountDetails = imapSession.login(userName, passwd)
if typ != 'OK':
print 'Not able to sign in!'
raise
imapSession.select('[Gmail]/All Mail')
typ, data = imapSession.search(None, 'ALL')
if typ != 'OK':
print 'Error searching Inbox.'
raise
for msgId in data[0].split():
typ, messageParts = imapSession.fetch(msgId, '(RFC822)')
if typ != 'OK':
print 'Error fetching mail.'
raise
emailBody = messageParts[0][1]
mail = email.message_from_string(emailBody)
for part in mail.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
fileName = part.get_filename()
if bool(fileName):
filePath = os.path.join(detach_dir, 'DataFiles', fileName)
if not os.path.isfile(filePath) :
print fileName
fp = open(filePath, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
imapSession.close()
imapSession.logout()
print 'Done'
except :
print 'Not able to download all attachments.'
来源:https://stackoverflow.com/questions/41749236/download-a-csv-file-from-gmail-using-python