I tried different python scripts for download a CSV attachment from Gmail. But I could not able to get it.Is this possible. If it is possible which python script should I us
I've put together a Github repo that makes getting CSV data from Gmail as simple as:
from gmail import *
service = get_gmail_service()
# get all attachments from e-mails containing 'test'
search_query = "test"
service = get_gmail_service()
csv_dfs = query_for_csv_attachments(service, search_query)
print(csv_dfs)
Follow the instructions in the README and feel free to contribute!
google-api-python-client and oauth2client)credentials.json.Install the needed Python packages:
pip install --upgrade google-api-python-client oauth2client
The following code will allow you to connect to your Gmail account via Python:
from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
GMAIL_CREDENTIALS_PATH = 'credentials.json' # downloaded
GMAIL_TOKEN_PATH = 'token.json' # this will be created
store = file.Storage(GMAIL_TOKEN_PATH)
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets(GMAIL_CREDENTIALS_PATH, SCOPES)
creds = tools.run_flow(flow, store)
service = build('gmail', 'v1', http=creds.authorize(Http()))
With this service you can read your emails and any attachments.
First you can query your e-mails with a search string to find the e-mail id's that have the attachments:
search_query = "ABCD"
result = service.users().messages().list(userId='me', q=search_query).execute()
msgs = results['messages']
msg_ids = [msg['id'] for msg in msgs]
For each messageId you can find the associated attachments in the email.
This part is a little messy so bear with me. First we obtain a list of "attachment parts" (and attachment filenames). These are components of the email that contain attachments:
messageId = 'XYZ'
msg = service.users().messages().get(userId='me', id=messageId).execute()
parts = msg.get('payload').get('parts')
all_parts = []
for p in parts:
if p.get('parts'):
all_parts.extend(p.get('parts'))
else:
all_parts.append(p)
att_parts = [p for p in all_parts if p['mimeType']=='text/csv']
filenames = [p['filename'] for p in att_parts]
Now we can obtain the attached CSV from each part:
messageId = 'XYZ'
data = part['body'].get('data')
attachmentId = part['body'].get('attachmentId')
if not data:
att = service.users().messages().attachments().get(
userId='me', id=attachmentId, messageId=messageId).execute()
data = att['data']
Now you have the CSV data but it's in an encoded format, so we change the encoding and convert the result into a Pandas dataframe:
import base64
import pandas as pd
from StringIO import StringIO
str_csv = base64.urlsafe_b64decode(data.encode('UTF-8'))
df = pd.read_csv(StringIO(str_csv))
That's it! you have a Pandas dataframe with the contents of the CSV attachment. You can work with this dataframe or write it to disk with pd.DataFrame.to_csv if you simply want to download it. You can use the list of filenames obtained earlier if you want to preserve the filename.