Batch fetching messages performance

前端 未结 4 1753
你的背包
你的背包 2021-02-10 09:27

I need to get the last 100 messages in the INBOX (headers only). For that I\'m currently using the IMAP extension to search and then fetch the messages. This is done with two re

相关标签:
4条回答
  • 2021-02-10 10:03

    In addition to MaK you can perform multiple batch requests using the google-api-php-client and Google_Http_Batch()

            $optParams = [];
            $optParams['maxResults'] = 5;
            $optParams['labelIds'] = 'INBOX'; // Only show messages in Inbox
            $optParams['q'] = 'subject:hello'; // search for hello in subject
    
            $messages = $service->users_messages->listUsersMessages($email_id,$optParams);
    
            $list = $messages->getMessages();
    
                $client->setUseBatch(true);
    
                $batch = new Google_Http_Batch($client);                
    
                foreach($list as $message_data){
    
                    $message_id = $message_data->getId();
    
                    $optParams = array('format' => 'full');
    
                    $request = $service->users_messages->get($email_id,$message_id,$optParams);
    
                    $batch->add($request, $message_id);                 
                }
    
                $results = $batch->execute();
    
    0 讨论(0)
  • 2021-02-10 10:04

    here is the python version, using the official google api client. Note that I did not use the callback here, because I need to handle the responses in a synchronous way.

    from apiclient.http import BatchHttpRequest
    import json
    
    batch = BatchHttpRequest()
    
    #assume we got messages from Gmail query API
    for message in messages:
        batch.add(service.users().messages().get(userId='me', id=message['id'],
                                                 format='raw'))
    for request_id in batch._order:
        resp, content = batch._responses[request_id]
        message = json.loads(content)
        #handle your message here, like a regular email object
    
    0 讨论(0)
  • 2021-02-10 10:05

    Great reply!
    If somebody wants to use a raw function in php to make batch requests for fetching emails corresponding to message ids, please feel free to use mine.

    function perform_batch_operation($auth_token, $gmail_api_key, $email_id, $message_ids, $BOUNDARY = "gmail_data_boundary"){
        $post_body = "";
        foreach ($message_ids as $message_id) {
            $post_body .= "--$BOUNDARY\n";
            $post_body .= "Content-Type: application/http\n\n";
            $post_body .= 'GET https://www.googleapis.com/gmail/v1/users/'.$email_id.
                    '/messages/'.$message_id.'?metadataHeaders=From&metadataHeaders=Date&format=metadata&key='.urlencode($gmail_api_key)."\n\n";
        }
        $post_body .= "--$BOUNDARY--\n";
    
        $headers = [ 'Content-type: multipart/mixed; boundary='.$BOUNDARY, 'Authorization: OAuth '.$auth_token  ];
    
        $curl = curl_init();
        curl_setopt($curl,CURLOPT_URL, 'https://www.googleapis.com/batch' );
        curl_setopt($curl, CURLOPT_CUSTOMREQUEST, "POST");
        curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
        curl_setopt($curl,CURLOPT_CONNECTTIMEOUT , 60 ) ;
        curl_setopt($curl, CURLOPT_TIMEOUT, 60 ) ;
        curl_setopt($curl,CURLOPT_POSTFIELDS , $post_body);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER,TRUE);
        curl_setopt($curl, CURLOPT_SSL_VERIFYPEER,0);
        curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
        $tmp_response =  curl_exec($curl);
        curl_close($curl);
        return $tmp_response;
    
    }
    

    FYI the above function gets just the headers for the emails, in particular the From and Date fields, please adjust according to the api documentation https://developers.google.com/gmail/api/v1/reference/users/messages/get

    0 讨论(0)
  • 2021-02-10 10:18

    It's pretty much the same in the Gmail API as in IMAP. Two requests: first is messages.list to get the message ids. Then a (batched) message.get to retrieve the ones you want. Depending on what language you're using the client libraries may help with the batch request construction.

    A batch request is a single standard HTTP request containing multiple Google Cloud Storage JSON API calls, using the multipart/mixed content type. Within that main HTTP request, each of the parts contains a nested HTTP request.

    From: https://developers.google.com/storage/docs/json_api/v1/how-tos/batch

    It's really not that hard, took me about an hour to figure it out in python even without the python client libraries (just using httplib and mimelib).

    Here's a partial code snippet of doing it, again with direct python. Hopefully it makes it clear that's there's not too much involved:

    msg_ids = [msg['id'] for msg in body['messages']]
    headers['Content-Type'] = 'multipart/mixed; boundary=%s' % self.BOUNDARY
    
    post_body = []
    for msg_id in msg_ids:
      post_body.append(
        "--%s\n"
        "Content-Type: application/http\n\n"
        "GET /gmail/v1/users/me/messages/%s?format=raw\n"
        % (self.BOUNDARY, msg_id))
    post_body.append("--%s--\n" % self.BOUNDARY)
    post = '\n'.join(post_body)
    (headers, body) = _conn.request(
        SERVER_URL + '/batch',
        method='POST', body=post, headers=headers)
    
    0 讨论(0)
提交回复
热议问题