Textblob - HTTPError: HTTP Error 429: Too Many Requests

青春壹個敷衍的年華 提交于 2021-01-26 03:54:51

问题


I am having a dataframe of which one column has a list of strings at each row.

On average, each list has 150 words of about 6 characters each.

Each of the 700 rows of the dataframe is about a document and each string is a word of this document; so basically I have tokenised the words of the document.

I want to detect the language of each of these documents and to do this I firstly try to detect the language of each word of the document.

For this reason I do the following:

from textblob import TextBlob

def lang_detect(document):

    lang_count = {}
    for word in document:

        if len(word) >= 4:

            word_textblob = TextBlob(word)
            lang_result = word_textblob.detect_language()

            response = lang_count.get(lang_result)

            if response is None:  
                lang_count[f"{lang_result}"] = 1
            else:
                lang_count[f"{lang_result}"] += 1

    return lang_count

df_per_doc['languages_count'] = df_per_doc['complete_text'].apply(lambda x: lang_detect(x))

When I do this then I get the following error:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-42-772df3809bcb> in <module>
     25 
---> 27 df_per_doc['languages_count'] = df_per_doc['complete_text'].apply(lambda x: lang_detect(x))
     28 
     29 
.
.
.

    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 
    651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 429: Too Many Requests

The error is much longer and I have omitted the rest of it at the middle.

Now,I am getting the same error even if I try to do this for only two documents/rows.

Is there any way that I can get a response from textblob for more words & documents?


回答1:


I had the same issue when I was trying to translate tweets. Since I exceed the rate limit, it started to return HTTP 429 too many requests error.

Therefore, for the others who might want to work on TextBlob, it would be better to check rate limits. Google provides information regarding limits: https://cloud.google.com/translate/quotas?hl=en

If you exceed the rate limits, you have to wait until quotas reset at midnight Pacific Time. It might take 24 hours to become effective again.

On the other hand, you can also introduce a delay between your requests to not bother the API server.

Ex: When you want to translate the TextBlob sentences in the list.

import time
...
for sentence in list_of_sentences:
    sentence.translate()
    time.sleep(1) #to sleep 1 sec



回答2:


You can try Googletrans.

"Googletrans is a free and unlimited Python library that implemented Google Translate API. This uses the Google Translate Ajax API to make calls to such methods as detect and translate."

Similary to TextBlob, Googletrans has features like language detection and translation. It worked pretty well for me when I was flagging the language and translating a large amount of mails.

(When using TextBlob I've tried time.sleep(1) but I ended up reaching the API limit...)



来源:https://stackoverflow.com/questions/56189054/textblob-httperror-http-error-429-too-many-requests

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!