How can I download the chat history of a group in Telegram?

旧街凉风 提交于 2020-04-08 09:12:08

问题


I would like to download the chat history (all messages) that were posted in a public group on Telegram. How can I do this with python?

I've found this method in the API https://core.telegram.org/method/messages.getHistory which I think looks like what I'm trying to do. But how do I actually call it? It seems there's no python examples for the MTproto protocol they use.

I also looked at the Bot API, but it doesn't seem to have a method to download messages.


回答1:


You can use Telethon. Telegram API is fairly complicated and with the telethon, you can start using telegram API in a very short time without any pre-knowledge about the API.

pip install telethon

Then register your app (taken from telethon):

   

the link is: https://my.telegram.org/

Then to obtain message history of a group (assuming you have the group id):

chat_id = YOUR_CHAT_ID
api_id=YOUR_API_ID
api_hash = 'YOUR_API_HASH'

from telethon import TelegramClient
from telethon.tl.types.input_peer_chat import InputPeerChat

client = TelegramClient('session_id', api_id=api_id, api_hash=api_hash)
client.connect()
chat = InputPeerChat(chat_id)

total_count, messages, senders = client.get_message_history(
                        chat, limit=10)

for msg in reversed(messages):
    # Format the message content
    if getattr(msg, 'media', None):
        content = '<{}> {}'.format(  # The media may or may not have a caption
        msg.media.__class__.__name__,
        getattr(msg.media, 'caption', ''))
    elif hasattr(msg, 'message'):
        content = msg.message
    elif hasattr(msg, 'action'):
        content = str(msg.action)
    else:
        # Unknown message, simply print its class name
        content = msg.__class__.__name__

    text = '[{}:{}] (ID={}) {}: {} type: {}'.format(
            msg.date.hour, msg.date.minute, msg.id, "no name",
            content)
    print (text)

The example is taken and simplified from telethon example.




回答2:


With an update (August 2018) now Telegram Desktop application supports saving chat history very conveniently. You can store it as json or html formatted.

To use this feature, make sure you have the latest version of Telegram Desktop installed on your computer, then click Settings > Export Telegram data.

https://telegram.org/blog/export-and-more




回答3:


Now, you can use TDesktop to export chats.

Here is the blog post about Aug 2018 update.


Original Answer:

Telegram MTProto is hard to use to newbies, so I recommend telegram-cli.

You can use third-party tg-export script, but still not easy to newbies too.




回答4:


The great language-agnostic way to use the Telegram API is to use the https://www.t-a-a-s.ru/.

You need to sign in and create an API Key. Then you can make the following request to get the chat history

GET https://www.t-a-a-s.ru/client
{
  "api_key": "YOUR_API_KEY",
  "@type": "getChatHistory",
  "chat_id": "xxxxxxxxxxx",
  "from_message_id": '0',
  "offset": 0,
  "limit": 100,
}



回答5:


The currently accepted answer is for very old versions of Telethon. With Telethon 1.0, the code can and should be simplified to the following:

chat_id = ...
api_id = ...
api_hash = ...

from telethon.sync import TelegramClient

client = TelegramClient('session_id', api_id, api_hash)

with client:
    # 10 is the limit on how many messages to fetch. Remove or change for more.
    for msg in client.iter_messages(chat, 10):
        print(msg.sender.first_name, ':', msg.text)

Applying any formatting is still possible but hasattr is no longer needed. if msg.media for example would be enough to check if the message has media.

A note, if you're using Jupyter, you need to use async directly:

from telethon import TelegramClient

client = TelegramClient('session_id', api_id, api_hash)

# Note `async with` and `async for`
async with client:
    async for msg in client.iter_messages(chat, 10):
        print(msg.sender.first_name, ':', msg.text)



回答6:


you can use telepot (documentation here) for python, e.g.:

import telepot

token = 'your_token'
bot = telepot.Bot(token)
tmp_history = bot.getUpdates()
print(tmp_history['result'])

but you may run down with limit of 100 records in history read this about it



来源:https://stackoverflow.com/questions/44467293/how-can-i-download-the-chat-history-of-a-group-in-telegram

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!