Elasticsearch analyzer for parsing the application logs

我们两清 提交于 2020-07-23 06:19:12

问题


  1. I am using the file beat and able to successfully push the logs to the elasticsearch in a particular index.

  2. I have a use case where I need to find the duplicates in the logs, I tried using aggregation and I am able to find the duplicates in the logs for the exact log match like below,

    2019-07-23 11:38:17,401 WARN [org.amazon.events] (default task-3) type=LOGIN_ERROR, realmId=amazon, clientId=angular-cors, userId=209fd7db-6964-41ff-bffd-3975ccbc03bb, ipAddress=44.44.44.44, error=invalid_user_credentials, auth_method=openid-connect, grant_type=password, client_auth_method=client-secret, username=testuser@amazon.com

    2019-07-23 11:38:17,401 WARN [org.amazon.events] (default task-3) type=LOGIN_ERROR, realmId=amazon, clientId=angular-cors, userId=209fd7db-6964-41ff-bffd-3975ccbc03bb, ipAddress=44.44.44.44, error=invalid_user_credentials, auth_method=openid-connect, grant_type=password, client_auth_method=client-secret, username=testuser@amazon.com

  3. But say the time and task id is changed like below, but still, I want to consider this as duplicate log as above

    2019-07-23 11:38:18,401 WARN [org.amazon.events] (default task-4) type=LOGIN_ERROR, realmId=amazon, clientId=angular-cors, userId=209fd7db-6964-41ff-bffd-3975ccbc03bb, ipAddress=44.44.44.44, error=invalid_user_credentials, auth_method=openid-connect, grant_type=password, client_auth_method=client-secret, username=testuser@amazon.com

I have one way,

Solution :
i) If I use the standard analyzer with stopwords, I will be able to separate as tokens
ii) Skip the and GET only the in the tokens
iii) then use the multi-match / most-like-this query to check any logs already exist.

This is working as of now. But Is there a better way to get only the "keywords" from the logs using analyzer, so that I won't end up with a large sets of keywords.

Anyhelp is appreciated.

Thanks,
Harry

来源:https://stackoverflow.com/questions/62942422/elasticsearch-analyzer-for-parsing-the-application-logs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!