How to run a query to find a string in blob files?

谁都会走 提交于 2019-11-28 00:46:35

问题


Mediawiki has a table in the database 'text' which contains the page content. It is saved as a [BLOB] file. I would like to run a query to search through all the text on the site to see which pages contain a certain 'string'. How do I run a query to search [blob] files?


回答1:


The Mediawiki markup text is stored in the old_text field, which is a mediumblob type. You can query it like any other text-based field. MySQL will cast your string into binary for the query. Note that this is a case-sensitive search!

select old_id from text where old_text like "%string%";

If you need case-insensitivity then you need to apply an appropriate character set with a case-insensitive collation to the column:

SELECT old_id from text where CONVERT(old_text USING latin1) like '%STRing%';

Be aware that if your table isn't small these queries will take a long time.




回答2:


As per the mediawiki documentation text table stores only the text for the revision. Hence to access the complete text, all revisions corresponding to a page need to be processed. It is better to use an API call to mediawiki search engine and process the results than search using SQL query.



来源:https://stackoverflow.com/questions/19563838/how-to-run-a-query-to-find-a-string-in-blob-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!