Replacing keywords in text with php & mysql

本小妞迷上赌 提交于 2019-12-11 15:45:09

问题


I have a news site containing an archive with more than 1 million news. I created a word definitions database with about 3000 entries, consisting of word-definition pairs.

What I want to do is adding a definition next to every occurence of these words in the news. I cant make a static change as I can add a new keyword everyday, so i can make it realtime or cached.

The question is, a str_replace or a preg_replace would be very slow for searching 3 thousand keywords in a text and replacing them.

Are there any fast alternatives?


回答1:


str_replace won't work for you (unless you want "perl" in "superlative" to be a keyword), you need something that takes word boundaries into account (e.g. preg_replace with \b). Of course, you cannot preg_replace all 3000 keywords at once, but one single document can hardly contain them all, therefore I'd suggest pre-indexing all documents, for example, by maintaining an index table doc_id->word_id. When serving a specific document, query the index and only replace keywords that the document actually contains (presumably no more than 100).

On the other side, if documents are short, maintaining the index table might not be worth the trouble. You can simply do pre-indexing on the fly, e.g. with strpos:

 $kw = array();
 foreach($all_keywords as $k) if(strpos($text, $k)) $kw[] = $k;

 // $kw contains only words that actually occur in the text
 // (and perhaps some more, but that doesn't matter)

 preg_replace_callback('/\b(' . implode('|', $kw) . ')\b/',  'insert_keyword', $text)



回答2:


str_replace is pretty zippy and is, to my knowledge, the fastest you will find for PHP. You should certainly keep a cache; that will bypass performance issues.




回答3:


this is just a suggestion to speed up the process, reduce errors etc.

  1. Create a function that will batch the news archives.
  2. Create a function to replace the text. str_replace is my bet.
  3. Create a function to spawn php process. refer to this thread
  4. Add caching functions.


来源:https://stackoverflow.com/questions/2636295/replacing-keywords-in-text-with-php-mysql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!