The Problem: A large static list of strings is provided as A, A long string is provided as B, strings in A are all very short (a keywords
Pack up all the individual words of B into a new list, consisting of the original string split by ' '. Then, for each element in B, test for membership against each element of A. If you find one (or more), delete it/them from A, and quit as soon as A is empty.
It seems like your approach will blaze through 500,000 candidates without an opt-out set in place.