Full text search in HTML ignoring tags / &

匆匆过客 提交于 2019-11-26 05:51:09

问题


I\'ve recently seen a lot of libraries for searching and highlighting terms within an HTML page. However, every library I saw has the same problem, they can\'t find text partly encased in an html tag and/or they\'d fail at finding special characters which are &-expressed.


Example a:

<span> This is a test. This is a <b>test</b> too</span>

Searching for \"a test\" would find the first instance but not the second.


Example b:

<span> Pencils in spanish are called l&aacute;pices</span>

Searching for \"lápices\" or \"lapices\" would fail to produce a result.


Is there a JS library that does this or at least a way to circumvent these obstacles?

Thanks in Advance!

Bruno


回答1:


You can use window.find() in non-IE browsers and TextRange's findText() method in IE. Here's an example:

http://jsfiddle.net/xeSQb/6/

Unfortunately Opera prior to the switch to the Blink rendering engine in version 15 doesn't support either window.find or TextRange. If this is a concern for you, a rather heavyweight alternative is to use a combination of the TextRange and CSS class applier modules of my Rangy library, as in the following demo: http://rangy.googlecode.com/svn/trunk/demos/textrange.html

Code:

function doSearch(text) {
    if (window.find && window.getSelection) {
        document.designMode = "on";
        var sel = window.getSelection();
        sel.collapse(document.body, 0);

        while (window.find(text)) {
            document.execCommand("HiliteColor", false, "yellow");
            sel.collapseToEnd();
        }
        document.designMode = "off";
    } else if (document.body.createTextRange) {
        var textRange = document.body.createTextRange();
        while (textRange.findText(text)) {
            textRange.execCommand("BackColor", false, "yellow");
            textRange.collapse(false);
        }
    }
}



回答2:


There are 2 problems here. One is the nested content problem, or search matches that span an element boundary. The other is HTML-escaped characters.

One way to handle the HTML-escaped characters is, if you are using jQuery for example, to use the .text() method, and run the search on that. The text that comes back from that already has the escaped characters "translated" into their real character.

Another way to handle those special characters would be to replace the actual character (in the search string) with the escaped version. Since there are a wide variety of possibilities there, however, that could be a lengthy search depending on the implementation.

The same sort of "text" method can be used to find content matches that span entity boundaries. It gets trickier because the "Text" doesn't have any notion of where the actual parts of the content come from, but it gives you a smaller domain to search over if you drill in. Once you are close, you can switch to a more "series of characters" sort of search rather than a word-based search.

I don't know of any libraries that do this however.




回答3:


To highlight search keywords and remove highlighting from a web page using javascript

    <script>


    function highlightAll(keyWords) { 
        document.getElementById('hid_search_text').value = keyWords; 
        document.designMode = "on"; 
        var sel = window.getSelection(); 
        sel.collapse(document.body, 0);
        while (window.find(keyWords)) { 
            document.execCommand("HiliteColor", false, "yellow"); 
            sel.collapseToEnd(); 
        }
        document.designMode = "off";
        goTop(keyWords,1); 
    }

    function removeHighLight() { 
        var keyWords = document.getElementById('hid_search_text').value; 
        document.designMode = "on"; 
        var sel = window.getSelection(); 
        sel.collapse(document.body, 0);
        while (window.find(keyWords)) { 
            document.execCommand("HiliteColor", false, "transparent"); 
            sel.collapseToEnd(); 
        }
        document.designMode = "off"; 
        goTop(keyWords,0); 
    }

    function goTop(keyWords,findFirst) { 
        if(window.document.location.href = '#') { 
            if(findFirst) { 
                window.find(keyWords, 0, 0, 1);
            }
        }
    }
    </script>

    <style>
    #search_para {
     color:grey;
    }
    .highlight {
     background-color: #FF6; 
    }
    </style>

    <div id="wrapper">
        <input type="text" id="search_text" name="search_text"> &nbsp; 
        <input type="hidden" id="hid_search_text" name="hid_search_text"> 
        <input type="button" value="search" id="search" onclick="highlightAll(document.getElementById('search_text').value)" >  &nbsp; 
        <input type="button" value="remove" id="remove" onclick="removeHighLight()" >  &nbsp; 
        <div>
            <p id="search_para">The European languages are members of the same family. Their separate existence is a myth. For science, music, sport, etc, Europe uses the same vocabulary. The languages only differ in their grammar, their pronunciation and their most common words. Everyone realizes why a new common language would be desirable: one could refuse to pay expensive translators. To achieve this, it would be necessary to have uniform grammar, pronunciation and more common words. If several languages coalesce, the grammar of the resulting language is more simple and regular than that of the individual languages. The new common language will be more simple and regular than the existing European languages.</p>
        </div>
    </div>



回答4:


Just press F3 and use the <p> and </p> command to tell others on your site. For example:You have the knowledge of the F3 search button so to put text on the screen to tell others you would type..

<p><h4>If your having trouble finding something press F3 to highlight the text<h4></p>


来源:https://stackoverflow.com/questions/5886858/full-text-search-in-html-ignoring-tags

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!