Need to add a search to static HTML site

前端 未结 4 1557
不思量自难忘°
不思量自难忘° 2020-12-29 10:15

Basically I\'ve got an old static html site ( http://www.brownwatson.co.uk/brochure/page1.html ) I need to add a search box to it to search a folder called /brochure within

4条回答
  •  爱一瞬间的悲伤
    2020-12-29 11:10

    I was searching for solution for searching for my blog created using Jekyll but didn't found good one, also Custom Google Search was giving me ads and results from subdomains, so it was not good. So I've created my own solution for this. I've written an article about how to create search for static site like Jekyll it's in Polish and translated using google translate.

    Probably will create better manual translation or rewrite on my English blog soon.

    The solution is python script that create SQLite database from HTML files and small PHP script that show search results. But it will require that your static site hosting also support PHP.

    Just in case the article go down, here is the code, it's created just for my blog (my html and file structure) so it need to be tweaked to work with your blog.

    Python script:

    import os, sys, re, sqlite3
    from bs4 import BeautifulSoup
    def get_data(html):
        """return dictionary with title url and content of the blog post"""
        tree = BeautifulSoup(html, 'html5lib')
        body = tree.body
        if body is None:
            return None
        for tag in body.select('script'):
            tag.decompose()
        for tag in body.select('style'):
            tag.decompose()
        for tag in body.select('figure'): # ignore code snippets
            tag.decompose()
        text = tree.findAll("div", {"class": "body"})
        if len(text) > 0:
          text = text[0].get_text(separator='\n')
        else:
          text = None
        title = tree.findAll("h2", {"itemprop" : "title"}) # my h2 havee this attr
        url = tree.findAll("link", {"rel": "canonical"}) # get url
        if len(title) > 0:
          title = title[0].get_text()
        else:
          title = None
        if len(url) > 0:
          url = url[0]['href']
        else:
          url = None
        result = {
          "title": title,
          "url": url,
          "text": text
        }
        return result
    
    if __name__ == '__main__':
      if len(sys.argv) == 2:
        db_file = 'index.db'
        # usunięcie starego pliku
        if os.path.exists(db_file):
          os.remove(db_file)
        conn = sqlite3.connect(db_file)
        c = conn.cursor()
        c.execute('CREATE TABLE page(title text, url text, content text)')
        for root, dirs, files in os.walk(sys.argv[1]):
          for name in files:
            # my files are in 20.* directories (eg. 2018) [/\\] is for windows and unix
            if name.endswith(".html") and re.search(r"[/\\]20[0-9]{2}", root):
              fname = os.path.join(root, name)
              f = open(fname, "r")
              data = get_data(f.read())
              f.close()
              if data is not None:
                data = (data['title'], data['url'], data['text']
                c.execute('INSERT INTO page VALUES(?, ?, ?)', data))
                print "indexed %s" % data['url']
                sys.stdout.flush()
        conn.commit()
        conn.close()
    

    and PHP search script:

    function mark($query, $str) {
        return preg_replace("%(" . $query . ")%i", '$1', $str);
    }
    if (isset($_GET['q'])) {
      $db = new PDO('sqlite:index.db');
      $stmt = $db->prepare('SELECT * FROM page WHERE content LIKE :var OR title LIKE :var');
      $wildcarded = '%'. $_GET['q'] .'%';
      $stmt->bindParam(':var', $wildcarded);
      $stmt->execute();
      $data = $stmt->fetchAll(PDO::FETCH_ASSOC);
      $query = str_replace("%", "\\%", preg_quote($_GET['q']));
      $re = "%(?>\S+\s*){0,10}(" . $query . ")\s*(?>\S+\s*){0,10}%i";
      if (count($data) == 0) {
        echo "

    Brak wyników

    "; } else { foreach ($data as $row) { if (preg_match($re, $row['content'], $match)) { echo '

    ' . mark($query, $row['title']) . '

    '; $text = trim($match[0], " \t\n\r\0\x0B,.{}()-"); echo '

    ' . mark($query, $text) . '

    '; } } } }

    In my code an in article I've wrapped this PHP script in the same layout as other pages by adding front matter to PHP file.

    If you can't use PHP on your hosting you can try to use sql.js which is SQLite compiled to JS with Emscripten. Here is example how to use ajax to load a file.

提交回复
热议问题