Need to add a search to static HTML site

前端未结

关注

 4  1557

不思量自难忘° 2020-12-29 10:15

Basically I\'ve got an old static html site ( http://www.brownwatson.co.uk/brochure/page1.html ) I need to add a search box to it to search a folder called /brochure within

4条回答

爱一瞬间的悲伤 (楼主)

2020-12-29 11:10

I was searching for solution for searching for my blog created using Jekyll but didn't found good one, also Custom Google Search was giving me ads and results from subdomains, so it was not good. So I've created my own solution for this. I've written an article about how to create search for static site like Jekyll it's in Polish and translated using google translate.

Probably will create better manual translation or rewrite on my English blog soon.

The solution is python script that create SQLite database from HTML files and small PHP script that show search results. But it will require that your static site hosting also support PHP.

Just in case the article go down, here is the code, it's created just for my blog (my html and file structure) so it need to be tweaked to work with your blog.

Python script:

import os, sys, re, sqlite3
from bs4 import BeautifulSoup
def get_data(html):
    """return dictionary with title url and content of the blog post"""
    tree = BeautifulSoup(html, 'html5lib')
    body = tree.body
    if body is None:
        return None
    for tag in body.select('script'):
        tag.decompose()
    for tag in body.select('style'):
        tag.decompose()
    for tag in body.select('figure'): # ignore code snippets
        tag.decompose()
    text = tree.findAll("div", {"class": "body"})
    if len(text) > 0:
      text = text[0].get_text(separator='\n')
    else:
      text = None
    title = tree.findAll("h2", {"itemprop" : "title"}) # my h2 havee this attr
    url = tree.findAll("link", {"rel": "canonical"}) # get url
    if len(title) > 0:
      title = title[0].get_text()
    else:
      title = None
    if len(url) > 0:
      url = url[0]['href']
    else:
      url = None
    result = {
      "title": title,
      "url": url,
      "text": text
    }
    return result

if __name__ == '__main__':
  if len(sys.argv) == 2:
    db_file = 'index.db'
    # usunięcie starego pliku
    if os.path.exists(db_file):
      os.remove(db_file)
    conn = sqlite3.connect(db_file)
    c = conn.cursor()
    c.execute('CREATE TABLE page(title text, url text, content text)')
    for root, dirs, files in os.walk(sys.argv[1]):
      for name in files:
        # my files are in 20.* directories (eg. 2018) [/\\] is for windows and unix
        if name.endswith(".html") and re.search(r"[/\\]20[0-9]{2}", root):
          fname = os.path.join(root, name)
          f = open(fname, "r")
          data = get_data(f.read())
          f.close()
          if data is not None:
            data = (data['title'], data['url'], data['text']
            c.execute('INSERT INTO page VALUES(?, ?, ?)', data))
            print "indexed %s" % data['url']
            sys.stdout.flush()
    conn.commit()
    conn.close()

and PHP search script:

function mark($query, $str) {
    return preg_replace("%(" . $query . ")%i", '$1', $str);
}
if (isset($_GET['q'])) {
  $db = new PDO('sqlite:index.db');
  $stmt = $db->prepare('SELECT * FROM page WHERE content LIKE :var OR title LIKE :var');
  $wildcarded = '%'. $_GET['q'] .'%';
  $stmt->bindParam(':var', $wildcarded);
  $stmt->execute();
  $data = $stmt->fetchAll(PDO::FETCH_ASSOC);
  $query = str_replace("%", "\\%", preg_quote($_GET['q']));
  $re = "%(?>\S+\s*){0,10}(" . $query . ")\s*(?>\S+\s*){0,10}%i";
  if (count($data) == 0) {
    echo "Brak wyników";
  } else {
    foreach ($data as $row) {
      if (preg_match($re, $row['content'], $match)) {
        echo '' . mark($query, $row['title']) . '';
        $text = trim($match[0], " \t\n\r\0\x0B,.{}()-");
        echo '' . mark($query, $text) . '';
      }
    }
  }
}

In my code an in article I've wrapped this PHP script in the same layout as other pages by adding front matter to PHP file.

If you can't use PHP on your hosting you can try to use sql.js which is SQLite compiled to JS with Emscripten. Here is example how to use ajax to load a file.

0 讨论(0)

查看其它4个回答