Scramblings

Dev scratchpad. Digital garden

Hugo - Integrate search using lunr.js

May 27, 2024 | Reading Time: 10 min

Adding search functionality to a static site generated with Hugo can significantly improve the user experience. lunr.js is a powerful JavaScript library for full-text search, offering a lightweight and fast solution ideal for static sites. In this guide, we will integrate search using lunr.js and address optimization concerns to ensure efficient and smooth operation. Note that the examples use Bootstrap and Font Awesome icons, but the same elements can be adapted to any styling system as required.

  • In your config.toml, add the following lines to enable the generation of the search index file:
1[outputs]
2  home = ["HTML", "RSS", "JSON"]
3
4[outputFormats]
5  [outputFormats.JSON]
6    mediaType = "application/json"
7    baseName = "index"
8    isPlainText = true
  • Then, create a layouts/index.json file, that will have a template for creating the data index.
  • This file will be processed during hugo build to create a {output dir e.g public}/index.json
  • For example, if you want to have title, url, content, tags, date available, the template will look something like below:
 1{{- $index := slice -}}
 2{{- range $.Site.RegularPages -}}
 3    {{- $tags := slice }}
 4    {{- range .Params.tags -}}
 5        {{- $tags = $tags | append . }}
 6    {{- end -}}
 7    {{- $content := .Content | plainify | htmlUnescape }}
 8    {{- $datestr := .Date.Format "Jan 2, 2006" }}
 9    {{- $indexItem := dict "url" .Permalink "title" .Title "content" $content "tags" $tags "date" $datestr -}}
10    {{- $index = $index | append $indexItem }}
11{{- end -}}
12{{- $index | jsonify (dict "indent" " ") }}

Step 2: Create the Search Form

  • Now that we have the data created for search, we would need to establish a interaction mechanism with the user.
  • You can embed the search form in the header or body of all pages or restrict it to a dedicated search page. It can be embedded as {{ partial "search-form.html" . }}
  • This form should, take input from user, and on action, invoke the search page with search query embedded into the url.
  • Below is a slightly opinionated layouts/partials/search-form.html, using Bootstrap classes and Font Awesome icons for styling. This can be adapted to any styling system. The form takes user input and invokes the search page with the search query embedded into the URL.
 1<!-- Form with a get page action -->
 2<form id="search" action='{{ with .GetPage "/search" }}{{.Permalink}}{{end}}' method="get" class="d-flex justify-content-center mt-2 mb-4">
 3  <!-- Hidden label for accessibility -->
 4  <label hidden for="search-input">Search site</label>
 5  <div class="input-group" style="width: 90%;">
 6    <!-- Icon inside the input group for visual enhancement -->
 7    <span class="input-group-text border-0 bg-transparent">
 8      <i class="fa fa-search"></i>
 9    </span>
10    <!-- Search input field. The "name" defined here will be used to parse the URL when executing the business logic in search.js -->
11    <input type="text" class="form-control rounded-pill" id="search-input" name="query" placeholder="Type here to search..." aria-label="Search">
12    <!-- Submit button with an arrow icon -->
13    <button class="btn border-0 bg-transparent" type="submit" aria-label="search">
14      <i class="fa fa-arrow-right"></i>
15    </button>
16  </div>
17</form>

Step 3: Set Up Your Search Content Page

  • To actually execute the query and display the results we need to create a search content page.
  • Generally it would be content/search/_index.md, but this can change depending on your chosen site organization for Hugo.
  • This file should point to a search layout, that we will create below.
  • You can customize it as required. Typically a bare minimum file will look like:
1---
2title: "Search"
3layout: "search"
4description: "Search page"
5---

Step 4: Create the Search Layout

  • Now, the above page needs to be served using a layout.
  • The search layout defines the structure of the search page and includes necessary scripts for lunr.js and the custom search logic.
  • By including these scripts in the layout page only, we can ensure that the search functionality is loaded on this page only and doesn’t really affect other pages. This can help optimize performance for the rest of the site.
  • Create a search.html file in your layouts/_default directory:
 1{{ define "main" }}
 2<div id="search-container" class="container">
 3  <!-- This is where the search results will be displayed. Initialize as empty list. -->
 4  <ul id="searchresults"></ul>
 5</div>
 6
 7<!-- Include lunr.js library. This can be included from node_modules mounted as assets/vendor, refer to their cdn or directly use from static/js.  -->
 8{{ $lunrJS := resources.Get "vendor/lunr/lunr.min.js" }}
 9<script src="{{ $lunrJS.RelPermalink }}" defer></script>
10
11<!-- Include the custom search script where the magic happens. This can be used from assets/js like below, or directly from static/js. -->
12{{ with resources.Get "js/search.js" }}
13  {{ $minifiedScript := . | minify | fingerprint }}
14  <script src="{{ $minifiedScript.Permalink }}" integrity="{{ $minifiedScript.Data.Integrity }}" defer></script>
15{{ else }}
16  {{ errorf "search.js not found in assets/js/" }}
17{{ end }}
18{{ end }}

Step 5: Create the Search business logic script

  • Now we need to connect all the above site elements to lunr.js, perform search and render results. We will create a javascript script for this.
  • This script handles the entire search process, including loading the search index, processing search queries, and displaying results.
  • The below script can be placed as assets/js/search.js and included in your search layout as shown in a previous step.
  • Alternately, you can put it directly inside static/js folder too and include it via the search layout above.
  • Flow of the Code
    • Initialization: The script initializes the lunr.js search index and ensures it only happens once for a page load.
    • Caching: It checks for cached search data in localStorage. If valid cached data is available, it uses it; otherwise, it fetches new data and caches it.
    • Building the Index: The script constructs the lunr.js search index from the fetched data.
    • Search Query Handling: It reads the search query from the URL parameters and triggers a search if a query is present.
    • Search Execution: It performs the search using the built index and processes the query to ensure it is valid.
    • Displaying Results: It limits the displayed results to a maximum of 10 to avoid overwhelming users and improve performance.
  1// Get the search input element
  2var searchElem = document.getElementById("search-input");
  3// Define a global object to store search-related data and ensure it's initialized only once
  4window.pankajpipadaCom = window.pankajpipadaCom || {};
  5
  6// Initialize search only once
  7if (!window.pankajpipadaCom.initialized) {
  8  window.pankajpipadaCom.lunrIndex = null;
  9  window.pankajpipadaCom.posts = null;
 10  window.pankajpipadaCom.initialized = true;
 11
 12  // Load search data and initialize lunr.js
 13  loadSearch();
 14}
 15
 16// Function to load search data and initialize lunr.js
 17function loadSearch() {
 18  var now = new Date().getTime();
 19  // Check for cached data in localStorage
 20  var storedData = localStorage.getItem("postData");
 21  
 22  // Use cached data if available and not expired
 23  if (storedData) {
 24    storedData = JSON.parse(storedData);
 25    if (now < storedData.expiry) {
 26      console.log("Using cached data");
 27      buildIndex(storedData.data, checkURLAndSearch);
 28      return;
 29    } else {
 30      console.log("Cached data expired");
 31      localStorage.removeItem("postData");
 32    }
 33  }
 34
 35  // Fetch search data via AJAX request
 36  var xhr = new XMLHttpRequest();
 37  xhr.onreadystatechange = function () {
 38    if (xhr.readyState === 4 && xhr.status === 200) {
 39      try {
 40        var data = JSON.parse(xhr.responseText);
 41        buildIndex(data, checkURLAndSearch);
 42        console.log("Search initialized");
 43
 44        // Cache fetched data with expiry
 45        localStorage.setItem(
 46          "postData",
 47          JSON.stringify({
 48            data: data,
 49            expiry: new Date().getTime() + 7 * 24 * 60 * 60 * 1000, // TTL for 1 week
 50          })
 51        );
 52      } catch (error) {
 53        console.error("Error parsing JSON:", error);
 54        showError("Failed to load search data.");
 55      }
 56    } else if (xhr.status !== 200) {
 57      console.error("Failed to load data:", xhr.status, xhr.statusText);
 58      showError("Failed to load search data.");
 59    }
 60  };
 61  xhr.onerror = function () {
 62    console.error("Network error occurred.");
 63    showError("Failed to load search data due to network error.");
 64  };
 65  xhr.open("GET", "../index.json");
 66  xhr.send();
 67}
 68
 69// Function to build lunr.js index
 70function buildIndex(data, callback) {
 71  window.pankajpipadaCom.posts = data;
 72  window.pankajpipadaCom.lunrIndex = lunr(function () {
 73    this.ref("url");
 74    this.field("content", { boost: 10 });
 75    this.field("title", { boost: 20 });
 76    // Define the new field for concatenated tags
 77    this.field("tags_str", { boost: 15 });
 78    this.field("date");
 79    window.pankajpipadaCom.posts.forEach(function (doc) {
 80      // Create a new field 'tags_str' for indexing
 81      const docForIndexing = {
 82        ...doc,
 83        tags_str: doc.tags.join(" "),
 84      };
 85      this.add(docForIndexing);
 86    }, this);
 87  });
 88  console.log("Index built at", new Date().toISOString());
 89  callback();
 90}
 91
 92// Function to display error message
 93function showError(message) {
 94  var searchResults = document.getElementById("searchresults");
 95  searchResults.innerHTML = `<br><h2 style="text-align:center">${message}</h2>`;
 96  searchElem.disabled = true; // Disable search input on error
 97}
 98
 99// Function to check URL for search query and perform search
100function checkURLAndSearch() {
101  var urlParams = new URLSearchParams(window.location.search);
102  var query = urlParams.get("query");
103  if (query) {
104    searchElem.value = query;
105    showSearchResults();
106  }
107}
108
109// Function to perform search and display results
110function showSearchResults() {
111  if (!window.pankajpipadaCom.lunrIndex) {
112    console.log("Index not available.");
113    return; // Exit function if index not loaded
114  }
115  var query = searchElem.value || "";
116  var searchString = query.trim().replace(/[^\w\s]/gi, "");
117  if (!searchString) {
118    displayResults([]);
119    return; // Exit if the search string is empty or only whitespace
120  }
121
122  var matches = window.pankajpipadaCom.lunrIndex.search(searchString);
123  console.log("matches", matches);
124  var matchPosts = matches.map((m) =>
125    window.pankajpipadaCom.posts.find((p) => p.url === m.ref)
126  );
127  console.log("Match posts", matchPosts);
128  displayResults(matchPosts);
129}
130
131// Function to display search results
132function displayResults(results) {
133  const searchResults = document.getElementById("searchresults");
134  const maxResults = 10; // Limit to 10 results
135  if (results.length) {
136    let resultList = "";
137    results.slice(0, maxResults).forEach((result) => {
138      if (result) {
139        resultList += getResultStr(result);
140      }
141    });
142    searchResults.innerHTML = resultList;
143  } else {
144    searchResults.innerHTML = "No results found.";
145  }
146}
147
148// Function to format search result items
149function getResultStr(result) {
150  var resultList = `
151      <li style="margin-bottom: 1rem">
152        <a href="${result.url}">${result.title}</a><br />
153        <p>${result.content.substring(0, 150)}...</p>
154        <div class="text-muted" style="display: flex; justify-content: space-between; align-items: center; font-size: small; height: 1.2em; line-height: 1em; padding: 0.25em;">
155            <div>${result.date}</div>
156            <div><i class="fa fa-tags"></i>
157                ${result.tags
158                  .map(
159                    (tag) =>
160                      `<a class="text-muted" href="/tags/${tag}">${tag}</a>`
161                  )
162                  .join(", ")}
163            </div>
164        </div>
165      </li>`;
166  return resultList;
167}

Optimization Concerns

Index Data Caching

  • By default, the index is not preserved across page loads, which can result in unnecessary data fetching and processing.
  • To improve performance, we can use localStorage to cache the search data. This caching mechanism is already implemented in the loadSearch function, where data is stored with a time-to-live (TTL) of one week.
  • This ensures that the index is only fetched and built once a week, reducing the load on the server and improving user experience.
  • Note that localStorage needs data to be in a serializable format and hence the index directly cannot be cached. Therefore, we are caching the index.json post data that we created in the first step and then rebuilding the index at each window creation.

Limiting Search Results

  • Limit the number of search results displayed to the user to avoid overwhelming them and to improve performance.
  • This is done by taking a maxResults length slice above.

Async Loading of Scripts

  • To improve page load times, ensure that the search scripts are loaded asynchronously.
  • This is achieved by adding the defer attribute to the script tags in the layout.
1<script src="{{ $lunrJS.RelPermalink }}" defer></script>
2<script src="{{ $minifiedScript.Permalink }}" integrity="{{ $minifiedScript.Data.Integrity }}" defer></script>
  • Note that this deferring means the page will first render and then the actual search execution will begin.
  • Our layout doesn’t really communicate with the script as such. It just loads the script. The script itself, sees if the data and indexes are present, then checks the URL for search query, then executes the search, modifies the html to add the result list items.
  • It also helps to allow users to share searches easily.
  • This is already handled in the checkURLAndSearch function, which reads the query parameter from the URL and performs a search if it’s present.

Conclusion

  • To recap, we created a search index data, a user interaction form, a search page and a layout for it.
  • All this and lunr.js is tied together using a custom javascript.
  • As noted before, the stylings used are bootstrap and font awesome based here, but can be easily adapted to any styling system.
  • An implemented example of this can be found in this sites search functionality. Example search query