oo

Learning to Rank

Liferay Enterprise Search (LES) Subscribers

Search engines like Elasticsearch have well-tuned relevance algorithms, good for general search purposes.

LES Learning to Rank harnesses machine learning to improve search result rankings. It combines the expertise of data scientists with machine learning to produce a smarter scoring function that’s applied to search queries.

LES Learning to Rank requires a Liferay Enterprise Search subscription. It’s important to understand that the Elasticsearch Learning to Rank plugin is not produced by Elastic, and there is not a pre-built plugin for all of Liferay’s supported Elasticsearch versions. See the LES Compatibility Matrix for details.

Disabling Learning to Rank on a Search Page

Learning to Rank does not work with the Sort widget.

If LES Learning to Rank is deployed, but you must disable it on a particular Search page (perhaps to use the Sort widget),

  1. Add a Low Level Search Options widget to the Search page.

  2. Open the widget’s Configuration screen by clicking

    Configure additional low level search options in this page.

  3. In the Contributors to Exclude field, enter

    com.liferay.portal.search.learning.to.rank

Now the Learning to Rank re-scoring process is skipped for queries entered into the page’s Search Bar. Its results are sortable and returned using the default relevance algorithm.

Prerequisites

There are some prerequisites for using Learning to Rank to re-score Liferay queries sent to Elasticsearch:

Technical Overview

In a normal search, the user sends a query to the search engine via Liferay DXP’s Search Bar. The order of returned results is dictated by the search engine’s relevance scoring algorithm.

Here’s where Learning to Rank intervenes and makes that process different:

  1. User enters a query into the search bar.

  2. Liferay sends the query to Elasticsearch and retrieves the first 1000 results as usual, using the search engine’s relevance algorithm.

  3. The top 1000 results are not returned as search hits, but are used by Elasticsearch for re-scoring via the re-score functionality.

  4. The results are re-scored by the SLTR query, which includes the keywords and the trained model to use for re-scoring.

  5. Once the trained model re-ranks the results, they’re returned in Liferay’s Search Results in their new order.

Though it’s just a sub-bullet point in the ordered list above, much of the work in this paradigm is in creating and honing the trained model. That’s beyond the scope here, but below is help in getting all the pieces in place to orchestrate the magic of machine learning on your Liferay queries. Here’s a brief overview of what constitutes model training.

Model Training

A useful trained model is produced when a good judgment list and a good feature set are fed to a Learning to Rank algorithm (this is the machine learning part of the puzzle). Therefore, it’s incumbent on you to assemble

  • The Learning to Rank algorithm you wish to use for creating a training model. This demonstration uses RankLib.

  • A judgment list, containing a graded list of search results. The algorithm produces a model that honors the ordering of the judgment list.

  • A feature set containing all the features you’re handing to the Learning to Rank algorithm, which it uses in conjunction with the judgment list to produce a reliable model. An example feature set for Liferay is shown in the example.

Judgment lists are lists of graded search results.

Features are the variables that the algorithm uses to create a function that can score results in a smarter way. If you don’t give enough—or the correct—relevant features, your model won’t be “smart” enough to provide improved results.

Before beginning, you must have a remote Elasticsearch cluster communicating with Liferay. See the Search Engine Compatibility Matrix for more information.

tip

Use Suggestions to discover the most common queries (this can be one way to decide which queries to create Learning to Rank models for).

Step 1: Install the Learning to Rank Plugin on Elasticsearch

See the Elasticsearch Learning to Rank plugin documentation to learn about installing the Learning to Rank plugin.

warning

If you’re running Liferay DXP 7.2 with Elasticsearch 7.14+, the plugin must be compiled with JDK8 or JDK11 (whichever your Liferay installation is using) before installing it. Refer to this article for the required steps and additional background information.

You’ll be running a command like this one, depending on the plugin version you’re installing:

./bin/elasticsearch-plugin install https://github.com/o19s/elasticsearch-learning-to-rank/releases/download/v1.5.7-es7.13.4/ltr-plugin-v1.5.7-es7.13.4.zip

If using X-Pack security in your Elasticsearch cluster, there may be additional steps.

Step 2: Training and Uploading a Model

Detailed instructions on training models is outside the scope of this guide. This requires the intervention of data scientists, who can recommend appropriate tools and models. Use what works for you. In doing so, you’ll almost certainly be compiling Judgment lists and feature sets that can be used by the training tool you select to generate a model that produces good search results. Once you have a model, upload it to the Learning to Rank plugin.

Step 3: Upload the Model to the Learning to Rank Plugin

You’ll upload the model using a POST request, but first you need to make sure you have a _ltr index and a feature set uploaded to the Learning to Rank plugin. Use Kibana (via the LES Monitoring widget), to make these tasks easier.

  1. If you don’t already have an _ltr index, create one:

    PUT _ltr
    
  2. Add a feature set to the _ltr index. In this example the set is called liferay:

    POST _ltr/_featureset/liferay
    {
      "featureset": {
        "name": "liferay",
        "features": [
          {
            "name": "title",
            "params": [
              "keywords"
            ],
            "template": {
              "match": {
                "title_en_US": "{{keywords}}"
              }
            }
          },
          {
            "name": "content",
            "params": [
              "keywords"
            ],
            "template": {
              "match": {
                "content_en_US": "{{keywords}}"
              }
            }
          },
          {
            "name": "asset tags",
            "params": [
              "keywords"
            ],
            "template": {
              "match": {
                "assetTagNames": "{{keywords}}"
              }
            }
          }
        ]
      }
    }
    

    Take note of the syntax used here, since it’s required.

  3. Add the trained model to the feature set:

    POST _ltr/_featureset/liferay/_createmodel
    {
      "model": {
        "name": "linearregression",
        "model": {
          "type": "model/ranklib",
          "definition": """
    ## Linear Regression
    ## Lambda = 1.0E-10
    0:-0.717621803830712 1:-0.717621803830712 2:-2.235841905601106 3:19.546816765721594
    """
        }
      }
    }
    

This is a very high level set of instructions, because there’s not much to do in Liferay itself. To learn in more detail about what’s required, see the Learning to Rank plugin’s documentation.

tip

Keep reworking those judgment lists!

Step 4: Enable Learning to Rank

Enable Learning to Rank from Control Panel → Configuration → System Settings → Search → Learning to Rank. There’s a simple on/off configuration and a text field where you must enter the name of the trained model to apply to search queries.

The model in the previous step was named linearregression, so that’s what you’d enter.

Enable Learning to Rank in Liferay from the System Settings entry.

That’s all the configuration required to get the Elasticsearch Learning to Rank plugin using a trained model, a feature set, and search queries from Liferay.

Capability:
Deployment Approach: