Legacy Knowledge Base
Published Jun. 30, 2025

Stop Elasticsearch from analyzing words that are a company's proper name

Written By

Jorge Diaz

How To articles are not official guidelines or officially supporteddocumentation. They are community-contributed content and may not alwaysreflect the latest updates to Liferay DXP. We welcome your feedback toimprove How to articles!

While we make every effort to ensure this Knowledge Base is accurate, itmay not always reflect the most recent updates or official guidelines.We appreciate your understanding and encourage you to reach out with anyfeedback or concerns.

Legacy Article

You are viewing an article from our legacy "FastTrack"publication program, made available for informational purposes. Articlesin this program were published without a requirement for independentediting or verification and are provided "as is" withoutguarantee.

Before using any information from this article, independently verify itssuitability for your situation and project.

Issue

We would like Elasticsearch to stop analyzing words that are the company's proper name.

For example, if our company is called "Smiths" we want to avoid Elasticsearch to analyze that word and drop the trailing 's'

Environment

  • Any Liferay DXP version

Resolution

How to customize the Elasticsearch analyzers configuration in Liferay

To modify the behavior of the Elasticsearch word analysis process during indexation and/or search time, you have to customize the Elasticsearch configuration of its analyzers.
To change this configuration, you have to do it from the Liferay side. You can do it from Elasticsearch connector configuration, in the "Additional Index Configurations" section, for more information, see:

You have an example of how to redefine an analyzer in this article: How to create a custom Elasticsearch analyzer that is insensitive to accents

How to handle some words in Elasticsearch to avoid being processed by the regular parsers

To avoid some words being processed by the regular parsers, you can use the "stemmer override" token filter or the "keyword marker" token filter, more information:

On the other hand, there is also the "mapping character" filter, which allows to apply some text rewriting. See the article Mapping character filter where for example it rewrites the ":(" with "_sad_" although in this case it would be managed by the successive tokenizers.

 

Seeing all this, if you want the word "Smiths" to be treated differently, you can set up a stemmer override so that "Smiths" is stored as a special keyword in Elasticsearch.

For example, you can apply a stemmer override with the rule: "Smiths, smiths=> __smiths__" with two "_" before and after so that it is not confused with other words that are in the language itself.

 

Additional Information

 

 

 

Did this article resolve your issue ?

Legacy Knowledge Base