Troubleshooting Cross-Cluster Replication
Known common pitfalls encountered during the CCR setup are covered here, as well as general troubleshooting techniques. For further troubleshooting, look at Elastic’s CCR documentation or visit Elastic’s forum.
Investigating Index Replication Issues
Successful read operations to the follower Elasticsearch cluster depend on replication of the leader’s indexes.
To aid diagnosing replication problems, add an INFO log level for com.liferay.portal.search.elasticsearch7.internal.ccr.CrossClusterReplicationHelperImpl
. Log Levels are added in Control Panel → Server Administration → Log Levels.
Inspecting Connection Request/Response
Enabling Cross-Cluster Replication requires setting up multiple connections to Elasticsearch clusters.
To aid diagnosing connection issues, add an INFO log level for com.liferay.portal.search.elasticsearch7.internal.connection.ElasticsearchConnectionManager
. Log Levels are added in Control Panel → Server Administration → Log Levels.
Exceptions During Reindex: RetentionLeaseNotFoundException
and IndexNotFoundException
When a reindex is triggered on the Leader DXP node, the Follower Elasticsearch node may throw errors like this:
and this:
With a shard history retention lease, a follower can mark in the history of operations on the leader where in history that follower currently is. The leader shards know that operations below that marker are safe to be merged away, but any operations above that marker must be retained for until the follower has had the opportunity to replicate them. These markers ensure that if a follower goes offline temporarily, the leader will retain operations that have not yet been replicated. Since retaining this history requires additional storage on the leader, these markers are only valid for a limited period after which the marker will expire and the leader shards will be free to merge away history. You can tune the length of this period based on how much additional storage you are willing to retain in case a follower goes offline, and how long you’re willing to accept a follower being offline before it would otherwise have to be re-bootstrapped from the leader.
ElasticsearchSecurityException
When Setting Up CCR
You may run into the following error when configuring CCR:
CCR requires a Platinum Elasticsearch license. As a LES subscriber you have access to CCR with the license provided to you by Liferay. If you’re testing locally, you can start a trial on each cluster.
Local DXP Node Doesn’t Read from Follower Elasticsearch Cluster
In a DXP cluster using Cross-Cluster Replication, each local DXP node must be mapped to read from the local follower Elasticsearch cluster. For example, if you have two local DXP nodes and the connectionId
of your follower connection is ccr
, to match them with the follower Elasticsearch cluster, the Local Cluster Configurations property should be configured with values like this:
Even if you’re not binding the DXP nodes to localhost
, the internal clustering code continues to identify each node using it, so localhost
should be the hostname in this property. If you want to use a hostname other than localhost
to identify DXP nodes internally (including in the CCR configuration) you must set the following portal properties on each DXP node:
With these properties, the above Local Cluster Configurations property is
Follower Elasticsearch Cluster with Red Status
The follower cluster may go to red cluster health status after you successfully set up a CCR connection and enable CCR on the local DXP node. This can result in errors like this in the follower Elasticsearch node’s console:
This may happen if you’ve been configuring, restarting, and reindexing repeatedly throughout the setting up procedure. If you see this happen and you are confident your connections are configured properly, re-enable the CCR functionality by deleting the follower indexes, then re-enabling CCR from Liferay’s System Settings:
-
Delete all the follower indexes. This is most conveniently carried out in Kibana’s Index Management UI.
-
Perform a full reindex from the Leader DXP node.
-
To re-enable the CCR configuration, go to System Settings → Search → Cross-Cluster Replication on the Local DXP node. De-select Read from Local Clusters and click Update to disable the module, then select Read from Local Clusters and click Update again to re-enable it.
Liferay 7.2: After Deploying the CCR LPKG and the ElasticsearchConnectionConfiguration File, Search is Broken
If you see errors in the log like those below, and experience a broken search engine connection, after deploying the CCR LPKG file simultaneously with the ElasticsearchConnectionConfiguration-ccr.config
file, you have encountered a known bug, LPS-127821. To work around this bug and fix the search engine connection, you can restart Liferay or else duplicate the configuration using a different file subname (e.g., -ccr2.config
; update the connectionId
setting as well).