Tuning Your Auto-Scaling Settings

Auto-scaling ensures your services have the appropriate amount of instances at different levels of usage. You should, however, monitor your environment’s usage and observe the behavior so you can tune your settings for better results.

Tuning your auto-scaling settings starts with configuring your target average utilization values.

Auto-Scaling Behavior

Liferay Cloud determines the desired number of instances using the Kubernetes Horizontal Pod Autoscaler (HPA) algorithm. The service calculates this value for CPU and memory requirements with:

Liferay Cloud determines the desired number of instances using the Kubernetes Horizontal Pod Autoscaler (HPA) algorithm

Your service upscales or downscales based on criteria relative to your configured target utilization value, when these criteria are met:

The desired number of instances changes up or down. This number is calculated as currentInstances x (currentUtilization / targetUtilization), rounded up to the next integer.

A separate value is calculated using both the CPU and memory target utilization values. The higher number of desired instances is used, to ensure both CPU and memory requirements are met for your usage.
The difference between the current utilization and the target utilization must be outside the tolerance window of 10%.

To prevent rapidly scaling up and down, the ratio of currentUtilization / targetUtilization must be ≤ 0.9 to trigger a downscale, and it must be ≥ 1.1 to trigger an upscale.

Once both of these conditions are sustained for 5 minutes, the service upscales or downscales to the appropriate number of instances.

Configuring Target Average Utilization

System administrators can specify a target average utilization. This value is the average of memory and CPU usage across Liferay DXP services. That value threshold must be crossed before auto-scaling is triggered.

For example, if three service instances utilize 70%, 90%, and 95% of memory, respectively, the average memory utilization is 85%. If the target average utilization is set to 90, no upscaling is needed; upscaling in this situation only occurs when the average memory utilization exceeds the target.

Note

The default target utilization is 80%. Services that use a high amount of memory (such as the liferay service) can surpass this target utilization value quickly, even immediately after deployment.

The total available memory is specified by the memory property in LCP.json, as referenced in Configuration via LCP.json.

Specify the target average utilization in the autoscale property of the service’s LCP.json:

"autoscale": {
   "cpu": 80,
   "memory": 90
}

Balance your target average utilization according to your application’s specific needs for the most efficient auto-scaling. For example, this configuration heavily prioritizes CPU usage:

"autoscale": {
   "cpu": 60,
   "memory": 95
}

If the autoscale property isn’t set, the target average utilization defaults to 80 for both CPU and memory utilization.

JVM-Based Auto-Scaling

In most cases, tuning your target utilization value is sufficient to promote desirable auto-scaling behavior. However, some environments may not downscale as expected because unused memory is not actively reallocated from the JVM. This can happen when the operating system prefers to keep memory in the page cache to optimize the application’s performance.

In these instances, you may see better auto-scaling results with JVM-based scaling, rather than the default scaling behavior. This graph shows an example of this behavior: the green line shows the JVM’s heap usage as garbage collection occurs, and the blue line shows the total memory allocated to the application in the service’s container.

In this environment, even when the JVM uses less of its allocated memory, the OS does not free up this memory, so auto-scaling cannot occur.

JVM-based scaling follows the same auto-scaling behavior, but it respects the usage of the JVM’s heap memory instead of how much memory is allocated by the operating system. Your environment may perform better with JVM-based scaling if

Your application has configured garbage collection to actively free up memory.
Your application allocates large amounts of memory by default, so it naturally stays close to auto-scaling thresholds.
Your application’s total memory allocation remains high even in periods of low traffic.

Enable JVM-based auto-scaling in your LCP.json file’s autoscale object:

Set the memory property to a high value (e.g., 1000).
Add a prometheus.googleapis.com|jvm_memory_pool_used_bytes|gauge property, and set it to a value proportional to the total bytes in the JVM’s heap.

For example, this LCP.json configuration sets a JVM heap usage threshold of about 80% of 12 GB:

"autoscale": {
   "cpu": 90,
   "memory": 1000,
   "prometheus.googleapis.com|jvm_memory_pool_used_bytes|gauge": "10307921510"
}

The extremely high memory threshold effectively disables the standard mechanism of auto-scaling based on container metrics, in favor of JVM-based scaling.

Instead, the service uses the configured number of bytes as a target memory threshold, which gives auto-scaling a more accurate measurement of your service’s memory usage for some applications.

Choosing Metrics and Target Values

Auto-scaling is automatic, but its behavior depends on the metrics and target values you configure. Custom metrics provide more control but require tuning based on real usage.

Choosing a Metric

CPU and memory utilization are sufficient for most environments. Use custom metrics only when they better reflect your application’s bottleneck.

Select a metric that reflects your application’s bottleneck. For supported metrics, see Alternative Scaling Metrics.

In LCP.json, metrics use the format prometheus.googleapis.com|<metric_name>|<type>.

Memory pressure: prometheus.googleapis.com|jvm_memory_used_bytes|gauge, prometheus.googleapis.com|jvm_memory_pool_used_bytes|gauge (see JVM-Based Auto-Scaling for an example)
High concurrency: prometheus.googleapis.com|catalina_threadpool_currentthreadsbusy|<type>, prometheus.googleapis.com|jvm_threads_current|gauge
Database limits: prometheus.googleapis.com|com_zaxxer_hikari_pool_hikaripool_{1,2,3...}_activeconnections|<type>

Note

Replace <type> with the metric type defined in Prometheus (for example, gauge or counter).

Tip

Use a single primary metric to avoid conflicting signals.

If you use a custom metric, set the memory or cpu values to a high number (for example, 1000) to avoid conflicting signals.

Choosing a Target Value

Targets must reflect real usage patterns. Monitor your environment under normal and peak load, identify values where performance begins to degrade, and set the target slightly below that point. Observe the resulting behavior and adjust the value incrementally as needed.

If no historical data is available, start with a target value around 70–80% of your JVM capacity (for example, heap usage) and adjust based on observed behavior. This approach is most useful when using custom metrics.

Ensuring Downscaling Works

Auto-scaling does not downscale as soon as usage drops below the target. The metric must drop significantly.

As a rule of thumb, downscaling is most restrictive with a small number of instances. For example, when running 2 instances, scaling down to 1 occurs only when the metric falls to about half of the target value.

If your target is 80 and you run 2 instances, scaling down to 1 may only occur when the metric drops to around 40.

With more instances, the required drop is smaller, so downscaling occurs closer to the target value. For example, with 10 instances and a target of 100, scaling down to 9 may occur when the metric drops to around 80–90.

If the metric never drops far enough, the service does not scale down. Increase the target so that downscaling becomes possible.

Common Pitfalls

No downscaling occurs: The target is too low relative to the metric’s minimum values.
Frequent scaling events: The target is too close to normal usage.
Conflicting metrics: CPU or memory targets interfere with custom metrics.
Wrong metric: The selected metric does not reflect the actual bottleneck.

Alternative Scaling Metrics

Use these Prometheus metrics in the autoscale property of LCP.json when your environment aligns better with metrics other than CPU or memory. These values use absolute thresholds, not percentages.

Tomcat (Request Handling and Threads)

Track request throughput and thread usage in the application server.

prometheus.googleapis.com|catalina_globalrequestprocessor_processingtime|unknown
prometheus.googleapis.com|catalina_globalrequestprocessor_requestcount|unknown
prometheus.googleapis.com|catalina_threadpool_currentthreadcount|unknown
prometheus.googleapis.com|catalina_threadpool_currentthreadsbusy|unknown

Database Connection Pool (HikariCP)

Track database connection usage and saturation.

prometheus.googleapis.com|com_zaxxer_hikari_pool_hikaripool_{1,2,3...}_activeconnections|unknown
prometheus.googleapis.com|com_zaxxer_hikari_pool_hikaripool_{1,2,3...}_idleconnections|unknown

JVM Metrics

Track memory usage, garbage collection, and thread activity.

prometheus.googleapis.com|java_lang_threading_threadcount|unknown
prometheus.googleapis.com|jvm_buffer_pool_used_bytes|gauge
prometheus.googleapis.com|jvm_classes_currently_loaded|gauge
prometheus.googleapis.com|jvm_gc_collection_seconds_count|summary
prometheus.googleapis.com|jvm_memory_pool_used_bytes|gauge
prometheus.googleapis.com|jvm_memory_used_bytes|gauge
prometheus.googleapis.com|jvm_threads_current|gauge
prometheus.googleapis.com|jvm_threads_peak|gauge
prometheus.googleapis.com|jvm_threads_state|gauge

Process Metrics

Track system-level resource usage.

prometheus.googleapis.com|process_cpu_seconds_total|counter
prometheus.googleapis.com|process_open_fds|gauge
prometheus.googleapis.com|process_resident_memory_bytes|gauge

Scaling the Liferay Service