Configuring Cross-Region Disaster Recovery
Liferay Cloud provides two ways for customers to take advantage of the Disaster Recovery (DR) procedure in the case of major incidents: Automatic Disaster Recovery and Cross-Region Disaster Recovery. Liferay Cloud’s approaches to disaster recovery scenarios can be reviewed in greater detail in the Disaster Recovery Overview.
Here you’ll learn how to recover data manually during a cross-region disaster. These steps are required only when there is a compromise in all three zones in the same geographic region at the same time.
Initial Setup
Liferay offers a dedicated Liferay Cloud environment to manage a cross region disaster. For this example, assume that a Production environment is stored in the europe-west2 region and the region is compromised. To prevent downtime and data loss on the Production environment, shift the Disaster Recovery environment to outside the region of operation, such as us-west1. This fifth Disaster Recovery (shortened to DR) environment thus serves as a backup to store new user data generated during the incident.
Liferay Cloud customers wishing to set up a Disaster Recovery environment must contact their sales representative to provision the DR environment. This new environment appears with the other available environments (e.g., dev
, infra
, uat
, and prd
).
Liferay Cloud systems administrators must have full administrative rights on both the DR environment and the Production environment.
Verify VPN Settings in the DR environment
If a VPN is enabled on the Production environment, verify that the DR environment’s VPN has also been enabled.
To ensure the two environments are connected,
-
Click the DR environment’s Settings tab in the left menu.
-
In the VPN section, enter this information:
- VPN Type: OpenVPN
- Server Address: The server address.
- Account Name: The administrator’s email address.
- Password: The administrator’s password.
- Certificate: The certificate code.
- Forwarding IP: The forwarding IP address.
- Forwarding Port: The forwarding port number.
- Local Hostname: The VPN’s hostname.
- Local Port: The local port number.
-
Click Connect VPN.
For more information on connecting to a VPN, see VPN Connection.
Deploy the Latest Stable Build from Production to the DR Environment
Now you must deploy the latest stable build on Production to the DR environment. Follow the same steps outlined in Updating Services in Liferay PaaS.
Set Up Automated Backup Restores to Disaster Recovery
To finish setting up your Disaster Recovery environment, set up automated backup restores. This ensures that the latest backup is always ready in your DR environment for when an incident occurs.
First, retrieve your production environment’s master token (this requires administrator privileges to access the liferay
service shell):
-
In the Liferay Cloud console, navigate to your production environment → the
liferay
service page. -
Click on the Shell tab.
-
Run this command to get the environment’s master token:
env | grep LCP_PROJECT_MASTER_TOKEN
The master token is the hexadecimal ID after the
=
in the result.
Once you have the production environment’s master token, set these environment variables in your DR environment:
-
LCP_EXTERNAL_PROJECT_ID: the production environment’s project ID (for example,
acme-prd
) -
LCP_BACKUP_RESTORE_SCHEDULE: a cron scheduling value that defines the frequency of automatically restoring backups from the
prd
environment to the Disaster Recovery environment. See Scheduling Automated Backups and Cleanups for more information.
Set this value as a secret in your DR environment:
- LCP_EXTERNAL_PROJECT_MASTER_TOKEN: your production environment’s master token
Set these environment variables in your Disaster Recovery environment, not your production environment. Setting these variables in a production environment may result in backups being restored to it unexpectedly.
Saving these variables in your DR environment enables automated restores.
During an Incident
Continuing the example above, assume that the Production environment hosted in the europe-west2 region is scheduled to be backed up hourly at 2:00 PM local time. In this scenario, the region is then compromised at 2:30 PM local time. Because no backups have been generated in the intervening half hour, it is necessary to restore a backup of the database and documents from the Production environment to the Disaster Recovery environment. The last stable environment is the version created at 2:00 PM.
During a cross-region incident, follow these steps:
Disable the Database Restoration Schedule
Since the DR environment becomes the main environment accessible to users for the duration of the incident, your normal database restoration schedule may overwrite data after you switch production over to it.
If you are using the LCP_BACKUP_RESTORE_SCHEDULE
environment variable to regularly restore data to your DR environment, temporarily disable the restoration schedule by removing the variable. This prevents data created during the incident from being overwritten by a scheduled restore.
Follow these steps to disable the restoration schedule while it is accessible:
-
On the Liferay Cloud Console, navigate to your DR environment → Backup service page → Environment Variables.
-
Click the eye icon to reveal the value for your
LCP_BACKUP_RESTORE_SCHEDULE
variable: -
Make a note of the value for
LCP_BACKUP_RESTORE_SCHEDULE
so that you can quickly replace it after the incident. -
Remove the
LCP_BACKUP_RESTORE_SCHEDULE
environment variable and save changes.
Copy Latest Production Data to the DR Environment
Next, restore a backup from your Production environment to ensure your DR environment has the most recent updates.
If you were using the LCP_BACKUP_RESTORE_SCHEDULE
environment variable to regularly restore to your DR environment, you may already have a more recent stable backup restored and ready (depending on your configured frequency). If you had a backup automatically restored more recently, skip manually restoring the backup.
Follow these steps to restore the latest stable backup of Production to the DR environment:
-
In the DR environment, click the Backups tab.
-
Click the tab corresponding to the Production environment.
NoteThe Backup History lists the backups in two tabs: one for the DR environment and one for the Production environment.
-
Click Actions () for the latest stable backup in the Production environment; then select Restore.
Verify the DR Environment’s VPN Status and Reindex
Next, follow these steps to ensure your DR environment is ready for incoming traffic:
-
Verify that your VPN is connected to the DR environment by navigating to the Settings → VPN page for your DR environment.
If the appropriate VPN is not connected, set up the connection. See Connecting a VPN Server to Liferay Cloud for more information.
-
Log onto your DXP instance (using the IP address, since the custom domain does not yet point to the DR environment).
-
Navigate to the Global Menu () → Control Panel → Search.
-
Click Reindex all search indexes.
Allow some time for the reindex to complete.
Direct Custom Domain Traffic to the DR Environment
The web server service’s custom domain in the DR environment must match the original Production environment’s. You must also delete that configuration from the Production environment:
-
In the DR environment, select Services in the left menu.
-
Click webserver in the list of Services.
-
Click the Custom Domains tab and configure the custom domain to match that of the Production environment.
-
Navigate to the same settings in the Production environment and remove the custom domain configuration.
-
Update the DNS records and add the custom domain to the DR environment. For more information, see Custom Domains.
This causes all traffic to go to the DR environment.
Post-incident Recovery
Once the regional incident is over, you must shift back to the original region’s Production environment (europe-west2 in the example). Follow these steps:
Put a Freeze on Data Creation
To prevent data loss from recent changes when you switch back to your normal production environment, you must freeze all content creation in the DR environment. When you are ready to make the switch back to your production environment, work with your database administrator to arrange a data freeze before you make a manual backup.
Create a Manual Backup of the DR Environment
During the incident, the DR environment functions as the Production environment and therefore contains all new data generated during the disaster event. To preserve this data, you must back up the DR environment:
-
In the DR environment, click Backups in the menu on the left.
-
Click Backup Now.
Restore the Manual Backup to Production
Restore the data from your DR environment back to your normal production environment:
-
In the DR environment, click Backups in the menu on the left.
-
Click the tab corresponding to the DR environment.
NoteThe Backup History lists the backups in two tabs: one for the DR environment and one for the Production environment.
-
For the most recent backup (that you created previously), click the Actions button () and select Restore.
-
Select the Production environment and click Deploy Build.
Verify VPN Status and Reindex
Follow these steps to ensure your production environment is ready for incoming traffic:
-
Verify that your VPN is connected to the production environment by navigating to the Settings → VPN page for your production environment.
If the appropriate VPN is not connected, set up the connection. See Connecting a VPN Server to Liferay Cloud for more information.
-
Log onto your DXP instance (using the IP address, since the custom domain still points to the DR environment).
-
Navigate to the Global Menu → Control Panel → Search.
-
Click Reindex all search indexes.
Allow some time for the reindex to complete.
Restore Server Custom Traffic to Production
Because the web server service redirected all traffic to the DR environment during the incident, these settings must be updated again so that all traffic is redirected back to the original Production environment.
-
Navigate to Services on the left menu.
-
Click on webserver in the list of Services.
-
Click the Custom Domains tab.
-
Remove the custom domain from the DR environment.
WarningRemoving the custom domain creates downtime until your Production environment is receiving traffic again.
-
Update the DNS records and add the custom domain back to the Production environment. For more information, see Custom Domains.
-
Click Update Custom Domain.
Traffic should now be directed back to the original Production environment. If you do not use automatically scheduled database restores to your DR environment, the disaster recovery process is complete.
Restore the Database Restoration Schedule
If you were using the LCP_BACKUP_RESTORE_SCHEDULE
environment variable to regularly restore to your DR environment before the incident, restore this variable again to resume the restoration schedule:
-
On the Liferay Cloud console, navigate to your DR environment → Backup service page → Environment Variables.
-
Add the
LCP_BACKUP_RESTORE_SCHEDULE
environment variable and restore the value that you noted previously when you removed it. -
Save the changes.
Your Liferay Cloud environments can now resume their normal operations.