Examining Data Migration
Liferay’s batch processing capabilities provide efficient, automated management of bulk object data. Once data is exported and appropriately formatted, batch client extensions leverage the batch engine to import data. Imports are executed according to configurable strategies, ensuring flexibility and strategic control as part of your overall data management strategy. By automating elements of transferring data, models are migrated reliably, quickly, and efficiently. This is particularly valuable for large-scale migrations or situations requiring frequent data transfers between environments.
Here, you’ll learn about streamlining data model migration data with batch client extensions.
Batch client extensions provide utility in crucial data management tasks. Using industry standard JSON to represent data structures and values, batch client extensions leverage APIs to migrate models and automate data import via headless batch endpoints. Upon deployment of a batch client extension, Liferay imports the extension’s models into the target environment, either creating new models or updating existing ones.
Batch client extensions are fundamentally composed of two parts:
- Configuration elements
- Payload resources
Batch Client Extension Configurations
Every client extension project contains a client-extension.yaml
file defining its structure, resources, and configurations. This file is crucial for managing and organizing your project’s extensions. For further details, refer to official documentation (see Configuring Client Extensions) or take Foundations of Liferay Client Extensions for a holistic overview.
For batch client extensions, the client-extension.yaml
file’s configuration elements are set in these three sections:
- The Assemble Block
- The (extension) Definition Block
- The Security (OAuth) Block
Assemble Block
The assemble
block organizes the extension’s assets, ensuring that all necessary resources are packaged correctly for deployment. This block guides the processor in finding the extension’s required files and resources. Typical batch client extensions will include an assemble block that looks like this:
The two batch
references in this example refer to the folder named “batch” within the Liferay Sample Workspace’s liferay-sample-batch
. If data is organized differently, paths in the assemble block will need to be adjusted accordingly.
Extension Definition Block
Core client extension configurations are defined within this block. For a batch client extension, this includes properties such as the name, type, and associated APIs. The name
value identifies the client extension in the Liferay UI, while the type
value of batch
ensures that Liferay handles it as a batch client extension when it's deployed. The oAuthApplicationHeadlessServer
value is a required attribute specifying the OAuth2 configuration used for secure API access. This value must match the name of the block that defines the properties.
A typical batch client extension will include an extension definition block similar to this example:
Security (OAuth) Block
To securely interact with Liferay's APIs, the extension is configured with an OAuth2 application. This setup includes standard OAuth criteria such as the service address, security scheme, and access scopes. Using the previous example’s oAuthApplicationHeadlessServer
value, a typical batch client extension’s OAuth block might include:
The security block is critical to the functionality of the batch execution -- without it, the process won’t be able to enforce the necessary security requirements to process the batch payload. Defining scopes
within this block grants permissions for specific API resources as needed by the extension.
Collectively, these three blocks compose the client-extension.yaml
file within batch client extensions. This YAML file is crucial as it provides Liferay with essential metadata and instructions for handling the deployed archive.
Understanding Payloads
While client-extension.yaml
files define a batch client extension’s metadata, payload files are equally critical. JSON payload files, stored in single or multiple files, define the specific actions and behaviors of the extension. Using multiple files typically simplifies maintenance, particularly for large or diverse datasets.
Payloads for batch client extensions consist of two top-level JSON objects:
- The Configuration Block
- The Data Block
Configuration Block
The configuration block provides instructions on how to process data within the JSON file. This block defines how to process data, what entities are involved, and how to handle potential errors during payload processing. Within the liferay-sample-batch
example's batch folder, the object-definition.batch-engine-data.json
contains this configuration block:
Within this example, the className
directs the batch engine to process any items as object definitions, and the importStrategy
directs Liferay to surface an error if the import fails.
Options for Batch Strategies
Each import batch contains parameters within the configuration block specifying the import’s data mapping, validation rules, and error handling procedures:
- Import Strategy (
importStrategy
): Defines behavior in case of errors. The default ofON_ERROR_FAIL
stops the import immediately upon encountering an error, whileON_ERROR_CONTINUE
continues processing records even when errors occur. - Create Strategy (
createStrategy
): Specifies how file records are handled. The default ofINSERT
creates only new records (and duplicate records trigger errors), whileUPSERT
updates existing records and creates new ones when necessary. - Update Strategy (
updateStrategy
): Configures the behavior during record updates when using theUPSERT
strategy.UPDATE
completely overwrites the record, replacing missing values in the import file withnull
, whilePARTIAL_UPDATE
updates only the fields present in the import file, preserving existing values for other fields.
Each of these strategies can be modified by manually overwriting the configuration block for batch client extension JSON files.
Data Block
The data block contains a list of items
, where each item represents a data record to be created. The processing defined by the configuration block relies on this items array as its data source. Therefore, it's crucial that this array includes complete definitions for all data records, encompassing attributes, relationships, and any other relevant information. Within the object-definition.batch-engine-data.json
example, below the configuration block, the items array contains a definition for one object (named Sample
):
Instead of manually creating information within the data block, leverage Liferay's available tools to export existing definitions. Though the JSON files for exported models will not include the configuration block or the items array by default, you can either augment the exported schema with these elements or incorporate it into a pre-existing file structure that already contains them.
Maintaining External Reference Codes
Items within the data block will frequently contain an external reference code (ERC), supported by many of Liferay’s headless APIs. This ERC is used to track and manage unique external data entities with human-readable keys. ERCs are unique identifiers, empowering consistent migration between environments when other values such as structure IDs may vary. Before migrating data, it’s essential to set and maintain descriptive ERCs within the Liferay UI for your environment’s objects and other assets.
Troubleshooting Data Migration with Batch Client Extensions
When working with batch client extensions, the following strategies can assist with troubleshooting:
- Check Compatibility: Ensure all resources inside the client extension’s JSON files match the schema, capabilities, and Liferay version of the target system.
- Evaluate Feature Flags: Be aware of any enabled feature flags that may impact client extension behavior.
- Pre-populate Scopes: To check for suspected conflicts with scope requests, access http://localhost:8080/o/api to populate all objects inside Liferay’s memory before deploying client extensions.
Conclusion
Batch client extensions offer significant advantages over manual methods towards automating data migration. By understanding data migration concepts and batch client extension components, you'll be able to streamline the process of migrating object definitions, picklists, workflow definitions, and more between environments. Next, you’ll build and deploy a batch client extension from Clarity’s exported distributor management definitions.
Capabilities
Product
Education
Contact Us