Examining Data Migration

Liferay’s batch processing capabilities provide efficient, automated management of bulk object data. Once data is exported and appropriately formatted, batch client extensions leverage the batch engine to import data. Imports are executed according to configurable strategies, ensuring flexibility and strategic control as part of your overall data management strategy. By automating elements of transferring data, models are migrated reliably, quickly, and efficiently. This is particularly valuable for large-scale migrations or situations requiring frequent data transfers between environments.

Here, you’ll learn about streamlining data model migration data with batch client extensions.

Once data is exported and appropriately formatted, batch client extensions leverage the batch engine to import data.

Batch client extensions provide utility in crucial data management tasks. Using industry standard JSON to represent data structures and values, batch client extensions leverage APIs to migrate models and automate data import via headless batch endpoints. Upon deployment of a batch client extension, Liferay imports the extension’s models into the target environment, either creating new models or updating existing ones.

Batch client extensions are fundamentally composed of two parts:

  • Configuration elements
  • Payload resources

Batch Client Extension Configurations

Every client extension project contains a client-extension.yaml file defining its structure, resources, and configurations. This file is crucial for managing and organizing your project’s extensions. For further details, refer to official documentation (see Configuring Client Extensions) or take Foundations of Liferay Client Extensions for a holistic overview.

For batch client extensions, the client-extension.yaml file’s configuration elements are set in these three sections:

  • The Assemble Block
  • The (extension) Definition Block
  • The Security (OAuth) Block

For batch client extensions, the client-extension.yaml file’s configuration elements are set in these three sections.

Assemble Block

The assemble block organizes the extension’s assets, ensuring that all necessary resources are packaged correctly for deployment. This block guides the processor in finding the extension’s required files and resources. Typical batch client extensions will include an assemble block that looks like this:

assemble:
    - from: batch
      into: batch

The two batch references in this example refer to the folder named “batch” within the Liferay Sample Workspace’s liferay-sample-batch. If data is organized differently, paths in the assemble block will need to be adjusted accordingly.

Paths in the assemble block are always assumed to start at the root of the client extension, so there is no need to specify a leading slash.

Extension Definition Block

Core client extension configurations are defined within this block. For a batch client extension, this includes properties such as the name, type, and associated APIs. The name value identifies the client extension in the Liferay UI, while the type value of batch ensures that Liferay handles it as a batch client extension when it's deployed. The oAuthApplicationHeadlessServer value is a required attribute specifying the OAuth2 configuration used for secure API access. This value must match the name of the block that defines the properties.

It is not recommended to configure a headless endpoint to be accessible without an authentication context.

A typical batch client extension will include an extension definition block similar to this example:

liferay-ticket-batch-list-type-definition:
    name: Liferay Ticket Batch List Type Definition
    oAuthApplicationHeadlessServer: liferay-ticket-batch-list-type-definition-oauth-application-headless-server
    type: batch

Security (OAuth) Block

To securely interact with Liferay's APIs, the extension is configured with an OAuth2 application. This setup includes standard OAuth criteria such as the service address, security scheme, and access scopes. Using the previous example’s oAuthApplicationHeadlessServer value, a typical batch client extension’s OAuth block might include:

liferay-ticket-batch-list-type-definition-oauth-application-headless-server:
    .serviceAddress: localhost:8080
    .serviceScheme: http
    name: Liferay Ticket Batch List Type Definition OAuth Application Headless Server
    scopes:
        - Liferay.Headless.Admin.List.Type.everything
        - Liferay.Headless.Batch.Engine.everything
    type: oAuthApplicationHeadlessServer

The security block is critical to the functionality of the batch execution -- without it, the process won’t be able to enforce the necessary security requirements to process the batch payload. Defining scopes within this block grants permissions for specific API resources as needed by the extension.

To find the correct API scopes for your batch client extensions, open the Global Menu (Global Menu icon), go to the Control Panel tab, and click OAuth 2 Administration. Select an OAuth 2.0 application from the list and go to the Scopes tab, which displays available Liferay API scopes.

Collectively, these three blocks compose the client-extension.yaml file within batch client extensions. This YAML file is crucial as it provides Liferay with essential metadata and instructions for handling the deployed archive.

Understanding Payloads

While client-extension.yaml files define a batch client extension’s metadata, payload files are equally critical. JSON payload files, stored in single or multiple files, define the specific actions and behaviors of the extension. Using multiple files typically simplifies maintenance, particularly for large or diverse datasets.

When leveraging a multiple file approach, file processing occurs in alphanumeric order. It's crucial to name and number files accordingly to ensure that prerequisite elements are processed before they are referenced by other files (e.g., importing a model’s definitions and relationships prior to its entries).

Payloads for batch client extensions consist of two top-level JSON objects:

  1. The Configuration Block
  2. The Data Block

Configuration Block

The configuration block provides instructions on how to process data within the JSON file. This block defines how to process data, what entities are involved, and how to handle potential errors during payload processing. Within the liferay-sample-batch example's batch folder, the object-definition.batch-engine-data.json contains this configuration block:

"configuration": {
    "className": "com.liferay.object.admin.rest.dto.v1_0.ObjectDefinition",
    "parameters": {
       "containsHeaders": "true",
       "createStrategy": "UPSERT",
       "importStrategy": "ON_ERROR_FAIL",
       "updateStrategy": "UPDATE"
    },
    "taskItemDelegateName": "DEFAULT"
},

Within this example, the className directs the batch engine to process any items as object definitions, and the importStrategy directs Liferay to surface an error if the import fails.

Options for Batch Strategies

Each import batch contains parameters within the configuration block specifying the import’s data mapping, validation rules, and error handling procedures:

  • Import Strategy (importStrategy): Defines behavior in case of errors. The default of ON_ERROR_FAIL stops the import immediately upon encountering an error, while ON_ERROR_CONTINUE continues processing records even when errors occur.
  • Create Strategy (createStrategy): Specifies how file records are handled. The default of INSERT creates only new records (and duplicate records trigger errors), while UPSERT updates existing records and creates new ones when necessary.
  • Update Strategy (updateStrategy): Configures the behavior during record updates when using the UPSERT strategy. UPDATE completely overwrites the record, replacing missing values in the import file with null, while PARTIAL_UPDATE updates only the fields present in the import file, preserving existing values for other fields.

Each of these strategies can be modified by manually overwriting the configuration block for batch client extension JSON files.

Data Block

The data block contains a list of items, where each item represents a data record to be created. The processing defined by the configuration block relies on this items array as its data source. Therefore, it's crucial that this array includes complete definitions for all data records, encompassing attributes, relationships, and any other relevant information. Within the object-definition.batch-engine-data.json example, below the configuration block, the items array contains a definition for one object (named Sample):

	"items": [
		{
          ...
			"label": {
				"en_US": "Sample"
			},
			"name": "Sample",
			"objectActions": [
				{
					"active": true,
					"conditionExpression": "",
					"dateCreated": "2023-03-03T15:10:15.037+00:00",
					"dateModified": "2023-03-03T15:10:15.037+00:00",
					"description": "",
					"errorMessage": {
					},
					"label": {
						"en_US": "Sample Node Update 1"
					},
          ...

Instead of manually creating information within the data block, leverage Liferay's available tools to export existing definitions. Though the JSON files for exported models will not include the configuration block or the items array by default, you can either augment the exported schema with these elements or incorporate it into a pre-existing file structure that already contains them.

Maintaining External Reference Codes

Items within the data block will frequently contain an external reference code (ERC), supported by many of Liferay’s headless APIs. This ERC is used to track and manage unique external data entities with human-readable keys. ERCs are unique identifiers, empowering consistent migration between environments when other values such as structure IDs may vary. Before migrating data, it’s essential to set and maintain descriptive ERCs within the Liferay UI for your environment’s objects and other assets.

Troubleshooting Data Migration with Batch Client Extensions

When working with batch client extensions, the following strategies can assist with troubleshooting:

  • Check Compatibility: Ensure all resources inside the client extension’s JSON files match the schema, capabilities, and Liferay version of the target system.
  • Evaluate Feature Flags: Be aware of any enabled feature flags that may impact client extension behavior.
  • Pre-populate Scopes: To check for suspected conflicts with scope requests, access http://localhost:8080/o/api to populate all objects inside Liferay’s memory before deploying client extensions.

Conclusion

Batch client extensions offer significant advantages over manual methods towards automating data migration. By understanding data migration concepts and batch client extension components, you'll be able to streamline the process of migrating object definitions, picklists, workflow definitions, and more between environments. Next, you’ll build and deploy a batch client extension from Clarity’s exported distributor management definitions.

Loading Knowledge

Capabilities

Product

Education

Contact Us

Connect

Powered by Liferay
© 2024 Liferay Inc. All Rights Reserved • Privacy Policy