Knowledge Base
Published Jun. 30, 2025

HTML Parsing issues when reindexing journal articles after upgrading to Quarterly Release (Jericho)

Written By

Ahmed Abdin

How To articles are not official guidelines or officially supported documentation. They are community-contributed content and may not always reflect the latest updates to Liferay DXP. We welcome your feedback to improve How To articles!

While we make every effort to ensure this Knowledge Base is accurate, it may not always reflect the most recent updates or official guidelines.We appreciate your understanding and encourage you to reach out with any feedback or concerns.

Issue

  • I upgraded my instance to Quarterly Release. I performed a full reindex and received several errors in the log:
ERROR [default-26][jericho:211] StartTag a at (r17,c4,p1100) contains attribute name with invalid character at position (r17,c97,p1193)

Environment

  • Liferay DXP 7.0, 7.1, 7.2, 7.3, 7.4
  • Liferay DXP Quarterly Releases

Resolution

  • This is a known issue caused by upgrading the Jericho version to 3.4 here LPS-193534. Ichanged the log level of all parsing errors from INFO to ERROR", please see the release note.
  • To resolve the issue, please add the attached XML file "com.liferay.portal.html.parser.impl-log4j-ext.xml" in your [LIFERAY_HOME]/osgi/log4j folder. This will set the Jericho HTML "net.htmlparser.jericho" log level to "Fatal", and the errors will not appear while reindexing.
Did this article resolve your issue ?

Knowledge Base