HTML Parsing issues when reindexing journal articles after upgrading to Quarterly Release (Jericho)
How To articles are not official guidelines or officially
supporteddocumentation. They are community-contributed content and may
not alwaysreflect the latest updates to Liferay DXP. We welcome your
feedback toimprove How to articles!
While we make every effort to ensure this Knowledge Base is accurate,
itmay not always reflect the most recent updates or official
guidelines.We appreciate your understanding and encourage you to reach
out with anyfeedback or concerns.
Legacy Article
You are viewing an article from our legacy
"FastTrack"publication program, made available for
informational purposes. Articlesin this program were published without a
requirement for independentediting or verification and are provided
"as is" withoutguarantee.
Before using any information from this article, independently verify
itssuitability for your situation and project.
Issue
- I upgraded my instance to Quarterly Release. I performed a full reindex and received several errors in the log:
ERROR [default-26][jericho:211] StartTag a at (r17,c4,p1100) contains attribute name with invalid character at position (r17,c97,p1193)
Environment
- Liferay DXP 7.0, 7.1, 7.2, 7.3, 7.4
- Liferay DXP Quarterly Releases
Resolution
- This is a known issue caused by upgrading the Jericho version to 3.4 here LPS-193534. It changed the log level of all parsing errors from INFO to ERROR", please see the release note.
- To resolve the issue, please add the attached XML file "com.liferay.portal.html.parser.impl-log4j-ext.xml" in your [LIFERAY_HOME]/osgi/log4j folder. This will set the Jericho HTML "net.htmlparser.jericho" log level to "Fatal", and the errors will not appear while reindexing.
Did this article resolve your issue ?