We're using HCP to archive e-mails and HCI to index their metadata. The metadata is uploaded in the same process as the object, and done in JSON format.
The metadata contains fields from the e-mail header which are processed by HCI using a HCP MQE connector.
I recently saw a couple of Java Null Pointer Exceptions in the Content Class Extraction stage of the pipeline. They occur if one of the fields (Subject mostly) consists of semicolons.
1. It happens when the field value is ';' or ';;'. Three or more have not been tested.
2. It does not happen if the field is empty.
3. It does not happen if the field has a leading semicolon*.
4. It does not happen if the field has a trailing semicolon*.
5. It does not happen if the field has multiple semicolons along with other text.
*) Note: I have not seen data that would have both a leading and a trailing semicolon.
6. It only happens if fields that are processed by Content Class Extraction are considered (other fields may contain only a semicolon or two, but these don't throw NPE)
There are three such exceptions captured so far. The third is different and needs to be considered separately. Should this be handled in a service request, or can we discuss this in the community?