Hitachi Content Platform​

 Check if file exists before Attach Stream stage

  • Object Storage
  • Hitachi Content Intelligence HCI
Eduardo Javier Huerta Yero's profile image
Eduardo Javier Huerta Yero posted 09-11-2019 18:41

A customer has created a pipeline that reads from a DB and, based on a table field, looks for an image file through a CIFS connection using the Attach Stream stage. When the image file does not exist, the Attach Stream stage generates an error and the document is no longer processed.

 

What we would like to do is to use a default image when the original file does not exist. For that purpose, we must have a way to discover if the file exists or not. There is no supported action in the CIFS connection for that.

 

Any thoughts are welcome

 

Thanks


#HitachiContentIntelligenceHCI
Jared Cohen's profile image
Jared Cohen

@Eduardo Javier Huerta Yero​  can you determine the existence of the file beforehand, from the DB lookup/table info? If so, then just put the attach stream within a conditional.

 

If not, you would need a custom stage (or custom action on a connector) to look up the existence of the file and then create a field value to base your conditional on.

 

Additionally, you can enable the workflow setting "Continue Processing Failed Documents" (added in 1.4) which will allow documents to continue through the pipeline after a failure in a stage, and will tag the failure reason as metadata, which you could base your conditional on.

 

Hope this helps,

-Jared

Eduardo Javier Huerta Yero's profile image
Eduardo Javier Huerta Yero

Hi @Jared Cohen​ , thanks for the great answer. Yes, my first approach was to determine whether the file existed before the Attach Stream stage. Since the CIFS connector does not have an action for this, I changed to the LFS connector and developed a Javascript stage that checks for the file. This solutions works (I can share the stage code if somebody is interested).

 

However, I wasn't aware of the "Continue Processing Failed Documents" setting for Workflows, which opens up a world of possibilities. Thanks for bringing this to my attention.

 

Cheers,

 

Eduardo

Alexander Dubnov's profile image
Alexander Dubnov

Hi @Eduardo Javier Huerta Yero​, could you please share the stage code? I'm working on something similar. I need to check if a pair of files with the same name, but with different extensions exist before processing them.

 

Thanks,

Alex

Eduardo Javier Huerta Yero's profile image
Eduardo Javier Huerta Yero

Hi Alexander,

 

Here you go:

 

function transformDocument(callback, document) {

 var builder = callback.documentBuilder().copy(document);

  

 

 var Files = Java.type("java.nio.file.Files"); 

 var Paths = Java.type("java.nio.file.Paths")

  

 var fullPath = document.getStringMetadataValue("fullPath");

 var myPath = Paths.get(fullPath);

 var BooleanDocumentFieldValue = Java.type("com.hds.ensemble.sdk.model.BooleanDocumentFieldValue");

 builder.setMetadata("fileExists", BooleanDocumentFieldValue.builder().setBoolean(Files.exists(myPath)).build());

 

 return builder.build();

}

 

Alexander Dubnov's profile image
Alexander Dubnov

Thanks Eduardo!