Hitachi Content Platform​

 Workflow output write to file vs output

  • Object Storage
  • Hitachi Content Intelligence HCI
Christopher Rohland's profile image
Christopher Rohland posted 06-10-2019 19:56

I want to be able to open a file, search for a word in the file, replace a word in the file, and save the file with the replacement. 

When it comes to that last part, using write to file does not modify the original source file at all, and output does not create a new file with the replacement word either. 

I am not indexing, and I am not using HCP. My data connectors work. My pipeline performs all stages correctly. It almost seems as if I need a stage after replace in my pipeline, that takes the output of replace and sends it to HCI_content so that the change is written to the original source (writefile) or to a new target (output file).

Suggestions?


#HitachiContentIntelligenceHCI
Data Conversion's profile image
Data Conversion

What does your Replace stage operate on? I presume it's a document field, and I also presume you have a stage that copies HCI_content stream into that field earlier on in the pipeline? If those assumptions are correct, then yes, you need a stage that copies the modified field back into a stream, for the writeFile action to write out to the file.

Christopher Rohland's profile image
Christopher Rohland

Replace is operating on HCI_snippet. 

MIME

Text/MetaData

Snippet Extraction

Replace on Snippet Extraction field name/value

So if Replace was being triggered by a credit card, and the replacement is xxxxxxx, how do I direct what I believe is now HCI_Snippet xxxxxxx, into HCI_content with another stage? 

Data Conversion's profile image
Data Conversion

Well, you are replacing text in a snippet extracted from the file, not in the file itself. The snippet may represent a small part of the file content. If you write the snippet with the replacement back, you will likely lose a large portion of the original file. And if the original file was anything other than .txt (eg it was a PDF or a word doc) you will lose all of that.

You can use a stage to copy a field into a stream and then give that stream to the write action. But as I said above, I doubt this is what you are after.

Troy Myers's profile image
Troy Myers

You can either Modify the Replace stage or work with the Append stage, either  way that is only half the battle.  Once you insert/ replace a word it typically goofs up the layout of the document.  We have been playing with this and can get it working on typical office files, ( word, Excel)  but not everything. 

Christopher Rohland's profile image
Christopher Rohland

Hi Yury,

You are correct, that is not what I was after at all. I do appreciate your explanation though. This is one of the reasons why I asked the question because I wasn't sure if it was even possible to do what I wanted with HCI. 

Thank you for your responses.

Christopher Rohland's profile image
Christopher Rohland

Hi Troy,

I've actually been using your PII workflow to get me this far, that certainly has helped accelerate my understanding of HCI. Have examples of the "working" ones been posted anywhere, and is it anticipated this will be working down the road?

Thanks