Thanks Jordan. I am struggling with implementing a data migration mechanism that puts content onto HCP with substantially variable folder structure structure for which requires only a fixed number of entries. In very simplest example, I need to change folder structure when a certain number of objects have been written. For example, start off with Folder_001, then when it reaches 10,000 items, change name to Folder_002.
What I am contemplating is having a pre-processing pipeline that essentially has a step that starts with an initial value of 1 and increments the value for every document that passes through. The most efficient would be to store the value in the stage session area and on every call increment the in-memory value. From what I can tell, it seems to work pretty well using the pre-processing workflow. However, not really sure when "instances" of the step are created so that I make sure there is predictable behavior.
I am still going to be taking the external database approach for most stuff, but still need to consider if I can effectively do any caching in a stage session. I would hate to contemplate updating a database every time the counter is incremented.
So for session behaviors, my guess is the following situation is when/how-many separate instance/session would be created:
1) Each occurrence the step exists in a processing pipeline. So if it is in a pipeline twice, there will be separate "sessions" for each occurrence.
2) For each parallel job, there will a multiple of the be "N" number of occurrences based on item 1 above. So if there are 2 instances in a pipeline, and 2 parallel jobs, there would be a total of 4 instances of the step with separate sessions.
Remaining questions:
1) What is the lifecycle of any session? Is it destroyed and recreated from scratch when a task is paused, stopped, or sleeps between "Check for updates"? Any other times? Thus I am assuming when the task starts back up, new sessions will be started and in this example, the counter will restart from 1?
2) Does the task performance configuration for parallel jobs impact pipelines configured for pre-processing execution? Or only impacts workflow agent mode?