We are doing a lot of testing with ~ 6.8 million emails. We have noticed that when, we clear a workflow it shows pending for 20 minutes or more. Is any one else seeing similier results, with data sets this size or larger?
Frank we ended up rebooting the system whcih corrected the issue.
This isn't enough information for us to really get a good picture of what you are doing. At what point are you clearing the workflow? We need to know if it was cleared when it was in the Pending, Running, or Completed state. If it was running how many documents had it processed thus far? What is your batch size set to? What do your connectors look like in the workflow, multiple, type, config?
The workflow is complated. We will let it run overight come and look at the dat in the index in the morning. When we go to clear it out that is where it goes to pending. processing is 20,000 out put is 100 for batch size.
Thank you, this information helps. We will investigate this to see if we experience similar behavior.
Frank it is crawling when we are running a workflow. So we are restarting the system/ bouncing all the nodes (14)
Now you have me confused...this contradicts your earlier comment of saying the workflow is in a completed state when you clear it.
Is the workflow in state running or completed when you hit clear workflow?
Additionally, are you restarting or bouncing nodes after you hit clear workflow?
Frank our typical testing workflow is we come in, look at and play with the the index etc, we eventualy will clear it out and then run the workflow with any changes we have made to the pipelines etc.
We cleared it out this morning ~ 20 minutes or so. We started the workflow after that to get results from some tweaks that were made, after ~ 1 hour we had only indexed a small percentage of our data set. At that point we stopped the workflow and decided to reboot the system.
That is the reason for the original question, if someone says clearing a workflow with a comarable data set takes 30 seconds we would know this is not normal, versus in the halo lab it takes a couple of seconds, but the data set is super small so it is not apples to apples.
I just ran "Clear Workflow Task" on a workflow that had index over 3 million documents and in state "Running." The workflow metrics took around 20 seconds or less to clear out. I did however have "Collect Discovery Metrics" turned off within the Task Settings as a production index most likely does not require that information.
I would suggest next time you run your workflow to disable that option. That will minimize the amount of metrics you are gathering and should speed up clearing the workflow metrics as well.
Additionally, after you clear, make sure to refresh the page or swap pages to ensure that you are always getting the most up to date workflow state.
EDIT: Just tried it again with collect discovery metrics turned on and 9 million documents indexed. From the completed state it took less than 3 seconds to clear the workflow task.
Let us know how that goes.
List-based connector plugins may take a bit longer to clear out than change-based connector plugins, as well.
Any update? Are you still experiencing this behavior?
Retrieving data ...