Pentaho

 View Only

 Capture Groups from Regex Evaluation

  • Pentaho
  • Kettle
  • Pentaho
  • Pentaho Data Integration PDI
Mike White's profile image
Mike White posted 06-29-2020 11:23

Hi Guys,after some help please.

 

Trying to use a Regex Evaluation Step to split out some values from a string of text.

 

I have some regex which works fine at regex101.com and gives me exactly the results I'm after

 

^(?:.*)(<SiteClassificationCode>)(.+)(<\/SiteClassificationCode>)(?:.*)$

 

regex

 

However when running in Spoon it just consistently fails to match and therefore generates no Capture Groups

 

My simple 3 step test KTR file is attached, I have the feeling I'm missing the obvious  :)

 


#Kettle
#Pentaho
#PentahoDataIntegrationPDI
Ana Gonzalez's profile image
Ana Gonzalez

The problem is Pentaho uses the "Java flavor" of regex, I use regex101 myself when I need to text my regex because not having much knowledge of regex, is very useful with the explanation and cheatsheet, but sometimes there are some quirks specifically for java you'll need to modify.

Unfortunately I can't help you more, but I would look for some java specific forums to try to decode what it's not working there.

A suggestion, I wasn't able to open your ktr because there are a bunch of master-slave details and connections needing a specific jar to open it, I had to clean that information with a text editor to open it. Besides, your Text Output step had the path hardcoded, it's better to use the ${Internal.Entry.Current.Directory} variable to generate the file in the same directory where your ktr file is when you are uploading examples.

Regards

Brandon Jackson's profile image
Brandon Jackson

You had an extra character after the $ in your statement. It was an invisible character.

Mike White's profile image
Mike White

Cheers Brandon,

 

Sometimes it's just staring you in the face :)