This is bad. So you have two options.
1. Clean the files before they get to PDI
Remove non-printable ASCII characters from a file with this Unix command | alvinalexander.com
2. Use a step like "Load file into memory" and set the "File content" field to type "Binary", then follow up with a Javascript or UDJC and strip it out using code.
The real bummer here is that because the character is essentially non-printable bytes, probably "0001" the XML step and frankly any other step that tries to represent it as a Java String are going to have a problem with it. That's why the step bombs out. So if you load it as a stream of bytes in PDI and do coding, something on the order of the link below:
Binary to Ascii Conversion
Where in the process you can kick out those bytes you consider invalid, then you are good to go.
I noticed that you have XML files with UTF-16 designated and that it may be a Latin language (special characters) also involved. So you'd have to define all the byte combos that stump your file and are not printable from a UTF-16 perspective.
Hope that helps.