Remove unprintable characters from a file

From Stack Overflow
Jump to: navigation, search

From :

Remove the garbage characters with the Unix tr command

To fix this problem, and get the binary characters out of your files, there are several approaches you can take to fix this problem. Probably the easiest solution involves using the Unix tr command. Here's all you have to remove non-printable binary characters (garbage) from a Unix text file:

tr -cd '\11\12\15\40-\176' < file-with-binary-chars > clean-file

This command uses the -c and -d arguments to the tr command to remove all the characters from the input stream other than the ASCII octal values that are shown between the single quotes. This command specifically allows the following characters to pass through this Unix filter:

octal 11: tab
octal 12: linefeed
octal 15: carriage return
octal 40 through octal 176: all the "good" keyboard characters 

All the other binary characters -- the "garbage" characters in your file -- are stripped out during this translation process.