Regular Expression To Strip Off Non-ASCII characters from text..

By Abhijit Ghatnekar
Most websites today provide textareas in the form of editors and people copy + paste text in them from all sorts of text editors. Editors like MS-Word introduce unwanted special characters in the text which go as well into the back end. It’s quite flummoxing and intriguing and sometimes the text appears valid but does not bypass the validation mechanisms of the server-side web application.

To Strip these off… employ the following Regex…..

$output = preg_replace(‘/[^(\x20-\x7F)]*/’,”, $output);

This will strip of all Non-ASCII

This actually is PHP… but it could be translated into any equivalent web scripting language.

Advertisements
  1. #1 by Alexandru Stefan on July 29, 2011 - 10:48 pm

    Great! It worked great for me! I like simple, straight line codes. I hate when people develop hell-long codes for a simple purpose…. maybe just to show how smart they are in coding. Thanks again!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: