Search the blog

With the upcoming GDPR I recently needed to obfuscate some form validation data. This was preferable to deleting it since it allowed me to run reports on the form data.

For example, if the form validator reports that the email address was invalid I can still see something like this:

xxxx.xxxxxx@xxxxxxxx,xxx

Without storing any personal data I can still see why the email address was wrong (they typed a comma instead of a full stop).

So here is an obfuscation function for PHP:

// A string to test with some personal information (an email address) and some non-Latin characters
$str = 'test@test.com
This is a new line!

This is text from a non-latin language: ϒϖϡϠϛϚϙ6Ϙϸϐϲϕϵ϶ϟϞϝ

Basic punctuation is preserved! %^&*()"~<>,.|;:…';

function obfuscate($string, $replaceWith = 'x') {

    $chars = preg_quote('#/\!?@%^&*()_+=[]{}~"“”‘’\'`~<>,.|;:…—–-', '/');

    // u at the end is for unicode so it is multibyte safe
    // \s space, tab, newline, carriage return, vertical tab
    $string = preg_replace('/[^' . $chars . '\s]/u', $replaceWith, $string);
    
    return $string;
    
}

$str = obfuscate($str);

echo '<pre>' . $str . '</pre>';

/**

Outputs:

xxxx@xxxx.xxx
xxxx xx x xxx xxxx!

xxxx xx xxxx xxxx x xxx-xxxxx xxxxxxxx: xxxxxxxxxxxxxxxxxx

xxxxx xxxxxxxxxxx xx xxxxxxxxx! %^&*()"~<>,.|;:…

*/

Tim Bennett is a Leeds-based web designer from Yorkshire. He has a First Class Honours degree in Computing from Leeds Metropolitan University and currently runs his own one-man web design company, Texelate.