Perl Data Munging Techniques

A few fellow mongers gave their two-cents worth on this topic,
but, really only one example roused discussion.

$string = 'stuff`and|things/';
$string =~ s/([^a-zA-Z0-9&#\d+;])/pack("cc",38,35) . unpack("c",$1) . pack("c",59)/ge;

This snippet was offered up by a fellow monger. This regular
expression was designed for massaging incoming CGI data
destined for a database. It would convert any nasty characters
encountered in '$string' and convert them to their harmless
HTML/ASCII equivelants.

Taking our example, stuff`and|things/ would literally be
translated to stuff`and|things&#47.

That's what the database would get. Especially handy when filling out
sticky-forms. The HTML would be rendered in your text-fields, textarea boxes or outside of <form> context.

User's could use whatever passwords they wanted finally without
constraining them to alph-numeric rules dictated by the back-end.

One thing to note here is that this reg-ex won't hose re-incoming data
say, from sticky-forms being re-submitted. This reg-ex will gloss over
the 'tokens', but their integrity isn't messed with.

All in all, it turned out to be quite a talk!


<< OC-PM Home Page