CL-TAINT

CL-TAINT is a library written by Alan Shields. It is used to easily wrap values so that they are only accessed in a controlled manner. This is very useful for web applications, where you should not trust the input from the user.

Basic Tainting and Untainting

(taint "7")
=>
#<TAINTED-VALUE "7" {B682029}>
(* (length *) (length *))
ERROR: NOT A SEQUENCE

(* (untaint #'parse-integer *) (untaint #'parse-integer *))
=> 49

Once a value is tainted, it can't be used until you untaint it. You untaint the value by calling UNTAINT with a function that will be run on the internal value, the result of which shall be returned.

This helps ensure that the value is only accessed through that function.

Detaint and With-Detaint

I find it useful to think of accepting external values in terms of declarative statements, so I created DETAINT and it's ease-of-use macro WITH-DETAINT. Keep in mind that DETAINT currently only works with strings as input. However, it will work with both tainted and un-tainted strings.

(detaint "7" 'integer)
=> 7
(detaint "dslkd" 'integer)
=> nil
(detaint "" '(or integer 0))
=> 0

If you have to untaint several variables at once, it's far more convenient to do it like so:

(with-detaint ((integer x y z)
               ((or integer 0) a b c))
  (list x y z a b c))

It's much like a type declaration.

Simple Types

The most simple usage of DETAINT is with the simple types. These types attempt to coerce the input into the target type.

Integers

(detaint "7" 'integer)
=> 7
(detaint "a7a" 'integer)
=> 7
(detaint "a7a" 'strict-integer)
=> NIL

integer and loose-integer will find the first contiguous sequence of numbers in the input and parse those. If no numbers are found, NIL will be returned.

strict-integer will only parse a string entirely consisting of numbers. If that does not hold, NIL will be returned.

Strings

(detaint "Yes" 'string)
=> "Yes"
(detaint "Yes" 'nestring)
=> "Yes"
(detaint "" 'string)
=> ""
(detaint "" 'nestring)
=> NIL

string will return the input string.

nestring will return the input string if it is non-empty, NIL otherwise.

Symbols

(detaint "Yes" 'symbol)
=> YES
(detaint "Yes" 'exact-symbol)
=> |Yes|
(detaint "Yes" 'keyword)
=> :YES

All of the symbol types (except the keyword ones, obviously) intern the string into the current package.

symbol, keyword are uppercased.

exact-symbol, msymbol, exact-keyword, mkeyword are unaltered.

Advanced Types

You can add some clauses to the simple types, restricting which inputs are acceptable.

For example:

(detaint "7" '(integer :min 6))
=> 7
(detaint "7" '(integer :min 8))
=> NIL
(detaint "7" '(integer :pred oddp))
=> 7
(detaint "7" '(integer :pred evenp))
=> NIL

Integers

(INTEGER-TYPE <:min literal> <:max literal> <:pred literal>)
:min
Specifies the minimum value acceptable
:max
Specifies the maximum value acceptable
:pred
A function of one argument which shall receive the parsed integer (if no integer is found, the function will not be called). If the function returns a false value, the parse will be rejected, and NIL will be returned.

Strings

(STRING-TYPE <:min-length literal> <:max-length literal> <:every literal> <:some literal> <:pred literal> <:match literal> <:imatch literal>)
:min-length
The minimum length acceptable
:max-length
The maximum length acceptable
:every
Every character of the string will be passed to the function, one at a time. If any return value is false, NIL will result.
:some
Every character of the string will be passed to the function, one at a time. If every return value is false, NIL will result.
:pred
The string will be passed to the function as a whole. If the return value is false, NIL will result.
:match
The string must exactly match the specified string.
:imatch
The string must case-insensitively match the specified string.

Symbols

(SYMBOL-TYPE <:package literal>)
:package
The symbol will be interned into the specified package, rather than the current package.

Combinators

By default, when a type fails, NIL is returned. You can propose alternate types using or and friends.

(detaint "7" '(or integer 0))
=> 7
(detaint "asdf" '(or integer 0))
=> 0
(detaint "7" '(if integer t nil))
=> T
(detaint "a" '(if integer t nil))
=> NIL
(detaint "a" '(when integer t))
=> NIL
(detaint "7" '(when integer t))
=> 7
(detaint "7" '(or (or (integer :min 42)
                      (integer :max 4))
                  0))
=> 0
(detaint "44" '(or (or (integer :min 42)
                       (integer :max 4))
                   0))
=> 44

or and any return the first value that returns a true value.

if returns the third construct if the second is true, and the fourth if it is not.

Matchers

Sometimes you only want to parse something if it follows a certain pattern. Currently the only matcher is example, but regular expressions will come soon, thanks to Edi's PCRE library.

If you need to match a simple function, just use (string :pred function) along with when.

(detaint "77" '(with-match (example "99") integer))
=> 77
(detaint "7" '(with-match (example "99") integer))
=> NIL
(detaint "7777" '(with-match (example "09") integer))
=> 7777

The third argument (integer in the above examples) is a proper specifier, so you can put in or constructs, more matchers, etc, etc, etc.

Example Matches

The example matcher is quite simple. You provide an example of what you'd like to match, and it matches it.

There are a few wildcard characters.

x
Matches any lowercase letter
X
Matches any uppercase letter
i or I
Matches any letter
* or _
Matches any character
1-9
Matches any number
0
Starts a variable-length number ("09" will match "7", "77", and "77777777". Whereas "99" would only match "77")

All other characters match only themselves.

Filters

Sometimes you want to filter the string before doing anything with it.

(detaint "a7a" '(with-filter (drop a) string))
=> "7"

The third argument (string in the above example) is, like match, a proper place. So you can put or, or even more filters.

The two filters available now are: gather (synonyms: pass), and drop (synonyms: remove, fail). These filters work character-wise. I hope to add some regexp filters using Edi's PCRE soon.

gather causes the new value to consist of only those characters which match its pattern.

drop causes the new value to consist of only those characters which do NOT match its pattern.

Gather/Drop filters

Character literals are specified either as #\c -style characters, one-letter symbols, or one-letter strings.

; valid characters:
#\f
#\F
|f|
f
"f"
"F"

While (gather "f") is useful for getting all the fs in a string, it's not all that interesting. Enter the constructs:

(any a b c)
(not (and "F" "f"))
any, or, and union
Match any construct in their argument list.
every, and, and intersection
Match only if every construct in their argument list matches.
iany, ior, and iunion
Same as any, but case insensitive.
ievery, iand, and iintersection
Same as every, but case insensitive.
difference
Matches if the first construct matches but none of the following do (think of setting up a range then removing ranges from that)
not
Inverses the match of its construct.
pred
Passes the current character to a function. That function returns a true or false value.
not-pred
Like (not (pred ____))
range
Matches from second -> third (ie (and (range #\a #\z) (range #\A #\Z)) for all of the alphabet)
not-range
Not in the range.
nocase and icase
Set all sub-matches to case-insensitive
case
Sets all sub-matches to case sensitive

Enjoy!

I hope this makes your tainting and untainting more pleasant. At the time this was written, I had a lot of things to do with web apps, so I planned on adding more constructs as I needed them. If there's been no change in this in a while, I guess you can figure out what happened.