[araneida: removing cl-taint stuff to put in own repo and pull from there Alan-Shields@omrf.ouhsc.edu**20060105004932 When CL-TAINT was put into its own respository, it turned out that taint.lisp and doc/taint.html were so entangled that they could not be pulled out by themselves. So, as a result, I copied them out and am now re-importing. Here's hoping this won't cause TOO many problems in the future. ] { hunk ./doc/taint.html 1 -Araneida - Taint - - - -

CL-TAINT

-

CL-TAINT is a library written by Alan Shields. It is used to easily wrap values -so that they are only accessed in a controlled manner. This is very useful for web applications, -where you should not trust the input from the user. - -

Basic Tainting and Untainting

- -
-(taint "7")
-=>
-#<TAINTED-VALUE "7" {B682029}>
-(* (length *) (length *))
-ERROR: NOT A SEQUENCE
-
-(* (untaint #'parse-integer *) (untaint #'parse-integer *))
-=> 49
-
- -

Once a value is tainted, it can't be used until you untaint it. You untaint -the value by calling UNTAINT with a function that will be run on the internal value, the result -of which shall be returned. - -

This helps ensure that the value is only accessed through that function. - -

Detaint and With-Detaint

-

I find it useful to think of accepting external values in terms of declarative statements, so -I created DETAINT and it's ease-of-use macro WITH-DETAINT. Keep in mind that DETAINT currently only -works with strings as input. However, it will work with both tainted and un-tainted strings. - -

-(detaint "7" 'integer)
-=> 7
-(detaint "dslkd" 'integer)
-=> nil
-(detaint "" '(or integer 0))
-=> 0
-
- -

If you have to untaint several variables at once, it's far more convenient to do it like so: -

-(with-detaint ((integer x y z)
-               ((or integer 0) a b c))
-  (list x y z a b c))
-
- -

It's much like a type declaration. - -

Simple Types

-

The most simple usage of DETAINT is with the simple types. These types -attempt to coerce the input into the target type. - -

Integers

-
-(detaint "7" 'integer)
-=> 7
-(detaint "a7a" 'integer)
-=> 7
-(detaint "a7a" 'strict-integer)
-=> NIL
-
- -

integer and loose-integer will find the first contiguous sequence -of numbers in the input and parse those. If no numbers are found, NIL will be returned. - -

strict-integer will only parse a string entirely consisting of numbers. If that -does not hold, NIL will be returned. - -

Strings

-
-(detaint "Yes" 'string)
-=> "Yes"
-(detaint "Yes" 'nestring)
-=> "Yes"
-(detaint "" 'string)
-=> ""
-(detaint "" 'nestring)
-=> NIL
-
- -

string will return the input string. -

nestring will return the input string if it is non-empty, NIL otherwise. - -

Symbols

-
-(detaint "Yes" 'symbol)
-=> YES
-(detaint "Yes" 'exact-symbol)
-=> |Yes|
-(detaint "Yes" 'keyword)
-=> :YES
-
- -

All of the symbol types (except the keyword ones, obviously) intern the string into the current package. -

symbol, keyword are uppercased. -

exact-symbol, msymbol, exact-keyword, mkeyword are unaltered. - - -

Advanced Types

-

You can add some clauses to the simple types, restricting which inputs are acceptable. - -

For example: -

-(detaint "7" '(integer :min 6))
-=> 7
-(detaint "7" '(integer :min 8))
-=> NIL
-(detaint "7" '(integer :pred oddp))
-=> 7
-(detaint "7" '(integer :pred evenp))
-=> NIL
-
- -

Integers

-
-(INTEGER-TYPE <:min literal> <:max literal> <:pred literal>)
-
- -
-
:min
Specifies the minimum value acceptable
-
:max
Specifies the maximum value acceptable
-
:pred
A function of one argument which shall receive the parsed integer (if no integer is found, the function will not be called). If -the function returns a false value, the parse will be rejected, and NIL will be returned.
-
- -

Strings

-
-(STRING-TYPE <:min-length literal> <:max-length literal> <:every literal> <:some literal> <:pred literal> <:match literal> <:imatch literal>)
-
- -
-
:min-length
The minimum length acceptable
-
:max-length
The maximum length acceptable
-
:every
Every character of the string will be passed to the function, one at a time. If any return value is false, NIL will result.
-
:some
Every character of the string will be passed to the function, one at a time. If every return value is false, NIL will result.
-
:pred
The string will be passed to the function as a whole. If the return value is false, NIL will result.
-
:match
The string must exactly match the specified string.
-
:imatch
The string must case-insensitively match the specified string.
-
- -

Symbols

-
-(SYMBOL-TYPE <:package literal>)
-
- -
-
:package
The symbol will be interned into the specified package, rather than the current package.
-
- -

Combinators

-

By default, when a type fails, NIL is returned. You can propose alternate types using or and friends. - -

-(detaint "7" '(or integer 0))
-=> 7
-(detaint "asdf" '(or integer 0))
-=> 0
-(detaint "7" '(if integer t nil))
-=> T
-(detaint "a" '(if integer t nil))
-=> NIL
-(detaint "a" '(when integer t))
-=> NIL
-(detaint "7" '(when integer t))
-=> 7
-(detaint "7" '(or (or (integer :min 42)
-                      (integer :max 4))
-                  0))
-=> 0
-(detaint "44" '(or (or (integer :min 42)
-                       (integer :max 4))
-                   0))
-=> 44
-
- -

or and any return the first value that returns a true value. -

if returns the third construct if the second is true, and the fourth if it is not. - -

Matchers

-

Sometimes you only want to parse something if it follows a certain pattern. -Currently the only matcher is example, but regular expressions will come soon, thanks to Edi's PCRE library. - -

If you need to match a simple function, just use (string :pred function) along with when. - -

-(detaint "77" '(with-match (example "99") integer))
-=> 77
-(detaint "7" '(with-match (example "99") integer))
-=> NIL
-(detaint "7777" '(with-match (example "09") integer))
-=> 7777
-
- -

The third argument (integer in the above examples) is a proper specifier, so you can put in or constructs, -more matchers, etc, etc, etc. - -

Example Matches

-

The example matcher is quite simple. You provide an example of what you'd like to match, and it matches it. - -

There are a few wildcard characters. - -

-
x
Matches any lowercase letter
-
X
Matches any uppercase letter
-
i or I
Matches any letter
-
* or _
Matches any character
-
1-9
Matches any number
-
0
Starts a variable-length number ("09" will match "7", "77", and "77777777". Whereas "99" would only match "77")
-
- -

All other characters match only themselves. - -

Filters

-

Sometimes you want to filter the string before doing anything with it. - -

-(detaint "a7a" '(with-filter (drop a) string))
-=> "7"
-
- -

The third argument (string in the above example) is, like match, a proper place. So you can put or, or even more filters. - -

The two filters available now are: gather (synonyms: pass), and drop (synonyms: remove, fail). -These filters work character-wise. I hope to add some regexp filters using Edi's PCRE soon. - -

gather causes the new value to consist of only those characters which match its pattern. -

drop causes the new value to consist of only those characters which do NOT match its pattern. - -

Gather/Drop filters

-

Character literals are specified either as #\c -style characters, one-letter symbols, or one-letter strings. - -

-; valid characters:
-#\f
-#\F
-|f|
-f
-"f"
-"F"
-
- -

While (gather "f") is useful for getting all the fs in a string, it's not all that interesting. Enter the constructs: - -

-(any a b c)
-(not (and "F" "f"))
-
- -
-
any, or, and union
Match any construct in their argument list.
-
every, and, and intersection
Match only if every construct in their argument list matches.
-
iany, ior, and iunion
Same as any, but case insensitive.
-
ievery, iand, and iintersection
Same as every, but case insensitive.
-
difference
-
Matches if the first construct matches but none of the following do (think of setting up a range then removing ranges from that)
-
not
Inverses the match of its construct.
-
pred
Passes the current character to a function. That function returns a true or false value.
-
not-pred
Like (not (pred ____))
-
range
Matches from second -> third (ie (and (range #\a #\z) (range #\A #\Z)) for all of the alphabet)
-
not-range
Not in the range.
-
nocase and icase
Set all sub-matches to case-insensitive
-
case
Sets all sub-matches to case sensitive
-
- -

Enjoy!

-

I hope this makes your tainting and untainting more pleasant. At the time this was written, I had a lot of things to do with -web apps, so I planned on adding more constructs as I needed them. If there's been no change in this in a while, I guess you can -figure out what happened. - - - rmfile ./doc/taint.html hunk ./taint.lisp 1 -(defpackage :cl-taint - (:use :common-lisp #+fiveam :it.bese.fiveam) - (:export :taint :untaint :tainted-p :detaint :with-detaint) - (:documentation "Simple package to taint and untaint values. -The basic idea is to wrap a value in a struct, making the value unusable directly. -This means you have to explicitly untaint it to get back a useful value. - -Usage follows this pattern: your functions that return data from the outside -(such as parameters from a web client) return their value via (taint value). -Later, when the function is desired, you use (untaint #'parse-integer tainted-value), -WITH-DETAINT, DETAINT, or the like.")) - -(in-package cl-taint) - -(defstruct tainted-value - (value nil)) -(defmethod print-object ((obj tainted-value) stream) - (if *print-escape* - (print-unreadable-object (obj stream :type t :identity t) - (format stream "~S" (tainted-value-value obj))) - (princ "TAINTED" stream))) - -(defun taint (value) - "Taints a value -Wraps a value in a lambda so that it no longer makes sense by itself." - (make-tainted-value :value value)) - -(defun untaint (func tainted-value) - "Untaints a value, returning it -Takes a function of one argument which will be passed the tainted value." - (funcall func (tainted-value-value tainted-value))) - -(defun tainted-p (tainted-value) - "Returns T if argument is tainted." - (typep tainted-value 'tainted-value)) - - -;;;; Below here lies the code for detaint - -(defun string->integer (string &key strict min max pred) - "Parses an integer from string. -If strict, then every character in the string must be an integer. -If not strict, then the first integer sequence in the string will be parsed. -If min is specified, the number must be at least min -If max is specified, the number must be at least max -If pred is specified, the (funcall pred number) must be true - -If any of these do not hold, nil is returned" - (declare (type string string) - (type (or null function) pred)) - (flet ((number-char-p (c) - (member c '(#\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9 #\0)))) - (when (or (and (not strict) - (some #'number-char-p string)) - (and strict - (every #'number-char-p string))) - (let ((int (if strict - (parse-integer string) - (parse-integer string :start (position-if #'number-char-p string) :junk-allowed t)))) - (when int - (if (and (or (not min) - (>= int min)) - (or (not max) - (<= int max)) - (or (not (functionp pred)) - (funcall pred int))) - int - nil)))))) - -(defun string->string (string &key min-length max-length match imatch every some pred) - "Checks a string for certain attributes. -If min-length, then string must be at least that long -If max-length, then string must be no more than that long -If match, then string must match match -If imatch, then string must case-insensitively match match -If every, then every character in the string must return true for (funcall every c) -If some, then at least one character in the string must return true for (funcall some c) -If pred, then (funcall pred string) must return true - -If any of these do not hold, nil is returned" - (declare (type string string) - (type (or null integer) min-length max-length) - (type (or null function) every some pred)) - (if (and (or (not min-length) - (>= (length string) min-length)) - (or (not max-length) - (<= (length string) max-length)) - (or (not match) - (equal string match)) - (or (not imatch) - (string-equal string imatch)) - (or (not (functionp every)) - (every every string)) - (or (not (functionp some)) - (some some string)) - (or (not (functionp pred)) - (funcall pred string))) - string - nil)) - -(defun string->symbol (string &key package uppercase) - "Turns a string into a symbol in :package, optionally uppercasing it" - (multiple-value-bind (symbol package) ; if we return multiple values, detaint assumes that this was not handled! - (intern (if uppercase - (string-upcase string) - string) - package) - (declare (ignore package)) - symbol)) - -(defun local-symbol (symbol) - "Guarantees that the symbol will be in the current package" - (intern (symbol-name symbol) :cl-taint)) - -(defun unwrap-value (value) - "Unconditionally get the value, whether tainted or not" - (if (tainted-p value) - (untaint #'identity value) - value)) - -(defun handle-integer-type (value args strict) - (when value - (destructuring-bind (&key min max pred) args - (string->integer value :strict strict :min min :max max :pred (when pred (symbol-function pred)))))) - -(defun handle-string-type (value args) - (when value - (destructuring-bind (&key min-length max-length every some pred match imatch) args - (string->string value - :min-length min-length - :max-length max-length - :match match - :imatch imatch - :every (when every (symbol-function every)) - :some (when some (symbol-function some)) - :pred (when pred (symbol-function pred)))))) - -(defun handle-nestring-type (value args) - (declare (ignore args)) - (when value - (if (equal value "") - nil - value))) - -(defun handle-symbol-type (value args keywordp uppercasep) - (when value - (destructuring-bind (&key package) args - (string->symbol value - :package (if keywordp - :keyword - (if package package *package*)) - :uppercase uppercasep)))) - -(defun coerce-char (char) - (cond - ((characterp char) char) - ((symbolp char) (if (= (length (symbol-name char)) 1) - (elt (symbol-name char) 0) - (error "~S is not a valid character specifier for a filter (bad length)" char))) - ((stringp char) (if (= (length char) 1) - (elt char 0) - (error "~S is not a valid character specifier for a filter (bad length)" char))) - ((integerp char) (elt (princ-to-string char) 0)))) - -(defun filter-specifier-matches-p (specifier c case-sensitive) - (cond - ((atom specifier) (if case-sensitive - (eql (coerce-char specifier) c) - (char-equal (coerce-char specifier) c))) - ((consp specifier) - (ecase (local-symbol (car specifier)) - ((any or union) (some (lambda (sub-spec) (filter-specifier-matches-p sub-spec c case-sensitive)) - (cdr specifier))) - ((iany ior iunion) (some (lambda (sub-spec) (filter-specifier-matches-p sub-spec c nil)) - (cdr specifier))) - ((every and intersection) (every (lambda (sub-spec) (filter-specifier-matches-p sub-spec c case-sensitive)) - (cdr specifier))) - ((ievery iand iintersection) (every (lambda (sub-spec) (filter-specifier-matches-p sub-spec c nil)) - (cdr specifier))) - (difference (and (notany (lambda (sub-spec) (filter-specifier-matches-p sub-spec c case-sensitive)) - (cddr specifier)) - (filter-specifier-matches-p (cadr specifier) c case-sensitive))) - (not (not (filter-specifier-matches-p (cadr specifier) c case-sensitive))) - - ((nocase icase) (filter-specifier-matches-p (cadr specifier) c nil)) - (case (filter-specifier-matches-p (cadr specifier) c t)) - - (pred (funcall (symbol-function (cadr specifier)) c)) - (not-pred (not (funcall (symbol-function (cadr specifier)) c))) - - (range (destructuring-bind (start end) (cdr specifier) - (let ((c-code (char-code (if case-sensitive c (char-upcase c)))) - (start-code (char-code (if case-sensitive (coerce-char start) (char-upcase (coerce-char start))))) - (end-code (char-code (if case-sensitive (coerce-char end) (char-upcase (coerce-char end)))))) - (and (>= c-code start-code) - (<= c-code end-code))))) - (not-range (not (filter-specifier-matches-p (cons 'range (cdr specifier)) c case-sensitive))))))) - -(defun gather-filter (specifier value) - (coerce (remove-if (complement (lambda (c) - (filter-specifier-matches-p specifier c t))) - (coerce value 'cons)) - 'string)) - -(defun remove-filter (specifier value) - (coerce (remove-if (lambda (c) - (filter-specifier-matches-p specifier c t)) - (coerce value 'cons)) - 'string)) - -(defun %example-match (input pattern in-variable-length-integer) - (declare (type list input pattern) - (type boolean in-variable-length-integer)) - (if (or (null input) - (null pattern)) - (if (and (null input) - (null pattern)) - t - nil) - (cond - (in-variable-length-integer (if (digit-char-p (car input)) - (%example-match (cdr input) pattern t) - (%example-match input pattern nil))) - (t - (case (car pattern) - (#\x (when (lower-case-p (car input)) - (%example-match (cdr input) (cdr pattern) nil))) - (#\X (when (upper-case-p (car input)) - (%example-match (cdr input) (cdr pattern) nil))) - ((#\i #\I) (when (alpha-char-p (car input)) - (%example-match (cdr input) (cdr pattern) nil))) - ((#\* #\_) (%example-match (cdr input) (cdr pattern) nil)) - ((#\1 #\2 #\3 #\4 #\5 - #\6 #\7 #\8 #\9) (when (digit-char-p (car input)) - (%example-match (cdr input) (cdr pattern) nil))) - (#\0 (when (digit-char-p (car input)) - (%example-match (cdr input) - (loop for sub on (cdr pattern) - when (not (digit-char-p (car sub))) - return sub) - t))) - (otherwise (when (eql (car pattern) (car input)) - (%example-match (cdr input) (cdr pattern) nil)))))))) - -(defun example-match (input pattern) - (%example-match (coerce input 'cons) - (coerce pattern 'cons) - nil)) - -(defun handle-filter-pass (value args) - (when value - (destructuring-bind ((filter-type specifier) sub-type) args - (ecase (local-symbol filter-type) - ((gather pass) (detaint (gather-filter specifier value) sub-type)) - ((drop remove fail) (detaint (remove-filter specifier value) sub-type)))))) - -(defun handle-match-pass (value args) - (when value - (destructuring-bind ((match-type specifier) sub-type) args - (ecase (local-symbol match-type) - (example (when (example-match value specifier) - (detaint value sub-type))))))) - -(defun handle-type (value type args) - (case (local-symbol type) - ((integer loose-integer) (handle-integer-type value args nil)) - (strict-integer (handle-integer-type value args t)) - - (string (handle-string-type value args)) - (nestring (handle-nestring-type value args)) - - (symbol (handle-symbol-type value args nil t)) - ((exact-symbol msymbol) (handle-symbol-type value args nil t)) - (keyword (handle-symbol-type value args t t)) - ((exact-keyword mkeyword) (handle-symbol-type value args t nil)) - - (with-filter (handle-filter-pass value args)) - (with-match (handle-match-pass value args)) - (otherwise (values nil t)))) - - -(defun detaint (value type) - "A declarative form of untaint. -Passed a string (can deal with tainted and untainted strings) and a declarative form describing -and constrainting the result. - -Usually used through with-detaint. - -DETAINT is (fully?) documented in doc/taint.html, but here is an EBNF-like description of valid -constructs - -(DETAINT possibly-tainted-value detaint-declaration) - -detaint-declaration: type - | statement - | literal - -type: simple-type - | complex-type - -integer-specifier: integer | loose-integer | strict-integer -string-specifier: string | nestring -symbol-specifier: symbol | exact-symbol | msymbol | keyword | exact-keyword | mkeyword - -simple-type: integer-specifier - | string-specifier - | symbol-specifier - -complex-type: (integer-specifier <:min literal> <:max literal> <:pred literal>) - | (string-specifier <:min-length literal> <:max-length literal> <:every literal> <:some literal> <:pred literal> <:match literal> <:imatch literal>) - | (symbol-specifier <:package literal>) - -statement: ([or|any] *) - | (if detaint-declaration detaint-declaration ) - | (when detaint-declaration detaint-declaration) - | (with-match match-specifier detaint-declaration) - | (with-filter filter-specifier detaint-declaration) - -match-specifier: (example example-match-string) - -filter-specifier: ([gather|pass] character-filter) - | ([drop|remove|fail] character-filter) - -character-filter: char-literal - | ([any|or|union] *) - | ([iany|ior|iunion] *) - | ([every|and|intersection] *) - | ([ievery|iand|iiintersection] *) - | (difference character-filter *) - | (not character-filter) - | (case character-filter) - | ([nocase|icase] character-filter) - | (pred function) - | (not-pred function) - | (range char-literal char-literal) - | (not-range char-literal char-literal) - -char-literal: character | one-letter-symbol | one-letter-string - -...as you can see, there's a little bit to it." - (declare (type (or string tainted-value null) value)) - (let ((value (unwrap-value value))) - (cond - ((atom type) - (cond - ((symbolp type) (multiple-value-bind (ret-value unhandled) (handle-type value type nil) - (if unhandled - type - ret-value))) - (t type))) - ((consp type) - (case (local-symbol (car type)) - ((or any) (loop for try in (cdr type) - when (detaint value try) - return it)) - (if (destructuring-bind (if then &optional else) (cdr type) - (if (detaint value if) - (detaint value then) - (detaint value else)))) - (when (destructuring-bind (if then) (cdr type) - (when (detaint value if) - (detaint value then)))) - (otherwise (multiple-value-bind (ret-value unhandled) (handle-type value (local-symbol (car type)) (cdr type)) - (if unhandled - (error "Unknown detaint construct ~S" type) - ret-value)))))))) - -(defmacro with-detaint (detaints &body body) - "Execute body with detainted variables. -Usage will look like: -(with-detaint ((integer x y z) - (string string1 string2) - ((or strict-integer - string) integer-or-string)) - (format t \"x: ~S y: ~S z: ~S string1: ~S string2: ~S integer-or-string: ~S\" x y z string1 string2 integer-or-string)) - -See the documentation for detaint to see all of the constructs that you can use. -The test suite in test-server.lisp has many examples." - `(let ,(mapcan (lambda (detaint-statement) - (mapcar (lambda (var) - `(,var (detaint ,var ',(car detaint-statement)))) - (cdr detaint-statement))) - detaints) - ,@body)) rmfile ./taint.lisp }