[cl-taint: moved over from Araneida Alan-Shields **20051223231356] { addfile ./cl-taint.asd hunk ./cl-taint.asd 1 - +;;; -*- Lisp -*- +(defsystem :cl-taint + :name "cl-taint" + :version "0.1" + :maintainer "Alan Shields " + :components ((:file "taint") + (:html-file "taint-doc"))) addfile ./taint-doc.html hunk ./taint-doc.html 1 +Araneida - Taint + + + +

CL-TAINT

+

CL-TAINT is a library written by Alan Shields. It is used to easily wrap values +so that they are only accessed in a controlled manner. This is very useful for web applications, +where you should not trust the input from the user. + +

Basic Tainting and Untainting

+ +
+(taint "7")
+=>
+#<TAINTED-VALUE "7" {B682029}>
+(* (length *) (length *))
+ERROR: NOT A SEQUENCE
+
+(* (untaint #'parse-integer *) (untaint #'parse-integer *))
+=> 49
+
+ +

Once a value is tainted, it can't be used until you untaint it. You untaint +the value by calling UNTAINT with a function that will be run on the internal value, the result +of which shall be returned. + +

This helps ensure that the value is only accessed through that function. + +

Detaint and With-Detaint

+

I find it useful to think of accepting external values in terms of declarative statements, so +I created DETAINT and it's ease-of-use macro WITH-DETAINT. Keep in mind that DETAINT currently only +works with strings as input. However, it will work with both tainted and un-tainted strings. + +

+(detaint "7" 'integer)
+=> 7
+(detaint "dslkd" 'integer)
+=> nil
+(detaint "" '(or integer 0))
+=> 0
+
+ +

If you have to untaint several variables at once, it's far more convenient to do it like so: +

+(with-detaint ((integer x y z)
+               ((or integer 0) a b c))
+  (list x y z a b c))
+
+ +

It's much like a type declaration. + +

Simple Types

+

The most simple usage of DETAINT is with the simple types. These types +attempt to coerce the input into the target type. + +

Integers

+
+(detaint "7" 'integer)
+=> 7
+(detaint "a7a" 'integer)
+=> 7
+(detaint "a7a" 'strict-integer)
+=> NIL
+
+ +

integer and loose-integer will find the first contiguous sequence +of numbers in the input and parse those. If no numbers are found, NIL will be returned. + +

strict-integer will only parse a string entirely consisting of numbers. If that +does not hold, NIL will be returned. + +

Strings

+
+(detaint "Yes" 'string)
+=> "Yes"
+(detaint "Yes" 'nestring)
+=> "Yes"
+(detaint "" 'string)
+=> ""
+(detaint "" 'nestring)
+=> NIL
+
+ +

string will return the input string. +

nestring will return the input string if it is non-empty, NIL otherwise. + +

Symbols

+
+(detaint "Yes" 'symbol)
+=> YES
+(detaint "Yes" 'exact-symbol)
+=> |Yes|
+(detaint "Yes" 'keyword)
+=> :YES
+
+ +

All of the symbol types (except the keyword ones, obviously) intern the string into the current package. +

symbol, keyword are uppercased. +

exact-symbol, msymbol, exact-keyword, mkeyword are unaltered. + + +

Advanced Types

+

You can add some clauses to the simple types, restricting which inputs are acceptable. + +

For example: +

+(detaint "7" '(integer :min 6))
+=> 7
+(detaint "7" '(integer :min 8))
+=> NIL
+(detaint "7" '(integer :pred oddp))
+=> 7
+(detaint "7" '(integer :pred evenp))
+=> NIL
+
+ +

Integers

+
+(INTEGER-TYPE <:min literal> <:max literal> <:pred literal>)
+
+ +
+
:min
Specifies the minimum value acceptable
+
:max
Specifies the maximum value acceptable
+
:pred
A function of one argument which shall receive the parsed integer (if no integer is found, the function will not be called). If +the function returns a false value, the parse will be rejected, and NIL will be returned.
+
+ +

Strings

+
+(STRING-TYPE <:min-length literal> <:max-length literal> <:every literal> <:some literal> <:pred literal> <:match literal> <:imatch literal>)
+
+ +
+
:min-length
The minimum length acceptable
+
:max-length
The maximum length acceptable
+
:every
Every character of the string will be passed to the function, one at a time. If any return value is false, NIL will result.
+
:some
Every character of the string will be passed to the function, one at a time. If every return value is false, NIL will result.
+
:pred
The string will be passed to the function as a whole. If the return value is false, NIL will result.
+
:match
The string must exactly match the specified string.
+
:imatch
The string must case-insensitively match the specified string.
+
+ +

Symbols

+
+(SYMBOL-TYPE <:package literal>)
+
+ +
+
:package
The symbol will be interned into the specified package, rather than the current package.
+
+ +

Combinators

+

By default, when a type fails, NIL is returned. You can propose alternate types using or and friends. + +

+(detaint "7" '(or integer 0))
+=> 7
+(detaint "asdf" '(or integer 0))
+=> 0
+(detaint "7" '(if integer t nil))
+=> T
+(detaint "a" '(if integer t nil))
+=> NIL
+(detaint "a" '(when integer t))
+=> NIL
+(detaint "7" '(when integer t))
+=> 7
+(detaint "7" '(or (or (integer :min 42)
+                      (integer :max 4))
+                  0))
+=> 0
+(detaint "44" '(or (or (integer :min 42)
+                       (integer :max 4))
+                   0))
+=> 44
+
+ +

or and any return the first value that returns a true value. +

if returns the third construct if the second is true, and the fourth if it is not. + +

Matchers

+

Sometimes you only want to parse something if it follows a certain pattern. +Currently the only matcher is example, but regular expressions will come soon, thanks to Edi's PCRE library. + +

If you need to match a simple function, just use (string :pred function) along with when. + +

+(detaint "77" '(with-match (example "99") integer))
+=> 77
+(detaint "7" '(with-match (example "99") integer))
+=> NIL
+(detaint "7777" '(with-match (example "09") integer))
+=> 7777
+
+ +

The third argument (integer in the above examples) is a proper specifier, so you can put in or constructs, +more matchers, etc, etc, etc. + +

Example Matches

+

The example matcher is quite simple. You provide an example of what you'd like to match, and it matches it. + +

There are a few wildcard characters. + +

+
x
Matches any lowercase letter
+
X
Matches any uppercase letter
+
i or I
Matches any letter
+
* or _
Matches any character
+
1-9
Matches any number
+
0
Starts a variable-length number ("09" will match "7", "77", and "77777777". Whereas "99" would only match "77")
+
+ +

All other characters match only themselves. + +

Filters

+

Sometimes you want to filter the string before doing anything with it. + +

+(detaint "a7a" '(with-filter (drop a) string))
+=> "7"
+
+ +

The third argument (string in the above example) is, like match, a proper place. So you can put or, or even more filters. + +

The two filters available now are: gather (synonyms: pass), and drop (synonyms: remove, fail). +These filters work character-wise. I hope to add some regexp filters using Edi's PCRE soon. + +

gather causes the new value to consist of only those characters which match its pattern. +

drop causes the new value to consist of only those characters which do NOT match its pattern. + +

Gather/Drop filters

+

Character literals are specified either as #\c -style characters, one-letter symbols, or one-letter strings. + +

+; valid characters:
+#\f
+#\F
+|f|
+f
+"f"
+"F"
+
+ +

While (gather "f") is useful for getting all the fs in a string, it's not all that interesting. Enter the constructs: + +

+(any a b c)
+(not (and "F" "f"))
+
+ +
+
any, or, and union
Match any construct in their argument list.
+
every, and, and intersection
Match only if every construct in their argument list matches.
+
iany, ior, and iunion
Same as any, but case insensitive.
+
ievery, iand, and iintersection
Same as every, but case insensitive.
+
difference
+
Matches if the first construct matches but none of the following do (think of setting up a range then removing ranges from that)
+
not
Inverses the match of its construct.
+
pred
Passes the current character to a function. That function returns a true or false value.
+
not-pred
Like (not (pred ____))
+
range
Matches from second -> third (ie (and (range #\a #\z) (range #\A #\Z)) for all of the alphabet)
+
not-range
Not in the range.
+
nocase and icase
Set all sub-matches to case-insensitive
+
case
Sets all sub-matches to case sensitive
+
+ +

Enjoy!

+

I hope this makes your tainting and untainting more pleasant. At the time this was written, I had a lot of things to do with +web apps, so I planned on adding more constructs as I needed them. If there's been no change in this in a while, I guess you can +figure out what happened. + + + addfile ./taint.lisp hunk ./taint.lisp 1 +(defpackage :cl-taint + (:use :common-lisp #+fiveam :it.bese.fiveam) + (:export :taint :untaint :tainted-p :detaint :with-detaint) + (:documentation "Simple package to taint and untaint values. +The basic idea is to wrap a value in a struct, making the value unusable directly. +This means you have to explicitly untaint it to get back a useful value. + +Usage follows this pattern: your functions that return data from the outside +(such as parameters from a web client) return their value via (taint value). +Later, when the function is desired, you use (untaint #'parse-integer tainted-value), +WITH-DETAINT, DETAINT, or the like.")) + +(in-package cl-taint) + +(defstruct tainted-value + (value nil)) +(defmethod print-object ((obj tainted-value) stream) + (if *print-escape* + (print-unreadable-object (obj stream :type t :identity t) + (format stream "~S" (tainted-value-value obj))) + (princ "TAINTED" stream))) + +(defun taint (value) + "Taints a value +Wraps a value in a lambda so that it no longer makes sense by itself." + (make-tainted-value :value value)) + +(defun untaint (func tainted-value) + "Untaints a value, returning it +Takes a function of one argument which will be passed the tainted value." + (funcall func (tainted-value-value tainted-value))) + +(defun tainted-p (tainted-value) + "Returns T if argument is tainted." + (typep tainted-value 'tainted-value)) + + +;;;; Below here lies the code for detaint + +(defun string->integer (string &key strict min max pred) + "Parses an integer from string. +If strict, then every character in the string must be an integer. +If not strict, then the first integer sequence in the string will be parsed. +If min is specified, the number must be at least min +If max is specified, the number must be at least max +If pred is specified, the (funcall pred number) must be true + +If any of these do not hold, nil is returned" + (declare (type string string) + (type (or null function) pred)) + (flet ((number-char-p (c) + (member c '(#\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9 #\0)))) + (when (or (and (not strict) + (some #'number-char-p string)) + (and strict + (every #'number-char-p string))) + (let ((int (if strict + (parse-integer string) + (parse-integer string :start (position-if #'number-char-p string) :junk-allowed t)))) + (when int + (if (and (or (not min) + (>= int min)) + (or (not max) + (<= int max)) + (or (not (functionp pred)) + (funcall pred int))) + int + nil)))))) + +(defun string->string (string &key min-length max-length match imatch every some pred) + "Checks a string for certain attributes. +If min-length, then string must be at least that long +If max-length, then string must be no more than that long +If match, then string must match match +If imatch, then string must case-insensitively match match +If every, then every character in the string must return true for (funcall every c) +If some, then at least one character in the string must return true for (funcall some c) +If pred, then (funcall pred string) must return true + +If any of these do not hold, nil is returned" + (declare (type string string) + (type (or null integer) min-length max-length) + (type (or null function) every some pred)) + (if (and (or (not min-length) + (>= (length string) min-length)) + (or (not max-length) + (<= (length string) max-length)) + (or (not match) + (equal string match)) + (or (not imatch) + (string-equal string imatch)) + (or (not (functionp every)) + (every every string)) + (or (not (functionp some)) + (some some string)) + (or (not (functionp pred)) + (funcall pred string))) + string + nil)) + +(defun string->symbol (string &key package uppercase) + "Turns a string into a symbol in :package, optionally uppercasing it" + (multiple-value-bind (symbol package) ; if we return multiple values, detaint assumes that this was not handled! + (intern (if uppercase + (string-upcase string) + string) + package) + (declare (ignore package)) + symbol)) + +(defun local-symbol (symbol) + "Guarantees that the symbol will be in the current package" + (intern (symbol-name symbol) :cl-taint)) + +(defun unwrap-value (value) + "Unconditionally get the value, whether tainted or not" + (if (tainted-p value) + (untaint #'identity value) + value)) + +(defun handle-integer-type (value args strict) + (when value + (destructuring-bind (&key min max pred) args + (string->integer value :strict strict :min min :max max :pred (when pred (symbol-function pred)))))) + +(defun handle-string-type (value args) + (when value + (destructuring-bind (&key min-length max-length every some pred match imatch) args + (string->string value + :min-length min-length + :max-length max-length + :match match + :imatch imatch + :every (when every (symbol-function every)) + :some (when some (symbol-function some)) + :pred (when pred (symbol-function pred)))))) + +(defun handle-nestring-type (value args) + (declare (ignore args)) + (when value + (if (equal value "") + nil + value))) + +(defun handle-symbol-type (value args keywordp uppercasep) + (when value + (destructuring-bind (&key package) args + (string->symbol value + :package (if keywordp + :keyword + (if package package *package*)) + :uppercase uppercasep)))) + +(defun coerce-char (char) + (cond + ((characterp char) char) + ((symbolp char) (if (= (length (symbol-name char)) 1) + (elt (symbol-name char) 0) + (error "~S is not a valid character specifier for a filter (bad length)" char))) + ((stringp char) (if (= (length char) 1) + (elt char 0) + (error "~S is not a valid character specifier for a filter (bad length)" char))) + ((integerp char) (elt (princ-to-string char) 0)))) + +(defun filter-specifier-matches-p (specifier c case-sensitive) + (cond + ((atom specifier) (if case-sensitive + (eql (coerce-char specifier) c) + (char-equal (coerce-char specifier) c))) + ((consp specifier) + (ecase (local-symbol (car specifier)) + ((any or union) (some (lambda (sub-spec) (filter-specifier-matches-p sub-spec c case-sensitive)) + (cdr specifier))) + ((iany ior iunion) (some (lambda (sub-spec) (filter-specifier-matches-p sub-spec c nil)) + (cdr specifier))) + ((every and intersection) (every (lambda (sub-spec) (filter-specifier-matches-p sub-spec c case-sensitive)) + (cdr specifier))) + ((ievery iand iintersection) (every (lambda (sub-spec) (filter-specifier-matches-p sub-spec c nil)) + (cdr specifier))) + (difference (and (notany (lambda (sub-spec) (filter-specifier-matches-p sub-spec c case-sensitive)) + (cddr specifier)) + (filter-specifier-matches-p (cadr specifier) c case-sensitive))) + (not (not (filter-specifier-matches-p (cadr specifier) c case-sensitive))) + + ((nocase icase) (filter-specifier-matches-p (cadr specifier) c nil)) + (case (filter-specifier-matches-p (cadr specifier) c t)) + + (pred (funcall (symbol-function (cadr specifier)) c)) + (not-pred (not (funcall (symbol-function (cadr specifier)) c))) + + (range (destructuring-bind (start end) (cdr specifier) + (let ((c-code (char-code (if case-sensitive c (char-upcase c)))) + (start-code (char-code (if case-sensitive (coerce-char start) (char-upcase (coerce-char start))))) + (end-code (char-code (if case-sensitive (coerce-char end) (char-upcase (coerce-char end)))))) + (and (>= c-code start-code) + (<= c-code end-code))))) + (not-range (not (filter-specifier-matches-p (cons 'range (cdr specifier)) c case-sensitive))))))) + +(defun gather-filter (specifier value) + (coerce (remove-if (complement (lambda (c) + (filter-specifier-matches-p specifier c t))) + (coerce value 'cons)) + 'string)) + +(defun remove-filter (specifier value) + (coerce (remove-if (lambda (c) + (filter-specifier-matches-p specifier c t)) + (coerce value 'cons)) + 'string)) + +(defun %example-match (input pattern in-variable-length-integer) + (declare (type list input pattern) + (type boolean in-variable-length-integer)) + (if (or (null input) + (null pattern)) + (if (and (null input) + (null pattern)) + t + nil) + (cond + (in-variable-length-integer (if (digit-char-p (car input)) + (%example-match (cdr input) pattern t) + (%example-match input pattern nil))) + (t + (case (car pattern) + (#\x (when (lower-case-p (car input)) + (%example-match (cdr input) (cdr pattern) nil))) + (#\X (when (upper-case-p (car input)) + (%example-match (cdr input) (cdr pattern) nil))) + ((#\i #\I) (when (alpha-char-p (car input)) + (%example-match (cdr input) (cdr pattern) nil))) + ((#\* #\_) (%example-match (cdr input) (cdr pattern) nil)) + ((#\1 #\2 #\3 #\4 #\5 + #\6 #\7 #\8 #\9) (when (digit-char-p (car input)) + (%example-match (cdr input) (cdr pattern) nil))) + (#\0 (when (digit-char-p (car input)) + (%example-match (cdr input) + (loop for sub on (cdr pattern) + when (not (digit-char-p (car sub))) + return sub) + t))) + (otherwise (when (eql (car pattern) (car input)) + (%example-match (cdr input) (cdr pattern) nil)))))))) + +(defun example-match (input pattern) + (%example-match (coerce input 'cons) + (coerce pattern 'cons) + nil)) + +(defun handle-filter-pass (value args) + (when value + (destructuring-bind ((filter-type specifier) sub-type) args + (ecase (local-symbol filter-type) + ((gather pass) (detaint (gather-filter specifier value) sub-type)) + ((drop remove fail) (detaint (remove-filter specifier value) sub-type)))))) + +(defun handle-match-pass (value args) + (when value + (destructuring-bind ((match-type specifier) sub-type) args + (ecase (local-symbol match-type) + (example (when (example-match value specifier) + (detaint value sub-type))))))) + +(defun handle-type (value type args) + (case (local-symbol type) + ((integer loose-integer) (handle-integer-type value args nil)) + (strict-integer (handle-integer-type value args t)) + + (string (handle-string-type value args)) + (nestring (handle-nestring-type value args)) + + (symbol (handle-symbol-type value args nil t)) + ((exact-symbol msymbol) (handle-symbol-type value args nil t)) + (keyword (handle-symbol-type value args t t)) + ((exact-keyword mkeyword) (handle-symbol-type value args t nil)) + + (with-filter (handle-filter-pass value args)) + (with-match (handle-match-pass value args)) + (otherwise (values nil t)))) + + +(defun detaint (value type) + "A declarative form of untaint. +Passed a string (can deal with tainted and untainted strings) and a declarative form describing +and constrainting the result. + +Usually used through with-detaint. + +DETAINT is (fully?) documented in doc/taint.html, but here is an EBNF-like description of valid +constructs + +(DETAINT possibly-tainted-value detaint-declaration) + +detaint-declaration: type + | statement + | literal + +type: simple-type + | complex-type + +integer-specifier: integer | loose-integer | strict-integer +string-specifier: string | nestring +symbol-specifier: symbol | exact-symbol | msymbol | keyword | exact-keyword | mkeyword + +simple-type: integer-specifier + | string-specifier + | symbol-specifier + +complex-type: (integer-specifier <:min literal> <:max literal> <:pred literal>) + | (string-specifier <:min-length literal> <:max-length literal> <:every literal> <:some literal> <:pred literal> <:match literal> <:imatch literal>) + | (symbol-specifier <:package literal>) + +statement: ([or|any] *) + | (if detaint-declaration detaint-declaration ) + | (when detaint-declaration detaint-declaration) + | (with-match match-specifier detaint-declaration) + | (with-filter filter-specifier detaint-declaration) + +match-specifier: (example example-match-string) + +filter-specifier: ([gather|pass] character-filter) + | ([drop|remove|fail] character-filter) + +character-filter: char-literal + | ([any|or|union] *) + | ([iany|ior|iunion] *) + | ([every|and|intersection] *) + | ([ievery|iand|iiintersection] *) + | (difference character-filter *) + | (not character-filter) + | (case character-filter) + | ([nocase|icase] character-filter) + | (pred function) + | (not-pred function) + | (range char-literal char-literal) + | (not-range char-literal char-literal) + +char-literal: character | one-letter-symbol | one-letter-string + +...as you can see, there's a little bit to it." + (declare (type (or string tainted-value null) value)) + (let ((value (unwrap-value value))) + (cond + ((atom type) + (cond + ((symbolp type) (multiple-value-bind (ret-value unhandled) (handle-type value type nil) + (if unhandled + type + ret-value))) + (t type))) + ((consp type) + (case (local-symbol (car type)) + ((or any) (loop for try in (cdr type) + when (detaint value try) + return it)) + (if (destructuring-bind (if then &optional else) (cdr type) + (if (detaint value if) + (detaint value then) + (detaint value else)))) + (when (destructuring-bind (if then) (cdr type) + (when (detaint value if) + (detaint value then)))) + (otherwise (multiple-value-bind (ret-value unhandled) (handle-type value (local-symbol (car type)) (cdr type)) + (if unhandled + (error "Unknown detaint construct ~S" type) + ret-value)))))))) + +(defmacro with-detaint (detaints &body body) + "Execute body with detainted variables. +Usage will look like: +(with-detaint ((integer x y z) + (string string1 string2) + ((or strict-integer + string) integer-or-string)) + (format t \"x: ~S y: ~S z: ~S string1: ~S string2: ~S integer-or-string: ~S\" x y z string1 string2 integer-or-string)) + +See the documentation for detaint to see all of the constructs that you can use. +The test suite in test-server.lisp has many examples." + `(let ,(mapcan (lambda (detaint-statement) + (mapcar (lambda (var) + `(,var (detaint ,var ',(car detaint-statement)))) + (cdr detaint-statement))) + detaints) + ,@body)) }