2008: The patches from Boris Smilga to add CLOS based deserialization is in an unfinished state. Need to add testcases and perhaps rethink the deserialization API a bit, we awant user configurable deserialization, and testcases. See mail conversation below, On Dec 28, 2007 5:38 PM, Henrik Hjelte < henrik@evahjelte.com> wrote: > I have started to integrate your additions, but I had some problems. > > The testcases failed a lot, over 20 or thirty or so. I believe that > the testcases are the most important part of cl-json, they run the > official > json tests, and test I think all the other features I have > implemented. And > they serve as documentation, defining the system when refactoring. > The point is, if you have a feature that you want to be preserved, > make sure > it has a testcase otherwise I or someone else might refactor it away. > So ideally, I would like new testcases that are examples that explain > the new features in the clos deserializer. And, of course make all old > testcases that are relevant run ok for the new code. In many > cases what is needed is to changes hardcoded lists to a function that > returns either arrays or lists depending on *json-array-type*, I > think I will > do that after sending this mail.. Most of the failures stem from two causes: either (1) alists are expected where the new decoder produces objects, or (2) the new encoder adds non-expected prototype metadata to JSON objects. The first cause can be eliminated by wrapping the call in a WITH-OLD- DECODER-SEMANTICS block, the second by setting *PROTOTYPE-NAME* to NIL. So, running (IN-PACKAGE JSON-TEST) (WITH-OLD-DECODER-SEMANTICS (LET ((*PROTOTYPE-NAME* NIL)) (RUN! 'JSON))) got me down to two failures, both of them in TEST*JSON-SYMBOLS- PACKAGE*. These were due to the fact that WITH-OLD-DECODER-SEMANTICS works essentially by supplying a canonical prototype, so that the decoder has a preconception of what kind of object to create—and, as it was, where to intern names. The last part was clearly wrong: there should be no such preconception, as the user might have set *JSON- SYMBOLS-PACKAGE* to a non-default value, and it is this value rather than the one supplied in the prototype that should be obeyed. I have fixed this (and also some more bugs which I have noticed in the process); please check the attached patches. > Configurable: I really would want to have it optional exactly how to > decode and encode objects. You are probably right in that your > solution is a good default behaviour, but it will not be perfect in > every situation for every user. So if possible, I would like to > have the > decoding and encoding configurable. So backwards compatibility > does not really mean backwards, it means compatibility between > different setups. For example, when doing testcases or a simple > json-bind > the old alist setup is good, for a more advanced setup your > code is great. For a secure setup you will probably want to have > access > control to what objects you are allowed to create and so on. One > solution will never be sufficient for everyone. I totally agree, but this implies a rather thorough change of design. In my original modifications, I strived to change the existing program as little as possible (for reasons which it would be quite tedious to explain here and now). > I thought that the setup with object-factory variables and funcalls > would be sufficient as an interface for configurability, > but I guess it was not since it didn't work out for you. On the other > hand, now a lot of the code in read-json-object for example is > intertwined with the clos-object creation stuff. For example, special > parsing of "prototype". So, I think we need an updated way to > handle this. > > My original idea repeated: > First *json-object-factory* to return an object X. > Call *json-object-factory-add-key-value* with X to repeatedly > handle key and values > that were read. > At the end of the object reading, call *json-object-factory-return* > with X to get the > actual created lisp thing (object or whatever). > > I haven't yet looked deeply enough to understand why this interface > was insufficient > for your implementation (see read-json-object). <...> This I can set forth. READ-JSON-OBJECT works by reading in the key string and the colon separator, and then recursively calling READ- JSON to consume characters from the input and construct the object which would be the corresponding value. With my modifications, when the key "prototype" (or whatever the name shoud be) is encountered on the input, the decoder is told that the value we are going to construct shall be the prototype object. If we were not able to communicate this beforehand, we'd have to do much post-processing of the factory object: look up the prototype (note that, at this point, we would not know the package and could not yet intern the keys, so that the matching would be done on strings, degrading the performance), convert it to some internal format (note that we would not be able to predict accurately enough what it would have been decoded to, as that may be influenced by user-side configuration), remove the prototype and key from the factory, and only then could we create the object. I judged this logic suboptimal when compared to prototype-prescient handling. > <...> One idea is of course to change this programming style > to a set of generic functions on json-object-dispatchers (or maybe > json-object-factory > or detailed-json-implementation or some good name). But AFAIK that > is just a syntactic difference, is there something left out in the > interface > that was a problem? Then I suggest we change the interface to > something that works, and then we can keep > object-implementations out of the core decoder/encoder. Any ideas? I need to think about it (have got a vague idea, but nothing to verbalize as yet). We are having New Year festivities for the next week down here in Moscow, let me get back to the matter after that. I'm also going to make a set of test cases updated for the CLOS semantics. (To make things worse, I have discovered that some implementations do not provide CLOS interface to structure classes— most importantly, (PROGN (DEFSTRUCT FOO) (LET ((X (MAKE-FOO))) (MAKE- INSTANCE (CLASS-OF X)))) produces an error. Consequently, we have to devise a separate logic to de/serialize structures.) > I have pushed your patches to the offical json darcs repo, moved > some code > around to make the old setup work. In the process I had to change > some things > in mainly json-factory-make-object regarding interning, this > probably breaks > things for your implementation. Er... Do you mean http://common-lisp.net/project/cl-json/darcs/cl- json? I'm afraid darcs does not see any new changes to pull from there. I would be happy to get the updates. Best regards, and happy New Year. -- B.Smilga. *** Rationale *** §1. A significant problem I perceive with CL-JSON in its present state is that the operations ENCODE-JSON and DECODE-JSON are not reasonably inverse. E.g.: (ENCODE-JSON-TO-STRING (DECODE-JSON-FROM-STRING "{\"foo\":null,\"bar \":null}")) => "[[\"foo\"],[\"bar\"]]" and (ENCODE-JSON-TO-STRING (DECODE-JSON-FROM-STRING "{}")) => "null" §2. This is the most painful kind of example, because the JSON object having even a single non-null field results in a proper value. Thus, the type of a JSON object emitted by an application that uses ENCODE-JSON can depend not only on the type of the source Lisp object but on its internal structure as well. This is not to say that ENCODE-JSON fusing Lisp vectors with lists and symbols and characters with strings, or DECODE-JSON fusing false with null are particularly pleasant; still, in theses cases we are able to predict typing across JSON interface. Note, however, that (ENCODE-JSON-TO-STRING (DECODE-JSON-FROM-STRING "[]")) would also yield "null". This has to be dealt with as well. §3. The root of the problem is the design decision of representing JSON objects as Lisp alists, and the solution I propose is to revert that decision and represent JSON objects as CLOS objects. This is quite feasible if we make use of meta-object protocol capabilities implemented in a more or less standardized way across most contemporary Common Lisps. In the simple form, decoding a well-formed JSON object should result in the creation of an anonymous class (i.e. an instance of STANDARD-CLASS) with such slots as found in the JSON object. Next, we create an instance of the anonymous class and populate its slots. Conversely, encoding a CLOS object Z should be done by iterating over the value of (CLASS-SLOTS (CLASS-OF Z)) and emitting a name:value pair for each slot bound in Z. §4. Attached herewith is a series of patches implementing the idea. However, I have ventured a step further by allowing JSON objects to include metadata which specify that CLOS objects are to belong (or else inherit) to non-anonymous CLOS classes. The metadata are passed as the field of the object whose name is ``prototype''. (More accurately, the name is determined by the symbol which is the value of the special variable *PROTOTYPE-NAME*: the value (FUNCALL *SYMBOL-TO-STRING-FN* *PROTOTYPE-NAME*) must be STRING= to the name of the field. I have judged the default value 'PROTOTYPE more or less developer-friendly, as the field ``prototype'' in JavaScript objects is already reserved for special purposes and so is unlikely to conflict with anyone's user-level naming schemes. Below, I invariably call this field the prototype field.) The value of the prototype field should be an object, called the prototype, in which the following fields are meaningful: "lispClass":C, "lispSuperclasses":[C1,...,Cn] and "lispPackage":P (all other fields are ignored). The following rules are employed when decoding a JSON object X to a CLOS object Z: (D1) If the prototype of X has a "lispClass":C field, Z shall be an instance of the class identified by the name C, and all fields of X which have no correspondence among the slots of the class C are discarded. (D2) If the prototype of X has a "lispSuperclasses":[C1,...,Cn] field, Z shall be an instance of an anonymous class C whose superclasses are identified by the names C1,...,Cn. The fields of X which have no correspondence among any of the slots of the classes C1,...,Cn are defined in C as direct slots. As a special case, if n=1 and all of X's fields (omitting the prototype) are defined as slots in C1 then Z shall be an instance of C1. (D3) If the prototype has both a "lispClass":C and a "lispSuperclasses":[C1,...,Cn] fields then the rule D1 applies and the latter field is ignored. (D4) If the prototype has a "lispPackage":P field, then the names of the classes in both other fields and the names of the fields in X are interned in the package specified by P instead of the default package KEYWORD. Of course, all names (including P itself) are converted from camel case before using them in Lisp. (D5) If the class of the resulting object Z provides for a slot whose name is the value of the special variable *PROTOTYPE-NAME* (JSON::PROTOTYPE by default) then that slot shall be bound to the object which internally represents the prototype of X (an instance of the class JSON::PROTOTYPE, qv. the code). NB: this slot is never created by the decoder on its own authority but always inherited from the class or superclasses specified. (D6) If the prototype has a "lispClass":"cons" field and such "lispPackage":P field that the interned class name is COMMON-LISP:CONS, the rule D1 does not apply. Instead, Z shall be an alist whose conses' cars are the names and whose conses' cdrs are the respective values of the fields in X. The field names are interned in P, and the prototype field itself is omitted. (D7) If the prototype has a "lispClass":"hashTable" field and such "lispPackage":P field that the interned class name is COMMON-LISP:HASH-TABLE, the rule D1 does not apply. Instead, Z shall be a hash table whose keys are the names and whose values are the respective values of the fields in X. The field names are interned in P, and the prototype field itself is omitted. (D8) The value of null for any of the three fields of a prototype is equivalent to the field being absent. X lacking or having a null prototype is equivalent to the prototype having all null fields. §5. Conversely, when a CLOS object Z is encoded, the encoded JSON object X shall include a prototype reconstructed from Z per following rules: (E1) If the class of Z has a name, the prototype shall have a "lispClass":C field, where C is that name (it is assumed here and below that all names are converted to camel case). (E2) If the class of Z does not have a name, the prototype shall have a "lispSuperclasses":[C1,...,Cn] field, where C1,...,Cn are the names of that class's superclasses. (E3) If Z is an alist, the prototype shall have a "lispClass":"cons" field. (E4) If Z is a hash table, the prototype shall have a "lispClass":"hashTable" field. (E5) The prototype shall have a "lispPackage":P field, where P is the name of a package such that any one of the names C or C1,...,Cn, and any one of the names of the slots bound in Z, is a symbol (either direct or inherited) in the package. (Mutatis mutandis if Z is an alist or a hash table.) If there is no such package, P shall be an unspecific package name, and the program shall signal a warning condition. §6. Below are some examples of the modified decoder / encoder. I have marked the printout with double bars, and the result with =>. This is from an SBCL session; I have also tested the examples in OpenMCL. (IN-PACKAGE JSON) => # (DECODE-JSON-FROM-STRING "{\"foo\":null,\"bar\":null}") => #<# {1268D979}> (DESCRIBE *) || #<# {1268D979}> || is an instance of class #. || The following slots have :INSTANCE allocation: || BAR NIL || FOO NIL (DESCRIBE (CLASS-OF **)) || # is a class. It is an instance of || STANDARD-CLASS. || It has no name (the name is NIL). || The direct superclasses are: (STANDARD-OBJECT), and the direct subclasses || are: (). || The class is finalized; its class precedence list is: || (# STANDARD-OBJECT SB-PCL::SLOT- OBJECT T). || There are 0 methods specialized for this class. (ENCODE-JSON-TO-STRING ***) => "{\"bar\":null,\"foo\":null,\"prototype\":{\"lispClass\":null, \"lispSuperclasses\":[\"standardObject\"],\"lispPackage\":\"json\"}}" (DESCRIBE (DECODE-JSON-FROM-STRING "{}")) || # || is an instance of class #. (DEFSTRUCT FOO BAR BAZ) => FOO (ENCODE-JSON-TO-STRING (MAKE-FOO :BAR 10 :BAZ 20)) => "{\"bar\":10,\"baz\":20,\"prototype\":{\"lispClass\":\"foo\", \"lispSuperclasses\":null,\"lispPackage\":\"json\"}}" (DECODE-JSON-FROM-STRING *) => #S(FOO :BAR 10 :BAZ 20) (DECODE-JSON-FROM-STRING "{\"bar\":10,\"baz\":20,\"quux\":50,\"prototype\":{\"lispClass\": \"foo\",\"lispPackage\":\"json\"}}") => #S(FOO :BAR 10 :BAZ 20) (DECODE-JSON-FROM-STRING "{\"bar\":10,\"baz\":20,\"quux\":50,\"prototype\":{\"lispClass\": \"cons\",\"lispPackage\":\"json\"}}") => ((BAR . 10) (BAZ . 20) (QUUX . 50)) (MAPHASH (LAMBDA (K V) (PRINT K) (PRINT V)) (DECODE-JSON-FROM-STRING "{\"bar\":10,\"baz\":20,\"quux\":50,\"prototype\":{\"lispClass\": \"hashTable\",\"lispPackage\":\"json\"}}")) || || BAR || 10 || BAZ || 20 || QUUX || 50 => NIL §7. The following additional names are exported from the package JSON: *PROTOTYPE-NAME* special variable (default: JSON::PROTOTYPE) A symbol which determines the name of the prototype field in a JSON object and the name of a slot in a CLOS object which may hold prototype information. As a special case, if *PROTOTYPE-NAME* is NIL the encoder does not add a prototype field when encoding an object that misses a prototype slot or key, and the decoder does not search for or treat in a special manner a prototype field in the input. The latter behaviour results in the ``simple semantics'' of §3: (LET ((*PROTOTYPE-NAME* NIL)) (DECODE-JSON-FROM-STRING "{\"bar\":10,\"baz\":20,\"prototype\":{\"lispClass\":\"foo\", \"lispPackage\":\"json\"}}")) => #<# {125290A9}> (DESCRIBE *) || #<# {125290A9}> || is an instance of class #. || The following slots have :INSTANCE allocation: || BAR 10 || BAZ 20 || PROTOTYPE #<# {12524219}> (DESCRIBE (SLOT-VALUE ** 'PROTOTYPE)) || #<# {12524219}> || is an instance of class #. || The following slots have :INSTANCE allocation: || LISP-CLASS "foo" || LISP-PACKAGE "json" *JSON-ARRAY-TYPE* special variable (default: VECTOR) A type specifier which determines what type to coerce JSON arrays to. The default value has been chosen to cope with the kind of problem mentioned in §2. Thus, unless *JSON-ARRAY-TYPE* is set to LIST the value of (DECODE-JSON-FROM-STRING "[]") shall be #(), yielding "[]" when re-encoded. WITH-OLD-DECODER-SEMANTICS macro Ensures backward compatibility with the old CL-JSON decoder. It can be called as (WITH-OLD-DECODER-SEMANTICS FORM*) where FORMs are an implicit PROGN. FORMs are executed in an environment where JSON objects are invariably decoded to alists, JSON arrays to lists, array keys interned in the package KEYWORD, and array fields called ``prototype'' (or whatever is specified by *PROTOTYPE-NAME*) receive no special treatment. E.g.: (WITH-OLD-DECODER-SEMANTICS (DECODE-JSON-FROM-STRING "{\"bar\":10,\"baz\":20,\"prototype\":{\"lispSuperclasses\": [\"foo\",\"bar\"],\"lispPackage\":\"json\"}}")) => ((:BAR . 10) (:BAZ . 20) (:PROTOTYPE (:LISP-SUPERCLASSES "foo" "bar") (:LISP-PACKAGE . "json")))