diff options
Diffstat (limited to 'parts/django/docs/topics/serialization.txt')
-rw-r--r-- | parts/django/docs/topics/serialization.txt | 402 |
1 files changed, 0 insertions, 402 deletions
diff --git a/parts/django/docs/topics/serialization.txt b/parts/django/docs/topics/serialization.txt deleted file mode 100644 index c8acc85..0000000 --- a/parts/django/docs/topics/serialization.txt +++ /dev/null @@ -1,402 +0,0 @@ -========================== -Serializing Django objects -========================== - -Django's serialization framework provides a mechanism for "translating" Django -objects into other formats. Usually these other formats will be text-based and -used for sending Django objects over a wire, but it's possible for a -serializer to handle any format (text-based or not). - -.. seealso:: - - If you just want to get some data from your tables into a serialized - form, you could use the :djadmin:`dumpdata` management command. - -Serializing data ----------------- - -At the highest level, serializing data is a very simple operation:: - - from django.core import serializers - data = serializers.serialize("xml", SomeModel.objects.all()) - -The arguments to the ``serialize`` function are the format to serialize the data -to (see `Serialization formats`_) and a :class:`~django.db.models.QuerySet` to -serialize. (Actually, the second argument can be any iterator that yields Django -objects, but it'll almost always be a QuerySet). - -You can also use a serializer object directly:: - - XMLSerializer = serializers.get_serializer("xml") - xml_serializer = XMLSerializer() - xml_serializer.serialize(queryset) - data = xml_serializer.getvalue() - -This is useful if you want to serialize data directly to a file-like object -(which includes an :class:`~django.http.HttpResponse`):: - - out = open("file.xml", "w") - xml_serializer.serialize(SomeModel.objects.all(), stream=out) - -Subset of fields -~~~~~~~~~~~~~~~~ - -If you only want a subset of fields to be serialized, you can -specify a ``fields`` argument to the serializer:: - - from django.core import serializers - data = serializers.serialize('xml', SomeModel.objects.all(), fields=('name','size')) - -In this example, only the ``name`` and ``size`` attributes of each model will -be serialized. - -.. note:: - - Depending on your model, you may find that it is not possible to - deserialize a model that only serializes a subset of its fields. If a - serialized object doesn't specify all the fields that are required by a - model, the deserializer will not be able to save deserialized instances. - -Inherited Models -~~~~~~~~~~~~~~~~ - -If you have a model that is defined using an :ref:`abstract base class -<abstract-base-classes>`, you don't have to do anything special to serialize -that model. Just call the serializer on the object (or objects) that you want to -serialize, and the output will be a complete representation of the serialized -object. - -However, if you have a model that uses :ref:`multi-table inheritance -<multi-table-inheritance>`, you also need to serialize all of the base classes -for the model. This is because only the fields that are locally defined on the -model will be serialized. For example, consider the following models:: - - class Place(models.Model): - name = models.CharField(max_length=50) - - class Restaurant(Place): - serves_hot_dogs = models.BooleanField() - -If you only serialize the Restaurant model:: - - data = serializers.serialize('xml', Restaurant.objects.all()) - -the fields on the serialized output will only contain the `serves_hot_dogs` -attribute. The `name` attribute of the base class will be ignored. - -In order to fully serialize your Restaurant instances, you will need to -serialize the Place models as well:: - - all_objects = list(Restaurant.objects.all()) + list(Place.objects.all()) - data = serializers.serialize('xml', all_objects) - -Deserializing data ------------------- - -Deserializing data is also a fairly simple operation:: - - for obj in serializers.deserialize("xml", data): - do_something_with(obj) - -As you can see, the ``deserialize`` function takes the same format argument as -``serialize``, a string or stream of data, and returns an iterator. - -However, here it gets slightly complicated. The objects returned by the -``deserialize`` iterator *aren't* simple Django objects. Instead, they are -special ``DeserializedObject`` instances that wrap a created -- but unsaved -- -object and any associated relationship data. - -Calling ``DeserializedObject.save()`` saves the object to the database. - -This ensures that deserializing is a non-destructive operation even if the -data in your serialized representation doesn't match what's currently in the -database. Usually, working with these ``DeserializedObject`` instances looks -something like:: - - for deserialized_object in serializers.deserialize("xml", data): - if object_should_be_saved(deserialized_object): - deserialized_object.save() - -In other words, the usual use is to examine the deserialized objects to make -sure that they are "appropriate" for saving before doing so. Of course, if you -trust your data source you could just save the object and move on. - -The Django object itself can be inspected as ``deserialized_object.object``. - -.. _serialization-formats: - -Serialization formats ---------------------- - -Django supports a number of serialization formats, some of which require you -to install third-party Python modules: - - ========== ============================================================== - Identifier Information - ========== ============================================================== - ``xml`` Serializes to and from a simple XML dialect. - - ``json`` Serializes to and from JSON_ (using a version of simplejson_ - bundled with Django). - - ``yaml`` Serializes to YAML (YAML Ain't a Markup Language). This - serializer is only available if PyYAML_ is installed. - ========== ============================================================== - -.. _json: http://json.org/ -.. _simplejson: http://undefined.org/python/#simplejson -.. _PyYAML: http://www.pyyaml.org/ - -Notes for specific serialization formats -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -json -^^^^ - -If you're using UTF-8 (or any other non-ASCII encoding) data with the JSON -serializer, you must pass ``ensure_ascii=False`` as a parameter to the -``serialize()`` call. Otherwise, the output won't be encoded correctly. - -For example:: - - json_serializer = serializers.get_serializer("json")() - json_serializer.serialize(queryset, ensure_ascii=False, stream=response) - -The Django source code includes the simplejson_ module. However, if you're -using Python 2.6 or later (which includes a builtin version of the module), Django will -use the builtin ``json`` module automatically. If you have a system installed -version that includes the C-based speedup extension, or your system version is -more recent than the version shipped with Django (currently, 2.0.7), the -system version will be used instead of the version included with Django. - -Be aware that if you're serializing using that module directly, not all Django -output can be passed unmodified to simplejson. In particular, :ref:`lazy -translation objects <lazy-translations>` need a `special encoder`_ written for -them. Something like this will work:: - - from django.utils.functional import Promise - from django.utils.encoding import force_unicode - - class LazyEncoder(simplejson.JSONEncoder): - def default(self, obj): - if isinstance(obj, Promise): - return force_unicode(obj) - return super(LazyEncoder, self).default(obj) - -.. _special encoder: http://svn.red-bean.com/bob/simplejson/tags/simplejson-1.7/docs/index.html - -.. _topics-serialization-natural-keys: - -Natural keys ------------- - -.. versionadded:: 1.2 - - The ability to use natural keys when serializing/deserializing data was - added in the 1.2 release. - -The default serialization strategy for foreign keys and many-to-many -relations is to serialize the value of the primary key(s) of the -objects in the relation. This strategy works well for most types of -object, but it can cause difficulty in some circumstances. - -Consider the case of a list of objects that have foreign key on -:class:`ContentType`. If you're going to serialize an object that -refers to a content type, you need to have a way to refer to that -content type. Content Types are automatically created by Django as -part of the database synchronization process, so you don't need to -include content types in a fixture or other serialized data. As a -result, the primary key of any given content type isn't easy to -predict - it will depend on how and when :djadmin:`syncdb` was -executed to create the content types. - -There is also the matter of convenience. An integer id isn't always -the most convenient way to refer to an object; sometimes, a -more natural reference would be helpful. - -It is for these reasons that Django provides *natural keys*. A natural -key is a tuple of values that can be used to uniquely identify an -object instance without using the primary key value. - -Deserialization of natural keys -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Consider the following two models:: - - from django.db import models - - class Person(models.Model): - first_name = models.CharField(max_length=100) - last_name = models.CharField(max_length=100) - - birthdate = models.DateField() - - class Meta: - unique_together = (('first_name', 'last_name'),) - - class Book(models.Model): - name = models.CharField(max_length=100) - author = models.ForeignKey(Person) - -Ordinarily, serialized data for ``Book`` would use an integer to refer to -the author. For example, in JSON, a Book might be serialized as:: - - ... - { - "pk": 1, - "model": "store.book", - "fields": { - "name": "Mostly Harmless", - "author": 42 - } - } - ... - -This isn't a particularly natural way to refer to an author. It -requires that you know the primary key value for the author; it also -requires that this primary key value is stable and predictable. - -However, if we add natural key handling to Person, the fixture becomes -much more humane. To add natural key handling, you define a default -Manager for Person with a ``get_by_natural_key()`` method. In the case -of a Person, a good natural key might be the pair of first and last -name:: - - from django.db import models - - class PersonManager(models.Manager): - def get_by_natural_key(self, first_name, last_name): - return self.get(first_name=first_name, last_name=last_name) - - class Person(models.Model): - objects = PersonManager() - - first_name = models.CharField(max_length=100) - last_name = models.CharField(max_length=100) - - birthdate = models.DateField() - - class Meta: - unique_together = (('first_name', 'last_name'),) - -Now books can use that natural key to refer to ``Person`` objects:: - - ... - { - "pk": 1, - "model": "store.book", - "fields": { - "name": "Mostly Harmless", - "author": ["Douglas", "Adams"] - } - } - ... - -When you try to load this serialized data, Django will use the -``get_by_natural_key()`` method to resolve ``["Douglas", "Adams"]`` -into the primary key of an actual ``Person`` object. - -.. note:: - - Whatever fields you use for a natural key must be able to uniquely - identify an object. This will usually mean that your model will - have a uniqueness clause (either unique=True on a single field, or - ``unique_together`` over multiple fields) for the field or fields - in your natural key. However, uniqueness doesn't need to be - enforced at the database level. If you are certain that a set of - fields will be effectively unique, you can still use those fields - as a natural key. - -Serialization of natural keys -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -So how do you get Django to emit a natural key when serializing an object? -Firstly, you need to add another method -- this time to the model itself:: - - class Person(models.Model): - objects = PersonManager() - - first_name = models.CharField(max_length=100) - last_name = models.CharField(max_length=100) - - birthdate = models.DateField() - - def natural_key(self): - return (self.first_name, self.last_name) - - class Meta: - unique_together = (('first_name', 'last_name'),) - -That method should always return a natural key tuple -- in this -example, ``(first name, last name)``. Then, when you call -``serializers.serialize()``, you provide a ``use_natural_keys=True`` -argument:: - - >>> serializers.serialize('json', [book1, book2], indent=2, use_natural_keys=True) - -When ``use_natural_keys=True`` is specified, Django will use the -``natural_key()`` method to serialize any reference to objects of the -type that defines the method. - -If you are using :djadmin:`dumpdata` to generate serialized data, you -use the `--natural` command line flag to generate natural keys. - -.. note:: - - You don't need to define both ``natural_key()`` and - ``get_by_natural_key()``. If you don't want Django to output - natural keys during serialization, but you want to retain the - ability to load natural keys, then you can opt to not implement - the ``natural_key()`` method. - - Conversely, if (for some strange reason) you want Django to output - natural keys during serialization, but *not* be able to load those - key values, just don't define the ``get_by_natural_key()`` method. - -Dependencies during serialization -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Since natural keys rely on database lookups to resolve references, it -is important that data exists before it is referenced. You can't make -a `forward reference` with natural keys - the data you are referencing -must exist before you include a natural key reference to that data. - -To accommodate this limitation, calls to :djadmin:`dumpdata` that use -the :djadminopt:`--natural` option will serialize any model with a -``natural_key()`` method before it serializes normal key objects. - -However, this may not always be enough. If your natural key refers to -another object (by using a foreign key or natural key to another object -as part of a natural key), then you need to be able to ensure that -the objects on which a natural key depends occur in the serialized data -before the natural key requires them. - -To control this ordering, you can define dependencies on your -``natural_key()`` methods. You do this by setting a ``dependencies`` -attribute on the ``natural_key()`` method itself. - -For example, consider the ``Permission`` model in ``contrib.auth``. -The following is a simplified version of the ``Permission`` model:: - - class Permission(models.Model): - name = models.CharField(max_length=50) - content_type = models.ForeignKey(ContentType) - codename = models.CharField(max_length=100) - # ... - def natural_key(self): - return (self.codename,) + self.content_type.natural_key() - -The natural key for a ``Permission`` is a combination of the codename for the -``Permission``, and the ``ContentType`` to which the ``Permission`` applies. This means -that ``ContentType`` must be serialized before ``Permission``. To define this -dependency, we add one extra line:: - - class Permission(models.Model): - # ... - def natural_key(self): - return (self.codename,) + self.content_type.natural_key() - natural_key.dependencies = ['contenttypes.contenttype'] - -This definition ensures that ``ContentType`` models are serialized before -``Permission`` models. In turn, any object referencing ``Permission`` will -be serialized after both ``ContentType`` and ``Permission``. |