diff options
Diffstat (limited to 'parts/django/docs/topics/db')
-rw-r--r-- | parts/django/docs/topics/db/aggregation.txt | 378 | ||||
-rw-r--r-- | parts/django/docs/topics/db/index.txt | 18 | ||||
-rw-r--r-- | parts/django/docs/topics/db/managers.txt | 376 | ||||
-rw-r--r-- | parts/django/docs/topics/db/models.txt | 1234 | ||||
-rw-r--r-- | parts/django/docs/topics/db/multi-db.txt | 574 | ||||
-rw-r--r-- | parts/django/docs/topics/db/optimization.txt | 260 | ||||
-rw-r--r-- | parts/django/docs/topics/db/queries.txt | 1110 | ||||
-rw-r--r-- | parts/django/docs/topics/db/sql.txt | 279 | ||||
-rw-r--r-- | parts/django/docs/topics/db/transactions.txt | 328 |
9 files changed, 4557 insertions, 0 deletions
diff --git a/parts/django/docs/topics/db/aggregation.txt b/parts/django/docs/topics/db/aggregation.txt new file mode 100644 index 0000000..eb21021 --- /dev/null +++ b/parts/django/docs/topics/db/aggregation.txt @@ -0,0 +1,378 @@ +=========== +Aggregation +=========== + +.. versionadded:: 1.1 + +.. currentmodule:: django.db.models + +The topic guide on :doc:`Django's database-abstraction API </topics/db/queries>` +described the way that you can use Django queries that create, +retrieve, update and delete individual objects. However, sometimes you will +need to retrieve values that are derived by summarizing or *aggregating* a +collection of objects. This topic guide describes the ways that aggregate values +can be generated and returned using Django queries. + +Throughout this guide, we'll refer to the following models. These models are +used to track the inventory for a series of online bookstores: + +.. _queryset-model-example: + +.. code-block:: python + + class Author(models.Model): + name = models.CharField(max_length=100) + age = models.IntegerField() + friends = models.ManyToManyField('self', blank=True) + + class Publisher(models.Model): + name = models.CharField(max_length=300) + num_awards = models.IntegerField() + + class Book(models.Model): + isbn = models.CharField(max_length=9) + name = models.CharField(max_length=300) + pages = models.IntegerField() + price = models.DecimalField(max_digits=10, decimal_places=2) + rating = models.FloatField() + authors = models.ManyToManyField(Author) + publisher = models.ForeignKey(Publisher) + pubdate = models.DateField() + + class Store(models.Model): + name = models.CharField(max_length=300) + books = models.ManyToManyField(Book) + + +Generating aggregates over a QuerySet +===================================== + +Django provides two ways to generate aggregates. The first way is to generate +summary values over an entire ``QuerySet``. For example, say you wanted to +calculate the average price of all books available for sale. Django's query +syntax provides a means for describing the set of all books:: + + >>> Book.objects.all() + +What we need is a way to calculate summary values over the objects that +belong to this ``QuerySet``. This is done by appending an ``aggregate()`` +clause onto the ``QuerySet``:: + + >>> from django.db.models import Avg + >>> Book.objects.all().aggregate(Avg('price')) + {'price__avg': 34.35} + +The ``all()`` is redundant in this example, so this could be simplified to:: + + >>> Book.objects.aggregate(Avg('price')) + {'price__avg': 34.35} + +The argument to the ``aggregate()`` clause describes the aggregate value that +we want to compute - in this case, the average of the ``price`` field on the +``Book`` model. A list of the aggregate functions that are available can be +found in the :ref:`QuerySet reference <aggregation-functions>`. + +``aggregate()`` is a terminal clause for a ``QuerySet`` that, when invoked, +returns a dictionary of name-value pairs. The name is an identifier for the +aggregate value; the value is the computed aggregate. The name is +automatically generated from the name of the field and the aggregate function. +If you want to manually specify a name for the aggregate value, you can do so +by providing that name when you specify the aggregate clause:: + + >>> Book.objects.aggregate(average_price=Avg('price')) + {'average_price': 34.35} + +If you want to generate more than one aggregate, you just add another +argument to the ``aggregate()`` clause. So, if we also wanted to know +the maximum and minimum price of all books, we would issue the query:: + + >>> from django.db.models import Avg, Max, Min, Count + >>> Book.objects.aggregate(Avg('price'), Max('price'), Min('price')) + {'price__avg': 34.35, 'price__max': Decimal('81.20'), 'price__min': Decimal('12.99')} + +Generating aggregates for each item in a QuerySet +================================================= + +The second way to generate summary values is to generate an independent +summary for each object in a ``QuerySet``. For example, if you are retrieving +a list of books, you may want to know how many authors contributed to +each book. Each Book has a many-to-many relationship with the Author; we +want to summarize this relationship for each book in the ``QuerySet``. + +Per-object summaries can be generated using the ``annotate()`` clause. +When an ``annotate()`` clause is specified, each object in the ``QuerySet`` +will be annotated with the specified values. + +The syntax for these annotations is identical to that used for the +``aggregate()`` clause. Each argument to ``annotate()`` describes an +aggregate that is to be calculated. For example, to annotate Books with +the number of authors:: + + # Build an annotated queryset + >>> q = Book.objects.annotate(Count('authors')) + # Interrogate the first object in the queryset + >>> q[0] + <Book: The Definitive Guide to Django> + >>> q[0].authors__count + 2 + # Interrogate the second object in the queryset + >>> q[1] + <Book: Practical Django Projects> + >>> q[1].authors__count + 1 + +As with ``aggregate()``, the name for the annotation is automatically derived +from the name of the aggregate function and the name of the field being +aggregated. You can override this default name by providing an alias when you +specify the annotation:: + + >>> q = Book.objects.annotate(num_authors=Count('authors')) + >>> q[0].num_authors + 2 + >>> q[1].num_authors + 1 + +Unlike ``aggregate()``, ``annotate()`` is *not* a terminal clause. The output +of the ``annotate()`` clause is a ``QuerySet``; this ``QuerySet`` can be +modified using any other ``QuerySet`` operation, including ``filter()``, +``order_by``, or even additional calls to ``annotate()``. + +Joins and aggregates +==================== + +So far, we have dealt with aggregates over fields that belong to the +model being queried. However, sometimes the value you want to aggregate +will belong to a model that is related to the model you are querying. + +When specifying the field to be aggregated in an aggregate function, Django +will allow you to use the same :ref:`double underscore notation +<field-lookups-intro>` that is used when referring to related fields in +filters. Django will then handle any table joins that are required to retrieve +and aggregate the related value. + +For example, to find the price range of books offered in each store, +you could use the annotation:: + + >>> Store.objects.annotate(min_price=Min('books__price'), max_price=Max('books__price')) + +This tells Django to retrieve the Store model, join (through the +many-to-many relationship) with the Book model, and aggregate on the +price field of the book model to produce a minimum and maximum value. + +The same rules apply to the ``aggregate()`` clause. If you wanted to +know the lowest and highest price of any book that is available for sale +in a store, you could use the aggregate:: + + >>> Store.objects.aggregate(min_price=Min('books__price'), max_price=Max('books__price')) + +Join chains can be as deep as you require. For example, to extract the +age of the youngest author of any book available for sale, you could +issue the query:: + + >>> Store.objects.aggregate(youngest_age=Min('books__authors__age')) + +Aggregations and other QuerySet clauses +======================================= + +``filter()`` and ``exclude()`` +------------------------------ + +Aggregates can also participate in filters. Any ``filter()`` (or +``exclude()``) applied to normal model fields will have the effect of +constraining the objects that are considered for aggregation. + +When used with an ``annotate()`` clause, a filter has the effect of +constraining the objects for which an annotation is calculated. For example, +you can generate an annotated list of all books that have a title starting +with "Django" using the query:: + + >>> Book.objects.filter(name__startswith="Django").annotate(num_authors=Count('authors')) + +When used with an ``aggregate()`` clause, a filter has the effect of +constraining the objects over which the aggregate is calculated. +For example, you can generate the average price of all books with a +title that starts with "Django" using the query:: + + >>> Book.objects.filter(name__startswith="Django").aggregate(Avg('price')) + +Filtering on annotations +~~~~~~~~~~~~~~~~~~~~~~~~ + +Annotated values can also be filtered. The alias for the annotation can be +used in ``filter()`` and ``exclude()`` clauses in the same way as any other +model field. + +For example, to generate a list of books that have more than one author, +you can issue the query:: + + >>> Book.objects.annotate(num_authors=Count('authors')).filter(num_authors__gt=1) + +This query generates an annotated result set, and then generates a filter +based upon that annotation. + +Order of ``annotate()`` and ``filter()`` clauses +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When developing a complex query that involves both ``annotate()`` and +``filter()`` clauses, particular attention should be paid to the order +in which the clauses are applied to the ``QuerySet``. + +When an ``annotate()`` clause is applied to a query, the annotation is +computed over the state of the query up to the point where the annotation +is requested. The practical implication of this is that ``filter()`` and +``annotate()`` are not commutative operations -- that is, there is a +difference between the query:: + + >>> Publisher.objects.annotate(num_books=Count('book')).filter(book__rating__gt=3.0) + +and the query:: + + >>> Publisher.objects.filter(book__rating__gt=3.0).annotate(num_books=Count('book')) + +Both queries will return a list of Publishers that have at least one good +book (i.e., a book with a rating exceeding 3.0). However, the annotation in +the first query will provide the total number of all books published by the +publisher; the second query will only include good books in the annotated +count. In the first query, the annotation precedes the filter, so the +filter has no effect on the annotation. In the second query, the filter +preceeds the annotation, and as a result, the filter constrains the objects +considered when calculating the annotation. + +``order_by()`` +-------------- + +Annotations can be used as a basis for ordering. When you +define an ``order_by()`` clause, the aggregates you provide can reference +any alias defined as part of an ``annotate()`` clause in the query. + +For example, to order a ``QuerySet`` of books by the number of authors +that have contributed to the book, you could use the following query:: + + >>> Book.objects.annotate(num_authors=Count('authors')).order_by('num_authors') + +``values()`` +------------ + +Ordinarily, annotations are generated on a per-object basis - an annotated +``QuerySet`` will return one result for each object in the original +``QuerySet``. However, when a ``values()`` clause is used to constrain the +columns that are returned in the result set, the method for evaluating +annotations is slightly different. Instead of returning an annotated result +for each result in the original ``QuerySet``, the original results are +grouped according to the unique combinations of the fields specified in the +``values()`` clause. An annotation is then provided for each unique group; +the annotation is computed over all members of the group. + +For example, consider an author query that attempts to find out the average +rating of books written by each author: + + >>> Author.objects.annotate(average_rating=Avg('book__rating')) + +This will return one result for each author in the database, annotated with +their average book rating. + +However, the result will be slightly different if you use a ``values()`` clause:: + + >>> Author.objects.values('name').annotate(average_rating=Avg('book__rating')) + +In this example, the authors will be grouped by name, so you will only get +an annotated result for each *unique* author name. This means if you have +two authors with the same name, their results will be merged into a single +result in the output of the query; the average will be computed as the +average over the books written by both authors. + +Order of ``annotate()`` and ``values()`` clauses +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As with the ``filter()`` clause, the order in which ``annotate()`` and +``values()`` clauses are applied to a query is significant. If the +``values()`` clause precedes the ``annotate()``, the annotation will be +computed using the grouping described by the ``values()`` clause. + +However, if the ``annotate()`` clause precedes the ``values()`` clause, +the annotations will be generated over the entire query set. In this case, +the ``values()`` clause only constrains the fields that are generated on +output. + +For example, if we reverse the order of the ``values()`` and ``annotate()`` +clause from our previous example:: + + >>> Author.objects.annotate(average_rating=Avg('book__rating')).values('name', 'average_rating') + +This will now yield one unique result for each author; however, only +the author's name and the ``average_rating`` annotation will be returned +in the output data. + +You should also note that ``average_rating`` has been explicitly included +in the list of values to be returned. This is required because of the +ordering of the ``values()`` and ``annotate()`` clause. + +If the ``values()`` clause precedes the ``annotate()`` clause, any annotations +will be automatically added to the result set. However, if the ``values()`` +clause is applied after the ``annotate()`` clause, you need to explicitly +include the aggregate column. + +Interaction with default ordering or ``order_by()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Fields that are mentioned in the ``order_by()`` part of a queryset (or which +are used in the default ordering on a model) are used when selecting the +output data, even if they are not otherwise specified in the ``values()`` +call. These extra fields are used to group "like" results together and they +can make otherwise identical result rows appear to be separate. This shows up, +particularly, when counting things. + +By way of example, suppose you have a model like this:: + + class Item(models.Model): + name = models.CharField(max_length=10) + data = models.IntegerField() + + class Meta: + ordering = ["name"] + +The important part here is the default ordering on the ``name`` field. If you +want to count how many times each distinct ``data`` value appears, you might +try this:: + + # Warning: not quite correct! + Item.objects.values("data").annotate(Count("id")) + +...which will group the ``Item`` objects by their common ``data`` values and +then count the number of ``id`` values in each group. Except that it won't +quite work. The default ordering by ``name`` will also play a part in the +grouping, so this query will group by distinct ``(data, name)`` pairs, which +isn't what you want. Instead, you should construct this queryset:: + + Item.objects.values("data").annotate(Count("id")).order_by() + +...clearing any ordering in the query. You could also order by, say, ``data`` +without any harmful effects, since that is already playing a role in the +query. + +This behavior is the same as that noted in the queryset documentation for +:meth:`~django.db.models.QuerySet.distinct` and the general rule is the same: +normally you won't want extra columns playing a part in the result, so clear +out the ordering, or at least make sure it's restricted only to those fields +you also select in a ``values()`` call. + +.. note:: + You might reasonably ask why Django doesn't remove the extraneous columns + for you. The main reason is consistency with ``distinct()`` and other + places: Django **never** removes ordering constraints that you have + specified (and we can't change those other methods' behavior, as that + would violate our :doc:`/misc/api-stability` policy). + +Aggregating annotations +----------------------- + +You can also generate an aggregate on the result of an annotation. When you +define an ``aggregate()`` clause, the aggregates you provide can reference +any alias defined as part of an ``annotate()`` clause in the query. + +For example, if you wanted to calculate the average number of authors per +book you first annotate the set of books with the author count, then +aggregate that author count, referencing the annotation field:: + + >>> Book.objects.annotate(num_authors=Count('authors')).aggregate(Avg('num_authors')) + {'num_authors__avg': 1.66} diff --git a/parts/django/docs/topics/db/index.txt b/parts/django/docs/topics/db/index.txt new file mode 100644 index 0000000..c49f158 --- /dev/null +++ b/parts/django/docs/topics/db/index.txt @@ -0,0 +1,18 @@ +Models and databases +==================== + +A model is the single, definitive source of data about your data. It contains +the essential fields and behaviors of the data you're storing. Generally, each +model maps to a single database table. + +.. toctree:: + :maxdepth: 1 + + models + queries + aggregation + managers + sql + transactions + multi-db + optimization diff --git a/parts/django/docs/topics/db/managers.txt b/parts/django/docs/topics/db/managers.txt new file mode 100644 index 0000000..5ebe0b1 --- /dev/null +++ b/parts/django/docs/topics/db/managers.txt @@ -0,0 +1,376 @@ +======== +Managers +======== + +.. currentmodule:: django.db.models + +.. class:: Manager() + +A ``Manager`` is the interface through which database query operations are +provided to Django models. At least one ``Manager`` exists for every model in +a Django application. + +The way ``Manager`` classes work is documented in :doc:`/topics/db/queries`; +this document specifically touches on model options that customize ``Manager`` +behavior. + +.. _manager-names: + +Manager names +============= + +By default, Django adds a ``Manager`` with the name ``objects`` to every Django +model class. However, if you want to use ``objects`` as a field name, or if you +want to use a name other than ``objects`` for the ``Manager``, you can rename +it on a per-model basis. To rename the ``Manager`` for a given class, define a +class attribute of type ``models.Manager()`` on that model. For example:: + + from django.db import models + + class Person(models.Model): + #... + people = models.Manager() + +Using this example model, ``Person.objects`` will generate an +``AttributeError`` exception, but ``Person.people.all()`` will provide a list +of all ``Person`` objects. + +.. _custom-managers: + +Custom Managers +=============== + +You can use a custom ``Manager`` in a particular model by extending the base +``Manager`` class and instantiating your custom ``Manager`` in your model. + +There are two reasons you might want to customize a ``Manager``: to add extra +``Manager`` methods, and/or to modify the initial ``QuerySet`` the ``Manager`` +returns. + +Adding extra Manager methods +---------------------------- + +Adding extra ``Manager`` methods is the preferred way to add "table-level" +functionality to your models. (For "row-level" functionality -- i.e., functions +that act on a single instance of a model object -- use :ref:`Model methods +<model-methods>`, not custom ``Manager`` methods.) + +A custom ``Manager`` method can return anything you want. It doesn't have to +return a ``QuerySet``. + +For example, this custom ``Manager`` offers a method ``with_counts()``, which +returns a list of all ``OpinionPoll`` objects, each with an extra +``num_responses`` attribute that is the result of an aggregate query:: + + class PollManager(models.Manager): + def with_counts(self): + from django.db import connection + cursor = connection.cursor() + cursor.execute(""" + SELECT p.id, p.question, p.poll_date, COUNT(*) + FROM polls_opinionpoll p, polls_response r + WHERE p.id = r.poll_id + GROUP BY 1, 2, 3 + ORDER BY 3 DESC""") + result_list = [] + for row in cursor.fetchall(): + p = self.model(id=row[0], question=row[1], poll_date=row[2]) + p.num_responses = row[3] + result_list.append(p) + return result_list + + class OpinionPoll(models.Model): + question = models.CharField(max_length=200) + poll_date = models.DateField() + objects = PollManager() + + class Response(models.Model): + poll = models.ForeignKey(Poll) + person_name = models.CharField(max_length=50) + response = models.TextField() + +With this example, you'd use ``OpinionPoll.objects.with_counts()`` to return +that list of ``OpinionPoll`` objects with ``num_responses`` attributes. + +Another thing to note about this example is that ``Manager`` methods can +access ``self.model`` to get the model class to which they're attached. + +Modifying initial Manager QuerySets +----------------------------------- + +A ``Manager``'s base ``QuerySet`` returns all objects in the system. For +example, using this model:: + + class Book(models.Model): + title = models.CharField(max_length=100) + author = models.CharField(max_length=50) + +...the statement ``Book.objects.all()`` will return all books in the database. + +You can override a ``Manager``\'s base ``QuerySet`` by overriding the +``Manager.get_query_set()`` method. ``get_query_set()`` should return a +``QuerySet`` with the properties you require. + +For example, the following model has *two* ``Manager``\s -- one that returns +all objects, and one that returns only the books by Roald Dahl:: + + # First, define the Manager subclass. + class DahlBookManager(models.Manager): + def get_query_set(self): + return super(DahlBookManager, self).get_query_set().filter(author='Roald Dahl') + + # Then hook it into the Book model explicitly. + class Book(models.Model): + title = models.CharField(max_length=100) + author = models.CharField(max_length=50) + + objects = models.Manager() # The default manager. + dahl_objects = DahlBookManager() # The Dahl-specific manager. + +With this sample model, ``Book.objects.all()`` will return all books in the +database, but ``Book.dahl_objects.all()`` will only return the ones written by +Roald Dahl. + +Of course, because ``get_query_set()`` returns a ``QuerySet`` object, you can +use ``filter()``, ``exclude()`` and all the other ``QuerySet`` methods on it. +So these statements are all legal:: + + Book.dahl_objects.all() + Book.dahl_objects.filter(title='Matilda') + Book.dahl_objects.count() + +This example also pointed out another interesting technique: using multiple +managers on the same model. You can attach as many ``Manager()`` instances to +a model as you'd like. This is an easy way to define common "filters" for your +models. + +For example:: + + class MaleManager(models.Manager): + def get_query_set(self): + return super(MaleManager, self).get_query_set().filter(sex='M') + + class FemaleManager(models.Manager): + def get_query_set(self): + return super(FemaleManager, self).get_query_set().filter(sex='F') + + class Person(models.Model): + first_name = models.CharField(max_length=50) + last_name = models.CharField(max_length=50) + sex = models.CharField(max_length=1, choices=(('M', 'Male'), ('F', 'Female'))) + people = models.Manager() + men = MaleManager() + women = FemaleManager() + +This example allows you to request ``Person.men.all()``, ``Person.women.all()``, +and ``Person.people.all()``, yielding predictable results. + +If you use custom ``Manager`` objects, take note that the first ``Manager`` +Django encounters (in the order in which they're defined in the model) has a +special status. Django interprets the first ``Manager`` defined in a class as +the "default" ``Manager``, and several parts of Django +(including :djadmin:`dumpdata`) will use that ``Manager`` +exclusively for that model. As a result, it's a good idea to be careful in +your choice of default manager in order to avoid a situation where overriding +``get_query_set()`` results in an inability to retrieve objects you'd like to +work with. + +.. _managers-for-related-objects: + +Using managers for related object access +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +By default, Django uses an instance of a "plain" manager class when accessing +related objects (i.e. ``choice.poll``), not the default manager on the related +object. This is because Django needs to be able to retrieve the related +object, even if it would otherwise be filtered out (and hence be inaccessible) +by the default manager. + +If the normal plain manager class (:class:`django.db.models.Manager`) is not +appropriate for your circumstances, you can force Django to use the same class +as the default manager for your model by setting the `use_for_related_fields` +attribute on the manager class. This is documented fully below_. + +.. _below: manager-types_ + +.. _custom-managers-and-inheritance: + +Custom managers and model inheritance +------------------------------------- + +Class inheritance and model managers aren't quite a perfect match for each +other. Managers are often specific to the classes they are defined on and +inheriting them in subclasses isn't necessarily a good idea. Also, because the +first manager declared is the *default manager*, it is important to allow that +to be controlled. So here's how Django handles custom managers and +:ref:`model inheritance <model-inheritance>`: + + 1. Managers defined on non-abstract base classes are *not* inherited by + child classes. If you want to reuse a manager from a non-abstract base, + redeclare it explicitly on the child class. These sorts of managers are + likely to be fairly specific to the class they are defined on, so + inheriting them can often lead to unexpected results (particularly as + far as the default manager goes). Therefore, they aren't passed onto + child classes. + + 2. Managers from abstract base classes are always inherited by the child + class, using Python's normal name resolution order (names on the child + class override all others; then come names on the first parent class, + and so on). Abstract base classes are designed to capture information + and behavior that is common to their child classes. Defining common + managers is an appropriate part of this common information. + + 3. The default manager on a class is either the first manager declared on + the class, if that exists, or the default manager of the first abstract + base class in the parent hierarchy, if that exists. If no default + manager is explicitly declared, Django's normal default manager is + used. + +These rules provide the necessary flexibility if you want to install a +collection of custom managers on a group of models, via an abstract base +class, but still customize the default manager. For example, suppose you have +this base class:: + + class AbstractBase(models.Model): + ... + objects = CustomManager() + + class Meta: + abstract = True + +If you use this directly in a subclass, ``objects`` will be the default +manager if you declare no managers in the base class:: + + class ChildA(AbstractBase): + ... + # This class has CustomManager as the default manager. + +If you want to inherit from ``AbstractBase``, but provide a different default +manager, you can provide the default manager on the child class:: + + class ChildB(AbstractBase): + ... + # An explicit default manager. + default_manager = OtherManager() + +Here, ``default_manager`` is the default. The ``objects`` manager is +still available, since it's inherited. It just isn't used as the default. + +Finally for this example, suppose you want to add extra managers to the child +class, but still use the default from ``AbstractBase``. You can't add the new +manager directly in the child class, as that would override the default and you would +have to also explicitly include all the managers from the abstract base class. +The solution is to put the extra managers in another base class and introduce +it into the inheritance hierarchy *after* the defaults:: + + class ExtraManager(models.Model): + extra_manager = OtherManager() + + class Meta: + abstract = True + + class ChildC(AbstractBase, ExtraManager): + ... + # Default manager is CustomManager, but OtherManager is + # also available via the "extra_manager" attribute. + +.. _manager-types: + +Controlling Automatic Manager Types +=================================== + +This document has already mentioned a couple of places where Django creates a +manager class for you: `default managers`_ and the "plain" manager used to +`access related objects`_. There are other places in the implementation of +Django where temporary plain managers are needed. Those automatically created +managers will normally be instances of the :class:`django.db.models.Manager` +class. + +.. _default managers: manager-names_ +.. _access related objects: managers-for-related-objects_ + +Throughout this section, we will use the term "automatic manager" to mean a +manager that Django creates for you -- either as a default manager on a model +with no managers, or to use temporarily when accessing related objects. + +Sometimes this default class won't be the right choice. One example is in the +:mod:`django.contrib.gis` application that ships with Django itself. All ``gis`` +models must use a special manager class (:class:`~django.contrib.gis.db.models.GeoManager`) +because they need a special queryset (:class:`~django.contrib.gis.db.models.GeoQuerySet`) +to be used for interacting with the database. It turns out that models which require +a special manager like this need to use the same manager class wherever an automatic +manager is created. + +Django provides a way for custom manager developers to say that their manager +class should be used for automatic managers whenever it is the default manager +on a model. This is done by setting the ``use_for_related_fields`` attribute on +the manager class:: + + class MyManager(models.Manager): + use_for_related_fields = True + + ... + +If this attribute is set on the *default* manager for a model (only the +default manager is considered in these situations), Django will use that class +whenever it needs to automatically create a manager for the class. Otherwise, +it will use :class:`django.db.models.Manager`. + +.. admonition:: Historical Note + + Given the purpose for which it's used, the name of this attribute + (``use_for_related_fields``) might seem a little odd. Originally, the + attribute only controlled the type of manager used for related field + access, which is where the name came from. As it became clear the concept + was more broadly useful, the name hasn't been changed. This is primarily + so that existing code will :doc:`continue to work </misc/api-stability>` in + future Django versions. + +Writing Correct Managers For Use In Automatic Manager Instances +--------------------------------------------------------------- + +As already suggested by the `django.contrib.gis` example, above, the +``use_for_related_fields`` feature is primarily for managers that need to +return a custom ``QuerySet`` subclass. In providing this functionality in your +manager, there are a couple of things to remember. + +Do not filter away any results in this type of manager subclass +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +One reason an automatic manager is used is to access objects that are related +to from some other model. In those situations, Django has to be able to see +all the objects for the model it is fetching, so that *anything* which is +referred to can be retrieved. + +If you override the ``get_query_set()`` method and filter out any rows, Django +will return incorrect results. Don't do that. A manager that filters results +in ``get_query_set()`` is not appropriate for use as an automatic manager. + +Set ``use_for_related_fields`` when you define the class +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``use_for_related_fields`` attribute must be set on the manager *class*, +object not on an *instance* of the class. The earlier example shows the +correct way to set it, whereas the following will not work:: + + # BAD: Incorrect code + class MyManager(models.Manager): + ... + + # Sets the attribute on an instance of MyManager. Django will + # ignore this setting. + mgr = MyManager() + mgr.use_for_related_fields = True + + class MyModel(models.Model): + ... + objects = mgr + + # End of incorrect code. + +You also shouldn't change the attribute on the class object after it has been +used in a model, since the attribute's value is processed when the model class +is created and not subsequently reread. Set the attribute on the manager class +when it is first defined, as in the initial example of this section and +everything will work smoothly. + diff --git a/parts/django/docs/topics/db/models.txt b/parts/django/docs/topics/db/models.txt new file mode 100644 index 0000000..2a19cbd --- /dev/null +++ b/parts/django/docs/topics/db/models.txt @@ -0,0 +1,1234 @@ +====== +Models +====== + +.. module:: django.db.models + +A model is the single, definitive source of data about your data. It contains +the essential fields and behaviors of the data you're storing. Generally, each +model maps to a single database table. + +The basics: + + * Each model is a Python class that subclasses + :class:`django.db.models.Model`. + + * Each attribute of the model represents a database field. + + * With all of this, Django gives you an automatically-generated + database-access API; see :doc:`/topics/db/queries`. + +.. seealso:: + + A companion to this document is the `official repository of model + examples`_. (In the Django source distribution, these examples are in the + ``tests/modeltests`` directory.) + + .. _official repository of model examples: http://www.djangoproject.com/documentation/models/ + +Quick example +============= + +This example model defines a ``Person``, which has a ``first_name`` and +``last_name``:: + + from django.db import models + + class Person(models.Model): + first_name = models.CharField(max_length=30) + last_name = models.CharField(max_length=30) + +``first_name`` and ``last_name`` are fields_ of the model. Each field is +specified as a class attribute, and each attribute maps to a database column. + +The above ``Person`` model would create a database table like this: + +.. code-block:: sql + + CREATE TABLE myapp_person ( + "id" serial NOT NULL PRIMARY KEY, + "first_name" varchar(30) NOT NULL, + "last_name" varchar(30) NOT NULL + ); + +Some technical notes: + + * The name of the table, ``myapp_person``, is automatically derived from + some model metadata but can be overridden. See :ref:`table-names` for more + details.. + + * An ``id`` field is added automatically, but this behavior can be + overridden. See :ref:`automatic-primary-key-fields`. + + * The ``CREATE TABLE`` SQL in this example is formatted using PostgreSQL + syntax, but it's worth noting Django uses SQL tailored to the database + backend specified in your :doc:`settings file </topics/settings>`. + +Using models +============ + +Once you have defined your models, you need to tell Django you're going to *use* +those models. Do this by editing your settings file and changing the +:setting:`INSTALLED_APPS` setting to add the name of the module that contains +your ``models.py``. + +For example, if the models for your application live in the module +``mysite.myapp.models`` (the package structure that is created for an +application by the :djadmin:`manage.py startapp <startapp>` script), +:setting:`INSTALLED_APPS` should read, in part:: + + INSTALLED_APPS = ( + #... + 'mysite.myapp', + #... + ) + +When you add new apps to :setting:`INSTALLED_APPS`, be sure to run +:djadmin:`manage.py syncdb <syncdb>`. + +Fields +====== + +The most important part of a model -- and the only required part of a model -- +is the list of database fields it defines. Fields are specified by class +attributes. + +Example:: + + class Musician(models.Model): + first_name = models.CharField(max_length=50) + last_name = models.CharField(max_length=50) + instrument = models.CharField(max_length=100) + + class Album(models.Model): + artist = models.ForeignKey(Musician) + name = models.CharField(max_length=100) + release_date = models.DateField() + num_stars = models.IntegerField() + +Field types +----------- + +Each field in your model should be an instance of the appropriate +:class:`~django.db.models.Field` class. Django uses the field class types to +determine a few things: + + * The database column type (e.g. ``INTEGER``, ``VARCHAR``). + + * The :doc:`widget </ref/forms/widgets>` to use in Django's admin interface, + if you care to use it (e.g. ``<input type="text">``, ``<select>``). + + * The minimal validation requirements, used in Django's admin and in + automatically-generated forms. + +Django ships with dozens of built-in field types; you can find the complete list +in the :ref:`model field reference <model-field-types>`. You can easily write +your own fields if Django's built-in ones don't do the trick; see +:doc:`/howto/custom-model-fields`. + +Field options +------------- + +Each field takes a certain set of field-specific arguments (documented in the +:ref:`model field reference <model-field-types>`). For example, +:class:`~django.db.models.CharField` (and its subclasses) require a +:attr:`~django.db.models.CharField.max_length` argument which specifies the size +of the ``VARCHAR`` database field used to store the data. + +There's also a set of common arguments available to all field types. All are +optional. They're fully explained in the :ref:`reference +<common-model-field-options>`, but here's a quick summary of the most often-used +ones: + + :attr:`~Field.null` + If ``True``, Django will store empty values as ``NULL`` in the database. + Default is ``False``. + + :attr:`~Field.blank` + If ``True``, the field is allowed to be blank. Default is ``False``. + + Note that this is different than :attr:`~Field.null`. + :attr:`~Field.null` is purely database-related, whereas + :attr:`~Field.blank` is validation-related. If a field has + :attr:`blank=True <Field.blank>`, validation on Django's admin site will + allow entry of an empty value. If a field has :attr:`blank=False + <Field.blank>`, the field will be required. + + :attr:`~Field.choices` + An iterable (e.g., a list or tuple) of 2-tuples to use as choices for + this field. If this is given, Django's admin will use a select box + instead of the standard text field and will limit choices to the choices + given. + + A choices list looks like this:: + + YEAR_IN_SCHOOL_CHOICES = ( + (u'FR', u'Freshman'), + (u'SO', u'Sophomore'), + (u'JR', u'Junior'), + (u'SR', u'Senior'), + (u'GR', u'Graduate'), + ) + + The first element in each tuple is the value that will be stored in the + database, the second element will be displayed by the admin interface, + or in a ModelChoiceField. Given an instance of a model object, the + display value for a choices field can be accessed using the + ``get_FOO_display`` method. For example:: + + from django.db import models + + class Person(models.Model): + GENDER_CHOICES = ( + (u'M', u'Male'), + (u'F', u'Female'), + ) + name = models.CharField(max_length=60) + gender = models.CharField(max_length=2, choices=GENDER_CHOICES) + + :: + + >>> p = Person(name="Fred Flinstone", gender="M") + >>> p.save() + >>> p.gender + u'M' + >>> p.get_gender_display() + u'Male' + + :attr:`~Field.default` + The default value for the field. This can be a value or a callable + object. If callable it will be called every time a new object is + created. + + :attr:`~Field.help_text` + Extra "help" text to be displayed under the field on the object's admin + form. It's useful for documentation even if your object doesn't have an + admin form. + + :attr:`~Field.primary_key` + If ``True``, this field is the primary key for the model. + + If you don't specify :attr:`primary_key=True <Field.primary_key>` for + any fields in your model, Django will automatically add an + :class:`IntegerField` to hold the primary key, so you don't need to set + :attr:`primary_key=True <Field.primary_key>` on any of your fields + unless you want to override the default primary-key behavior. For more, + see :ref:`automatic-primary-key-fields`. + + :attr:`~Field.unique` + If ``True``, this field must be unique throughout the table. + +Again, these are just short descriptions of the most common field options. Full +details can be found in the :ref:`common model field option reference +<common-model-field-options>`. + +.. _automatic-primary-key-fields: + +Automatic primary key fields +---------------------------- + +By default, Django gives each model the following field:: + + id = models.AutoField(primary_key=True) + +This is an auto-incrementing primary key. + +If you'd like to specify a custom primary key, just specify +:attr:`primary_key=True <Field.primary_key>` on one of your fields. If Django +sees you've explicitly set :attr:`Field.primary_key`, it won't add the automatic +``id`` column. + +Each model requires exactly one field to have :attr:`primary_key=True +<Field.primary_key>`. + +.. _verbose-field-names: + +Verbose field names +------------------- + +Each field type, except for :class:`~django.db.models.ForeignKey`, +:class:`~django.db.models.ManyToManyField` and +:class:`~django.db.models.OneToOneField`, takes an optional first positional +argument -- a verbose name. If the verbose name isn't given, Django will +automatically create it using the field's attribute name, converting underscores +to spaces. + +In this example, the verbose name is ``"person's first name"``:: + + first_name = models.CharField("person's first name", max_length=30) + +In this example, the verbose name is ``"first name"``:: + + first_name = models.CharField(max_length=30) + +:class:`~django.db.models.ForeignKey`, +:class:`~django.db.models.ManyToManyField` and +:class:`~django.db.models.OneToOneField` require the first argument to be a +model class, so use the :attr:`~Field.verbose_name` keyword argument:: + + poll = models.ForeignKey(Poll, verbose_name="the related poll") + sites = models.ManyToManyField(Site, verbose_name="list of sites") + place = models.OneToOneField(Place, verbose_name="related place") + +The convention is not to capitalize the first letter of the +:attr:`~Field.verbose_name`. Django will automatically capitalize the first +letter where it needs to. + +Relationships +------------- + +Clearly, the power of relational databases lies in relating tables to each +other. Django offers ways to define the three most common types of database +relationships: many-to-one, many-to-many and one-to-one. + +Many-to-one relationships +~~~~~~~~~~~~~~~~~~~~~~~~~ + +To define a many-to-one relationship, use :class:`~django.db.models.ForeignKey`. +You use it just like any other :class:`~django.db.models.Field` type: by +including it as a class attribute of your model. + +:class:`~django.db.models.ForeignKey` requires a positional argument: the class +to which the model is related. + +For example, if a ``Car`` model has a ``Manufacturer`` -- that is, a +``Manufacturer`` makes multiple cars but each ``Car`` only has one +``Manufacturer`` -- use the following definitions:: + + class Manufacturer(models.Model): + # ... + + class Car(models.Model): + manufacturer = models.ForeignKey(Manufacturer) + # ... + +You can also create :ref:`recursive relationships <recursive-relationships>` (an +object with a many-to-one relationship to itself) and :ref:`relationships to +models not yet defined <lazy-relationships>`; see :ref:`the model field +reference <ref-foreignkey>` for details. + +It's suggested, but not required, that the name of a +:class:`~django.db.models.ForeignKey` field (``manufacturer`` in the example +above) be the name of the model, lowercase. You can, of course, call the field +whatever you want. For example:: + + class Car(models.Model): + company_that_makes_it = models.ForeignKey(Manufacturer) + # ... + +.. seealso:: + + See the `Many-to-one relationship model example`_ for a full example. + +.. _Many-to-one relationship model example: http://www.djangoproject.com/documentation/models/many_to_one/ + +:class:`~django.db.models.ForeignKey` fields also accept a number of extra +arguments which are explained in :ref:`the model field reference +<foreign-key-arguments>`. These options help define how the relationship should +work; all are optional. + +Many-to-many relationships +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To define a many-to-many relationship, use +:class:`~django.db.models.ManyToManyField`. You use it just like any other +:class:`~django.db.models.Field` type: by including it as a class attribute of +your model. + +:class:`~django.db.models.ManyToManyField` requires a positional argument: the +class to which the model is related. + +For example, if a ``Pizza`` has multiple ``Topping`` objects -- that is, a +``Topping`` can be on multiple pizzas and each ``Pizza`` has multiple toppings +-- here's how you'd represent that:: + + class Topping(models.Model): + # ... + + class Pizza(models.Model): + # ... + toppings = models.ManyToManyField(Topping) + +As with :class:`~django.db.models.ForeignKey`, you can also create +:ref:`recursive relationships <recursive-relationships>` (an object with a +many-to-many relationship to itself) and :ref:`relationships to models not yet +defined <lazy-relationships>`; see :ref:`the model field reference +<ref-manytomany>` for details. + +It's suggested, but not required, that the name of a +:class:`~django.db.models.ManyToManyField` (``toppings`` in the example above) +be a plural describing the set of related model objects. + +It doesn't matter which model gets the +:class:`~django.db.models.ManyToManyField`, but you only need it in one of the +models -- not in both. + +Generally, :class:`~django.db.models.ManyToManyField` instances should go in the +object that's going to be edited in the admin interface, if you're using +Django's admin. In the above example, ``toppings`` is in ``Pizza`` (rather than +``Topping`` having a ``pizzas`` :class:`~django.db.models.ManyToManyField` ) +because it's more natural to think about a pizza having toppings than a +topping being on multiple pizzas. The way it's set up above, the ``Pizza`` admin +form would let users select the toppings. + +.. seealso:: + + See the `Many-to-many relationship model example`_ for a full example. + +.. _Many-to-many relationship model example: http://www.djangoproject.com/documentation/models/many_to_many/ + +:class:`~django.db.models.ManyToManyField` fields also accept a number of extra +arguments which are explained in :ref:`the model field reference +<manytomany-arguments>`. These options help define how the relationship should +work; all are optional. + +.. _intermediary-manytomany: + +Extra fields on many-to-many relationships +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. versionadded:: 1.0 + +When you're only dealing with simple many-to-many relationships such as +mixing and matching pizzas and toppings, a standard :class:`~django.db.models.ManyToManyField` is all you need. However, sometimes +you may need to associate data with the relationship between two models. + +For example, consider the case of an application tracking the musical groups +which musicians belong to. There is a many-to-many relationship between a person +and the groups of which they are a member, so you could use a +:class:`~django.db.models.ManyToManyField` to represent this relationship. +However, there is a lot of detail about the membership that you might want to +collect, such as the date at which the person joined the group. + +For these situations, Django allows you to specify the model that will be used +to govern the many-to-many relationship. You can then put extra fields on the +intermediate model. The intermediate model is associated with the +:class:`~django.db.models.ManyToManyField` using the +:attr:`through <ManyToManyField.through>` argument to point to the model +that will act as an intermediary. For our musician example, the code would look +something like this:: + + class Person(models.Model): + name = models.CharField(max_length=128) + + def __unicode__(self): + return self.name + + class Group(models.Model): + name = models.CharField(max_length=128) + members = models.ManyToManyField(Person, through='Membership') + + def __unicode__(self): + return self.name + + class Membership(models.Model): + person = models.ForeignKey(Person) + group = models.ForeignKey(Group) + date_joined = models.DateField() + invite_reason = models.CharField(max_length=64) + +When you set up the intermediary model, you explicitly specify foreign +keys to the models that are involved in the ManyToMany relation. This +explicit declaration defines how the two models are related. + +There are a few restrictions on the intermediate model: + + * Your intermediate model must contain one - and *only* one - foreign key + to the target model (this would be ``Person`` in our example). If you + have more than one foreign key, a validation error will be raised. + + * Your intermediate model must contain one - and *only* one - foreign key + to the source model (this would be ``Group`` in our example). If you + have more than one foreign key, a validation error will be raised. + + * The only exception to this is a model which has a many-to-many + relationship to itself, through an intermediary model. In this + case, two foreign keys to the same model are permitted, but they + will be treated as the two (different) sides of the many-to-many + relation. + + * When defining a many-to-many relationship from a model to + itself, using an intermediary model, you *must* use + :attr:`symmetrical=False <ManyToManyField.symmetrical>` (see + :ref:`the model field reference <manytomany-arguments>`). + +Now that you have set up your :class:`~django.db.models.ManyToManyField` to use +your intermediary model (``Membership``, in this case), you're ready to start +creating some many-to-many relationships. You do this by creating instances of +the intermediate model:: + + >>> ringo = Person.objects.create(name="Ringo Starr") + >>> paul = Person.objects.create(name="Paul McCartney") + >>> beatles = Group.objects.create(name="The Beatles") + >>> m1 = Membership(person=ringo, group=beatles, + ... date_joined=date(1962, 8, 16), + ... invite_reason= "Needed a new drummer.") + >>> m1.save() + >>> beatles.members.all() + [<Person: Ringo Starr>] + >>> ringo.group_set.all() + [<Group: The Beatles>] + >>> m2 = Membership.objects.create(person=paul, group=beatles, + ... date_joined=date(1960, 8, 1), + ... invite_reason= "Wanted to form a band.") + >>> beatles.members.all() + [<Person: Ringo Starr>, <Person: Paul McCartney>] + +Unlike normal many-to-many fields, you *can't* use ``add``, ``create``, +or assignment (i.e., ``beatles.members = [...]``) to create relationships:: + + # THIS WILL NOT WORK + >>> beatles.members.add(john) + # NEITHER WILL THIS + >>> beatles.members.create(name="George Harrison") + # AND NEITHER WILL THIS + >>> beatles.members = [john, paul, ringo, george] + +Why? You can't just create a relationship between a ``Person`` and a ``Group`` +- you need to specify all the detail for the relationship required by the +``Membership`` model. The simple ``add``, ``create`` and assignment calls +don't provide a way to specify this extra detail. As a result, they are +disabled for many-to-many relationships that use an intermediate model. +The only way to create this type of relationship is to create instances of the +intermediate model. + +The :meth:`~django.db.models.fields.related.RelatedManager.remove` method is +disabled for similar reasons. However, the +:meth:`~django.db.models.fields.related.RelatedManager.clear` method can be +used to remove all many-to-many relationships for an instance:: + + # Beatles have broken up + >>> beatles.members.clear() + +Once you have established the many-to-many relationships by creating instances +of your intermediate model, you can issue queries. Just as with normal +many-to-many relationships, you can query using the attributes of the +many-to-many-related model:: + + # Find all the groups with a member whose name starts with 'Paul' + >>> Group.objects.filter(members__name__startswith='Paul') + [<Group: The Beatles>] + +As you are using an intermediate model, you can also query on its attributes:: + + # Find all the members of the Beatles that joined after 1 Jan 1961 + >>> Person.objects.filter( + ... group__name='The Beatles', + ... membership__date_joined__gt=date(1961,1,1)) + [<Person: Ringo Starr] + + +One-to-one relationships +~~~~~~~~~~~~~~~~~~~~~~~~ + +To define a one-to-one relationship, use +:class:`~django.db.models.OneToOneField`. You use it just like any other +``Field`` type: by including it as a class attribute of your model. + +This is most useful on the primary key of an object when that object "extends" +another object in some way. + +:class:`~django.db.models.OneToOneField` requires a positional argument: the +class to which the model is related. + +For example, if you were building a database of "places", you would +build pretty standard stuff such as address, phone number, etc. in the +database. Then, if you wanted to build a database of restaurants on +top of the places, instead of repeating yourself and replicating those +fields in the ``Restaurant`` model, you could make ``Restaurant`` have +a :class:`~django.db.models.OneToOneField` to ``Place`` (because a +restaurant "is a" place; in fact, to handle this you'd typically use +:ref:`inheritance <model-inheritance>`, which involves an implicit +one-to-one relation). + +As with :class:`~django.db.models.ForeignKey`, a +:ref:`recursive relationship <recursive-relationships>` +can be defined and +:ref:`references to as-yet undefined models <lazy-relationships>` +can be made; see :ref:`the model field reference <ref-onetoone>` for details. + +.. seealso:: + + See the `One-to-one relationship model example`_ for a full example. + +.. _One-to-one relationship model example: http://www.djangoproject.com/documentation/models/one_to_one/ + +.. versionadded:: 1.0 + +:class:`~django.db.models.OneToOneField` fields also accept one optional argument +described in the :ref:`model field reference <ref-onetoone>`. + +:class:`~django.db.models.OneToOneField` classes used to automatically become +the primary key on a model. This is no longer true (although you can manually +pass in the :attr:`~django.db.models.Field.primary_key` argument if you like). +Thus, it's now possible to have multiple fields of type +:class:`~django.db.models.OneToOneField` on a single model. + +Models across files +------------------- + +It's perfectly OK to relate a model to one from another app. To do this, +import the related model at the top of the model that holds your model. Then, +just refer to the other model class wherever needed. For example:: + + from geography.models import ZipCode + + class Restaurant(models.Model): + # ... + zip_code = models.ForeignKey(ZipCode) + +Field name restrictions +----------------------- + +Django places only two restrictions on model field names: + + 1. A field name cannot be a Python reserved word, because that would result + in a Python syntax error. For example:: + + class Example(models.Model): + pass = models.IntegerField() # 'pass' is a reserved word! + + 2. A field name cannot contain more than one underscore in a row, due to + the way Django's query lookup syntax works. For example:: + + class Example(models.Model): + foo__bar = models.IntegerField() # 'foo__bar' has two underscores! + +These limitations can be worked around, though, because your field name doesn't +necessarily have to match your database column name. See the +:attr:`~Field.db_column` option. + +SQL reserved words, such as ``join``, ``where`` or ``select``, *are* allowed as +model field names, because Django escapes all database table names and column +names in every underlying SQL query. It uses the quoting syntax of your +particular database engine. + +Custom field types +------------------ + +.. versionadded:: 1.0 + +If one of the existing model fields cannot be used to fit your purposes, or if +you wish to take advantage of some less common database column types, you can +create your own field class. Full coverage of creating your own fields is +provided in :doc:`/howto/custom-model-fields`. + +.. _meta-options: + +Meta options +============ + +Give your model metadata by using an inner ``class Meta``, like so:: + + class Ox(models.Model): + horn_length = models.IntegerField() + + class Meta: + ordering = ["horn_length"] + verbose_name_plural = "oxen" + +Model metadata is "anything that's not a field", such as ordering options +(:attr:`~Options.ordering`), database table name (:attr:`~Options.db_table`), or +human-readable singular and plural names (:attr:`~Options.verbose_name` and +:attr:`~Options.verbose_name_plural`). None are required, and adding ``class +Meta`` to a model is completely optional. + +A complete list of all possible ``Meta`` options can be found in the :doc:`model +option reference </ref/models/options>`. + +.. _model-methods: + +Model methods +============= + +Define custom methods on a model to add custom "row-level" functionality to your +objects. Whereas :class:`~django.db.models.Manager` methods are intended to do +"table-wide" things, model methods should act on a particular model instance. + +This is a valuable technique for keeping business logic in one place -- the +model. + +For example, this model has a few custom methods:: + + from django.contrib.localflavor.us.models import USStateField + + class Person(models.Model): + first_name = models.CharField(max_length=50) + last_name = models.CharField(max_length=50) + birth_date = models.DateField() + address = models.CharField(max_length=100) + city = models.CharField(max_length=50) + state = USStateField() # Yes, this is America-centric... + + def baby_boomer_status(self): + "Returns the person's baby-boomer status." + import datetime + if datetime.date(1945, 8, 1) <= self.birth_date <= datetime.date(1964, 12, 31): + return "Baby boomer" + if self.birth_date < datetime.date(1945, 8, 1): + return "Pre-boomer" + return "Post-boomer" + + def is_midwestern(self): + "Returns True if this person is from the Midwest." + return self.state in ('IL', 'WI', 'MI', 'IN', 'OH', 'IA', 'MO') + + def _get_full_name(self): + "Returns the person's full name." + return '%s %s' % (self.first_name, self.last_name) + full_name = property(_get_full_name) + +The last method in this example is a :term:`property`. `Read more about +properties`_. + +.. _Read more about properties: http://www.python.org/download/releases/2.2/descrintro/#property + +The :doc:`model instance reference </ref/models/instances>` has a complete list +of :ref:`methods automatically given to each model <model-instance-methods>`. +You can override most of these -- see `overriding predefined model methods`_, +below -- but there are a couple that you'll almost always want to define: + + :meth:`~Model.__unicode__` + A Python "magic method" that returns a unicode "representation" of any + object. This is what Python and Django will use whenever a model + instance needs to be coerced and displayed as a plain string. Most + notably, this happens when you display an object in an interactive + console or in the admin. + + You'll always want to define this method; the default isn't very helpful + at all. + + :meth:`~Model.get_absolute_url` + This tells Django how to calculate the URL for an object. Django uses + this in its admin interface, and any time it needs to figure out a URL + for an object. + + Any object that has a URL that uniquely identifies it should define this + method. + +.. _overriding-model-methods: + +Overriding predefined model methods +----------------------------------- + +There's another set of :ref:`model methods <model-instance-methods>` that +encapsulate a bunch of database behavior that you'll want to customize. In +particular you'll often want to change the way :meth:`~Model.save` and +:meth:`~Model.delete` work. + +You're free to override these methods (and any other model method) to alter +behavior. + +A classic use-case for overriding the built-in methods is if you want something +to happen whenever you save an object. For example (see +:meth:`~Model.save` for documentation of the parameters it accepts):: + + class Blog(models.Model): + name = models.CharField(max_length=100) + tagline = models.TextField() + + def save(self, *args, **kwargs): + do_something() + super(Blog, self).save(*args, **kwargs) # Call the "real" save() method. + do_something_else() + +You can also prevent saving:: + + class Blog(models.Model): + name = models.CharField(max_length=100) + tagline = models.TextField() + + def save(self, *args, **kwargs): + if self.name == "Yoko Ono's blog": + return # Yoko shall never have her own blog! + else: + super(Blog, self).save(*args, **kwargs) # Call the "real" save() method. + +It's important to remember to call the superclass method -- that's +that ``super(Blog, self).save(*args, **kwargs)`` business -- to ensure +that the object still gets saved into the database. If you forget to +call the superclass method, the default behavior won't happen and the +database won't get touched. + +It's also important that you pass through the arguments that can be +passed to the model method -- that's what the ``*args, **kwargs`` bit +does. Django will, from time to time, extend the capabilities of +built-in model methods, adding new arguments. If you use ``*args, +**kwargs`` in your method definitions, you are guaranteed that your +code will automatically support those arguments when they are added. + +Executing custom SQL +-------------------- + +Another common pattern is writing custom SQL statements in model methods and +module-level methods. For more details on using raw SQL, see the documentation +on :doc:`using raw SQL</topics/db/sql>`. + +.. _model-inheritance: + +Model inheritance +================= + +.. versionadded:: 1.0 + +Model inheritance in Django works almost identically to the way normal +class inheritance works in Python. The only decision you have to make +is whether you want the parent models to be models in their own right +(with their own database tables), or if the parents are just holders +of common information that will only be visible through the child +models. + +There are three styles of inheritance that are possible in Django. + + 1. Often, you will just want to use the parent class to hold information that + you don't want to have to type out for each child model. This class isn't + going to ever be used in isolation, so :ref:`abstract-base-classes` are + what you're after. + 2. If you're subclassing an existing model (perhaps something from another + application entirely) and want each model to have its own database table, + :ref:`multi-table-inheritance` is the way to go. + 3. Finally, if you only want to modify the Python-level behaviour of a model, + without changing the models fields in any way, you can use + :ref:`proxy-models`. + +.. _abstract-base-classes: + +Abstract base classes +--------------------- + +Abstract base classes are useful when you want to put some common +information into a number of other models. You write your base class +and put ``abstract=True`` in the :ref:`Meta <meta-options>` +class. This model will then not be used to create any database +table. Instead, when it is used as a base class for other models, its +fields will be added to those of the child class. It is an error to +have fields in the abstract base class with the same name as those in +the child (and Django will raise an exception). + +An example:: + + class CommonInfo(models.Model): + name = models.CharField(max_length=100) + age = models.PositiveIntegerField() + + class Meta: + abstract = True + + class Student(CommonInfo): + home_group = models.CharField(max_length=5) + +The ``Student`` model will have three fields: ``name``, ``age`` and +``home_group``. The ``CommonInfo`` model cannot be used as a normal Django +model, since it is an abstract base class. It does not generate a database +table or have a manager, and cannot be instantiated or saved directly. + +For many uses, this type of model inheritance will be exactly what you want. +It provides a way to factor out common information at the Python level, whilst +still only creating one database table per child model at the database level. + +``Meta`` inheritance +~~~~~~~~~~~~~~~~~~~~ + +When an abstract base class is created, Django makes any :ref:`Meta <meta-options>` +inner class you declared in the base class available as an +attribute. If a child class does not declare its own :ref:`Meta <meta-options>` +class, it will inherit the parent's :ref:`Meta <meta-options>`. If the child wants to +extend the parent's :ref:`Meta <meta-options>` class, it can subclass it. For example:: + + class CommonInfo(models.Model): + ... + class Meta: + abstract = True + ordering = ['name'] + + class Student(CommonInfo): + ... + class Meta(CommonInfo.Meta): + db_table = 'student_info' + +Django does make one adjustment to the :ref:`Meta <meta-options>` class of an abstract base +class: before installing the :ref:`Meta <meta-options>` attribute, it sets ``abstract=False``. +This means that children of abstract base classes don't automatically become +abstract classes themselves. Of course, you can make an abstract base class +that inherits from another abstract base class. You just need to remember to +explicitly set ``abstract=True`` each time. + +Some attributes won't make sense to include in the :ref:`Meta <meta-options>` class of an +abstract base class. For example, including ``db_table`` would mean that all +the child classes (the ones that don't specify their own :ref:`Meta <meta-options>`) would use +the same database table, which is almost certainly not what you want. + +.. _abstract-related-name: + +Be careful with ``related_name`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you are using the :attr:`~django.db.models.ForeignKey.related_name` attribute on a ``ForeignKey`` or +``ManyToManyField``, you must always specify a *unique* reverse name for the +field. This would normally cause a problem in abstract base classes, since the +fields on this class are included into each of the child classes, with exactly +the same values for the attributes (including :attr:`~django.db.models.ForeignKey.related_name`) each time. + +.. versionchanged:: 1.2 + +To work around this problem, when you are using :attr:`~django.db.models.ForeignKey.related_name` in an +abstract base class (only), part of the name should contain +``'%(app_label)s'`` and ``'%(class)s'``. + +- ``'%(class)s'`` is replaced by the lower-cased name of the child class + that the field is used in. +- ``'%(app_label)s'`` is replaced by the lower-cased name of the app the child + class is contained within. Each installed application name must be unique + and the model class names within each app must also be unique, therefore the + resulting name will end up being different. + +For example, given an app ``common/models.py``:: + + class Base(models.Model): + m2m = models.ManyToManyField(OtherModel, related_name="%(app_label)s_%(class)s_related") + + class Meta: + abstract = True + + class ChildA(Base): + pass + + class ChildB(Base): + pass + +Along with another app ``rare/models.py``:: + + from common.models import Base + + class ChildB(Base): + pass + +The reverse name of the ``commmon.ChildA.m2m`` field will be +``common_childa_related``, whilst the reverse name of the +``common.ChildB.m2m`` field will be ``common_childb_related``, and finally the +reverse name of the ``rare.ChildB.m2m`` field will be ``rare_childb_related``. +It is up to you how you use the ``'%(class)s'`` and ``'%(app_label)s`` portion +to construct your related name, but if you forget to use it, Django will raise +errors when you validate your models (or run :djadmin:`syncdb`). + +If you don't specify a :attr:`~django.db.models.ForeignKey.related_name` +attribute for a field in an abstract base class, the default reverse name will +be the name of the child class followed by ``'_set'``, just as it normally +would be if you'd declared the field directly on the child class. For example, +in the above code, if the :attr:`~django.db.models.ForeignKey.related_name` +attribute was omitted, the reverse name for the ``m2m`` field would be +``childa_set`` in the ``ChildA`` case and ``childb_set`` for the ``ChildB`` +field. + +.. _multi-table-inheritance: + +Multi-table inheritance +----------------------- + +The second type of model inheritance supported by Django is when each model in +the hierarchy is a model all by itself. Each model corresponds to its own +database table and can be queried and created individually. The inheritance +relationship introduces links between the child model and each of its parents +(via an automatically-created :class:`~django.db.models.fields.OneToOneField`). +For example:: + + class Place(models.Model): + name = models.CharField(max_length=50) + address = models.CharField(max_length=80) + + class Restaurant(Place): + serves_hot_dogs = models.BooleanField() + serves_pizza = models.BooleanField() + +All of the fields of ``Place`` will also be available in ``Restaurant``, +although the data will reside in a different database table. So these are both +possible:: + + >>> Place.objects.filter(name="Bob's Cafe") + >>> Restaurant.objects.filter(name="Bob's Cafe") + +If you have a ``Place`` that is also a ``Restaurant``, you can get from the +``Place`` object to the ``Restaurant`` object by using the lower-case version +of the model name:: + + >>> p = Place.objects.get(id=12) + # If p is a Restaurant object, this will give the child class: + >>> p.restaurant + <Restaurant: ...> + +However, if ``p`` in the above example was *not* a ``Restaurant`` (it had been +created directly as a ``Place`` object or was the parent of some other class), +referring to ``p.restaurant`` would raise a Restaurant.DoesNotExist exception. + +``Meta`` and multi-table inheritance +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In the multi-table inheritance situation, it doesn't make sense for a child +class to inherit from its parent's :ref:`Meta <meta-options>` class. All the :ref:`Meta <meta-options>` options +have already been applied to the parent class and applying them again would +normally only lead to contradictory behavior (this is in contrast with the +abstract base class case, where the base class doesn't exist in its own +right). + +So a child model does not have access to its parent's :ref:`Meta +<meta-options>` class. However, there are a few limited cases where the child +inherits behavior from the parent: if the child does not specify an +:attr:`~django.db.models.Options.ordering` attribute or a +:attr:`~django.db.models.Options.get_latest_by` attribute, it will inherit +these from its parent. + +If the parent has an ordering and you don't want the child to have any natural +ordering, you can explicitly disable it:: + + class ChildModel(ParentModel): + ... + class Meta: + # Remove parent's ordering effect + ordering = [] + +Inheritance and reverse relations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Because multi-table inheritance uses an implicit +:class:`~django.db.models.OneToOneField` to link the child and +the parent, it's possible to move from the parent down to the child, +as in the above example. However, this uses up the name that is the +default :attr:`~django.db.models.ForeignKey.related_name` value for +:class:`~django.db.models.ForeignKey` and +:class:`~django.db.models.ManyToManyField` relations. If you +are putting those types of relations on a subclass of another model, +you **must** specify the +:attr:`~django.db.models.ForeignKey.related_name` attribute on each +such field. If you forget, Django will raise an error when you run +:djadmin:`validate` or :djadmin:`syncdb`. + +For example, using the above ``Place`` class again, let's create another +subclass with a :class:`~django.db.models.ManyToManyField`:: + + class Supplier(Place): + # Must specify related_name on all relations. + customers = models.ManyToManyField(Restaurant, related_name='provider') + + +Specifying the parent link field +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As mentioned, Django will automatically create a +:class:`~django.db.models.OneToOneField` linking your child +class back any non-abstract parent models. If you want to control the +name of the attribute linking back to the parent, you can create your +own :class:`~django.db.models.OneToOneField` and set +:attr:`parent_link=True <django.db.models.OneToOneField.parent_link>` +to indicate that your field is the link back to the parent class. + +.. _proxy-models: + +Proxy models +------------ + +.. versionadded:: 1.1 + +When using :ref:`multi-table inheritance <multi-table-inheritance>`, a new +database table is created for each subclass of a model. This is usually the +desired behavior, since the subclass needs a place to store any additional +data fields that are not present on the base class. Sometimes, however, you +only want to change the Python behavior of a model -- perhaps to change the +default manager, or add a new method. + +This is what proxy model inheritance is for: creating a *proxy* for the +original model. You can create, delete and update instances of the proxy model +and all the data will be saved as if you were using the original (non-proxied) +model. The difference is that you can change things like the default model +ordering or the default manager in the proxy, without having to alter the +original. + +Proxy models are declared like normal models. You tell Django that it's a +proxy model by setting the :attr:`~django.db.models.Options.proxy` attribute of +the ``Meta`` class to ``True``. + +For example, suppose you want to add a method to the standard +:class:`~django.contrib.auth.models.User` model that will be used in your +templates. You can do it like this:: + + from django.contrib.auth.models import User + + class MyUser(User): + class Meta: + proxy = True + + def do_something(self): + ... + +The ``MyUser`` class operates on the same database table as its parent +:class:`~django.contrib.auth.models.User` class. In particular, any new +instances of :class:`~django.contrib.auth.models.User` will also be accessible +through ``MyUser``, and vice-versa:: + + >>> u = User.objects.create(username="foobar") + >>> MyUser.objects.get(username="foobar") + <MyUser: foobar> + +You could also use a proxy model to define a different default ordering on a +model. The standard :class:`~django.contrib.auth.models.User` model has no +ordering defined on it (intentionally; sorting is expensive and we don't want +to do it all the time when we fetch users). You might want to regularly order +by the ``username`` attribute when you use the proxy. This is easy:: + + class OrderedUser(User): + class Meta: + ordering = ["username"] + proxy = True + +Now normal :class:`~django.contrib.auth.models.User` queries will be unordered +and ``OrderedUser`` queries will be ordered by ``username``. + +QuerySets still return the model that was requested +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There is no way to have Django return, say, a ``MyUser`` object whenever you +query for :class:`~django.contrib.auth.models.User` objects. A queryset for +``User`` objects will return those types of objects. The whole point of proxy +objects is that code relying on the original ``User`` will use those and your +own code can use the extensions you included (that no other code is relying on +anyway). It is not a way to replace the ``User`` (or any other) model +everywhere with something of your own creation. + +Base class restrictions +~~~~~~~~~~~~~~~~~~~~~~~ + +A proxy model must inherit from exactly one non-abstract model class. You +can't inherit from multiple non-abstract models as the proxy model doesn't +provide any connection between the rows in the different database tables. A +proxy model can inherit from any number of abstract model classes, providing +they do *not* define any model fields. + +Proxy models inherit any ``Meta`` options that they don't define from their +non-abstract model parent (the model they are proxying for). + +Proxy model managers +~~~~~~~~~~~~~~~~~~~~ + +If you don't specify any model managers on a proxy model, it inherits the +managers from its model parents. If you define a manager on the proxy model, +it will become the default, although any managers defined on the parent +classes will still be available. + +Continuing our example from above, you could change the default manager used +when you query the ``User`` model like this:: + + class NewManager(models.Manager): + ... + + class MyUser(User): + objects = NewManager() + + class Meta: + proxy = True + +If you wanted to add a new manager to the Proxy, without replacing the +existing default, you can use the techniques described in the :ref:`custom +manager <custom-managers-and-inheritance>` documentation: create a base class +containing the new managers and inherit that after the primary base class:: + + # Create an abstract class for the new manager. + class ExtraManagers(models.Model): + secondary = NewManager() + + class Meta: + abstract = True + + class MyUser(User, ExtraManagers): + class Meta: + proxy = True + +You probably won't need to do this very often, but, when you do, it's +possible. + +.. _proxy-vs-unmanaged-models: + +Differences between proxy inheritance and unmanaged models +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Proxy model inheritance might look fairly similar to creating an unmanaged +model, using the :attr:`~django.db.models.Options.managed` attribute on a +model's ``Meta`` class. The two alternatives are not quite the same and it's +worth considering which one you should use. + +One difference is that you can (and, in fact, must unless you want an empty +model) specify model fields on models with ``Meta.managed=False``. You could, +with careful setting of :attr:`Meta.db_table +<django.db.models.Options.db_table>` create an unmanaged model that shadowed +an existing model and add Python methods to it. However, that would be very +repetitive and fragile as you need to keep both copies synchronized if you +make any changes. + +The other difference that is more important for proxy models, is how model +managers are handled. Proxy models are intended to behave exactly like the +model they are proxying for. So they inherit the parent model's managers, +including the default manager. In the normal multi-table model inheritance +case, children do not inherit managers from their parents as the custom +managers aren't always appropriate when extra fields are involved. The +:ref:`manager documentation <custom-managers-and-inheritance>` has more +details about this latter case. + +When these two features were implemented, attempts were made to squash them +into a single option. It turned out that interactions with inheritance, in +general, and managers, in particular, made the API very complicated and +potentially difficult to understand and use. It turned out that two options +were needed in any case, so the current separation arose. + +So, the general rules are: + + 1. If you are mirroring an existing model or database table and don't want + all the original database table columns, use ``Meta.managed=False``. + That option is normally useful for modeling database views and tables + not under the control of Django. + 2. If you are wanting to change the Python-only behavior of a model, but + keep all the same fields as in the original, use ``Meta.proxy=True``. + This sets things up so that the proxy model is an exact copy of the + storage structure of the original model when data is saved. + +Multiple inheritance +-------------------- + +Just as with Python's subclassing, it's possible for a Django model to inherit +from multiple parent models. Keep in mind that normal Python name resolution +rules apply. The first base class that a particular name (e.g. :ref:`Meta +<meta-options>`) appears in will be the one that is used; for example, this +means that if multiple parents contain a :ref:`Meta <meta-options>` class, +only the first one is going to be used, and all others will be ignored. + +Generally, you won't need to inherit from multiple parents. The main use-case +where this is useful is for "mix-in" classes: adding a particular extra +field or method to every class that inherits the mix-in. Try to keep your +inheritance hierarchies as simple and straightforward as possible so that you +won't have to struggle to work out where a particular piece of information is +coming from. + +Field name "hiding" is not permitted +------------------------------------- + +In normal Python class inheritance, it is permissible for a child class to +override any attribute from the parent class. In Django, this is not permitted +for attributes that are :class:`~django.db.models.fields.Field` instances (at +least, not at the moment). If a base class has a field called ``author``, you +cannot create another model field called ``author`` in any class that inherits +from that base class. + +Overriding fields in a parent model leads to difficulties in areas such as +initialising new instances (specifying which field is being initialized in +``Model.__init__``) and serialization. These are features which normal Python +class inheritance doesn't have to deal with in quite the same way, so the +difference between Django model inheritance and Python class inheritance isn't +arbitrary. + +This restriction only applies to attributes which are +:class:`~django.db.models.fields.Field` instances. Normal Python attributes +can be overridden if you wish. It also only applies to the name of the +attribute as Python sees it: if you are manually specifying the database +column name, you can have the same column name appearing in both a child and +an ancestor model for multi-table inheritance (they are columns in two +different database tables). + +Django will raise a :exc:`~django.core.exceptions.FieldError` if you override +any model field in any ancestor model. diff --git a/parts/django/docs/topics/db/multi-db.txt b/parts/django/docs/topics/db/multi-db.txt new file mode 100644 index 0000000..1a939b0 --- /dev/null +++ b/parts/django/docs/topics/db/multi-db.txt @@ -0,0 +1,574 @@ +================== +Multiple databases +================== + +.. versionadded:: 1.2 + +This topic guide describes Django's support for interacting with +multiple databases. Most of the rest of Django's documentation assumes +you are interacting with a single database. If you want to interact +with multiple databases, you'll need to take some additional steps. + +Defining your databases +======================= + +The first step to using more than one database with Django is to tell +Django about the database servers you'll be using. This is done using +the :setting:`DATABASES` setting. This setting maps database aliases, +which are a way to refer to a specific database throughout Django, to +a dictionary of settings for that specific connection. The settings in +the inner dictionaries are described fully in the :setting:`DATABASES` +documentation. + +Databases can have any alias you choose. However, the alias +``default`` has special significance. Django uses the database with +the alias of ``default`` when no other database has been selected. If +you don't have a ``default`` database, you need to be careful to +always specify the database that you want to use. + +The following is an example ``settings.py`` snippet defining two +databases -- a default PostgreSQL database and a MySQL database called +``users``: + +.. code-block:: python + + DATABASES = { + 'default': { + 'NAME': 'app_data', + 'ENGINE': 'django.db.backends.postgresql_psycopg2', + 'USER': 'postgres_user', + 'PASSWORD': 's3krit' + }, + 'users': { + 'NAME': 'user_data', + 'ENGINE': 'django.db.backends.mysql', + 'USER': 'mysql_user', + 'PASSWORD': 'priv4te' + } + } + +If you attempt to access a database that you haven't defined in your +:setting:`DATABASES` setting, Django will raise a +``django.db.utils.ConnectionDoesNotExist`` exception. + +Synchronizing your databases +============================ + +The :djadmin:`syncdb` management command operates on one database at a +time. By default, it operates on the ``default`` database, but by +providing a :djadminopt:`--database` argument, you can tell syncdb to +synchronize a different database. So, to synchronize all models onto +all databases in our example, you would need to call:: + + $ ./manage.py syncdb + $ ./manage.py syncdb --database=users + +If you don't want every application to be synchronized onto a +particular database, you can define a :ref:`database +router<topics-db-multi-db-routing>` that implements a policy +constraining the availability of particular models. + +Alternatively, if you want fine-grained control of synchronization, +you can pipe all or part of the output of :djadmin:`sqlall` for a +particular application directly into your database prompt, like this:: + + $ ./manage.py sqlall sales | ./manage.py dbshell + +Using other management commands +------------------------------- + +The other ``django-admin.py`` commands that interact with the database +operate in the same way as :djadmin:`syncdb` -- they only ever operate +on one database at a time, using :djadminopt:`--database` to control +the database used. + +.. _topics-db-multi-db-routing: + +Automatic database routing +========================== + +The easiest way to use multiple databases is to set up a database +routing scheme. The default routing scheme ensures that objects remain +'sticky' to their original database (i.e., an object retrieved from +the ``foo`` database will be saved on the same database). The default +routing scheme ensures that if a database isn't specified, all queries +fall back to the ``default`` database. + +You don't have to do anything to activate the default routing scheme +-- it is provided 'out of the box' on every Django project. However, +if you want to implement more interesting database allocation +behaviors, you can define and install your own database routers. + +Database routers +---------------- + +A database Router is a class that provides up to four methods: + +.. method:: db_for_read(model, **hints) + + Suggest the database that should be used for read operations for + objects of type ``model``. + + If a database operation is able to provide any additional + information that might assist in selecting a database, it will be + provided in the ``hints`` dictionary. Details on valid hints are + provided :ref:`below <topics-db-multi-db-hints>`. + + Returns None if there is no suggestion. + +.. method:: db_for_write(model, **hints) + + Suggest the database that should be used for writes of objects of + type Model. + + If a database operation is able to provide any additional + information that might assist in selecting a database, it will be + provided in the ``hints`` dictionary. Details on valid hints are + provided :ref:`below <topics-db-multi-db-hints>`. + + Returns None if there is no suggestion. + +.. method:: allow_relation(obj1, obj2, **hints) + + Return True if a relation between obj1 and obj2 should be + allowed, False if the relation should be prevented, or None if + the router has no opinion. This is purely a validation operation, + used by foreign key and many to many operations to determine if a + relation should be allowed between two objects. + +.. method:: allow_syncdb(db, model) + + Determine if the ``model`` should be synchronized onto the + database with alias ``db``. Return True if the model should be + synchronized, False if it should not be synchronized, or None if + the router has no opinion. This method can be used to determine + the availability of a model on a given database. + +A router doesn't have to provide *all* these methods - it omit one or +more of them. If one of the methods is omitted, Django will skip that +router when performing the relevant check. + +.. _topics-db-multi-db-hints: + +Hints +~~~~~ + +The hints received by the database router can be used to decide which +database should receive a given request. + +At present, the only hint that will be provided is ``instance``, an +object instance that is related to the read or write operation that is +underway. This might be the instance that is being saved, or it might +be an instance that is being added in a many-to-many relation. In some +cases, no instance hint will be provided at all. The router checks for +the existence of an instance hint, and determine if that hint should be +used to alter routing behavior. + +Using routers +------------- + +Database routers are installed using the :setting:`DATABASE_ROUTERS` +setting. This setting defines a list of class names, each specifying a +router that should be used by the master router +(``django.db.router``). + +The master router is used by Django's database operations to allocate +database usage. Whenever a query needs to know which database to use, +it calls the master router, providing a model and a hint (if +available). Django then tries each router in turn until a database +suggestion can be found. If no suggestion can be found, it tries the +current ``_state.db`` of the hint instance. If a hint instance wasn't +provided, or the instance doesn't currently have database state, the +master router will allocate the ``default`` database. + +An example +---------- + +.. admonition:: Example purposes only! + + This example is intended as a demonstration of how the router + infrastructure can be used to alter database usage. It + intentionally ignores some complex issues in order to + demonstrate how routers are used. + + This example won't work if any of the models in ``myapp`` contain + relationships to models outside of the ``other`` database. + :ref:`Cross-database relationships <no_cross_database_relations>` + introduce referential integrity problems that Django can't + currently handle. + + The master/slave configuration described is also flawed -- it + doesn't provide any solution for handling replication lag (i.e., + query inconsistencies introduced because of the time taken for a + write to propagate to the slaves). It also doesn't consider the + interaction of transactions with the database utilization strategy. + +So - what does this mean in practice? Say you want ``myapp`` to +exist on the ``other`` database, and you want all other models in a +master/slave relationship between the databases ``master``, ``slave1`` and +``slave2``. To implement this, you would need 2 routers:: + + class MyAppRouter(object): + """A router to control all database operations on models in + the myapp application""" + + def db_for_read(self, model, **hints): + "Point all operations on myapp models to 'other'" + if model._meta.app_label == 'myapp': + return 'other' + return None + + def db_for_write(self, model, **hints): + "Point all operations on myapp models to 'other'" + if model._meta.app_label == 'myapp': + return 'other' + return None + + def allow_relation(self, obj1, obj2, **hints): + "Allow any relation if a model in myapp is involved" + if obj1._meta.app_label == 'myapp' or obj2._meta.app_label == 'myapp': + return True + return None + + def allow_syncdb(self, db, model): + "Make sure the myapp app only appears on the 'other' db" + if db == 'other': + return model._meta.app_label == 'myapp' + elif model._meta.app_label == 'myapp': + return False + return None + + class MasterSlaveRouter(object): + """A router that sets up a simple master/slave configuration""" + + def db_for_read(self, model, **hints): + "Point all read operations to a random slave" + return random.choice(['slave1','slave2']) + + def db_for_write(self, model, **hints): + "Point all write operations to the master" + return 'master' + + def allow_relation(self, obj1, obj2, **hints): + "Allow any relation between two objects in the db pool" + db_list = ('master','slave1','slave2') + if obj1._state.db in db_list and obj2._state.db in db_list: + return True + return None + + def allow_syncdb(self, db, model): + "Explicitly put all models on all databases." + return True + +Then, in your settings file, add the following (substituting ``path.to.`` with +the actual python path to the module where you define the routers):: + + DATABASE_ROUTERS = ['path.to.MyAppRouter', 'path.to.MasterSlaveRouter'] + +The order in which routers are processed is significant. Routers will +be queried in the order the are listed in the +:setting:`DATABASE_ROUTERS` setting . In this example, the +``MyAppRouter`` is processed before the ``MasterSlaveRouter``, and as a +result, decisions concerning the models in ``myapp`` are processed +before any other decision is made. If the :setting:`DATABASE_ROUTERS` +setting listed the two routers in the other order, +``MasterSlaveRouter.allow_syncdb()`` would be processed first. The +catch-all nature of the MasterSlaveRouter implementation would mean +that all models would be available on all databases. + +With this setup installed, lets run some Django code:: + + >>> # This retrieval will be performed on the 'credentials' database + >>> fred = User.objects.get(username='fred') + >>> fred.first_name = 'Frederick' + + >>> # This save will also be directed to 'credentials' + >>> fred.save() + + >>> # These retrieval will be randomly allocated to a slave database + >>> dna = Person.objects.get(name='Douglas Adams') + + >>> # A new object has no database allocation when created + >>> mh = Book(title='Mostly Harmless') + + >>> # This assignment will consult the router, and set mh onto + >>> # the same database as the author object + >>> mh.author = dna + + >>> # This save will force the 'mh' instance onto the master database... + >>> mh.save() + + >>> # ... but if we re-retrieve the object, it will come back on a slave + >>> mh = Book.objects.get(title='Mostly Harmless') + + +Manually selecting a database +============================= + +Django also provides an API that allows you to maintain complete control +over database usage in your code. A manually specified database allocation +will take priority over a database allocated by a router. + +Manually selecting a database for a ``QuerySet`` +------------------------------------------------ + +You can select the database for a ``QuerySet`` at any point in the +``QuerySet`` "chain." Just call ``using()`` on the ``QuerySet`` to get +another ``QuerySet`` that uses the specified database. + +``using()`` takes a single argument: the alias of the database on +which you want to run the query. For example:: + + >>> # This will run on the 'default' database. + >>> Author.objects.all() + + >>> # So will this. + >>> Author.objects.using('default').all() + + >>> # This will run on the 'other' database. + >>> Author.objects.using('other').all() + +Selecting a database for ``save()`` +----------------------------------- + +Use the ``using`` keyword to ``Model.save()`` to specify to which +database the data should be saved. + +For example, to save an object to the ``legacy_users`` database, you'd +use this:: + + >>> my_object.save(using='legacy_users') + +If you don't specify ``using``, the ``save()`` method will save into +the default database allocated by the routers. + +Moving an object from one database to another +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you've saved an instance to one database, it might be tempting to +use ``save(using=...)`` as a way to migrate the instance to a new +database. However, if you don't take appropriate steps, this could +have some unexpected consequences. + +Consider the following example:: + + >>> p = Person(name='Fred') + >>> p.save(using='first') # (statement 1) + >>> p.save(using='second') # (statement 2) + +In statement 1, a new ``Person`` object is saved to the ``first`` +database. At this time, ``p`` doesn't have a primary key, so Django +issues a SQL ``INSERT`` statement. This creates a primary key, and +Django assigns that primary key to ``p``. + +When the save occurs in statement 2, ``p`` already has a primary key +value, and Django will attempt to use that primary key on the new +database. If the primary key value isn't in use in the ``second`` +database, then you won't have any problems -- the object will be +copied to the new database. + +However, if the primary key of ``p`` is already in use on the +``second`` database, the existing object in the ``second`` database +will be overridden when ``p`` is saved. + +You can avoid this in two ways. First, you can clear the primary key +of the instance. If an object has no primary key, Django will treat it +as a new object, avoiding any loss of data on the ``second`` +database:: + + >>> p = Person(name='Fred') + >>> p.save(using='first') + >>> p.pk = None # Clear the primary key. + >>> p.save(using='second') # Write a completely new object. + +The second option is to use the ``force_insert`` option to ``save()`` +to ensure that Django does a SQL ``INSERT``:: + + >>> p = Person(name='Fred') + >>> p.save(using='first') + >>> p.save(using='second', force_insert=True) + +This will ensure that the person named ``Fred`` will have the same +primary key on both databases. If that primary key is already in use +when you try to save onto the ``second`` database, an error will be +raised. + +Selecting a database to delete from +----------------------------------- + +By default, a call to delete an existing object will be executed on +the same database that was used to retrieve the object in the first +place:: + + >>> u = User.objects.using('legacy_users').get(username='fred') + >>> u.delete() # will delete from the `legacy_users` database + +To specify the database from which a model will be deleted, pass a +``using`` keyword argument to the ``Model.delete()`` method. This +argument works just like the ``using`` keyword argument to ``save()``. + +For example, if you're migrating a user from the ``legacy_users`` +database to the ``new_users`` database, you might use these commands:: + + >>> user_obj.save(using='new_users') + >>> user_obj.delete(using='legacy_users') + +Using managers with multiple databases +-------------------------------------- + +Use the ``db_manager()`` method on managers to give managers access to +a non-default database. + +For example, say you have a custom manager method that touches the +database -- ``User.objects.create_user()``. Because ``create_user()`` +is a manager method, not a ``QuerySet`` method, you can't do +``User.objects.using('new_users').create_user()``. (The +``create_user()`` method is only available on ``User.objects``, the +manager, not on ``QuerySet`` objects derived from the manager.) The +solution is to use ``db_manager()``, like this:: + + User.objects.db_manager('new_users').create_user(...) + +``db_manager()`` returns a copy of the manager bound to the database you specify. + +Using ``get_query_set()`` with multiple databases +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you're overriding ``get_query_set()`` on your manager, be sure to +either call the method on the parent (using ``super()``) or do the +appropriate handling of the ``_db`` attribute on the manager (a string +containing the name of the database to use). + +For example, if you want to return a custom ``QuerySet`` class from +the ``get_query_set`` method, you could do this:: + + class MyManager(models.Manager): + def get_query_set(self): + qs = CustomQuerySet(self.model) + if self._db is not None: + qs = qs.using(self._db) + return qs + +Exposing multiple databases in Django's admin interface +======================================================= + +Django's admin doesn't have any explicit support for multiple +databases. If you want to provide an admin interface for a model on a +database other than that that specified by your router chain, you'll +need to write custom :class:`~django.contrib.admin.ModelAdmin` classes +that will direct the admin to use a specific database for content. + +``ModelAdmin`` objects have four methods that require customization for +multiple-database support:: + + class MultiDBModelAdmin(admin.ModelAdmin): + # A handy constant for the name of the alternate database. + using = 'other' + + def save_model(self, request, obj, form, change): + # Tell Django to save objects to the 'other' database. + obj.save(using=self.using) + + def queryset(self, request): + # Tell Django to look for objects on the 'other' database. + return super(MultiDBModelAdmin, self).queryset(request).using(self.using) + + def formfield_for_foreignkey(self, db_field, request=None, **kwargs): + # Tell Django to populate ForeignKey widgets using a query + # on the 'other' database. + return super(MultiDBModelAdmin, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs) + + def formfield_for_manytomany(self, db_field, request=None, **kwargs): + # Tell Django to populate ManyToMany widgets using a query + # on the 'other' database. + return super(MultiDBModelAdmin, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs) + +The implementation provided here implements a multi-database strategy +where all objects of a given type are stored on a specific database +(e.g., all ``User`` objects are in the ``other`` database). If your +usage of multiple databases is more complex, your ``ModelAdmin`` will +need to reflect that strategy. + +Inlines can be handled in a similar fashion. They require three customized methods:: + + class MultiDBTabularInline(admin.TabularInline): + using = 'other' + + def queryset(self, request): + # Tell Django to look for inline objects on the 'other' database. + return super(MultiDBTabularInline, self).queryset(request).using(self.using) + + def formfield_for_foreignkey(self, db_field, request=None, **kwargs): + # Tell Django to populate ForeignKey widgets using a query + # on the 'other' database. + return super(MultiDBTabularInline, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs) + + def formfield_for_manytomany(self, db_field, request=None, **kwargs): + # Tell Django to populate ManyToMany widgets using a query + # on the 'other' database. + return super(MultiDBTabularInline, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs) + +Once you've written your model admin definitions, they can be +registered with any ``Admin`` instance:: + + from django.contrib import admin + + # Specialize the multi-db admin objects for use with specific models. + class BookInline(MultiDBTabularInline): + model = Book + + class PublisherAdmin(MultiDBModelAdmin): + inlines = [BookInline] + + admin.site.register(Author, MultiDBModelAdmin) + admin.site.register(Publisher, PublisherAdmin) + + othersite = admin.Site('othersite') + othersite.register(Publisher, MultiDBModelAdmin) + +This example sets up two admin sites. On the first site, the +``Author`` and ``Publisher`` objects are exposed; ``Publisher`` +objects have an tabular inline showing books published by that +publisher. The second site exposes just publishers, without the +inlines. + +Using raw cursors with multiple databases +========================================= + +If you are using more than one database you can use +``django.db.connections`` to obtain the connection (and cursor) for a +specific database. ``django.db.connections`` is a dictionary-like +object that allows you to retrieve a specific connection using it's +alias:: + + from django.db import connections + cursor = connections['my_db_alias'].cursor() + +Limitations of multiple databases +================================= + +.. _no_cross_database_relations: + +Cross-database relations +------------------------ + +Django doesn't currently provide any support for foreign key or +many-to-many relationships spanning multiple databases. If you +have used a router to partition models to different databases, +any foreign key and many-to-many relationships defined by those +models must be internal to a single database. + +This is because of referential integrity. In order to maintain a +relationship between two objects, Django needs to know that the +primary key of the related object is valid. If the primary key is +stored on a separate database, it's not possible to easily evaluate +the validity of a primary key. + +If you're using Postgres, Oracle, or MySQL with InnoDB, this is +enforced at the database integrity level -- database level key +constraints prevent the creation of relations that can't be validated. + +However, if you're using SQLite or MySQL with MyISAM tables, there is +no enforced referential integrity; as a result, you may be able to +'fake' cross database foreign keys. However, this configuration is not +officially supported by Django. diff --git a/parts/django/docs/topics/db/optimization.txt b/parts/django/docs/topics/db/optimization.txt new file mode 100644 index 0000000..7d51052 --- /dev/null +++ b/parts/django/docs/topics/db/optimization.txt @@ -0,0 +1,260 @@ +============================ +Database access optimization +============================ + +Django's database layer provides various ways to help developers get the most +out of their databases. This document gathers together links to the relevant +documentation, and adds various tips, organized under a number of headings that +outline the steps to take when attempting to optimize your database usage. + +Profile first +============= + +As general programming practice, this goes without saying. Find out :ref:`what +queries you are doing and what they are costing you +<faq-see-raw-sql-queries>`. You may also want to use an external project like +django-debug-toolbar_, or a tool that monitors your database directly. + +Remember that you may be optimizing for speed or memory or both, depending on +your requirements. Sometimes optimizing for one will be detrimental to the +other, but sometimes they will help each other. Also, work that is done by the +database process might not have the same cost (to you) as the same amount of +work done in your Python process. It is up to you to decide what your +priorities are, where the balance must lie, and profile all of these as required +since this will depend on your application and server. + +With everything that follows, remember to profile after every change to ensure +that the change is a benefit, and a big enough benefit given the decrease in +readability of your code. **All** of the suggestions below come with the caveat +that in your circumstances the general principle might not apply, or might even +be reversed. + +.. _django-debug-toolbar: http://robhudson.github.com/django-debug-toolbar/ + +Use standard DB optimization techniques +======================================= + +...including: + +* Indexes. This is a number one priority, *after* you have determined from + profiling what indexes should be added. Use + :attr:`django.db.models.Field.db_index` to add these from Django. + +* Appropriate use of field types. + +We will assume you have done the obvious things above. The rest of this document +focuses on how to use Django in such a way that you are not doing unnecessary +work. This document also does not address other optimization techniques that +apply to all expensive operations, such as :doc:`general purpose caching +</topics/cache>`. + +Understand QuerySets +==================== + +Understanding :doc:`QuerySets </ref/models/querysets>` is vital to getting good +performance with simple code. In particular: + +Understand QuerySet evaluation +------------------------------ + +To avoid performance problems, it is important to understand: + +* that :ref:`QuerySets are lazy <querysets-are-lazy>`. + +* when :ref:`they are evaluated <when-querysets-are-evaluated>`. + +* how :ref:`the data is held in memory <caching-and-querysets>`. + +Understand cached attributes +---------------------------- + +As well as caching of the whole ``QuerySet``, there is caching of the result of +attributes on ORM objects. In general, attributes that are not callable will be +cached. For example, assuming the :ref:`example Weblog models +<queryset-model-example>`:: + + >>> entry = Entry.objects.get(id=1) + >>> entry.blog # Blog object is retrieved at this point + >>> entry.blog # cached version, no DB access + +But in general, callable attributes cause DB lookups every time:: + + >>> entry = Entry.objects.get(id=1) + >>> entry.authors.all() # query performed + >>> entry.authors.all() # query performed again + +Be careful when reading template code - the template system does not allow use +of parentheses, but will call callables automatically, hiding the above +distinction. + +Be careful with your own custom properties - it is up to you to implement +caching. + +Use the ``with`` template tag +----------------------------- + +To make use of the caching behaviour of ``QuerySet``, you may need to use the +:ttag:`with` template tag. + +Use ``iterator()`` +------------------ + +When you have a lot of objects, the caching behaviour of the ``QuerySet`` can +cause a large amount of memory to be used. In this case, +:meth:`~django.db.models.QuerySet.iterator()` may help. + +Do database work in the database rather than in Python +====================================================== + +For instance: + +* At the most basic level, use :ref:`filter and exclude <queryset-api>` to do + filtering in the database. + +* Use :ref:`F() object query expressions <query-expressions>` to do filtering + against other fields within the same model. + +* Use :doc:`annotate to do aggregation in the database </topics/db/aggregation>`. + +If these aren't enough to generate the SQL you need: + +Use ``QuerySet.extra()`` +------------------------ + +A less portable but more powerful method is +:meth:`~django.db.models.QuerySet.extra()`, which allows some SQL to be +explicitly added to the query. If that still isn't powerful enough: + +Use raw SQL +----------- + +Write your own :doc:`custom SQL to retrieve data or populate models +</topics/db/sql>`. Use ``django.db.connection.queries`` to find out what Django +is writing for you and start from there. + +Retrieve everything at once if you know you will need it +======================================================== + +Hitting the database multiple times for different parts of a single 'set' of +data that you will need all parts of is, in general, less efficient than +retrieving it all in one query. This is particularly important if you have a +query that is executed in a loop, and could therefore end up doing many database +queries, when only one was needed. So: + +Use ``QuerySet.select_related()`` +--------------------------------- + +Understand :ref:`QuerySet.select_related() <select-related>` thoroughly, and use it: + +* in view code, + +* and in :doc:`managers and default managers </topics/db/managers>` where + appropriate. Be aware when your manager is and is not used; sometimes this is + tricky so don't make assumptions. + +Don't retrieve things you don't need +==================================== + +Use ``QuerySet.values()`` and ``values_list()`` +----------------------------------------------- + +When you just want a ``dict`` or ``list`` of values, and don't need ORM model +objects, make appropriate usage of :meth:`~django.db.models.QuerySet.values()`. +These can be useful for replacing model objects in template code - as long as +the dicts you supply have the same attributes as those used in the template, +you are fine. + +Use ``QuerySet.defer()`` and ``only()`` +--------------------------------------- + +Use :meth:`~django.db.models.QuerySet.defer()` and +:meth:`~django.db.models.QuerySet.only()` if there are database columns you +know that you won't need (or won't need in most cases) to avoid loading +them. Note that if you *do* use them, the ORM will have to go and get them in a +separate query, making this a pessimization if you use it inappropriately. + +Use QuerySet.count() +-------------------- + +...if you only want the count, rather than doing ``len(queryset)``. + +Use QuerySet.exists() +--------------------- + +...if you only want to find out if at least one result exists, rather than ``if +queryset``. + +But: + +Don't overuse ``count()`` and ``exists()`` +------------------------------------------ + +If you are going to need other data from the QuerySet, just evaluate it. + +For example, assuming an Email class that has a ``body`` attribute and a +many-to-many relation to User, the following template code is optimal: + +.. code-block:: html+django + + {% if display_inbox %} + {% with user.emails.all as emails %} + {% if emails %} + <p>You have {{ emails|length }} email(s)</p> + {% for email in emails %} + <p>{{ email.body }}</p> + {% endfor %} + {% else %} + <p>No messages today.</p> + {% endif %} + {% endwith %} + {% endif %} + + +It is optimal because: + + 1. Since QuerySets are lazy, this does no database if 'display_inbox' is False. + + #. Use of ``with`` means that we store ``user.emails.all`` in a variable for + later use, allowing its cache to be re-used. + + #. The line ``{% if emails %}`` causes ``QuerySet.__nonzero__()`` to be called, + which causes the ``user.emails.all()`` query to be run on the database, and + at the least the first line to be turned into an ORM object. If there aren't + any results, it will return False, otherwise True. + + #. The use of ``{{ emails|length }}`` calls ``QuerySet.__len__()``, filling + out the rest of the cache without doing another query. + + #. The ``for`` loop iterates over the already filled cache. + +In total, this code does either one or zero database queries. The only +deliberate optimization performed is the use of the ``with`` tag. Using +``QuerySet.exists()`` or ``QuerySet.count()`` at any point would cause +additional queries. + +Use ``QuerySet.update()`` and ``delete()`` +------------------------------------------ + +Rather than retrieve a load of objects, set some values, and save them +individual, use a bulk SQL UPDATE statement, via :ref:`QuerySet.update() +<topics-db-queries-update>`. Similarly, do :ref:`bulk deletes +<topics-db-queries-delete>` where possible. + +Note, however, that these bulk update methods cannot call the ``save()`` or +``delete()`` methods of individual instances, which means that any custom +behaviour you have added for these methods will not be executed, including +anything driven from the normal database object :doc:`signals </ref/signals>`. + +Use foreign key values directly +------------------------------- + +If you only need a foreign key value, use the foreign key value that is already on +the object you've got, rather than getting the whole related object and taking +its primary key. i.e. do:: + + entry.blog_id + +instead of:: + + entry.blog.id + diff --git a/parts/django/docs/topics/db/queries.txt b/parts/django/docs/topics/db/queries.txt new file mode 100644 index 0000000..923b1e4 --- /dev/null +++ b/parts/django/docs/topics/db/queries.txt @@ -0,0 +1,1110 @@ +============== +Making queries +============== + +.. currentmodule:: django.db.models + +Once you've created your :doc:`data models </topics/db/models>`, Django +automatically gives you a database-abstraction API that lets you create, +retrieve, update and delete objects. This document explains how to use this +API. Refer to the :doc:`data model reference </ref/models/index>` for full +details of all the various model lookup options. + +Throughout this guide (and in the reference), we'll refer to the following +models, which comprise a Weblog application: + +.. _queryset-model-example: + +.. code-block:: python + + class Blog(models.Model): + name = models.CharField(max_length=100) + tagline = models.TextField() + + def __unicode__(self): + return self.name + + class Author(models.Model): + name = models.CharField(max_length=50) + email = models.EmailField() + + def __unicode__(self): + return self.name + + class Entry(models.Model): + blog = models.ForeignKey(Blog) + headline = models.CharField(max_length=255) + body_text = models.TextField() + pub_date = models.DateTimeField() + authors = models.ManyToManyField(Author) + n_comments = models.IntegerField() + n_pingbacks = models.IntegerField() + rating = models.IntegerField() + + def __unicode__(self): + return self.headline + +Creating objects +================ + +To represent database-table data in Python objects, Django uses an intuitive +system: A model class represents a database table, and an instance of that +class represents a particular record in the database table. + +To create an object, instantiate it using keyword arguments to the model class, +then call ``save()`` to save it to the database. + +You import the model class from wherever it lives on the Python path, as you +may expect. (We point this out here because previous Django versions required +funky model importing.) + +Assuming models live in a file ``mysite/blog/models.py``, here's an example:: + + >>> from blog.models import Blog + >>> b = Blog(name='Beatles Blog', tagline='All the latest Beatles news.') + >>> b.save() + +This performs an ``INSERT`` SQL statement behind the scenes. Django doesn't hit +the database until you explicitly call ``save()``. + +The ``save()`` method has no return value. + +.. seealso:: + + ``save()`` takes a number of advanced options not described here. + See the documentation for ``save()`` for complete details. + + To create an object and save it all in one step see the ```create()``` + method. + +Saving changes to objects +========================= + +To save changes to an object that's already in the database, use ``save()``. + +Given a ``Blog`` instance ``b5`` that has already been saved to the database, +this example changes its name and updates its record in the database:: + + >> b5.name = 'New name' + >> b5.save() + +This performs an ``UPDATE`` SQL statement behind the scenes. Django doesn't hit +the database until you explicitly call ``save()``. + +Saving ``ForeignKey`` and ``ManyToManyField`` fields +---------------------------------------------------- + +Updating a ``ForeignKey`` field works exactly the same way as saving a normal +field; simply assign an object of the right type to the field in question. +This example updates the ``blog`` attribute of an ``Entry`` instance ``entry``:: + + >>> from blog.models import Entry + >>> entry = Entry.objects.get(pk=1) + >>> cheese_blog = Blog.objects.get(name="Cheddar Talk") + >>> entry.blog = cheese_blog + >>> entry.save() + +Updating a ``ManyToManyField`` works a little differently; use the ``add()`` +method on the field to add a record to the relation. This example adds the +``Author`` instance ``joe`` to the ``entry`` object:: + + >>> from blog.models import Author + >>> joe = Author.objects.create(name="Joe") + >>> entry.authors.add(joe) + +Django will complain if you try to assign or add an object of the wrong type. + +Retrieving objects +================== + +To retrieve objects from your database, you construct a ``QuerySet`` via a +``Manager`` on your model class. + +A ``QuerySet`` represents a collection of objects from your database. It can +have zero, one or many *filters* -- criteria that narrow down the collection +based on given parameters. In SQL terms, a ``QuerySet`` equates to a ``SELECT`` +statement, and a filter is a limiting clause such as ``WHERE`` or ``LIMIT``. + +You get a ``QuerySet`` by using your model's ``Manager``. Each model has at +least one ``Manager``, and it's called ``objects`` by default. Access it +directly via the model class, like so:: + + >>> Blog.objects + <django.db.models.manager.Manager object at ...> + >>> b = Blog(name='Foo', tagline='Bar') + >>> b.objects + Traceback: + ... + AttributeError: "Manager isn't accessible via Blog instances." + +.. note:: + + ``Managers`` are accessible only via model classes, rather than from model + instances, to enforce a separation between "table-level" operations and + "record-level" operations. + +The ``Manager`` is the main source of ``QuerySets`` for a model. It acts as a +"root" ``QuerySet`` that describes all objects in the model's database table. +For example, ``Blog.objects`` is the initial ``QuerySet`` that contains all +``Blog`` objects in the database. + +Retrieving all objects +---------------------- + +The simplest way to retrieve objects from a table is to get all of them. +To do this, use the ``all()`` method on a ``Manager``:: + + >>> all_entries = Entry.objects.all() + +The ``all()`` method returns a ``QuerySet`` of all the objects in the database. + +(If ``Entry.objects`` is a ``QuerySet``, why can't we just do ``Entry.objects``? +That's because ``Entry.objects``, the root ``QuerySet``, is a special case +that cannot be evaluated. The ``all()`` method returns a ``QuerySet`` that +*can* be evaluated.) + + +Retrieving specific objects with filters +---------------------------------------- + +The root ``QuerySet`` provided by the ``Manager`` describes all objects in the +database table. Usually, though, you'll need to select only a subset of the +complete set of objects. + +To create such a subset, you refine the initial ``QuerySet``, adding filter +conditions. The two most common ways to refine a ``QuerySet`` are: + + ``filter(**kwargs)`` + Returns a new ``QuerySet`` containing objects that match the given + lookup parameters. + + ``exclude(**kwargs)`` + Returns a new ``QuerySet`` containing objects that do *not* match the + given lookup parameters. + +The lookup parameters (``**kwargs`` in the above function definitions) should +be in the format described in `Field lookups`_ below. + +For example, to get a ``QuerySet`` of blog entries from the year 2006, use +``filter()`` like so:: + + Entry.objects.filter(pub_date__year=2006) + +We don't have to add an ``all()`` -- ``Entry.objects.all().filter(...)``. That +would still work, but you only need ``all()`` when you want all objects from the +root ``QuerySet``. + +.. _chaining-filters: + +Chaining filters +~~~~~~~~~~~~~~~~ + +The result of refining a ``QuerySet`` is itself a ``QuerySet``, so it's +possible to chain refinements together. For example:: + + >>> Entry.objects.filter( + ... headline__startswith='What' + ... ).exclude( + ... pub_date__gte=datetime.now() + ... ).filter( + ... pub_date__gte=datetime(2005, 1, 1) + ... ) + +This takes the initial ``QuerySet`` of all entries in the database, adds a +filter, then an exclusion, then another filter. The final result is a +``QuerySet`` containing all entries with a headline that starts with "What", +that were published between January 1, 2005, and the current day. + +.. _filtered-querysets-are-unique: + +Filtered QuerySets are unique +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Each time you refine a ``QuerySet``, you get a brand-new ``QuerySet`` that is +in no way bound to the previous ``QuerySet``. Each refinement creates a +separate and distinct ``QuerySet`` that can be stored, used and reused. + +Example:: + + >> q1 = Entry.objects.filter(headline__startswith="What") + >> q2 = q1.exclude(pub_date__gte=datetime.now()) + >> q3 = q1.filter(pub_date__gte=datetime.now()) + +These three ``QuerySets`` are separate. The first is a base ``QuerySet`` +containing all entries that contain a headline starting with "What". The second +is a subset of the first, with an additional criteria that excludes records +whose ``pub_date`` is greater than now. The third is a subset of the first, +with an additional criteria that selects only the records whose ``pub_date`` is +greater than now. The initial ``QuerySet`` (``q1``) is unaffected by the +refinement process. + +.. _querysets-are-lazy: + +QuerySets are lazy +~~~~~~~~~~~~~~~~~~ + +``QuerySets`` are lazy -- the act of creating a ``QuerySet`` doesn't involve any +database activity. You can stack filters together all day long, and Django won't +actually run the query until the ``QuerySet`` is *evaluated*. Take a look at +this example:: + + >>> q = Entry.objects.filter(headline__startswith="What") + >>> q = q.filter(pub_date__lte=datetime.now()) + >>> q = q.exclude(body_text__icontains="food") + >>> print q + +Though this looks like three database hits, in fact it hits the database only +once, at the last line (``print q``). In general, the results of a ``QuerySet`` +aren't fetched from the database until you "ask" for them. When you do, the +``QuerySet`` is *evaluated* by accessing the database. For more details on +exactly when evaluation takes place, see :ref:`when-querysets-are-evaluated`. + + +.. _retrieving-single-object-with-get: + +Retrieving a single object with get +----------------------------------- + +``.filter()`` will always give you a ``QuerySet``, even if only a single +object matches the query - in this case, it will be a ``QuerySet`` containing +a single element. + +If you know there is only one object that matches your query, you can use +the ``get()`` method on a `Manager` which returns the object directly:: + + >>> one_entry = Entry.objects.get(pk=1) + +You can use any query expression with ``get()``, just like with ``filter()`` - +again, see `Field lookups`_ below. + +Note that there is a difference between using ``.get()``, and using +``.filter()`` with a slice of ``[0]``. If there are no results that match the +query, ``.get()`` will raise a ``DoesNotExist`` exception. This exception is an +attribute of the model class that the query is being performed on - so in the +code above, if there is no ``Entry`` object with a primary key of 1, Django will +raise ``Entry.DoesNotExist``. + +Similarly, Django will complain if more than one item matches the ``get()`` +query. In this case, it will raise ``MultipleObjectsReturned``, which again is +an attribute of the model class itself. + + +Other QuerySet methods +---------------------- + +Most of the time you'll use ``all()``, ``get()``, ``filter()`` and ``exclude()`` +when you need to look up objects from the database. However, that's far from all +there is; see the :ref:`QuerySet API Reference <queryset-api>` for a complete +list of all the various ``QuerySet`` methods. + +.. _limiting-querysets: + +Limiting QuerySets +------------------ + +Use a subset of Python's array-slicing syntax to limit your ``QuerySet`` to a +certain number of results. This is the equivalent of SQL's ``LIMIT`` and +``OFFSET`` clauses. + +For example, this returns the first 5 objects (``LIMIT 5``):: + + >>> Entry.objects.all()[:5] + +This returns the sixth through tenth objects (``OFFSET 5 LIMIT 5``):: + + >>> Entry.objects.all()[5:10] + +Negative indexing (i.e. ``Entry.objects.all()[-1]``) is not supported. + +Generally, slicing a ``QuerySet`` returns a new ``QuerySet`` -- it doesn't +evaluate the query. An exception is if you use the "step" parameter of Python +slice syntax. For example, this would actually execute the query in order to +return a list of every *second* object of the first 10:: + + >>> Entry.objects.all()[:10:2] + +To retrieve a *single* object rather than a list +(e.g. ``SELECT foo FROM bar LIMIT 1``), use a simple index instead of a +slice. For example, this returns the first ``Entry`` in the database, after +ordering entries alphabetically by headline:: + + >>> Entry.objects.order_by('headline')[0] + +This is roughly equivalent to:: + + >>> Entry.objects.order_by('headline')[0:1].get() + +Note, however, that the first of these will raise ``IndexError`` while the +second will raise ``DoesNotExist`` if no objects match the given criteria. See +:meth:`~django.db.models.QuerySet.get` for more details. + +.. _field-lookups-intro: + +Field lookups +------------- + +Field lookups are how you specify the meat of an SQL ``WHERE`` clause. They're +specified as keyword arguments to the ``QuerySet`` methods ``filter()``, +``exclude()`` and ``get()``. + +Basic lookups keyword arguments take the form ``field__lookuptype=value``. +(That's a double-underscore). For example:: + + >>> Entry.objects.filter(pub_date__lte='2006-01-01') + +translates (roughly) into the following SQL:: + + SELECT * FROM blog_entry WHERE pub_date <= '2006-01-01'; + +.. admonition:: How this is possible + + Python has the ability to define functions that accept arbitrary name-value + arguments whose names and values are evaluated at runtime. For more + information, see `Keyword Arguments`_ in the official Python tutorial. + + .. _`Keyword Arguments`: http://docs.python.org/tutorial/controlflow.html#keyword-arguments + +If you pass an invalid keyword argument, a lookup function will raise +``TypeError``. + +The database API supports about two dozen lookup types; a complete reference +can be found in the :ref:`field lookup reference <field-lookups>`. To give you a taste of what's available, here's some of the more common lookups +you'll probably use: + + :lookup:`exact` + An "exact" match. For example:: + + >>> Entry.objects.get(headline__exact="Man bites dog") + + Would generate SQL along these lines: + + .. code-block:: sql + + SELECT ... WHERE headline = 'Man bites dog'; + + If you don't provide a lookup type -- that is, if your keyword argument + doesn't contain a double underscore -- the lookup type is assumed to be + ``exact``. + + For example, the following two statements are equivalent:: + + >>> Blog.objects.get(id__exact=14) # Explicit form + >>> Blog.objects.get(id=14) # __exact is implied + + This is for convenience, because ``exact`` lookups are the common case. + + :lookup:`iexact` + A case-insensitive match. So, the query:: + + >>> Blog.objects.get(name__iexact="beatles blog") + + Would match a ``Blog`` titled "Beatles Blog", "beatles blog", or even + "BeAtlES blOG". + + :lookup:`contains` + Case-sensitive containment test. For example:: + + Entry.objects.get(headline__contains='Lennon') + + Roughly translates to this SQL: + + .. code-block:: sql + + SELECT ... WHERE headline LIKE '%Lennon%'; + + Note this will match the headline ``'Today Lennon honored'`` but not + ``'today lennon honored'``. + + There's also a case-insensitive version, :lookup:`icontains`. + + :lookup:`startswith`, :lookup:`endswith` + Starts-with and ends-with search, respectively. There are also + case-insensitive versions called :lookup:`istartswith` and + :lookup:`iendswith`. + +Again, this only scratches the surface. A complete reference can be found in the +:ref:`field lookup reference <field-lookups>`. + +Lookups that span relationships +------------------------------- + +Django offers a powerful and intuitive way to "follow" relationships in +lookups, taking care of the SQL ``JOIN``\s for you automatically, behind the +scenes. To span a relationship, just use the field name of related fields +across models, separated by double underscores, until you get to the field you +want. + +This example retrieves all ``Entry`` objects with a ``Blog`` whose ``name`` +is ``'Beatles Blog'``:: + + >>> Entry.objects.filter(blog__name__exact='Beatles Blog') + +This spanning can be as deep as you'd like. + +It works backwards, too. To refer to a "reverse" relationship, just use the +lowercase name of the model. + +This example retrieves all ``Blog`` objects which have at least one ``Entry`` +whose ``headline`` contains ``'Lennon'``:: + + >>> Blog.objects.filter(entry__headline__contains='Lennon') + +If you are filtering across multiple relationships and one of the intermediate +models doesn't have a value that meets the filter condition, Django will treat +it as if there is an empty (all values are ``NULL``), but valid, object there. +All this means is that no error will be raised. For example, in this filter:: + + Blog.objects.filter(entry__authors__name='Lennon') + +(if there was a related ``Author`` model), if there was no ``author`` +associated with an entry, it would be treated as if there was also no ``name`` +attached, rather than raising an error because of the missing ``author``. +Usually this is exactly what you want to have happen. The only case where it +might be confusing is if you are using ``isnull``. Thus:: + + Blog.objects.filter(entry__authors__name__isnull=True) + +will return ``Blog`` objects that have an empty ``name`` on the ``author`` and +also those which have an empty ``author`` on the ``entry``. If you don't want +those latter objects, you could write:: + + Blog.objects.filter(entry__authors__isnull=False, + entry__authors__name__isnull=True) + +Spanning multi-valued relationships +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. versionadded:: 1.0 + +When you are filtering an object based on a ``ManyToManyField`` or a reverse +``ForeignKey``, there are two different sorts of filter you may be +interested in. Consider the ``Blog``/``Entry`` relationship (``Blog`` to +``Entry`` is a one-to-many relation). We might be interested in finding blogs +that have an entry which has both *"Lennon"* in the headline and was published +in 2008. Or we might want to find blogs that have an entry with *"Lennon"* in +the headline as well as an entry that was published in 2008. Since there are +multiple entries associated with a single ``Blog``, both of these queries are +possible and make sense in some situations. + +The same type of situation arises with a ``ManyToManyField``. For example, if +an ``Entry`` has a ``ManyToManyField`` called ``tags``, we might want to find +entries linked to tags called *"music"* and *"bands"* or we might want an +entry that contains a tag with a name of *"music"* and a status of *"public"*. + +To handle both of these situations, Django has a consistent way of processing +``filter()`` and ``exclude()`` calls. Everything inside a single ``filter()`` +call is applied simultaneously to filter out items matching all those +requirements. Successive ``filter()`` calls further restrict the set of +objects, but for multi-valued relations, they apply to any object linked to +the primary model, not necessarily those objects that were selected by an +earlier ``filter()`` call. + +That may sound a bit confusing, so hopefully an example will clarify. To +select all blogs that contain entries with both *"Lennon"* in the headline +and that were published in 2008 (the same entry satisfying both conditions), +we would write:: + + Blog.objects.filter(entry__headline__contains='Lennon', + entry__pub_date__year=2008) + +To select all blogs that contain an entry with *"Lennon"* in the headline +**as well as** an entry that was published in 2008, we would write:: + + Blog.objects.filter(entry__headline__contains='Lennon').filter( + entry__pub_date__year=2008) + +In this second example, the first filter restricted the queryset to all those +blogs linked to that particular type of entry. The second filter restricted +the set of blogs *further* to those that are also linked to the second type of +entry. The entries select by the second filter may or may not be the same as +the entries in the first filter. We are filtering the ``Blog`` items with each +filter statement, not the ``Entry`` items. + +All of this behavior also applies to ``exclude()``: all the conditions in a +single ``exclude()`` statement apply to a single instance (if those conditions +are talking about the same multi-valued relation). Conditions in subsequent +``filter()`` or ``exclude()`` calls that refer to the same relation may end up +filtering on different linked objects. + +.. _query-expressions: + +Filters can reference fields on the model +----------------------------------------- + +.. versionadded:: 1.1 + +In the examples given so far, we have constructed filters that compare +the value of a model field with a constant. But what if you want to compare +the value of a model field with another field on the same model? + +Django provides the ``F()`` object to allow such comparisons. Instances +of ``F()`` act as a reference to a model field within a query. These +references can then be used in query filters to compare the values of two +different fields on the same model instance. + +For example, to find a list of all blog entries that have had more comments +than pingbacks, we construct an ``F()`` object to reference the comment count, +and use that ``F()`` object in the query:: + + >>> from django.db.models import F + >>> Entry.objects.filter(n_comments__gt=F('n_pingbacks')) + +Django supports the use of addition, subtraction, multiplication, +division and modulo arithmetic with ``F()`` objects, both with constants +and with other ``F()`` objects. To find all the blog entries with more than +*twice* as many comments as pingbacks, we modify the query:: + + >>> Entry.objects.filter(n_comments__gt=F('n_pingbacks') * 2) + +To find all the entries where the rating of the entry is less than the +sum of the pingback count and comment count, we would issue the +query:: + + >>> Entry.objects.filter(rating__lt=F('n_comments') + F('n_pingbacks')) + +You can also use the double underscore notation to span relationships in +an ``F()`` object. An ``F()`` object with a double underscore will introduce +any joins needed to access the related object. For example, to retrieve all +the entries where the author's name is the same as the blog name, we could +issue the query: + + >>> Entry.objects.filter(authors__name=F('blog__name')) + +The pk lookup shortcut +---------------------- + +For convenience, Django provides a ``pk`` lookup shortcut, which stands for +"primary key". + +In the example ``Blog`` model, the primary key is the ``id`` field, so these +three statements are equivalent:: + + >>> Blog.objects.get(id__exact=14) # Explicit form + >>> Blog.objects.get(id=14) # __exact is implied + >>> Blog.objects.get(pk=14) # pk implies id__exact + +The use of ``pk`` isn't limited to ``__exact`` queries -- any query term +can be combined with ``pk`` to perform a query on the primary key of a model:: + + # Get blogs entries with id 1, 4 and 7 + >>> Blog.objects.filter(pk__in=[1,4,7]) + + # Get all blog entries with id > 14 + >>> Blog.objects.filter(pk__gt=14) + +``pk`` lookups also work across joins. For example, these three statements are +equivalent:: + + >>> Entry.objects.filter(blog__id__exact=3) # Explicit form + >>> Entry.objects.filter(blog__id=3) # __exact is implied + >>> Entry.objects.filter(blog__pk=3) # __pk implies __id__exact + +Escaping percent signs and underscores in LIKE statements +--------------------------------------------------------- + +The field lookups that equate to ``LIKE`` SQL statements (``iexact``, +``contains``, ``icontains``, ``startswith``, ``istartswith``, ``endswith`` +and ``iendswith``) will automatically escape the two special characters used in +``LIKE`` statements -- the percent sign and the underscore. (In a ``LIKE`` +statement, the percent sign signifies a multiple-character wildcard and the +underscore signifies a single-character wildcard.) + +This means things should work intuitively, so the abstraction doesn't leak. +For example, to retrieve all the entries that contain a percent sign, just use +the percent sign as any other character:: + + >>> Entry.objects.filter(headline__contains='%') + +Django takes care of the quoting for you; the resulting SQL will look something +like this: + +.. code-block:: sql + + SELECT ... WHERE headline LIKE '%\%%'; + +Same goes for underscores. Both percentage signs and underscores are handled +for you transparently. + +.. _caching-and-querysets: + +Caching and QuerySets +--------------------- + +Each ``QuerySet`` contains a cache, to minimize database access. It's important +to understand how it works, in order to write the most efficient code. + +In a newly created ``QuerySet``, the cache is empty. The first time a +``QuerySet`` is evaluated -- and, hence, a database query happens -- Django +saves the query results in the ``QuerySet``'s cache and returns the results +that have been explicitly requested (e.g., the next element, if the +``QuerySet`` is being iterated over). Subsequent evaluations of the +``QuerySet`` reuse the cached results. + +Keep this caching behavior in mind, because it may bite you if you don't use +your ``QuerySet``\s correctly. For example, the following will create two +``QuerySet``\s, evaluate them, and throw them away:: + + >>> print [e.headline for e in Entry.objects.all()] + >>> print [e.pub_date for e in Entry.objects.all()] + +That means the same database query will be executed twice, effectively doubling +your database load. Also, there's a possibility the two lists may not include +the same database records, because an ``Entry`` may have been added or deleted +in the split second between the two requests. + +To avoid this problem, simply save the ``QuerySet`` and reuse it:: + + >>> queryset = Entry.objects.all() + >>> print [p.headline for p in queryset] # Evaluate the query set. + >>> print [p.pub_date for p in queryset] # Re-use the cache from the evaluation. + +.. _complex-lookups-with-q: + +Complex lookups with Q objects +============================== + +Keyword argument queries -- in ``filter()``, etc. -- are "AND"ed together. If +you need to execute more complex queries (for example, queries with ``OR`` +statements), you can use ``Q`` objects. + +A ``Q`` object (``django.db.models.Q``) is an object used to encapsulate a +collection of keyword arguments. These keyword arguments are specified as in +"Field lookups" above. + +For example, this ``Q`` object encapsulates a single ``LIKE`` query:: + + Q(question__startswith='What') + +``Q`` objects can be combined using the ``&`` and ``|`` operators. When an +operator is used on two ``Q`` objects, it yields a new ``Q`` object. + +For example, this statement yields a single ``Q`` object that represents the +"OR" of two ``"question__startswith"`` queries:: + + Q(question__startswith='Who') | Q(question__startswith='What') + +This is equivalent to the following SQL ``WHERE`` clause:: + + WHERE question LIKE 'Who%' OR question LIKE 'What%' + +You can compose statements of arbitrary complexity by combining ``Q`` objects +with the ``&`` and ``|`` operators and use parenthetical grouping. Also, ``Q`` +objects can be negated using the ``~`` operator, allowing for combined lookups +that combine both a normal query and a negated (``NOT``) query:: + + Q(question__startswith='Who') | ~Q(pub_date__year=2005) + +Each lookup function that takes keyword-arguments (e.g. ``filter()``, +``exclude()``, ``get()``) can also be passed one or more ``Q`` objects as +positional (not-named) arguments. If you provide multiple ``Q`` object +arguments to a lookup function, the arguments will be "AND"ed together. For +example:: + + Poll.objects.get( + Q(question__startswith='Who'), + Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6)) + ) + +... roughly translates into the SQL:: + + SELECT * from polls WHERE question LIKE 'Who%' + AND (pub_date = '2005-05-02' OR pub_date = '2005-05-06') + +Lookup functions can mix the use of ``Q`` objects and keyword arguments. All +arguments provided to a lookup function (be they keyword arguments or ``Q`` +objects) are "AND"ed together. However, if a ``Q`` object is provided, it must +precede the definition of any keyword arguments. For example:: + + Poll.objects.get( + Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6)), + question__startswith='Who') + +... would be a valid query, equivalent to the previous example; but:: + + # INVALID QUERY + Poll.objects.get( + question__startswith='Who', + Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6))) + +... would not be valid. + +.. seealso:: + + The `OR lookups examples`_ in the Django unit tests show some possible uses + of ``Q``. + + .. _OR lookups examples: http://code.djangoproject.com/browser/django/trunk/tests/modeltests/or_lookups/tests.py + +Comparing objects +================= + +To compare two model instances, just use the standard Python comparison operator, +the double equals sign: ``==``. Behind the scenes, that compares the primary +key values of two models. + +Using the ``Entry`` example above, the following two statements are equivalent:: + + >>> some_entry == other_entry + >>> some_entry.id == other_entry.id + +If a model's primary key isn't called ``id``, no problem. Comparisons will +always use the primary key, whatever it's called. For example, if a model's +primary key field is called ``name``, these two statements are equivalent:: + + >>> some_obj == other_obj + >>> some_obj.name == other_obj.name + +.. _topics-db-queries-delete: + +Deleting objects +================ + +The delete method, conveniently, is named ``delete()``. This method immediately +deletes the object and has no return value. Example:: + + e.delete() + +You can also delete objects in bulk. Every ``QuerySet`` has a ``delete()`` +method, which deletes all members of that ``QuerySet``. + +For example, this deletes all ``Entry`` objects with a ``pub_date`` year of +2005:: + + Entry.objects.filter(pub_date__year=2005).delete() + +Keep in mind that this will, whenever possible, be executed purely in +SQL, and so the ``delete()`` methods of individual object instances +will not necessarily be called during the process. If you've provided +a custom ``delete()`` method on a model class and want to ensure that +it is called, you will need to "manually" delete instances of that +model (e.g., by iterating over a ``QuerySet`` and calling ``delete()`` +on each object individually) rather than using the bulk ``delete()`` +method of a ``QuerySet``. + +When Django deletes an object, it emulates the behavior of the SQL +constraint ``ON DELETE CASCADE`` -- in other words, any objects which +had foreign keys pointing at the object to be deleted will be deleted +along with it. For example:: + + b = Blog.objects.get(pk=1) + # This will delete the Blog and all of its Entry objects. + b.delete() + +Note that ``delete()`` is the only ``QuerySet`` method that is not exposed on a +``Manager`` itself. This is a safety mechanism to prevent you from accidentally +requesting ``Entry.objects.delete()``, and deleting *all* the entries. If you +*do* want to delete all the objects, then you have to explicitly request a +complete query set:: + + Entry.objects.all().delete() + +.. _topics-db-queries-update: + +Updating multiple objects at once +================================= + +.. versionadded:: 1.0 + +Sometimes you want to set a field to a particular value for all the objects in +a ``QuerySet``. You can do this with the ``update()`` method. For example:: + + # Update all the headlines with pub_date in 2007. + Entry.objects.filter(pub_date__year=2007).update(headline='Everything is the same') + +You can only set non-relation fields and ``ForeignKey`` fields using this +method. To update a non-relation field, provide the new value as a constant. +To update ``ForeignKey`` fields, set the new value to be the new model +instance you want to point to. For example:: + + >>> b = Blog.objects.get(pk=1) + + # Change every Entry so that it belongs to this Blog. + >>> Entry.objects.all().update(blog=b) + +The ``update()`` method is applied instantly and returns the number of rows +affected by the query. The only restriction on the ``QuerySet`` that is +updated is that it can only access one database table, the model's main +table. You can filter based on related fields, but you can only update columns +in the model's main table. Example:: + + >>> b = Blog.objects.get(pk=1) + + # Update all the headlines belonging to this Blog. + >>> Entry.objects.select_related().filter(blog=b).update(headline='Everything is the same') + +Be aware that the ``update()`` method is converted directly to an SQL +statement. It is a bulk operation for direct updates. It doesn't run any +``save()`` methods on your models, or emit the ``pre_save`` or ``post_save`` +signals (which are a consequence of calling ``save()``). If you want to save +every item in a ``QuerySet`` and make sure that the ``save()`` method is +called on each instance, you don't need any special function to handle that. +Just loop over them and call ``save()``:: + + for item in my_queryset: + item.save() + +.. versionadded:: 1.1 + +Calls to update can also use :ref:`F() objects <query-expressions>` to update +one field based on the value of another field in the model. This is especially +useful for incrementing counters based upon their current value. For example, to +increment the pingback count for every entry in the blog:: + + >>> Entry.objects.all().update(n_pingbacks=F('n_pingbacks') + 1) + +However, unlike ``F()`` objects in filter and exclude clauses, you can't +introduce joins when you use ``F()`` objects in an update -- you can only +reference fields local to the model being updated. If you attempt to introduce +a join with an ``F()`` object, a ``FieldError`` will be raised:: + + # THIS WILL RAISE A FieldError + >>> Entry.objects.update(headline=F('blog__name')) + +Related objects +=============== + +When you define a relationship in a model (i.e., a ``ForeignKey``, +``OneToOneField``, or ``ManyToManyField``), instances of that model will have +a convenient API to access the related object(s). + +Using the models at the top of this page, for example, an ``Entry`` object ``e`` +can get its associated ``Blog`` object by accessing the ``blog`` attribute: +``e.blog``. + +(Behind the scenes, this functionality is implemented by Python descriptors_. +This shouldn't really matter to you, but we point it out here for the curious.) + +Django also creates API accessors for the "other" side of the relationship -- +the link from the related model to the model that defines the relationship. +For example, a ``Blog`` object ``b`` has access to a list of all related +``Entry`` objects via the ``entry_set`` attribute: ``b.entry_set.all()``. + +All examples in this section use the sample ``Blog``, ``Author`` and ``Entry`` +models defined at the top of this page. + +.. _descriptors: http://users.rcn.com/python/download/Descriptor.htm + +One-to-many relationships +------------------------- + +Forward +~~~~~~~ + +If a model has a ``ForeignKey``, instances of that model will have access to +the related (foreign) object via a simple attribute of the model. + +Example:: + + >>> e = Entry.objects.get(id=2) + >>> e.blog # Returns the related Blog object. + +You can get and set via a foreign-key attribute. As you may expect, changes to +the foreign key aren't saved to the database until you call ``save()``. +Example:: + + >>> e = Entry.objects.get(id=2) + >>> e.blog = some_blog + >>> e.save() + +If a ``ForeignKey`` field has ``null=True`` set (i.e., it allows ``NULL`` +values), you can assign ``None`` to it. Example:: + + >>> e = Entry.objects.get(id=2) + >>> e.blog = None + >>> e.save() # "UPDATE blog_entry SET blog_id = NULL ...;" + +Forward access to one-to-many relationships is cached the first time the +related object is accessed. Subsequent accesses to the foreign key on the same +object instance are cached. Example:: + + >>> e = Entry.objects.get(id=2) + >>> print e.blog # Hits the database to retrieve the associated Blog. + >>> print e.blog # Doesn't hit the database; uses cached version. + +Note that the ``select_related()`` ``QuerySet`` method recursively prepopulates +the cache of all one-to-many relationships ahead of time. Example:: + + >>> e = Entry.objects.select_related().get(id=2) + >>> print e.blog # Doesn't hit the database; uses cached version. + >>> print e.blog # Doesn't hit the database; uses cached version. + +.. _backwards-related-objects: + +Following relationships "backward" +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If a model has a ``ForeignKey``, instances of the foreign-key model will have +access to a ``Manager`` that returns all instances of the first model. By +default, this ``Manager`` is named ``FOO_set``, where ``FOO`` is the source +model name, lowercased. This ``Manager`` returns ``QuerySets``, which can be +filtered and manipulated as described in the "Retrieving objects" section +above. + +Example:: + + >>> b = Blog.objects.get(id=1) + >>> b.entry_set.all() # Returns all Entry objects related to Blog. + + # b.entry_set is a Manager that returns QuerySets. + >>> b.entry_set.filter(headline__contains='Lennon') + >>> b.entry_set.count() + +You can override the ``FOO_set`` name by setting the ``related_name`` +parameter in the ``ForeignKey()`` definition. For example, if the ``Entry`` +model was altered to ``blog = ForeignKey(Blog, related_name='entries')``, the +above example code would look like this:: + + >>> b = Blog.objects.get(id=1) + >>> b.entries.all() # Returns all Entry objects related to Blog. + + # b.entries is a Manager that returns QuerySets. + >>> b.entries.filter(headline__contains='Lennon') + >>> b.entries.count() + +You cannot access a reverse ``ForeignKey`` ``Manager`` from the class; it must +be accessed from an instance:: + + >>> Blog.entry_set + Traceback: + ... + AttributeError: "Manager must be accessed via instance". + +In addition to the ``QuerySet`` methods defined in "Retrieving objects" above, +the ``ForeignKey`` ``Manager`` has additional methods used to handle the set of +related objects. A synopsis of each is below, and complete details can be found +in the :doc:`related objects reference </ref/models/relations>`. + +``add(obj1, obj2, ...)`` + Adds the specified model objects to the related object set. + +``create(**kwargs)`` + Creates a new object, saves it and puts it in the related object set. + Returns the newly created object. + +``remove(obj1, obj2, ...)`` + Removes the specified model objects from the related object set. + +``clear()`` + Removes all objects from the related object set. + +To assign the members of a related set in one fell swoop, just assign to it +from any iterable object. The iterable can contain object instances, or just +a list of primary key values. For example:: + + b = Blog.objects.get(id=1) + b.entry_set = [e1, e2] + +In this example, ``e1`` and ``e2`` can be full Entry instances, or integer +primary key values. + +If the ``clear()`` method is available, any pre-existing objects will be +removed from the ``entry_set`` before all objects in the iterable (in this +case, a list) are added to the set. If the ``clear()`` method is *not* +available, all objects in the iterable will be added without removing any +existing elements. + +Each "reverse" operation described in this section has an immediate effect on +the database. Every addition, creation and deletion is immediately and +automatically saved to the database. + +Many-to-many relationships +-------------------------- + +Both ends of a many-to-many relationship get automatic API access to the other +end. The API works just as a "backward" one-to-many relationship, above. + +The only difference is in the attribute naming: The model that defines the +``ManyToManyField`` uses the attribute name of that field itself, whereas the +"reverse" model uses the lowercased model name of the original model, plus +``'_set'`` (just like reverse one-to-many relationships). + +An example makes this easier to understand:: + + e = Entry.objects.get(id=3) + e.authors.all() # Returns all Author objects for this Entry. + e.authors.count() + e.authors.filter(name__contains='John') + + a = Author.objects.get(id=5) + a.entry_set.all() # Returns all Entry objects for this Author. + +Like ``ForeignKey``, ``ManyToManyField`` can specify ``related_name``. In the +above example, if the ``ManyToManyField`` in ``Entry`` had specified +``related_name='entries'``, then each ``Author`` instance would have an +``entries`` attribute instead of ``entry_set``. + +One-to-one relationships +------------------------ + +One-to-one relationships are very similar to many-to-one relationships. If you +define a :class:`~django.db.models.OneToOneField` on your model, instances of +that model will have access to the related object via a simple attribute of the +model. + +For example:: + + class EntryDetail(models.Model): + entry = models.OneToOneField(Entry) + details = models.TextField() + + ed = EntryDetail.objects.get(id=2) + ed.entry # Returns the related Entry object. + +The difference comes in "reverse" queries. The related model in a one-to-one +relationship also has access to a :class:`~django.db.models.Manager` object, but +that :class:`~django.db.models.Manager` represents a single object, rather than +a collection of objects:: + + e = Entry.objects.get(id=2) + e.entrydetail # returns the related EntryDetail object + +If no object has been assigned to this relationship, Django will raise +a ``DoesNotExist`` exception. + +Instances can be assigned to the reverse relationship in the same way as +you would assign the forward relationship:: + + e.entrydetail = ed + +How are the backward relationships possible? +-------------------------------------------- + +Other object-relational mappers require you to define relationships on both +sides. The Django developers believe this is a violation of the DRY (Don't +Repeat Yourself) principle, so Django only requires you to define the +relationship on one end. + +But how is this possible, given that a model class doesn't know which other +model classes are related to it until those other model classes are loaded? + +The answer lies in the :setting:`INSTALLED_APPS` setting. The first time any model is +loaded, Django iterates over every model in :setting:`INSTALLED_APPS` and creates the +backward relationships in memory as needed. Essentially, one of the functions +of :setting:`INSTALLED_APPS` is to tell Django the entire model domain. + +Queries over related objects +---------------------------- + +Queries involving related objects follow the same rules as queries involving +normal value fields. When specifying the value for a query to match, you may +use either an object instance itself, or the primary key value for the object. + +For example, if you have a Blog object ``b`` with ``id=5``, the following +three queries would be identical:: + + Entry.objects.filter(blog=b) # Query using object instance + Entry.objects.filter(blog=b.id) # Query using id from instance + Entry.objects.filter(blog=5) # Query using id directly + +Falling back to raw SQL +======================= + +If you find yourself needing to write an SQL query that is too complex for +Django's database-mapper to handle, you can fall back on writing SQL by hand. +Django has a couple of options for writing raw SQL queries; see +:doc:`/topics/db/sql`. + +Finally, it's important to note that the Django database layer is merely an +interface to your database. You can access your database via other tools, +programming languages or database frameworks; there's nothing Django-specific +about your database. diff --git a/parts/django/docs/topics/db/sql.txt b/parts/django/docs/topics/db/sql.txt new file mode 100644 index 0000000..cac9a72 --- /dev/null +++ b/parts/django/docs/topics/db/sql.txt @@ -0,0 +1,279 @@ +========================== +Performing raw SQL queries +========================== + +.. currentmodule:: django.db.models + +When the :doc:`model query APIs </topics/db/queries>` don't go far enough, you +can fall back to writing raw SQL. Django gives you two ways of performing raw +SQL queries: you can use :meth:`Manager.raw()` to `perform raw queries and +return model instances`__, or you can avoid the model layer entirely and +`execute custom SQL directly`__. + +__ `performing raw queries`_ +__ `executing custom SQL directly`_ + +Performing raw queries +====================== + +.. versionadded:: 1.2 + +The ``raw()`` manager method can be used to perform raw SQL queries that +return model instances: + +.. method:: Manager.raw(raw_query, params=None, translations=None) + +This method method takes a raw SQL query, executes it, and returns a +:class:`~django.db.models.query.RawQuerySet` instance. This +:class:`~django.db.models.query.RawQuerySet` instance can be iterated +over just like an normal QuerySet to provide object instances. + +This is best illustrated with an example. Suppose you've got the following model:: + + class Person(models.Model): + first_name = models.CharField(...) + last_name = models.CharField(...) + birth_date = models.DateField(...) + +You could then execute custom SQL like so:: + + >>> for p in Person.objects.raw('SELECT * FROM myapp_person'): + ... print p + John Smith + Jane Jones + +.. admonition:: Model table names + + Where'd the name of the ``Person`` table come from in that example? + + By default, Django figures out a database table name by joining the + model's "app label" -- the name you used in ``manage.py startapp`` -- to + the model's class name, with an underscore between them. In the example + we've assumed that the ``Person`` model lives in an app named ``myapp``, + so its table would be ``myapp_person``. + + For more details check out the documentation for the + :attr:`~Options.db_table` option, which also lets you manually set the + database table name. + +Of course, this example isn't very exciting -- it's exactly the same as +running ``Person.objects.all()``. However, ``raw()`` has a bunch of other +options that make it very powerful. + +Mapping query fields to model fields +------------------------------------ + +``raw()`` automatically maps fields in the query to fields on the model. + +The order of fields in your query doesn't matter. In other words, both +of the following queries work identically:: + + >>> Person.objects.raw('SELECT id, first_name, last_name, birth_date FROM myapp_person') + ... + >>> Person.objects.raw('SELECT last_name, birth_date, first_name, id FROM myapp_person') + ... + +Matching is done by name. This means that you can use SQL's ``AS`` clauses to +map fields in the query to model fields. So if you had some other table that +had ``Person`` data in it, you could easily map it into ``Person`` instances:: + + >>> Person.objects.raw('''SELECT first AS first_name, + ... last AS last_name, + ... bd AS birth_date, + ... pk as id, + ... FROM some_other_table''') + +As long as the names match, the model instances will be created correctly. + +Alternatively, you can map fields in the query to model fields using the +``translations`` argument to ``raw()``. This is a dictionary mapping names of +fields in the query to names of fields on the model. For example, the above +query could also be written:: + + >>> name_map = {'first': 'first_name', 'last': 'last_name', 'bd': 'birth_date', 'pk': 'id'} + >>> Person.objects.raw('SELECT * FROM some_other_table', translations=name_map) + +Index lookups +------------- + +``raw()`` supports indexing, so if you need only the first result you can +write:: + + >>> first_person = Person.objects.raw('SELECT * from myapp_person')[0] + +However, the indexing and slicing are not performed at the database level. If +you have a big amount of ``Person`` objects in your database, it is more +efficient to limit the query at the SQL level:: + + >>> first_person = Person.objects.raw('SELECT * from myapp_person LIMIT 1')[0] + +Deferring model fields +---------------------- + +Fields may also be left out:: + + >>> people = Person.objects.raw('SELECT id, first_name FROM myapp_person') + +The ``Person`` objects returned by this query will be deferred model instances +(see :meth:`~django.db.models.QuerySet.defer()`). This means that the fields +that are omitted from the query will be loaded on demand. For example:: + + >>> for p in Person.objects.raw('SELECT id, first_name FROM myapp_person'): + ... print p.first_name, # This will be retrieved by the original query + ... print p.last_name # This will be retrieved on demand + ... + John Smith + Jane Jones + +From outward appearances, this looks like the query has retrieved both +the first name and last name. However, this example actually issued 3 +queries. Only the first names were retrieved by the raw() query -- the +last names were both retrieved on demand when they were printed. + +There is only one field that you can't leave out - the primary key +field. Django uses the primary key to identify model instances, so it +must always be included in a raw query. An ``InvalidQuery`` exception +will be raised if you forget to include the primary key. + +Adding annotations +------------------ + +You can also execute queries containing fields that aren't defined on the +model. For example, we could use `PostgreSQL's age() function`__ to get a list +of people with their ages calculated by the database:: + + >>> people = Person.objects.raw('SELECT *, age(birth_date) AS age FROM myapp_person') + >>> for p in people: + ... print "%s is %s." % (p.first_name, p.age) + John is 37. + Jane is 42. + ... + +__ http://www.postgresql.org/docs/8.4/static/functions-datetime.html + +Passing parameters into ``raw()`` +--------------------------------- + +If you need to perform parameterized queries, you can use the ``params`` +argument to ``raw()``:: + + >>> lname = 'Doe' + >>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname]) + +``params`` is a list of parameters. You'll use ``%s`` placeholders in the +query string (regardless of your database engine); they'll be replaced with +parameters from the ``params`` list. + +.. warning:: + + **Do not use string formatting on raw queries!** + + It's tempting to write the above query as:: + + >>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' % lname + >>> Person.objects.raw(query) + + **Don't.** + + Using the ``params`` list completely protects you from `SQL injection + attacks`__, a common exploit where attackers inject arbitrary SQL into + your database. If you use string interpolation, sooner or later you'll + fall victim to SQL injection. As long as you remember to always use the + ``params`` list you'll be protected. + +__ http://en.wikipedia.org/wiki/SQL_injection + +Executing custom SQL directly +============================= + +Sometimes even :meth:`Manager.raw` isn't quite enough: you might need to +perform queries that don't map cleanly to models, or directly execute +``UPDATE``, ``INSERT``, or ``DELETE`` queries. + +In these cases, you can always access the database directly, routing around +the model layer entirely. + +The object ``django.db.connection`` represents the +default database connection, and ``django.db.transaction`` represents the +default database transaction. To use the database connection, call +``connection.cursor()`` to get a cursor object. Then, call +``cursor.execute(sql, [params])`` to execute the SQL and ``cursor.fetchone()`` +or ``cursor.fetchall()`` to return the resulting rows. After performing a data +changing operation, you should then call +``transaction.commit_unless_managed()`` to ensure your changes are committed +to the database. If your query is purely a data retrieval operation, no commit +is required. For example:: + + def my_custom_sql(): + from django.db import connection, transaction + cursor = connection.cursor() + + # Data modifying operation - commit required + cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz]) + transaction.commit_unless_managed() + + # Data retrieval operation - no commit required + cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz]) + row = cursor.fetchone() + + return row + +If you are using more than one database you can use +``django.db.connections`` to obtain the connection (and cursor) for a +specific database. ``django.db.connections`` is a dictionary-like +object that allows you to retrieve a specific connection using it's +alias:: + + from django.db import connections + cursor = connections['my_db_alias'].cursor() + +.. _transactions-and-raw-sql: + +Transactions and raw SQL +------------------------ +If you are using transaction decorators (such as ``commit_on_success``) to +wrap your views and provide transaction control, you don't have to make a +manual call to ``transaction.commit_unless_managed()`` -- you can manually +commit if you want to, but you aren't required to, since the decorator will +commit for you. However, if you don't manually commit your changes, you will +need to manually mark the transaction as dirty, using +``transaction.set_dirty()``:: + + @commit_on_success + def my_custom_sql_view(request, value): + from django.db import connection, transaction + cursor = connection.cursor() + + # Data modifying operation + cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [value]) + + # Since we modified data, mark the transaction as dirty + transaction.set_dirty() + + # Data retrieval operation. This doesn't dirty the transaction, + # so no call to set_dirty() is required. + cursor.execute("SELECT foo FROM bar WHERE baz = %s", [value]) + row = cursor.fetchone() + + return render_to_response('template.html', {'row': row}) + +The call to ``set_dirty()`` is made automatically when you use the Django ORM +to make data modifying database calls. However, when you use raw SQL, Django +has no way of knowing if your SQL modifies data or not. The manual call to +``set_dirty()`` ensures that Django knows that there are modifications that +must be committed. + +Connections and cursors +----------------------- + +``connection`` and ``cursor`` mostly implement the standard `Python DB-API`_ +(except when it comes to :doc:`transaction handling </topics/db/transactions>`). +If you're not familiar with the Python DB-API, note that the SQL statement in +``cursor.execute()`` uses placeholders, ``"%s"``, rather than adding parameters +directly within the SQL. If you use this technique, the underlying database +library will automatically add quotes and escaping to your parameter(s) as +necessary. (Also note that Django expects the ``"%s"`` placeholder, *not* the +``"?"`` placeholder, which is used by the SQLite Python bindings. This is for +the sake of consistency and sanity.) + +.. _Python DB-API: http://www.python.org/dev/peps/pep-0249/ diff --git a/parts/django/docs/topics/db/transactions.txt b/parts/django/docs/topics/db/transactions.txt new file mode 100644 index 0000000..be9d9a8 --- /dev/null +++ b/parts/django/docs/topics/db/transactions.txt @@ -0,0 +1,328 @@ +============================== +Managing database transactions +============================== + +.. currentmodule:: django.db + +Django gives you a few ways to control how database transactions are managed, +if you're using a database that supports transactions. + +Django's default transaction behavior +===================================== + +Django's default behavior is to run with an open transaction which it +commits automatically when any built-in, data-altering model function is +called. For example, if you call ``model.save()`` or ``model.delete()``, the +change will be committed immediately. + +This is much like the auto-commit setting for most databases. As soon as you +perform an action that needs to write to the database, Django produces the +``INSERT``/``UPDATE``/``DELETE`` statements and then does the ``COMMIT``. +There's no implicit ``ROLLBACK``. + +Tying transactions to HTTP requests +=================================== + +The recommended way to handle transactions in Web requests is to tie them to +the request and response phases via Django's ``TransactionMiddleware``. + +It works like this: When a request starts, Django starts a transaction. If the +response is produced without problems, Django commits any pending transactions. +If the view function produces an exception, Django rolls back any pending +transactions. + +To activate this feature, just add the ``TransactionMiddleware`` middleware to +your :setting:`MIDDLEWARE_CLASSES` setting:: + + MIDDLEWARE_CLASSES = ( + 'django.middleware.cache.UpdateCacheMiddleware', + 'django.contrib.sessions.middleware.SessionMiddleware', + 'django.middleware.common.CommonMiddleware', + 'django.middleware.transaction.TransactionMiddleware', + 'django.middleware.cache.FetchFromCacheMiddleware', + ) + +The order is quite important. The transaction middleware applies not only to +view functions, but also for all middleware modules that come after it. So if +you use the session middleware after the transaction middleware, session +creation will be part of the transaction. + +The various cache middlewares are an exception: +:class:`~django.middleware.cache.CacheMiddleware`, +:class:`~django.middleware.cache.UpdateCacheMiddleware`, and +:class:`~django.middleware.cache.FetchFromCacheMiddleware` are never affected. +Even when using database caching, Django's cache backend uses its own +database cursor (which is mapped to its own database connection internally). + +Controlling transaction management in views +=========================================== + +For most people, implicit request-based transactions work wonderfully. However, +if you need more fine-grained control over how transactions are managed, you +can use Python decorators to change the way transactions are handled by a +particular view function. All of the decorators take an option ``using`` +parameter which should be the alias for a database connection for which the +behavior applies to. If no alias is specified then the ``"default"`` database +is used. + +.. note:: + + Although the examples below use view functions as examples, these + decorators can be applied to non-view functions as well. + +.. _topics-db-transactions-autocommit: + +``django.db.transaction.autocommit`` +------------------------------------ + +Use the ``autocommit`` decorator to switch a view function to Django's default +commit behavior, regardless of the global transaction setting. + +Example:: + + from django.db import transaction + + @transaction.autocommit + def viewfunc(request): + .... + + @transaction.autocommit(using="my_other_database") + def viewfunc2(request): + .... + +Within ``viewfunc()``, transactions will be committed as soon as you call +``model.save()``, ``model.delete()``, or any other function that writes to the +database. ``viewfunc2()`` will have this same behavior, but for the +``"my_other_database"`` connection. + +``django.db.transaction.commit_on_success`` +------------------------------------------- + +Use the ``commit_on_success`` decorator to use a single transaction for +all the work done in a function:: + + from django.db import transaction + + @transaction.commit_on_success + def viewfunc(request): + .... + + @transaction.commit_on_success(using="my_other_database") + def viewfunc2(request): + .... + +If the function returns successfully, then Django will commit all work done +within the function at that point. If the function raises an exception, though, +Django will roll back the transaction. + +``django.db.transaction.commit_manually`` +----------------------------------------- + +Use the ``commit_manually`` decorator if you need full control over +transactions. It tells Django you'll be managing the transaction on your own. + +If your view changes data and doesn't ``commit()`` or ``rollback()``, Django +will raise a ``TransactionManagementError`` exception. + +Manual transaction management looks like this:: + + from django.db import transaction + + @transaction.commit_manually + def viewfunc(request): + ... + # You can commit/rollback however and whenever you want + transaction.commit() + ... + + # But you've got to remember to do it yourself! + try: + ... + except: + transaction.rollback() + else: + transaction.commit() + + @transaction.commit_manually(using="my_other_database") + def viewfunc2(request): + .... + +.. admonition:: An important note to users of earlier Django releases: + + The database ``connection.commit()`` and ``connection.rollback()`` methods + (called ``db.commit()`` and ``db.rollback()`` in 0.91 and earlier) no + longer exist. They've been replaced by ``transaction.commit()`` and + ``transaction.rollback()``. + +How to globally deactivate transaction management +================================================= + +Control freaks can totally disable all transaction management by setting +``DISABLE_TRANSACTION_MANAGEMENT`` to ``True`` in the Django settings file. + +If you do this, Django won't provide any automatic transaction management +whatsoever. Middleware will no longer implicitly commit transactions, and +you'll need to roll management yourself. This even requires you to commit +changes done by middleware somewhere else. + +Thus, this is best used in situations where you want to run your own +transaction-controlling middleware or do something really strange. In almost +all situations, you'll be better off using the default behavior, or the +transaction middleware, and only modify selected functions as needed. + +.. _topics-db-transactions-savepoints: + +Savepoints +========== + +A savepoint is a marker within a transaction that enables you to roll back +part of a transaction, rather than the full transaction. Savepoints are +available to the PostgreSQL 8 and Oracle backends. Other backends will +provide the savepoint functions, but they are empty operations - they won't +actually do anything. + +Savepoints aren't especially useful if you are using the default +``autocommit`` behaviour of Django. However, if you are using +``commit_on_success`` or ``commit_manually``, each open transaction will build +up a series of database operations, awaiting a commit or rollback. If you +issue a rollback, the entire transaction is rolled back. Savepoints provide +the ability to perform a fine-grained rollback, rather than the full rollback +that would be performed by ``transaction.rollback()``. + +Each of these functions takes a ``using`` argument which should be the name of +a database for which the behavior applies. If no ``using`` argument is +provided then the ``"default"`` database is used. + +Savepoints are controlled by three methods on the transaction object: + +.. method:: transaction.savepoint(using=None) + + Creates a new savepoint. This marks a point in the transaction that + is known to be in a "good" state. + + Returns the savepoint ID (sid). + +.. method:: transaction.savepoint_commit(sid, using=None) + + Updates the savepoint to include any operations that have been performed + since the savepoint was created, or since the last commit. + +.. method:: transaction.savepoint_rollback(sid, using=None) + + Rolls the transaction back to the last point at which the savepoint was + committed. + +The following example demonstrates the use of savepoints:: + + from django.db import transaction + + @transaction.commit_manually + def viewfunc(request): + + a.save() + # open transaction now contains a.save() + sid = transaction.savepoint() + + b.save() + # open transaction now contains a.save() and b.save() + + if want_to_keep_b: + transaction.savepoint_commit(sid) + # open transaction still contains a.save() and b.save() + else: + transaction.savepoint_rollback(sid) + # open transaction now contains only a.save() + + transaction.commit() + +Transactions in MySQL +===================== + +If you're using MySQL, your tables may or may not support transactions; it +depends on your MySQL version and the table types you're using. (By +"table types," we mean something like "InnoDB" or "MyISAM".) MySQL transaction +peculiarities are outside the scope of this article, but the MySQL site has +`information on MySQL transactions`_. + +If your MySQL setup does *not* support transactions, then Django will function +in auto-commit mode: Statements will be executed and committed as soon as +they're called. If your MySQL setup *does* support transactions, Django will +handle transactions as explained in this document. + +.. _information on MySQL transactions: http://dev.mysql.com/doc/refman/5.0/en/sql-syntax-transactions.html + +Handling exceptions within PostgreSQL transactions +================================================== + +When a call to a PostgreSQL cursor raises an exception (typically +``IntegrityError``), all subsequent SQL in the same transaction will fail with +the error "current transaction is aborted, queries ignored until end of +transaction block". Whilst simple use of ``save()`` is unlikely to raise an +exception in PostgreSQL, there are more advanced usage patterns which +might, such as saving objects with unique fields, saving using the +force_insert/force_update flag, or invoking custom SQL. + +There are several ways to recover from this sort of error. + +Transaction rollback +-------------------- + +The first option is to roll back the entire transaction. For example:: + + a.save() # Succeeds, but may be undone by transaction rollback + try: + b.save() # Could throw exception + except IntegrityError: + transaction.rollback() + c.save() # Succeeds, but a.save() may have been undone + +Calling ``transaction.rollback()`` rolls back the entire transaction. Any +uncommitted database operations will be lost. In this example, the changes +made by ``a.save()`` would be lost, even though that operation raised no error +itself. + +Savepoint rollback +------------------ + +If you are using PostgreSQL 8 or later, you can use :ref:`savepoints +<topics-db-transactions-savepoints>` to control the extent of a rollback. +Before performing a database operation that could fail, you can set or update +the savepoint; that way, if the operation fails, you can roll back the single +offending operation, rather than the entire transaction. For example:: + + a.save() # Succeeds, and never undone by savepoint rollback + try: + sid = transaction.savepoint() + b.save() # Could throw exception + transaction.savepoint_commit(sid) + except IntegrityError: + transaction.savepoint_rollback(sid) + c.save() # Succeeds, and a.save() is never undone + +In this example, ``a.save()`` will not be undone in the case where +``b.save()`` raises an exception. + +Database-level autocommit +------------------------- + +.. versionadded:: 1.1 + +With PostgreSQL 8.2 or later, there is an advanced option to run PostgreSQL +with :doc:`database-level autocommit </ref/databases>`. If you use this option, +there is no constantly open transaction, so it is always possible to continue +after catching an exception. For example:: + + a.save() # succeeds + try: + b.save() # Could throw exception + except IntegrityError: + pass + c.save() # succeeds + +.. note:: + + This is not the same as the :ref:`autocommit decorator + <topics-db-transactions-autocommit>`. When using database level autocommit + there is no database transaction at all. The ``autocommit`` decorator + still uses transactions, automatically committing each transaction when + a database modifying operation occurs. |