summaryrefslogtreecommitdiff
path: root/parts/django/docs/topics/db/optimization.txt
diff options
context:
space:
mode:
Diffstat (limited to 'parts/django/docs/topics/db/optimization.txt')
-rw-r--r--parts/django/docs/topics/db/optimization.txt260
1 files changed, 0 insertions, 260 deletions
diff --git a/parts/django/docs/topics/db/optimization.txt b/parts/django/docs/topics/db/optimization.txt
deleted file mode 100644
index 7d51052..0000000
--- a/parts/django/docs/topics/db/optimization.txt
+++ /dev/null
@@ -1,260 +0,0 @@
-============================
-Database access optimization
-============================
-
-Django's database layer provides various ways to help developers get the most
-out of their databases. This document gathers together links to the relevant
-documentation, and adds various tips, organized under a number of headings that
-outline the steps to take when attempting to optimize your database usage.
-
-Profile first
-=============
-
-As general programming practice, this goes without saying. Find out :ref:`what
-queries you are doing and what they are costing you
-<faq-see-raw-sql-queries>`. You may also want to use an external project like
-django-debug-toolbar_, or a tool that monitors your database directly.
-
-Remember that you may be optimizing for speed or memory or both, depending on
-your requirements. Sometimes optimizing for one will be detrimental to the
-other, but sometimes they will help each other. Also, work that is done by the
-database process might not have the same cost (to you) as the same amount of
-work done in your Python process. It is up to you to decide what your
-priorities are, where the balance must lie, and profile all of these as required
-since this will depend on your application and server.
-
-With everything that follows, remember to profile after every change to ensure
-that the change is a benefit, and a big enough benefit given the decrease in
-readability of your code. **All** of the suggestions below come with the caveat
-that in your circumstances the general principle might not apply, or might even
-be reversed.
-
-.. _django-debug-toolbar: http://robhudson.github.com/django-debug-toolbar/
-
-Use standard DB optimization techniques
-=======================================
-
-...including:
-
-* Indexes. This is a number one priority, *after* you have determined from
- profiling what indexes should be added. Use
- :attr:`django.db.models.Field.db_index` to add these from Django.
-
-* Appropriate use of field types.
-
-We will assume you have done the obvious things above. The rest of this document
-focuses on how to use Django in such a way that you are not doing unnecessary
-work. This document also does not address other optimization techniques that
-apply to all expensive operations, such as :doc:`general purpose caching
-</topics/cache>`.
-
-Understand QuerySets
-====================
-
-Understanding :doc:`QuerySets </ref/models/querysets>` is vital to getting good
-performance with simple code. In particular:
-
-Understand QuerySet evaluation
-------------------------------
-
-To avoid performance problems, it is important to understand:
-
-* that :ref:`QuerySets are lazy <querysets-are-lazy>`.
-
-* when :ref:`they are evaluated <when-querysets-are-evaluated>`.
-
-* how :ref:`the data is held in memory <caching-and-querysets>`.
-
-Understand cached attributes
-----------------------------
-
-As well as caching of the whole ``QuerySet``, there is caching of the result of
-attributes on ORM objects. In general, attributes that are not callable will be
-cached. For example, assuming the :ref:`example Weblog models
-<queryset-model-example>`::
-
- >>> entry = Entry.objects.get(id=1)
- >>> entry.blog # Blog object is retrieved at this point
- >>> entry.blog # cached version, no DB access
-
-But in general, callable attributes cause DB lookups every time::
-
- >>> entry = Entry.objects.get(id=1)
- >>> entry.authors.all() # query performed
- >>> entry.authors.all() # query performed again
-
-Be careful when reading template code - the template system does not allow use
-of parentheses, but will call callables automatically, hiding the above
-distinction.
-
-Be careful with your own custom properties - it is up to you to implement
-caching.
-
-Use the ``with`` template tag
------------------------------
-
-To make use of the caching behaviour of ``QuerySet``, you may need to use the
-:ttag:`with` template tag.
-
-Use ``iterator()``
-------------------
-
-When you have a lot of objects, the caching behaviour of the ``QuerySet`` can
-cause a large amount of memory to be used. In this case,
-:meth:`~django.db.models.QuerySet.iterator()` may help.
-
-Do database work in the database rather than in Python
-======================================================
-
-For instance:
-
-* At the most basic level, use :ref:`filter and exclude <queryset-api>` to do
- filtering in the database.
-
-* Use :ref:`F() object query expressions <query-expressions>` to do filtering
- against other fields within the same model.
-
-* Use :doc:`annotate to do aggregation in the database </topics/db/aggregation>`.
-
-If these aren't enough to generate the SQL you need:
-
-Use ``QuerySet.extra()``
-------------------------
-
-A less portable but more powerful method is
-:meth:`~django.db.models.QuerySet.extra()`, which allows some SQL to be
-explicitly added to the query. If that still isn't powerful enough:
-
-Use raw SQL
------------
-
-Write your own :doc:`custom SQL to retrieve data or populate models
-</topics/db/sql>`. Use ``django.db.connection.queries`` to find out what Django
-is writing for you and start from there.
-
-Retrieve everything at once if you know you will need it
-========================================================
-
-Hitting the database multiple times for different parts of a single 'set' of
-data that you will need all parts of is, in general, less efficient than
-retrieving it all in one query. This is particularly important if you have a
-query that is executed in a loop, and could therefore end up doing many database
-queries, when only one was needed. So:
-
-Use ``QuerySet.select_related()``
----------------------------------
-
-Understand :ref:`QuerySet.select_related() <select-related>` thoroughly, and use it:
-
-* in view code,
-
-* and in :doc:`managers and default managers </topics/db/managers>` where
- appropriate. Be aware when your manager is and is not used; sometimes this is
- tricky so don't make assumptions.
-
-Don't retrieve things you don't need
-====================================
-
-Use ``QuerySet.values()`` and ``values_list()``
------------------------------------------------
-
-When you just want a ``dict`` or ``list`` of values, and don't need ORM model
-objects, make appropriate usage of :meth:`~django.db.models.QuerySet.values()`.
-These can be useful for replacing model objects in template code - as long as
-the dicts you supply have the same attributes as those used in the template,
-you are fine.
-
-Use ``QuerySet.defer()`` and ``only()``
----------------------------------------
-
-Use :meth:`~django.db.models.QuerySet.defer()` and
-:meth:`~django.db.models.QuerySet.only()` if there are database columns you
-know that you won't need (or won't need in most cases) to avoid loading
-them. Note that if you *do* use them, the ORM will have to go and get them in a
-separate query, making this a pessimization if you use it inappropriately.
-
-Use QuerySet.count()
---------------------
-
-...if you only want the count, rather than doing ``len(queryset)``.
-
-Use QuerySet.exists()
----------------------
-
-...if you only want to find out if at least one result exists, rather than ``if
-queryset``.
-
-But:
-
-Don't overuse ``count()`` and ``exists()``
-------------------------------------------
-
-If you are going to need other data from the QuerySet, just evaluate it.
-
-For example, assuming an Email class that has a ``body`` attribute and a
-many-to-many relation to User, the following template code is optimal:
-
-.. code-block:: html+django
-
- {% if display_inbox %}
- {% with user.emails.all as emails %}
- {% if emails %}
- <p>You have {{ emails|length }} email(s)</p>
- {% for email in emails %}
- <p>{{ email.body }}</p>
- {% endfor %}
- {% else %}
- <p>No messages today.</p>
- {% endif %}
- {% endwith %}
- {% endif %}
-
-
-It is optimal because:
-
- 1. Since QuerySets are lazy, this does no database if 'display_inbox' is False.
-
- #. Use of ``with`` means that we store ``user.emails.all`` in a variable for
- later use, allowing its cache to be re-used.
-
- #. The line ``{% if emails %}`` causes ``QuerySet.__nonzero__()`` to be called,
- which causes the ``user.emails.all()`` query to be run on the database, and
- at the least the first line to be turned into an ORM object. If there aren't
- any results, it will return False, otherwise True.
-
- #. The use of ``{{ emails|length }}`` calls ``QuerySet.__len__()``, filling
- out the rest of the cache without doing another query.
-
- #. The ``for`` loop iterates over the already filled cache.
-
-In total, this code does either one or zero database queries. The only
-deliberate optimization performed is the use of the ``with`` tag. Using
-``QuerySet.exists()`` or ``QuerySet.count()`` at any point would cause
-additional queries.
-
-Use ``QuerySet.update()`` and ``delete()``
-------------------------------------------
-
-Rather than retrieve a load of objects, set some values, and save them
-individual, use a bulk SQL UPDATE statement, via :ref:`QuerySet.update()
-<topics-db-queries-update>`. Similarly, do :ref:`bulk deletes
-<topics-db-queries-delete>` where possible.
-
-Note, however, that these bulk update methods cannot call the ``save()`` or
-``delete()`` methods of individual instances, which means that any custom
-behaviour you have added for these methods will not be executed, including
-anything driven from the normal database object :doc:`signals </ref/signals>`.
-
-Use foreign key values directly
--------------------------------
-
-If you only need a foreign key value, use the foreign key value that is already on
-the object you've got, rather than getting the whole related object and taking
-its primary key. i.e. do::
-
- entry.blog_id
-
-instead of::
-
- entry.blog.id
-