diff options
author | Nishanth Amuluru | 2011-01-08 11:20:57 +0530 |
---|---|---|
committer | Nishanth Amuluru | 2011-01-08 11:20:57 +0530 |
commit | 65411d01d448ff0cd4abd14eee14cf60b5f8fc20 (patch) | |
tree | b4c404363c4c63a61d6e2f8bd26c5b057c1fb09d /parts/django/docs/topics/db/optimization.txt | |
parent | 2e35094d43b4cc6974172e1febf76abb50f086ec (diff) | |
download | pytask-65411d01d448ff0cd4abd14eee14cf60b5f8fc20.tar.gz pytask-65411d01d448ff0cd4abd14eee14cf60b5f8fc20.tar.bz2 pytask-65411d01d448ff0cd4abd14eee14cf60b5f8fc20.zip |
Added buildout stuff and made changes accordingly
--HG--
rename : profile/management/__init__.py => eggs/djangorecipe-0.20-py2.6.egg/EGG-INFO/dependency_links.txt
rename : profile/management/__init__.py => eggs/djangorecipe-0.20-py2.6.egg/EGG-INFO/not-zip-safe
rename : profile/management/__init__.py => eggs/infrae.subversion-1.4.5-py2.6.egg/EGG-INFO/dependency_links.txt
rename : profile/management/__init__.py => eggs/infrae.subversion-1.4.5-py2.6.egg/EGG-INFO/not-zip-safe
rename : profile/management/__init__.py => eggs/mercurial-1.7.3-py2.6-linux-x86_64.egg/EGG-INFO/dependency_links.txt
rename : profile/management/__init__.py => eggs/mercurial-1.7.3-py2.6-linux-x86_64.egg/EGG-INFO/not-zip-safe
rename : profile/management/__init__.py => eggs/py-1.4.0-py2.6.egg/EGG-INFO/dependency_links.txt
rename : profile/management/__init__.py => eggs/py-1.4.0-py2.6.egg/EGG-INFO/not-zip-safe
rename : profile/management/__init__.py => eggs/zc.buildout-1.5.2-py2.6.egg/EGG-INFO/dependency_links.txt
rename : profile/management/__init__.py => eggs/zc.buildout-1.5.2-py2.6.egg/EGG-INFO/not-zip-safe
rename : profile/management/__init__.py => eggs/zc.recipe.egg-1.3.2-py2.6.egg/EGG-INFO/dependency_links.txt
rename : profile/management/__init__.py => eggs/zc.recipe.egg-1.3.2-py2.6.egg/EGG-INFO/not-zip-safe
rename : profile/management/__init__.py => parts/django/Django.egg-info/dependency_links.txt
rename : taskapp/models.py => parts/django/django/conf/app_template/models.py
rename : taskapp/tests.py => parts/django/django/conf/app_template/tests.py
rename : taskapp/views.py => parts/django/django/conf/app_template/views.py
rename : taskapp/views.py => parts/django/django/contrib/gis/tests/geo3d/views.py
rename : profile/management/__init__.py => parts/django/tests/modeltests/delete/__init__.py
rename : profile/management/__init__.py => parts/django/tests/modeltests/files/__init__.py
rename : profile/management/__init__.py => parts/django/tests/modeltests/invalid_models/__init__.py
rename : profile/management/__init__.py => parts/django/tests/modeltests/m2m_signals/__init__.py
rename : profile/management/__init__.py => parts/django/tests/modeltests/model_package/__init__.py
rename : profile/management/__init__.py => parts/django/tests/regressiontests/bash_completion/__init__.py
rename : profile/management/__init__.py => parts/django/tests/regressiontests/bash_completion/management/__init__.py
rename : profile/management/__init__.py => parts/django/tests/regressiontests/bash_completion/management/commands/__init__.py
rename : profile/management/__init__.py => parts/django/tests/regressiontests/bash_completion/models.py
rename : profile/management/__init__.py => parts/django/tests/regressiontests/delete_regress/__init__.py
rename : profile/management/__init__.py => parts/django/tests/regressiontests/file_storage/__init__.py
rename : profile/management/__init__.py => parts/django/tests/regressiontests/max_lengths/__init__.py
rename : profile/forms.py => pytask/profile/forms.py
rename : profile/management/__init__.py => pytask/profile/management/__init__.py
rename : profile/management/commands/seed_db.py => pytask/profile/management/commands/seed_db.py
rename : profile/models.py => pytask/profile/models.py
rename : profile/templatetags/user_tags.py => pytask/profile/templatetags/user_tags.py
rename : taskapp/tests.py => pytask/profile/tests.py
rename : profile/urls.py => pytask/profile/urls.py
rename : profile/utils.py => pytask/profile/utils.py
rename : profile/views.py => pytask/profile/views.py
rename : static/css/base.css => pytask/static/css/base.css
rename : taskapp/tests.py => pytask/taskapp/tests.py
rename : taskapp/views.py => pytask/taskapp/views.py
rename : templates/base.html => pytask/templates/base.html
rename : templates/profile/browse_notifications.html => pytask/templates/profile/browse_notifications.html
rename : templates/profile/edit.html => pytask/templates/profile/edit.html
rename : templates/profile/view.html => pytask/templates/profile/view.html
rename : templates/profile/view_notification.html => pytask/templates/profile/view_notification.html
rename : templates/registration/activate.html => pytask/templates/registration/activate.html
rename : templates/registration/activation_email.txt => pytask/templates/registration/activation_email.txt
rename : templates/registration/activation_email_subject.txt => pytask/templates/registration/activation_email_subject.txt
rename : templates/registration/logged_out.html => pytask/templates/registration/logged_out.html
rename : templates/registration/login.html => pytask/templates/registration/login.html
rename : templates/registration/logout.html => pytask/templates/registration/logout.html
rename : templates/registration/password_change_done.html => pytask/templates/registration/password_change_done.html
rename : templates/registration/password_change_form.html => pytask/templates/registration/password_change_form.html
rename : templates/registration/password_reset_complete.html => pytask/templates/registration/password_reset_complete.html
rename : templates/registration/password_reset_confirm.html => pytask/templates/registration/password_reset_confirm.html
rename : templates/registration/password_reset_done.html => pytask/templates/registration/password_reset_done.html
rename : templates/registration/password_reset_email.html => pytask/templates/registration/password_reset_email.html
rename : templates/registration/password_reset_form.html => pytask/templates/registration/password_reset_form.html
rename : templates/registration/registration_complete.html => pytask/templates/registration/registration_complete.html
rename : templates/registration/registration_form.html => pytask/templates/registration/registration_form.html
rename : utils.py => pytask/utils.py
Diffstat (limited to 'parts/django/docs/topics/db/optimization.txt')
-rw-r--r-- | parts/django/docs/topics/db/optimization.txt | 260 |
1 files changed, 260 insertions, 0 deletions
diff --git a/parts/django/docs/topics/db/optimization.txt b/parts/django/docs/topics/db/optimization.txt new file mode 100644 index 0000000..7d51052 --- /dev/null +++ b/parts/django/docs/topics/db/optimization.txt @@ -0,0 +1,260 @@ +============================ +Database access optimization +============================ + +Django's database layer provides various ways to help developers get the most +out of their databases. This document gathers together links to the relevant +documentation, and adds various tips, organized under a number of headings that +outline the steps to take when attempting to optimize your database usage. + +Profile first +============= + +As general programming practice, this goes without saying. Find out :ref:`what +queries you are doing and what they are costing you +<faq-see-raw-sql-queries>`. You may also want to use an external project like +django-debug-toolbar_, or a tool that monitors your database directly. + +Remember that you may be optimizing for speed or memory or both, depending on +your requirements. Sometimes optimizing for one will be detrimental to the +other, but sometimes they will help each other. Also, work that is done by the +database process might not have the same cost (to you) as the same amount of +work done in your Python process. It is up to you to decide what your +priorities are, where the balance must lie, and profile all of these as required +since this will depend on your application and server. + +With everything that follows, remember to profile after every change to ensure +that the change is a benefit, and a big enough benefit given the decrease in +readability of your code. **All** of the suggestions below come with the caveat +that in your circumstances the general principle might not apply, or might even +be reversed. + +.. _django-debug-toolbar: http://robhudson.github.com/django-debug-toolbar/ + +Use standard DB optimization techniques +======================================= + +...including: + +* Indexes. This is a number one priority, *after* you have determined from + profiling what indexes should be added. Use + :attr:`django.db.models.Field.db_index` to add these from Django. + +* Appropriate use of field types. + +We will assume you have done the obvious things above. The rest of this document +focuses on how to use Django in such a way that you are not doing unnecessary +work. This document also does not address other optimization techniques that +apply to all expensive operations, such as :doc:`general purpose caching +</topics/cache>`. + +Understand QuerySets +==================== + +Understanding :doc:`QuerySets </ref/models/querysets>` is vital to getting good +performance with simple code. In particular: + +Understand QuerySet evaluation +------------------------------ + +To avoid performance problems, it is important to understand: + +* that :ref:`QuerySets are lazy <querysets-are-lazy>`. + +* when :ref:`they are evaluated <when-querysets-are-evaluated>`. + +* how :ref:`the data is held in memory <caching-and-querysets>`. + +Understand cached attributes +---------------------------- + +As well as caching of the whole ``QuerySet``, there is caching of the result of +attributes on ORM objects. In general, attributes that are not callable will be +cached. For example, assuming the :ref:`example Weblog models +<queryset-model-example>`:: + + >>> entry = Entry.objects.get(id=1) + >>> entry.blog # Blog object is retrieved at this point + >>> entry.blog # cached version, no DB access + +But in general, callable attributes cause DB lookups every time:: + + >>> entry = Entry.objects.get(id=1) + >>> entry.authors.all() # query performed + >>> entry.authors.all() # query performed again + +Be careful when reading template code - the template system does not allow use +of parentheses, but will call callables automatically, hiding the above +distinction. + +Be careful with your own custom properties - it is up to you to implement +caching. + +Use the ``with`` template tag +----------------------------- + +To make use of the caching behaviour of ``QuerySet``, you may need to use the +:ttag:`with` template tag. + +Use ``iterator()`` +------------------ + +When you have a lot of objects, the caching behaviour of the ``QuerySet`` can +cause a large amount of memory to be used. In this case, +:meth:`~django.db.models.QuerySet.iterator()` may help. + +Do database work in the database rather than in Python +====================================================== + +For instance: + +* At the most basic level, use :ref:`filter and exclude <queryset-api>` to do + filtering in the database. + +* Use :ref:`F() object query expressions <query-expressions>` to do filtering + against other fields within the same model. + +* Use :doc:`annotate to do aggregation in the database </topics/db/aggregation>`. + +If these aren't enough to generate the SQL you need: + +Use ``QuerySet.extra()`` +------------------------ + +A less portable but more powerful method is +:meth:`~django.db.models.QuerySet.extra()`, which allows some SQL to be +explicitly added to the query. If that still isn't powerful enough: + +Use raw SQL +----------- + +Write your own :doc:`custom SQL to retrieve data or populate models +</topics/db/sql>`. Use ``django.db.connection.queries`` to find out what Django +is writing for you and start from there. + +Retrieve everything at once if you know you will need it +======================================================== + +Hitting the database multiple times for different parts of a single 'set' of +data that you will need all parts of is, in general, less efficient than +retrieving it all in one query. This is particularly important if you have a +query that is executed in a loop, and could therefore end up doing many database +queries, when only one was needed. So: + +Use ``QuerySet.select_related()`` +--------------------------------- + +Understand :ref:`QuerySet.select_related() <select-related>` thoroughly, and use it: + +* in view code, + +* and in :doc:`managers and default managers </topics/db/managers>` where + appropriate. Be aware when your manager is and is not used; sometimes this is + tricky so don't make assumptions. + +Don't retrieve things you don't need +==================================== + +Use ``QuerySet.values()`` and ``values_list()`` +----------------------------------------------- + +When you just want a ``dict`` or ``list`` of values, and don't need ORM model +objects, make appropriate usage of :meth:`~django.db.models.QuerySet.values()`. +These can be useful for replacing model objects in template code - as long as +the dicts you supply have the same attributes as those used in the template, +you are fine. + +Use ``QuerySet.defer()`` and ``only()`` +--------------------------------------- + +Use :meth:`~django.db.models.QuerySet.defer()` and +:meth:`~django.db.models.QuerySet.only()` if there are database columns you +know that you won't need (or won't need in most cases) to avoid loading +them. Note that if you *do* use them, the ORM will have to go and get them in a +separate query, making this a pessimization if you use it inappropriately. + +Use QuerySet.count() +-------------------- + +...if you only want the count, rather than doing ``len(queryset)``. + +Use QuerySet.exists() +--------------------- + +...if you only want to find out if at least one result exists, rather than ``if +queryset``. + +But: + +Don't overuse ``count()`` and ``exists()`` +------------------------------------------ + +If you are going to need other data from the QuerySet, just evaluate it. + +For example, assuming an Email class that has a ``body`` attribute and a +many-to-many relation to User, the following template code is optimal: + +.. code-block:: html+django + + {% if display_inbox %} + {% with user.emails.all as emails %} + {% if emails %} + <p>You have {{ emails|length }} email(s)</p> + {% for email in emails %} + <p>{{ email.body }}</p> + {% endfor %} + {% else %} + <p>No messages today.</p> + {% endif %} + {% endwith %} + {% endif %} + + +It is optimal because: + + 1. Since QuerySets are lazy, this does no database if 'display_inbox' is False. + + #. Use of ``with`` means that we store ``user.emails.all`` in a variable for + later use, allowing its cache to be re-used. + + #. The line ``{% if emails %}`` causes ``QuerySet.__nonzero__()`` to be called, + which causes the ``user.emails.all()`` query to be run on the database, and + at the least the first line to be turned into an ORM object. If there aren't + any results, it will return False, otherwise True. + + #. The use of ``{{ emails|length }}`` calls ``QuerySet.__len__()``, filling + out the rest of the cache without doing another query. + + #. The ``for`` loop iterates over the already filled cache. + +In total, this code does either one or zero database queries. The only +deliberate optimization performed is the use of the ``with`` tag. Using +``QuerySet.exists()`` or ``QuerySet.count()`` at any point would cause +additional queries. + +Use ``QuerySet.update()`` and ``delete()`` +------------------------------------------ + +Rather than retrieve a load of objects, set some values, and save them +individual, use a bulk SQL UPDATE statement, via :ref:`QuerySet.update() +<topics-db-queries-update>`. Similarly, do :ref:`bulk deletes +<topics-db-queries-delete>` where possible. + +Note, however, that these bulk update methods cannot call the ``save()`` or +``delete()`` methods of individual instances, which means that any custom +behaviour you have added for these methods will not be executed, including +anything driven from the normal database object :doc:`signals </ref/signals>`. + +Use foreign key values directly +------------------------------- + +If you only need a foreign key value, use the foreign key value that is already on +the object you've got, rather than getting the whole related object and taking +its primary key. i.e. do:: + + entry.blog_id + +instead of:: + + entry.blog.id + |