1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
|
==================
Multiple databases
==================
.. versionadded:: 1.2
This topic guide describes Django's support for interacting with
multiple databases. Most of the rest of Django's documentation assumes
you are interacting with a single database. If you want to interact
with multiple databases, you'll need to take some additional steps.
Defining your databases
=======================
The first step to using more than one database with Django is to tell
Django about the database servers you'll be using. This is done using
the :setting:`DATABASES` setting. This setting maps database aliases,
which are a way to refer to a specific database throughout Django, to
a dictionary of settings for that specific connection. The settings in
the inner dictionaries are described fully in the :setting:`DATABASES`
documentation.
Databases can have any alias you choose. However, the alias
``default`` has special significance. Django uses the database with
the alias of ``default`` when no other database has been selected. If
you don't have a ``default`` database, you need to be careful to
always specify the database that you want to use.
The following is an example ``settings.py`` snippet defining two
databases -- a default PostgreSQL database and a MySQL database called
``users``:
.. code-block:: python
DATABASES = {
'default': {
'NAME': 'app_data',
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'USER': 'postgres_user',
'PASSWORD': 's3krit'
},
'users': {
'NAME': 'user_data',
'ENGINE': 'django.db.backends.mysql',
'USER': 'mysql_user',
'PASSWORD': 'priv4te'
}
}
If you attempt to access a database that you haven't defined in your
:setting:`DATABASES` setting, Django will raise a
``django.db.utils.ConnectionDoesNotExist`` exception.
Synchronizing your databases
============================
The :djadmin:`syncdb` management command operates on one database at a
time. By default, it operates on the ``default`` database, but by
providing a :djadminopt:`--database` argument, you can tell syncdb to
synchronize a different database. So, to synchronize all models onto
all databases in our example, you would need to call::
$ ./manage.py syncdb
$ ./manage.py syncdb --database=users
If you don't want every application to be synchronized onto a
particular database, you can define a :ref:`database
router<topics-db-multi-db-routing>` that implements a policy
constraining the availability of particular models.
Alternatively, if you want fine-grained control of synchronization,
you can pipe all or part of the output of :djadmin:`sqlall` for a
particular application directly into your database prompt, like this::
$ ./manage.py sqlall sales | ./manage.py dbshell
Using other management commands
-------------------------------
The other ``django-admin.py`` commands that interact with the database
operate in the same way as :djadmin:`syncdb` -- they only ever operate
on one database at a time, using :djadminopt:`--database` to control
the database used.
.. _topics-db-multi-db-routing:
Automatic database routing
==========================
The easiest way to use multiple databases is to set up a database
routing scheme. The default routing scheme ensures that objects remain
'sticky' to their original database (i.e., an object retrieved from
the ``foo`` database will be saved on the same database). The default
routing scheme ensures that if a database isn't specified, all queries
fall back to the ``default`` database.
You don't have to do anything to activate the default routing scheme
-- it is provided 'out of the box' on every Django project. However,
if you want to implement more interesting database allocation
behaviors, you can define and install your own database routers.
Database routers
----------------
A database Router is a class that provides up to four methods:
.. method:: db_for_read(model, **hints)
Suggest the database that should be used for read operations for
objects of type ``model``.
If a database operation is able to provide any additional
information that might assist in selecting a database, it will be
provided in the ``hints`` dictionary. Details on valid hints are
provided :ref:`below <topics-db-multi-db-hints>`.
Returns None if there is no suggestion.
.. method:: db_for_write(model, **hints)
Suggest the database that should be used for writes of objects of
type Model.
If a database operation is able to provide any additional
information that might assist in selecting a database, it will be
provided in the ``hints`` dictionary. Details on valid hints are
provided :ref:`below <topics-db-multi-db-hints>`.
Returns None if there is no suggestion.
.. method:: allow_relation(obj1, obj2, **hints)
Return True if a relation between obj1 and obj2 should be
allowed, False if the relation should be prevented, or None if
the router has no opinion. This is purely a validation operation,
used by foreign key and many to many operations to determine if a
relation should be allowed between two objects.
.. method:: allow_syncdb(db, model)
Determine if the ``model`` should be synchronized onto the
database with alias ``db``. Return True if the model should be
synchronized, False if it should not be synchronized, or None if
the router has no opinion. This method can be used to determine
the availability of a model on a given database.
A router doesn't have to provide *all* these methods - it omit one or
more of them. If one of the methods is omitted, Django will skip that
router when performing the relevant check.
.. _topics-db-multi-db-hints:
Hints
~~~~~
The hints received by the database router can be used to decide which
database should receive a given request.
At present, the only hint that will be provided is ``instance``, an
object instance that is related to the read or write operation that is
underway. This might be the instance that is being saved, or it might
be an instance that is being added in a many-to-many relation. In some
cases, no instance hint will be provided at all. The router checks for
the existence of an instance hint, and determine if that hint should be
used to alter routing behavior.
Using routers
-------------
Database routers are installed using the :setting:`DATABASE_ROUTERS`
setting. This setting defines a list of class names, each specifying a
router that should be used by the master router
(``django.db.router``).
The master router is used by Django's database operations to allocate
database usage. Whenever a query needs to know which database to use,
it calls the master router, providing a model and a hint (if
available). Django then tries each router in turn until a database
suggestion can be found. If no suggestion can be found, it tries the
current ``_state.db`` of the hint instance. If a hint instance wasn't
provided, or the instance doesn't currently have database state, the
master router will allocate the ``default`` database.
An example
----------
.. admonition:: Example purposes only!
This example is intended as a demonstration of how the router
infrastructure can be used to alter database usage. It
intentionally ignores some complex issues in order to
demonstrate how routers are used.
This example won't work if any of the models in ``myapp`` contain
relationships to models outside of the ``other`` database.
:ref:`Cross-database relationships <no_cross_database_relations>`
introduce referential integrity problems that Django can't
currently handle.
The master/slave configuration described is also flawed -- it
doesn't provide any solution for handling replication lag (i.e.,
query inconsistencies introduced because of the time taken for a
write to propagate to the slaves). It also doesn't consider the
interaction of transactions with the database utilization strategy.
So - what does this mean in practice? Say you want ``myapp`` to
exist on the ``other`` database, and you want all other models in a
master/slave relationship between the databases ``master``, ``slave1`` and
``slave2``. To implement this, you would need 2 routers::
class MyAppRouter(object):
"""A router to control all database operations on models in
the myapp application"""
def db_for_read(self, model, **hints):
"Point all operations on myapp models to 'other'"
if model._meta.app_label == 'myapp':
return 'other'
return None
def db_for_write(self, model, **hints):
"Point all operations on myapp models to 'other'"
if model._meta.app_label == 'myapp':
return 'other'
return None
def allow_relation(self, obj1, obj2, **hints):
"Allow any relation if a model in myapp is involved"
if obj1._meta.app_label == 'myapp' or obj2._meta.app_label == 'myapp':
return True
return None
def allow_syncdb(self, db, model):
"Make sure the myapp app only appears on the 'other' db"
if db == 'other':
return model._meta.app_label == 'myapp'
elif model._meta.app_label == 'myapp':
return False
return None
class MasterSlaveRouter(object):
"""A router that sets up a simple master/slave configuration"""
def db_for_read(self, model, **hints):
"Point all read operations to a random slave"
return random.choice(['slave1','slave2'])
def db_for_write(self, model, **hints):
"Point all write operations to the master"
return 'master'
def allow_relation(self, obj1, obj2, **hints):
"Allow any relation between two objects in the db pool"
db_list = ('master','slave1','slave2')
if obj1._state.db in db_list and obj2._state.db in db_list:
return True
return None
def allow_syncdb(self, db, model):
"Explicitly put all models on all databases."
return True
Then, in your settings file, add the following (substituting ``path.to.`` with
the actual python path to the module where you define the routers)::
DATABASE_ROUTERS = ['path.to.MyAppRouter', 'path.to.MasterSlaveRouter']
The order in which routers are processed is significant. Routers will
be queried in the order the are listed in the
:setting:`DATABASE_ROUTERS` setting . In this example, the
``MyAppRouter`` is processed before the ``MasterSlaveRouter``, and as a
result, decisions concerning the models in ``myapp`` are processed
before any other decision is made. If the :setting:`DATABASE_ROUTERS`
setting listed the two routers in the other order,
``MasterSlaveRouter.allow_syncdb()`` would be processed first. The
catch-all nature of the MasterSlaveRouter implementation would mean
that all models would be available on all databases.
With this setup installed, lets run some Django code::
>>> # This retrieval will be performed on the 'credentials' database
>>> fred = User.objects.get(username='fred')
>>> fred.first_name = 'Frederick'
>>> # This save will also be directed to 'credentials'
>>> fred.save()
>>> # These retrieval will be randomly allocated to a slave database
>>> dna = Person.objects.get(name='Douglas Adams')
>>> # A new object has no database allocation when created
>>> mh = Book(title='Mostly Harmless')
>>> # This assignment will consult the router, and set mh onto
>>> # the same database as the author object
>>> mh.author = dna
>>> # This save will force the 'mh' instance onto the master database...
>>> mh.save()
>>> # ... but if we re-retrieve the object, it will come back on a slave
>>> mh = Book.objects.get(title='Mostly Harmless')
Manually selecting a database
=============================
Django also provides an API that allows you to maintain complete control
over database usage in your code. A manually specified database allocation
will take priority over a database allocated by a router.
Manually selecting a database for a ``QuerySet``
------------------------------------------------
You can select the database for a ``QuerySet`` at any point in the
``QuerySet`` "chain." Just call ``using()`` on the ``QuerySet`` to get
another ``QuerySet`` that uses the specified database.
``using()`` takes a single argument: the alias of the database on
which you want to run the query. For example::
>>> # This will run on the 'default' database.
>>> Author.objects.all()
>>> # So will this.
>>> Author.objects.using('default').all()
>>> # This will run on the 'other' database.
>>> Author.objects.using('other').all()
Selecting a database for ``save()``
-----------------------------------
Use the ``using`` keyword to ``Model.save()`` to specify to which
database the data should be saved.
For example, to save an object to the ``legacy_users`` database, you'd
use this::
>>> my_object.save(using='legacy_users')
If you don't specify ``using``, the ``save()`` method will save into
the default database allocated by the routers.
Moving an object from one database to another
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you've saved an instance to one database, it might be tempting to
use ``save(using=...)`` as a way to migrate the instance to a new
database. However, if you don't take appropriate steps, this could
have some unexpected consequences.
Consider the following example::
>>> p = Person(name='Fred')
>>> p.save(using='first') # (statement 1)
>>> p.save(using='second') # (statement 2)
In statement 1, a new ``Person`` object is saved to the ``first``
database. At this time, ``p`` doesn't have a primary key, so Django
issues a SQL ``INSERT`` statement. This creates a primary key, and
Django assigns that primary key to ``p``.
When the save occurs in statement 2, ``p`` already has a primary key
value, and Django will attempt to use that primary key on the new
database. If the primary key value isn't in use in the ``second``
database, then you won't have any problems -- the object will be
copied to the new database.
However, if the primary key of ``p`` is already in use on the
``second`` database, the existing object in the ``second`` database
will be overridden when ``p`` is saved.
You can avoid this in two ways. First, you can clear the primary key
of the instance. If an object has no primary key, Django will treat it
as a new object, avoiding any loss of data on the ``second``
database::
>>> p = Person(name='Fred')
>>> p.save(using='first')
>>> p.pk = None # Clear the primary key.
>>> p.save(using='second') # Write a completely new object.
The second option is to use the ``force_insert`` option to ``save()``
to ensure that Django does a SQL ``INSERT``::
>>> p = Person(name='Fred')
>>> p.save(using='first')
>>> p.save(using='second', force_insert=True)
This will ensure that the person named ``Fred`` will have the same
primary key on both databases. If that primary key is already in use
when you try to save onto the ``second`` database, an error will be
raised.
Selecting a database to delete from
-----------------------------------
By default, a call to delete an existing object will be executed on
the same database that was used to retrieve the object in the first
place::
>>> u = User.objects.using('legacy_users').get(username='fred')
>>> u.delete() # will delete from the `legacy_users` database
To specify the database from which a model will be deleted, pass a
``using`` keyword argument to the ``Model.delete()`` method. This
argument works just like the ``using`` keyword argument to ``save()``.
For example, if you're migrating a user from the ``legacy_users``
database to the ``new_users`` database, you might use these commands::
>>> user_obj.save(using='new_users')
>>> user_obj.delete(using='legacy_users')
Using managers with multiple databases
--------------------------------------
Use the ``db_manager()`` method on managers to give managers access to
a non-default database.
For example, say you have a custom manager method that touches the
database -- ``User.objects.create_user()``. Because ``create_user()``
is a manager method, not a ``QuerySet`` method, you can't do
``User.objects.using('new_users').create_user()``. (The
``create_user()`` method is only available on ``User.objects``, the
manager, not on ``QuerySet`` objects derived from the manager.) The
solution is to use ``db_manager()``, like this::
User.objects.db_manager('new_users').create_user(...)
``db_manager()`` returns a copy of the manager bound to the database you specify.
Using ``get_query_set()`` with multiple databases
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you're overriding ``get_query_set()`` on your manager, be sure to
either call the method on the parent (using ``super()``) or do the
appropriate handling of the ``_db`` attribute on the manager (a string
containing the name of the database to use).
For example, if you want to return a custom ``QuerySet`` class from
the ``get_query_set`` method, you could do this::
class MyManager(models.Manager):
def get_query_set(self):
qs = CustomQuerySet(self.model)
if self._db is not None:
qs = qs.using(self._db)
return qs
Exposing multiple databases in Django's admin interface
=======================================================
Django's admin doesn't have any explicit support for multiple
databases. If you want to provide an admin interface for a model on a
database other than that that specified by your router chain, you'll
need to write custom :class:`~django.contrib.admin.ModelAdmin` classes
that will direct the admin to use a specific database for content.
``ModelAdmin`` objects have four methods that require customization for
multiple-database support::
class MultiDBModelAdmin(admin.ModelAdmin):
# A handy constant for the name of the alternate database.
using = 'other'
def save_model(self, request, obj, form, change):
# Tell Django to save objects to the 'other' database.
obj.save(using=self.using)
def queryset(self, request):
# Tell Django to look for objects on the 'other' database.
return super(MultiDBModelAdmin, self).queryset(request).using(self.using)
def formfield_for_foreignkey(self, db_field, request=None, **kwargs):
# Tell Django to populate ForeignKey widgets using a query
# on the 'other' database.
return super(MultiDBModelAdmin, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs)
def formfield_for_manytomany(self, db_field, request=None, **kwargs):
# Tell Django to populate ManyToMany widgets using a query
# on the 'other' database.
return super(MultiDBModelAdmin, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs)
The implementation provided here implements a multi-database strategy
where all objects of a given type are stored on a specific database
(e.g., all ``User`` objects are in the ``other`` database). If your
usage of multiple databases is more complex, your ``ModelAdmin`` will
need to reflect that strategy.
Inlines can be handled in a similar fashion. They require three customized methods::
class MultiDBTabularInline(admin.TabularInline):
using = 'other'
def queryset(self, request):
# Tell Django to look for inline objects on the 'other' database.
return super(MultiDBTabularInline, self).queryset(request).using(self.using)
def formfield_for_foreignkey(self, db_field, request=None, **kwargs):
# Tell Django to populate ForeignKey widgets using a query
# on the 'other' database.
return super(MultiDBTabularInline, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs)
def formfield_for_manytomany(self, db_field, request=None, **kwargs):
# Tell Django to populate ManyToMany widgets using a query
# on the 'other' database.
return super(MultiDBTabularInline, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs)
Once you've written your model admin definitions, they can be
registered with any ``Admin`` instance::
from django.contrib import admin
# Specialize the multi-db admin objects for use with specific models.
class BookInline(MultiDBTabularInline):
model = Book
class PublisherAdmin(MultiDBModelAdmin):
inlines = [BookInline]
admin.site.register(Author, MultiDBModelAdmin)
admin.site.register(Publisher, PublisherAdmin)
othersite = admin.Site('othersite')
othersite.register(Publisher, MultiDBModelAdmin)
This example sets up two admin sites. On the first site, the
``Author`` and ``Publisher`` objects are exposed; ``Publisher``
objects have an tabular inline showing books published by that
publisher. The second site exposes just publishers, without the
inlines.
Using raw cursors with multiple databases
=========================================
If you are using more than one database you can use
``django.db.connections`` to obtain the connection (and cursor) for a
specific database. ``django.db.connections`` is a dictionary-like
object that allows you to retrieve a specific connection using it's
alias::
from django.db import connections
cursor = connections['my_db_alias'].cursor()
Limitations of multiple databases
=================================
.. _no_cross_database_relations:
Cross-database relations
------------------------
Django doesn't currently provide any support for foreign key or
many-to-many relationships spanning multiple databases. If you
have used a router to partition models to different databases,
any foreign key and many-to-many relationships defined by those
models must be internal to a single database.
This is because of referential integrity. In order to maintain a
relationship between two objects, Django needs to know that the
primary key of the related object is valid. If the primary key is
stored on a separate database, it's not possible to easily evaluate
the validity of a primary key.
If you're using Postgres, Oracle, or MySQL with InnoDB, this is
enforced at the database integrity level -- database level key
constraints prevent the creation of relations that can't be validated.
However, if you're using SQLite or MySQL with MyISAM tables, there is
no enforced referential integrity; as a result, you may be able to
'fake' cross database foreign keys. However, this configuration is not
officially supported by Django.
|