Django Select_Related, Prefetch_Related

JunePyo Suh·2020년 5월 25일
0

select_related and prefetch_related can significantly improve our query performance.

According to the Django documentation:

select_related(*fields) returns a QuerySet that will "follow" foreign-key relationships, selecting additional related-object data when it executes its query. This is a performance booster which results in a single more complex query but means later use of foreign-key relationships won’t require database queries.

prefetch_related(*lookups) returns a QuerySet that will automatically retrieve, in a single batch, related objects for each of the specified lookups.

"Select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one."

"Prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related."

Simply using regular referencing or reverse referencing creates a huge overhead for the database, as it has to loop over itself every time whenever there is a get call.

Using prefetch related and select related, databases can first read in the target data, store them in cache memory (a variable), from which we can access data without directly accessing the database as long as there is no change to the data structure.

These two methods return query sets, because if they return objects, that becomes a solid, confirmed memory which can not be queried or searched.

Example code:

>>> a = Product.objects.select_related('main_category', 'sub_category', 'nutrient').prefetch_related('productallergen_set')
>>> a
<QuerySet [<Product: Product object (1)>, <Product: Product object (2)>]>

In the above example, only the data for productallergen_set can be retrieved from the cache memory. If there is a need for further querying, again use select_related instead of all() for memory efficiency.

>>> a[0].productallergen_set.select_related('product','allergen')
<QuerySet [<ProductAllergen: ProductAllergen object (1)>, <ProductAllergen: ProductAllergen object (2)>, <ProductAllergen: ProductAllergen object (3)>]>
>>> a[0].productallergen_set.all()
<QuerySet [<ProductAllergen: ProductAllergen object (1)>, <ProductAllergen: ProductAllergen object (2)>, <ProductAllergen: ProductAllergen object (3)>]>

0개의 댓글