Storage CDN
All assets uploaded to Supabase Storage are cached on a Content Delivery Network (CDN) to improve the latency for users all around the world. CDNs are a geographically distributed set of servers or nodes which caches content from an origin server. For Supabase Storage, the origin is the storage server running in the same region as your project. Aside from performance, CDNs also help with security and availability by mitigating Distributed Denial of Service and other application attacks.
Basic CDN#
Our basic CDN caches objects based on the cache time set when uploading objects.
Example#
Let's walk through an example of how a CDN helps with performance.
A new bucket is created for a Supabase project launched in Singapore. All requests to the Supabase Storage API are routed to the CDN first.
A user from the United States requests an object and is routed to the U.S. CDN. At this point, that CDN node does not have the object in its cache and pings the origin server in Singapore.
Another user, also in the United States, requests the same object and is served directly from the CDN cache in the United States instead of routing the request back to Singapore.
Cache duration#
By default, assets are cached both in the CDN and in the user's browser for 1 hour. After this, the CDN nodes ping the storage server to see if an object has been updated. You can modify this cache time when you are uploading or updating an object by modifying the cacheControl
parameter.
If you expect the object to not change at a given URL, setting a longer cache duration is preferable.
If you need to update the version of the object stored in the CDN, there are various cache-busting techniques you can use. The most common way to do this is to add a version query parameter in the URL.
For example, you can use a URL like /storage/v1/object/sign/profile-pictures/cat.jpg?token=eyJh...&version=1
in your applications and set a long cache time of 1 year.
When you want to update the cat picture, you can increment the version query parameter in the URL. The CDN treats /storage/v1/object/sign/profile-pictures/cat.jpg?token=eyJh...&version=2
as a new object and pings the origin for the updated version.
If your asset is updated frequently, we recommend that you re-upload the new asset in a different key, this way you'll always have the latest changes available immediately.
Note that CDNs might still evict your object from their cache if it has not been requested for a while from a specific region. For example, if no user from United States requests your object, it will be removed from the CDN cache even if you set a very long cache control duration.
The cache status of a particular request is sent in the cf-cache-status
header. A cache status of MISS
indicates that the CDN node did not have the object in its cache and had to ping the origin to get it. A cache status of HIT
indicates that the object was sent directly from the CDN.
Smart CDN Caching#
note
Smart CDN caching is automatically enabled for Pro plan and above.
With Smart CDN caching enabled, the asset metadata in your database is synchronized to the edge. This automatically revalidates the cache when the asset is changed or deleted.
Additionally, the smart CDN has a higher cache hit ratio as the origin server is shielded from asset requests that haven't changed when using different query strings in the URL.
Cache duration#
When Smart CDN is enabled, the asset is cached on the CDN for as long as possible. You can still control how long assets are stored in the browser using the cacheControl option when uploading a file. Smart CDN caching works with all types of storage operations including signed URLs.
When a file is updated or deleted, the CDN cache is automatically invalidated to reflect the change (including transformed images). It can take up to 60 seconds for the CDN cache to be invalidated as the asset metadata has to propagate across all the data-centers around the globe.
When an asset is invalidated at the CDN level, browsers may not update its cache.
If your asset is updated frequently, we recommend that you re-upload the new asset in a different key, this way you'll always have the latest changes available without waiting the propagation delay. If you expect your asset to be deleted, we recommend setting a low browser TTL value using the cacheControl
option when using smart CDN caching, the default is 1 hour which generally is a good default.
Public vs Private Buckets#
Objects in public buckets do not require any Authorization to access objects. This leads to a better cache hit rate compared to private buckets. For private buckets, permissions for accessing each object is checked on a per user level. For example, if two different users access the same object in a private bucket from the same region, it results in a cache miss for both the users since they might have different security policies attached to them. On the other hand, if two different users access the same object in a public bucket from the same region, it results in a cache hit for the second user.
Debugging Cache Hits#
The storage report gives a breakdown of your project's cache hit rate.
The Logs Explorer can also be used to debug cache misses.
note
Cache hits can be determined via the metadata.response.headers.cf_cache_status
key. Any value corresponding to either HIT
, STALE
, REVALIDATED
, or UPDATING
is considered a cache hit.
The following example query will show the top cache misses from the edge_logs
:
_17select_17 r.path as path,_17 r.search as search,_17 count(id) as count_17from_17 edge_logs as f_17 cross join unnest(f.metadata) as m_17 cross join unnest(m.request) as r_17 cross join unnest(m.response) as res_17 cross join unnest(res.headers) as h_17where_17 starts_with(r.path, '/storage/v1/object')_17 and r.method = 'GET'_17 and h.cf_cache_status in ('MISS', 'NONE/UNKNOWN', 'EXPIRED', 'BYPASS', 'DYNAMIC')_17group by path, search_17order by count desc_17limit 50;
Try out this query in the Logs Explorer here.
Your cache hit ratio over time can then be determined using the following query:
_12select_12 timestamp_trunc(timestamp, hour) as timestamp,_12 countif(h.cf_cache_status in ('HIT', 'STALE', 'REVALIDATED', 'UPDATING')) / count(f.id) as ratio_12from_12 edge_logs as f_12 cross join unnest(f.metadata) as m_12 cross join unnest(m.request) as r_12 cross join unnest(m.response) as res_12 cross join unnest(res.headers) as h_12where starts_with(r.path, '/storage/v1/object') and r.method = 'GET'_12group by timestamp_12order by timestamp desc;
Try out this query in the Logs Explorer here.