How to delete items in bulk from Cosmos DB
I use Cosmos DB
because its great NoSQL
engine, with fluent connectivity and integration, makes it great thing to work with.
One of the common tasks I faced recently was removing not relevant (outdated) data. Housekeeping tasks are very straight forward: you mark data as obsolete and then remove it.
Let's take a look what options for implementation do we have:
Custom logic
As an example: You create a Cron job which runs every x period of time and checks for expired data and deletes it. Can be simple as it sounds. But when a lot amount of data to be deleted, things getting complicated.
Comparing to relational database Cosmos DB
SQL API doesn't support delete by query, which leaves with only option delete records one by one - not the best approach for scalability.
Pros
- With right implementation can be portable to another engine, with minimal rewrite of connectivity
- Logic on backend separately from database (if according to target architecture)
Cons
- Development and testing effort
- Performance can be affected in case of big amount of data to be removed
- Code to maintain
Cosmos DB Store procedure
Similar to previous but executed on cosmos DB, probably better solution then previous, but still will require certain development effort and will cost you in RU's.
Pros
- Handled by the database engine
Cons
- Development and testing effort
- Performance can be affected in case of big amount of data to be removed
Implementing Cosmos DB TTL
There is built in mechanism in Cosmos DB
SQL API which will take care of expired data called Time to Live.
However, keep in mind TTL doesn't come for free, Cosmos DB will use left-over RU's in background for deleting records. This is definitely preferable option over mentioned above as
Pros
- Minimal development effort
- Easy and clear solution
- No code to maintain
Cons
- Intensive work with Cosmos DB might cause delays in deletion as the mechanism uses RUs left-overs
The setup
- Enable time to live in container settings
- Add TTL property to your model
public class PersistentModel
{
public string Id { get; set; }
public string Name { get; set; }
public string PartitionKey{ get; set; }
public int Ttl { get; set; }
}
- Save record with calculated ttl in seconds
ItemResponse<PersistentModel> createResponse = await container.CreateItemAsync(vt, new PartitionKey(vt.PartitionKey));
and walla! what left is to make sure records are deleted after hitting ttl offset.
🚀 Turbocharge Your Infrastructure with Our Terraform Template Kits! 🚀
🌟 Slash deployment time and costs! Discover the ultimate solution for efficient, cost-effective cloud infrastructure. Perfect for DevOps enthusiasts looking for a reliable, scalable setup. Click here to revolutionize your workflow!
Learn More about Starter Terraform Kits for AKS,EKS and GKE
No comments are allowed for this post