Cache Performance

Monitor and optimize AI response caching efficiency

+3.2%

92.3%

Hit Rate

76% used

758 MB

Cache Size

-78%

45ms

Avg Response

+12%

$2,340

Cost Savings

Cache Hit Rate

Hit Rate

Miss Rate

Response Time Impact

P50

P95

P99

Cached

P50

P95

P99

Non-cached

20x Faster Response Times

Cached responses are significantly faster, improving user experience and reducing compute costs.

Cache Distribution

758

MB Total

Text Completions

342 MB

45%

Embeddings

189 MB

25%

Image Generation

114 MB

15%

Code Generation

76 MB

10%

Other

38 MB

Eviction Policies

LRU

Least Recently Used

234 evictions

92%

TTL-based

Time To Live

89 evictions

87%

LFU

Least Frequently Used

156 evictions

79%

Enable auto-eviction when cache is 90% full

Recent Cache Entries

emb_vec_search_products

Embeddings•2.4 MB•8,934 hits•2 mins ago

active

completion_customer_support

Text Completions•156 KB•3,421 hits•5 mins ago

active

img_gen_product_thumbnails

Image Generation•45.2 MB•1,203 hits•18 mins ago

stale

code_gen_api_endpoints

Code Generation•892 KB•567 hits•1 hour ago

expiring