Skip to content
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions docs/api/cache.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,19 @@ SemanticCache
:inherited-members:


LangCacheSemanticCache
======================

.. _langcache_semantic_cache_api:

.. currentmodule:: redisvl.extensions.cache.llm

.. autoclass:: LangCacheSemanticCache
:show-inheritance:
:members:
:inherited-members:


Cache Schema Classes
====================

Expand Down
6 changes: 5 additions & 1 deletion docs/concepts/extensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,11 @@ Too strict, and you miss valid cache hits. Too loose, and you return wrong answe

In applications serving multiple users or contexts, you often want separate cache spaces. Filters let you scope cache lookups—for example, caching per-user or per-conversation so one user's cached answers don't leak to another.

**Learn more:** {doc}`/user_guide/03_llmcache` covers semantic caching in detail.
### Redis vs LangCache managed service

`SemanticCache` stores data in your Redis deployment and uses RedisVL’s search index under the hood—you control sizing, networking, and advanced filtering with {doc}`FilterExpression </api/filter>`.

If you prefer a hosted semantic cache that is operated as a service you can use `LangCacheSemanticCache` (install `redisvl[langcache]`). It uses the LangCache API endpoint instead of Redis directly. While these are similar, they do not share all the same properties. Refer to {doc}`/user_guide/03_llmcache` to see `SemanticCache` in detail, and {doc}`/user_guide/13_langcache_semantic_cache` covers `LangCacheSemanticCache` in detail.

## Embeddings Cache

Expand Down
165 changes: 102 additions & 63 deletions docs/user_guide/03_llmcache.ipynb

Large diffs are not rendered by default.

289 changes: 289 additions & 0 deletions docs/user_guide/13_langcache_semantic_cache.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,289 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use LangCache as the LLM Cache Backend\n",
"\n",
"This guide shows how to use RedisVL's `LangCacheSemanticCache`, a thin wrapper around the [LangCache](https://redis.io/langcache/) managed semantic cache service. You get the same high-level `check` / `store` workflow as `SemanticCache`, backed by LangCache's HTTP API instead of a Redis index you manage yourself.\n",
"\n",
"For more on semantic caching, see [Extensions](../concepts/extensions.md), and to use RedisVL's semantic caching class see our [llm cache notebook](03_llmcache.ipynb). API entries for both classes live in the [LLM cache API](../api/cache.rst).\n",
"\n",
"## Prerequisites\n",
"\n",
"Before you begin, ensure you have:\n",
"- Installed RedisVL with the LangCache extra: `pip install redisvl[langcache]`\n",
"- Python 3.9+ (same as RedisVL)\n",
"- A LangCache service with a **cache ID** and **API key**. You can set up a LangCache service in Redis Cloud [here](https://cloud.redis.io/#/)\n",
"- Optionally: **attributes** configured on your LangCache cache if you plan to pass `metadata` / `attributes` from RedisVL\n",
"\n",
"## What You'll Learn\n",
"\n",
"By the end of this guide, you will be able to:\n",
"- Choose between `SemanticCache` and `LangCacheSemanticCache` for your deployment\n",
"- Initialize `LangCacheSemanticCache` with credentials and TTL defaults\n",
"- Implement read-through caching (`check` \u2192 LLM \u2192 `store`)\n",
"- Use LangCache attributes for scoping and deletion\n",
"- Override TTL per store, use async APIs, and run delete operations\n",
"- Understand current limitations compared to `SemanticCache`\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Choose `SemanticCache` or `LangCacheSemanticCache`\n",
"\n",
"| | `SemanticCache` | `LangCacheSemanticCache` |\n",
"|---|------------------|--------------------------|\n",
"| **Where data lives** | Your Redis deployment; RedisVL creates and queries a search index | LangCache managed service (hosted API) |\n",
"| **Best when** | You control Redis, need full RedisVL query/filter features, or co-locate cache with app data | You want a managed semantic cache without operating Redis or the index |\n",
"| **Vector search by raw embedding** | Supported (`vector=` on `check`) | **Not supported** \u2014 search is prompt-based via the LangCache API |\n",
"| **Filter expressions** | `FilterExpression` on `check` | **Not supported** \u2014 use LangCache **attributes** (pre-configured on the cache) |\n",
"| **Partial entry updates** | Supported where the backend allows | **`update` / `aupdate` raise** \u2014 delete and re-store instead |\n",
"\n",
"> **Note:** `SemanticCache` is covered in depth in the [llmcache notebook](03_llmcache.ipynb) guide.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install the LangCache extra\n",
"\n",
"The `redisvl[langcache]` extra installs compatible `langcache` dependencies:\n",
"\n",
"```bash\n",
"pip install redisvl[langcache]\n",
"```\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# NBVAL_SKIP\n",
"%pip install redisvl[langcache]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize `LangCacheSemanticCache`\n",
"\n",
"Create `LangCacheSemanticCache` with your LangCache credentials. The default `server_url` points at the managed LangCache API; override it if your provider gives a different endpoint.\n",
"\n",
"The following example reads credentials from environment variables (recommended for applications). Replace placeholder values when experimenting locally.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# NBVAL_SKIP\n",
"import os\n",
"\n",
"from redisvl.extensions.cache.llm import LangCacheSemanticCache\n",
"\n",
"CACHE_ID = os.environ.get(\"LANGCACHE_CACHE_ID\", \"YOUR_CACHE_ID\")\n",
"API_KEY = os.environ.get(\"LANGCACHE_API_KEY\", \"YOUR_API_KEY\")\n",
"\n",
"cache = LangCacheSemanticCache(\n",
" name=\"my_app_cache\",\n",
" server_url=\"https://aws-us-east-1.langcache.redis.io\",\n",
" cache_id=CACHE_ID,\n",
" api_key=API_KEY,\n",
" ttl=3600, # default TTL for entries, in seconds (optional)\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"| Parameter | Purpose |\n",
"|-----------|---------|\n",
"| `cache_id`, `api_key` | Required. Identify your LangCache cache and authenticate. |\n",
"| `server_url` | LangCache API base URL (default matches typical managed deployments). |\n",
"| `ttl` | Default time-to-live for stored entries, in seconds; can be overridden per `store` call. |\n",
"| `use_exact_search` / `use_semantic_search` | Enable exact and/or semantic matching (at least one must be `True`). |\n",
"| `distance_threshold` (on `check`) | Works with `distance_scale`: `\"normalized\"` (0\u20131 distance) or `\"redis\"` (cosine-style 0\u20132). |\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Attributes and metadata\n",
"\n",
"LangCache **attributes** are key/value metadata attached to entries. They can be used when **searching** (`check` / `acheck` via the `attributes` argument) and when **deleting** (`delete_by_attributes` / `adelete_by_attributes`).\n",
"\n",
"**You must define the same attribute names (and types) in the LangCache console or API for your cache before RedisVL can use them.** If you pass `metadata` to `store` or `attributes` to `check` but the cache has no attributes configured, the LangCache API returns an error; RedisVL surfaces a clear `RuntimeError` explaining that attributes need to be configured or removed from the call.\n",
"\n",
"String values are encoded for the API and decoded when reading hits so special characters remain usable.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Read-through caching pattern\n",
"\n",
"Typical flow: try `check`, call the LLM on a miss, then `store` the result.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# NBVAL_SKIP\n",
"def call_your_llm(prompt: str) -> str:\n",
" \"\"\"Replace with your LLM client (OpenAI, Anthropic, etc.).\"\"\"\n",
" return f\"Answer for: {prompt}\"\n",
"\n",
"\n",
"def answer(user_prompt: str) -> str:\n",
" hits = cache.check(prompt=user_prompt, num_results=1)\n",
" if hits:\n",
" return hits[0][\"response\"]\n",
"\n",
" response = call_your_llm(user_prompt)\n",
" cache.store(prompt=user_prompt, response=response)\n",
" return response\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Optional scoping with attributes (only if those attributes are configured on LangCache):\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# NBVAL_SKIP\n",
"user_prompt = \"Example prompt\"\n",
"\n",
"hits = cache.check(\n",
" prompt=user_prompt,\n",
" attributes={\"tenant_id\": \"acme\", \"model\": \"gpt-4o\"},\n",
" num_results=1,\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### TTL\n",
"\n",
"- **Constructor** `ttl=` sets the default lifetime for new entries (seconds).\n",
"- **Per call**, pass `ttl=` to `store` / `astore` to override the default for that entry.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# NBVAL_SKIP\n",
"prompt = \"What is Redis?\"\n",
"response = \"Redis is an in memory data store.\"\n",
"\n",
"cache.store(prompt=prompt, response=response, ttl=300) # this entry expires in 5 minutes\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Async usage\n",
"Use the `a`-prefixed methods with `asyncio` \u2014 for example `acheck`, `astore`, `adelete`, `adelete_by_id`, `adelete_by_attributes`, `aclear`.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# NBVAL_SKIP\n",
"async def call_your_llm_async(prompt: str) -> str:\n",
" return f\"Async answer for: {prompt}\"\n",
"\n",
"\n",
"async def answer_async(user_prompt: str) -> str:\n",
" hits = await cache.acheck(prompt=user_prompt, num_results=1)\n",
" if hits:\n",
" return hits[0][\"response\"]\n",
"\n",
" response = await call_your_llm_async(user_prompt)\n",
" await cache.astore(prompt=user_prompt, response=response)\n",
" return response\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Delete operations\n",
"\n",
"| Method | What it does |\n",
"|--------|----------------|\n",
"| `delete()` / `adelete()` | **Flush** the entire cache (all entries). Aliases: `clear()` / `aclear()`. |\n",
"| `delete_by_id(entry_id)` / `adelete_by_id` | Remove one entry by LangCache entry ID (returned from `store`). |\n",
"| `delete_by_attributes` / `adelete_by_attributes` | Remove entries matching the given attribute map (non-empty dict required). |\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Current limitations\n",
"\n",
"The wrapper follows the LangCache API. The following RedisVL features either do not apply or are explicitly unsupported:\n",
"\n",
"- **No direct vector search** \u2014 Passing `vector=` to `check` / `acheck` logs a warning and does not search by embedding.\n",
"- **No `filter_expression`** \u2014 RedisVL filter expressions are not translated; use LangCache attributes only.\n",
"- **No `update()` / `aupdate()`** \u2014 The LangCache API does not update individual entries; these methods raise `NotImplementedError`. Delete the entry (or store a new pair) instead.\n",
"- **`filters` on `store`** \u2014 Not supported by LangCache; a warning is logged if provided.\n",
"\n",
"> **Tip:** See the **LangCacheSemanticCache** section in the [LLM cache API](../api/cache.rst) for parameter and method listings.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "redisvl-dev",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
3 changes: 3 additions & 0 deletions docs/user_guide/how_to_guides/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ How-to guides are **task-oriented** recipes that help you accomplish specific go
:::{grid-item-card} 🤖 LLM Extensions

- [Cache LLM Responses](../03_llmcache.ipynb) -- semantic caching to reduce costs and latency
- [Use LangCache as the LLM cache](../13_langcache_semantic_cache.ipynb) -- managed cache service with LangCache
- [Manage LLM Message History](../07_message_history.ipynb) -- persistent chat history with relevancy retrieval
- [Route Queries with SemanticRouter](../08_semantic_router.ipynb) -- classify intents and route queries
:::
Expand Down Expand Up @@ -49,6 +50,7 @@ How-to guides are **task-oriented** recipes that help you accomplish specific go
| I want to... | Guide |
|--------------|-------|
| Cache LLM responses | [Cache LLM Responses](../03_llmcache.ipynb) |
| Use LangCache (managed) for LLM caching | [Use LangCache as the LLM cache](../13_langcache_semantic_cache.ipynb) |
| Store chat history | [Manage LLM Message History](../07_message_history.ipynb) |
| Route queries by intent | [Route Queries with SemanticRouter](../08_semantic_router.ipynb) |
| Filter results by multiple criteria | [Query and Filter Data](../02_complex_filtering.ipynb) |
Expand All @@ -66,6 +68,7 @@ How-to guides are **task-oriented** recipes that help you accomplish specific go
:hidden:

Cache LLM Responses <../03_llmcache>
Use LangCache as the LLM cache <../13_langcache_semantic_cache>
Manage LLM Message History <../07_message_history>
Route Queries with SemanticRouter <../08_semantic_router>
Query and Filter Data <../02_complex_filtering>
Expand Down
8 changes: 4 additions & 4 deletions redisvl/extensions/cache/llm/langcache.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,8 +286,8 @@ def check(
filter_expression (Optional[FilterExpression]): Not supported.
distance_threshold (Optional[float]): Maximum distance threshold.
Converted to similarity_threshold according to distance_scale:
- If "redis": uses norm_cosine_distance(distance_threshold) ([0,2] [0,1])
- If "normalized": uses (1.0 - distance_threshold) ([0,1] [0,1])
If "redis", uses norm_cosine_distance(distance_threshold) ([0,2] -> [0,1]).
If "normalized", uses (1.0 - distance_threshold) ([0,1] -> [0,1]).
attributes (Optional[Dict[str, Any]]): LangCache attributes to filter by.
Note: Attributes must be pre-configured in your LangCache instance.

Expand Down Expand Up @@ -360,8 +360,8 @@ async def acheck(
filter_expression (Optional[FilterExpression]): Not supported.
distance_threshold (Optional[float]): Maximum distance threshold.
Converted to similarity_threshold according to distance_scale:
- If "redis": uses norm_cosine_distance(distance_threshold) ([0,2] -> [0,1])
- If "normalized": uses (1.0 - distance_threshold) ([0,1] -> [0,1])
If "redis", uses norm_cosine_distance(distance_threshold) ([0,2] -> [0,1]).
If "normalized", uses (1.0 - distance_threshold) ([0,1] -> [0,1]).
attributes (Optional[Dict[str, Any]]): LangCache attributes to filter by.
Note: Attributes must be pre-configured in your LangCache instance.

Expand Down
Loading
Loading