You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**LLM Proxy** is a simple demo project that serves as a proxy for [**OpenAI API**](https://platform.openai.com/) and [**Google Gemini API**](https://ai.google.dev/gemini-api/docs). It also provides an additional custom endpoint for testing purposes.
21
+
**React ChatBotify RAG API** is a lightweight project that serves as an LLM proxy for [**Google Gemini API**](https://ai.google.dev/gemini-api/docs). Notably, it is curated to pick out and utilize knowledge specific to React ChatBotify.
22
22
23
-
**New in this version:**The project now includes a **Retrieval Augmented Generation (RAG)** system. This allows you to upload Markdown documents, which are then chunked, embedded, and stored in a [**ChromaDB**](https://www.trychroma.com/) vector database. You can then query these documents, and the system will retrieve relevant chunks to augment the context provided to the Language Model (LLM), enabling more informed and context-aware responses. A key feature of this RAG implementation is its ability to retrieve and use the full content of the original parent documents from which relevant chunks were found, providing richer context to the LLM.
23
+
The project includes a **Retrieval Augmented Generation (RAG)** system. This allows one to upload Markdown documents, which are then chunked, embedded, and stored in a [**ChromaDB**](https://www.trychroma.com/) vector database. When queries are received, the system will retrieve relevant chunks to augment the context provided to the Language Model (LLM), enabling more informed and context-aware responses. A key feature of this RAG implementation is its ability to retrieve and use the full content of the original parent documents from which relevant chunks were found, providing richer context to the LLM.
24
24
25
-
All functionalities, including the original proxy endpoints and the new RAG endpoints, are exposed via [**Swagger docs**](https://swagger.io/docs/) under the `/api/v1/docs` endpoint.
25
+
All functionalities, including both query and management endpoints, are exposed via [**Swagger docs**](https://swagger.io/docs/) under the `/api/v1/docs` endpoint.
26
26
27
-
This demo project was created in private during the development of [**LLM Connector**](https://github.com/React-ChatBotify-Plugins/llm-connector) - an official [**React ChatBotify**](https://react-chatbotify.com) plugin. It has since been made public to serve as a simple demo project (not just for plugin users, but also anyone interested in a simple LLM proxy with RAG capabilities).
28
-
29
-
Note that this LLM Proxy **is not an official project of React ChatBotify**. With that said, while issues/pull requests are welcome, support for this demo project is **not guaranteed**.
27
+
Note that this project is a fork of the [**LLM Proxy**](https://github.com/tjtanjin/llm-proxy), which is a simpler alternative as a proxy with no support for RAG.
30
28
31
29
### Features
32
30
33
-
LLM Proxy offers the following features:
31
+
React ChatBotify RAG API offers the following features:
32
+
33
+
**Query Endpoints:**
34
+
- POST `/api/v1/gemini/models/{model}:generateContent`: Proxies requests to Google Gemini's content generation API.
35
+
- POST `/api/v1/gemini/models/{model}:streamGenerateContent`: Proxies requests to Google Gemini's streaming content generation API.
34
36
35
-
**Core Proxy Endpoints:**
36
-
-`/api/v1/openai/chat/completions`: Proxies requests to OpenAI's chat completions API.
37
-
-`/api/v1/gemini/models/:model:generateContent`: Proxies requests to Google Gemini's content generation API.
38
-
-`/api/v1/gemini/models/:model:streamGenerateContent`: Proxies requests to Google Gemini's streaming content generation API.
39
-
-`/api/v1/custom`: A custom endpoint that **always** returns "Hello World!" in a JSON response for basic testing.
37
+
**Management Endpoints:**
38
+
-POST `/api/v1/rag/manage/documents`: Creates a new document in the RAG system.
39
+
-GET `/api/v1/rag/manage/documents/{documentId}`: Retrieves a document by its ID.
40
+
-PUT `/api/v1/rag/manage/documents/{documentId}`: Updates an existing document.
41
+
-DELETE `/api/v1/rag/manage/documents/{documentId}`: Deletes a document by its ID.
40
42
41
43
**Retrieval Augmented Generation (RAG) System:**
42
44
- Document Management:
@@ -53,7 +55,7 @@ LLM Proxy offers the following features:
53
55
-`/api/v1/docs`: Interactive Swagger UI for exploring and testing all API endpoints.
54
56
55
57
### Technologies
56
-
Technologies used by LLM Proxy are as below:
58
+
Technologies used by React ChatBotify RAG API are as below:
3. Copy the `.env.template` file (found under the `config/env/` folder) to a new file named `.env`:
97
102
```bash
98
-
cp .env.example .env
103
+
cp ./config/env.env.template ./config/env/.env
99
104
```
100
-
2. Edit the `.env` file and provide the necessary values:
101
-
*`PORT`: Port for the application (defaults to 8080).
102
-
*`LLM_API_KEY`: Your OpenAI API key (if using OpenAI). (Note: The original template used `GEMINI_API_KEY` and `OPENAI_API_KEY`. This should be updated or clarified based on which keys are actively used by the proxy part).
103
-
*`OPENAI_API_KEY`: Your OpenAI API key.
104
-
*`GEMINI_API_KEY`: Your Google Gemini API key.
105
-
*`RAG_MANAGEMENT_API_KEY`: A secure API key you define for authenticating RAG management endpoints.
106
-
*`CHROMA_URL`: The URL for the ChromaDB instance. If using the provided `docker-compose.yml`, this will typically be `http://chromadb:8000`.
107
-
*`EMBEDDING_MODEL_NAME`: The name of the sentence transformer model to use for embeddings (e.g., `Xenova/all-MiniLM-L6-v2`). This model will be downloaded on first use.
105
+
4. Edit the `.env` file and provide the necessary values as described in the template.
106
+
5. Run `npm run start`.
107
+
6. Visit `http://localhost:${PORT}/api/v1/docs`for the Swagger docs page.
108
108
109
-
### Docker Deployment (with RAG)
109
+
### Docker Deployment
110
110
The recommended way to deploy the project, including the RAG service and ChromaDB, is using Docker Compose.
111
111
112
112
1. **Ensure `.env` is configured:** Follow the steps in"Environment Configuration" above.
@@ -117,143 +117,21 @@ The recommended way to deploy the project, including the RAG service and ChromaD
117
117
This command will:
118
118
* Build the `rag-api` service image.
119
119
* Pull the `chromadb/chroma` image for ChromaDB.
120
-
* Start both services.
121
-
* Create a persistent volume for ChromaDB data (`chroma-data`).
120
+
* Pull the `mongodb` image for mongodb.
121
+
* Start all services.
122
+
* Create a persistent volume for ChromaDB and MongoDB.
122
123
3. **Accessing the Service:**
123
-
* The LLM Proxy will be available at `http://localhost:${PORT}` (e.g., `http://localhost:8080`).
124
+
* The RAG API will be available at `http://localhost:${PORT}` (e.g., `http://localhost:8080`).
124
125
* ChromaDB's API (if needed for direct interaction, though usually not required) will be available at `http://localhost:8001` (as mapped in `docker-compose.yml`).
125
126
* API documentation (Swagger UI) is available at `http://localhost:${PORT}/api/v1/docs`.
126
127
127
-
### Development Setup (with RAG)
128
-
If you prefer to run the services separately or manage them manually:
These endpoints are used to manage documents in the RAG system. **They are protected and require an `X-API-KEY` header matching the `RAG_MANAGEMENT_API_KEY` defined in your `.env` file.**
167
-
168
-
- **`POST /rag/manage/documents`**
169
-
* Uploads a Markdown document.
170
-
* Request Content-Type: `multipart/form-data`.
171
-
* Form fields:
172
-
*`documentId` (string, required): A unique identifier for the document.
173
-
*`markdownFile` (file, required): The `.md` file to upload.
174
-
- **`GET /rag/manage/documents/{documentId}`**
175
-
* Retrieves the original content of an uploaded document.
176
-
* Path parameter: `documentId`.
177
-
- **`PUT /rag/manage/documents/{documentId}`**
178
-
* Updates an existing document by replacing its content.
179
-
* Path parameter: `documentId`.
180
-
* Request Content-Type: `multipart/form-data`.
181
-
* Form field:
182
-
*`markdownFile` (file, required): The new `.md` file.
183
-
- **`DELETE /rag/manage/documents/{documentId}`**
184
-
* Deletes a document and all its associated chunks from the RAG system.
185
-
* Path parameter: `documentId`.
186
-
187
-
##### Query Endpoint (Public)
188
-
This endpoint is public and used to query the RAG system.
189
-
190
-
- **`POST /rag/query`**
191
-
* Sends a query to the RAG system. The system retrieves relevant document chunks, augments an LLM prompt with their content (specifically, the original parent documents), and returns the LLM's response.
192
-
* Request Content-Type: `application/json`.
193
-
* Request Body:
194
-
* `query` (string, required): The user's query.
195
-
*`llm_model` (string, optional): The LLM model to use (e.g., `gpt-3.5-turbo`). Defaults to a system-configured model.
196
-
*`n_results` (integer, optional): Number of relevant document chunks to retrieve. Defaults to 3.
197
-
*`stream` (boolean, optional): Whether to stream the response. Defaults to `false`.
198
-
* Response:
199
-
* If `stream: false`: A JSON object containing the LLM's response.
200
-
* If `stream: true`: A `text/event-stream` response.
201
-
202
-
#### API Documentation
203
-
- `GET /docs`: Interactive Swagger UI for all API endpoints. Accessible at `http://localhost:${PORT}/api/v1/docs`.
204
-
205
-
### Using the RAG API (Examples)
206
-
207
-
Replace `your_secure_api_key_here` with the value of `RAG_MANAGEMENT_API_KEY` from your `.env` file.
208
-
Replace `/path/to/your/document.md` with the actual path to a Markdown file.
209
-
The default port `8080` is used in these examples.
210
-
211
-
1. **Upload a document:**
212
-
```bash
213
-
curl -X POST -H "X-API-KEY: your_secure_api_key_here" \
214
-
-F "documentId=my_test_doc_01" \
215
-
-F "markdownFile=@/path/to/your/document.md" \
216
-
http://localhost:8080/api/v1/rag/manage/documents
217
-
```
218
-
219
-
2. **Query the RAG system:**
220
-
```bash
221
-
curl -X POST -H "Content-Type: application/json" \
222
-
-d '{
223
-
"query": "What is the main content of my_test_doc_01?",
224
-
"stream": false
225
-
}' \
226
-
http://localhost:8080/api/v1/rag/query
227
-
```
228
-
To stream the response:
229
-
```bash
230
-
curl -X POST -H "Content-Type: application/json" \
231
-
-d '{
232
-
"query": "Summarize my_test_doc_01 for me.",
233
-
"stream": true
234
-
}' \
235
-
http://localhost:8080/api/v1/rag/query
236
-
```
237
-
238
-
3. **Get a document's content:**
239
-
```bash
240
-
curl -X GET -H "X-API-KEY: your_secure_api_key_here" \
Given the simplicity and narrowly scoped purpose of this project, there is **no developer guide**. Feel free to submit pull requests if you wish to make improvements or fixes.
132
+
There is currently no developer guide for the project. This will be written soon. In the meantime, if you're keen to make improvements, the codebase is relatively small for exploration.
255
133
256
-
Alternatively, you may contact me via [**discord**](https://discord.gg/X8VSdZvBQY) or simply raise bugs or suggestions by opening an [**issue**](https://github.com/React-ChatBotify/rag-api/issues).
134
+
Alternatively, you may reach out via [**discord**](https://discord.gg/6R4DK4G5Zh) or simply raise bugs or suggestions by opening an [**issue**](https://github.com/react-chatbotifyy/rag-api/issues).
257
135
258
136
### Others
259
-
For any questions regarding the implementation of the project, you may reach out on [**discord**](https://discord.gg/X8VSdZvBQY) or drop an email to: cjtanjin@gmail.com.
137
+
For any questions regarding the implementation of the project, you may reach out on [**discord**](https://discord.gg/6R4DK4G5Zh) or drop an email to: cjtanjin@gmail.com.
0 commit comments