Skip to content

Commit 3da7ce5

Browse files
Merge pull request #14053 from nextcloud/backport/14047/stable33
[stable33] chore(AI/ContextChat): update scaling docs for parsing
2 parents 5923652 + 2b435bb commit 3da7ce5

1 file changed

Lines changed: 20 additions & 18 deletions

File tree

admin_manual/ai/app_context_chat.rst

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@ App: Context Chat
66

77
Context Chat is an :ref:`assistant<ai-app-assistant>` feature that is implemented via an ensemble of two apps:
88

9-
* the *context_chat* app, written purely in PHP
10-
* the *context_chat_backend* ExternalApp written in Python
9+
* the ``context_chat`` app, written purely in PHP
10+
* the ``context_chat_backend`` ExternalApp written in Python
1111

1212
Together they provide the ContextChat *text processing* and *search* tasks accessible via the :ref:`Nextcloud Assistant app<ai-app-assistant>`.
1313

14-
The *context_chat* and *context_chat_backend* apps will use the Free text-to-text task processing providers like OpenAI integration, LLM2, etc. and such a provider is required on a fresh install, or it can be configured to run open source models entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities.
14+
The ``context_chat`` and ``context_chat_backend`` apps will use the configured text-to-text task processing provider, which is required on a fresh install. It can be configured to run open source models entirely on-premises, see the list of providers :ref:`here <tp-consumer-apps>` in the "Backend apps" section.
1515

16-
This app supports input and output in the same languages that the currently configured Free text-to-text task processing provider supports.
16+
This app supports input and output in the same languages that the currently configured text-to-text task processing provider supports.
1717

1818
Requirements
1919
------------
@@ -26,18 +26,16 @@ Requirements
2626
* GPU Setup Sizing
2727

2828
* A NVIDIA GPU with at least 2GB VRAM
29-
* The requirements for the Free text-to-text providers should be checked separately
30-
* llm2's requirements can be found :ref:`here <ai-app-llm2>`
31-
* integration_openai does not have any additional GPU requirements
29+
* The requirements for the text-to-text providers should be checked separately for each app :ref:`here <tp-consumer-apps>` in the "Backend apps" section, as they can vary greatly based on the model used and whether the provider is hosted locally or remotely.
3230
* At least 8GB of system RAM
3331
* 2 GB + additional 500MB for each concurrent request made to the backend if configuration parameters are changed
3432

3533
* CPU Setup Sizing
3634

3735
* At least 12GB of system RAM
38-
* 2 GB + additional 500MB for each request made to the backend if the Free text-to-text provider is not on the same machine
36+
* 2 GB + additional 500MB for each additional concurrent query request
3937
* 8 GB is recommended in the above case for the default settings
40-
* This app makes use of the configured free text-to-text task processing provider instead of running its own language model by default, you will thus need 4+ cores for the embedding model only
38+
* This app makes use of the configured text-to-text task processing provider instead of running its own language model by default, thus 4+ cores for the embedding model is needed
4139

4240
* A dedicated machine is recommended
4341

@@ -51,19 +49,19 @@ Installation
5149

5250
1. Make sure the :ref:`Nextcloud Assistant app<ai-app-assistant>` is installed
5351
2. Setup a :ref:`Deploy Daemon <ai-app_api>` in AppAPI Admin settings
54-
3. Install the *context_chat_backend* ExApp via the "Apps" page in Nextcloud, or by executing (checkout the readme at https://github.com/nextcloud/context_chat_backend for manual install steps)
52+
3. Install the ``context_chat_backend`` ExApp via the "Apps" page in Nextcloud, or by executing (checkout the readme at https://github.com/nextcloud/context_chat_backend for manual install steps)
5553

5654
.. code-block::
5755
5856
occ app_api:app:register context_chat_backend
5957
60-
4. Install the *context_chat* app via the "Apps" page in Nextcloud, or by executing
58+
4. Install the ``context_chat`` app via the "Apps" page in Nextcloud, or by executing
6159

6260
.. code-block::
6361
6462
occ app:enable context_chat
6563
66-
5. Install a text generation backend like :ref:`llm2 <ai-app-llm2>` or `integration_openai <https://github.com/nextcloud/integration_openai>`_ via the "Apps" page in Nextcloud
64+
5. Install a text-to-text provider (text generation provider) via the "Apps" page in Nextcloud. A list of providers can be found :ref:`here <tp-consumer-apps>` in the "Backend apps" section.
6765

6866
6. Optionally but recommended, setup background workers for faster pickup of tasks. See :ref:`the relevant section in AI Overview<ai-overview_improve-ai-task-pickup-speed>` for more information.
6967

@@ -104,23 +102,27 @@ Synchronous indexing
104102
Scaling
105103
-------
106104

107-
There are three major parts that influence the performance of the system:
105+
Listed below are the major parts of the system that can be scaled independently to improve performance:
108106

109-
1. **The text-to-text task processing provider (like OpenAI and LocalAI integration, LLM2, etc.)**
107+
1. The text-to-text task processing provider (from among the list of providers :ref:`here <tp-consumer-apps>` in the "Backend apps" section)
110108

111109
The text-to-text task processing provider can be scaled by using a hosted service using the `OpenAI and LocalAI integration (via OpenAI API) <https://apps.nextcloud.com/apps/integration_openai>`_ like OpenAI or by hosting your own model on powerful hardware.
112110

113-
2. **The vector DB performance**
111+
2. The vector DB performance
114112

115113
| The vector DB performance can be scaled by using a dedicated or cluster setup for PostgreSQL with the pgvector extension.
116114
| The connection string of the external vector DB can be set using the environment variable ``EXTERNAL_DB`` during deployment in the "Deploy Options".
117115
118-
3. **The embedding model performance**
116+
3. The embedding model performance
119117

120118
| The embedding model performance can be scaled by using a hosted embedding service, locally or remotely hosted. It should be able to serve an OpenAI-compatible API.
121119
| The embedding service URL can be set using the environment variable ``CC_EM_BASE_URL`` during deployment in the "Deploy Options". Other options like the model name, api key, or username and password can be set using the environment variables ``CC_EM_MODEL_NAME``, ``CC_EM_API_KEY``, ``CC_EM_USERNAME``, and ``CC_EM_PASSWORD`` respectively.
122120
123-
If context_chat_backend is already deployed, you can change these environment variables by redeploying it with the new values.
121+
One part of the system that cannot be scaled yet is the parsing of the documents to extract text.
122+
This is currently done in a single instance of the ``context_chat_backend`` ExApp.
123+
It is a CPU-bound task so having a powerful CPU will help speed up the parsing process.
124+
125+
If ``context_chat_backend`` is already deployed, you can change these environment variables by redeploying it with the new values.
124126

125127
1. Go to Apps page -> search for "Context Chat Backend"
126128
2. Disable and remove the app taking care the data is not removed
@@ -131,7 +133,7 @@ If context_chat_backend is already deployed, you can change these environment va
131133
App store
132134
---------
133135

134-
You can also find the *context_chat* app in our app store, where you can write a review: `<https://apps.nextcloud.com/apps/context_chat>`_
136+
You can also find the ``context_chat`` app in our app store, where you can write a review: `<https://apps.nextcloud.com/apps/context_chat>`_
135137

136138
Repository
137139
----------

0 commit comments

Comments
 (0)