AgentQnA - add support for remote server (#1900)

Signed-off-by: alexsin368 <alex.sin@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
This commit is contained in:
alexsin368
2025-05-13 20:12:57 -07:00
committed by GitHub
parent 26d07019d0
commit fb53c536a3
2 changed files with 71 additions and 12 deletions

View File

@@ -99,7 +99,7 @@ flowchart LR
#### First, clone the `GenAIExamples` repo.
```
```bash
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
@@ -109,7 +109,7 @@ git clone https://github.com/opea-project/GenAIExamples.git
##### For proxy environments only
```
```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
@@ -118,14 +118,24 @@ export no_proxy="Your_No_Proxy"
##### For using open-source llms
```
Set up a [HuggingFace](https://huggingface.co/) account and generate a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
Then set an environment variable with the token and another for a directory to download the models:
```bash
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> # to avoid redownloading models
```
##### [Optional] OPANAI_API_KEY to use OpenAI models
##### [Optional] OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference
```
To use OpenAI models, generate a key following these [instructions](https://platform.openai.com/api-keys).
To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a key to access the desired model on the server.
Then set the environment variable `OPENAI_API_KEY` with the key contents:
```bash
export OPENAI_API_KEY=<your-openai-key>
```
@@ -133,16 +143,18 @@ export OPENAI_API_KEY=<your-openai-key>
##### Gaudi
```
```bash
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/set_env.sh
```
##### Xeon
```
```bash
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh
```
For running
### 2. Launch the multi-agent system. </br>
We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the `DocIndexRetriever` example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers.
@@ -184,14 +196,37 @@ docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/
#### Launch on Xeon
On Xeon, only OpenAI models are supported. The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key.
```bash
export OPENAI_API_KEY=<your-openai-key>
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
```
##### OpenAI Models
The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent.
```bash
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
```
##### Models on Remote Server
When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add `compose_remote.yaml` to the `docker compose` command and set additional environment variables.
###### Notes
- `OPENAI_API_KEY` is already set in a previous step.
- `model` is used to overwrite the value set for this environment variable in `set_env.sh`.
- `LLM_ENDPOINT_URL` is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: "https://<DNS>". Here is an example: "https://api.inference.example.com".
```bash
export model=<name-of-model-card>
export LLM_ENDPOINT_URL=<http-endpoint-of-remote-server>
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml -f compose_remote.yaml up -d
```
### 3. Ingest Data into the vector database
The `run_ingest_data.sh` script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo.
@@ -208,12 +243,18 @@ bash run_ingest_data.sh
The UI microservice is launched in the previous step with the other microservices.
To see the UI, open a web browser to `http://${ip_address}:5173` to access the UI. Note the `ip_address` here is the host IP of the UI microservice.
1. `create Admin Account` with a random value
2. add opea agent endpoint `http://$ip_address:9090/v1` which is a openai compatible api
1. Click on the arrow above `Get started`. Create an admin account with a name, email, and password.
2. Add an OpenAI-compatible API endpoint. In the upper right, click on the circle button with the user's initial, go to `Admin Settings`->`Connections`. Under `Manage OpenAI API Connections`, click on the `+` to add a connection. Fill in these fields:
- **URL**: `http://${ip_address}:9090/v1`, do not forget the `v1`
- **Key**: any value
- **Model IDs**: any name i.e. `opea-agent`, then press `+` to add it
Click "Save".
![opea-agent-setting](assets/img/opea-agent-setting.png)
3. test opea agent with ui
3. Test OPEA agent with UI. Return to `New Chat` and ensure the model (i.e. `opea-agent`) is selected near the upper left. Enter in any prompt to interact with the agent.
![opea-agent-test](assets/img/opea-agent-test.png)

View File

@@ -0,0 +1,18 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
services:
worker-rag-agent:
environment:
llm_endpoint_url: ${LLM_ENDPOINT_URL}
api_key: ${OPENAI_API_KEY}
worker-sql-agent:
environment:
llm_endpoint_url: ${LLM_ENDPOINT_URL}
api_key: ${OPENAI_API_KEY}
supervisor-react-agent:
environment:
llm_endpoint_url: ${LLM_ENDPOINT_URL}
api_key: ${OPENAI_API_KEY}