Adding files to deploy AgentQnA application on ROCm vLLM (#1613)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
This commit is contained in:
committed by
GitHub
parent
46a29cc253
commit
5cc047ce34
@@ -1,101 +1,334 @@
|
||||
# Single node on-prem deployment with Docker Compose on AMD GPU
|
||||
# Build Mega Service of AgentQnA on AMD ROCm GPU
|
||||
|
||||
This example showcases a hierarchical multi-agent system for question-answering applications. We deploy the example on Xeon. For LLMs, we use OpenAI models via API calls. For instructions on using open-source LLMs, please refer to the deployment guide [here](../../../../README.md).
|
||||
## Build Docker Images
|
||||
|
||||
## Deployment with docker
|
||||
### 1. Build Docker Image
|
||||
|
||||
1. First, clone this repo.
|
||||
```
|
||||
export WORKDIR=<your-work-directory>
|
||||
cd $WORKDIR
|
||||
git clone https://github.com/opea-project/GenAIExamples.git
|
||||
```
|
||||
2. Set up environment for this example </br>
|
||||
- #### Create application install directory and go to it:
|
||||
|
||||
```
|
||||
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
|
||||
export host_ip=$(hostname -I | awk '{print $1}')
|
||||
# if you are in a proxy environment, also set the proxy-related environment variables
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
|
||||
export no_proxy="Your_No_Proxy"
|
||||
```bash
|
||||
mkdir ~/agentqna-install && cd agentqna-install
|
||||
```
|
||||
|
||||
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
|
||||
#OPANAI_API_KEY if you want to use OpenAI models
|
||||
export OPENAI_API_KEY=<your-openai-key>
|
||||
# Set AMD GPU settings
|
||||
export AGENTQNA_CARD_ID="card1"
|
||||
export AGENTQNA_RENDER_ID="renderD136"
|
||||
```
|
||||
- #### Clone the repository GenAIExamples (the default repository branch "main" is used here):
|
||||
|
||||
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIExamples.git
|
||||
```
|
||||
|
||||
First, launch the mega-service.
|
||||
If you need to use a specific branch/tag of the GenAIExamples repository, then (v1.3 replace with its own value):
|
||||
|
||||
```
|
||||
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
|
||||
bash launch_retrieval_tool.sh
|
||||
```
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIExamples.git && cd GenAIExamples && git checkout v1.3
|
||||
```
|
||||
|
||||
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
|
||||
We remind you that when using a specific version of the code, you need to use the README from this version:
|
||||
|
||||
```
|
||||
bash run_ingest_data.sh
|
||||
```
|
||||
- #### Go to build directory:
|
||||
|
||||
4. Launch Tool service
|
||||
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
|
||||
```
|
||||
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
|
||||
```
|
||||
5. Launch `Agent` service
|
||||
```bash
|
||||
cd ~/agentqna-install/GenAIExamples/AgentQnA/docker_image_build
|
||||
```
|
||||
|
||||
```
|
||||
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
|
||||
bash launch_agent_service_tgi_rocm.sh
|
||||
```
|
||||
- Cleaning up the GenAIComps repository if it was previously cloned in this directory.
|
||||
This is necessary if the build was performed earlier and the GenAIComps folder exists and is not empty:
|
||||
|
||||
6. [Optional] Build `Agent` docker image if pulling images failed.
|
||||
```bash
|
||||
echo Y | rm -R GenAIComps
|
||||
```
|
||||
|
||||
```
|
||||
git clone https://github.com/opea-project/GenAIComps.git
|
||||
cd GenAIComps
|
||||
docker build -t opea/agent:latest -f comps/agent/src/Dockerfile .
|
||||
```
|
||||
- #### Clone the repository GenAIComps (the default repository branch "main" is used here):
|
||||
|
||||
## Validate services
|
||||
|
||||
First look at logs of the agent docker containers:
|
||||
|
||||
```
|
||||
# worker agent
|
||||
docker logs rag-agent-endpoint
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIComps.git
|
||||
```
|
||||
|
||||
```
|
||||
# supervisor agent
|
||||
docker logs react-agent-endpoint
|
||||
We remind you that when using a specific version of the code, you need to use the README from this version.
|
||||
|
||||
- #### Setting the list of images for the build (from the build file.yaml)
|
||||
|
||||
If you want to deploy a vLLM-based or TGI-based application, then the set of services is installed as follows:
|
||||
|
||||
#### vLLM-based application
|
||||
|
||||
```bash
|
||||
service_list="vllm-rocm agent agent-ui"
|
||||
```
|
||||
|
||||
#### TGI-based application
|
||||
|
||||
```bash
|
||||
service_list="agent agent-ui"
|
||||
```
|
||||
|
||||
- #### Optional. Pull TGI Docker Image (Do this if you want to use TGI)
|
||||
|
||||
```bash
|
||||
docker pull ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
|
||||
```
|
||||
|
||||
- #### Build Docker Images
|
||||
|
||||
```bash
|
||||
docker compose -f build.yaml build ${service_list} --no-cache
|
||||
```
|
||||
|
||||
- #### Build DocIndexRetriever Docker Images
|
||||
|
||||
```bash
|
||||
cd ~/agentqna-install/GenAIExamples/DocIndexRetriever/docker_image_build/
|
||||
git clone https://github.com/opea-project/GenAIComps.git
|
||||
service_list="doc-index-retriever dataprep embedding retriever reranking"
|
||||
docker compose -f build.yaml build ${service_list} --no-cache
|
||||
```
|
||||
|
||||
- #### Pull DocIndexRetriever Docker Images
|
||||
|
||||
```bash
|
||||
docker pull redis/redis-stack:7.2.0-v9
|
||||
docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
```
|
||||
|
||||
After the build, we check the list of images with the command:
|
||||
|
||||
```bash
|
||||
docker image ls
|
||||
```
|
||||
|
||||
The list of images should include:
|
||||
|
||||
##### vLLM-based application:
|
||||
|
||||
- opea/vllm-rocm:latest
|
||||
- opea/agent:latest
|
||||
- redis/redis-stack:7.2.0-v9
|
||||
- ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
- opea/embedding:latest
|
||||
- opea/retriever:latest
|
||||
- opea/reranking:latest
|
||||
- opea/doc-index-retriever:latest
|
||||
|
||||
##### TGI-based application:
|
||||
|
||||
- ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
|
||||
- opea/agent:latest
|
||||
- redis/redis-stack:7.2.0-v9
|
||||
- ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
- opea/embedding:latest
|
||||
- opea/retriever:latest
|
||||
- opea/reranking:latest
|
||||
- opea/doc-index-retriever:latest
|
||||
|
||||
---
|
||||
|
||||
## Deploy the AgentQnA Application
|
||||
|
||||
### Docker Compose Configuration for AMD GPUs
|
||||
|
||||
To enable GPU support for AMD GPUs, the following configuration is added to the Docker Compose file:
|
||||
|
||||
- compose_vllm.yaml - for vLLM-based application
|
||||
- compose.yaml - for TGI-based
|
||||
|
||||
```yaml
|
||||
shm_size: 1g
|
||||
devices:
|
||||
- /dev/kfd:/dev/kfd
|
||||
- /dev/dri:/dev/dri
|
||||
cap_add:
|
||||
- SYS_PTRACE
|
||||
group_add:
|
||||
- video
|
||||
security_opt:
|
||||
- seccomp:unconfined
|
||||
```
|
||||
|
||||
You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>
|
||||
This configuration forwards all available GPUs to the container. To use a specific GPU, specify its `cardN` and `renderN` device IDs. For example:
|
||||
|
||||
Second, validate worker agent:
|
||||
|
||||
```
|
||||
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
|
||||
"query": "Most recent album by Taylor Swift"
|
||||
}'
|
||||
```yaml
|
||||
shm_size: 1g
|
||||
devices:
|
||||
- /dev/kfd:/dev/kfd
|
||||
- /dev/dri/card0:/dev/dri/card0
|
||||
- /dev/dri/render128:/dev/dri/render128
|
||||
cap_add:
|
||||
- SYS_PTRACE
|
||||
group_add:
|
||||
- video
|
||||
security_opt:
|
||||
- seccomp:unconfined
|
||||
```
|
||||
|
||||
Third, validate supervisor agent:
|
||||
**How to Identify GPU Device IDs:**
|
||||
Use AMD GPU driver utilities to determine the correct `cardN` and `renderN` IDs for your GPU.
|
||||
|
||||
```
|
||||
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
|
||||
"query": "Most recent album by Taylor Swift"
|
||||
}'
|
||||
### Set deploy environment variables
|
||||
|
||||
#### Setting variables in the operating system environment:
|
||||
|
||||
```bash
|
||||
### Replace the string 'server_address' with your local server IP address
|
||||
export host_ip='server_address'
|
||||
### Replace the string 'your_huggingfacehub_token' with your HuggingFacehub repository access token.
|
||||
export HUGGINGFACEHUB_API_TOKEN='your_huggingfacehub_token'
|
||||
### Replace the string 'your_langchain_api_key' with your LANGCHAIN API KEY.
|
||||
export LANGCHAIN_API_KEY='your_langchain_api_key'
|
||||
export LANGCHAIN_TRACING_V2=""
|
||||
```
|
||||
|
||||
## How to register your own tools with agent
|
||||
### Start the services:
|
||||
|
||||
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/src/README.md).
|
||||
#### If you use vLLM
|
||||
|
||||
```bash
|
||||
cd ~/agentqna-install/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
|
||||
bash launch_agent_service_vllm_rocm.sh
|
||||
```
|
||||
|
||||
#### If you use TGI
|
||||
|
||||
```bash
|
||||
cd ~/agentqna-install/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
|
||||
bash launch_agent_service_tgi_rocm.sh
|
||||
```
|
||||
|
||||
All containers should be running and should not restart:
|
||||
|
||||
##### If you use vLLM:
|
||||
|
||||
- dataprep-redis-server
|
||||
- doc-index-retriever-server
|
||||
- embedding-server
|
||||
- rag-agent-endpoint
|
||||
- react-agent-endpoint
|
||||
- redis-vector-db
|
||||
- reranking-tei-xeon-server
|
||||
- retriever-redis-server
|
||||
- sql-agent-endpoint
|
||||
- tei-embedding-server
|
||||
- tei-reranking-server
|
||||
- vllm-service
|
||||
|
||||
##### If you use TGI:
|
||||
|
||||
- agentqna-tgi-service
|
||||
- whisper-service
|
||||
- speecht5-service
|
||||
- agentqna-backend-server
|
||||
- agentqna-ui-server
|
||||
|
||||
---
|
||||
|
||||
## Validate the Services
|
||||
|
||||
### 1. Validate the vLLM/TGI Service
|
||||
|
||||
#### If you use vLLM:
|
||||
|
||||
```bash
|
||||
DATA='{"model": "Intel/neural-chat-7b-v3-3t", '\
|
||||
'"messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens": 256}'
|
||||
|
||||
curl http://${HOST_IP}:${AUDIOQNA_VLLM_SERVICE_PORT}/v1/chat/completions \
|
||||
-X POST \
|
||||
-d "$DATA" \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
Checking the response from the service. The response should be similar to JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "chatcmpl-142f34ef35b64a8db3deedd170fed951",
|
||||
"object": "chat.completion",
|
||||
"created": 1742270316,
|
||||
"model": "Intel/neural-chat-7b-v3-3",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": "",
|
||||
"tool_calls": []
|
||||
},
|
||||
"logprobs": null,
|
||||
"finish_reason": "length",
|
||||
"stop_reason": null
|
||||
}
|
||||
],
|
||||
"usage": { "prompt_tokens": 66, "total_tokens": 322, "completion_tokens": 256, "prompt_tokens_details": null },
|
||||
"prompt_logprobs": null
|
||||
}
|
||||
```
|
||||
|
||||
If the service response has a meaningful response in the value of the "choices.message.content" key,
|
||||
then we consider the vLLM service to be successfully launched
|
||||
|
||||
#### If you use TGI:
|
||||
|
||||
```bash
|
||||
DATA='{"inputs":"What is Deep Learning?",'\
|
||||
'"parameters":{"max_new_tokens":256,"do_sample": true}}'
|
||||
|
||||
curl http://${HOST_IP}:${AUDIOQNA_TGI_SERVICE_PORT}/generate \
|
||||
-X POST \
|
||||
-d "$DATA" \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
Checking the response from the service. The response should be similar to JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"generated_text": " "
|
||||
}
|
||||
```
|
||||
|
||||
If the service response has a meaningful response in the value of the "generated_text" key,
|
||||
then we consider the TGI service to be successfully launched
|
||||
|
||||
### 2. Validate MegaServices
|
||||
|
||||
Test the AgentQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
|
||||
base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
|
||||
to the response, decode the base64 string and save it as a .wav file.
|
||||
|
||||
```bash
|
||||
# voice can be "default" or "male"
|
||||
curl http://${host_ip}:3008/v1/agentqna \
|
||||
-X POST \
|
||||
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \
|
||||
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
|
||||
```
|
||||
|
||||
### 3. Validate MicroServices
|
||||
|
||||
```bash
|
||||
# whisper service
|
||||
curl http://${host_ip}:7066/v1/asr \
|
||||
-X POST \
|
||||
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
|
||||
-H 'Content-Type: application/json'
|
||||
|
||||
# speecht5 service
|
||||
curl http://${host_ip}:7055/v1/tts \
|
||||
-X POST \
|
||||
-d '{"text": "Who are you?"}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
### 4. Stop application
|
||||
|
||||
#### If you use vLLM
|
||||
|
||||
```bash
|
||||
cd ~/agentqna-install/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
|
||||
docker compose -f compose_vllm.yaml down
|
||||
```
|
||||
|
||||
#### If you use TGI
|
||||
|
||||
```bash
|
||||
cd ~/agentqna-install/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
|
||||
docker compose -f compose.yaml down
|
||||
```
|
||||
|
||||
@@ -1,26 +1,24 @@
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
# Copyright (C) 2025 Advanced Micro Devices, Inc.
|
||||
|
||||
services:
|
||||
agent-tgi-server:
|
||||
image: ${AGENTQNA_TGI_IMAGE}
|
||||
container_name: agent-tgi-server
|
||||
tgi-service:
|
||||
image: ghcr.io/huggingface/text-generation-inference:3.0.0-rocm
|
||||
container_name: tgi-service
|
||||
ports:
|
||||
- "${AGENTQNA_TGI_SERVICE_PORT-8085}:80"
|
||||
- "${TGI_SERVICE_PORT-8085}:80"
|
||||
volumes:
|
||||
- ${HF_CACHE_DIR:-/var/opea/agent-service/}:/data
|
||||
- "${MODEL_CACHE:-./data}:/data"
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
TGI_LLM_ENDPOINT: "http://${HOST_IP}:${AGENTQNA_TGI_SERVICE_PORT}"
|
||||
TGI_LLM_ENDPOINT: "http://${ip_address}:${TGI_SERVICE_PORT}"
|
||||
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
shm_size: 1g
|
||||
shm_size: 32g
|
||||
devices:
|
||||
- /dev/kfd:/dev/kfd
|
||||
- /dev/dri/${AGENTQNA_CARD_ID}:/dev/dri/${AGENTQNA_CARD_ID}
|
||||
- /dev/dri/${AGENTQNA_RENDER_ID}:/dev/dri/${AGENTQNA_RENDER_ID}
|
||||
- /dev/dri:/dev/dri
|
||||
cap_add:
|
||||
- SYS_PTRACE
|
||||
group_add:
|
||||
@@ -34,14 +32,14 @@ services:
|
||||
image: opea/agent:latest
|
||||
container_name: rag-agent-endpoint
|
||||
volumes:
|
||||
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
|
||||
- ${TOOLSET_PATH}:/home/user/tools/
|
||||
- "${TOOLSET_PATH}:/home/user/tools/"
|
||||
ports:
|
||||
- "9095:9095"
|
||||
- "${WORKER_RAG_AGENT_PORT:-9095}:9095"
|
||||
ipc: host
|
||||
environment:
|
||||
ip_address: ${ip_address}
|
||||
strategy: rag_agent_llama
|
||||
with_memory: false
|
||||
recursion_limit: ${recursion_limit_worker}
|
||||
llm_engine: tgi
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
@@ -61,21 +59,49 @@ services:
|
||||
LANGCHAIN_PROJECT: "opea-worker-agent-service"
|
||||
port: 9095
|
||||
|
||||
worker-sql-agent:
|
||||
image: opea/agent:latest
|
||||
container_name: sql-agent-endpoint
|
||||
volumes:
|
||||
- "${WORKDIR}/tests/Chinook_Sqlite.sqlite:/home/user/chinook-db/Chinook_Sqlite.sqlite:rw"
|
||||
ports:
|
||||
- "${WORKER_SQL_AGENT_PORT:-9096}:9096"
|
||||
ipc: host
|
||||
environment:
|
||||
ip_address: ${ip_address}
|
||||
strategy: sql_agent_llama
|
||||
with_memory: false
|
||||
db_name: ${db_name}
|
||||
db_path: ${db_path}
|
||||
use_hints: false
|
||||
recursion_limit: ${recursion_limit_worker}
|
||||
llm_engine: vllm
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
llm_endpoint_url: ${LLM_ENDPOINT_URL}
|
||||
model: ${LLM_MODEL_ID}
|
||||
temperature: ${temperature}
|
||||
max_new_tokens: ${max_new_tokens}
|
||||
stream: false
|
||||
require_human_feedback: false
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
port: 9096
|
||||
|
||||
supervisor-react-agent:
|
||||
image: opea/agent:latest
|
||||
container_name: react-agent-endpoint
|
||||
depends_on:
|
||||
- agent-tgi-server
|
||||
- worker-rag-agent
|
||||
volumes:
|
||||
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
|
||||
- ${TOOLSET_PATH}:/home/user/tools/
|
||||
- "${TOOLSET_PATH}:/home/user/tools/"
|
||||
ports:
|
||||
- "${AGENTQNA_FRONTEND_PORT}:9090"
|
||||
- "${SUPERVISOR_REACT_AGENT_PORT:-9090}:9090"
|
||||
ipc: host
|
||||
environment:
|
||||
ip_address: ${ip_address}
|
||||
strategy: react_langgraph
|
||||
strategy: react_llama
|
||||
with_memory: true
|
||||
recursion_limit: ${recursion_limit_supervisor}
|
||||
llm_engine: tgi
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
@@ -83,7 +109,7 @@ services:
|
||||
model: ${LLM_MODEL_ID}
|
||||
temperature: ${temperature}
|
||||
max_new_tokens: ${max_new_tokens}
|
||||
stream: false
|
||||
stream: true
|
||||
tools: /home/user/tools/supervisor_agent_tools.yaml
|
||||
require_human_feedback: false
|
||||
no_proxy: ${no_proxy}
|
||||
@@ -92,6 +118,7 @@ services:
|
||||
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
|
||||
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
|
||||
LANGCHAIN_PROJECT: "opea-supervisor-agent-service"
|
||||
CRAG_SERVER: $CRAG_SERVER
|
||||
WORKER_AGENT_URL: $WORKER_AGENT_URL
|
||||
CRAG_SERVER: ${CRAG_SERVER}
|
||||
WORKER_AGENT_URL: ${WORKER_AGENT_URL}
|
||||
SQL_AGENT_URL: ${SQL_AGENT_URL}
|
||||
port: 9090
|
||||
|
||||
128
AgentQnA/docker_compose/amd/gpu/rocm/compose_vllm.yaml
Normal file
128
AgentQnA/docker_compose/amd/gpu/rocm/compose_vllm.yaml
Normal file
@@ -0,0 +1,128 @@
|
||||
# Copyright (C) 2025 Advanced Micro Devices, Inc.
|
||||
|
||||
services:
|
||||
vllm-service:
|
||||
image: ${REGISTRY:-opea}/vllm-rocm:${TAG:-latest}
|
||||
container_name: vllm-service
|
||||
ports:
|
||||
- "${VLLM_SERVICE_PORT:-8081}:8011"
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
HF_HUB_DISABLE_PROGRESS_BARS: 1
|
||||
HF_HUB_ENABLE_HF_TRANSFER: 0
|
||||
WILM_USE_TRITON_FLASH_ATTENTION: 0
|
||||
PYTORCH_JIT: 0
|
||||
volumes:
|
||||
- "${MODEL_CACHE:-./data}:/data"
|
||||
shm_size: 20G
|
||||
devices:
|
||||
- /dev/kfd:/dev/kfd
|
||||
- /dev/dri/:/dev/dri/
|
||||
cap_add:
|
||||
- SYS_PTRACE
|
||||
group_add:
|
||||
- video
|
||||
security_opt:
|
||||
- seccomp:unconfined
|
||||
- apparmor=unconfined
|
||||
command: "--model ${VLLM_LLM_MODEL_ID} --swap-space 16 --disable-log-requests --dtype float16 --tensor-parallel-size 4 --host 0.0.0.0 --port 8011 --num-scheduler-steps 1 --distributed-executor-backend \"mp\""
|
||||
ipc: host
|
||||
|
||||
worker-rag-agent:
|
||||
image: opea/agent:latest
|
||||
container_name: rag-agent-endpoint
|
||||
volumes:
|
||||
- ${TOOLSET_PATH}:/home/user/tools/
|
||||
ports:
|
||||
- "${WORKER_RAG_AGENT_PORT:-9095}:9095"
|
||||
ipc: host
|
||||
environment:
|
||||
ip_address: ${ip_address}
|
||||
strategy: rag_agent_llama
|
||||
with_memory: false
|
||||
recursion_limit: ${recursion_limit_worker}
|
||||
llm_engine: vllm
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
llm_endpoint_url: ${LLM_ENDPOINT_URL}
|
||||
model: ${LLM_MODEL_ID}
|
||||
temperature: ${temperature}
|
||||
max_new_tokens: ${max_new_tokens}
|
||||
stream: false
|
||||
tools: /home/user/tools/worker_agent_tools.yaml
|
||||
require_human_feedback: false
|
||||
RETRIEVAL_TOOL_URL: ${RETRIEVAL_TOOL_URL}
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
|
||||
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
|
||||
LANGCHAIN_PROJECT: "opea-worker-agent-service"
|
||||
port: 9095
|
||||
|
||||
worker-sql-agent:
|
||||
image: opea/agent:latest
|
||||
container_name: sql-agent-endpoint
|
||||
volumes:
|
||||
- "${WORKDIR}/tests/Chinook_Sqlite.sqlite:/home/user/chinook-db/Chinook_Sqlite.sqlite:rw"
|
||||
ports:
|
||||
- "${WORKER_SQL_AGENT_PORT:-9096}:9096"
|
||||
ipc: host
|
||||
environment:
|
||||
ip_address: ${ip_address}
|
||||
strategy: sql_agent_llama
|
||||
with_memory: false
|
||||
db_name: ${db_name}
|
||||
db_path: ${db_path}
|
||||
use_hints: false
|
||||
recursion_limit: ${recursion_limit_worker}
|
||||
llm_engine: vllm
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
llm_endpoint_url: ${LLM_ENDPOINT_URL}
|
||||
model: ${LLM_MODEL_ID}
|
||||
temperature: ${temperature}
|
||||
max_new_tokens: ${max_new_tokens}
|
||||
stream: false
|
||||
require_human_feedback: false
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
port: 9096
|
||||
|
||||
supervisor-react-agent:
|
||||
image: opea/agent:latest
|
||||
container_name: react-agent-endpoint
|
||||
depends_on:
|
||||
- worker-rag-agent
|
||||
volumes:
|
||||
- ${TOOLSET_PATH}:/home/user/tools/
|
||||
ports:
|
||||
- "${SUPERVISOR_REACT_AGENT_PORT:-9090}:9090"
|
||||
ipc: host
|
||||
environment:
|
||||
ip_address: ${ip_address}
|
||||
strategy: react_llama
|
||||
with_memory: true
|
||||
recursion_limit: ${recursion_limit_supervisor}
|
||||
llm_engine: vllm
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
llm_endpoint_url: ${LLM_ENDPOINT_URL}
|
||||
model: ${LLM_MODEL_ID}
|
||||
temperature: ${temperature}
|
||||
max_new_tokens: ${max_new_tokens}
|
||||
stream: true
|
||||
tools: /home/user/tools/supervisor_agent_tools.yaml
|
||||
require_human_feedback: false
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
|
||||
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
|
||||
LANGCHAIN_PROJECT: "opea-supervisor-agent-service"
|
||||
CRAG_SERVER: ${CRAG_SERVER}
|
||||
WORKER_AGENT_URL: ${WORKER_AGENT_URL}
|
||||
SQL_AGENT_URL: ${SQL_AGENT_URL}
|
||||
port: 9090
|
||||
@@ -1,47 +1,87 @@
|
||||
# Copyright (C) 2024 Advanced Micro Devices, Inc.
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
WORKPATH=$(dirname "$PWD")/..
|
||||
export ip_address=${host_ip}
|
||||
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
|
||||
export AGENTQNA_TGI_IMAGE=ghcr.io/huggingface/text-generation-inference:2.4.1-rocm
|
||||
export AGENTQNA_TGI_SERVICE_PORT="8085"
|
||||
# Before start script:
|
||||
# export host_ip="your_host_ip_or_host_name"
|
||||
# export HUGGINGFACEHUB_API_TOKEN="your_huggingface_api_token"
|
||||
# export LANGCHAIN_API_KEY="your_langchain_api_key"
|
||||
# export LANGCHAIN_TRACING_V2=""
|
||||
|
||||
# LLM related environment variables
|
||||
export AGENTQNA_CARD_ID="card1"
|
||||
export AGENTQNA_RENDER_ID="renderD136"
|
||||
export HF_CACHE_DIR=${HF_CACHE_DIR}
|
||||
ls $HF_CACHE_DIR
|
||||
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
|
||||
#export NUM_SHARDS=4
|
||||
export LLM_ENDPOINT_URL="http://${ip_address}:${AGENTQNA_TGI_SERVICE_PORT}"
|
||||
# Set server hostname or IP address
|
||||
export ip_address=${host_ip}
|
||||
|
||||
# Set services IP ports
|
||||
export TGI_SERVICE_PORT="18110"
|
||||
export WORKER_RAG_AGENT_PORT="18111"
|
||||
export WORKER_SQL_AGENT_PORT="18112"
|
||||
export SUPERVISOR_REACT_AGENT_PORT="18113"
|
||||
export CRAG_SERVER_PORT="18114"
|
||||
|
||||
export WORKPATH=$(dirname "$PWD")
|
||||
export WORKDIR=${WORKPATH}/../../../
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
|
||||
export HF_CACHE_DIR="./data"
|
||||
export MODEL_CACHE="./data"
|
||||
export TOOLSET_PATH=${WORKPATH}/../../../tools/
|
||||
export recursion_limit_worker=12
|
||||
export LLM_ENDPOINT_URL=http://${ip_address}:${TGI_SERVICE_PORT}
|
||||
export temperature=0.01
|
||||
export max_new_tokens=512
|
||||
|
||||
# agent related environment variables
|
||||
export AGENTQNA_WORKER_AGENT_SERVICE_PORT="9095"
|
||||
export TOOLSET_PATH=/home/huggingface/datamonsters/amd-opea/GenAIExamples/AgentQnA/tools/
|
||||
echo "TOOLSET_PATH=${TOOLSET_PATH}"
|
||||
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
|
||||
export LANGCHAIN_API_KEY=${LANGCHAIN_API_KEY}
|
||||
export LANGCHAIN_TRACING_V2=${LANGCHAIN_TRACING_V2}
|
||||
export db_name=Chinook
|
||||
export db_path="sqlite:////home/user/chinook-db/Chinook_Sqlite.sqlite"
|
||||
export recursion_limit_worker=12
|
||||
export recursion_limit_supervisor=10
|
||||
export WORKER_AGENT_URL="http://${ip_address}:${AGENTQNA_WORKER_AGENT_SERVICE_PORT}/v1/chat/completions"
|
||||
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
|
||||
export CRAG_SERVER=http://${ip_address}:18881
|
||||
|
||||
export AGENTQNA_FRONTEND_PORT="9090"
|
||||
|
||||
#retrieval_tool
|
||||
export CRAG_SERVER=http://${ip_address}:${CRAG_SERVER_PORT}
|
||||
export WORKER_AGENT_URL="http://${ip_address}:${WORKER_RAG_AGENT_PORT}/v1/chat/completions"
|
||||
export SQL_AGENT_URL="http://${ip_address}:${WORKER_SQL_AGENT_PORT}/v1/chat/completions"
|
||||
export HF_CACHE_DIR=${HF_CACHE_DIR}
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export no_proxy=${no_proxy}
|
||||
export http_proxy=${http_proxy}
|
||||
export https_proxy=${https_proxy}
|
||||
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
|
||||
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
|
||||
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
|
||||
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
|
||||
export REDIS_URL="redis://${host_ip}:26379"
|
||||
export REDIS_URL="redis://${host_ip}:6379"
|
||||
export INDEX_NAME="rag-redis"
|
||||
export RERANK_TYPE="tei"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
|
||||
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
|
||||
export RERANK_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
|
||||
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
|
||||
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get"
|
||||
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete"
|
||||
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get"
|
||||
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete"
|
||||
|
||||
echo ${WORKER_RAG_AGENT_PORT} > ${WORKPATH}/WORKER_RAG_AGENT_PORT_tmp
|
||||
echo ${WORKER_SQL_AGENT_PORT} > ${WORKPATH}/WORKER_SQL_AGENT_PORT_tmp
|
||||
echo ${SUPERVISOR_REACT_AGENT_PORT} > ${WORKPATH}/SUPERVISOR_REACT_AGENT_PORT_tmp
|
||||
echo ${CRAG_SERVER_PORT} > ${WORKPATH}/CRAG_SERVER_PORT_tmp
|
||||
|
||||
echo "Downloading chinook data..."
|
||||
echo Y | rm -R chinook-database
|
||||
git clone https://github.com/lerocha/chinook-database.git
|
||||
echo Y | rm -R ../../../../../AgentQnA/tests/Chinook_Sqlite.sqlite
|
||||
cp chinook-database/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite ../../../../../AgentQnA/tests
|
||||
|
||||
docker compose -f ../../../../../DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml up -d
|
||||
docker compose -f compose.yaml up -d
|
||||
|
||||
n=0
|
||||
until [[ "$n" -ge 100 ]]; do
|
||||
docker logs tgi-service > ${WORKPATH}/tgi_service_start.log
|
||||
if grep -q Connected ${WORKPATH}/tgi_service_start.log; then
|
||||
break
|
||||
fi
|
||||
sleep 10s
|
||||
n=$((n+1))
|
||||
done
|
||||
|
||||
echo "Starting CRAG server"
|
||||
docker run -d --runtime=runc --name=kdd-cup-24-crag-service -p=${CRAG_SERVER_PORT}:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
|
||||
|
||||
@@ -0,0 +1,88 @@
|
||||
# Copyright (C) 2024 Advanced Micro Devices, Inc.
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
# Before start script:
|
||||
# export host_ip="your_host_ip_or_host_name"
|
||||
# export HUGGINGFACEHUB_API_TOKEN="your_huggingface_api_token"
|
||||
# export LANGCHAIN_API_KEY="your_langchain_api_key"
|
||||
# export LANGCHAIN_TRACING_V2=""
|
||||
|
||||
# Set server hostname or IP address
|
||||
export ip_address=${host_ip}
|
||||
|
||||
# Set services IP ports
|
||||
export VLLM_SERVICE_PORT="18110"
|
||||
export WORKER_RAG_AGENT_PORT="18111"
|
||||
export WORKER_SQL_AGENT_PORT="18112"
|
||||
export SUPERVISOR_REACT_AGENT_PORT="18113"
|
||||
export CRAG_SERVER_PORT="18114"
|
||||
|
||||
export WORKPATH=$(dirname "$PWD")
|
||||
export WORKDIR=${WORKPATH}/../../../
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export VLLM_LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
|
||||
export HF_CACHE_DIR="./data"
|
||||
export MODEL_CACHE="./data"
|
||||
export TOOLSET_PATH=${WORKPATH}/../../../tools/
|
||||
export recursion_limit_worker=12
|
||||
export LLM_ENDPOINT_URL=http://${ip_address}:${VLLM_SERVICE_PORT}
|
||||
export LLM_MODEL_ID=${VLLM_LLM_MODEL_ID}
|
||||
export temperature=0.01
|
||||
export max_new_tokens=512
|
||||
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
|
||||
export LANGCHAIN_API_KEY=${LANGCHAIN_API_KEY}
|
||||
export LANGCHAIN_TRACING_V2=${LANGCHAIN_TRACING_V2}
|
||||
export db_name=Chinook
|
||||
export db_path="sqlite:////home/user/chinook-db/Chinook_Sqlite.sqlite"
|
||||
export recursion_limit_worker=12
|
||||
export recursion_limit_supervisor=10
|
||||
export CRAG_SERVER=http://${ip_address}:${CRAG_SERVER_PORT}
|
||||
export WORKER_AGENT_URL="http://${ip_address}:${WORKER_RAG_AGENT_PORT}/v1/chat/completions"
|
||||
export SQL_AGENT_URL="http://${ip_address}:${WORKER_SQL_AGENT_PORT}/v1/chat/completions"
|
||||
export HF_CACHE_DIR=${HF_CACHE_DIR}
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export no_proxy=${no_proxy}
|
||||
export http_proxy=${http_proxy}
|
||||
export https_proxy=${https_proxy}
|
||||
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
|
||||
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
|
||||
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
|
||||
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
|
||||
export REDIS_URL="redis://${host_ip}:6379"
|
||||
export INDEX_NAME="rag-redis"
|
||||
export RERANK_TYPE="tei"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
|
||||
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
|
||||
export RERANK_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
|
||||
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
|
||||
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get"
|
||||
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete"
|
||||
|
||||
echo ${WORKER_RAG_AGENT_PORT} > ${WORKPATH}/WORKER_RAG_AGENT_PORT_tmp
|
||||
echo ${WORKER_SQL_AGENT_PORT} > ${WORKPATH}/WORKER_SQL_AGENT_PORT_tmp
|
||||
echo ${SUPERVISOR_REACT_AGENT_PORT} > ${WORKPATH}/SUPERVISOR_REACT_AGENT_PORT_tmp
|
||||
echo ${CRAG_SERVER_PORT} > ${WORKPATH}/CRAG_SERVER_PORT_tmp
|
||||
|
||||
echo "Downloading chinook data..."
|
||||
echo Y | rm -R chinook-database
|
||||
git clone https://github.com/lerocha/chinook-database.git
|
||||
echo Y | rm -R ../../../../../AgentQnA/tests/Chinook_Sqlite.sqlite
|
||||
cp chinook-database/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite ../../../../../AgentQnA/tests
|
||||
|
||||
docker compose -f ../../../../../DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml up -d
|
||||
docker compose -f compose_vllm.yaml up -d
|
||||
|
||||
n=0
|
||||
until [[ "$n" -ge 500 ]]; do
|
||||
docker logs vllm-service >& "${WORKPATH}"/vllm-service_start.log
|
||||
if grep -q "Application startup complete" "${WORKPATH}"/vllm-service_start.log; then
|
||||
break
|
||||
fi
|
||||
sleep 20s
|
||||
n=$((n+1))
|
||||
done
|
||||
|
||||
echo "Starting CRAG server"
|
||||
docker run -d --runtime=runc --name=kdd-cup-24-crag-service -p=${CRAG_SERVER_PORT}:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
|
||||
@@ -14,7 +14,7 @@ export AGENTQNA_CARD_ID="card1"
|
||||
export AGENTQNA_RENDER_ID="renderD136"
|
||||
export HF_CACHE_DIR=${HF_CACHE_DIR}
|
||||
ls $HF_CACHE_DIR
|
||||
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
|
||||
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
|
||||
export NUM_SHARDS=4
|
||||
export LLM_ENDPOINT_URL="http://${ip_address}:${AGENTQNA_TGI_SERVICE_PORT}"
|
||||
export temperature=0.01
|
||||
@@ -44,3 +44,19 @@ export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
|
||||
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
|
||||
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get"
|
||||
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete"
|
||||
|
||||
echo "Removing chinook data..."
|
||||
echo Y | rm -R chinook-database
|
||||
if [ -d "chinook-database" ]; then
|
||||
rm -rf chinook-database
|
||||
fi
|
||||
echo "Chinook data removed!"
|
||||
|
||||
echo "Stopping CRAG server"
|
||||
docker rm kdd-cup-24-crag-service --force
|
||||
|
||||
echo "Stopping Agent services"
|
||||
docker compose -f compose.yaml down
|
||||
|
||||
echo "Stopping Retrieval services"
|
||||
docker compose -f ../../../../../DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml down
|
||||
@@ -0,0 +1,84 @@
|
||||
# Copyright (C) 2024 Advanced Micro Devices, Inc.
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
# Before start script:
|
||||
# export host_ip="your_host_ip_or_host_name"
|
||||
# export HUGGINGFACEHUB_API_TOKEN="your_huggingface_api_token"
|
||||
# export LANGCHAIN_API_KEY="your_langchain_api_key"
|
||||
# export LANGCHAIN_TRACING_V2=""
|
||||
|
||||
# Set server hostname or IP address
|
||||
export ip_address=${host_ip}
|
||||
|
||||
# Set services IP ports
|
||||
export VLLM_SERVICE_PORT="18110"
|
||||
export WORKER_RAG_AGENT_PORT="18111"
|
||||
export WORKER_SQL_AGENT_PORT="18112"
|
||||
export SUPERVISOR_REACT_AGENT_PORT="18113"
|
||||
export CRAG_SERVER_PORT="18114"
|
||||
|
||||
export WORKPATH=$(dirname "$PWD")
|
||||
export WORKDIR=${WORKPATH}/../../../
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export VLLM_LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
|
||||
export HF_CACHE_DIR="./data"
|
||||
export MODEL_CACHE="./data"
|
||||
export TOOLSET_PATH=${WORKPATH}/../../../tools/
|
||||
export recursion_limit_worker=12
|
||||
export LLM_ENDPOINT_URL=http://${ip_address}:${VLLM_SERVICE_PORT}
|
||||
export LLM_MODEL_ID=${VLLM_LLM_MODEL_ID}
|
||||
export temperature=0.01
|
||||
export max_new_tokens=512
|
||||
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
|
||||
export LANGCHAIN_API_KEY=${LANGCHAIN_API_KEY}
|
||||
export LANGCHAIN_TRACING_V2=${LANGCHAIN_TRACING_V2}
|
||||
export db_name=Chinook
|
||||
export db_path="sqlite:////home/user/chinook-db/Chinook_Sqlite.sqlite"
|
||||
export recursion_limit_worker=12
|
||||
export recursion_limit_supervisor=10
|
||||
export CRAG_SERVER=http://${ip_address}:${CRAG_SERVER_PORT}
|
||||
export WORKER_AGENT_URL="http://${ip_address}:${WORKER_RAG_AGENT_PORT}/v1/chat/completions"
|
||||
export SQL_AGENT_URL="http://${ip_address}:${WORKER_SQL_AGENT_PORT}/v1/chat/completions"
|
||||
export HF_CACHE_DIR=${HF_CACHE_DIR}
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export no_proxy=${no_proxy}
|
||||
export http_proxy=${http_proxy}
|
||||
export https_proxy=${https_proxy}
|
||||
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
|
||||
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
|
||||
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
|
||||
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
|
||||
export REDIS_URL="redis://${host_ip}:6379"
|
||||
export INDEX_NAME="rag-redis"
|
||||
export RERANK_TYPE="tei"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
|
||||
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
|
||||
export RERANK_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
|
||||
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
|
||||
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get"
|
||||
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete"
|
||||
|
||||
echo ${WORKER_RAG_AGENT_PORT} > ${WORKPATH}/WORKER_RAG_AGENT_PORT_tmp
|
||||
echo ${WORKER_SQL_AGENT_PORT} > ${WORKPATH}/WORKER_SQL_AGENT_PORT_tmp
|
||||
echo ${SUPERVISOR_REACT_AGENT_PORT} > ${WORKPATH}/SUPERVISOR_REACT_AGENT_PORT_tmp
|
||||
echo ${CRAG_SERVER_PORT} > ${WORKPATH}/CRAG_SERVER_PORT_tmp
|
||||
|
||||
echo "Removing chinook data..."
|
||||
echo Y | rm -R chinook-database
|
||||
if [ -d "chinook-database" ]; then
|
||||
rm -rf chinook-database
|
||||
fi
|
||||
echo "Chinook data removed!"
|
||||
|
||||
echo "Stopping CRAG server"
|
||||
docker rm kdd-cup-24-crag-service --force
|
||||
|
||||
echo "Stopping Agent services"
|
||||
docker compose -f compose_vllm.yaml down
|
||||
|
||||
echo "Stopping Retrieval services"
|
||||
docker compose -f ../../../../../DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml down
|
||||
@@ -17,3 +17,12 @@ services:
|
||||
dockerfile: ./docker/Dockerfile
|
||||
extends: agent
|
||||
image: ${REGISTRY:-opea}/agent-ui:${TAG:-latest}
|
||||
vllm-rocm:
|
||||
build:
|
||||
args:
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
no_proxy: ${no_proxy}
|
||||
context: GenAIComps
|
||||
dockerfile: comps/third_parties/vllm/src/Dockerfile.amd_gpu
|
||||
image: ${REGISTRY:-opea}/vllm-rocm:${TAG:-latest}
|
||||
|
||||
64
AgentQnA/tests/step1_build_images_rocm_vllm.sh
Normal file
64
AgentQnA/tests/step1_build_images_rocm_vllm.sh
Normal file
@@ -0,0 +1,64 @@
|
||||
#!/bin/bash
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
export WORKPATH=$(dirname "$PWD")
|
||||
export WORKDIR=${WORKPATH}/../../
|
||||
echo "WORKDIR=${WORKDIR}"
|
||||
export ip_address=$(hostname -I | awk '{print $1}')
|
||||
|
||||
|
||||
function get_genai_comps() {
|
||||
if [ ! -d "GenAIComps" ] ; then
|
||||
git clone --depth 1 --branch ${opea_branch:-"main"} https://github.com/opea-project/GenAIComps.git
|
||||
fi
|
||||
}
|
||||
|
||||
|
||||
function build_docker_images_for_retrieval_tool(){
|
||||
cd $WORKPATH/../DocIndexRetriever/docker_image_build/
|
||||
get_genai_comps
|
||||
echo "Build all the images with --no-cache..."
|
||||
service_list="doc-index-retriever dataprep embedding retriever reranking"
|
||||
docker compose -f build.yaml build ${service_list} --no-cache
|
||||
docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
|
||||
docker images && sleep 3s
|
||||
}
|
||||
|
||||
function build_agent_docker_image() {
|
||||
cd $WORKPATH/docker_image_build/
|
||||
get_genai_comps
|
||||
echo "Build agent image with --no-cache..."
|
||||
docker compose -f build.yaml build --no-cache
|
||||
|
||||
docker images && sleep 3s
|
||||
}
|
||||
|
||||
#function build_vllm_docker_image() {
|
||||
# echo "Building the vllm docker image"
|
||||
# cd $WORKPATH/
|
||||
# docker build --no-cache -t opea/llm-vllm-rocm:ci -f Dockerfile-vllm-rocm .
|
||||
#
|
||||
# docker images && sleep 3s
|
||||
#}
|
||||
|
||||
|
||||
function main() {
|
||||
echo "==================== Build docker images for retrieval tool ===================="
|
||||
build_docker_images_for_retrieval_tool
|
||||
echo "==================== Build docker images for retrieval tool completed ===================="
|
||||
|
||||
echo "==================== Build agent docker image ===================="
|
||||
build_agent_docker_image
|
||||
echo "==================== Build agent docker image completed ===================="
|
||||
|
||||
# echo "==================== Build vllm docker image ===================="
|
||||
# build_vllm_docker_image
|
||||
# echo "==================== Build vllm docker image completed ===================="
|
||||
|
||||
docker image ls | grep vllm
|
||||
}
|
||||
|
||||
main
|
||||
49
AgentQnA/tests/step2_start_retrieval_tool_rocm_vllm.sh
Normal file
49
AgentQnA/tests/step2_start_retrieval_tool_rocm_vllm.sh
Normal file
@@ -0,0 +1,49 @@
|
||||
#!/bin/bash
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
WORKPATH=$(dirname "$PWD")
|
||||
export WORKDIR=$WORKPATH/../
|
||||
echo "WORKDIR=${WORKDIR}"
|
||||
export ip_address=$(hostname -I | awk '{print $1}')
|
||||
export host_ip=${ip_address}
|
||||
|
||||
export HF_CACHE_DIR=$WORKPATH/hf_cache
|
||||
if [ ! -d "$HF_CACHE_DIR" ]; then
|
||||
echo "Creating HF_CACHE directory"
|
||||
mkdir -p "$HF_CACHE_DIR"
|
||||
fi
|
||||
|
||||
function start_retrieval_tool() {
|
||||
echo "Starting Retrieval tool"
|
||||
cd $WORKPATH/../DocIndexRetriever/docker_compose/intel/cpu/xeon
|
||||
host_ip=$(hostname -I | awk '{print $1}')
|
||||
export HF_CACHE_DIR=${HF_CACHE_DIR}
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export no_proxy=${no_proxy}
|
||||
export http_proxy=${http_proxy}
|
||||
export https_proxy=${https_proxy}
|
||||
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
|
||||
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
|
||||
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
|
||||
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
|
||||
export REDIS_URL="redis://${host_ip}:6379"
|
||||
export INDEX_NAME="rag-redis"
|
||||
export RERANK_TYPE="tei"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
|
||||
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
|
||||
export RERANK_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
|
||||
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
|
||||
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get"
|
||||
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete"
|
||||
|
||||
docker compose -f compose.yaml up -d
|
||||
}
|
||||
|
||||
echo "==================== Start retrieval tool ===================="
|
||||
start_retrieval_tool
|
||||
sleep 20 # needed for downloading the models
|
||||
echo "==================== Retrieval tool started ===================="
|
||||
@@ -0,0 +1,68 @@
|
||||
#!/bin/bash
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
|
||||
WORKPATH=$(dirname "$PWD")
|
||||
export WORKDIR=$WORKPATH/../../
|
||||
echo "WORKDIR=${WORKDIR}"
|
||||
export ip_address=$(hostname -I | awk '{print $1}')
|
||||
export host_ip=$ip_address
|
||||
echo "ip_address=${ip_address}"
|
||||
|
||||
|
||||
function validate() {
|
||||
local CONTENT="$1"
|
||||
local EXPECTED_RESULT="$2"
|
||||
local SERVICE_NAME="$3"
|
||||
|
||||
if echo "$CONTENT" | grep -q "$EXPECTED_RESULT"; then
|
||||
echo "[ $SERVICE_NAME ] Content is as expected: $CONTENT"
|
||||
echo 0
|
||||
else
|
||||
echo "[ $SERVICE_NAME ] Content does not match the expected result: $CONTENT"
|
||||
echo 1
|
||||
fi
|
||||
}
|
||||
|
||||
function ingest_data_and_validate() {
|
||||
echo "Ingesting data"
|
||||
cd $WORKPATH/retrieval_tool/
|
||||
echo $PWD
|
||||
local CONTENT=$(bash run_ingest_data.sh)
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Data preparation succeeded" "dataprep-redis-server")
|
||||
echo "$EXIT_CODE"
|
||||
local EXIT_CODE="${EXIT_CODE:0-1}"
|
||||
echo "return value is $EXIT_CODE"
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs dataprep-redis-server
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
function validate_retrieval_tool() {
|
||||
echo "----------------Test retrieval tool ----------------"
|
||||
local CONTENT=$(http_proxy="" curl http://${ip_address}:8889/v1/retrievaltool -X POST -H "Content-Type: application/json" -d '{
|
||||
"text": "Who sang Thriller"
|
||||
}')
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "retrieval-tool")
|
||||
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs retrievaltool-xeon-backend-server
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
function main(){
|
||||
|
||||
echo "==================== Ingest data ===================="
|
||||
ingest_data_and_validate
|
||||
echo "==================== Data ingestion completed ===================="
|
||||
|
||||
echo "==================== Validate retrieval tool ===================="
|
||||
validate_retrieval_tool
|
||||
echo "==================== Retrieval tool validated ===================="
|
||||
}
|
||||
|
||||
main
|
||||
120
AgentQnA/tests/step4_launch_and_validate_agent_rocm_vllm.sh
Normal file
120
AgentQnA/tests/step4_launch_and_validate_agent_rocm_vllm.sh
Normal file
@@ -0,0 +1,120 @@
|
||||
#!/bin/bash
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
|
||||
WORKPATH=$(dirname "$PWD")
|
||||
export LOG_PATH=${WORKPATH}
|
||||
export WORKDIR=$WORKPATH/../../
|
||||
echo "WORKDIR=${WORKDIR}"
|
||||
export ip_address=$(hostname -I | awk '{print $1}')
|
||||
export host_ip=${ip_address}
|
||||
export TOOLSET_PATH=$WORKPATH/tools/
|
||||
|
||||
export HF_CACHE_DIR=$WORKPATH/data2/huggingface
|
||||
if [ ! -d "$HF_CACHE_DIR" ]; then
|
||||
HF_CACHE_DIR=$WORKDIR/hf_cache
|
||||
mkdir -p "$HF_CACHE_DIR"
|
||||
fi
|
||||
|
||||
function download_chinook_data(){
|
||||
echo "Downloading chinook data..."
|
||||
cd $WORKDIR
|
||||
git clone https://github.com/lerocha/chinook-database.git
|
||||
cp chinook-database/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite ${WORKPATH}/tests/
|
||||
}
|
||||
|
||||
function start_agent_and_api_server() {
|
||||
echo "Starting Agent services"
|
||||
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
|
||||
bash launch_agent_service_vllm_rocm.sh
|
||||
}
|
||||
|
||||
function validate() {
|
||||
local CONTENT="$1"
|
||||
local EXPECTED_RESULT="$2"
|
||||
local SERVICE_NAME="$3"
|
||||
|
||||
if echo "$CONTENT" | grep -q "$EXPECTED_RESULT"; then
|
||||
echo "[ $SERVICE_NAME ] Content is as expected: $CONTENT"
|
||||
echo 0
|
||||
else
|
||||
echo "[ $SERVICE_NAME ] Content does not match the expected result: $CONTENT"
|
||||
echo 1
|
||||
fi
|
||||
}
|
||||
|
||||
function validate_agent_service() {
|
||||
# # test worker rag agent
|
||||
echo "======================Testing worker rag agent======================"
|
||||
export agent_port=$(cat ${WORKPATH}/docker_compose/amd/gpu/WORKER_RAG_AGENT_PORT_tmp)
|
||||
prompt="Tell me about Michael Jackson song Thriller"
|
||||
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port)
|
||||
# echo $CONTENT
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "rag-agent-endpoint")
|
||||
echo $EXIT_CODE
|
||||
local EXIT_CODE="${EXIT_CODE:0-1}"
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs rag-agent-endpoint
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# test worker sql agent
|
||||
echo "======================Testing worker sql agent======================"
|
||||
export agent_port=$(cat ${WORKPATH}/docker_compose/amd/gpu/WORKER_SQL_AGENT_PORT_tmp)
|
||||
prompt="How many employees are there in the company?"
|
||||
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port)
|
||||
local EXIT_CODE=$(validate "$CONTENT" "8" "sql-agent-endpoint")
|
||||
echo $CONTENT
|
||||
# echo $EXIT_CODE
|
||||
local EXIT_CODE="${EXIT_CODE:0-1}"
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs sql-agent-endpoint
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# test supervisor react agent
|
||||
echo "======================Testing supervisor react agent======================"
|
||||
export agent_port=$(cat ${WORKPATH}/docker_compose/amd/gpu/SUPERVISOR_REACT_AGENT_PORT_tmp)
|
||||
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream)
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Iron" "react-agent-endpoint")
|
||||
# echo $CONTENT
|
||||
echo $EXIT_CODE
|
||||
local EXIT_CODE="${EXIT_CODE:0-1}"
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs react-agent-endpoint
|
||||
exit 1
|
||||
fi
|
||||
|
||||
}
|
||||
|
||||
function remove_chinook_data(){
|
||||
echo "Removing chinook data..."
|
||||
cd $WORKDIR
|
||||
if [ -d "chinook-database" ]; then
|
||||
rm -rf chinook-database
|
||||
fi
|
||||
echo "Chinook data removed!"
|
||||
}
|
||||
|
||||
function main() {
|
||||
echo "==================== Prepare data ===================="
|
||||
download_chinook_data
|
||||
echo "==================== Data prepare done ===================="
|
||||
|
||||
echo "==================== Start agent ===================="
|
||||
start_agent_and_api_server
|
||||
echo "==================== Agent started ===================="
|
||||
|
||||
echo "==================== Validate agent service ===================="
|
||||
validate_agent_service
|
||||
echo "==================== Agent service validated ===================="
|
||||
}
|
||||
|
||||
|
||||
remove_chinook_data
|
||||
|
||||
main
|
||||
|
||||
remove_chinook_data
|
||||
@@ -2,26 +2,30 @@
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -ex
|
||||
set -e
|
||||
|
||||
WORKPATH=$(dirname "$PWD")
|
||||
export LOG_PATH=${WORKPATH}
|
||||
export WORKDIR=$WORKPATH/../../
|
||||
echo "WORKDIR=${WORKDIR}"
|
||||
export ip_address=$(hostname -I | awk '{print $1}')
|
||||
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export host_ip=${ip_address}
|
||||
export TOOLSET_PATH=$WORKPATH/tools/
|
||||
|
||||
export HF_CACHE_DIR=${model_cache:-"$WORKDIR/hf_cache"}
|
||||
export HF_CACHE_DIR=$WORKPATH/data2/huggingface
|
||||
if [ ! -d "$HF_CACHE_DIR" ]; then
|
||||
HF_CACHE_DIR=$WORKDIR/hf_cache
|
||||
mkdir -p "$HF_CACHE_DIR"
|
||||
fi
|
||||
ls $HF_CACHE_DIR
|
||||
|
||||
function download_chinook_data(){
|
||||
echo "Downloading chinook data..."
|
||||
cd $WORKDIR
|
||||
git clone https://github.com/lerocha/chinook-database.git
|
||||
cp chinook-database/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite ${WORKPATH}/tests/
|
||||
}
|
||||
|
||||
function start_agent_and_api_server() {
|
||||
echo "Starting CRAG server"
|
||||
docker run -d --runtime=runc --name=kdd-cup-24-crag-service -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
|
||||
|
||||
echo "Starting Agent services"
|
||||
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
|
||||
bash launch_agent_service_tgi_rocm.sh
|
||||
@@ -42,28 +46,63 @@ function validate() {
|
||||
}
|
||||
|
||||
function validate_agent_service() {
|
||||
echo "----------------Test agent ----------------"
|
||||
local CONTENT=$(http_proxy="" curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
|
||||
"query": "Tell me about Michael Jackson song thriller"
|
||||
}')
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
|
||||
docker logs rag-agent-endpoint
|
||||
# # test worker rag agent
|
||||
echo "======================Testing worker rag agent======================"
|
||||
export agent_port=$(cat ${WORKPATH}/docker_compose/amd/gpu/WORKER_RAG_AGENT_PORT_tmp)
|
||||
prompt="Tell me about Michael Jackson song Thriller"
|
||||
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port)
|
||||
# echo $CONTENT
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "rag-agent-endpoint")
|
||||
echo $EXIT_CODE
|
||||
local EXIT_CODE="${EXIT_CODE:0-1}"
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs rag-agent-endpoint
|
||||
exit 1
|
||||
fi
|
||||
|
||||
local CONTENT=$(http_proxy="" curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
|
||||
"query": "Tell me about Michael Jackson song thriller"
|
||||
}')
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
|
||||
docker logs react-agent-endpoint
|
||||
# test worker sql agent
|
||||
echo "======================Testing worker sql agent======================"
|
||||
export agent_port=$(cat ${WORKPATH}/docker_compose/amd/gpu/WORKER_SQL_AGENT_PORT_tmp)
|
||||
prompt="How many employees are there in the company?"
|
||||
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port)
|
||||
local EXIT_CODE=$(validate "$CONTENT" "8" "sql-agent-endpoint")
|
||||
echo $CONTENT
|
||||
# echo $EXIT_CODE
|
||||
local EXIT_CODE="${EXIT_CODE:0-1}"
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs sql-agent-endpoint
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# test supervisor react agent
|
||||
echo "======================Testing supervisor react agent======================"
|
||||
export agent_port=$(cat ${WORKPATH}/docker_compose/amd/gpu/SUPERVISOR_REACT_AGENT_PORT_tmp)
|
||||
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream)
|
||||
local EXIT_CODE=$(validate "$CONTENT" "Iron" "react-agent-endpoint")
|
||||
# echo $CONTENT
|
||||
echo $EXIT_CODE
|
||||
local EXIT_CODE="${EXIT_CODE:0-1}"
|
||||
if [ "$EXIT_CODE" == "1" ]; then
|
||||
docker logs react-agent-endpoint
|
||||
exit 1
|
||||
fi
|
||||
|
||||
}
|
||||
|
||||
function remove_chinook_data(){
|
||||
echo "Removing chinook data..."
|
||||
cd $WORKDIR
|
||||
if [ -d "chinook-database" ]; then
|
||||
rm -rf chinook-database
|
||||
fi
|
||||
echo "Chinook data removed!"
|
||||
}
|
||||
|
||||
function main() {
|
||||
echo "==================== Prepare data ===================="
|
||||
download_chinook_data
|
||||
echo "==================== Data prepare done ===================="
|
||||
|
||||
echo "==================== Start agent ===================="
|
||||
start_agent_and_api_server
|
||||
echo "==================== Agent started ===================="
|
||||
@@ -73,4 +112,9 @@ function main() {
|
||||
echo "==================== Agent service validated ===================="
|
||||
}
|
||||
|
||||
|
||||
remove_chinook_data
|
||||
|
||||
main
|
||||
|
||||
remove_chinook_data
|
||||
|
||||
@@ -5,11 +5,13 @@
|
||||
set -xe
|
||||
|
||||
WORKPATH=$(dirname "$PWD")
|
||||
ls $WORKPATH
|
||||
export WORKDIR=$WORKPATH/../../
|
||||
echo "WORKDIR=${WORKDIR}"
|
||||
export ip_address=$(hostname -I | awk '{print $1}')
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
|
||||
export TOOLSET_PATH=$WORKPATH/tools/
|
||||
export MODEL_CACHE="./data"
|
||||
|
||||
function stop_crag() {
|
||||
cid=$(docker ps -aq --filter "name=kdd-cup-24-crag-service")
|
||||
@@ -19,13 +21,7 @@ function stop_crag() {
|
||||
|
||||
function stop_agent_docker() {
|
||||
cd $WORKPATH/docker_compose/amd/gpu/rocm
|
||||
# docker compose -f compose.yaml down
|
||||
container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2)
|
||||
for container_name in $container_list; do
|
||||
cid=$(docker ps -aq --filter "name=$container_name")
|
||||
echo "Stopping container $container_name"
|
||||
if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi
|
||||
done
|
||||
bash stop_agent_service_tgi_rocm.sh
|
||||
}
|
||||
|
||||
function stop_retrieval_tool() {
|
||||
|
||||
66
AgentQnA/tests/test_compose_vllm_on_rocm.sh
Normal file
66
AgentQnA/tests/test_compose_vllm_on_rocm.sh
Normal file
@@ -0,0 +1,66 @@
|
||||
#!/bin/bash
|
||||
# Copyright (C) 2024 Advanced Micro Devices, Inc.
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
|
||||
WORKPATH=$(dirname "$PWD")
|
||||
export LOG_PATH=${WORKPATH}
|
||||
export WORKDIR=${WORKPATH}/../../
|
||||
echo "WORKDIR=${WORKDIR}"
|
||||
export ip_address=$(hostname -I | awk '{print $1}')
|
||||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
|
||||
export TOOLSET_PATH=$WORKPATH/tools/
|
||||
export MODEL_CACHE="./data"
|
||||
|
||||
function stop_crag() {
|
||||
cid=$(docker ps -aq --filter "name=kdd-cup-24-crag-service")
|
||||
echo "Stopping container kdd-cup-24-crag-service with cid $cid"
|
||||
if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi
|
||||
}
|
||||
|
||||
function stop_agent_docker() {
|
||||
cd $WORKPATH/docker_compose/amd/gpu/rocm
|
||||
bash stop_agent_service_vllm_rocm.sh
|
||||
}
|
||||
|
||||
function stop_retrieval_tool() {
|
||||
echo "Stopping Retrieval tool"
|
||||
local RETRIEVAL_TOOL_PATH=$WORKDIR/GenAIExamples/DocIndexRetriever
|
||||
cd $RETRIEVAL_TOOL_PATH/docker_compose/intel/cpu/xeon/
|
||||
docker compose -f compose.yaml down
|
||||
}
|
||||
|
||||
echo "workpath: $WORKPATH"
|
||||
echo "=================== Stop containers ===================="
|
||||
stop_crag
|
||||
stop_agent_docker
|
||||
stop_retrieval_tool
|
||||
|
||||
cd $WORKPATH/tests
|
||||
|
||||
echo "=================== #1 Building docker images===================="
|
||||
bash step1_build_images_rocm_vllm.sh
|
||||
echo "=================== #1 Building docker images completed===================="
|
||||
|
||||
echo "=================== #2 Start retrieval tool===================="
|
||||
bash step2_start_retrieval_tool_rocm_vllm.sh
|
||||
echo "=================== #2 Retrieval tool started===================="
|
||||
|
||||
echo "=================== #3 Ingest data and validate retrieval===================="
|
||||
bash step3_ingest_data_and_validate_retrieval_rocm_vllm.sh
|
||||
echo "=================== #3 Data ingestion and validation completed===================="
|
||||
|
||||
echo "=================== #4 Start agent and API server===================="
|
||||
bash step4_launch_and_validate_agent_rocm_vllm.sh
|
||||
echo "=================== #4 Agent test passed ===================="
|
||||
|
||||
echo "=================== #5 Stop agent and API server===================="
|
||||
stop_crag
|
||||
stop_agent_docker
|
||||
stop_retrieval_tool
|
||||
echo "=================== #5 Agent and API server stopped===================="
|
||||
|
||||
echo y | docker system prune
|
||||
|
||||
echo "ALL DONE!!"
|
||||
Reference in New Issue
Block a user