Merge FaqGen into ChatQnA (#1654)
1. Delete FaqGen 2. Refactor FaqGen into ChatQnA, serve as a LLM selection. 3. Combine all ChatQnA related Dockerfile into one Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# Build Mega Service of ChatQnA on Xeon
|
||||
|
||||
This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`, and `llm`.
|
||||
This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`,`llm` and `faqgen`.
|
||||
|
||||
The default pipeline deploys with vLLM as the LLM serving component and leverages rerank component. It also provides options of not using rerank in the pipeline and using TGI backend for LLM microservice, please refer to [start-all-the-services-docker-containers](#start-all-the-services-docker-containers) section in this page. Besides, refer to [Build with Pinecone VectorDB](./README_pinecone.md) and [Build with Qdrant VectorDB](./README_qdrant.md) for other deployment variants.
|
||||
|
||||
@@ -30,7 +30,7 @@ To set up environment variables for deploying ChatQnA services, follow these ste
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
|
||||
export no_proxy="Your_No_Proxy",chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm-service
|
||||
export no_proxy="Your_No_Proxy",chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm-service,llm-faqgen
|
||||
```
|
||||
|
||||
3. Set up other environment variables:
|
||||
@@ -141,29 +141,27 @@ docker build --no-cache -t opea/dataprep:latest --build-arg https_proxy=$https_p
|
||||
cd ..
|
||||
```
|
||||
|
||||
### 3. Build MegaService Docker Image
|
||||
### 3. Build FaqGen LLM Image (Optional)
|
||||
|
||||
1. MegaService with Rerank
|
||||
If you want to enable FAQ generation LLM in the pipeline, please use the below command:
|
||||
|
||||
To construct the Mega Service with Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIComps.git
|
||||
cd GenAIComps
|
||||
docker build -t opea/llm-faqgen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/src/faq-generation/Dockerfile .
|
||||
```
|
||||
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIExamples.git
|
||||
cd GenAIExamples/ChatQnA
|
||||
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
|
||||
```
|
||||
### 4. Build MegaService Docker Image
|
||||
|
||||
2. MegaService without Rerank
|
||||
To construct the Mega Service with Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
|
||||
|
||||
To construct the Mega Service without Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna_without_rerank.py` Python script. Build MegaService Docker image via below command:
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIExamples.git
|
||||
cd GenAIExamples/ChatQnA
|
||||
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
|
||||
```
|
||||
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIExamples.git
|
||||
cd GenAIExamples/ChatQnA
|
||||
docker build --no-cache -t opea/chatqna-without-rerank:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.without_rerank .
|
||||
```
|
||||
|
||||
### 4. Build UI Docker Image
|
||||
### 5. Build UI Docker Image
|
||||
|
||||
Build frontend Docker image via below command:
|
||||
|
||||
@@ -172,7 +170,7 @@ cd GenAIExamples/ChatQnA/ui
|
||||
docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
|
||||
```
|
||||
|
||||
### 5. Build Conversational React UI Docker Image (Optional)
|
||||
### 6. Build Conversational React UI Docker Image (Optional)
|
||||
|
||||
Build frontend Docker image that enables Conversational experience with ChatQnA megaservice via below command:
|
||||
|
||||
@@ -183,7 +181,7 @@ cd GenAIExamples/ChatQnA/ui
|
||||
docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react .
|
||||
```
|
||||
|
||||
### 6. Build Nginx Docker Image
|
||||
### 7. Build Nginx Docker Image
|
||||
|
||||
```bash
|
||||
cd GenAIComps
|
||||
@@ -194,10 +192,14 @@ Then run the command `docker images`, you will have the following 5 Docker Image
|
||||
|
||||
1. `opea/dataprep:latest`
|
||||
2. `opea/retriever:latest`
|
||||
3. `opea/chatqna:latest` or `opea/chatqna-without-rerank:latest`
|
||||
3. `opea/chatqna:latest`
|
||||
4. `opea/chatqna-ui:latest`
|
||||
5. `opea/nginx:latest`
|
||||
|
||||
If FaqGen related docker image is built, you will find one more image:
|
||||
|
||||
- `opea/llm-faqgen:latest`
|
||||
|
||||
## 🚀 Start Microservices
|
||||
|
||||
### Required Models
|
||||
@@ -287,6 +289,8 @@ docker compose -f compose.yaml up -d
|
||||
docker compose -f compose_without_rerank.yaml up -d
|
||||
# Start ChatQnA with Rerank Pipeline and Open Telemetry Tracing
|
||||
docker compose -f compose.yaml -f compose.telemetry.yaml up -d
|
||||
# Start ChatQnA with FaqGen Pipeline
|
||||
docker compose -f compose_faqgen.yaml up -d
|
||||
```
|
||||
|
||||
If use TGI as the LLM serving backend.
|
||||
@@ -295,6 +299,8 @@ If use TGI as the LLM serving backend.
|
||||
docker compose -f compose_tgi.yaml up -d
|
||||
# Start ChatQnA with Open Telemetry Tracing
|
||||
docker compose -f compose_tgi.yaml -f compose_tgi.telemetry.yaml up -d
|
||||
# Start ChatQnA with FaqGen Pipeline
|
||||
docker compose -f compose_faqgen_tgi.yaml up -d
|
||||
```
|
||||
|
||||
### Validate Microservices
|
||||
@@ -369,7 +375,16 @@ For details on how to verify the correctness of the response, refer to [how-to-v
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
5. MegaService
|
||||
5. FaqGen LLM Microservice (if enabled)
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:${LLM_SERVICE_PORT}/v1/faqgen \
|
||||
-X POST \
|
||||
-d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
6. MegaService
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{
|
||||
@@ -377,7 +392,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
|
||||
}'
|
||||
```
|
||||
|
||||
6. Nginx Service
|
||||
7. Nginx Service
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:${NGINX_PORT}/v1/chatqna \
|
||||
@@ -385,7 +400,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
|
||||
-d '{"messages": "What is the revenue of Nike in 2023?"}'
|
||||
```
|
||||
|
||||
7. Dataprep Microservice(Optional)
|
||||
8. Dataprep Microservice(Optional)
|
||||
|
||||
If you want to update the default knowledge base, you can use the following commands:
|
||||
|
||||
|
||||
187
ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen.yaml
Normal file
187
ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen.yaml
Normal file
@@ -0,0 +1,187 @@
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
services:
|
||||
redis-vector-db:
|
||||
image: redis/redis-stack:7.2.0-v9
|
||||
container_name: redis-vector-db
|
||||
ports:
|
||||
- "${CHATQNA_REDIS_VECTOR_PORT:-6379}:6379"
|
||||
- "${CHATQNA_REDIS_VECTOR_INSIGHT_PORT:-8001}:8001"
|
||||
dataprep-redis-service:
|
||||
image: ${REGISTRY:-opea}/dataprep:${TAG:-latest}
|
||||
container_name: dataprep-redis-server
|
||||
depends_on:
|
||||
- redis-vector-db
|
||||
- tei-embedding-service
|
||||
ports:
|
||||
- "6007:5000"
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
REDIS_URL: redis://redis-vector-db:6379
|
||||
REDIS_HOST: redis-vector-db
|
||||
INDEX_NAME: ${INDEX_NAME}
|
||||
TEI_ENDPOINT: http://tei-embedding-service:80
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
tei-embedding-service:
|
||||
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
container_name: tei-embedding-server
|
||||
ports:
|
||||
- "6006:80"
|
||||
volumes:
|
||||
- "${MODEL_CACHE:-./data}:/data"
|
||||
shm_size: 1g
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate
|
||||
retriever:
|
||||
image: ${REGISTRY:-opea}/retriever:${TAG:-latest}
|
||||
container_name: retriever-redis-server
|
||||
depends_on:
|
||||
- redis-vector-db
|
||||
ports:
|
||||
- "7000:7000"
|
||||
ipc: host
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
REDIS_URL: redis://redis-vector-db:6379
|
||||
REDIS_HOST: redis-vector-db
|
||||
INDEX_NAME: ${INDEX_NAME}
|
||||
TEI_EMBEDDING_ENDPOINT: http://tei-embedding-service:80
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
LOGFLAG: ${LOGFLAG}
|
||||
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
|
||||
restart: unless-stopped
|
||||
tei-reranking-service:
|
||||
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
container_name: tei-reranking-server
|
||||
ports:
|
||||
- "8808:80"
|
||||
volumes:
|
||||
- "${MODEL_CACHE:-./data}:/data"
|
||||
shm_size: 1g
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
HF_HUB_DISABLE_PROGRESS_BARS: 1
|
||||
HF_HUB_ENABLE_HF_TRANSFER: 0
|
||||
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
|
||||
vllm-service:
|
||||
image: ${REGISTRY:-opea}/vllm:${TAG:-latest}
|
||||
container_name: vllm-server
|
||||
ports:
|
||||
- ${LLM_ENDPOINT_PORT:-9009}:80
|
||||
volumes:
|
||||
- "${MODEL_CACHE:-./data}:/root/.cache/huggingface/hub"
|
||||
shm_size: 128g
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
HF_TOKEN: ${HF_TOKEN}
|
||||
LLM_MODEL_ID: ${LLM_MODEL_ID}
|
||||
VLLM_TORCH_PROFILER_DIR: "${VLLM_TORCH_PROFILER_DIR:-/mnt}"
|
||||
host_ip: ${host_ip}
|
||||
LLM_ENDPOINT_PORT: ${LLM_ENDPOINT_PORT}
|
||||
VLLM_SKIP_WARMUP: ${VLLM_SKIP_WARMUP:-false}
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "curl -f http://${host_ip}:${LLM_ENDPOINT_PORT}/health || exit 1"]
|
||||
interval: 10s
|
||||
timeout: 10s
|
||||
retries: 100
|
||||
command: --model $LLM_MODEL_ID --host 0.0.0.0 --port 80
|
||||
llm-faqgen:
|
||||
image: ${REGISTRY:-opea}/llm-faqgen:${TAG:-latest}
|
||||
container_name: llm-faqgen-server
|
||||
depends_on:
|
||||
vllm-service:
|
||||
condition: service_healthy
|
||||
ports:
|
||||
- ${LLM_SERVER_PORT:-9000}:9000
|
||||
ipc: host
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
LLM_ENDPOINT: ${LLM_ENDPOINT}
|
||||
LLM_MODEL_ID: ${LLM_MODEL_ID}
|
||||
HF_TOKEN: ${HF_TOKEN}
|
||||
FAQGen_COMPONENT_NAME: ${FAQGen_COMPONENT_NAME:-OpeaFaqGenvLLM}
|
||||
LOGFLAG: ${LOGFLAG:-False}
|
||||
restart: unless-stopped
|
||||
chatqna-xeon-backend-server:
|
||||
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
|
||||
container_name: chatqna-xeon-backend-server
|
||||
depends_on:
|
||||
- redis-vector-db
|
||||
- tei-embedding-service
|
||||
- retriever
|
||||
- tei-reranking-service
|
||||
- vllm-service
|
||||
- llm-faqgen
|
||||
ports:
|
||||
- ${CHATQNA_BACKEND_PORT:-8888}:8888
|
||||
environment:
|
||||
- no_proxy=${no_proxy}
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
- MEGA_SERVICE_HOST_IP=chatqna-xeon-backend-server
|
||||
- EMBEDDING_SERVER_HOST_IP=tei-embedding-service
|
||||
- EMBEDDING_SERVER_PORT=80
|
||||
- RETRIEVER_SERVICE_HOST_IP=retriever
|
||||
- RETRIEVER_SERVICE_PORT=7000
|
||||
- RERANK_SERVER_HOST_IP=tei-reranking-service
|
||||
- RERANK_SERVER_PORT=80
|
||||
- LLM_SERVER_HOST_IP=llm-faqgen
|
||||
- LLM_SERVER_PORT=9000
|
||||
- LLM_MODEL=${LLM_MODEL_ID}
|
||||
- LOGFLAG=${LOGFLAG}
|
||||
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
|
||||
ipc: host
|
||||
restart: always
|
||||
chatqna-xeon-ui-server:
|
||||
image: ${REGISTRY:-opea}/chatqna-ui:${TAG:-latest}
|
||||
container_name: chatqna-xeon-ui-server
|
||||
depends_on:
|
||||
- chatqna-xeon-backend-server
|
||||
ports:
|
||||
- ${CHATQNA_FRONTEND_SERVICE_PORT:-5173}:5173
|
||||
environment:
|
||||
- no_proxy=${no_proxy}
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
ipc: host
|
||||
restart: always
|
||||
chatqna-xeon-nginx-server:
|
||||
image: ${REGISTRY:-opea}/nginx:${TAG:-latest}
|
||||
container_name: chatqna-xeon-nginx-server
|
||||
depends_on:
|
||||
- chatqna-xeon-backend-server
|
||||
- chatqna-xeon-ui-server
|
||||
ports:
|
||||
- "${NGINX_PORT:-80}:80"
|
||||
environment:
|
||||
- no_proxy=${no_proxy}
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
- FRONTEND_SERVICE_IP=chatqna-xeon-ui-server
|
||||
- FRONTEND_SERVICE_PORT=5173
|
||||
- BACKEND_SERVICE_NAME=chatqna
|
||||
- BACKEND_SERVICE_IP=chatqna-xeon-backend-server
|
||||
- BACKEND_SERVICE_PORT=8888
|
||||
- DATAPREP_SERVICE_IP=dataprep-redis-service
|
||||
- DATAPREP_SERVICE_PORT=5000
|
||||
ipc: host
|
||||
restart: always
|
||||
|
||||
networks:
|
||||
default:
|
||||
driver: bridge
|
||||
187
ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen_tgi.yaml
Normal file
187
ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen_tgi.yaml
Normal file
@@ -0,0 +1,187 @@
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
services:
|
||||
redis-vector-db:
|
||||
image: redis/redis-stack:7.2.0-v9
|
||||
container_name: redis-vector-db
|
||||
ports:
|
||||
- "${CHATQNA_REDIS_VECTOR_PORT:-6379}:6379"
|
||||
- "${CHATQNA_REDIS_VECTOR_INSIGHT_PORT:-8001}:8001"
|
||||
dataprep-redis-service:
|
||||
image: ${REGISTRY:-opea}/dataprep:${TAG:-latest}
|
||||
container_name: dataprep-redis-server
|
||||
depends_on:
|
||||
- redis-vector-db
|
||||
- tei-embedding-service
|
||||
ports:
|
||||
- "6007:5000"
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
REDIS_URL: redis://redis-vector-db:6379
|
||||
REDIS_HOST: redis-vector-db
|
||||
INDEX_NAME: ${INDEX_NAME}
|
||||
TEI_ENDPOINT: http://tei-embedding-service:80
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
tei-embedding-service:
|
||||
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
container_name: tei-embedding-server
|
||||
ports:
|
||||
- "6006:80"
|
||||
volumes:
|
||||
- "${MODEL_CACHE:-./data}:/data"
|
||||
shm_size: 1g
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate
|
||||
retriever:
|
||||
image: ${REGISTRY:-opea}/retriever:${TAG:-latest}
|
||||
container_name: retriever-redis-server
|
||||
depends_on:
|
||||
- redis-vector-db
|
||||
ports:
|
||||
- "7000:7000"
|
||||
ipc: host
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
REDIS_URL: redis://redis-vector-db:6379
|
||||
REDIS_HOST: redis-vector-db
|
||||
INDEX_NAME: ${INDEX_NAME}
|
||||
TEI_EMBEDDING_ENDPOINT: http://tei-embedding-service:80
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
LOGFLAG: ${LOGFLAG}
|
||||
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
|
||||
restart: unless-stopped
|
||||
tei-reranking-service:
|
||||
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
|
||||
container_name: tei-reranking-server
|
||||
ports:
|
||||
- "8808:80"
|
||||
volumes:
|
||||
- "${MODEL_CACHE:-./data}:/data"
|
||||
shm_size: 1g
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
HF_HUB_DISABLE_PROGRESS_BARS: 1
|
||||
HF_HUB_ENABLE_HF_TRANSFER: 0
|
||||
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
|
||||
tgi-service:
|
||||
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
|
||||
container_name: tgi-server
|
||||
ports:
|
||||
- ${LLM_ENDPOINT_PORT:-9009}:80
|
||||
volumes:
|
||||
- "${MODEL_CACHE:-./data}:/data"
|
||||
shm_size: 1g
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN}
|
||||
HF_TOKEN: ${HF_TOKEN}
|
||||
host_ip: ${host_ip}
|
||||
LLM_ENDPOINT_PORT: ${LLM_ENDPOINT_PORT}
|
||||
HF_HUB_DISABLE_PROGRESS_BARS: 1
|
||||
HF_HUB_ENABLE_HF_TRANSFER: 0
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "curl -f http://${host_ip}:${LLM_ENDPOINT_PORT}/health || exit 1"]
|
||||
interval: 10s
|
||||
timeout: 10s
|
||||
retries: 100
|
||||
command: --model-id ${LLM_MODEL_ID} --cuda-graphs 0
|
||||
llm-faqgen:
|
||||
image: ${REGISTRY:-opea}/llm-faqgen:${TAG:-latest}
|
||||
container_name: llm-faqgen-server
|
||||
depends_on:
|
||||
tgi-service:
|
||||
condition: service_healthy
|
||||
ports:
|
||||
- ${LLM_SERVER_PORT:-9000}:9000
|
||||
ipc: host
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
LLM_ENDPOINT: ${LLM_ENDPOINT}
|
||||
LLM_MODEL_ID: ${LLM_MODEL_ID}
|
||||
HF_TOKEN: ${HF_TOKEN}
|
||||
FAQGen_COMPONENT_NAME: ${FAQGen_COMPONENT_NAME:-OpeaFaqGenTgi}
|
||||
LOGFLAG: ${LOGFLAG:-False}
|
||||
restart: unless-stopped
|
||||
chatqna-xeon-backend-server:
|
||||
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
|
||||
container_name: chatqna-xeon-backend-server
|
||||
depends_on:
|
||||
- redis-vector-db
|
||||
- tei-embedding-service
|
||||
- retriever
|
||||
- tei-reranking-service
|
||||
- tgi-service
|
||||
- llm-faqgen
|
||||
ports:
|
||||
- ${CHATQNA_BACKEND_PORT:-8888}:8888
|
||||
environment:
|
||||
- no_proxy=${no_proxy}
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
- MEGA_SERVICE_HOST_IP=chatqna-xeon-backend-server
|
||||
- EMBEDDING_SERVER_HOST_IP=tei-embedding-service
|
||||
- EMBEDDING_SERVER_PORT=80
|
||||
- RETRIEVER_SERVICE_HOST_IP=retriever
|
||||
- RETRIEVER_SERVICE_PORT=7000
|
||||
- RERANK_SERVER_HOST_IP=tei-reranking-service
|
||||
- RERANK_SERVER_PORT=80
|
||||
- LLM_SERVER_HOST_IP=llm-faqgen
|
||||
- LLM_SERVER_PORT=9000
|
||||
- LLM_MODEL=${LLM_MODEL_ID}
|
||||
- LOGFLAG=${LOGFLAG}
|
||||
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
|
||||
ipc: host
|
||||
restart: always
|
||||
chatqna-xeon-ui-server:
|
||||
image: ${REGISTRY:-opea}/chatqna-ui:${TAG:-latest}
|
||||
container_name: chatqna-xeon-ui-server
|
||||
depends_on:
|
||||
- chatqna-xeon-backend-server
|
||||
ports:
|
||||
- ${CHATQNA_FRONTEND_SERVICE_PORT:-5173}:5173
|
||||
environment:
|
||||
- no_proxy=${no_proxy}
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
ipc: host
|
||||
restart: always
|
||||
chatqna-xeon-nginx-server:
|
||||
image: ${REGISTRY:-opea}/nginx:${TAG:-latest}
|
||||
container_name: chatqna-xeon-nginx-server
|
||||
depends_on:
|
||||
- chatqna-xeon-backend-server
|
||||
- chatqna-xeon-ui-server
|
||||
ports:
|
||||
- "${NGINX_PORT:-80}:80"
|
||||
environment:
|
||||
- no_proxy=${no_proxy}
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
- FRONTEND_SERVICE_IP=chatqna-xeon-ui-server
|
||||
- FRONTEND_SERVICE_PORT=5173
|
||||
- BACKEND_SERVICE_NAME=chatqna
|
||||
- BACKEND_SERVICE_IP=chatqna-xeon-backend-server
|
||||
- BACKEND_SERVICE_PORT=8888
|
||||
- DATAPREP_SERVICE_IP=dataprep-redis-service
|
||||
- DATAPREP_SERVICE_PORT=5000
|
||||
ipc: host
|
||||
restart: always
|
||||
|
||||
networks:
|
||||
default:
|
||||
driver: bridge
|
||||
@@ -75,7 +75,7 @@ services:
|
||||
VLLM_TORCH_PROFILER_DIR: "/mnt"
|
||||
command: --model $LLM_MODEL_ID --host 0.0.0.0 --port 80
|
||||
chatqna-xeon-backend-server:
|
||||
image: ${REGISTRY:-opea}/chatqna-without-rerank:${TAG:-latest}
|
||||
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
|
||||
container_name: chatqna-xeon-backend-server
|
||||
depends_on:
|
||||
- redis-vector-db
|
||||
@@ -97,6 +97,7 @@ services:
|
||||
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
|
||||
- LLM_MODEL=${LLM_MODEL_ID}
|
||||
- LOGFLAG=${LOGFLAG}
|
||||
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_NO_RERANK}
|
||||
ipc: host
|
||||
restart: always
|
||||
chatqna-xeon-ui-server:
|
||||
|
||||
@@ -20,3 +20,13 @@ export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=grpc://$JAEGER_IP:4317
|
||||
export TELEMETRY_ENDPOINT=http://$JAEGER_IP:4318/v1/traces
|
||||
# Set no proxy
|
||||
export no_proxy="$no_proxy,chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm-service,jaeger,prometheus,grafana,node-exporter,$JAEGER_IP"
|
||||
|
||||
export LLM_ENDPOINT_PORT=8010
|
||||
export LLM_SERVER_PORT=9000
|
||||
export CHATQNA_BACKEND_PORT=8888
|
||||
export CHATQNA_REDIS_VECTOR_PORT=6379
|
||||
export CHATQNA_REDIS_VECTOR_INSIGHT_PORT=8001
|
||||
export CHATQNA_FRONTEND_SERVICE_PORT=5173
|
||||
export NGINX_PORT=80
|
||||
export FAQGen_COMPONENT_NAME="OpeaFaqGenvLLM"
|
||||
export LLM_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
|
||||
|
||||
Reference in New Issue
Block a user