Refine README of Examples (#420)
* update chatqna readme and set env script Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for comments Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add consume Signed-off-by: letonghan <letong.han@intel.com> * modify details Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update codegen readme Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add patch modifications Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update codegen readme Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update ui options Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * udpate codetrans readme Signed-off-by: letonghan <letong.han@intel.com> * update docsum & searchqna readme Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
@@ -18,17 +18,81 @@ This ChatQnA use case performs RAG using LangChain, Redis VectorDB and Text Gene
|
||||
|
||||
The ChatQnA service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors.
|
||||
|
||||
Currently we support two ways of deploying ChatQnA services with docker compose:
|
||||
|
||||
1. Start services using the docker image on `docker hub`:
|
||||
|
||||
```bash
|
||||
docker pull opea/chatqna:latest
|
||||
```
|
||||
|
||||
Two type of UI are supported now, choose one you like and pull the referred docker image.
|
||||
|
||||
If you choose conversational UI, follow the [instruction](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/docker/gaudi#-launch-the-conversational-ui-optional) and modify the [docker_compose.yaml](./docker/xeon/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
docker pull opea/chatqna-ui:latest
|
||||
# or
|
||||
docker pull opea/chatqna-conversation-ui:latest
|
||||
```
|
||||
|
||||
2. Start services using the docker images `built from source`: [Guide](./docker)
|
||||
|
||||
## Setup Environment Variable
|
||||
|
||||
To set up environment variables for deploying ChatQnA services, follow these steps:
|
||||
|
||||
1. Set the required environment variables:
|
||||
|
||||
```bash
|
||||
export host_ip="External_Public_IP"
|
||||
export no_proxy="Your_No_Proxy"
|
||||
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
|
||||
```
|
||||
|
||||
2. If you are in a proxy environment, also set the proxy-related environment variables:
|
||||
|
||||
```bash
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
```
|
||||
|
||||
3. Set up other environment variables:
|
||||
|
||||
```bash
|
||||
bash ./docker/set_env.sh
|
||||
```
|
||||
|
||||
## Deploy ChatQnA on Gaudi
|
||||
|
||||
Refer to the [Gaudi Guide](./docker/gaudi/README.md) for instructions on deploying ChatQnA on Gaudi.
|
||||
If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/ChatQnA/docker/gaudi/
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
|
||||
|
||||
## Deploy ChatQnA on Xeon
|
||||
|
||||
Refer to the [Xeon Guide](./docker/xeon/README.md) for instructions on deploying ChatQnA on Xeon.
|
||||
Please find corresponding [docker_compose.yaml](./docker/xeon/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/ChatQnA/docker/xeon/
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
Refer to the [Xeon Guide](./docker/xeon/README.md) for more instructions on building docker images from source.
|
||||
|
||||
## Deploy ChatQnA on NVIDIA GPU
|
||||
|
||||
Refer to the [NVIDIA GPU Guide](./docker/gpu/README.md) for instructions on deploying ChatQnA on NVIDIA GPU.
|
||||
```bash
|
||||
cd GenAIExamples/ChatQnA/docker/gpu/
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
Refer to the [NVIDIA GPU Guide](./docker/gpu/README.md) for more instructions on building docker images from source.
|
||||
|
||||
## Deploy ChatQnA into Kubernetes on Xeon & Gaudi
|
||||
|
||||
@@ -37,3 +101,37 @@ Refer to the [Kubernetes Guide](./kubernetes/manifests/README.md) for instructio
|
||||
## Deploy ChatQnA on AI PC
|
||||
|
||||
Refer to the [AI PC Guide](./docker/aipc/README.md) for instructions on deploying ChatQnA on AI PC.
|
||||
|
||||
# Consume ChatQnA Service
|
||||
|
||||
Two ways of consuming ChatQnA Service:
|
||||
|
||||
1. Use cURL command on terminal
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:8888/v1/chatqna \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"messages": "What is the revenue of Nike in 2023?"
|
||||
}'
|
||||
```
|
||||
|
||||
2. Access via frontend
|
||||
|
||||
To access the frontend, open the following URL in your browser: `http://{host_ip}:5173`
|
||||
|
||||
By default, the UI runs on port 5173 internally.
|
||||
|
||||
If you choose conversational UI, use this URL: `http://{host_ip}:5174`
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
1. If you get errors like "Access Denied", please [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/docker/xeon#validate-microservices) first. A simple example:
|
||||
|
||||
```bash
|
||||
http_proxy="" curl ${host_ip}:6006/embed -X POST -d '{"inputs":"What is Deep Learning?"}' -H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
2. (Docker only) If all microservices work well, please check the port ${host_ip}:8888, the port may be allocated by other users, you can modify the `docker_compose.yaml`.
|
||||
|
||||
3. (Docker only) If you get errors like "The container name is in use", please change container name in `docker_compose.yaml`.
|
||||
|
||||
23
ChatQnA/docker/set_env.sh
Normal file
23
ChatQnA/docker/set_env.sh
Normal file
@@ -0,0 +1,23 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
|
||||
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
|
||||
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
|
||||
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:8090"
|
||||
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
|
||||
export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
|
||||
export REDIS_URL="redis://${host_ip}:6379"
|
||||
export INDEX_NAME="rag-redis"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
|
||||
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
|
||||
export RERANK_SERVICE_HOST_IP=${host_ip}
|
||||
export LLM_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna"
|
||||
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
|
||||
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
|
||||
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file"
|
||||
@@ -22,12 +22,99 @@ The workflow falls into the following architecture:
|
||||
|
||||
The CodeGen service can be effortlessly deployed on either Intel Gaudi2 or Intel Xeon Scalable Processor.
|
||||
|
||||
Currently we support two ways of deploying ChatQnA services with docker compose:
|
||||
|
||||
1. Start services using the docker image on `docker hub`:
|
||||
|
||||
```bash
|
||||
docker pull opea/codegen:latest
|
||||
```
|
||||
|
||||
2. Start services using the docker images `built from source`: [Guide](./docker)
|
||||
|
||||
## Setup Environment Variable
|
||||
|
||||
To set up environment variables for deploying ChatQnA services, follow these steps:
|
||||
|
||||
1. Set the required environment variables:
|
||||
|
||||
```bash
|
||||
export host_ip="External_Public_IP"
|
||||
export no_proxy="Your_No_Proxy"
|
||||
```
|
||||
|
||||
2. If you are in a proxy environment, also set the proxy-related environment variables:
|
||||
|
||||
```bash
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
|
||||
```
|
||||
|
||||
3. Set up other environment variables:
|
||||
|
||||
```bash
|
||||
bash ./docker/set_env.sh
|
||||
```
|
||||
|
||||
## Deploy CodeGen using Docker
|
||||
|
||||
- Refer to the [Gaudi Guide](./docker/gaudi/README.md) for instructions on deploying CodeGen on Gaudi.
|
||||
### Deploy CodeGen on Gaudi
|
||||
|
||||
- Refer to the [Xeon Guide](./docker/xeon/README.md) for instructions on deploying CodeGen on Xeon.
|
||||
- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/CodeGen/docker/gaudi
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
|
||||
|
||||
### Deploy CodeGen on Xeon
|
||||
|
||||
Please find corresponding [docker_compose.yaml](./docker/xeon/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/CodeGen/docker/xeon
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
Refer to the [Xeon Guide](./docker/xeon/README.md) for more instructions on building docker images from source.
|
||||
|
||||
## Deploy CodeGen using Kubernetes
|
||||
|
||||
Refer to the [Kubernetes Guide](./kubernetes/manifests/README.md) for instructions on deploying CodeGen into Kubernetes on Xeon & Gaudi.
|
||||
|
||||
# Consume CodeGen Service
|
||||
|
||||
Two ways of consuming CodeGen Service:
|
||||
|
||||
1. Use cURL command on terminal
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:7778/v1/codegen \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}'
|
||||
```
|
||||
|
||||
2. Access via frontend
|
||||
|
||||
To access the frontend, open the following URL in your browser: http://{host_ip}:5173.
|
||||
|
||||
By default, the UI runs on port 5173 internally.
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
1. If you get errors like "Access Denied", please [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/CodeGen/docker/xeon#validate-microservices) first. A simple example:
|
||||
|
||||
```bash
|
||||
http_proxy=""
|
||||
curl http://${host_ip}:8028/generate \
|
||||
-X POST \
|
||||
-d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
2. (Docker only) If all microservices work well, please check the port ${host_ip}:7778, the port may be allocated by other users, you can modify the `docker_compose.yaml`.
|
||||
|
||||
3. (Docker only) If you get errors like "The container name is in use", please change container name in `docker_compose.yaml`.
|
||||
|
||||
11
CodeGen/docker/set_env.sh
Normal file
11
CodeGen/docker/set_env.sh
Normal file
@@ -0,0 +1,11 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
export LLM_MODEL_ID="meta-llama/CodeLlama-7b-hf"
|
||||
export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export LLM_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:7778/v1/codegen"
|
||||
@@ -12,12 +12,99 @@ This Code Translation use case uses Text Generation Inference on Intel Gaudi2 or
|
||||
|
||||
The Code Translation service can be effortlessly deployed on either Intel Gaudi2 or Intel Xeon Scalable Processor.
|
||||
|
||||
Currently we support two ways of deploying Code Translation services on docker:
|
||||
|
||||
1. Start services using the docker image on `docker hub`:
|
||||
|
||||
```bash
|
||||
docker pull opea/codetrans:latest
|
||||
```
|
||||
|
||||
2. Start services using the docker images `built from source`: [Guide](./docker)
|
||||
|
||||
## Setup Environment Variable
|
||||
|
||||
To set up environment variables for deploying Code Translation services, follow these steps:
|
||||
|
||||
1. Set the required environment variables:
|
||||
|
||||
```bash
|
||||
export host_ip="External_Public_IP"
|
||||
export no_proxy="Your_No_Proxy"
|
||||
```
|
||||
|
||||
2. If you are in a proxy environment, also set the proxy-related environment variables:
|
||||
|
||||
```bash
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
|
||||
```
|
||||
|
||||
3. Set up other environment variables:
|
||||
|
||||
```bash
|
||||
bash ./docker/set_env.sh
|
||||
```
|
||||
|
||||
## Deploy with Docker
|
||||
|
||||
- To deploy Code Translation on Gaudi please refer to the [Gaudi Guide](./docker/gaudi/README.md)
|
||||
### Deploy Code Translation on Gaudi
|
||||
|
||||
- To deploy Code Translation on Xeon please refer to the [Xeon Guide](./docker/xeon/README.md).
|
||||
- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start Code Translation services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/CodeTrans/docker/gaudi
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
|
||||
|
||||
### Deploy Code Translation on Xeon
|
||||
|
||||
Please find corresponding [docker_compose.yaml](./docker/xeon/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/CodeTrans/docker/xeon
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
Refer to the [Xeon Guide](./docker/xeon/README.md) for more instructions on building docker images from source.
|
||||
|
||||
## Deploy with Kubernetes
|
||||
|
||||
Please refer to the [Code Translation Kubernetes Guide](./kubernetes/README.md)
|
||||
|
||||
# Consume Code Translation Service
|
||||
|
||||
Two ways of consuming Code Translation Service:
|
||||
|
||||
1. Use cURL command on terminal
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:7777/v1/codetrans \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}'
|
||||
```
|
||||
|
||||
2. Access via frontend
|
||||
|
||||
To access the frontend, open the following URL in your browser: http://{host_ip}:5173.
|
||||
|
||||
By default, the UI runs on port 5173 internally.
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
1. If you get errors like "Access Denied", please [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/CodeTrans/docker/xeon#validate-microservices) first. A simple example:
|
||||
|
||||
```bash
|
||||
http_proxy=""
|
||||
curl http://${host_ip}:8008/generate \
|
||||
-X POST \
|
||||
-d '{"inputs":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:","parameters":{"max_new_tokens":17, "do_sample": true}}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
2. (Docker only) If all microservices work well, please check the port ${host_ip}:7777, the port may be allocated by other users, you can modify the `docker_compose.yaml`.
|
||||
|
||||
3. (Docker only) If you get errors like "The container name is in use", please change container name in `docker_compose.yaml`.
|
||||
|
||||
11
CodeTrans/docker/set_env.sh
Normal file
11
CodeTrans/docker/set_env.sh
Normal file
@@ -0,0 +1,11 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
export LLM_MODEL_ID="HuggingFaceH4/mistral-7b-grok"
|
||||
export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export LLM_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:7777/v1/codetrans"
|
||||
@@ -15,12 +15,99 @@ The architecture for document summarization will be illustrated/described below:
|
||||
The Document Summarization service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors.
|
||||
Based on whether you want to use Docker or Kubernetes, please follow the instructions below.
|
||||
|
||||
Currently we support two ways of deploying Document Summarization services with docker compose:
|
||||
|
||||
1. Start services using the docker image on `docker hub`:
|
||||
|
||||
```bash
|
||||
docker pull opea/docsum:latest
|
||||
```
|
||||
|
||||
2. Start services using the docker images `built from source`: [Guide](./docker)
|
||||
|
||||
## Setup Environment Variable
|
||||
|
||||
To set up environment variables for deploying Document Summarization services, follow these steps:
|
||||
|
||||
1. Set the required environment variables:
|
||||
|
||||
```bash
|
||||
export host_ip="External_Public_IP"
|
||||
export no_proxy="Your_No_Proxy"
|
||||
```
|
||||
|
||||
2. If you are in a proxy environment, also set the proxy-related environment variables:
|
||||
|
||||
```bash
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
|
||||
```
|
||||
|
||||
3. Set up other environment variables:
|
||||
|
||||
```bash
|
||||
bash ./docker/set_env.sh
|
||||
```
|
||||
|
||||
## Deploy using Docker
|
||||
|
||||
- Refer to the [Gaudi Guide](./docker/gaudi/README.md) for instructions on deploying Document Summarization on Gaudi.
|
||||
### Deploy on Gaudi
|
||||
|
||||
- Refer to the [Xeon Guide](./docker/xeon/README.md) for instructions on deploying Document Summarization on Xeon.
|
||||
If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start DocSum services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/DocSum/docker/gaudi/
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
|
||||
|
||||
### Deploy on Xeon
|
||||
|
||||
Please find corresponding [docker_compose.yaml](./docker/xeon/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/DocSum/docker/xeon/
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
Refer to the [Xeon Guide](./docker/xeon/README.md) for more instructions on building docker images from source.
|
||||
|
||||
## Deploy using Kubernetes
|
||||
|
||||
Please refer to [Kubernetes deployment](./kubernetes/README.md)
|
||||
|
||||
# Consume Document Summarization Service
|
||||
|
||||
Two ways of consuming Document Summarization Service:
|
||||
|
||||
1. Use cURL command on terminal
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}'
|
||||
```
|
||||
|
||||
2. Access via frontend
|
||||
|
||||
To access the frontend, open the following URL in your browser: http://{host_ip}:5173.
|
||||
|
||||
By default, the UI runs on port 5173 internally.
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
1. If you get errors like "Access Denied", please [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/DocSum/docker/xeon#validate-microservices) first. A simple example:
|
||||
|
||||
```bash
|
||||
http_proxy=""
|
||||
curl http://${your_ip}:8008/generate \
|
||||
-X POST \
|
||||
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
2. (Docker only) If all microservices work well, please check the port ${host_ip}:8888, the port may be allocated by other users, you can modify the `docker_compose.yaml`.
|
||||
|
||||
3. (Docker only) If you get errors like "The container name is in use", please change container name in `docker_compose.yaml`.
|
||||
|
||||
11
DocSum/docker/set_env.sh
Normal file
11
DocSum/docker/set_env.sh
Normal file
@@ -0,0 +1,11 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
|
||||
export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export LLM_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/docsum"
|
||||
@@ -24,10 +24,98 @@ The workflow falls into the following architecture:
|
||||
|
||||
The SearchQnA service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors.
|
||||
|
||||
## Deploy SearchQnA on Xeon
|
||||
Currently we support two ways of deploying SearchQnA services with docker compose:
|
||||
|
||||
Refer to the [Xeon Guide](./docker/xeon/README.md) for instructions on deploying SearchQnA on Xeon.
|
||||
1. Start services using the docker image on `docker hub`:
|
||||
|
||||
```bash
|
||||
docker pull opea/searchqna:latest
|
||||
```
|
||||
|
||||
2. Start services using the docker images `built from source`: [Guide](./docker)
|
||||
|
||||
## Setup Environment Variable
|
||||
|
||||
To set up environment variables for deploying SearchQnA services, follow these steps:
|
||||
|
||||
1. Set the required environment variables:
|
||||
|
||||
```bash
|
||||
export host_ip="External_Public_IP"
|
||||
export no_proxy="Your_No_Proxy"
|
||||
export GOOGLE_CSE_ID="Your_CSE_ID"
|
||||
export GOOGLE_API_KEY="Your_Google_API_Key"
|
||||
```
|
||||
|
||||
2. If you are in a proxy environment, also set the proxy-related environment variables:
|
||||
|
||||
```bash
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
|
||||
```
|
||||
|
||||
3. Set up other environment variables:
|
||||
|
||||
```bash
|
||||
bash ./docker/set_env.sh
|
||||
```
|
||||
|
||||
## Deploy SearchQnA on Gaudi
|
||||
|
||||
Refer to the [Gaudi Guide](./docker/gaudi/README.md) for instructions on deploying SearchQnA on Gaudi.
|
||||
If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start SearchQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/SearchQnA/docker/gaudi/
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
|
||||
|
||||
## Deploy SearchQnA on Xeon
|
||||
|
||||
Please find corresponding [docker_compose.yaml](./docker/xeon/docker_compose.yaml).
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/SearchQnA/docker/xeon/
|
||||
docker compose -f docker_compose.yaml up -d
|
||||
```
|
||||
|
||||
Refer to the [Xeon Guide](./docker/xeon/README.md) for more instructions on building docker images from source.
|
||||
|
||||
# Consume SearchQnA Service
|
||||
|
||||
Two ways of consuming SearchQnA Service:
|
||||
|
||||
1. Use cURL command on terminal
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:3008/v1/searchqna \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"messages": "What is the latest news? Give me also the source link.",
|
||||
"stream": "True"
|
||||
}'
|
||||
```
|
||||
|
||||
2. Access via frontend
|
||||
|
||||
To access the frontend, open the following URL in your browser: http://{host_ip}:5173.
|
||||
|
||||
By default, the UI runs on port 5173 internally.
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
1. If you get errors like "Access Denied", please [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/docker/xeon#validate-microservices) first. A simple example:
|
||||
|
||||
```bash
|
||||
http_proxy=""
|
||||
curl http://${host_ip}:3001/embed \
|
||||
-X POST \
|
||||
-d '{"inputs":"What is Deep Learning?"}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
2. (Docker only) If all microservices work well, please check the port ${host_ip}:3008, the port may be allocated by other users, you can modify the `docker_compose.yaml`.
|
||||
|
||||
3. (Docker only) If you get errors like "The container name is in use", please change container name in `docker_compose.yaml`.
|
||||
|
||||
24
SearchQnA/docker/set_env.sh
Normal file
24
SearchQnA/docker/set_env.sh
Normal file
@@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
export EMBEDDING_MODEL_ID=BAAI/bge-base-en-v1.5
|
||||
export TEI_EMBEDDING_ENDPOINT=http://${host_ip}:3001
|
||||
export RERANK_MODEL_ID=BAAI/bge-reranker-base
|
||||
export TEI_RERANKING_ENDPOINT=http://${host_ip}:3004
|
||||
|
||||
export TGI_LLM_ENDPOINT=http://${host_ip}:3006
|
||||
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
|
||||
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
|
||||
export WEB_RETRIEVER_SERVICE_HOST_IP=${host_ip}
|
||||
export RERANK_SERVICE_HOST_IP=${host_ip}
|
||||
export LLM_SERVICE_HOST_IP=${host_ip}
|
||||
|
||||
export EMBEDDING_SERVICE_PORT=3002
|
||||
export WEB_RETRIEVER_SERVICE_PORT=3003
|
||||
export RERANK_SERVICE_PORT=3005
|
||||
export LLM_SERVICE_PORT=3007
|
||||
Reference in New Issue
Block a user