Docsum (#1095)
Signed-off-by: Mustafa <mustafa.cetin@intel.com> Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com> Co-authored-by: Harsha Ramayanam <harsha.ramayanam@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: XinyaoWa <xinyao.wang@intel.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: chen, suyue <suyue.chen@intel.com>
This commit is contained in:
@@ -12,17 +12,46 @@ After launching your instance, you can connect to it using SSH (for Linux instan
|
||||
|
||||
## 🚀 Build Docker Images
|
||||
|
||||
First of all, you need to build Docker Images locally and install the python package of it.
|
||||
### 1. Build MicroService Docker Image
|
||||
|
||||
### 1. Build LLM Image
|
||||
First of all, you need to build Docker Images locally and install the python package of it.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/opea-project/GenAIComps.git
|
||||
cd GenAIComps
|
||||
docker build -t opea/llm-docsum-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/summarization/tgi/langchain/Dockerfile .
|
||||
```
|
||||
|
||||
Then run the command `docker images`, you will have the following four Docker Images:
|
||||
#### Whisper Service
|
||||
|
||||
The Whisper Service converts audio files to text. Follow these steps to build and run the service:
|
||||
|
||||
```bash
|
||||
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/dependency/Dockerfile .
|
||||
```
|
||||
|
||||
#### Audio to text Service
|
||||
|
||||
The Audio to text Service is another service for converting audio to text. Follow these steps to build and run the service:
|
||||
|
||||
```bash
|
||||
docker build -t opea/dataprep-audio2text:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/multimedia2text/audio2text/Dockerfile .
|
||||
```
|
||||
|
||||
#### Video to Audio Service
|
||||
|
||||
The Video to Audio Service extracts audio from video files. Follow these steps to build and run the service:
|
||||
|
||||
```bash
|
||||
docker build -t opea/dataprep-video2audio:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/multimedia2text/video2audio/Dockerfile .
|
||||
```
|
||||
|
||||
#### Multimedia to Text Service
|
||||
|
||||
The Multimedia to Text Service transforms multimedia data to text data. Follow these steps to build and run the service:
|
||||
|
||||
```bash
|
||||
docker build -t opea/dataprep-multimedia2text:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/multimedia2text/Dockerfile .
|
||||
```
|
||||
|
||||
### 2. Build MegaService Docker Image
|
||||
|
||||
@@ -36,6 +65,10 @@ docker build -t opea/docsum:latest --build-arg https_proxy=$https_proxy --build-
|
||||
|
||||
### 3. Build UI Docker Image
|
||||
|
||||
Several UI options are provided. If you need to work with multimedia documents, .doc, or .pdf files, suggested to use Gradio UI.
|
||||
|
||||
#### Svelte UI
|
||||
|
||||
Build the frontend Docker image via below command:
|
||||
|
||||
```bash
|
||||
@@ -43,13 +76,16 @@ cd GenAIExamples/DocSum/ui
|
||||
docker build -t opea/docsum-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile .
|
||||
```
|
||||
|
||||
Then run the command `docker images`, you will have the following Docker Images:
|
||||
#### Gradio UI
|
||||
|
||||
1. `opea/llm-docsum-tgi:latest`
|
||||
2. `opea/docsum:latest`
|
||||
3. `opea/docsum-ui:latest`
|
||||
Build the Gradio UI frontend Docker image using the following command:
|
||||
|
||||
### 4. Build React UI Docker Image
|
||||
```bash
|
||||
cd GenAIExamples/DocSum/ui
|
||||
docker build -t opea/docsum-gradio-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile.gradio .
|
||||
```
|
||||
|
||||
#### React UI
|
||||
|
||||
Build the frontend Docker image via below command:
|
||||
|
||||
@@ -61,45 +97,62 @@ docker build -t opea/docsum-react-ui:latest --build-arg BACKEND_SERVICE_ENDPOINT
|
||||
docker build -t opea/docsum-react-ui:latest --build-arg BACKEND_SERVICE_ENDPOINT=$BACKEND_SERVICE_ENDPOINT --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react .
|
||||
```
|
||||
|
||||
Then run the command `docker images`, you will have the following Docker Images:
|
||||
|
||||
1. `opea/llm-docsum-tgi:latest`
|
||||
2. `opea/docsum:latest`
|
||||
3. `opea/docsum-ui:latest`
|
||||
4. `opea/docsum-react-ui:latest`
|
||||
|
||||
## 🚀 Start Microservices and MegaService
|
||||
|
||||
### Required Models
|
||||
|
||||
We set default model as "Intel/neural-chat-7b-v3-3", change "LLM_MODEL_ID" in following Environment Variables setting if you want to use other models.
|
||||
If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.
|
||||
|
||||
### Setup Environment Variables
|
||||
|
||||
Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
|
||||
Default model is "Intel/neural-chat-7b-v3-3". Change "LLM_MODEL_ID" environment variable in commands below if you want to use another model.
|
||||
|
||||
```bash
|
||||
export no_proxy=${your_no_proxy}
|
||||
export http_proxy=${your_http_proxy}
|
||||
export https_proxy=${your_http_proxy}
|
||||
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
|
||||
export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
|
||||
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
|
||||
export MEGA_SERVICE_HOST_IP=${host_ip}
|
||||
export LLM_SERVICE_HOST_IP=${host_ip}
|
||||
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/docsum"
|
||||
```
|
||||
|
||||
Note: Please replace with `host_ip` with your external IP address, do not use localhost.
|
||||
When using gated models, you also need to provide [HuggingFace token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.
|
||||
|
||||
### Setup Environment Variable
|
||||
|
||||
To set up environment variables for deploying Document Summarization services, follow these steps:
|
||||
|
||||
1. Set the required environment variables:
|
||||
|
||||
```bash
|
||||
# Example: host_ip="192.168.1.1"
|
||||
export host_ip="External_Public_IP"
|
||||
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
|
||||
export no_proxy="Your_No_Proxy"
|
||||
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
|
||||
```
|
||||
|
||||
2. If you are in a proxy environment, also set the proxy-related environment variables:
|
||||
|
||||
```bash
|
||||
export http_proxy="Your_HTTP_Proxy"
|
||||
export https_proxy="Your_HTTPs_Proxy"
|
||||
```
|
||||
|
||||
3. Set up other environment variables:
|
||||
|
||||
```bash
|
||||
source GenAIExamples/DocSum/docker_compose/set_env.sh
|
||||
```
|
||||
|
||||
### Start Microservice Docker Containers
|
||||
|
||||
```bash
|
||||
cd GenAIExamples/DocSum/docker_compose/intel/cpu/xeon
|
||||
docker compose up -d
|
||||
docker compose -f compose.yaml up -d
|
||||
```
|
||||
|
||||
You will have the following Docker Images:
|
||||
|
||||
1. `opea/docsum-ui:latest`
|
||||
2. `opea/docsum:latest`
|
||||
3. `opea/llm-docsum-tgi:latest`
|
||||
4. `opea/whisper:latest`
|
||||
5. `opea/dataprep-audio2text:latest`
|
||||
6. `opea/dataprep-multimedia2text:latest`
|
||||
7. `opea/dataprep-video2audio:latest`
|
||||
|
||||
### Validate Microservices
|
||||
|
||||
1. TGI Service
|
||||
@@ -120,31 +173,143 @@ docker compose up -d
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
3. MegaService
|
||||
3. Whisper Microservice
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:7066/v1/asr \
|
||||
-X POST \
|
||||
-d '{"audio":"UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
Expected output:
|
||||
|
||||
```bash
|
||||
{"asr_result":"you"}
|
||||
```
|
||||
|
||||
4. Audio2Text Microservice
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:9099/v1/audio/transcriptions \
|
||||
-X POST \
|
||||
-d '{"byte_str":"UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
Expected output:
|
||||
|
||||
```bash
|
||||
{"downstream_black_list":[],"id":"--> this will be different id number for each run <--","query":"you"}
|
||||
```
|
||||
|
||||
5. Multimedia to text Microservice
|
||||
|
||||
```bash
|
||||
curl http://${host_ip}:7079/v1/multimedia2text \
|
||||
-X POST \
|
||||
-d '{"audio":"UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
Expected output:
|
||||
|
||||
```bash
|
||||
{"downstream_black_list":[],"id":"--> this will be different id number for each run <--","query":"you"}
|
||||
```
|
||||
|
||||
6. MegaService
|
||||
|
||||
Text:
|
||||
|
||||
```bash
|
||||
curl -X POST http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"type": "text", "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}'
|
||||
|
||||
# Use English mode (default).
|
||||
curl http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "type=text" \
|
||||
-F "messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." \
|
||||
-F "max_tokens=32" \
|
||||
-F "language=en" \
|
||||
-F "stream=false"
|
||||
-F "stream=true"
|
||||
|
||||
# Use Chinese mode.
|
||||
curl http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "type=text" \
|
||||
-F "messages=2024年9月26日,北京——今日,英特尔正式发布英特尔® 至强® 6性能核处理器(代号Granite Rapids),为AI、数据分析、科学计算等计算密集型业务提供卓越性能。" \
|
||||
-F "max_tokens=32" \
|
||||
-F "language=zh" \
|
||||
-F "stream=true"
|
||||
|
||||
# Upload file
|
||||
curl http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "type=text" \
|
||||
-F "messages=" \
|
||||
-F "files=@/path to your file (.txt, .docx, .pdf)" \
|
||||
-F "max_tokens=32" \
|
||||
-F "language=en" \
|
||||
-F "stream=true"
|
||||
```
|
||||
|
||||
Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service.
|
||||
> Audio and Video file uploads are not supported in docsum with curl request, please use the Gradio-UI.
|
||||
|
||||
Audio:
|
||||
|
||||
```bash
|
||||
curl -X POST http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"type": "audio", "messages": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}'
|
||||
|
||||
curl http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "type=audio" \
|
||||
-F "messages=UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA" \
|
||||
-F "max_tokens=32" \
|
||||
-F "language=en" \
|
||||
-F "stream=true"
|
||||
```
|
||||
|
||||
Video:
|
||||
|
||||
```bash
|
||||
curl -X POST http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"type": "video", "messages": "convert your video to base64 data type"}'
|
||||
|
||||
curl http://${host_ip}:8888/v1/docsum \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "type=video" \
|
||||
-F "messages=convert your video to base64 data type" \
|
||||
-F "max_tokens=32" \
|
||||
-F "language=en" \
|
||||
-F "stream=true"
|
||||
```
|
||||
|
||||
## 🚀 Launch the UI
|
||||
|
||||
Open this URL `http://{host_ip}:5173` in your browser to access the svelte based frontend.
|
||||
Several UI options are provided. If you need to work with multimedia documents, .doc, or .pdf files, suggested to use Gradio UI.
|
||||
|
||||
Open this URL `http://{host_ip}:5174` in your browser to access the React based frontend.
|
||||
### Gradio UI
|
||||
|
||||
Open this URL `http://{host_ip}:5173` in your browser to access the Gradio based frontend.
|
||||
|
||||

|
||||
|
||||
### Svelte UI
|
||||
|
||||
Open this URL `http://{host_ip}:5173` in your browser to access the Svelte based frontend.
|
||||
|
||||

|
||||
|
||||
### React UI (Optional)
|
||||
|
||||
Open this URL `http://{host_ip}:5174` in your browser to access the React based frontend.
|
||||
|
||||
To access the React-based frontend, modify the UI service in the `compose.yaml` file. Replace `docsum-xeon-ui-server` service with the `docsum-xeon-react-ui-server` service as per the config below:
|
||||
|
||||
```yaml
|
||||
|
||||
@@ -17,7 +17,8 @@ services:
|
||||
- "./data:/data"
|
||||
shm_size: 1g
|
||||
command: --model-id ${LLM_MODEL_ID} --cuda-graphs 0
|
||||
llm:
|
||||
|
||||
llm-docsum-tgi:
|
||||
image: ${REGISTRY:-opea}/llm-docsum-tgi:${TAG:-latest}
|
||||
container_name: llm-docsum-server
|
||||
depends_on:
|
||||
@@ -32,12 +33,56 @@ services:
|
||||
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
|
||||
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
|
||||
restart: unless-stopped
|
||||
|
||||
whisper:
|
||||
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
|
||||
container_name: whisper-service
|
||||
ports:
|
||||
- "7066:7066"
|
||||
ipc: host
|
||||
environment:
|
||||
no_proxy: ${no_proxy}
|
||||
http_proxy: ${http_proxy}
|
||||
https_proxy: ${https_proxy}
|
||||
restart: unless-stopped
|
||||
|
||||
dataprep-audio2text:
|
||||
image: ${REGISTRY:-opea}/dataprep-audio2text:${TAG:-latest}
|
||||
container_name: dataprep-audio2text-service
|
||||
ports:
|
||||
- "9099:9099"
|
||||
ipc: host
|
||||
environment:
|
||||
A2T_ENDPOINT: ${A2T_ENDPOINT}
|
||||
|
||||
dataprep-video2audio:
|
||||
image: ${REGISTRY:-opea}/dataprep-video2audio:${TAG:-latest}
|
||||
container_name: dataprep-video2audio-service
|
||||
ports:
|
||||
- "7078:7078"
|
||||
ipc: host
|
||||
environment:
|
||||
V2A_ENDPOINT: ${V2A_ENDPOINT}
|
||||
|
||||
dataprep-multimedia2text:
|
||||
image: ${REGISTRY:-opea}/dataprep-multimedia2text:${TAG:-latest}
|
||||
container_name: dataprep-multimedia2text
|
||||
ports:
|
||||
- "7079:7079"
|
||||
ipc: host
|
||||
environment:
|
||||
V2A_ENDPOINT: ${V2A_ENDPOINT}
|
||||
A2T_ENDPOINT: ${A2T_ENDPOINT}
|
||||
|
||||
docsum-xeon-backend-server:
|
||||
image: ${REGISTRY:-opea}/docsum:${TAG:-latest}
|
||||
container_name: docsum-xeon-backend-server
|
||||
depends_on:
|
||||
- tgi-service
|
||||
- llm
|
||||
- llm-docsum-tgi
|
||||
- dataprep-multimedia2text
|
||||
- dataprep-video2audio
|
||||
- dataprep-audio2text
|
||||
ports:
|
||||
- "8888:8888"
|
||||
environment:
|
||||
@@ -45,10 +90,12 @@ services:
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
|
||||
- DATA_SERVICE_HOST_IP=${DATA_SERVICE_HOST_IP}
|
||||
- LLM_SERVICE_HOST_IP=${LLM_SERVICE_HOST_IP}
|
||||
ipc: host
|
||||
restart: always
|
||||
docsum-xeon-ui-server:
|
||||
|
||||
docsum-ui:
|
||||
image: ${REGISTRY:-opea}/docsum-ui:${TAG:-latest}
|
||||
container_name: docsum-xeon-ui-server
|
||||
depends_on:
|
||||
@@ -59,6 +106,7 @@ services:
|
||||
- no_proxy=${no_proxy}
|
||||
- https_proxy=${https_proxy}
|
||||
- http_proxy=${http_proxy}
|
||||
- BACKEND_SERVICE_ENDPOINT=${BACKEND_SERVICE_ENDPOINT}
|
||||
- DOC_BASE_URL=${BACKEND_SERVICE_ENDPOINT}
|
||||
ipc: host
|
||||
restart: always
|
||||
|
||||
Reference in New Issue
Block a user