Set no wrapper ChatQnA as default (#891)

Signed-off-by: lvliang-intel <liang1.lv@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-11 13:30:45 +08:00
parent b71a12d424
commit 619d941047
66 changed files with 649 additions and 4796 deletions
--- a/ChatQnA/docker_compose/intel/cpu/xeon/README.md
+++ b/ChatQnA/docker_compose/intel/cpu/xeon/README.md
@@ -97,61 +97,20 @@ After launching your instance, you can connect to it using SSH (for Linux instan

 First of all, you need to build Docker Images locally and install the python package of it.

-### 1. Build Embedding Image
-
-```bash
-git clone https://github.com/opea-project/GenAIComps.git
-cd GenAIComps
-docker build --no-cache -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/tei/langchain/Dockerfile .
-```
-
-### 2. Build Retriever Image
+### 1. Build Retriever Image

 ```bash
 docker build --no-cache -t opea/retriever-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/redis/langchain/Dockerfile .
 ```

-### 3. Build Rerank Image
-
-> Skip for ChatQnA without Rerank pipeline
-
-```bash
-docker build --no-cache -t opea/reranking-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/tei/Dockerfile .
-```
-
-### 4. Build LLM Image
-
-#### Use TGI as backend
-
-```bash
-docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
-```
-
-#### Use vLLM as backend
-
-Build vLLM docker.
-
-```bash
-git clone https://github.com/vllm-project/vllm.git
-cd ./vllm/
-docker build --no-cache -t opea/vllm:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.cpu .
-cd ..
-```
-
-Build microservice.
-
-```bash
-docker build --no-cache -t opea/llm-vllm:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/vllm/langchain/Dockerfile .
-```
-
-### 5. Build Dataprep Image
+### 2. Build Dataprep Image

 ```bash
 docker build --no-cache -t opea/dataprep-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/redis/langchain/Dockerfile .
 cd ..
 ```

-### 6. Build MegaService Docker Image
+### 3. Build MegaService Docker Image

 1. MegaService with Rerank

@@ -173,7 +132,7 @@ cd ..
   docker build --no-cache -t opea/chatqna-without-rerank:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.without_rerank .
   ```

-### 7. Build UI Docker Image
+### 4. Build UI Docker Image

 Build frontend Docker image via below command:

@@ -182,7 +141,7 @@ cd GenAIExamples/ChatQnA/ui
 docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
 ```

-### 8. Build Conversational React UI Docker Image (Optional)
+### 5. Build Conversational React UI Docker Image (Optional)

 Build frontend Docker image that enables Conversational experience with ChatQnA megaservice via below command:

@@ -193,23 +152,20 @@ cd GenAIExamples/ChatQnA/ui
 docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react .
 ```

-### 9. Build Nginx Docker Image
+### 6. Build Nginx Docker Image

 ```bash
 cd GenAIComps
 docker build -t opea/nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/nginx/Dockerfile .
 ```

-Then run the command `docker images`, you will have the following 8 Docker Images:
+Then run the command `docker images`, you will have the following 5 Docker Images:

 1. `opea/dataprep-redis:latest`
-2. `opea/embedding-tei:latest`
-3. `opea/retriever-redis:latest`
-4. `opea/reranking-tei:latest`
-5. `opea/llm-tgi:latest` or `opea/llm-vllm:latest`
-6. `opea/chatqna:latest` or `opea/chatqna-without-rerank:latest`
-7. `opea/chatqna-ui:latest`
-8. `opea/nginx:latest`
+2. `opea/retriever-redis:latest`
+3. `opea/chatqna:latest` or `opea/chatqna-without-rerank:latest`
+4. `opea/chatqna-ui:latest`
+5. `opea/nginx:latest`

 ## 🚀 Start Microservices

@@ -315,16 +271,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
       -H 'Content-Type: application/json'
   ```

-2. Embedding Microservice
-
-   ```bash
-   curl http://${host_ip}:6000/v1/embeddings\
-     -X POST \
-     -d '{"text":"hello"}' \
-     -H 'Content-Type: application/json'
-   ```
-
-3. Retriever Microservice
+2. Retriever Microservice

   To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector
   is determined by the embedding model.
@@ -340,7 +287,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
     -H 'Content-Type: application/json'
   ```

-4. TEI Reranking Service
+3. TEI Reranking Service

   > Skip for ChatQnA without Rerank pipeline

@@ -351,18 +298,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
       -H 'Content-Type: application/json'
   ```

-5. Reranking Microservice
-
-   > Skip for ChatQnA without Rerank pipeline
-
-   ```bash
-   curl http://${host_ip}:8000/v1/reranking\
-     -X POST \
-     -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \
-     -H 'Content-Type: application/json'
-   ```
-
-6. LLM backend Service
+4. LLM backend Service

   In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

@@ -395,31 +331,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
     -d '{"model": "Intel/neural-chat-7b-v3-3", "prompt": "What is Deep Learning?", "max_tokens": 32, "temperature": 0}'
   ```

-7. LLM Microservice
-
-   This service depends on above LLM backend service startup. It will be ready after long time, to wait for them being ready in first startup.
-
-   ```bash
-   # TGI service
-   curl http://${host_ip}:9000/v1/chat/completions\
-     -X POST \
-     -d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
-     -H 'Content-Type: application/json'
-   ```
-
-   For parameters in TGI modes, please refer to [HuggingFace InferenceClient API](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client#huggingface_hub.InferenceClient.text_generation) (except we rename "max_new_tokens" to "max_tokens".)
-
-   ```bash
-   # vLLM Service
-   curl http://${host_ip}:9000/v1/chat/completions \
-    -X POST \
-    -d '{"query":"What is Deep Learning?","max_tokens":17,"top_p":1,"temperature":0.7,"frequency_penalty":0,"presence_penalty":0, "streaming":false}' \
-    -H 'Content-Type: application/json'
-   ```
-
-   For parameters in vLLM modes, can refer to [LangChain VLLMOpenAI API](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.vllm.VLLMOpenAI.html)
-
-8. MegaService
+5. MegaService

   ```bash
    curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{
@@ -427,7 +339,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
          }'
   ```

-9. Nginx Service
+6. Nginx Service

   ```bash
   curl http://${host_ip}:${NGINX_PORT}/v1/chatqna \
@@ -435,7 +347,7 @@ For details on how to verify the correctness of the response, refer to [how-to-v
       -d '{"messages": "What is the revenue of Nike in 2023?"}'
   ```

-10. Dataprep Microservice（Optional）
+7. Dataprep Microservice（Optional）

 If you want to update the default knowledge base, you can use the following commands: