From afc3341156aed7f6aeab3a61e2ba889bb2fbda76 Mon Sep 17 00:00:00 2001
From: Letong Han <106566639+letonghan@users.noreply.github.com>
Date: Tue, 3 Sep 2024 15:06:50 +0800
Subject: [PATCH] Refine ChatQnA README for TGI (#715)

* update chatqna readme for tgi

Signed-off-by: letonghan <letong.han@intel.com>

* update log block

Signed-off-by: letonghan <letong.han@intel.com>

---------

Signed-off-by: letonghan <letong.han@intel.com>
---
 ChatQnA/README.md                    | 13 +++++++++++++
 ChatQnA/docker/gaudi/README.md       | 16 ++++++++++++++++
 ChatQnA/docker/gpu/README.md         | 16 ++++++++++++++++
 ChatQnA/docker/xeon/README.md        | 16 ++++++++++++++--
 ChatQnA/docker/xeon/README_qdrant.md | 16 ++++++++++++++++
 5 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/ChatQnA/README.md b/ChatQnA/README.md
index 0d4798810..5d3f93e8f 100644
--- a/ChatQnA/README.md
+++ b/ChatQnA/README.md
@@ -224,6 +224,19 @@ Refer to the [Intel Technology enabling for Openshift readme](https://github.com
 
 ## Consume ChatQnA Service
 
+Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start).
+
+```bash
+# TGI example
+docker logs tgi-service | grep Connected
+```
+
+Consume ChatQnA service until you get the TGI response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
 Two ways of consuming ChatQnA Service:
 
 1. Use cURL command on terminal
diff --git a/ChatQnA/docker/gaudi/README.md b/ChatQnA/docker/gaudi/README.md
index 053484f77..717988c6b 100644
--- a/ChatQnA/docker/gaudi/README.md
+++ b/ChatQnA/docker/gaudi/README.md
@@ -306,6 +306,22 @@ curl http://${host_ip}:8000/v1/reranking \
 
 6. LLM backend Service
 
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
+
+Try the command below to check whether the LLM serving is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate services.
+
 ```bash
 #TGI Service
 curl http://${host_ip}:8005/generate \
diff --git a/ChatQnA/docker/gpu/README.md b/ChatQnA/docker/gpu/README.md
index 48c287fb5..f559230b6 100644
--- a/ChatQnA/docker/gpu/README.md
+++ b/ChatQnA/docker/gpu/README.md
@@ -192,6 +192,22 @@ curl http://${host_ip}:8000/v1/reranking \
 
 6. TGI Service
 
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
+
+Try the command below to check whether the TGI service is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate TGI.
+
 ```bash
 curl http://${host_ip}:8008/generate \
   -X POST \
diff --git a/ChatQnA/docker/xeon/README.md b/ChatQnA/docker/xeon/README.md
index dc8735928..675e74cea 100644
--- a/ChatQnA/docker/xeon/README.md
+++ b/ChatQnA/docker/xeon/README.md
@@ -303,9 +303,21 @@ curl http://${host_ip}:8000/v1/reranking\
 
 6. LLM backend Service
 
-In first startup, this service will take more time to download the LLM file. After it's finished, the service will be ready.
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
 
-Use `docker logs CONTAINER_ID` to check if the download is finished.
+Try the command below to check whether the LLM serving is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate services.
 
 ```bash
 # TGI service
diff --git a/ChatQnA/docker/xeon/README_qdrant.md b/ChatQnA/docker/xeon/README_qdrant.md
index f103d5a73..a03b563b2 100644
--- a/ChatQnA/docker/xeon/README_qdrant.md
+++ b/ChatQnA/docker/xeon/README_qdrant.md
@@ -276,6 +276,22 @@ curl http://${host_ip}:6046/v1/reranking\
 
 6. TGI Service
 
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
+
+Try the command below to check whether the TGI service is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate TGI.
+
 ```bash
 curl http://${host_ip}:6042/generate \
   -X POST \