Refine readme of CodeGen (#1797)

Signed-off-by: Yao, Qing <qing.yao@intel.com>
2025-04-21 17:49:15 +08:00
parent 608dc963c9
commit 262ad7d6ec
8 changed files with 1062 additions and 1243 deletions
--- a/CodeGen/kubernetes/gmc/README.md
+++ b/CodeGen/kubernetes/gmc/README.md
@@ -1,40 +1,130 @@
-# Deploy CodeGen in a Kubernetes Cluster
+# Deploy CodeGen using Kubernetes Microservices Connector (GMC)

-This document outlines the deployment process for a Code Generation (CodeGen) application that utilizes the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice components on Intel Xeon servers and Gaudi machines.
+This document guides you through deploying the CodeGen application on a Kubernetes cluster using the OPEA Microservices Connector (GMC).

-Install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector/README.md). We will soon publish images to Docker Hub, at which point no builds will be required, further simplifying install.
+## Table of Contents

-If you have only Intel Xeon machines you could use the codegen_xeon.yaml file or if you have a Gaudi cluster you could use codegen_gaudi.yaml
-In the below example we illustrate on Xeon.
+- [Purpose](#purpose)
+- [Prerequisites](#prerequisites)
+- [Deployment Steps](#deployment-steps)
+  - [1. Choose Configuration](#1-choose-configuration)
+  - [2. Prepare Namespace and Deploy](#2-prepare-namespace-and-deploy)
+  - [3. Verify Pod Status](#3-verify-pod-status)
+- [Validation Steps](#validation-steps)
+  - [1. Deploy Test Client](#1-deploy-test-client)
+  - [2. Get Service URL](#2-get-service-url)
+  - [3. Send Test Request](#3-send-test-request)
+- [Cleanup](#cleanup)

-## Deploy the RAG application
+## Purpose

-1. Create the desired namespace if it does not already exist and deploy the application
-   ```bash
-   export APP_NAMESPACE=CT
-   kubectl create ns $APP_NAMESPACE
-   sed -i "s|namespace: codegen|namespace: $APP_NAMESPACE|g"  ./codegen_xeon.yaml
-   kubectl apply -f ./codegen_xeon.yaml
-   ```
+To deploy the multi-component CodeGen application on Kubernetes, leveraging GMC to manage the connections and routing between microservices based on a declarative configuration file.

-2. Check if the application is up and ready
-   ```bash
-   kubectl get pods -n $APP_NAMESPACE
-   ```
+## Prerequisites

-3. Deploy a client pod for testing
-   ```bash
-   kubectl create deployment client-test -n $APP_NAMESPACE --image=python:3.8.13 -- sleep infinity
-   ```
+- A running Kubernetes cluster.
+- `kubectl` installed and configured to interact with your cluster.
+- [GenAI Microservice Connector (GMC)](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector/README.md) installed in your cluster. Follow the GMC installation guide if you haven't already.
+- Access to the container images specified in the GMC configuration files (`codegen_xeon.yaml` or `codegen_gaudi.yaml`). These may be on Docker Hub or a private registry.

-4. Check that client pod is ready
-   ```bash
-   kubectl get pods -n $APP_NAMESPACE
-   ```
+## Deployment Steps

-5. Send request to application
-   ```bash
-   export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
-   export accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='codegen')].status.accessUrl}")
-   kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl -s --no-buffer $accessUrl -X POST -d '{"query": "def print_hello_world():"}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_codegen.log
-   ```
+### 1. Choose Configuration
+
+Two GMC configuration files are provided based on the target hardware for the LLM serving component:
+
+-   `codegen_xeon.yaml`: Deploys CodeGen using CPU-optimized components (suitable for Intel Xeon clusters).
+-   `codegen_gaudi.yaml`: Deploys CodeGen using Gaudi-optimized LLM serving components (suitable for clusters with Intel Gaudi nodes).
+
+Select the file appropriate for your cluster hardware. The following steps use `codegen_xeon.yaml` as an example.
+
+### 2. Prepare Namespace and Deploy
+
+Choose a namespace for the deployment (e.g., `codegen-app`).
+
+```bash
+# Set the desired namespace
+export APP_NAMESPACE=codegen-app
+
+# Create the namespace if it doesn't exist
+kubectl create ns $APP_NAMESPACE || true
+
+# (Optional) Update the namespace within the chosen YAML file if it's not parameterized
+# sed -i "s|namespace: codegen|namespace: $APP_NAMESPACE|g" ./codegen_xeon.yaml
+
+# Apply the GMC configuration file to the chosen namespace
+kubectl apply -f ./codegen_xeon.yaml -n $APP_NAMESPACE
+```
+*Note: If the YAML file uses a hardcoded namespace, ensure you either modify the file or deploy to that specific namespace.*
+
+### 3. Verify Pod Status
+
+Check that all the pods defined in the GMC configuration are successfully created and running.
+
+```bash
+kubectl get pods -n $APP_NAMESPACE
+```
+Wait until all pods are in the `Running` state. This might take some time for image pulling and initialization.
+
+## Validation Steps
+
+### 1. Deploy Test Client
+
+Deploy a simple pod within the same namespace to use as a client for sending requests.
+
+```bash
+kubectl create deployment client-test -n $APP_NAMESPACE --image=curlimages/curl -- sleep infinity
+```
+Verify the client pod is running:
+```bash
+kubectl get pods -n $APP_NAMESPACE -l app=client-test
+```
+
+### 2. Get Service URL
+
+Retrieve the access URL exposed by the GMC for the CodeGen application.
+
+```bash
+# Get the client pod name
+export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath='{.items[0].metadata.name}')
+
+# Get the access URL provided by the 'codegen' GMC resource
+# Adjust 'codegen' if the metadata.name in your YAML is different
+export ACCESS_URL=$(kubectl get gmc codegen -n $APP_NAMESPACE -o jsonpath='{.status.accessUrl}')
+
+# Display the URL (optional)
+echo "Access URL: $ACCESS_URL"
+```
+*Note: The `accessUrl` typically points to the internal Kubernetes service endpoint for the gateway service defined in the GMC configuration.*
+
+### 3. Send Test Request
+
+Use the test client pod to send a `curl` request to the CodeGen service endpoint.
+
+```bash
+# Define the payload
+PAYLOAD='{"messages": "def print_hello_world():"}'
+
+# Execute curl from the client pod
+kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl -s --no-buffer "$ACCESS_URL" \
+  -X POST \
+  -d "$PAYLOAD" \
+  -H 'Content-Type: application/json'
+```
+
+**Expected Output:** A stream of JSON data containing the generated code, similar to the Docker Compose validation, ending with a `[DONE]` marker if streaming is enabled.
+
+## Cleanup
+
+To remove the deployed application and the test client:
+
+```bash
+# Delete the GMC deployment
+kubectl delete -f ./codegen_xeon.yaml -n $APP_NAMESPACE
+
+# Delete the test client deployment
+kubectl delete deployment client-test -n $APP_NAMESPACE
+
+# Optionally delete the namespace if no longer needed
+# kubectl delete ns $APP_NAMESPACE
+```