Add helm deployment instructions for codegen (#1351)

Signed-off-by: Dolpher Du <dolpher.du@intel.com>
2025-01-08 13:20:32 +08:00
parent 23117871c2
commit 5638075d65
15 changed files with 73 additions and 1482 deletions
--- a/CodeGen/kubernetes/gmc/README.md
+++ b/CodeGen/kubernetes/gmc/README.md
@@ -0,0 +1,40 @@
+# Deploy CodeGen in a Kubernetes Cluster
+
+This document outlines the deployment process for a Code Generation (CodeGen) application that utilizes the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice components on Intel Xeon servers and Gaudi machines.
+
+Install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector/README.md). We will soon publish images to Docker Hub, at which point no builds will be required, further simplifying install.
+
+If you have only Intel Xeon machines you could use the codegen_xeon.yaml file or if you have a Gaudi cluster you could use codegen_gaudi.yaml
+In the below example we illustrate on Xeon.
+
+## Deploy the RAG application
+
+1. Create the desired namespace if it does not already exist and deploy the application
+   ```bash
+   export APP_NAMESPACE=CT
+   kubectl create ns $APP_NAMESPACE
+   sed -i "s|namespace: codegen|namespace: $APP_NAMESPACE|g"  ./codegen_xeon.yaml
+   kubectl apply -f ./codegen_xeon.yaml
+   ```
+
+2. Check if the application is up and ready
+   ```bash
+   kubectl get pods -n $APP_NAMESPACE
+   ```
+
+3. Deploy a client pod for testing
+   ```bash
+   kubectl create deployment client-test -n $APP_NAMESPACE --image=python:3.8.13 -- sleep infinity
+   ```
+
+4. Check that client pod is ready
+   ```bash
+   kubectl get pods -n $APP_NAMESPACE
+   ```
+
+5. Send request to application
+   ```bash
+   export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
+   export accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='codegen')].status.accessUrl}")
+   kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl -s --no-buffer $accessUrl -X POST -d '{"query": "def print_hello_world():"}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_codegen.log
+   ```
--- a/CodeGen/kubernetes/gmc/codegen_gaudi.yaml
+++ b/CodeGen/kubernetes/gmc/codegen_gaudi.yaml
@@ -0,0 +1,34 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: gmc.opea.io/v1alpha3
+kind: GMConnector
+metadata:
+  labels:
+    app.kubernetes.io/name: gmconnector
+    app.kubernetes.io/managed-by: kustomize
+    gmc/platform: gaudi
+  name: codegen
+  namespace: codegen
+spec:
+  routerConfig:
+    name: router
+    serviceName: router-service
+  nodes:
+    root:
+      routerType: Sequence
+      steps:
+      - name: Llm
+        data: $response
+        internalService:
+          serviceName: llm-service
+          config:
+            endpoint: /v1/chat/completions
+            TGI_LLM_ENDPOINT: tgi-gaudi-svc
+      - name: TgiGaudi
+        internalService:
+          serviceName: tgi-gaudi-svc
+          config:
+            MODEL_ID: Qwen/Qwen2.5-Coder-7B-Instruct
+            endpoint: /generate
+          isDownstreamService: true
--- a/CodeGen/kubernetes/gmc/codegen_xeon.yaml
+++ b/CodeGen/kubernetes/gmc/codegen_xeon.yaml
@@ -0,0 +1,34 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: gmc.opea.io/v1alpha3
+kind: GMConnector
+metadata:
+  labels:
+    app.kubernetes.io/name: gmconnector
+    app.kubernetes.io/managed-by: kustomize
+    gmc/platform: xeon
+  name: codegen
+  namespace: codegen
+spec:
+  routerConfig:
+    name: router
+    serviceName: router-service
+  nodes:
+    root:
+      routerType: Sequence
+      steps:
+      - name: Llm
+        data: $response
+        internalService:
+          serviceName: llm-service
+          config:
+            endpoint: /v1/chat/completions
+            TGI_LLM_ENDPOINT: tgi-service
+      - name: Tgi
+        internalService:
+          serviceName: tgi-service
+          config:
+            MODEL_ID: Qwen/Qwen2.5-Coder-7B-Instruct
+            endpoint: /generate
+          isDownstreamService: true