Files
Yao Qing 262ad7d6ec Refine readme of CodeGen (#1797)
Signed-off-by: Yao, Qing <qing.yao@intel.com>
2025-04-21 17:49:15 +08:00

134 lines
5.5 KiB
Markdown

# Deploy CodeGen on Kubernetes using Helm
This guide explains how to deploy the CodeGen application to a Kubernetes cluster using the official OPEA Helm chart.
## Table of Contents
- [Purpose](#purpose)
- [Prerequisites](#prerequisites)
- [Deployment Steps](#deployment-steps)
- [1. Set Hugging Face Token](#1-set-hugging-face-token)
- [2. Choose Hardware Configuration](#2-choose-hardware-configuration)
- [3. Install Helm Chart](#3-install-helm-chart)
- [Verify Deployment](#verify-deployment)
- [Accessing the Service](#accessing-the-service)
- [Customization](#customization)
- [Uninstalling the Chart](#uninstalling-the-chart)
## Purpose
To provide a standardized and configurable method for deploying the CodeGen application and its microservice dependencies onto Kubernetes using Helm.
## Prerequisites
- A running Kubernetes cluster.
- `kubectl` installed and configured to interact with your cluster.
- Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) if needed.
- Network access from your cluster nodes to download container images (from `ghcr.io/opea-project` and Hugging Face) and models.
## Deployment Steps
### 1. Set Hugging Face Token
The chart requires your Hugging Face Hub API token to download models. Set it as an environment variable:
```bash
export HFTOKEN="your-huggingface-api-token-here"
```
Replace `your-huggingface-api-token-here` with your actual token.
### 2. Choose Hardware Configuration
The CodeGen Helm chart supports different hardware configurations using values files:
- **Intel Xeon CPU:** Use `cpu-values.yaml` (located within the chart structure, or provide your own). This is suitable for general Kubernetes clusters without specific accelerators.
- **Intel Gaudi HPU:** Use `gaudi-values.yaml` (located within the chart structure, or provide your own). This requires nodes with Gaudi devices and the appropriate Kubernetes device plugins configured.
### 3. Install Helm Chart
Install the CodeGen chart from the OPEA OCI registry, providing your Hugging Face token and selecting the appropriate values file.
**Deploy on Xeon (CPU):**
```bash
helm install codegen oci://ghcr.io/opea-project/charts/codegen \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
-f cpu-values.yaml \
--namespace codegen --create-namespace
```
*Note: `-f cpu-values.yaml` assumes a file named `cpu-values.yaml` exists locally or you are referencing one within the chart structure accessible to Helm. You might need to download it first or customize parameters directly using `--set`.*
**Deploy on Gaudi (HPU):**
```bash
helm install codegen oci://ghcr.io/opea-project/charts/codegen \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
-f gaudi-values.yaml \
--namespace codegen --create-namespace
```
*Note: `-f gaudi-values.yaml` has the same assumption as above. Ensure your cluster meets Gaudi prerequisites.*
*The command installs the chart into the `codegen` namespace, creating it if necessary. Change `--namespace` if desired.*
## Verify Deployment
Check the status of the pods created by the Helm release:
```bash
kubectl get pods -n codegen
```
Wait for all pods (e.g., codegen-gateway, codegen-llm, codegen-embedding, redis, etc.) to reach the `Running` state. Check logs if any pods encounter issues:
```bash
kubectl logs -n codegen <pod-name>
```
## Accessing the Service
By default, the Helm chart typically exposes the CodeGen gateway service via a Kubernetes `Service` of type `ClusterIP` or `LoadBalancer`, depending on the chart's values.
- **If `ClusterIP`:** Access is typically internal to the cluster or requires port-forwarding:
```bash
# Find the service name (e.g., codegen-gateway)
kubectl get svc -n codegen
# Forward local port 7778 to the service port (usually 7778)
kubectl port-forward svc/<service-name> -n codegen 7778:7778
# Access via curl on localhost:7778
curl http://localhost:7778/v1/codegen -H "Content-Type: application/json" -d '{"messages": "Test"}'
```
- **If `LoadBalancer`:** Obtain the external IP address assigned by your cloud provider:
```bash
kubectl get svc -n codegen <service-name> -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
# Access using the external IP and service port (e.g., 7778)
export EXTERNAL_IP=$(kubectl get svc -n codegen <service-name> -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
curl http://${EXTERNAL_IP}:7778/v1/codegen -H "Content-Type: application/json" -d '{"messages": "Test"}'
```
Refer to the chart's documentation or `values.yaml` for specifics on service exposure. The UI service might also be exposed similarly (check for a UI-related service).
## Customization
You can customize the deployment by:
- Modifying the `cpu-values.yaml` or `gaudi-values.yaml` file before installation.
- Overriding parameters using the `--set` flag during `helm install`. Example:
```bash
helm install codegen oci://ghcr.io/opea-project/charts/codegen \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--namespace codegen --create-namespace
# Add other --set overrides or -f <your-custom-values.yaml>
```
- Refer to the [OPEA Helm Charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme) for detailed information on available configuration options within the charts.
## Uninstalling the Chart
To remove the CodeGen deployment installed by Helm:
```bash
helm uninstall codegen -n codegen
```
Optionally, delete the namespace if it's no longer needed and empty:
```bash
# kubectl delete ns codegen
```