Files
GenAIExamples/CodeGen/kubernetes/manifests
chen, suyue 7eb402e95b Revert hf_token setting (#226)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-05-30 23:12:03 +08:00
..
2024-05-30 23:12:03 +08:00
2024-05-30 23:12:03 +08:00
2024-05-30 23:12:03 +08:00

Deploy CodeGen in Kubernetes Cluster

[NOTE] The following values must be set before you can deploy: HUGGINGFACEHUB_API_TOKEN You can also customize the "MODEL_ID" and "model-volume"

Deploy On Xeon

cd GenAIExamples/CodeGen/kubernetes/manifests/xeon
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" codegen.yaml
kubectl apply -f codegen.yaml

Deploy On Gaudi

cd GenAIExamples/CodeGen/kubernetes/manifests/gaudi
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" codegen.yaml
kubectl apply -f codegen.yaml

Verify Services

Make sure all the pods are running, and restart the codegen-xxxx pod if necessary.

kubectl get pods
curl http://codegen:6666/v1/codegen -H "Content-Type: application/json" -d '{
     "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."
     }'