Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml and doc updates (#613)

* Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml and doc updates

Signed-off-by: dmsuehir <dina.s.jones@intel.com>
This commit is contained in:
Dina Suehiro Jones
2024-08-23 01:04:57 -07:00
committed by GitHub
parent 4f3be23efa
commit c25063f4bb
2 changed files with 8 additions and 2 deletions

View File

@@ -6,7 +6,8 @@
> You can also customize the "MODEL_ID" if needed.
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGEn workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGen workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node.
> Alternatively, you can change the `codegen.yaml` to use a different type of volume, such as a persistent volume claim.
## Deploy On Xeon
@@ -30,10 +31,13 @@ kubectl apply -f codegen.yaml
To verify the installation, run the command `kubectl get pod` to make sure all pods are running.
Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGEn service for access.
Then run the command `kubectl port-forward svc/codegen 7778:7778` to expose the CodeGen service for access.
Open another terminal and run the following command to verify the service if working:
> Note that it may take a couple of minutes for the service to be ready. If the `curl` command below fails, you
> can check the logs of the codegen-tgi pod to see its status or check for errors.
```
kubectl get pods
curl http://localhost:7778/v1/codegen -H "Content-Type: application/json" -d '{

View File

@@ -271,6 +271,8 @@ spec:
resources:
limits:
habana.ai/gaudi: 1
memory: 64Gi
hugepages-2Mi: 500Mi
volumes:
- name: model-volume
hostPath: