Files
GenAIExamples/AudioQnA/kubernetes/intel
lvliang-intel 0306c620b5 Update TGI CPU image to latest official release 2.4.0 (#1035)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-04 11:28:43 +08:00
..

Deploy AudioQnA in a Kubernetes Cluster

[NOTE] The following values must be set before you can deploy: HUGGINGFACEHUB_API_TOKEN You can also customize the "MODEL_ID" and "model-volume"

Deploy On Xeon

cd GenAIExamples/AudioQnA/kubernetes/intel/cpu/xeon/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml

Deploy On Gaudi

cd GenAIExamples/AudioQnA/kubernetes/intel/hpu/gaudi/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml

Verify Services

Make sure all the pods are running, and restart the audioqna-xxxx pod if necessary.

kubectl get pods

curl http://${host_ip}:3008/v1/audioqna   -X POST   -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}'   -H 'Content-Type: application/json'