Files

lvliang-intel 0306c620b5 Update TGI CPU image to latest official release 2.4.0 (#1035 )

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

2024-11-04 11:28:43 +08:00

cpu/xeon

Update TGI CPU image to latest official release 2.4.0 (#1035 )

2024-11-04 11:28:43 +08:00

hpu/gaudi

Use official tei gaudi image and update tgi gaudi version (#810 )

2024-09-23 17:52:56 +08:00

README_gmc.md

doc: fix missing references to README.md (#860 )

2024-09-24 21:40:42 +08:00

README.md

Fix ChatQnA manifest default port issue (#1033 )

2024-10-30 11:52:04 +08:00

README.md

Deploy AudioQnA in a Kubernetes Cluster

[NOTE] The following values must be set before you can deploy: HUGGINGFACEHUB_API_TOKEN You can also customize the "MODEL_ID" and "model-volume"

Deploy On Xeon

cd GenAIExamples/AudioQnA/kubernetes/intel/cpu/xeon/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml

Deploy On Gaudi

cd GenAIExamples/AudioQnA/kubernetes/intel/hpu/gaudi/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml

Verify Services

Make sure all the pods are running, and restart the audioqna-xxxx pod if necessary.

kubectl get pods

curl http://${host_ip}:3008/v1/audioqna   -X POST   -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}'   -H 'Content-Type: application/json'