Add kubernetes support for VisualQnA (#578)
* Add kubernetes support for VisualQnA Signed-off-by: lvliang-intel <liang1.lv@intel.com> * update gmc file Signed-off-by: lvliang-intel <liang1.lv@intel.com> * update pic Signed-off-by: lvliang-intel <liang1.lv@intel.com> --------- Signed-off-by: lvliang-intel <liang1.lv@intel.com>
This commit is contained in:
57
VisualQnA/kubernetes/README.md
Normal file
57
VisualQnA/kubernetes/README.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Deploy VisualQnA in a Kubernetes Cluster
|
||||
|
||||
This document outlines the deployment process for a Visual Question Answering (VisualQnA) application that utilizes the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice components on Intel Xeon servers and Gaudi machines.
|
||||
|
||||
Please install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector#readme). We will soon publish images to Docker Hub, at which point no builds will be required, further simplifying install.
|
||||
|
||||
If you have only Intel Xeon machines you could use the visualqna_xeon.yaml file or if you have a Gaudi cluster you could use visualqna_gaudi.yaml
|
||||
In the below example we illustrate on Xeon.
|
||||
|
||||
## Deploy the VisualQnA application
|
||||
|
||||
1. Create the desired namespace if it does not already exist and deploy the application
|
||||
```bash
|
||||
export APP_NAMESPACE=CT
|
||||
kubectl create ns $APP_NAMESPACE
|
||||
sed -i "s|namespace: visualqna|namespace: $APP_NAMESPACE|g" ./visualqna_xeon.yaml
|
||||
kubectl apply -f ./visualqna_xeon.yaml
|
||||
```
|
||||
|
||||
2. Check if the application is up and ready
|
||||
```bash
|
||||
kubectl get pods -n $APP_NAMESPACE
|
||||
```
|
||||
|
||||
3. Deploy a client pod for testing
|
||||
```bash
|
||||
kubectl create deployment client-test -n $APP_NAMESPACE --image=python:3.8.13 -- sleep infinity
|
||||
```
|
||||
|
||||
4. Check that client pod is ready
|
||||
```bash
|
||||
kubectl get pods -n $APP_NAMESPACE
|
||||
```
|
||||
|
||||
5. Send request to application
|
||||
```bash
|
||||
export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
|
||||
export accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='visualqna')].status.accessUrl}")
|
||||
kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -X POST -d '{"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": [
|
||||
{
|
||||
"type": "text",
|
||||
"text": "What'\''s in this image?"
|
||||
},
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": "https://www.ilankelman.org/stopsigns/australia.jpg"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"max_tokens": 128}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_visualqna.log
|
||||
```
|
||||
51
VisualQnA/kubernetes/manifests/README.md
Normal file
51
VisualQnA/kubernetes/manifests/README.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Deploy VisualQnA in Kubernetes Cluster
|
||||
|
||||
> [NOTE]
|
||||
> You can also customize the "LVM_MODEL_ID" if needed.
|
||||
|
||||
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the visualqna workload is running. Otherwise, you need to modify the `visualqna.yaml` file to change the `model-volume` to a directory that exists on the node.
|
||||
|
||||
## Deploy On Xeon
|
||||
|
||||
```
|
||||
cd GenAIExamples/visualqna/kubernetes/manifests/xeon
|
||||
kubectl apply -f visualqna.yaml
|
||||
```
|
||||
|
||||
## Deploy On Gaudi
|
||||
|
||||
```
|
||||
cd GenAIExamples/visualqna/kubernetes/manifests/gaudi
|
||||
kubectl apply -f visualqna.yaml
|
||||
```
|
||||
|
||||
## Verify Services
|
||||
|
||||
To verify the installation, run the command `kubectl get pod` to make sure all pods are running.
|
||||
|
||||
Then run the command `kubectl port-forward svc/visualqna 8888:8888` to expose the visualqna service for access.
|
||||
|
||||
Open another terminal and run the following command to verify the service if working:
|
||||
|
||||
```console
|
||||
curl http://localhost:8888/v1/visualqna \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": [
|
||||
{
|
||||
"type": "text",
|
||||
"text": "What'\''s in this image?"
|
||||
},
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": "https://www.ilankelman.org/stopsigns/australia.jpg"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"max_tokens": 128}'
|
||||
```
|
||||
298
VisualQnA/kubernetes/manifests/gaudi/visualqna.yaml
Normal file
298
VisualQnA/kubernetes/manifests/gaudi/visualqna.yaml
Normal file
@@ -0,0 +1,298 @@
|
||||
---
|
||||
# Source: visualqna/charts/lvm-uservice/templates/configmap.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: visualqna-lvm-uservice-config
|
||||
labels:
|
||||
helm.sh/chart: lvm-uservice-0.8.0
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
data:
|
||||
LVM_ENDPOINT: "http://visualqna-tgi"
|
||||
HF_HOME: "/tmp/.cache/huggingface"
|
||||
http_proxy: ""
|
||||
https_proxy: ""
|
||||
no_proxy: ""
|
||||
---
|
||||
# Source: visualqna/charts/tgi/templates/configmap.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: visualqna-tgi-config
|
||||
labels:
|
||||
helm.sh/chart: tgi-0.8.0
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "2.1.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
data:
|
||||
MODEL_ID: "llava-hf/llava-v1.6-mistral-7b-hf"
|
||||
PORT: "8399"
|
||||
MAX_INPUT_TOKENS: "4096"
|
||||
MAX_TOTAL_TOKENS: "8192"
|
||||
http_proxy: ""
|
||||
https_proxy: ""
|
||||
no_proxy: ""
|
||||
HABANA_LOGS: "/tmp/habana_logs"
|
||||
NUMBA_CACHE_DIR: "/tmp"
|
||||
TRANSFORMERS_CACHE: "/tmp/transformers_cache"
|
||||
HF_HOME: "/tmp/.cache/huggingface"
|
||||
---
|
||||
# Source: visualqna/charts/lvm-uservice/templates/service.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: visualqna-lvm-uservice
|
||||
labels:
|
||||
helm.sh/chart: lvm-uservice-0.8.0
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 9399
|
||||
targetPort: 9399
|
||||
protocol: TCP
|
||||
name: lvm-uservice
|
||||
selector:
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
---
|
||||
# Source: visualqna/charts/tgi/templates/service.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: visualqna-tgi
|
||||
labels:
|
||||
helm.sh/chart: tgi-0.8.0
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "2.1.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8399
|
||||
protocol: TCP
|
||||
name: tgi
|
||||
selector:
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
---
|
||||
# Source: visualqna/templates/service.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: visualqna
|
||||
labels:
|
||||
helm.sh/chart: visualqna-0.8.0
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 8888
|
||||
targetPort: 8888
|
||||
protocol: TCP
|
||||
name: visualqna
|
||||
selector:
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
---
|
||||
# Source: visualqna/charts/lvm-uservice/templates/deployment.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: visualqna-lvm-uservice
|
||||
labels:
|
||||
helm.sh/chart: lvm-uservice-0.8.0
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
spec:
|
||||
securityContext:
|
||||
{}
|
||||
containers:
|
||||
- name: visualqna
|
||||
envFrom:
|
||||
- configMapRef:
|
||||
name: visualqna-lvm-uservice-config
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
capabilities:
|
||||
drop:
|
||||
- ALL
|
||||
readOnlyRootFilesystem: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
image: "opea/lvm-tgi:latest"
|
||||
imagePullPolicy: IfNotPresent
|
||||
ports:
|
||||
- name: lvm-uservice
|
||||
containerPort: 9399
|
||||
protocol: TCP
|
||||
volumeMounts:
|
||||
- mountPath: /tmp
|
||||
name: tmp
|
||||
resources:
|
||||
{}
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
---
|
||||
# Source: visualqna/charts/tgi/templates/deployment.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: visualqna-tgi
|
||||
labels:
|
||||
helm.sh/chart: tgi-0.8.0
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "2.1.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
spec:
|
||||
securityContext:
|
||||
{}
|
||||
containers:
|
||||
- name: tgi
|
||||
envFrom:
|
||||
- configMapRef:
|
||||
name: visualqna-tgi-config
|
||||
securityContext:
|
||||
{}
|
||||
image: "opea/llava-tgi:latest"
|
||||
imagePullPolicy: IfNotPresent
|
||||
volumeMounts:
|
||||
- mountPath: /data
|
||||
name: model-volume
|
||||
- mountPath: /tmp
|
||||
name: tmp
|
||||
ports:
|
||||
- name: http
|
||||
containerPort: 8399
|
||||
protocol: TCP
|
||||
resources:
|
||||
limits:
|
||||
habana.ai/gaudi: 1
|
||||
volumes:
|
||||
- name: model-volume
|
||||
hostPath:
|
||||
path: /mnt/opea-models
|
||||
type: Directory
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
---
|
||||
# Source: visualqna/templates/deployment.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: visualqna
|
||||
labels:
|
||||
helm.sh/chart: visualqna-0.8.0
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
spec:
|
||||
securityContext:
|
||||
null
|
||||
containers:
|
||||
- name: visualqna
|
||||
env:
|
||||
- name: LVM_SERVICE_HOST_IP
|
||||
value: visualqna-lvm-uservice
|
||||
#- name: MEGA_SERVICE_PORT
|
||||
# value: 8888
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
capabilities:
|
||||
drop:
|
||||
- ALL
|
||||
readOnlyRootFilesystem: true
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
image: "opea/visualqna:latest"
|
||||
imagePullPolicy: IfNotPresent
|
||||
volumeMounts:
|
||||
- mountPath: /tmp
|
||||
name: tmp
|
||||
ports:
|
||||
- name: visualqna
|
||||
containerPort: 8888
|
||||
protocol: TCP
|
||||
resources:
|
||||
null
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
298
VisualQnA/kubernetes/manifests/xeon/visualqna.yaml
Normal file
298
VisualQnA/kubernetes/manifests/xeon/visualqna.yaml
Normal file
@@ -0,0 +1,298 @@
|
||||
---
|
||||
# Source: visualqna/charts/lvm-uservice/templates/configmap.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: visualqna-lvm-uservice-config
|
||||
labels:
|
||||
helm.sh/chart: lvm-uservice-0.8.0
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
data:
|
||||
LVM_ENDPOINT: "http://visualqna-tgi"
|
||||
HF_HOME: "/tmp/.cache/huggingface"
|
||||
http_proxy: ""
|
||||
https_proxy: ""
|
||||
no_proxy: ""
|
||||
---
|
||||
# Source: visualqna/charts/tgi/templates/configmap.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: visualqna-tgi-config
|
||||
labels:
|
||||
helm.sh/chart: tgi-0.8.0
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "2.1.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
data:
|
||||
MODEL_ID: "llava-hf/llava-v1.6-mistral-7b-hf"
|
||||
PORT: "8399"
|
||||
MAX_INPUT_TOKENS: "4096"
|
||||
MAX_TOTAL_TOKENS: "8192"
|
||||
CUDA_GRAPHS: "0"
|
||||
http_proxy: ""
|
||||
https_proxy: ""
|
||||
no_proxy: ""
|
||||
HABANA_LOGS: "/tmp/habana_logs"
|
||||
NUMBA_CACHE_DIR: "/tmp"
|
||||
TRANSFORMERS_CACHE: "/tmp/transformers_cache"
|
||||
HF_HOME: "/tmp/.cache/huggingface"
|
||||
---
|
||||
# Source: visualqna/charts/lvm-uservice/templates/service.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: visualqna-lvm-uservice
|
||||
labels:
|
||||
helm.sh/chart: lvm-uservice-0.8.0
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 9399
|
||||
targetPort: 9399
|
||||
protocol: TCP
|
||||
name: lvm-uservice
|
||||
selector:
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
---
|
||||
# Source: visualqna/charts/tgi/templates/service.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: visualqna-tgi
|
||||
labels:
|
||||
helm.sh/chart: tgi-0.8.0
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "2.1.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8399
|
||||
protocol: TCP
|
||||
name: tgi
|
||||
selector:
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
---
|
||||
# Source: visualqna/templates/service.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: visualqna
|
||||
labels:
|
||||
helm.sh/chart: visualqna-0.8.0
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 8888
|
||||
targetPort: 8888
|
||||
protocol: TCP
|
||||
name: visualqna
|
||||
selector:
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
---
|
||||
# Source: visualqna/charts/lvm-uservice/templates/deployment.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: visualqna-lvm-uservice
|
||||
labels:
|
||||
helm.sh/chart: lvm-uservice-0.8.0
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: lvm-uservice
|
||||
app.kubernetes.io/instance: visualqna
|
||||
spec:
|
||||
securityContext:
|
||||
{}
|
||||
containers:
|
||||
- name: visualqna
|
||||
envFrom:
|
||||
- configMapRef:
|
||||
name: visualqna-lvm-uservice-config
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
capabilities:
|
||||
drop:
|
||||
- ALL
|
||||
readOnlyRootFilesystem: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
image: "opea/lvm-tgi:latest"
|
||||
imagePullPolicy: IfNotPresent
|
||||
ports:
|
||||
- name: lvm-uservice
|
||||
containerPort: 9399
|
||||
protocol: TCP
|
||||
volumeMounts:
|
||||
- mountPath: /tmp
|
||||
name: tmp
|
||||
resources:
|
||||
{}
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
---
|
||||
# Source: visualqna/charts/tgi/templates/deployment.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: visualqna-tgi
|
||||
labels:
|
||||
helm.sh/chart: tgi-0.8.0
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "2.1.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: tgi
|
||||
app.kubernetes.io/instance: visualqna
|
||||
spec:
|
||||
securityContext:
|
||||
{}
|
||||
containers:
|
||||
- name: tgi
|
||||
envFrom:
|
||||
- configMapRef:
|
||||
name: visualqna-tgi-config
|
||||
securityContext:
|
||||
{}
|
||||
image: "ghcr.io/huggingface/text-generation-inference:2.2.0"
|
||||
imagePullPolicy: IfNotPresent
|
||||
volumeMounts:
|
||||
- mountPath: /data
|
||||
name: model-volume
|
||||
- mountPath: /tmp
|
||||
name: tmp
|
||||
ports:
|
||||
- name: http
|
||||
containerPort: 8399
|
||||
protocol: TCP
|
||||
resources:
|
||||
{}
|
||||
volumes:
|
||||
- name: model-volume
|
||||
hostPath:
|
||||
path: /mnt/opea-models
|
||||
type: Directory
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
---
|
||||
# Source: visualqna/templates/deployment.yaml
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: visualqna
|
||||
labels:
|
||||
helm.sh/chart: visualqna-0.8.0
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
app.kubernetes.io/version: "1.0.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: visualqna
|
||||
app.kubernetes.io/instance: visualqna
|
||||
spec:
|
||||
securityContext:
|
||||
null
|
||||
containers:
|
||||
- name: visualqna
|
||||
env:
|
||||
- name: LVM_SERVICE_HOST_IP
|
||||
value: visualqna-lvm-uservice
|
||||
#- name: MEGA_SERVICE_PORT
|
||||
# value: 8888
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
capabilities:
|
||||
drop:
|
||||
- ALL
|
||||
readOnlyRootFilesystem: true
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
image: "opea/visualqna:latest"
|
||||
imagePullPolicy: IfNotPresent
|
||||
volumeMounts:
|
||||
- mountPath: /tmp
|
||||
name: tmp
|
||||
ports:
|
||||
- name: visualqna
|
||||
containerPort: 8888
|
||||
protocol: TCP
|
||||
resources:
|
||||
null
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
34
VisualQnA/kubernetes/visualqna_gaudi.yaml
Normal file
34
VisualQnA/kubernetes/visualqna_gaudi.yaml
Normal file
@@ -0,0 +1,34 @@
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: gmc.opea.io/v1alpha3
|
||||
kind: GMConnector
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: gmconnector
|
||||
app.kubernetes.io/managed-by: kustomize
|
||||
gmc/platform: gaudi
|
||||
name: visualqna
|
||||
namespace: visualqna
|
||||
spec:
|
||||
routerConfig:
|
||||
name: router
|
||||
serviceName: router-service
|
||||
nodes:
|
||||
root:
|
||||
routerType: Sequence
|
||||
steps:
|
||||
- name: Lvm
|
||||
data: $response
|
||||
internalService:
|
||||
serviceName: visualqna-service
|
||||
config:
|
||||
endpoint: /v1/lvm
|
||||
LVM_ENDPOINT: visualqna-tgi-svc
|
||||
- name: TgiGaudi
|
||||
internalService:
|
||||
serviceName: visualqna-tgi-svc
|
||||
config:
|
||||
MODEL_ID: llava-hf/llava-v1.6-mistral-7b-hf
|
||||
endpoint: /generate
|
||||
isDownstreamService: true
|
||||
34
VisualQnA/kubernetes/visualqna_xeon.yaml
Normal file
34
VisualQnA/kubernetes/visualqna_xeon.yaml
Normal file
@@ -0,0 +1,34 @@
|
||||
# Copyright (C) 2024 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
apiVersion: gmc.opea.io/v1alpha3
|
||||
kind: GMConnector
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: gmconnector
|
||||
app.kubernetes.io/managed-by: kustomize
|
||||
gmc/platform: xeon
|
||||
name: visualqna
|
||||
namespace: visualqna
|
||||
spec:
|
||||
routerConfig:
|
||||
name: router
|
||||
serviceName: router-service
|
||||
nodes:
|
||||
root:
|
||||
routerType: Sequence
|
||||
steps:
|
||||
- name: Lvm
|
||||
data: $response
|
||||
internalService:
|
||||
serviceName: visualqna-service
|
||||
config:
|
||||
endpoint: /v1/lvm
|
||||
LVM_ENDPOINT: visualqna-tgi-svc
|
||||
- name: Tgi
|
||||
internalService:
|
||||
serviceName: visualqna-tgi-svc
|
||||
config:
|
||||
MODEL_ID: llava-hf/llava-v1.6-mistral-7b-hf
|
||||
endpoint: /generate
|
||||
isDownstreamService: true
|
||||
Reference in New Issue
Block a user