Add kubernetes support for VisualQnA (#578)

* Add kubernetes support for VisualQnA Signed-off-by: lvliang-intel <liang1.lv@intel.com> * update gmc file Signed-off-by: lvliang-intel <liang1.lv@intel.com> * update pic Signed-off-by: lvliang-intel <liang1.lv@intel.com> --------- Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-08-13 17:14:03 +08:00
parent 80e3e2a2d3
commit 4f7fc39d66
9 changed files with 784 additions and 7 deletions
--- a/VisualQnA/kubernetes/README.md
+++ b/VisualQnA/kubernetes/README.md
@@ -0,0 +1,57 @@
+# Deploy VisualQnA in a Kubernetes Cluster
+
+This document outlines the deployment process for a Visual Question Answering (VisualQnA) application that utilizes the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice components on Intel Xeon servers and Gaudi machines.
+
+Please install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector#readme). We will soon publish images to Docker Hub, at which point no builds will be required, further simplifying install.
+
+If you have only Intel Xeon machines you could use the visualqna_xeon.yaml file or if you have a Gaudi cluster you could use visualqna_gaudi.yaml
+In the below example we illustrate on Xeon.
+
+## Deploy the VisualQnA application
+
+1. Create the desired namespace if it does not already exist and deploy the application
+   ```bash
+   export APP_NAMESPACE=CT
+   kubectl create ns $APP_NAMESPACE
+   sed -i "s|namespace: visualqna|namespace: $APP_NAMESPACE|g"  ./visualqna_xeon.yaml
+   kubectl apply -f ./visualqna_xeon.yaml
+   ```
+
+2. Check if the application is up and ready
+   ```bash
+   kubectl get pods -n $APP_NAMESPACE
+   ```
+
+3. Deploy a client pod for testing
+   ```bash
+   kubectl create deployment client-test -n $APP_NAMESPACE --image=python:3.8.13 -- sleep infinity
+   ```
+
+4. Check that client pod is ready
+   ```bash
+   kubectl get pods -n $APP_NAMESPACE
+   ```
+
+5. Send request to application
+   ```bash
+   export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
+   export accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='visualqna')].status.accessUrl}")
+   kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -X POST -d '{"messages": [
+      {
+        "role": "user",
+        "content": [
+          {
+            "type": "text",
+            "text": "What'\''s in this image?"
+          },
+          {
+            "type": "image_url",
+            "image_url": {
+              "url": "https://www.ilankelman.org/stopsigns/australia.jpg"
+            }
+          }
+        ]
+      }
+    ],
+    "max_tokens": 128}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_visualqna.log
+   ```
--- a/VisualQnA/kubernetes/manifests/README.md
+++ b/VisualQnA/kubernetes/manifests/README.md
@@ -0,0 +1,51 @@
+# Deploy VisualQnA in Kubernetes Cluster
+
+> [NOTE]
+> You can also customize the "LVM_MODEL_ID" if needed.
+
+> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the visualqna workload is running. Otherwise, you need to modify the `visualqna.yaml` file to change the `model-volume` to a directory that exists on the node.
+
+## Deploy On Xeon
+
+```
+cd GenAIExamples/visualqna/kubernetes/manifests/xeon
+kubectl apply -f visualqna.yaml
+```
+
+## Deploy On Gaudi
+
+```
+cd GenAIExamples/visualqna/kubernetes/manifests/gaudi
+kubectl apply -f visualqna.yaml
+```
+
+## Verify Services
+
+To verify the installation, run the command `kubectl get pod` to make sure all pods are running.
+
+Then run the command `kubectl port-forward svc/visualqna 8888:8888` to expose the visualqna service for access.
+
+Open another terminal and run the following command to verify the service if working:
+
+```console
+curl http://localhost:8888/v1/visualqna \
+    -H 'Content-Type: application/json' \
+    -d '{"messages": [
+      {
+        "role": "user",
+        "content": [
+          {
+            "type": "text",
+            "text": "What'\''s in this image?"
+          },
+          {
+            "type": "image_url",
+            "image_url": {
+              "url": "https://www.ilankelman.org/stopsigns/australia.jpg"
+            }
+          }
+        ]
+      }
+    ],
+    "max_tokens": 128}'
+```
--- a/VisualQnA/kubernetes/manifests/gaudi/visualqna.yaml
+++ b/VisualQnA/kubernetes/manifests/gaudi/visualqna.yaml
@@ -0,0 +1,298 @@
+---
+# Source: visualqna/charts/lvm-uservice/templates/configmap.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: visualqna-lvm-uservice-config
+  labels:
+    helm.sh/chart: lvm-uservice-0.8.0
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+data:
+  LVM_ENDPOINT: "http://visualqna-tgi"
+  HF_HOME: "/tmp/.cache/huggingface"
+  http_proxy: ""
+  https_proxy: ""
+  no_proxy: ""
+---
+# Source: visualqna/charts/tgi/templates/configmap.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: visualqna-tgi-config
+  labels:
+    helm.sh/chart: tgi-0.8.0
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "2.1.0"
+    app.kubernetes.io/managed-by: Helm
+data:
+  MODEL_ID: "llava-hf/llava-v1.6-mistral-7b-hf"
+  PORT: "8399"
+  MAX_INPUT_TOKENS: "4096"
+  MAX_TOTAL_TOKENS: "8192"
+  http_proxy: ""
+  https_proxy: ""
+  no_proxy: ""
+  HABANA_LOGS: "/tmp/habana_logs"
+  NUMBA_CACHE_DIR: "/tmp"
+  TRANSFORMERS_CACHE: "/tmp/transformers_cache"
+  HF_HOME: "/tmp/.cache/huggingface"
+---
+# Source: visualqna/charts/lvm-uservice/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+  name: visualqna-lvm-uservice
+  labels:
+    helm.sh/chart: lvm-uservice-0.8.0
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 9399
+      targetPort: 9399
+      protocol: TCP
+      name: lvm-uservice
+  selector:
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+---
+# Source: visualqna/charts/tgi/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+  name: visualqna-tgi
+  labels:
+    helm.sh/chart: tgi-0.8.0
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "2.1.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 80
+      targetPort: 8399
+      protocol: TCP
+      name: tgi
+  selector:
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+---
+# Source: visualqna/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+  name: visualqna
+  labels:
+    helm.sh/chart: visualqna-0.8.0
+    app.kubernetes.io/name: visualqna
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 8888
+      targetPort: 8888
+      protocol: TCP
+      name: visualqna
+  selector:
+    app.kubernetes.io/name: visualqna
+    app.kubernetes.io/instance: visualqna
+---
+# Source: visualqna/charts/lvm-uservice/templates/deployment.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: visualqna-lvm-uservice
+  labels:
+    helm.sh/chart: lvm-uservice-0.8.0
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: lvm-uservice
+      app.kubernetes.io/instance: visualqna
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: lvm-uservice
+        app.kubernetes.io/instance: visualqna
+    spec:
+      securityContext:
+        {}
+      containers:
+        - name: visualqna
+          envFrom:
+            - configMapRef:
+                name: visualqna-lvm-uservice-config
+          securityContext:
+            allowPrivilegeEscalation: false
+            capabilities:
+              drop:
+              - ALL
+            readOnlyRootFilesystem: false
+            runAsNonRoot: true
+            runAsUser: 1000
+            seccompProfile:
+              type: RuntimeDefault
+          image: "opea/lvm-tgi:latest"
+          imagePullPolicy: IfNotPresent
+          ports:
+            - name: lvm-uservice
+              containerPort: 9399
+              protocol: TCP
+          volumeMounts:
+            - mountPath: /tmp
+              name: tmp
+          resources:
+            {}
+      volumes:
+        - name: tmp
+          emptyDir: {}
+---
+# Source: visualqna/charts/tgi/templates/deployment.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: visualqna-tgi
+  labels:
+    helm.sh/chart: tgi-0.8.0
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "2.1.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: tgi
+      app.kubernetes.io/instance: visualqna
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: tgi
+        app.kubernetes.io/instance: visualqna
+    spec:
+      securityContext:
+        {}
+      containers:
+        - name: tgi
+          envFrom:
+            - configMapRef:
+                name: visualqna-tgi-config
+          securityContext:
+            {}
+          image: "opea/llava-tgi:latest"
+          imagePullPolicy: IfNotPresent
+          volumeMounts:
+            - mountPath: /data
+              name: model-volume
+            - mountPath: /tmp
+              name: tmp
+          ports:
+            - name: http
+              containerPort: 8399
+              protocol: TCP
+          resources:
+            limits:
+              habana.ai/gaudi: 1
+      volumes:
+        - name: model-volume
+          hostPath:
+            path: /mnt/opea-models
+            type: Directory
+        - name: tmp
+          emptyDir: {}
+---
+# Source: visualqna/templates/deployment.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: visualqna
+  labels:
+    helm.sh/chart: visualqna-0.8.0
+    app.kubernetes.io/name: visualqna
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: visualqna
+      app.kubernetes.io/instance: visualqna
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: visualqna
+        app.kubernetes.io/instance: visualqna
+    spec:
+      securityContext:
+        null
+      containers:
+        - name: visualqna
+          env:
+            - name: LVM_SERVICE_HOST_IP
+              value: visualqna-lvm-uservice
+            #- name: MEGA_SERVICE_PORT
+            #  value: 8888
+          securityContext:
+            allowPrivilegeEscalation: false
+            capabilities:
+              drop:
+              - ALL
+            readOnlyRootFilesystem: true
+            runAsNonRoot: true
+            runAsUser: 1000
+            seccompProfile:
+              type: RuntimeDefault
+          image: "opea/visualqna:latest"
+          imagePullPolicy: IfNotPresent
+          volumeMounts:
+            - mountPath: /tmp
+              name: tmp
+          ports:
+            - name: visualqna
+              containerPort: 8888
+              protocol: TCP
+          resources:
+            null
+      volumes:
+        - name: tmp
+          emptyDir: {}
--- a/VisualQnA/kubernetes/manifests/xeon/visualqna.yaml
+++ b/VisualQnA/kubernetes/manifests/xeon/visualqna.yaml
@@ -0,0 +1,298 @@
+---
+# Source: visualqna/charts/lvm-uservice/templates/configmap.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: visualqna-lvm-uservice-config
+  labels:
+    helm.sh/chart: lvm-uservice-0.8.0
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+data:
+  LVM_ENDPOINT: "http://visualqna-tgi"
+  HF_HOME: "/tmp/.cache/huggingface"
+  http_proxy: ""
+  https_proxy: ""
+  no_proxy: ""
+---
+# Source: visualqna/charts/tgi/templates/configmap.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: visualqna-tgi-config
+  labels:
+    helm.sh/chart: tgi-0.8.0
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "2.1.0"
+    app.kubernetes.io/managed-by: Helm
+data:
+  MODEL_ID: "llava-hf/llava-v1.6-mistral-7b-hf"
+  PORT: "8399"
+  MAX_INPUT_TOKENS: "4096"
+  MAX_TOTAL_TOKENS: "8192"
+  CUDA_GRAPHS: "0"
+  http_proxy: ""
+  https_proxy: ""
+  no_proxy: ""
+  HABANA_LOGS: "/tmp/habana_logs"
+  NUMBA_CACHE_DIR: "/tmp"
+  TRANSFORMERS_CACHE: "/tmp/transformers_cache"
+  HF_HOME: "/tmp/.cache/huggingface"
+---
+# Source: visualqna/charts/lvm-uservice/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+  name: visualqna-lvm-uservice
+  labels:
+    helm.sh/chart: lvm-uservice-0.8.0
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 9399
+      targetPort: 9399
+      protocol: TCP
+      name: lvm-uservice
+  selector:
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+---
+# Source: visualqna/charts/tgi/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+  name: visualqna-tgi
+  labels:
+    helm.sh/chart: tgi-0.8.0
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "2.1.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 80
+      targetPort: 8399
+      protocol: TCP
+      name: tgi
+  selector:
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+---
+# Source: visualqna/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+  name: visualqna
+  labels:
+    helm.sh/chart: visualqna-0.8.0
+    app.kubernetes.io/name: visualqna
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 8888
+      targetPort: 8888
+      protocol: TCP
+      name: visualqna
+  selector:
+    app.kubernetes.io/name: visualqna
+    app.kubernetes.io/instance: visualqna
+---
+# Source: visualqna/charts/lvm-uservice/templates/deployment.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: visualqna-lvm-uservice
+  labels:
+    helm.sh/chart: lvm-uservice-0.8.0
+    app.kubernetes.io/name: lvm-uservice
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: lvm-uservice
+      app.kubernetes.io/instance: visualqna
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: lvm-uservice
+        app.kubernetes.io/instance: visualqna
+    spec:
+      securityContext:
+        {}
+      containers:
+        - name: visualqna
+          envFrom:
+            - configMapRef:
+                name: visualqna-lvm-uservice-config
+          securityContext:
+            allowPrivilegeEscalation: false
+            capabilities:
+              drop:
+              - ALL
+            readOnlyRootFilesystem: false
+            runAsNonRoot: true
+            runAsUser: 1000
+            seccompProfile:
+              type: RuntimeDefault
+          image: "opea/lvm-tgi:latest"
+          imagePullPolicy: IfNotPresent
+          ports:
+            - name: lvm-uservice
+              containerPort: 9399
+              protocol: TCP
+          volumeMounts:
+            - mountPath: /tmp
+              name: tmp
+          resources:
+            {}
+      volumes:
+        - name: tmp
+          emptyDir: {}
+---
+# Source: visualqna/charts/tgi/templates/deployment.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: visualqna-tgi
+  labels:
+    helm.sh/chart: tgi-0.8.0
+    app.kubernetes.io/name: tgi
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "2.1.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: tgi
+      app.kubernetes.io/instance: visualqna
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: tgi
+        app.kubernetes.io/instance: visualqna
+    spec:
+      securityContext:
+        {}
+      containers:
+        - name: tgi
+          envFrom:
+            - configMapRef:
+                name: visualqna-tgi-config
+          securityContext:
+            {}
+          image: "ghcr.io/huggingface/text-generation-inference:2.2.0"
+          imagePullPolicy: IfNotPresent
+          volumeMounts:
+            - mountPath: /data
+              name: model-volume
+            - mountPath: /tmp
+              name: tmp
+          ports:
+            - name: http
+              containerPort: 8399
+              protocol: TCP
+          resources:
+            {}
+      volumes:
+        - name: model-volume
+          hostPath:
+            path: /mnt/opea-models
+            type: Directory
+        - name: tmp
+          emptyDir: {}
+---
+# Source: visualqna/templates/deployment.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: visualqna
+  labels:
+    helm.sh/chart: visualqna-0.8.0
+    app.kubernetes.io/name: visualqna
+    app.kubernetes.io/instance: visualqna
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: visualqna
+      app.kubernetes.io/instance: visualqna
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: visualqna
+        app.kubernetes.io/instance: visualqna
+    spec:
+      securityContext:
+        null
+      containers:
+        - name: visualqna
+          env:
+            - name: LVM_SERVICE_HOST_IP
+              value: visualqna-lvm-uservice
+            #- name: MEGA_SERVICE_PORT
+            #  value: 8888
+          securityContext:
+            allowPrivilegeEscalation: false
+            capabilities:
+              drop:
+              - ALL
+            readOnlyRootFilesystem: true
+            runAsNonRoot: true
+            runAsUser: 1000
+            seccompProfile:
+              type: RuntimeDefault
+          image: "opea/visualqna:latest"
+          imagePullPolicy: IfNotPresent
+          volumeMounts:
+            - mountPath: /tmp
+              name: tmp
+          ports:
+            - name: visualqna
+              containerPort: 8888
+              protocol: TCP
+          resources:
+            null
+      volumes:
+        - name: tmp
+          emptyDir: {}
--- a/VisualQnA/kubernetes/visualqna_gaudi.yaml
+++ b/VisualQnA/kubernetes/visualqna_gaudi.yaml
@@ -0,0 +1,34 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: gmc.opea.io/v1alpha3
+kind: GMConnector
+metadata:
+  labels:
+    app.kubernetes.io/name: gmconnector
+    app.kubernetes.io/managed-by: kustomize
+    gmc/platform: gaudi
+  name: visualqna
+  namespace: visualqna
+spec:
+  routerConfig:
+    name: router
+    serviceName: router-service
+  nodes:
+    root:
+      routerType: Sequence
+      steps:
+      - name: Lvm
+        data: $response
+        internalService:
+          serviceName: visualqna-service
+          config:
+            endpoint: /v1/lvm
+            LVM_ENDPOINT: visualqna-tgi-svc
+      - name: TgiGaudi
+        internalService:
+          serviceName: visualqna-tgi-svc
+          config:
+            MODEL_ID: llava-hf/llava-v1.6-mistral-7b-hf
+            endpoint: /generate
+          isDownstreamService: true
--- a/VisualQnA/kubernetes/visualqna_xeon.yaml
+++ b/VisualQnA/kubernetes/visualqna_xeon.yaml
@@ -0,0 +1,34 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: gmc.opea.io/v1alpha3
+kind: GMConnector
+metadata:
+  labels:
+    app.kubernetes.io/name: gmconnector
+    app.kubernetes.io/managed-by: kustomize
+    gmc/platform: xeon
+  name: visualqna
+  namespace: visualqna
+spec:
+  routerConfig:
+    name: router
+    serviceName: router-service
+  nodes:
+    root:
+      routerType: Sequence
+      steps:
+      - name: Lvm
+        data: $response
+        internalService:
+          serviceName: visualqna-service
+          config:
+            endpoint: /v1/lvm
+            LVM_ENDPOINT: visualqna-tgi-svc
+      - name: Tgi
+        internalService:
+          serviceName: visualqna-tgi-svc
+          config:
+            MODEL_ID: llava-hf/llava-v1.6-mistral-7b-hf
+            endpoint: /generate
+          isDownstreamService: true