Remove kubernetes manifest related code and tests (#1466)

Remove deprecated kubernetes manifest related code and tests.
k8s implementation for those examples based on helm charts will target for next release.

Signed-off-by: chensuyue <suyue.chen@intel.com>
This commit is contained in:
chen, suyue
2025-01-24 15:23:12 +08:00
committed by GitHub
parent 9a1118730b
commit 259099d19f
17 changed files with 1 additions and 3739 deletions

View File

@@ -54,6 +54,6 @@ jobs:
${{ env.changed_files }}
Please verify if the helm charts and manifests need to be changed accordingly.
Please verify if the helm charts need to be changed accordingly.
> This issue was created automatically by CI.

View File

@@ -1,64 +0,0 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
set -xe
USER_ID=$(whoami)
MOUNT_DIR=/home/$USER_ID/.cache/huggingface/hub
IMAGE_REPO=${IMAGE_REPO:-opea}
IMAGE_TAG=${IMAGE_TAG:-latest}
ROLLOUT_TIMEOUT_SECONDS="1800s"
KUBECTL_TIMEOUT_SECONDS="60s"
function init_chatqna() {
# replace the mount dir "path: /mnt/opea-models" with "path: $CHART_MOUNT"
find ../../kubernetes/intel/*/*/manifest -name '*.yaml' -type f -exec sed -i "s#path: /mnt/opea-models#path: $MOUNT_DIR#g" {} \;
# replace microservice image tag
find ../../kubernetes/intel/*/*/manifest -name '*.yaml' -type f -exec sed -i "s#image: \"opea/\(.*\):latest#image: \"opea/\1:${IMAGE_TAG}#g" {} \;
# replace the repository "image: opea/*" with "image: $IMAGE_REPO/"
find ../../kubernetes/intel/*/*/manifest -name '*.yaml' -type f -exec sed -i "s#image: \"opea/*#image: \"${IMAGE_REPO}/#g" {} \;
# set huggingface token
find ../../kubernetes/intel/*/*/manifest -name '*.yaml' -type f -exec sed -i "s#insert-your-huggingface-token-here#$(cat /home/$USER_ID/.cache/huggingface/token)#g" {} \;
}
function get_end_point() {
# $1 is service name, $2 is namespace
ip_address=$(kubectl get svc $1 -n $2 -o jsonpath='{.spec.clusterIP}')
port=$(kubectl get svc $1 -n $2 -o jsonpath='{.spec.ports[0].port}')
echo "$ip_address:$port"
}
function _cleanup_ns() {
local ns=$1
if kubectl get ns $ns; then
if ! kubectl delete ns $ns --timeout=$KUBECTL_TIMEOUT_SECONDS; then
kubectl delete pods --namespace $ns --force --grace-period=0 --all
kubectl delete ns $ns --force --grace-period=0 --timeout=$KUBECTL_TIMEOUT_SECONDS
fi
fi
}
if [ $# -eq 0 ]; then
echo "Usage: $0 <function_name>"
exit 1
fi
case "$1" in
init_ChatQnA)
init_chatqna
;;
get_end_point)
service=$2
NAMESPACE=$3
get_end_point $service $NAMESPACE
;;
_cleanup_ns)
NAMESPACE=$2
_cleanup_ns $NAMESPACE
;;
*)
echo "Unknown function: $1"
;;
esac

View File

@@ -180,4 +180,3 @@ Utilizes the open-source platform **Keycloak** for single sign-on identity and a
- **[Keycloak Configuration Guide](./docker_compose/intel/cpu/xeon/keycloak_setup_guide.md)**: Instructions to set up Keycloak for identity and access management.
- **[Xeon Guide](./docker_compose/intel/cpu/xeon/README.md)**: Instructions to build Docker images from source and run the application via Docker Compose.
- **[Xeon Kubernetes Guide](./kubernetes/intel/README.md)**: Instructions to deploy the application via Kubernetes.

View File

@@ -1,111 +0,0 @@
# 🚀 Deploy ProductivitySuite with ReactUI
The document outlines the deployment steps for ProductivitySuite via Kubernetes cluster while utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline components and ReactUI, a popular React-based user interface library.
In ProductivitySuite, it consists of following pipelines/examples and components:
```
- productivity-suite-react-ui
- chatqna
- codegen
- docsum
- faqgen
- dataprep via redis
- chat-history
- prompt-registry
- mongo
- keycloak
```
---
## ⚠️ Prerequisites for Deploying ProductivitySuite with ReactUI
To begin with, ensure that you have following prerequisites in place:
1. ☸ Kubernetes installation: Make sure that you have Kubernetes installed.
2. 🐳 Images: Make sure you have all the images ready for the examples and components stated above. You may refer to [README](../../docker_compose/intel/cpu/xeon/README.md) for steps to build the images.
3. 🔧 Configuration Values: Set the following values in all the yaml files before proceeding with the deployment:
Download and set up yq for YAML processing:
```
sudo wget -qO /usr/local/bin/yq https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64
sudo chmod a+x /usr/local/bin/yq
cd GenAIExamples/ProductivitySuite/kubernetes/intel/cpu/xeon/manifest/
. ../utils
```
a. HUGGINGFACEHUB_API_TOKEN (Your HuggingFace token to download your desired model from HuggingFace):
```
# You may set the HUGGINGFACEHUB_API_TOKEN via method:
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
set_hf_token $HUGGINGFACEHUB_API_TOKEN
```
b. Set the proxies based on your network configuration
```
# Look for http_proxy, https_proxy and no_proxy key and fill up the values for all the yaml files with your system proxy configuration.
set_http_proxy $http_proxy
set_https_proxy $https_proxy
set_no_proxy $no_proxy
```
c. Set all the backend service endpoint for REACT UI service
```
# Setup all the backend service endpoint in productivity_suite_reactui.yaml for UI to consume with.
# Look for ENDPOINT in the yaml and insert all the url endpoint for all the required backend service.
set_services_endpoint
```
4. MODEL_ID and model-volume **(OPTIONAL)**: You may as well customize the "MODEL_ID" to use different model and model-volume for the volume to be mounted.
```
sudo mkdir -p /mnt/opea-models
sudo chmod -R a+xwr /mnt/opea-models
set_model_id
```
5. MODEL_MIRROR **(OPTIONAL)**: Please set the exact huggingface mirror if cannot access huggingface website directly from your country. You can set it as https://hf-mirror.com in PRC.
```
set_model_mirror
```
6. After finish with steps above, you can proceed with the deployment of the yaml file.
```
git diff
```
---
## 🌐 Deploying ProductivitySuite
You can use yaml files in xeon folder to deploy ProductivitySuite with reactUI.
```
cd GenAIExamples/ProductivitySuite/kubernetes/intel/cpu/xeon/manifest/
kubectl apply -f .
```
---
## 🔐 User Management via Keycloak Configuration
Please refer to **[keycloak_setup_guide](../../docker_compose/intel/cpu/xeon/keycloak_setup_guide.md)** for more detail related to Keycloak configuration setup.
---
## ✅ Verify Services
To verify the installation, run command 'kubectl get pod' to make sure all pods are running.
To view all the available services, run command 'kubectl get svc' to obtain ports that need to used as backend service endpoint in productivity_suite_reactui.yaml.
You may use `kubectl port-forward service/<service_name> <forwarded_port>/<service_port>` to forward the port of all the services if necessary.
```
# For example, 'kubectl get svc | grep productivity'
productivity-suite-react-ui ClusterIP 10.96.3.236 <none> 80/TCP
# By default, productivity-suite-react-ui service export port 80, forward it to 5174 via command:
'kubectl port-forward service/productivity-suite-react-ui 5174:80'
```
Or simple way to forward the productivity suite service port.
```
label='app.kubernetes.io/name=react-ui'
port=$(kubectl -n ${ns:-default} get svc -l ${label} -o jsonpath='{.items[0].spec.ports[0].port}')
kubectl port-forward service/productivity-suite-react-ui 5174:$port
```
You may open up the productivity suite react UI by using http://localhost:5174 in the browser.

View File

@@ -1,75 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
---
apiVersion: v1
kind: ConfigMap
metadata:
name: chat-history-config
data:
http_proxy: ""
https_proxy: ""
no_proxy: ""
MONGO_HOST: "mongo"
MONGO_PORT: "27017"
DB_NAME: "OPEA"
COLLECTION_NAME: "ChatHistory"
---
apiVersion: v1
kind: Service
metadata:
name: chat-history
labels:
helm.sh/chart: chat-history-0.1.0
app.kubernetes.io/name: chat-history
app.kubernetes.io/instance: chat-history
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 6012
targetPort: 6012
protocol: TCP
name: chat-history
selector:
app.kubernetes.io/name: chat-history
app.kubernetes.io/instance: chat-history
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: chat-history
labels:
helm.sh/chart: chat-history-0.1.0
app.kubernetes.io/name: chat-history
app.kubernetes.io/instance: chat-history
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: chat-history
app.kubernetes.io/instance: chat-history
template:
metadata:
labels:
app.kubernetes.io/name: chat-history
app.kubernetes.io/instance: chat-history
spec:
securityContext: null
containers:
- name: chat-history
envFrom:
- configMapRef:
name: chat-history-config
securityContext: null
image: "opea/chathistory-mongo-server:latest"
imagePullPolicy: IfNotPresent
ports:
- name: chat-history
containerPort: 6012
protocol: TCP
resources: null
---

View File

@@ -1,333 +0,0 @@
---
# Source: codegen/charts/llm-uservice/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: codegen-llm-uservice-config
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
data:
TGI_LLM_ENDPOINT: "http://codegen-tgi"
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
HF_HOME: "/tmp/.cache/huggingface"
http_proxy: ""
https_proxy: ""
no_proxy: ""
LANGCHAIN_TRACING_V2: "false"
LANGCHAIN_API_KEY: insert-your-langchain-key-here
LANGCHAIN_PROJECT: "opea-llm-uservice"
---
# Source: codegen/charts/tgi/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: codegen-tgi-config
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.4"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "meta-llama/CodeLlama-7b-hf"
PORT: "2080"
HUGGING_FACE_HUB_TOKEN: "insert-your-huggingface-token-here"
HF_TOKEN: "insert-your-huggingface-token-here"
MAX_INPUT_TOKENS: "1024"
MAX_TOTAL_TOKENS: "4096"
http_proxy: ""
https_proxy: ""
no_proxy: ""
HABANA_LOGS: "/tmp/habana_logs"
NUMBA_CACHE_DIR: "/tmp"
TRANSFORMERS_CACHE: "/tmp/transformers_cache"
HF_HOME: "/tmp/.cache/huggingface"
---
# Source: codegen/charts/llm-uservice/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: codegen-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
protocol: TCP
name: llm-uservice
selector:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codegen
---
# Source: codegen/charts/tgi/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: codegen-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.4"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 2080
protocol: TCP
name: tgi
selector:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codegen
---
# Source: codegen/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: codegen
labels:
helm.sh/chart: codegen-0.8.0
app.kubernetes.io/name: codegen
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 7778
targetPort: 7778
protocol: TCP
name: codegen
selector:
app.kubernetes.io/name: codegen
app.kubernetes.io/instance: codegen
---
# Source: codegen/charts/llm-uservice/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: codegen-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codegen
template:
metadata:
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codegen
spec:
securityContext:
{}
containers:
- name: codegen
envFrom:
- configMapRef:
name: codegen-llm-uservice-config
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/llm-textgen:latest"
imagePullPolicy: IfNotPresent
ports:
- name: llm-uservice
containerPort: 9000
protocol: TCP
volumeMounts:
- mountPath: /tmp
name: tmp
startupProbe:
exec:
command:
- curl
- http://codegen-tgi
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 120
resources:
{}
volumes:
- name: tmp
emptyDir: {}
---
# Source: codegen/charts/tgi/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: codegen-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.4"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codegen
template:
metadata:
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codegen
spec:
securityContext:
{}
containers:
- name: tgi
envFrom:
- configMapRef:
name: codegen-tgi-config
securityContext:
{}
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: model-volume
- mountPath: /tmp
name: tmp
ports:
- name: http
containerPort: 2080
protocol: TCP
resources:
{}
volumes:
- name: model-volume
hostPath:
path: /mnt/opea-models
type: Directory
- name: tmp
emptyDir: {}
---
# Source: codegen/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: codegen
labels:
helm.sh/chart: codegen-0.8.0
app.kubernetes.io/name: codegen
app.kubernetes.io/instance: codegen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: codegen
app.kubernetes.io/instance: codegen
template:
metadata:
labels:
app.kubernetes.io/name: codegen
app.kubernetes.io/instance: codegen
spec:
securityContext:
null
containers:
- name: codegen
env:
- name: LLM_SERVICE_HOST_IP
value: codegen-llm-uservice
- name: http_proxy
value: ""
- name: https_proxy
value: ""
- name: no_proxy
value: ""
#- name: MEGA_SERVICE_PORT
# value: 7778
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/codegen:latest"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /tmp
name: tmp
ports:
- name: codegen
containerPort: 7778
protocol: TCP
# startupProbe:
# httpGet:
# host: codegen-llm-uservice
# port: 9000
# path: /
# initialDelaySeconds: 5
# periodSeconds: 5
# failureThreshold: 120
# livenessProbe:
# httpGet:
# path: /
# port: 7778
# readinessProbe:
# httpGet:
# path: /
# port: 7778
resources:
null
volumes:
- name: tmp
emptyDir: {}

View File

@@ -1,317 +0,0 @@
---
# Source: docsum/charts/llm-uservice/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: docsum-llm-uservice-config
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
data:
TGI_LLM_ENDPOINT: "http://docsum-tgi"
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
HF_HOME: "/tmp/.cache/huggingface"
http_proxy: ""
https_proxy: ""
no_proxy: ""
LANGCHAIN_TRACING_V2: "false"
LANGCHAIN_API_KEY: insert-your-langchain-key-here
LANGCHAIN_PROJECT: "opea-llm-uservice"
---
# Source: docsum/charts/tgi/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: docsum-tgi-config
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "Intel/neural-chat-7b-v3-3"
PORT: "2080"
HUGGING_FACE_HUB_TOKEN: "insert-your-huggingface-token-here"
HF_TOKEN: "insert-your-huggingface-token-here"
MAX_INPUT_TOKENS: "1024"
MAX_TOTAL_TOKENS: "4096"
http_proxy: ""
https_proxy: ""
no_proxy: ""
HABANA_LOGS: "/tmp/habana_logs"
NUMBA_CACHE_DIR: "/tmp"
TRANSFORMERS_CACHE: "/tmp/transformers_cache"
HF_HOME: "/tmp/.cache/huggingface"
---
# Source: docsum/charts/llm-uservice/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: docsum-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
protocol: TCP
name: llm-uservice
selector:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: docsum
---
# Source: docsum/charts/tgi/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: docsum-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 2080
protocol: TCP
name: tgi
selector:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: docsum
---
# Source: docsum/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: docsum
labels:
helm.sh/chart: docsum-0.8.0
app.kubernetes.io/name: docsum
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 8888
targetPort: 8888
protocol: TCP
name: docsum
selector:
app.kubernetes.io/name: docsum
app.kubernetes.io/instance: docsum
---
# Source: docsum/charts/llm-uservice/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: docsum-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: docsum
template:
metadata:
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: docsum
spec:
securityContext:
{}
containers:
- name: docsum
envFrom:
- configMapRef:
name: docsum-llm-uservice-config
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/llm-docsum-tgi:latest"
imagePullPolicy: IfNotPresent
ports:
- name: llm-uservice
containerPort: 9000
protocol: TCP
volumeMounts:
- mountPath: /tmp
name: tmp
startupProbe:
exec:
command:
- curl
- http://docsum-tgi
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 120
resources:
{}
volumes:
- name: tmp
emptyDir: {}
---
# Source: docsum/charts/tgi/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: docsum-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: docsum
template:
metadata:
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: docsum
spec:
securityContext:
{}
containers:
- name: tgi
envFrom:
- configMapRef:
name: docsum-tgi-config
securityContext:
{}
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: model-volume
- mountPath: /tmp
name: tmp
ports:
- name: http
containerPort: 2080
protocol: TCP
resources:
{}
volumes:
- name: model-volume
hostPath:
path: /mnt/opea-models
type: Directory
- name: tmp
emptyDir: {}
---
# Source: docsum/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: docsum
labels:
helm.sh/chart: docsum-0.8.0
app.kubernetes.io/name: docsum
app.kubernetes.io/instance: docsum
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: docsum
app.kubernetes.io/instance: docsum
template:
metadata:
labels:
app.kubernetes.io/name: docsum
app.kubernetes.io/instance: docsum
spec:
securityContext:
null
containers:
- name: docsum
env:
- name: LLM_SERVICE_HOST_IP
value: docsum-llm-uservice
- name: http_proxy
value: ""
- name: https_proxy
value: ""
- name: no_proxy
value: ""
#- name: MEGA_SERVICE_PORT
# value: 8888
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/docsum:latest"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /tmp
name: tmp
ports:
- name: docsum
containerPort: 8888
protocol: TCP
resources:
null
volumes:
- name: tmp
emptyDir: {}

View File

@@ -1,243 +0,0 @@
---
# Source: faqgen/charts/llm-uservice/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: faqgen-llm-uservice-config
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
data:
TGI_LLM_ENDPOINT: "http://faqgen-tgi:80"
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
https_proxy: ""
no_proxy: ""
---
# Source: faqgen/charts/tgi/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: faqgen-tgi-config
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "Intel/neural-chat-7b-v3-3"
PORT: "80"
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
https_proxy: ""
no_proxy: ""
---
# Source: faqgen/charts/llm-uservice/charts/tgi/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: faqgen-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 80
protocol: TCP
name: tgi
selector:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: faqgen
---
apiVersion: v1
kind: Service
metadata:
name: faqgen-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
protocol: TCP
name: llm-uservice
selector:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: faqgen
---
apiVersion: v1
kind: Service
metadata:
name: faqgen
labels:
helm.sh/chart: faqgen-0.8.0
app.kubernetes.io/name: faqgen
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 8888
targetPort: 8888
protocol: TCP
name: faqgen
selector:
app.kubernetes.io/name: faqgen
app.kubernetes.io/instance: faqgen
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: faqgen-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: faqgen
template:
metadata:
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: faqgen
spec:
securityContext: {}
containers:
- name: tgi
envFrom:
- configMapRef:
name: faqgen-tgi-config
securityContext: {}
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: model-volume
ports:
- name: http
containerPort: 80
protocol: TCP
resources: {}
volumes:
- name: model-volume
hostPath:
path: /mnt/opea-models
type: Directory
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: faqgen-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: faqgen
template:
metadata:
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: faqgen
spec:
securityContext: {}
containers:
- name: faqgen
envFrom:
- configMapRef:
name: faqgen-llm-uservice-config
securityContext: {}
image: "opea/llm-faqgen:latest"
imagePullPolicy: IfNotPresent
ports:
- name: llm-uservice
containerPort: 9000
protocol: TCP
startupProbe:
exec:
command:
- curl
- http://faqgen-tgi:80
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 120
resources: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: faqgen
labels:
helm.sh/chart: faqgen-0.8.0
app.kubernetes.io/name: faqgen
app.kubernetes.io/instance: faqgen
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: faqgen
app.kubernetes.io/instance: faqgen
template:
metadata:
labels:
app.kubernetes.io/name: faqgen
app.kubernetes.io/instance: faqgen
spec:
securityContext: null
containers:
- name: faqgen
env:
- name: LLM_SERVICE_HOST_IP
value: faqgen-llm-uservice
- name: http_proxy
value: ""
- name: https_proxy
value: ""
- name: no_proxy
value: ""
securityContext: null
image: "opea/faqgen:latest"
imagePullPolicy: IfNotPresent
ports:
- name: faqgen
containerPort: 8888
protocol: TCP
resources: null

View File

@@ -1,66 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: keycloak
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: keycloak
template:
metadata:
labels:
app: keycloak
spec:
containers:
- args:
- start-dev
env:
- name: KEYCLOAK_ADMIN
value: admin
- name: KEYCLOAK_ADMIN_PASSWORD
value: admin
- name: KC_PROXY
value: edge
image: quay.io/keycloak/keycloak:25.0.2
imagePullPolicy: IfNotPresent
name: keycloak
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /realms/master
port: 8080
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
name: keycloak
spec:
allocateLoadBalancerNodePorts: true
ports:
- name: http
nodePort: 31503
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: keycloak
type: LoadBalancer

View File

@@ -1,71 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mongo-config
data:
http_proxy: ""
https_proxy: ""
no_proxy: ""
---
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
helm.sh/chart: mongo-0.1.0
app.kubernetes.io/name: mongo
app.kubernetes.io/instance: mongo
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 27017
targetPort: 27017
protocol: TCP
name: mongo
selector:
app.kubernetes.io/name: mongo
app.kubernetes.io/instance: mongo
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongo
labels:
helm.sh/chart: mongo-0.1.0
app.kubernetes.io/name: mongo
app.kubernetes.io/instance: mongo
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: mongo
app.kubernetes.io/instance: mongo
template:
metadata:
labels:
app.kubernetes.io/name: mongo
app.kubernetes.io/instance: mongo
spec:
securityContext: null
containers:
- name: mongo
envFrom:
- configMapRef:
name: mongo-config
securityContext: null
image: "mongo:7.0.11"
imagePullPolicy: IfNotPresent
ports:
- name: mongo
containerPort: 27017
protocol: TCP
resources: null
command: ["mongod", "--bind_ip", "0.0.0.0", "--quiet", "--logpath", "/dev/null"]

View File

@@ -1,91 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
---
apiVersion: v1
kind: Service
metadata:
name: productivity-suite-react-ui
labels:
helm.sh/chart: productivity-suite-react-ui-0.1.0
app.kubernetes.io/name: react-ui
app.kubernetes.io/instance: productivity-suite
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 80
protocol: TCP
name: react-ui
selector:
app.kubernetes.io/name: react-ui
app.kubernetes.io/instance: productivity-suite
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: productivity-suite-react-ui
labels:
helm.sh/chart: productivity-suite-react-ui-0.1.0
app.kubernetes.io/name: react-ui
app.kubernetes.io/instance: productivity-suite
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: react-ui
app.kubernetes.io/instance: productivity-suite
template:
metadata:
labels:
app.kubernetes.io/name: react-ui
app.kubernetes.io/instance: productivity-suite
spec:
securityContext: null
containers:
- name: productivity-suite-react-ui
env:
- name: http_proxy
value: ""
- name: https_proxy
value: ""
- name: no_proxy
value: ""
- name: APP_BACKEND_SERVICE_ENDPOINT_CHATQNA
value: ""
- name: APP_BACKEND_SERVICE_ENDPOINT_CODEGEN
value: ""
- name: APP_BACKEND_SERVICE_ENDPOINT_DOCSUM
value: ""
- name: APP_BACKEND_SERVICE_ENDPOINT_FAQGEN
value: ""
- name: APP_DATAPREP_SERVICE_ENDPOINT
value: ""
- name: APP_DATAPREP_GET_FILE_ENDPOINT
value: ""
- name: APP_DATAPREP_DELETE_FILE_ENDPOINT
value: ""
- name: APP_CHAT_HISTORY_CREATE_ENDPOINT
value: ""
- name: APP_CHAT_HISTORY_DELETE_ENDPOINT
value: ""
- name: APP_CHAT_HISTORY_GET_ENDPOINT
value: ""
- name: APP_PROMPT_SERVICE_GET_ENDPOINT
value: ""
- name: APP_PROMPT_SERVICE_CREATE_ENDPOINT
value: ""
- name: APP_KEYCLOAK_SERVICE_ENDPOINT
value: ""
securityContext: null
image: "opea/productivity-suite-react-ui-server:latest"
imagePullPolicy: IfNotPresent
ports:
- name: react-ui
containerPort: 80
protocol: TCP
resources: null

View File

@@ -1,75 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
---
apiVersion: v1
kind: ConfigMap
metadata:
name: prompt-registry-config
data:
http_proxy: ""
https_proxy: ""
no_proxy: ""
MONGO_HOST: "mongo"
MONGO_PORT: "27017"
DB_NAME: "OPEA"
COLLECTION_NAME: "Prompt"
---
apiVersion: v1
kind: Service
metadata:
name: prompt-registry
labels:
helm.sh/chart: prompt-registry-0.1.0
app.kubernetes.io/name: prompt-registry
app.kubernetes.io/instance: prompt-registry
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 6018
targetPort: 6018
protocol: TCP
name: prompt-registry
selector:
app.kubernetes.io/name: prompt-registry
app.kubernetes.io/instance: prompt-registry
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prompt-registry
labels:
helm.sh/chart: prompt-registry-0.1.0
app.kubernetes.io/name: prompt-registry
app.kubernetes.io/instance: prompt-registry
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: prompt-registry
app.kubernetes.io/instance: prompt-registry
template:
metadata:
labels:
app.kubernetes.io/name: prompt-registry
app.kubernetes.io/instance: prompt-registry
spec:
securityContext: null
containers:
- name: prompt-registry
envFrom:
- configMapRef:
name: prompt-registry-config
securityContext: null
image: "opea/promptregistry-mongo-server:latest"
imagePullPolicy: IfNotPresent
ports:
- name: prompt-registry
containerPort: 6018
protocol: TCP
resources: null
---

View File

@@ -1,157 +0,0 @@
set_model_id() {
if [ -z "$1" ] && [ -z "$2" ]; then
yq -o json '.| select(.data | has("MODEL_ID"))| {"ConfigMap": .metadata.name, "MODEL_ID": .data.MODEL_ID}' *.yaml
echo "usage:"
echo " set_model_id \${ConfigMap} \${MODEL_ID}"
return
fi
conf=$1
file=${1%%-*}
sed -i '/name: '"${conf}"'/,/---/s|\(MODEL_ID:\).*|\1 "'"${2}"'"|' ${file}.yaml
}
set_model_mirror() {
if [ -z "$1" ] ; then
yq -o json '.| select(.data | has("MODEL_ID"))| {"ConfigMap": .metadata.name, "MODEL_MIRROR": .data.HF_ENDPOINT}' *.yaml
echo "usage:"
echo " set_model_mirror \${MODEL_MIRROR}"
return
fi
cm=$(yq -r -o json '.| select(.data | has("MODEL_ID"))| .metadata.name' *.yaml)
mirror=$1
for i in $cm; do
conf=$i
file=${i%%-*}
echo "ConfigMap: $conf set mirror as $mirror"
has_mirror=$(yq -r -o json '.| select(.metadata.name == "'"${conf}"'")| .data.HF_ENDPOINT' ${file}.yaml)
if [ "$has_mirror" == "null" ]; then
sed -i '/name: '"${conf}"'/,/---/s|\(data:\)|\1\n HF_ENDPOINT: "'"${mirror}"'"|' ${file}.yaml
else
sed -i '/name: '"${conf}"'/,/---/s|\(HF_ENDPOINT:\).*|\1 "'"${1}"'"|' ${file}.yaml
fi
done
}
set_hf_token() {
if [ -z "$1" ] ; then
echo "usage:"
echo " set_hf_token \${HF_TOKEN}"
return
fi
sed -i "s/\(HF_TOKEN:\).*/\1 \"${1}\"/g" *.yaml
sed -i "s/\(HUGGINGFACEHUB_API_TOKEN:\).*/\1 \"${1}\"/g" *.yaml
sed -i "s/\(HUGGING_FACE_HUB_TOKEN:\).*/\1 \"${1}\"/g" *.yaml
}
set_https_proxy() {
if [ -z "$1" ] ; then
echo "usage:"
echo " set_https_proxy \${https_proxy}"
return
fi
https_proxy=$1
sed -i -e "s|\(https_proxy:\)\s*\"\"|\1 \"$https_proxy\"|g" *.yaml
sed -i '/https_proxy/{n;s|\(value:\)\s.*""|\1 "'"$https_proxy"'"|g}' *.yaml
}
set_http_proxy() {
if [ -z "$1" ] ; then
echo "usage:"
echo " set_http_proxy \${http_proxy}"
return
fi
http_proxy=$1
sed -i -e "s|\(http_proxy:\)\s*\"\"|\1 \"$http_proxy\"|g" *.yaml
sed -i '/http_proxy/{n;s|\(value:\)\s.*""|\1 "'"$http_proxy"'"|g}' *.yaml
}
set_no_proxy() {
if [ -z "$1" ] ; then
echo "usage:"
echo " set_no_proxy \${no_proxy}"
return
fi
no_proxy=$1
sed -i -e "s|\(no_proxy:\)\s*\"\"|\1 \"$no_proxy\"|g" *.yaml
sed -i '/no_proxy/{n;s|\(value:\)\s.*""|\1 "'"$no_proxy"'"|g}' *.yaml
}
set_backend_service_endpoint() {
for i in $(grep -oP "(?<=APP_BACKEND_SERVICE_ENDPOINT_).*" *.yaml); do
echo $i
name=${i##*:}
file=${name,,}.yaml
svc=$(yq -o json '. | select(.metadata.name == "'"${name,,}"'" and .kind=="Service")' $file)
port=$(jq .spec.ports[0].port <<< $svc)
url=http://${name,,}.${ns:-default}.svc.cluster.local:${port}
echo $url
sed -i -e '/APP_BACKEND_SERVICE_ENDPOINT_'"$name"'/{n;s|\(value:\)\s.*|\1 "'"$url"'"|}' productivity_suite_reactui.yaml
done
}
set_dataprep_service_endpoint() {
name=chatqna-data-prep
file=chatqna.yaml
svc=$(yq -o json '. | select(.metadata.name == "'"$name"'" and .kind=="Service")' $file)
port=$(jq .spec.ports[0].port <<< $svc)
url=http://${name}.${ns:-default}.svc.cluster.local:${port}
echo $url
for i in $(grep -oP "(?<=APP_)DATAPREP.*(?=_ENDPOINT)" *.yaml); do
echo $i
curd=${i##*:};
sed -i -e '/'"$curd"'/{n;s|\(value:\)\s.*|\1 "'"$url"'"|}' productivity_suite_reactui.yaml;
done
}
set_chat_history_endpoint() {
for i in $(grep -oP "(?<=APP_)CHAT_HISTORY.*(?=_ENDPOINT)" *.yaml); do
echo $i;
curd=${i##*:};
name=${curd%_*};
file=${name,,}.yaml;
name=${name/_/-};
svc=$(yq -o json '. | select(.metadata.name == "'"${name,,}"'" and .kind=="Service")' $file)
port=$(jq .spec.ports[0].port <<< $svc)
url=http://${name,,}.${ns:-default}.svc.cluster.local:${port};
echo $url;
sed -i -e '/'"$curd"'/{n;s|\(value:\)\s.*|\1 "'"$url"'"|}' productivity_suite_reactui.yaml;
done
}
set_prompt_service_endpoint() {
for i in $(grep -oP "(?<=APP_)PROMPT_SERVICE.*(?=_ENDPOINT)" *.yaml); do
echo $i;
curd=${i##*:};
curdr=${curd/SERVICE/REGISTRY};
name=${curdr%_*};
file=${name,,}.yaml;
name=${name/_/-};
svc=$(yq -o json '. | select(.metadata.name == "'"${name,,}"'" and .kind=="Service")' $file)
port=$(jq .spec.ports[0].port <<< $svc)
url=http://${name,,}.${ns:-default}.svc.cluster.local:${port};
echo $url;
sed -i -e '/'"$curd"'/{n;s|\(value:\)\s.*|\1 "'"$url"'"|}' productivity_suite_reactui.yaml ;
done
}
set_keycloak_service_endpoint() {
name=keycloak
file=keycloak_install.yaml
svc=$(yq -o json '. | select(.metadata.name == "'"$name"'" and .kind=="Service")' $file)
port=$(jq .spec.ports[0].port <<< $svc)
url=http://${name}.${ns:-default}.svc.cluster.local:${port}
echo $url
sed -i -e '/APP_KEYCLOAK_SERVICE_ENDPOINT/{n;s|\(value:\)\s.*|\1 "'"$url"'"|}' productivity_suite_reactui.yaml
}
set_services_endpoint() {
set_backend_service_endpoint
set_keycloak_service_endpoint
set_chat_history_endpoint
set_prompt_service_endpoint
set_dataprep_service_endpoint
}

View File

@@ -1,41 +0,0 @@
# Deploy Translation in Kubernetes Cluster
> [NOTE]
> The following values must be set before you can deploy:
> HUGGINGFACEHUB_API_TOKEN
>
> You can also customize the "MODEL_ID" if needed.
>
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the Translation workload is running. Otherwise, you need to modify the `translation.yaml` file to change the `model-volume` to a directory that exists on the node.
## Deploy On Xeon
```
cd GenAIExamples/Translation/kubernetes/intel/cpu/xeon/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" translation.yaml
kubectl apply -f translation.yaml
```
## Deploy On Gaudi
```
cd GenAIExamples/Translation/kubernetes/intel/hpu/gaudi/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" translation.yaml
kubectl apply -f translation.yaml
```
## Verify Services
To verify the installation, run the command `kubectl get pod` to make sure all pods are running.
Then run the command `kubectl port-forward svc/translation 8888:8888` to expose the Translation service for access.
Open another terminal and run the following command to verify the service if working:
```console
curl http://localhost:8888/v1/translation \
-H 'Content-Type: application/json' \
-d '{"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```

View File

@@ -1,495 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: translation-tgi-config
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "2.1.0"
data:
MODEL_ID: "haoranxu/ALMA-13B"
PORT: "2080"
HF_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
https_proxy: ""
no_proxy: ""
HABANA_LOGS: "/tmp/habana_logs"
NUMBA_CACHE_DIR: "/tmp"
HF_HOME: "/tmp/.cache/huggingface"
CUDA_GRAPHS: "0"
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: translation-llm-uservice-config
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
data:
TGI_LLM_ENDPOINT: "http://translation-tgi"
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
https_proxy: ""
no_proxy: ""
LOGFLAG: ""
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: translation-ui-config
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
data:
BASE_URL: "/v1/translation"
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
data:
default.conf: |+
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
server {
listen 80;
listen [::]:80;
location /home {
alias /usr/share/nginx/html/index.html;
}
location / {
proxy_pass http://translation-ui:5173;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /v1/translation {
proxy_pass http://translation:8888;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
kind: ConfigMap
metadata:
name: translation-nginx-config
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation-ui
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
type: ClusterIP
ports:
- port: 5173
targetPort: ui
protocol: TCP
name: ui
selector:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation-llm-uservice
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
protocol: TCP
name: llm-uservice
selector:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation-tgi
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "2.1.0"
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 2080
protocol: TCP
name: tgi
selector:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
---
apiVersion: v1
kind: Service
metadata:
name: translation-nginx
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation-nginx
type: NodePort
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
type: ClusterIP
ports:
- port: 8888
targetPort: 8888
protocol: TCP
name: translation
selector:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-ui
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
template:
metadata:
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
securityContext:
{}
containers:
- name: translation-ui
envFrom:
- configMapRef:
name: translation-ui-config
securityContext:
{}
image: "opea/translation-ui:latest"
imagePullPolicy: IfNotPresent
ports:
- name: ui
containerPort: 80
protocol: TCP
resources:
{}
volumeMounts:
- mountPath: /tmp
name: tmp
volumes:
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-llm-uservice
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
template:
metadata:
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
spec:
securityContext:
{}
containers:
- name: translation
envFrom:
- configMapRef:
name: translation-llm-uservice-config
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/llm-textgen:latest"
imagePullPolicy: IfNotPresent
ports:
- name: llm-uservice
containerPort: 9000
protocol: TCP
volumeMounts:
- mountPath: /tmp
name: tmp
livenessProbe:
failureThreshold: 24
httpGet:
path: v1/health_check
port: llm-uservice
initialDelaySeconds: 5
periodSeconds: 5
readinessProbe:
httpGet:
path: v1/health_check
port: llm-uservice
initialDelaySeconds: 5
periodSeconds: 5
startupProbe:
failureThreshold: 120
httpGet:
path: v1/health_check
port: llm-uservice
initialDelaySeconds: 5
periodSeconds: 5
resources:
{}
volumes:
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-tgi
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "2.1.0"
spec:
# use explicit replica counts only of HorizontalPodAutoscaler is disabled
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
template:
metadata:
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
spec:
securityContext:
{}
containers:
- name: tgi
envFrom:
- configMapRef:
name: translation-tgi-config
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: model-volume
- mountPath: /tmp
name: tmp
ports:
- name: http
containerPort: 2080
protocol: TCP
livenessProbe:
failureThreshold: 24
initialDelaySeconds: 5
periodSeconds: 5
tcpSocket:
port: http
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
tcpSocket:
port: http
startupProbe:
failureThreshold: 120
initialDelaySeconds: 5
periodSeconds: 5
tcpSocket:
port: http
resources:
{}
volumes:
- name: model-volume
emptyDir: {}
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
app: translation
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation
template:
metadata:
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation
spec:
securityContext:
null
containers:
- name: translation
env:
- name: LLM_SERVICE_HOST_IP
value: translation-llm-uservice
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/translation:latest"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /tmp
name: tmp
ports:
- name: translation
containerPort: 8888
protocol: TCP
resources:
null
volumes:
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-nginx
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
app: translation-nginx
spec:
selector:
matchLabels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation-nginx
template:
metadata:
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation-nginx
spec:
containers:
- image: nginx:1.27.1
imagePullPolicy: IfNotPresent
name: nginx
volumeMounts:
- mountPath: /etc/nginx/conf.d
name: nginx-config-volume
securityContext: {}
volumes:
- configMap:
defaultMode: 420
name: translation-nginx-config
name: nginx-config-volume

View File

@@ -1,497 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: translation-tgi-config
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "2.1.0"
data:
MODEL_ID: "haoranxu/ALMA-13B"
PORT: "2080"
HF_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
https_proxy: ""
no_proxy: ""
HABANA_LOGS: "/tmp/habana_logs"
NUMBA_CACHE_DIR: "/tmp"
HF_HOME: "/tmp/.cache/huggingface"
MAX_INPUT_LENGTH: "1024"
MAX_TOTAL_TOKENS: "2048"
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: translation-llm-uservice-config
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
data:
TGI_LLM_ENDPOINT: "http://translation-tgi"
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
https_proxy: ""
no_proxy: ""
LOGFLAG: ""
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: ConfigMap
metadata:
name: translation-ui-config
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
data:
BASE_URL: "/v1/translation"
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
data:
default.conf: |+
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
server {
listen 80;
listen [::]:80;
location /home {
alias /usr/share/nginx/html/index.html;
}
location / {
proxy_pass http://translation-ui:5173;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /v1/translation {
proxy_pass http://translation;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
kind: ConfigMap
metadata:
name: translation-nginx-config
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation-ui
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
type: ClusterIP
ports:
- port: 5173
targetPort: ui
protocol: TCP
name: ui
selector:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation-llm-uservice
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
protocol: TCP
name: llm-uservice
selector:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation-tgi
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "2.1.0"
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 2080
protocol: TCP
name: tgi
selector:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
---
apiVersion: v1
kind: Service
metadata:
name: translation-nginx
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation-nginx
type: NodePort
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: v1
kind: Service
metadata:
name: translation
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
type: ClusterIP
ports:
- port: 8888
targetPort: 8888
protocol: TCP
name: translation
selector:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-ui
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
template:
metadata:
labels:
app.kubernetes.io/name: translation-ui
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
securityContext:
{}
containers:
- name: translation-ui
envFrom:
- configMapRef:
name: translation-ui-config
securityContext:
{}
image: "opea/translation-ui:latest"
imagePullPolicy: IfNotPresent
ports:
- name: ui
containerPort: 80
protocol: TCP
resources:
{}
volumeMounts:
- mountPath: /tmp
name: tmp
volumes:
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-llm-uservice
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
template:
metadata:
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: translation
spec:
securityContext:
{}
containers:
- name: translation
envFrom:
- configMapRef:
name: translation-llm-uservice-config
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/llm-textgen:latest"
imagePullPolicy: IfNotPresent
ports:
- name: llm-uservice
containerPort: 9000
protocol: TCP
volumeMounts:
- mountPath: /tmp
name: tmp
livenessProbe:
failureThreshold: 24
httpGet:
path: v1/health_check
port: llm-uservice
initialDelaySeconds: 5
periodSeconds: 5
readinessProbe:
httpGet:
path: v1/health_check
port: llm-uservice
initialDelaySeconds: 5
periodSeconds: 5
startupProbe:
failureThreshold: 120
httpGet:
path: v1/health_check
port: llm-uservice
initialDelaySeconds: 5
periodSeconds: 5
resources:
{}
volumes:
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-tgi
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "2.1.0"
spec:
# use explicit replica counts only of HorizontalPodAutoscaler is disabled
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
template:
metadata:
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: translation
spec:
securityContext:
{}
containers:
- name: tgi
envFrom:
- configMapRef:
name: translation-tgi-config
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "ghcr.io/huggingface/tgi-gaudi:2.0.6"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: model-volume
- mountPath: /tmp
name: tmp
ports:
- name: http
containerPort: 2080
protocol: TCP
livenessProbe:
failureThreshold: 24
initialDelaySeconds: 5
periodSeconds: 5
tcpSocket:
port: http
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
tcpSocket:
port: http
startupProbe:
failureThreshold: 120
initialDelaySeconds: 20
periodSeconds: 5
tcpSocket:
port: http
resources:
limits:
habana.ai/gaudi: 1
volumes:
- name: model-volume
emptyDir: {}
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
app: translation
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation
template:
metadata:
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation
spec:
securityContext:
null
containers:
- name: translation
env:
- name: LLM_SERVICE_HOST_IP
value: translation-llm-uservice
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/translation:latest"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /tmp
name: tmp
ports:
- name: translation
containerPort: 8888
protocol: TCP
resources:
null
volumes:
- name: tmp
emptyDir: {}
---
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-nginx
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app.kubernetes.io/version: "v1.0"
app: translation-nginx
spec:
selector:
matchLabels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation-nginx
template:
metadata:
labels:
app.kubernetes.io/name: translation
app.kubernetes.io/instance: translation
app: translation-nginx
spec:
containers:
- image: nginx:1.27.1
imagePullPolicy: IfNotPresent
name: nginx
volumeMounts:
- mountPath: /etc/nginx/conf.d
name: nginx-config-volume
securityContext: {}
volumes:
- configMap:
defaultMode: 420
name: translation-nginx-config
name: nginx-config-volume