Compare commits

..

32 Commits

Author SHA1 Message Date
ZePan110
ec9db1a3e1 Merge branch 'main' into nightly-cancel 2025-05-06 16:35:38 +08:00
lkk
ff66600ab4 Fix ui dockerfile. (#1909)
Signed-off-by: lkk <33276950+lkk12014402@users.noreply.github.com>
2025-05-06 16:34:16 +08:00
ZePan110
faf6250590 Fix 1.
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-05-06 16:17:10 +08:00
ZePan110
5375332fb3 Fix security issues for helm test workflow (#1908)
Signed-off-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-06 15:54:43 +08:00
Omar Khleif
df33800945 CodeGen Gradio UI Enhancements (#1904)
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
2025-05-06 13:41:21 +08:00
Ying Hu
40e44dfcd6 Update README.md of ChatQnA for broken URL (#1907)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2025-05-06 13:21:31 +08:00
ZePan110
9259ba41a5 Remove invalid codeowner. (#1896)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-04-30 13:24:42 +08:00
ZePan110
5c7f5718ed Restore context in EdgeCraftRAG build.yaml. (#1895)
Restore context in EdgeCraftRAG build.yaml to avoid the issue of can't find Dockerfiles.

Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-04-30 11:09:21 +08:00
lkk
d334f5c8fd build cpu agent ui docker image. (#1894) 2025-04-29 23:58:52 +08:00
ZePan110
670d9f3d18 Fix security issue. (#1892)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-04-29 19:44:48 +08:00
Zhu Yongbo
555c4100b3 Install cpu version for components (#1888)
Signed-off-by: Yongbozzz <yongbo.zhu@intel.com>
2025-04-29 10:08:23 +08:00
ZePan110
04d527d3b0 Integrate set_env to ut scripts for CodeTrans. (#1868)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-04-28 13:53:50 +08:00
ZePan110
13c4749ca3 Fix security issue (#1884)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-04-28 13:52:50 +08:00
ZePan110
99b62ae49e Integrate DocSum set_env to ut scripts. (#1860)
Integrate DocSum set_env to ut scripts.
Add README.md for DocSum and InstructionTuning UT scripts.

Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-04-28 13:35:05 +08:00
chen, suyue
c546d96e98 downgrade tei version from 1.6 to 1.5, fix the chatqna perf regression (#1886)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2025-04-25 23:00:36 +08:00
chen, suyue
be5933ad85 Update benchmark scripts (#1883)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-04-25 17:05:48 +08:00
rbrugaro
18b4f39f27 README fixes Finance Example (#1882)
Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com>
Co-authored-by: Ying Hu <ying.hu@intel.com>
2025-04-24 23:58:08 -07:00
chyundunovDatamonsters
ef9290f245 DocSum - refactoring README.md for deploy application on ROCm (#1881)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
2025-04-25 13:36:40 +08:00
chyundunovDatamonsters
3b0bcb80a8 DocSum - Adding files to deploy an application in the K8S environment using Helm (#1758)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Co-authored-by: Chingis Yundunov <YundunovCN@sibedge.com>
Co-authored-by: Artem Astafev <a.astafev@datamonsters.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2025-04-25 13:33:08 +08:00
Artem Astafev
ccc145ea1a Refine README.MD for SearchQnA on AMD ROCm platform (#1876)
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com>
2025-04-25 10:16:03 +08:00
chyundunovDatamonsters
bb7a675665 ChatQnA - refactoring README.md for deploy application on ROCm (#1857)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Co-authored-by: Chingis Yundunov <YundunovCN@sibedge.com>
Co-authored-by: Artem Astafev <a.astafev@datamonsters.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-04-25 08:52:24 +08:00
chen, suyue
f90a6d2a8e [CICD enhance] EdgeCraftRAG run CI with latest base image, group logs in GHA outputs. (#1877)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2025-04-24 16:18:44 +08:00
chyundunovDatamonsters
1fdab591d9 CodeTrans - refactoring README.md for deploy application on ROCm with Docker Compose (#1875)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
2025-04-24 15:28:57 +08:00
chen, suyue
13ea13862a Remove proxy in CodeTrans test (#1874)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2025-04-24 13:47:56 +08:00
ZePan110
1787d1ee98 Update image links. (#1866)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-04-24 13:34:41 +08:00
Artem Astafev
db4bf1a4c3 Refine README.MD for AMD ROCm docker compose deployment (#1856)
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com>
2025-04-24 11:00:51 +08:00
chen, suyue
f7002fcb70 Set opea_branch for CD test (#1870)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2025-04-24 09:49:20 +08:00
Artem Astafev
c39c875211 Fix compose file and functional tests for Avatarchatbot on AMD ROCm platform (#1872)
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com>
2025-04-23 22:58:25 +08:00
Artem Astafev
c2e9a259fe Refine AuidoQnA README.MD for AMD ROCm docker compose deployment (#1862)
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com>
2025-04-23 13:55:01 +08:00
Omar Khleif
48eaf9c1c9 Added CodeGen Gradio README link to Docker Images List (#1864)
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
2025-04-22 15:28:49 -07:00
Ervin Castelino
a39824f142 Update README.md of DBQnA (#1855)
Co-authored-by: Ying Hu <ying.hu@intel.com>
2025-04-22 15:56:37 -04:00
Dina Suehiro Jones
e10e6dd002 Fixes for MultimodalQnA with the Milvus vector db (#1859)
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
2025-04-21 16:05:11 -07:00
31 changed files with 518 additions and 278 deletions

10
.github/CODEOWNERS vendored
View File

@@ -4,13 +4,13 @@
/AudioQnA/ sihan.chen@intel.com wenjiao.yue@intel.com
/AvatarChatbot/ chun.tao@intel.com kaokao.lv@intel.com
/ChatQnA/ liang1.lv@intel.com letong.han@intel.com
/CodeGen/ liang1.lv@intel.com xinyao.wang@intel.com
/CodeTrans/ sihan.chen@intel.com xinyao.wang@intel.com
/CodeGen/ liang1.lv@intel.com
/CodeTrans/ sihan.chen@intel.com
/DBQnA/ supriya.krishnamurthi@intel.com liang1.lv@intel.com
/DocIndexRetriever/ kaokao.lv@intel.com chendi.xue@intel.com
/DocSum/ letong.han@intel.com xinyao.wang@intel.com
/DocSum/ letong.han@intel.com
/EdgeCraftRAG/ yongbo.zhu@intel.com mingyuan.qi@intel.com
/FaqGen/ yogesh.pandey@intel.com xinyao.wang@intel.com
/FaqGen/ yogesh.pandey@intel.com
/GraphRAG/ rita.brugarolas.brufau@intel.com abolfazl.shahbazi@intel.com
/InstructionTuning/ xinyu.ye@intel.com kaokao.lv@intel.com
/MultimodalQnA/ melanie.h.buehler@intel.com tiep.le@intel.com
@@ -19,5 +19,5 @@
/SearchQnA/ sihan.chen@intel.com letong.han@intel.com
/Text2Image/ wenjiao.yue@intel.com xinyu.ye@intel.com
/Translation/ liang1.lv@intel.com sihan.chen@intel.com
/VideoQnA/ huiling.bao@intel.com xinyao.wang@intel.com
/VideoQnA/ huiling.bao@intel.com
/VisualQnA/ liang1.lv@intel.com sihan.chen@intel.com

View File

@@ -2,7 +2,9 @@
# SPDX-License-Identifier: Apache-2.0
name: Helm Chart E2e Test For Call
permissions: read-all
permissions:
contents: read
on:
workflow_call:
inputs:
@@ -135,16 +137,28 @@ jobs:
env:
example: ${{ inputs.example }}
run: |
CHART_NAME="${example,,}" # CodeGen
echo "CHART_NAME=$CHART_NAME" >> $GITHUB_ENV
echo "RELEASE_NAME=${CHART_NAME}$(date +%Y%m%d%H%M%S)" >> $GITHUB_ENV
echo "NAMESPACE=${CHART_NAME}-$(head -c 4 /dev/urandom | xxd -p)" >> $GITHUB_ENV
echo "ROLLOUT_TIMEOUT_SECONDS=600s" >> $GITHUB_ENV
echo "TEST_TIMEOUT_SECONDS=600s" >> $GITHUB_ENV
echo "KUBECTL_TIMEOUT_SECONDS=60s" >> $GITHUB_ENV
echo "should_cleanup=false" >> $GITHUB_ENV
echo "skip_validate=false" >> $GITHUB_ENV
echo "CHART_FOLDER=${example}/kubernetes/helm" >> $GITHUB_ENV
if [[ ! "$example" =~ ^[a-zA-Z]{1,20}$ ]] || [[ "$example" =~ \.\. ]] || [[ "$example" == -* || "$example" == *- ]]; then
echo "Error: Invalid input - only lowercase alphanumeric and internal hyphens allowed"
exit 1
fi
# SAFE_PREFIX="kb-"
CHART_NAME="${SAFE_PREFIX}$(echo "$example" | tr '[:upper:]' '[:lower:]')"
RAND_SUFFIX=$(openssl rand -hex 2 | tr -dc 'a-f0-9')
cat <<EOF >> $GITHUB_ENV
CHART_NAME=${CHART_NAME}
RELEASE_NAME=${CHART_NAME}-$(date +%s)
NAMESPACE=ns-${CHART_NAME}-${RAND_SUFFIX}
ROLLOUT_TIMEOUT_SECONDS=600s
TEST_TIMEOUT_SECONDS=600s
KUBECTL_TIMEOUT_SECONDS=60s
should_cleanup=false
skip_validate=false
CHART_FOLDER=${example}/kubernetes/helm
EOF
echo "Generated safe variables:" >> $GITHUB_STEP_SUMMARY
echo "- CHART_NAME: ${CHART_NAME}" >> $GITHUB_STEP_SUMMARY
- name: Helm install
id: install

View File

@@ -9,7 +9,7 @@ on:
workflow_dispatch:
env:
EXAMPLES: ${{ vars.NIGHTLY_RELEASE_EXAMPLES }}
EXAMPLES: CodeGen,CodeTrans #${{ vars.NIGHTLY_RELEASE_EXAMPLES }}
TAG: "latest"
PUBLISH_TAGS: "latest"
@@ -75,7 +75,7 @@ jobs:
publish:
needs: [get-build-matrix, get-image-list, build-images]
if: always()
if: ${{ success() }}
strategy:
matrix:
image: ${{ fromJSON(needs.get-image-list.outputs.matrix) }}

View File

@@ -19,6 +19,9 @@ concurrency:
jobs:
job1:
name: Get-Test-Matrix
permissions:
contents: read
pull-requests: read
runs-on: ubuntu-latest
outputs:
run_matrix: ${{ steps.get-test-matrix.outputs.run_matrix }}

View File

@@ -96,20 +96,21 @@ flowchart LR
The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.
| Category | Deployment Option | Description |
| ----------------------- | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| On-premise Deployments | Docker compose | [ChatQnA deployment on Xeon](./docker_compose/intel/cpu/xeon) |
| | | [ChatQnA deployment on AI PC](./docker_compose/intel/cpu/aipc) |
| | | [ChatQnA deployment on Gaudi](./docker_compose/intel/hpu/gaudi) |
| | | [ChatQnA deployment on Nvidia GPU](./docker_compose/nvidia/gpu) |
| | | [ChatQnA deployment on AMD ROCm](./docker_compose/amd/gpu/rocm) |
| | Kubernetes | [Helm Charts](./kubernetes/helm) |
| Cloud Service Providers | AWS | [Terraform deployment on 4th Gen Intel Xeon with Intel AMX using meta-llama/Meta-Llama-3-8B-Instruct ](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) |
| | | [Terraform deployment on 4th Gen Intel Xeon with Intel AMX using TII Falcon2-11B](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna-falcon11B) |
| | GCP | [Terraform deployment on 5th Gen Intel Xeon with Intel AMX(support Confidential AI by using Intel® TDX](https://github.com/intel/terraform-intel-gcp-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) |
| | Azure | [Terraform deployment on 4th/5th Gen Intel Xeon with Intel AMX & Intel TDX](https://github.com/intel/terraform-intel-azure-linux-vm/tree/main/examples/azure-gen-ai-xeon-opea-chatqna-tdx) |
| | Intel Tiber AI Cloud | Coming Soon |
| | Any Xeon based Ubuntu system | [ChatQnA Ansible Module for Ubuntu 20.04](https://github.com/intel/optimized-cloud-recipes/tree/main/recipes/ai-opea-chatqna-xeon) .Use this if you are not using Terraform and have provisioned your system either manually or with another tool, including directly on bare metal. |
| Category | Deployment Option | Description |
| ------------------------------------------------------------------------------------------------------------------------------ | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| On-premise Deployments | Docker compose | [ChatQnA deployment on Xeon](./docker_compose/intel/cpu/xeon/README.md) |
| | | [ChatQnA deployment on AI PC](./docker_compose/intel/cpu/aipc/README.md) |
| | | [ChatQnA deployment on Gaudi](./docker_compose/intel/hpu/gaudi/README.md) |
| | | [ChatQnA deployment on Nvidia GPU](./docker_compose/nvidia/gpu/README.md) |
| | | [ChatQnA deployment on AMD ROCm](./docker_compose/amd/gpu/rocm/README.md) |
| Cloud Platforms Deployment on AWS, GCP, Azure, IBM Cloud,Oracle Cloud, [Intel® Tiber™ AI Cloud](https://ai.cloud.intel.com/) | Docker Compose | [Getting Started Guide: Deploy the ChatQnA application across multiple cloud platforms](https://github.com/opea-project/docs/tree/main/getting-started/README.md) |
| | Kubernetes | [Helm Charts](./kubernetes/helm/README.md) |
| Automated Terraform Deployment on Cloud Service Providers | AWS | [Terraform deployment on 4th Gen Intel Xeon with Intel AMX using meta-llama/Meta-Llama-3-8B-Instruct ](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) |
| | | [Terraform deployment on 4th Gen Intel Xeon with Intel AMX using TII Falcon2-11B](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna-falcon11B) |
| | GCP | [Terraform deployment on 5th Gen Intel Xeon with Intel AMX(support Confidential AI by using Intel® TDX](https://github.com/intel/terraform-intel-gcp-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) |
| | Azure | [Terraform deployment on 4th/5th Gen Intel Xeon with Intel AMX & Intel TDX](https://github.com/intel/terraform-intel-azure-linux-vm/tree/main/examples/azure-gen-ai-xeon-opea-chatqna-tdx) |
| | Intel Tiber AI Cloud | Coming Soon |
| | Any Xeon based Ubuntu system | [ChatQnA Ansible Module for Ubuntu 20.04](https://github.com/intel/optimized-cloud-recipes/tree/main/recipes/ai-opea-chatqna-xeon). Use this if you are not using Terraform and have provisioned your system either manually or with another tool, including directly on bare metal. |
## Monitor and Tracing

View File

@@ -46,13 +46,24 @@ dataprep_get_indices_endpoint = f"{DATAPREP_ENDPOINT}/indices"
# Define the functions that will be used in the app
def add_to_history(prompt, history):
history.append([prompt["text"], ""])
return history, ""
def conversation_history(prompt, index, use_agent, history):
print(f"Generating code for prompt: {prompt} using index: {index} and use_agent is {use_agent}")
history.append([prompt, ""])
response_generator = generate_code(prompt, index, use_agent)
history = add_to_history(prompt, history)[0]
response_generator = generate_code(prompt["text"], index, use_agent)
for token in response_generator:
history[-1][-1] += token
yield history
yield history, ""
def clear_history():
return ""
def upload_media(media, index=None, chunk_size=1500, chunk_overlap=100):
@@ -287,19 +298,32 @@ def get_file_names(files):
# Define UI components
with gr.Blocks() as ui:
with gr.Tab("Code Generation"):
gr.Markdown("### Generate Code from Natural Language")
chatbot = gr.Chatbot(label="Chat History")
prompt_input = gr.Textbox(label="Enter your query")
with gr.Column():
with gr.Row(equal_height=True):
with gr.Row():
with gr.Column(scale=2):
database_dropdown = gr.Dropdown(choices=get_indices(), label="Select Index", value="None", scale=10)
db_refresh_button = gr.Button("Refresh Dropdown", scale=0.1)
db_refresh_button.click(update_indices_dropdown, outputs=database_dropdown)
use_agent = gr.Checkbox(label="Use Agent", container=False)
generate_button = gr.Button("Generate Code")
generate_button.click(
conversation_history, inputs=[prompt_input, database_dropdown, use_agent, chatbot], outputs=chatbot
with gr.Column(scale=9):
gr.Markdown("### Generate Code from Natural Language")
chatbot = gr.Chatbot(label="Chat History")
with gr.Row(equal_height=True):
with gr.Column(scale=8):
prompt_input = gr.MultimodalTextbox(
show_label=False, interactive=True, placeholder="Enter your query", sources=[]
)
with gr.Column(scale=1, min_width=150):
with gr.Row(elem_id="buttons") as button_row:
clear_btn = gr.Button(value="🗑️ Clear", interactive=True)
clear_btn.click(clear_history, None, chatbot)
prompt_input.submit(add_to_history, inputs=[prompt_input, chatbot], outputs=[chatbot, prompt_input])
prompt_input.submit(
conversation_history,
inputs=[prompt_input, database_dropdown, use_agent, chatbot],
outputs=[chatbot, prompt_input],
)
with gr.Tab("Resource Management"):
@@ -315,7 +339,7 @@ with gr.Blocks() as ui:
)
with gr.Column(scale=3):
file_upload = gr.File(label="Upload Files", file_count="multiple")
url_input = gr.Textbox(label="Media to be ingested (Append URL's in a new line)")
url_input = gr.Textbox(label="Media to be ingested. Append URL's in a new line (Shift + Enter)")
upload_button = gr.Button("Upload", variant="primary")
upload_status = gr.Textbox(label="Upload Status")
file_upload.change(get_file_names, inputs=file_upload, outputs=url_input)

View File

@@ -8,14 +8,14 @@
# which can be used to connect to the server from the Internet. It must be specified in the EXTERNAL_HOST_IP variable.
# If the server is used only on the internal network or has a direct external address,
# specify it in HOST_IP and in EXTERNAL_HOST_IP.
export HOST_IP=''
export EXTERNAL_HOST_IP=''
export HOST_IP=${ip_address}
export EXTERNAL_HOST_IP=${ip_address}
### Model ID
export CODETRANS_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
### The port of the TGI service. On this port, the TGI service will accept connections
export CODETRANS_TGI_SERVICE_PORT=18156
export CODETRANS_TGI_SERVICE_PORT=8008
### The endpoint of the TGI service to which requests to this service will be sent (formed from previously set variables)
export CODETRANS_TGI_LLM_ENDPOINT="http://${HOST_IP}:${CODETRANS_TGI_SERVICE_PORT}"
@@ -24,7 +24,7 @@ export CODETRANS_TGI_LLM_ENDPOINT="http://${HOST_IP}:${CODETRANS_TGI_SERVICE_POR
export CODETRANS_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
### The port of the LLM service. On this port, the LLM service will accept connections
export CODETRANS_LLM_SERVICE_PORT=18157
export CODETRANS_LLM_SERVICE_PORT=9000
### The IP address or domain name of the server for CodeTrans MegaService
export CODETRANS_MEGA_SERVICE_HOST_IP=${HOST_IP}
@@ -36,7 +36,7 @@ export CODETRANS_LLM_SERVICE_HOST_IP=${HOST_IP}
export CODETRANS_FRONTEND_SERVICE_IP=${HOST_IP}
### The port of the frontend service
export CODETRANS_FRONTEND_SERVICE_PORT=18155
export CODETRANS_FRONTEND_SERVICE_PORT=5173
### Name of GenAI service for route requests to application
export CODETRANS_BACKEND_SERVICE_NAME=codetrans
@@ -45,10 +45,10 @@ export CODETRANS_BACKEND_SERVICE_NAME=codetrans
export CODETRANS_BACKEND_SERVICE_IP=${HOST_IP}
### The port of the backend service
export CODETRANS_BACKEND_SERVICE_PORT=18154
export CODETRANS_BACKEND_SERVICE_PORT=7777
### The port of the Nginx reverse proxy for application
export CODETRANS_NGINX_PORT=18153
export CODETRANS_NGINX_PORT=8088
### Endpoint of the backend service
export CODETRANS_BACKEND_SERVICE_URL="http://${EXTERNAL_HOST_IP}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"

View File

@@ -8,14 +8,14 @@
# which can be used to connect to the server from the Internet. It must be specified in the EXTERNAL_HOST_IP variable.
# If the server is used only on the internal network or has a direct external address,
# specify it in HOST_IP and in EXTERNAL_HOST_IP.
export HOST_IP=''
export EXTERNAL_HOST_IP=''
export HOST_IP=${ip_address}
export EXTERNAL_HOST_IP=${ip_address}
### Model ID
export CODETRANS_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
### The port of the TGI service. On this port, the TGI service will accept connections
export CODETRANS_VLLM_SERVICE_PORT=18156
export CODETRANS_VLLM_SERVICE_PORT=8008
### The endpoint of the TGI service to which requests to this service will be sent (formed from previously set variables)
export CODETRANS_LLM_ENDPOINT="http://${HOST_IP}:${CODETRANS_VLLM_SERVICE_PORT}"
@@ -24,7 +24,7 @@ export CODETRANS_LLM_ENDPOINT="http://${HOST_IP}:${CODETRANS_VLLM_SERVICE_PORT}"
export CODETRANS_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
### The port of the LLM service. On this port, the LLM service will accept connections
export CODETRANS_LLM_SERVICE_PORT=18157
export CODETRANS_LLM_SERVICE_PORT=9000
### The IP address or domain name of the server for CodeTrans MegaService
export CODETRANS_MEGA_SERVICE_HOST_IP=${HOST_IP}
@@ -36,7 +36,7 @@ export CODETRANS_LLM_SERVICE_HOST_IP=${HOST_IP}
export CODETRANS_FRONTEND_SERVICE_IP=${HOST_IP}
### The port of the frontend service
export CODETRANS_FRONTEND_SERVICE_PORT=18155
export CODETRANS_FRONTEND_SERVICE_PORT=5173
### Name of GenAI service for route requests to application
export CODETRANS_BACKEND_SERVICE_NAME=codetrans
@@ -45,10 +45,10 @@ export CODETRANS_BACKEND_SERVICE_NAME=codetrans
export CODETRANS_BACKEND_SERVICE_IP=${HOST_IP}
### The port of the backend service
export CODETRANS_BACKEND_SERVICE_PORT=18154
export CODETRANS_BACKEND_SERVICE_PORT=7777
### The port of the Nginx reverse proxy for application
export CODETRANS_NGINX_PORT=18153
export CODETRANS_NGINX_PORT=8088
### Endpoint of the backend service
export CODETRANS_BACKEND_SERVICE_URL="http://${EXTERNAL_HOST_IP}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"

45
CodeTrans/tests/README.md Normal file
View File

@@ -0,0 +1,45 @@
# CodeTrans E2E test scripts
## Set the required environment variable
```bash
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
## Run test
On Intel Xeon with TGI:
```bash
bash test_compose_tgi_on_xeon.sh
```
On Intel Xeon with vLLM:
```bash
bash test_compose_on_xeon.sh
```
On Intel Gaudi with TGI:
```bash
bash test_compose_tgi_on_gaudi.sh
```
On Intel Gaudi with vLLM:
```bash
bash test_compose_on_gaudi.sh
```
On AMD ROCm with TGI:
```bash
bash test_compose_on_rocm.sh
```
On AMD ROCm with vLLM:
```bash
bash test_compose_vllm_on_rocm.sh
```

View File

@@ -42,25 +42,12 @@ function build_docker_images() {
}
function start_services() {
cd $WORKPATH/docker_compose/intel/hpu/gaudi
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3"
export LLM_ENDPOINT="http://${ip_address}:8008"
export LLM_COMPONENT_NAME="OpeaTextGenService"
export NUM_CARDS=1
export BLOCK_SIZE=128
export MAX_NUM_SEQS=256
export MAX_SEQ_LEN_TO_CAPTURE=2048
cd $WORKPATH/docker_compose
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export MEGA_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:7777/v1/codetrans"
export FRONTEND_SERVICE_IP=${ip_address}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_NAME=codetrans
export BACKEND_SERVICE_IP=${ip_address}
export BACKEND_SERVICE_PORT=7777
export NGINX_PORT=80
export host_ip=${ip_address}
source set_env.sh
cd intel/hpu/gaudi
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env

View File

@@ -42,21 +42,7 @@ function build_docker_images() {
function start_services() {
cd $WORKPATH/docker_compose/amd/gpu/rocm/
export CODETRANS_TGI_SERVICE_PORT=8008
export CODETRANS_LLM_SERVICE_PORT=9000
export CODETRANS_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export CODETRANS_TGI_LLM_ENDPOINT="http://${ip_address}:${CODETRANS_TGI_SERVICE_PORT}"
export CODETRANS_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export CODETRANS_MEGA_SERVICE_HOST_IP=${ip_address}
export CODETRANS_LLM_SERVICE_HOST_IP=${ip_address}
export CODETRANS_FRONTEND_SERVICE_IP=${ip_address}
export CODETRANS_FRONTEND_SERVICE_PORT=5173
export CODETRANS_BACKEND_SERVICE_NAME=codetrans
export CODETRANS_BACKEND_SERVICE_IP=${ip_address}
export CODETRANS_BACKEND_SERVICE_PORT=7777
export CODETRANS_NGINX_PORT=8088
export CODETRANS_BACKEND_SERVICE_URL="http://${ip_address}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"
export HOST_IP=${ip_address}
source set_env.sh
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env

View File

@@ -44,21 +44,13 @@ function build_docker_images() {
}
function start_services() {
cd $WORKPATH/docker_compose/intel/cpu/xeon/
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3"
export LLM_ENDPOINT="http://${ip_address}:8008"
export LLM_COMPONENT_NAME="OpeaTextGenService"
cd $WORKPATH/docker_compose
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export MEGA_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:7777/v1/codetrans"
export FRONTEND_SERVICE_IP=${ip_address}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_NAME=codetrans
export BACKEND_SERVICE_IP=${ip_address}
export BACKEND_SERVICE_PORT=7777
export NGINX_PORT=80
export host_ip=${ip_address}
source set_env.sh
cd intel/cpu/xeon/
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env

View File

@@ -40,21 +40,13 @@ function build_docker_images() {
}
function start_services() {
cd $WORKPATH/docker_compose/intel/hpu/gaudi/
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3"
export LLM_ENDPOINT="http://${ip_address}:8008"
export LLM_COMPONENT_NAME="OpeaTextGenService"
cd $WORKPATH/docker_compose
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export MEGA_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:7777/v1/codetrans"
export FRONTEND_SERVICE_IP=${ip_address}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_NAME=codetrans
export BACKEND_SERVICE_IP=${ip_address}
export BACKEND_SERVICE_PORT=7777
export NGINX_PORT=80
export host_ip=${ip_address}
source set_env.sh
cd intel/hpu/gaudi/
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env

View File

@@ -40,21 +40,13 @@ function build_docker_images() {
}
function start_services() {
cd $WORKPATH/docker_compose/intel/cpu/xeon/
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3"
export LLM_ENDPOINT="http://${ip_address}:8008"
export LLM_COMPONENT_NAME="OpeaTextGenService"
cd $WORKPATH/docker_compose
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export MEGA_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:7777/v1/codetrans"
export FRONTEND_SERVICE_IP=${ip_address}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_NAME=codetrans
export BACKEND_SERVICE_IP=${ip_address}
export BACKEND_SERVICE_PORT=7777
export NGINX_PORT=80
export host_ip=${ip_address}
source set_env.sh
cd intel/cpu/xeon/
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env

View File

@@ -40,22 +40,7 @@ function build_docker_images() {
function start_services() {
cd $WORKPATH/docker_compose/amd/gpu/rocm/
export HOST_IP=${ip_address}
export CODETRANS_VLLM_SERVICE_PORT=8008
export CODETRANS_LLM_SERVICE_PORT=9000
export CODETRANS_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export CODETRANS_LLM_ENDPOINT="http://${ip_address}:${CODETRANS_VLLM_SERVICE_PORT}"
export CODETRANS_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export CODETRANS_MEGA_SERVICE_HOST_IP=${ip_address}
export CODETRANS_LLM_SERVICE_HOST_IP=${ip_address}
export CODETRANS_FRONTEND_SERVICE_IP=${ip_address}
export CODETRANS_FRONTEND_SERVICE_PORT=5173
export CODETRANS_BACKEND_SERVICE_NAME=codetrans
export CODETRANS_BACKEND_SERVICE_IP=${ip_address}
export CODETRANS_BACKEND_SERVICE_PORT=7777
export CODETRANS_NGINX_PORT=8088
export CODETRANS_BACKEND_SERVICE_URL="http://${ip_address}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"
export HOST_IP=${ip_address}
source set_env_vllm.sh
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env

View File

@@ -3,7 +3,7 @@
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
export HOST_IP=''
export HOST_IP=${ip_address}
export DOCSUM_MAX_INPUT_TOKENS="2048"
export DOCSUM_MAX_TOTAL_TOKENS="4096"
export DOCSUM_LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"

View File

@@ -3,7 +3,7 @@
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
export HOST_IP=''
export HOST_IP=${ip_address}
export DOCSUM_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export DOCSUM_MAX_INPUT_TOKENS=2048
export DOCSUM_MAX_TOTAL_TOKENS=4096

View File

@@ -10,7 +10,7 @@ export no_proxy="${no_proxy},${host_ip}" # Example: no_proxy="localhost, 127.0.0
export http_proxy=$http_proxy
export https_proxy=$https_proxy
export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_ENDPOINT_PORT=8008
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
@@ -20,10 +20,12 @@ export MAX_TOTAL_TOKENS=2048
export LLM_PORT=9000
export LLM_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
export DocSum_COMPONENT_NAME="OpeaDocSumvLLM" # OpeaDocSumTgi
export FRONTEND_SERVICE_PORT=5173
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_PORT=8888
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum"
export LOGFLAG=True

View File

@@ -16,3 +16,150 @@ helm install docsum oci://ghcr.io/opea-project/charts/docsum --set global.HUGGI
export HFTOKEN="insert-your-huggingface-token-here"
helm install docsum oci://ghcr.io/opea-project/charts/docsum --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml
```
## Deploy on AMD ROCm using Helm charts from the binary Helm repository
```bash
mkdir ~/docsum-k8s-install && cd ~/docsum-k8s-install
```
### Cloning repos
```bash
git clone git clone https://github.com/opea-project/GenAIExamples.git
```
### Go to the installation directory
```bash
cd GenAIExamples/DocSum/kubernetes/helm
```
### Settings system variables
```bash
export HFTOKEN="your_huggingface_token"
export MODELDIR="/mnt/opea-models"
export MODELNAME="Intel/neural-chat-7b-v3-3"
```
### Setting variables in Values files
#### If ROCm vLLM used
```bash
nano ~/docsum-k8s-install/GenAIExamples/DocSum/kubernetes/helm/rocm-values.yaml
```
- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- TENSOR_PARALLEL_SIZE - must match the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used
#### If ROCm TGI used
```bash
nano ~/docsum-k8s-install/GenAIExamples/DocSum/kubernetes/helm/rocm-tgi-values.yaml
```
- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- extraCmdArgs: [ "--num-shard","1" ] - replace "1" with the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used
### Installing the Helm Chart
#### If ROCm vLLM used
```bash
helm upgrade --install docsum oci://ghcr.io/opea-project/charts/docsum \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values rocm-values.yaml
```
#### If ROCm TGI used
```bash
helm upgrade --install docsum oci://ghcr.io/opea-project/charts/docsum \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values rocm-tgi-values.yaml
```
## Deploy on AMD ROCm using Helm charts from Git repositories
### Creating working dirs
```bash
mkdir ~/docsum-k8s-install && cd ~/docsum-k8s-install
```
### Cloning repos
```bash
git clone git clone https://github.com/opea-project/GenAIExamples.git
git clone git clone https://github.com/opea-project/GenAIInfra.git
```
### Go to the installation directory
```bash
cd GenAIExamples/DocSum/kubernetes/helm
```
### Settings system variables
```bash
export HFTOKEN="your_huggingface_token"
export MODELDIR="/mnt/opea-models"
export MODELNAME="Intel/neural-chat-7b-v3-3"
```
### Setting variables in Values files
#### If ROCm vLLM used
```bash
nano ~/docsum-k8s-install/GenAIExamples/DocSum/kubernetes/helm/rocm-values.yaml
```
- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- TENSOR_PARALLEL_SIZE - must match the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used
#### If ROCm TGI used
```bash
nano ~/docsum-k8s-install/GenAIExamples/DocSum/kubernetes/helm/rocm-tgi-values.yaml
```
- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- extraCmdArgs: [ "--num-shard","1" ] - replace "1" with the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used
### Installing the Helm Chart
#### If ROCm vLLM used
```bash
cd ~/docsum-k8s-install/GenAIInfra/helm-charts
./update_dependency.sh
helm dependency update docsum
helm upgrade --install docsum docsum \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values ../../GenAIExamples/DocSum/kubernetes/helm/rocm-values.yaml
```
#### If ROCm TGI used
```bash
cd ~/docsum-k8s-install/GenAIInfra/helm-charts
./update_dependency.sh
helm dependency update docsum
helm upgrade --install docsum docsum \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values ../../GenAIExamples/DocSum/kubernetes/helm/rocm-tgi-values.yaml
```

View File

@@ -0,0 +1,45 @@
# Copyright (C) 2025 Advanced Micro Devices, Inc.
tgi:
enabled: true
accelDevice: "rocm"
image:
repository: ghcr.io/huggingface/text-generation-inference
tag: "2.4.1-rocm"
MAX_INPUT_LENGTH: "1024"
MAX_TOTAL_TOKENS: "2048"
USE_FLASH_ATTENTION: "false"
FLASH_ATTENTION_RECOMPUTE: "false"
HIP_VISIBLE_DEVICES: "0"
MAX_BATCH_SIZE: "4"
extraCmdArgs: [ "--num-shard","1" ]
resources:
limits:
amd.com/gpu: "1"
requests:
cpu: 1
memory: 16Gi
securityContext:
readOnlyRootFilesystem: false
runAsNonRoot: false
runAsUser: 0
capabilities:
add:
- SYS_PTRACE
readinessProbe:
initialDelaySeconds: 60
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 120
startupProbe:
initialDelaySeconds: 60
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 120
llm-uservice:
DOCSUM_BACKEND: "TGI"
retryTimeoutSeconds: 720
vllm:
enabled: false

View File

@@ -0,0 +1,40 @@
# Copyright (C) 2025 Advanced Micro Devices, Inc.
tgi:
enabled: false
llm-uservice:
DOCSUM_BACKEND: "vLLM"
retryTimeoutSeconds: 720
vllm:
enabled: true
accelDevice: "rocm"
image:
repository: opea/vllm-rocm
tag: latest
env:
HIP_VISIBLE_DEVICES: "0"
TENSOR_PARALLEL_SIZE: "1"
HF_HUB_DISABLE_PROGRESS_BARS: "1"
HF_HUB_ENABLE_HF_TRANSFER: "0"
VLLM_USE_TRITON_FLASH_ATTN: "0"
VLLM_WORKER_MULTIPROC_METHOD: "spawn"
PYTORCH_JIT: "0"
HF_HOME: "/data"
extraCmd:
command: [ "python3", "/workspace/api_server.py" ]
extraCmdArgs: [ "--swap-space", "16",
"--disable-log-requests",
"--dtype", "float16",
"--num-scheduler-steps", "1",
"--distributed-executor-backend", "mp" ]
resources:
limits:
amd.com/gpu: "1"
startupProbe:
failureThreshold: 180
securityContext:
readOnlyRootFilesystem: false
runAsNonRoot: false
runAsUser: 0

45
DocSum/tests/README.md Normal file
View File

@@ -0,0 +1,45 @@
# DocSum E2E test scripts
## Set the required environment variable
```bash
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
## Run test
On Intel Xeon with vLLM:
```bash
bash test_compose_on_xeon.sh
```
On Intel Xeon with TGI:
```bash
bash test_compose_tgi_on_xeon.sh
```
On Intel Gaudi with vLLM:
```bash
bash test_compose_on_gaudi.sh
```
On Intel Gaudi with TGI:
```bash
bash test_compose_tgi_on_gaudi.sh
```
On AMD ROCm with TGI:
```bash
bash test_compose_on_rocm.sh
```
On AMD ROCm with vLLM:
```bash
bash test_compose_vllm_on_rocm.sh
```

View File

@@ -10,35 +10,22 @@ export http_proxy=$http_proxy
export https_proxy=$https_proxy
export host_ip=$(hostname -I | awk '{print $1}')
WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export no_proxy="${no_proxy},${host_ip}"
export MODEL_CACHE=${model_cache:-"./data"}
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_ENDPOINT_PORT=8008
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
source $WORKPATH/docker_compose/set_env.sh
export MODEL_CACHE=${model_cache:-"./data"}
export NUM_CARDS=1
export BLOCK_SIZE=128
export MAX_NUM_SEQS=256
export MAX_SEQ_LEN_TO_CAPTURE=2048
export MAX_INPUT_TOKENS=2048
export MAX_TOTAL_TOKENS=4096
export LLM_PORT=9000
export LLM_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
export DocSum_COMPONENT_NAME="OpeaDocSumvLLM"
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_PORT=8888
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum"
export LOGFLAG=True
WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
# Get the root folder of the current script
ROOT_FOLDER=$(dirname "$(readlink -f "$0")")

View File

@@ -14,21 +14,8 @@ export MODEL_CACHE=${model_cache:-"./data"}
WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
ip_address=$(hostname -I | awk '{print $1}')
export HOST_IP=${ip_address}
export host_ip=${ip_address}
export DOCSUM_MAX_INPUT_TOKENS="2048"
export DOCSUM_MAX_TOTAL_TOKENS="4096"
export DOCSUM_LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export DOCSUM_TGI_SERVICE_PORT="8008"
export DOCSUM_TGI_LLM_ENDPOINT="http://${HOST_IP}:${DOCSUM_TGI_SERVICE_PORT}"
export DOCSUM_HUGGINGFACEHUB_API_TOKEN=''
export DOCSUM_WHISPER_PORT="7066"
export ASR_SERVICE_HOST_IP="${HOST_IP}"
export DOCSUM_LLM_SERVER_PORT="9000"
export DOCSUM_BACKEND_SERVER_PORT="18072"
export DOCSUM_FRONTEND_PORT="18073"
export BACKEND_SERVICE_ENDPOINT="http://${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum"
source $WORKPATH/docker_compose/amd/gpu/rocm/set_env.sh
function build_docker_images() {
opea_branch=${opea_branch:-"main"}
@@ -129,7 +116,7 @@ function validate_microservices() {
# whisper microservice
ulimit -s 65536
validate_services \
"${host_ip}:${DOCSUM_WHISPER_PORT}/v1/asr" \
"${HOST_IP}:${DOCSUM_WHISPER_PORT}/v1/asr" \
'{"asr_result":"well"}' \
"whisper-service" \
"whisper-service" \
@@ -137,7 +124,7 @@ function validate_microservices() {
# tgi for llm service
validate_services \
"${host_ip}:${DOCSUM_TGI_SERVICE_PORT}/generate" \
"${HOST_IP}:${DOCSUM_TGI_SERVICE_PORT}/generate" \
"generated_text" \
"docsum-tgi-service" \
"docsum-tgi-service" \
@@ -145,7 +132,7 @@ function validate_microservices() {
# llm microservice
validate_services \
"${host_ip}:${DOCSUM_LLM_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_LLM_SERVER_PORT}/v1/docsum" \
"text" \
"docsum-llm-server" \
"docsum-llm-server" \
@@ -158,7 +145,7 @@ function validate_megaservice() {
local DOCKER_NAME="docsum-backend-server"
local EXPECTED_RESULT="[DONE]"
local INPUT_DATA="messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."
local URL="${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum"
local URL="${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum"
local DATA_TYPE="type=text"
local HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST -F "$DATA_TYPE" -F "$INPUT_DATA" -H 'Content-Type: multipart/form-data' "$URL")
@@ -188,7 +175,7 @@ function validate_megaservice_json() {
echo ""
echo ">>> Checking text data with Content-Type: application/json"
validate_services \
"${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"[DONE]" \
"docsum-backend-server" \
"docsum-backend-server" \
@@ -196,7 +183,7 @@ function validate_megaservice_json() {
echo ">>> Checking audio data"
validate_services \
"${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"[DONE]" \
"docsum-backend-server" \
"docsum-backend-server" \
@@ -204,7 +191,7 @@ function validate_megaservice_json() {
echo ">>> Checking video data"
validate_services \
"${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"[DONE]" \
"docsum-backend-server" \
"docsum-backend-server" \

View File

@@ -10,30 +10,18 @@ export http_proxy=$http_proxy
export https_proxy=$https_proxy
export host_ip=$(hostname -I | awk '{print $1}')
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export no_proxy="${no_proxy},${host_ip}"
export MODEL_CACHE=${model_cache:-"./data"}
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_ENDPOINT_PORT=8008
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export MAX_INPUT_TOKENS=2048
export MAX_TOTAL_TOKENS=4096
export LLM_PORT=9000
export LLM_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
export DocSum_COMPONENT_NAME="OpeaDocSumvLLM"
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_PORT=8888
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum"
export LOGFLAG=True
WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
source $WORKPATH/docker_compose/set_env.sh
export MODEL_CACHE=${model_cache:-"./data"}
export MAX_INPUT_TOKENS=2048
export MAX_TOTAL_TOKENS=4096
# Get the root folder of the current script
ROOT_FOLDER=$(dirname "$(readlink -f "$0")")

View File

@@ -9,32 +9,20 @@ IMAGE_TAG=${IMAGE_TAG:-"latest"}
export http_proxy=$http_proxy
export https_proxy=$https_proxy
export host_ip=$(hostname -I | awk '{print $1}')
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export no_proxy="${no_proxy},${host_ip}"
export MODEL_CACHE=${model_cache:-"./data"}
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_ENDPOINT_PORT=8008
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export MAX_INPUT_TOKENS=2048
export MAX_TOTAL_TOKENS=4096
export LLM_PORT=9000
export LLM_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
export DocSum_COMPONENT_NAME="OpeaDocSumTgi"
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_PORT=8888
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum"
export LOGFLAG=True
WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
source $WORKPATH/docker_compose/set_env.sh
export MODEL_CACHE=${model_cache:-"./data"}
export MAX_INPUT_TOKENS=2048
export MAX_TOTAL_TOKENS=4096
export DocSum_COMPONENT_NAME="OpeaDocSumTgi"
# Get the root folder of the current script
ROOT_FOLDER=$(dirname "$(readlink -f "$0")")

View File

@@ -9,31 +9,20 @@ IMAGE_TAG=${IMAGE_TAG:-"latest"}
export http_proxy=$http_proxy
export https_proxy=$https_proxy
export host_ip=$(hostname -I | awk '{print $1}')
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export no_proxy="${no_proxy},${host_ip}"
export MODEL_CACHE=${model_cache:-"./data"}
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_ENDPOINT_PORT=8008
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export MAX_INPUT_TOKENS=2048
export MAX_TOTAL_TOKENS=4096
export LLM_PORT=9000
export LLM_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
export DocSum_COMPONENT_NAME="OpeaDocSumTgi"
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_PORT=8888
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum"
export LOGFLAG=True
WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
source $WORKPATH/docker_compose/set_env.sh
export MODEL_CACHE=${model_cache:-"./data"}
export MAX_INPUT_TOKENS=2048
export MAX_TOTAL_TOKENS=4096
export DocSum_COMPONENT_NAME="OpeaDocSumTgi"
# Get the root folder of the current script
ROOT_FOLDER=$(dirname "$(readlink -f "$0")")

View File

@@ -16,21 +16,7 @@ WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
ip_address=$(hostname -I | awk '{print $1}')
export host_ip=${ip_address}
export HOST_IP=${ip_address}
export EXTERNAL_HOST_IP=${ip_address}
export DOCSUM_HUGGINGFACEHUB_API_TOKEN="${HUGGINGFACEHUB_API_TOKEN}"
export DOCSUM_MAX_INPUT_TOKENS=2048
export DOCSUM_MAX_TOTAL_TOKENS=4096
export DOCSUM_LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export DOCSUM_VLLM_SERVICE_PORT="8008"
export DOCSUM_LLM_ENDPOINT="http://${HOST_IP}:${DOCSUM_VLLM_SERVICE_PORT}"
export DOCSUM_WHISPER_PORT="7066"
export ASR_SERVICE_HOST_IP="${HOST_IP}"
export DOCSUM_LLM_SERVER_PORT="9000"
export DOCSUM_BACKEND_SERVER_PORT="18072"
export DOCSUM_FRONTEND_PORT="18073"
export BACKEND_SERVICE_ENDPOINT="http://${EXTERNAL_HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum"
source $WORKPATH/docker_compose/amd/gpu/rocm/set_env_vllm.sh
function build_docker_images() {
opea_branch=${opea_branch:-"main"}
@@ -130,7 +116,7 @@ function validate_microservices() {
# whisper microservice
ulimit -s 65536
validate_services \
"${host_ip}:${DOCSUM_WHISPER_PORT}/v1/asr" \
"${HOST_IP}:${DOCSUM_WHISPER_PORT}/v1/asr" \
'{"asr_result":"well"}' \
"whisper-service" \
"whisper-service" \
@@ -138,7 +124,7 @@ function validate_microservices() {
# vLLM service
validate_services \
"${host_ip}:${DOCSUM_VLLM_SERVICE_PORT}/v1/chat/completions" \
"${HOST_IP}:${DOCSUM_VLLM_SERVICE_PORT}/v1/chat/completions" \
"content" \
"docsum-vllm-service" \
"docsum-vllm-service" \
@@ -146,7 +132,7 @@ function validate_microservices() {
# llm microservice
validate_services \
"${host_ip}:${DOCSUM_LLM_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_LLM_SERVER_PORT}/v1/docsum" \
"text" \
"docsum-llm-server" \
"docsum-llm-server" \
@@ -159,7 +145,7 @@ function validate_megaservice() {
local DOCKER_NAME="docsum-backend-server"
local EXPECTED_RESULT="[DONE]"
local INPUT_DATA="messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."
local URL="${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum"
local URL="${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum"
local DATA_TYPE="type=text"
local HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST -F "$DATA_TYPE" -F "$INPUT_DATA" -H 'Content-Type: multipart/form-data' "$URL")
@@ -189,7 +175,7 @@ function validate_megaservice_json() {
echo ""
echo ">>> Checking text data with Content-Type: application/json"
validate_services \
"${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"[DONE]" \
"docsum-backend-server" \
"docsum-backend-server" \
@@ -197,7 +183,7 @@ function validate_megaservice_json() {
echo ">>> Checking audio data"
validate_services \
"${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"[DONE]" \
"docsum-backend-server" \
"docsum-backend-server" \
@@ -205,7 +191,7 @@ function validate_megaservice_json() {
echo ">>> Checking video data"
validate_services \
"${host_ip}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" \
"[DONE]" \
"docsum-backend-server" \
"docsum-backend-server" \

View File

@@ -14,16 +14,19 @@ services:
image: ${REGISTRY:-opea}/edgecraftrag:${TAG:-latest}
edgecraftrag-server:
build:
context: ../
dockerfile: ./Dockerfile.server
extends: edgecraftrag
image: ${REGISTRY:-opea}/edgecraftrag-server:${TAG:-latest}
edgecraftrag-ui:
build:
context: ../
dockerfile: ./ui/docker/Dockerfile.ui
extends: edgecraftrag
image: ${REGISTRY:-opea}/edgecraftrag-ui:${TAG:-latest}
edgecraftrag-ui-gradio:
build:
context: ../
dockerfile: ./ui/docker/Dockerfile.gradio
extends: edgecraftrag
image: ${REGISTRY:-opea}/edgecraftrag-ui-gradio:${TAG:-latest}

View File

@@ -0,0 +1,15 @@
# Translation E2E test scripts
## Set the required environment variable
```bash
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
## Run test
On Intel Xeon:
```bash
bash test_compose_on_xeon.sh
```

View File

@@ -1,3 +0,0 @@
VERSION_MAJOR 1
VERSION_MINOR 3
VERSION_PATCH 0