Razvan Liviu Varzaru
ebb7c24ca8
Add ChatQnA docker-compose example on Intel Xeon using MariaDB Vector ( #1916 )
...
Signed-off-by: Razvan-Liviu Varzaru <razvan@mariadb.org >
Co-authored-by: Liang Lv <liang1.lv@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-08 21:08:15 -07:00
Sun, Xuehao
b467a13ec3
daily update vLLM&vLLM-fork version ( #1914 )
...
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com >
2025-05-08 10:34:36 +08:00
chen, suyue
c546d96e98
downgrade tei version from 1.6 to 1.5, fix the chatqna perf regression ( #1886 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-04-25 23:00:36 +08:00
chen, suyue
13ea13862a
Remove proxy in CodeTrans test ( #1874 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-04-24 13:47:56 +08:00
xiguiw
4fc19c7d73
Update TEI docker images to CPU-1.6 ( #1791 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-04-17 15:00:06 +08:00
Liang Lv
71fe886ce9
Replaced TGI with vLLM for guardrail serving ( #1815 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
2025-04-16 17:06:11 +08:00
chen, suyue
1095d88c5f
Group log lines in GHA outputs for better readable logs. ( #1821 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-04-16 13:17:53 +08:00
ZePan110
5f4b3a6d12
Adaptation to vllm v0.8.3 build paths ( #1761 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-04-09 13:20:02 +08:00
ZePan110
42735d0d7d
Fix vllm and vllm-fork tags ( #1766 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-04-07 22:58:50 +08:00
chen, suyue
c48cd651e4
[CICD enhance] ChatQnA run CI with latest base image, group logs in GHA outputs. ( #1736 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-04-03 22:03:20 +08:00
chyundunovDatamonsters
c50dfb2510
Adding files to deploy ChatQnA application on ROCm vLLM ( #1560 )
...
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com >
2025-04-03 17:19:26 +08:00
Xiaotian Chen
1bd56af994
Update TGI image versions ( #1625 )
...
Signed-off-by: xiaotia3 <xiaotian.chen@intel.com >
2025-04-01 11:27:51 +08:00
xiguiw
87baeb833d
Update TEI docker image to 1.6 ( #1650 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-03-27 09:40:22 +08:00
XinyaoWa
6d24c1c77a
Merge FaqGen into ChatQnA ( #1654 )
...
1. Delete FaqGen
2. Refactor FaqGen into ChatQnA, serve as a LLM selection.
3. Combine all ChatQnA related Dockerfile into one
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
2025-03-20 17:40:00 +08:00
chen, suyue
43d0a18270
Enhance ChatQnA test scripts ( #1643 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-03-10 17:36:26 +08:00
chen, suyue
4cab86260f
Use the latest HabanaAI/vllm-fork release tag to build vllm-gaudi image ( #1635 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
Co-authored-by: Liang Lv <liang1.lv@intel.com >
2025-03-07 20:40:32 +08:00
ZePan110
6ead1b12db
Enable ChatQnA model cache for docker compose test. ( #1605 )
...
Enable ChatQnA model cache for docker compose test.
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-03-05 11:30:04 +08:00
chen, suyue
8f8d3af7c3
open chatqna frontend test ( #1594 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-03-04 10:41:22 +08:00
Spycsh
ce38a84372
Revert chatqna async and enhance tests ( #1598 )
...
align with opea-project/GenAIComps#1354
2025-03-03 23:03:44 +08:00
Eze Lanza (Eze)
fba0de45d2
ChatQnA Docker compose file for Milvus as vdb ( #1548 )
...
Signed-off-by: Ezequiel Lanza <ezequiel.lanza@gmail.com >
Signed-off-by: Kendall González León <kendall.gonzalez.leon@intel.com >
Signed-off-by: chensuyue <suyue.chen@intel.com >
Signed-off-by: Spycsh <sihan.chen@intel.com >
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
Signed-off-by: ZePan110 <ze.pan@intel.com >
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: minmin-intel <minmin.hou@intel.com >
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com >
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com >
Signed-off-by: letonghan <letong.han@intel.com >
Signed-off-by: alexsin368 <alex.sin@intel.com >
Signed-off-by: WenjiaoYue <wenjiao.yue@intel.com >
Co-authored-by: Ezequiel Lanza <emlanza@CDQ242RKJDmac.local >
Co-authored-by: Kendall González León <kendallgonzalez@hotmail.es >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
Co-authored-by: Spycsh <39623753+Spycsh@users.noreply.github.com >
Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com >
Co-authored-by: jotpalch <49465120+jotpalch@users.noreply.github.com >
Co-authored-by: ZePan110 <ze.pan@intel.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: minmin-intel <minmin.hou@intel.com >
Co-authored-by: Ying Hu <ying.hu@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com >
Co-authored-by: Liang Lv <liang1.lv@intel.com >
Co-authored-by: Artem Astafev <a.astafev@datamonsters.com >
Co-authored-by: XinyaoWa <xinyao.wang@intel.com >
Co-authored-by: alexsin368 <109180236+alexsin368@users.noreply.github.com >
Co-authored-by: WenjiaoYue <wenjiao.yue@intel.com >
2025-02-28 22:40:31 +08:00
Artem Astafev
6abf7652e8
Fix ChatQnA ROCm compose Readme file and absolute path for ROCM CI test ( #1159 )
...
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com >
2025-02-27 15:26:45 +08:00
ZePan110
caec354324
Fix trivy issue ( #1569 )
...
Fix docker image security issue
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-02-20 14:41:52 +08:00
ZePan110
799881a3fa
Remove perf test code from test scripts. ( #1510 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-02-18 16:23:49 +08:00
xiguiw
0c0edffc5b
update vLLM CPU to the latest stable version ( #1546 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
2025-02-17 08:26:25 +08:00
XinyaoWa
87ff149f61
Remove vllm hpu triton version fix ( #1515 )
...
vllm-fork has fix triton version issue, remove duplicated code https://github.com/HabanaAI/vllm-fork/blob/habana_main/requirements-hpu.txt
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
2025-02-12 09:24:38 +08:00
Louie Tsai
47069ac70c
fix a test script issue due to name change for telemetry yaml files ( #1516 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-02-11 17:58:42 +08:00
Louie Tsai
ad5523bac7
Enable OpenTelemtry Tracing for ChatQnA on Xeon and Gaudi by docker compose merge feature ( #1488 )
...
Signed-off-by: Louie, Tsai <louie.tsai@intel.com >
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-02-10 22:58:50 -08:00
Liang Lv
9adf7a6af0
Add support for latest deepseek models on Gaudi ( #1491 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
2025-02-05 08:30:04 +08:00
chen, suyue
259099d19f
Remove kubernetes manifest related code and tests ( #1466 )
...
Remove deprecated kubernetes manifest related code and tests.
k8s implementation for those examples based on helm charts will target for next release.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-24 15:23:12 +08:00
chen, suyue
9a1118730b
Freeze the triton version in vllm-gaudi image to 3.1.0 ( #1463 )
...
The new triton version 3.2.0 can't work with vllm-gaudi. Freeze the triton version in vllm-gaudi image to 3.1.0.
Issue create for vllm-fork: HabanaAI/vllm-fork#732
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-24 09:50:59 +08:00
dolpher
ee0e5cc8d9
Sync value files from GenAIInfra ( #1428 )
...
All gaudi values updated with extra flags.
Added helm support for 2 new examples Text2Image and SearchQnA. Minor fix for llm-uservice.
Signed-off-by: Dolpher Du <dolpher.du@intel.com >
2025-01-22 17:44:11 +08:00
chen, suyue
927698e23e
Simplify git clone code in CI test ( #1434 )
...
1. Simplify git clone code in CI test.
2. Replace git clone branch in Dockerfile.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-21 23:00:08 +08:00
Wang, Kai Lawrence
284db982be
[ROCm] Fix the hf-token setting for TGI and TEI in ChatQnA ( #1432 )
...
This PR is to correct the env variable names in chatqna example on ROCm platform passing to the docker container of TGI and TEI. For tgi, either HF_TOKEN and HUGGING_FACE_HUB_TOKEN could be parsed in TGI while HF_API_TOKEN can be parsed in TEI.
TGI: https://github.com/huggingface/text-generation-inference/blob/main/router/src/server.rs#L1700C1-L1702C15
TEI: https://github.com/huggingface/text-embeddings-inference/blob/main/router/src/main.rs#L112
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-21 14:22:39 +08:00
Wang, Kai Lawrence
3d3ac59bfb
[ChatQnA] Update the default LLM to llama3-8B on cpu/gpu/hpu ( #1430 )
...
Update the default LLM to llama3-8B on cpu/nvgpu/amdgpu/gaudi for docker-compose deployment to avoid the potential model serving issue or the missing chat-template issue using neural-chat-7b.
Slow serving issue of neural-chat-7b on ICX: #1420
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-20 22:47:56 +08:00
Liang Lv
0f7e5a37ac
Adapt code for dataprep microservice refactor ( #1408 )
...
https://github.com/opea-project/GenAIComps/pull/1153
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
2025-01-20 20:37:03 +08:00
xiguiw
2d5898244c
Enchance health check in GenAIExample docker-compose ( #1410 )
...
Fix service launch issue
1. Update Gaudi TGI image from 2.0.6 to 2.3.1
2. Change the hpu-gaudi TGI health check condition.
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-01-20 20:13:13 +08:00
chen, suyue
6bfd156573
Clean up test scripts and enhance git clone ( #1417 )
...
1. Clean up test code in scripts.
2. Simplify git clone code.
3. Replace git clone branch in Dockerfile.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-20 16:34:28 +08:00
Wang, Kai Lawrence
742cb6ddd3
[ChatQnA] Switch to vLLM as default llm backend on Xeon ( #1403 )
...
Switching from TGI to vLLM as the default LLM serving backend on Xeon for the ChatQnA example to enhance the perf.
https://github.com/opea-project/GenAIExamples/issues/1213
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-17 20:48:19 +08:00
Wang, Kai Lawrence
00e9da9ced
[ChatQnA] Switch to vLLM as default llm backend on Gaudi ( #1404 )
...
Switching from TGI to vLLM as the default LLM serving backend on Gaudi for the ChatQnA example to enhance the perf.
https://github.com/opea-project/GenAIExamples/issues/1213
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-17 20:46:38 +08:00
XinyaoWa
301b5e9a69
Fix vllm hpu to a stable release ( #1398 )
...
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
2025-01-16 16:35:32 +08:00
Letong Han
4cabd55778
Refactor Retrievers related Examples ( #1387 )
...
Delete redundant retrievers docker image in docker_images_list.md.
Refactor Retrievers related Examples READMEs.
Change all of the comps/retrievers/xxx/xxx/Dockerfile path into comps/retrievers/src/Dockerfile.
Fix the Examples CI issues of PR opea-project/GenAIComps#1138 .
Signed-off-by: letonghan <letong.han@intel.com >
2025-01-16 14:21:48 +08:00
XinyaoWa
7d218b9f36
Remove vllm hpu commit id limit ( #1386 )
...
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-14 11:05:32 +08:00
dolpher
c795ef2203
Add helm deployment instructions for GenAIExamples ( #1373 )
...
Add helm deployment instructions for ChatQnA, AgentQnA, AudioQnA, CodeTrans, DocSum, FaqGen and VisualQnA
Signed-off-by: Dolpher Du <dolpher.du@intel.com >
2025-01-10 09:55:31 +08:00
Louie Tsai
81022355a7
Enable OpenTelemetry Tracing for ChatQnA TGI serving on Gaudi ( #1316 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-01-08 17:20:13 -08:00
Liang Lv
b3c405a5f6
Adapt example code for guardrails refactor ( #1360 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-08 14:35:23 +08:00
chen, suyue
23117871c2
remove chatqna-conversation-ui build in CI test ( #1361 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-08 12:09:33 +08:00
XinyaoWa
464e2d3125
Rename streaming to stream to align with OpenAI API ( #1332 )
...
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
2025-01-06 13:25:55 +08:00
chen, suyue
5c7a5bd850
Update Code and README for GenAIComps Refactor ( #1285 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
Signed-off-by: chensuyue <suyue.chen@intel.com >
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Signed-off-by: letonghan <letong.han@intel.com >
Signed-off-by: ZePan110 <ze.pan@intel.com >
Signed-off-by: WenjiaoYue <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>
2025-01-02 20:03:26 +08:00
Wang, Kai Lawrence
4c01e14642
[ChatQnA] Remove enforce-eager to enable HPU graphs for better vLLM perf ( #1210 )
...
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2024-12-10 13:19:15 +08:00
ZePan110
340796bbae
Split ChatQnA manifest test ( #1190 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-04 15:17:46 +08:00