Louie Tsai
e8cdf7d668
[ChatQnA] update to the latest Grafana Dashboard ( #1728 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-04-03 12:14:55 -07:00
chen, suyue
c48cd651e4
[CICD enhance] ChatQnA run CI with latest base image, group logs in GHA outputs. ( #1736 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-04-03 22:03:20 +08:00
chyundunovDatamonsters
c50dfb2510
Adding files to deploy ChatQnA application on ROCm vLLM ( #1560 )
...
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com >
2025-04-03 17:19:26 +08:00
Louie Tsai
8fe2d5d0be
Update README.md to have Table for contents ( #1721 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-04-01 10:31:05 -07:00
Xiaotian Chen
1bd56af994
Update TGI image versions ( #1625 )
...
Signed-off-by: xiaotia3 <xiaotian.chen@intel.com >
2025-04-01 11:27:51 +08:00
xiguiw
87baeb833d
Update TEI docker image to 1.6 ( #1650 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-03-27 09:40:22 +08:00
Louie Tsai
0736912c69
change gaudi node exporter from default one to 41612 ( #1702 )
...
Signed-off-by: Louie Tsai <louie.tsai@intel.com >
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-03-20 21:38:24 -07:00
XinyaoWa
6d24c1c77a
Merge FaqGen into ChatQnA ( #1654 )
...
1. Delete FaqGen
2. Refactor FaqGen into ChatQnA, serve as a LLM selection.
3. Combine all ChatQnA related Dockerfile into one
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
2025-03-20 17:40:00 +08:00
James Edwards
527b146a80
Add final README.md and set_env.sh script for quickstart review. Previous pull request was 1595. ( #1662 )
...
Signed-off-by: Edwards, James A <jaedwards@habana.ai >
Co-authored-by: Edwards, James A <jaedwards@habana.ai >
2025-03-14 16:05:01 -07:00
Louie Tsai
671dff7f51
[ChatQnA] Enable Prometheus and Grafana with telemetry docker compose file. ( #1623 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-03-13 23:18:29 -07:00
Li Gang
0701b8cfff
[ChatQnA][docker]Check healthy of redis to avoid dataprep failure ( #1591 )
...
Signed-off-by: Li Gang <gang.g.li@intel.com >
2025-03-13 10:52:33 +08:00
Eero Tamminen
4269669f73
Use GenAIComp base image to simplify Dockerfiles & reduce image sizes - part 2 ( #1638 )
...
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com >
2025-03-13 08:23:07 +08:00
chen, suyue
43d0a18270
Enhance ChatQnA test scripts ( #1643 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-03-10 17:36:26 +08:00
Wang, Kai Lawrence
5362321d3a
Fix vllm model cache directory ( #1642 )
...
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-03-10 13:40:42 +08:00
chen, suyue
4cab86260f
Use the latest HabanaAI/vllm-fork release tag to build vllm-gaudi image ( #1635 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
Co-authored-by: Liang Lv <liang1.lv@intel.com >
2025-03-07 20:40:32 +08:00
wangleflex
694207f76b
[ChatQnA] Show spinner after query to improve user experience ( #1003 ) ( #1628 )
...
Signed-off-by: Wang,Le3 <le3.wang@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-03-07 17:08:53 +08:00
ZePan110
785ffb9a1e
Update compose.yaml for ChatQnA ( #1621 )
...
Update compose.yaml for ChatQnA
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-03-07 09:19:39 +08:00
ZePan110
6ead1b12db
Enable ChatQnA model cache for docker compose test. ( #1605 )
...
Enable ChatQnA model cache for docker compose test.
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-03-05 11:30:04 +08:00
chen, suyue
8f8d3af7c3
open chatqna frontend test ( #1594 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-03-04 10:41:22 +08:00
Spycsh
ce38a84372
Revert chatqna async and enhance tests ( #1598 )
...
align with opea-project/GenAIComps#1354
2025-03-03 23:03:44 +08:00
Eze Lanza (Eze)
fba0de45d2
ChatQnA Docker compose file for Milvus as vdb ( #1548 )
...
Signed-off-by: Ezequiel Lanza <ezequiel.lanza@gmail.com >
Signed-off-by: Kendall González León <kendall.gonzalez.leon@intel.com >
Signed-off-by: chensuyue <suyue.chen@intel.com >
Signed-off-by: Spycsh <sihan.chen@intel.com >
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
Signed-off-by: ZePan110 <ze.pan@intel.com >
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: minmin-intel <minmin.hou@intel.com >
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com >
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com >
Signed-off-by: letonghan <letong.han@intel.com >
Signed-off-by: alexsin368 <alex.sin@intel.com >
Signed-off-by: WenjiaoYue <wenjiao.yue@intel.com >
Co-authored-by: Ezequiel Lanza <emlanza@CDQ242RKJDmac.local >
Co-authored-by: Kendall González León <kendallgonzalez@hotmail.es >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
Co-authored-by: Spycsh <39623753+Spycsh@users.noreply.github.com >
Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com >
Co-authored-by: jotpalch <49465120+jotpalch@users.noreply.github.com >
Co-authored-by: ZePan110 <ze.pan@intel.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: minmin-intel <minmin.hou@intel.com >
Co-authored-by: Ying Hu <ying.hu@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com >
Co-authored-by: Liang Lv <liang1.lv@intel.com >
Co-authored-by: Artem Astafev <a.astafev@datamonsters.com >
Co-authored-by: XinyaoWa <xinyao.wang@intel.com >
Co-authored-by: alexsin368 <109180236+alexsin368@users.noreply.github.com >
Co-authored-by: WenjiaoYue <wenjiao.yue@intel.com >
2025-02-28 22:40:31 +08:00
chen, suyue
3d8009aa91
Fix benchmark scripts ( #1517 )
...
- Align benchmark default config:
1. Update default helm charts version.
2. Add `# mandatory` comment.
3. Update default model ID for LLM.
- Fix deploy issue:
1. Support different `replicaCount` for w/ w/o rerank test.
2. Add `max_num_seqs` for vllm.
3. Add resource setting for tune mode.
- Fix Benchmark issue:
1. Update `user_queries` and `concurrency` setting.
2. Remove invalid parameters.
3. Fix `dataset` and `prompt` setting. And dataset ingest into db.
5. Fix the benchmark hang issue with large user queries. Update `"processes": 16` will fix this issue.
6. Update the eval_path setting logical.
- Optimize benchmark readme.
- Optimize the log path to make the logs more readable.
Signed-off-by: chensuyue <suyue.chen@intel.com >
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com >
Signed-off-by: letonghan <letong.han@intel.com >
2025-02-28 10:30:54 +08:00
XinyaoWa
78f8ae524d
Fix async in chatqna bug ( #1589 )
...
Algin async with comps: related PR: opea-project/GenAIComps#1300
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
2025-02-27 23:32:29 +08:00
Artem Astafev
6abf7652e8
Fix ChatQnA ROCm compose Readme file and absolute path for ROCM CI test ( #1159 )
...
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com >
2025-02-27 15:26:45 +08:00
Eero Tamminen
23a77df302
Fix "OpenAI" & "response" spelling ( #1561 )
2025-02-25 12:45:21 +08:00
Ying Hu
852bc7027c
Update README.md of AIPC quick start ( #1578 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-02-23 17:38:27 +08:00
ZePan110
caec354324
Fix trivy issue ( #1569 )
...
Fix docker image security issue
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-02-20 14:41:52 +08:00
xiguiw
d482554a6b
Fix mismatched environment variable ( #1575 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-02-19 19:24:10 +08:00
xiguiw
2ae6871fc5
Simplify ChatQnA AIPC user setting ( #1573 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-02-19 16:30:02 +08:00
ZePan110
799881a3fa
Remove perf test code from test scripts. ( #1510 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-02-18 16:23:49 +08:00
xiguiw
0c0edffc5b
update vLLM CPU to the latest stable version ( #1546 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
2025-02-17 08:26:25 +08:00
Kendall González León
80dd86f122
Make a fix in the main README.md of the ChatQnA. ( #1551 )
...
Signed-off-by: Kendall González León <kendall.gonzalez.leon@intel.com >
2025-02-14 17:00:44 +08:00
Louie Tsai
970b869838
Add a new section to change LLM model such as deepseek based on validated model table in LLM microservice ( #1501 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
Co-authored-by: Wang, Kai Lawrence <109344418+wangkl2@users.noreply.github.com >
Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com >
2025-02-12 09:34:56 +08:00
XinyaoWa
87ff149f61
Remove vllm hpu triton version fix ( #1515 )
...
vllm-fork has fix triton version issue, remove duplicated code https://github.com/HabanaAI/vllm-fork/blob/habana_main/requirements-hpu.txt
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
2025-02-12 09:24:38 +08:00
chen, suyue
81b02bb947
Revert "HUGGINGFACEHUB_API_TOKEN environment is change to HF_TOKEN (#… ( #1521 )
...
Revert this PR since the test is not triggered properly due to the false merge of a WIP CI PR, 44a689b0bf , which block the CI test.
This change will be submitted in another PR.
2025-02-11 18:36:12 +08:00
Louie Tsai
47069ac70c
fix a test script issue due to name change for telemetry yaml files ( #1516 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-02-11 17:58:42 +08:00
Louie Tsai
ad5523bac7
Enable OpenTelemtry Tracing for ChatQnA on Xeon and Gaudi by docker compose merge feature ( #1488 )
...
Signed-off-by: Louie, Tsai <louie.tsai@intel.com >
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-02-10 22:58:50 -08:00
xiguiw
45d5da2ddd
HUGGINGFACEHUB_API_TOKEN environment is change to HF_TOKEN ( #1503 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-02-09 20:33:06 +08:00
Louie Tsai
4c41a5db83
Update README.md for OPEA OTLP tracing ( #1406 )
...
Signed-off-by: louie-tsai <louie.tsai@intel.com >
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com >
2025-02-05 13:03:15 -08:00
Liang Lv
9adf7a6af0
Add support for latest deepseek models on Gaudi ( #1491 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
2025-02-05 08:30:04 +08:00
bjzhjing
ed163087ba
Provide unified scalable deployment and benchmarking support for exam… ( #1315 )
...
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com >
Signed-off-by: letonghan <letong.han@intel.com >
Co-authored-by: letonghan <letong.han@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-24 22:27:49 +08:00
chen, suyue
259099d19f
Remove kubernetes manifest related code and tests ( #1466 )
...
Remove deprecated kubernetes manifest related code and tests.
k8s implementation for those examples based on helm charts will target for next release.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-24 15:23:12 +08:00
chen, suyue
9a1118730b
Freeze the triton version in vllm-gaudi image to 3.1.0 ( #1463 )
...
The new triton version 3.2.0 can't work with vllm-gaudi. Freeze the triton version in vllm-gaudi image to 3.1.0.
Issue create for vllm-fork: HabanaAI/vllm-fork#732
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-24 09:50:59 +08:00
dolpher
9b0f98be8b
Update ChatQnA helm chart README. ( #1459 )
...
Signed-off-by: Dolpher Du <dolpher.du@intel.com >
2025-01-23 10:54:39 +08:00
Ervin Castelino
27fdbcab58
[chore/chatqna] Missing protocol in curl command ( #1447 )
...
This PR fixes the missing protocol for in the curl command mentioned in chatqna readme for tei-embedding-service.
2025-01-22 21:41:47 +08:00
dolpher
ee0e5cc8d9
Sync value files from GenAIInfra ( #1428 )
...
All gaudi values updated with extra flags.
Added helm support for 2 new examples Text2Image and SearchQnA. Minor fix for llm-uservice.
Signed-off-by: Dolpher Du <dolpher.du@intel.com >
2025-01-22 17:44:11 +08:00
WenjiaoYue
b721c256f9
Fix Domain Access Issue in Latest Vite Version ( #1444 )
...
Fix the restriction on using domain names when users are using the latest version of Vite
When users use the new version of Vite, the UI cannot be accessed via domain names due to Vite's new rules. This fix adds the corresponding parameters according to Vite's new rules, ensuring that users can access the frontend via domain names when building the UI.
Fixes #1441
Co-authored-by: WenjiaoYue <wenjiao.yue@intel.com >
2025-01-21 23:28:37 +08:00
chen, suyue
927698e23e
Simplify git clone code in CI test ( #1434 )
...
1. Simplify git clone code in CI test.
2. Replace git clone branch in Dockerfile.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-21 23:00:08 +08:00
Wang, Kai Lawrence
284db982be
[ROCm] Fix the hf-token setting for TGI and TEI in ChatQnA ( #1432 )
...
This PR is to correct the env variable names in chatqna example on ROCm platform passing to the docker container of TGI and TEI. For tgi, either HF_TOKEN and HUGGING_FACE_HUB_TOKEN could be parsed in TGI while HF_API_TOKEN can be parsed in TEI.
TGI: https://github.com/huggingface/text-generation-inference/blob/main/router/src/server.rs#L1700C1-L1702C15
TEI: https://github.com/huggingface/text-embeddings-inference/blob/main/router/src/main.rs#L112
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-21 14:22:39 +08:00
Wang, Kai Lawrence
3d3ac59bfb
[ChatQnA] Update the default LLM to llama3-8B on cpu/gpu/hpu ( #1430 )
...
Update the default LLM to llama3-8B on cpu/nvgpu/amdgpu/gaudi for docker-compose deployment to avoid the potential model serving issue or the missing chat-template issue using neural-chat-7b.
Slow serving issue of neural-chat-7b on ICX: #1420
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-20 22:47:56 +08:00