ZePan110
799881a3fa
Remove perf test code from test scripts. ( #1510 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-02-18 16:23:49 +08:00
xiguiw
0c0edffc5b
update vLLM CPU to the latest stable version ( #1546 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
2025-02-17 08:26:25 +08:00
Kendall González León
80dd86f122
Make a fix in the main README.md of the ChatQnA. ( #1551 )
...
Signed-off-by: Kendall González León <kendall.gonzalez.leon@intel.com >
2025-02-14 17:00:44 +08:00
Louie Tsai
970b869838
Add a new section to change LLM model such as deepseek based on validated model table in LLM microservice ( #1501 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
Co-authored-by: Wang, Kai Lawrence <109344418+wangkl2@users.noreply.github.com >
Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com >
2025-02-12 09:34:56 +08:00
XinyaoWa
87ff149f61
Remove vllm hpu triton version fix ( #1515 )
...
vllm-fork has fix triton version issue, remove duplicated code https://github.com/HabanaAI/vllm-fork/blob/habana_main/requirements-hpu.txt
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Co-authored-by: chen, suyue <suyue.chen@intel.com >
2025-02-12 09:24:38 +08:00
chen, suyue
81b02bb947
Revert "HUGGINGFACEHUB_API_TOKEN environment is change to HF_TOKEN (#… ( #1521 )
...
Revert this PR since the test is not triggered properly due to the false merge of a WIP CI PR, 44a689b0bf , which block the CI test.
This change will be submitted in another PR.
2025-02-11 18:36:12 +08:00
Louie Tsai
47069ac70c
fix a test script issue due to name change for telemetry yaml files ( #1516 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-02-11 17:58:42 +08:00
Louie Tsai
ad5523bac7
Enable OpenTelemtry Tracing for ChatQnA on Xeon and Gaudi by docker compose merge feature ( #1488 )
...
Signed-off-by: Louie, Tsai <louie.tsai@intel.com >
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-02-10 22:58:50 -08:00
xiguiw
45d5da2ddd
HUGGINGFACEHUB_API_TOKEN environment is change to HF_TOKEN ( #1503 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-02-09 20:33:06 +08:00
Louie Tsai
4c41a5db83
Update README.md for OPEA OTLP tracing ( #1406 )
...
Signed-off-by: louie-tsai <louie.tsai@intel.com >
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com >
2025-02-05 13:03:15 -08:00
Liang Lv
9adf7a6af0
Add support for latest deepseek models on Gaudi ( #1491 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
2025-02-05 08:30:04 +08:00
bjzhjing
ed163087ba
Provide unified scalable deployment and benchmarking support for exam… ( #1315 )
...
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com >
Signed-off-by: letonghan <letong.han@intel.com >
Co-authored-by: letonghan <letong.han@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-24 22:27:49 +08:00
chen, suyue
259099d19f
Remove kubernetes manifest related code and tests ( #1466 )
...
Remove deprecated kubernetes manifest related code and tests.
k8s implementation for those examples based on helm charts will target for next release.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-24 15:23:12 +08:00
chen, suyue
9a1118730b
Freeze the triton version in vllm-gaudi image to 3.1.0 ( #1463 )
...
The new triton version 3.2.0 can't work with vllm-gaudi. Freeze the triton version in vllm-gaudi image to 3.1.0.
Issue create for vllm-fork: HabanaAI/vllm-fork#732
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-24 09:50:59 +08:00
dolpher
9b0f98be8b
Update ChatQnA helm chart README. ( #1459 )
...
Signed-off-by: Dolpher Du <dolpher.du@intel.com >
2025-01-23 10:54:39 +08:00
Ervin Castelino
27fdbcab58
[chore/chatqna] Missing protocol in curl command ( #1447 )
...
This PR fixes the missing protocol for in the curl command mentioned in chatqna readme for tei-embedding-service.
2025-01-22 21:41:47 +08:00
dolpher
ee0e5cc8d9
Sync value files from GenAIInfra ( #1428 )
...
All gaudi values updated with extra flags.
Added helm support for 2 new examples Text2Image and SearchQnA. Minor fix for llm-uservice.
Signed-off-by: Dolpher Du <dolpher.du@intel.com >
2025-01-22 17:44:11 +08:00
WenjiaoYue
b721c256f9
Fix Domain Access Issue in Latest Vite Version ( #1444 )
...
Fix the restriction on using domain names when users are using the latest version of Vite
When users use the new version of Vite, the UI cannot be accessed via domain names due to Vite's new rules. This fix adds the corresponding parameters according to Vite's new rules, ensuring that users can access the frontend via domain names when building the UI.
Fixes #1441
Co-authored-by: WenjiaoYue <wenjiao.yue@intel.com >
2025-01-21 23:28:37 +08:00
chen, suyue
927698e23e
Simplify git clone code in CI test ( #1434 )
...
1. Simplify git clone code in CI test.
2. Replace git clone branch in Dockerfile.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-21 23:00:08 +08:00
Wang, Kai Lawrence
284db982be
[ROCm] Fix the hf-token setting for TGI and TEI in ChatQnA ( #1432 )
...
This PR is to correct the env variable names in chatqna example on ROCm platform passing to the docker container of TGI and TEI. For tgi, either HF_TOKEN and HUGGING_FACE_HUB_TOKEN could be parsed in TGI while HF_API_TOKEN can be parsed in TEI.
TGI: https://github.com/huggingface/text-generation-inference/blob/main/router/src/server.rs#L1700C1-L1702C15
TEI: https://github.com/huggingface/text-embeddings-inference/blob/main/router/src/main.rs#L112
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-21 14:22:39 +08:00
Wang, Kai Lawrence
3d3ac59bfb
[ChatQnA] Update the default LLM to llama3-8B on cpu/gpu/hpu ( #1430 )
...
Update the default LLM to llama3-8B on cpu/nvgpu/amdgpu/gaudi for docker-compose deployment to avoid the potential model serving issue or the missing chat-template issue using neural-chat-7b.
Slow serving issue of neural-chat-7b on ICX: #1420
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-20 22:47:56 +08:00
chen, suyue
7a54064d65
remove Dockerfile.wrapper ( #1429 )
...
Remove Dockerfile.wrapper, it's not used anymore and no test cover this Dockerfile. So remove this Dockerfile to avoid regression.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-20 20:49:18 +08:00
Liang Lv
0f7e5a37ac
Adapt code for dataprep microservice refactor ( #1408 )
...
https://github.com/opea-project/GenAIComps/pull/1153
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
2025-01-20 20:37:03 +08:00
xiguiw
2d5898244c
Enchance health check in GenAIExample docker-compose ( #1410 )
...
Fix service launch issue
1. Update Gaudi TGI image from 2.0.6 to 2.3.1
2. Change the hpu-gaudi TGI health check condition.
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-01-20 20:13:13 +08:00
chen, suyue
6bfd156573
Clean up test scripts and enhance git clone ( #1417 )
...
1. Clean up test code in scripts.
2. Simplify git clone code.
3. Replace git clone branch in Dockerfile.
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-20 16:34:28 +08:00
Wang, Kai Lawrence
742cb6ddd3
[ChatQnA] Switch to vLLM as default llm backend on Xeon ( #1403 )
...
Switching from TGI to vLLM as the default LLM serving backend on Xeon for the ChatQnA example to enhance the perf.
https://github.com/opea-project/GenAIExamples/issues/1213
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-17 20:48:19 +08:00
Wang, Kai Lawrence
00e9da9ced
[ChatQnA] Switch to vLLM as default llm backend on Gaudi ( #1404 )
...
Switching from TGI to vLLM as the default LLM serving backend on Gaudi for the ChatQnA example to enhance the perf.
https://github.com/opea-project/GenAIExamples/issues/1213
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com >
2025-01-17 20:46:38 +08:00
XinyaoWa
301b5e9a69
Fix vllm hpu to a stable release ( #1398 )
...
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
2025-01-16 16:35:32 +08:00
Letong Han
4cabd55778
Refactor Retrievers related Examples ( #1387 )
...
Delete redundant retrievers docker image in docker_images_list.md.
Refactor Retrievers related Examples READMEs.
Change all of the comps/retrievers/xxx/xxx/Dockerfile path into comps/retrievers/src/Dockerfile.
Fix the Examples CI issues of PR opea-project/GenAIComps#1138 .
Signed-off-by: letonghan <letong.han@intel.com >
2025-01-16 14:21:48 +08:00
xiguiw
698a06edbf
[DOC] Fix document issue ( #1395 )
...
Signed-off-by: Wang, Xigui <xigui.wang@intel.com >
2025-01-16 11:30:07 +08:00
Eero Tamminen
0eae391fda
Use staged builds to minimize final image sizes ( #1031 )
...
Staged image builds so that final images do not have redundant things like:
- Git tool and its deps
- Git repo history
- Test directories
Fixes : #225
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com >
2025-01-16 11:14:47 +08:00
XinyaoWa
7d218b9f36
Remove vllm hpu commit id limit ( #1386 )
...
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-14 11:05:32 +08:00
Ying Hu
91ff520baa
Update README.md for add K8S cluster link for Gaudi ( #1380 )
2025-01-13 09:33:58 +08:00
Liang Lv
3ca78867eb
Update example code for embedding dependency moving to 3rd_party ( #1368 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
2025-01-10 15:36:58 +08:00
dolpher
c795ef2203
Add helm deployment instructions for GenAIExamples ( #1373 )
...
Add helm deployment instructions for ChatQnA, AgentQnA, AudioQnA, CodeTrans, DocSum, FaqGen and VisualQnA
Signed-off-by: Dolpher Du <dolpher.du@intel.com >
2025-01-10 09:55:31 +08:00
Louie Tsai
81022355a7
Enable OpenTelemetry Tracing for ChatQnA TGI serving on Gaudi ( #1316 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com >
2025-01-08 17:20:13 -08:00
Jaswanth Karani
ddacb7e86d
fixed build issue ( #1367 )
2025-01-08 22:19:23 +08:00
Liang Lv
b3c405a5f6
Adapt example code for guardrails refactor ( #1360 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-08 14:35:23 +08:00
chen, suyue
23117871c2
remove chatqna-conversation-ui build in CI test ( #1361 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-08 12:09:33 +08:00
WenjiaoYue
9970605460
Adapt refactor comps ( #1340 )
...
Signed-off-by: WenjiaoYue
2025-01-08 10:36:24 +08:00
Pranav Singh
d2b49bbc82
[ChatQNA] Fix K8s Deployment for CPU/HPU ( #1274 )
...
Signed-off-by: Pranav Singh <pranav.singh@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-07 13:45:09 +08:00
ZePan110
ed2b8ed983
Exclude dockerfile under tests and exclude check Dockerfile under tests. ( #1354 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-07 09:05:01 +08:00
ZePan110
aa5c91d7ee
Check duplicated dockerfile ( #1289 )
...
Signed-off-by: ZePan110 <ze.pan@intel.com >
2025-01-06 17:30:12 +08:00
XinyaoWa
464e2d3125
Rename streaming to stream to align with OpenAI API ( #1332 )
...
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
2025-01-06 13:25:55 +08:00
chen, suyue
1f29eca288
fix chatqna benchmark without rerank config issue ( #1341 )
...
Signed-off-by: chensuyue <suyue.chen@intel.com >
2025-01-06 09:16:20 +08:00
chen, suyue
5c7a5bd850
Update Code and README for GenAIComps Refactor ( #1285 )
...
Signed-off-by: lvliang-intel <liang1.lv@intel.com >
Signed-off-by: chensuyue <suyue.chen@intel.com >
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com >
Signed-off-by: letonghan <letong.han@intel.com >
Signed-off-by: ZePan110 <ze.pan@intel.com >
Signed-off-by: WenjiaoYue <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>
2025-01-02 20:03:26 +08:00
Ying Hu
597f17b979
Update set_env.sh to fix LOGFLAG warning ( #1319 )
2024-12-30 10:54:26 +08:00
Daniel De León
b27b48c488
Add microservice resources to no_proxy in the main ChatQnA README ( #1269 )
...
Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com >
2024-12-27 16:14:28 +08:00
bjzhjing
7d9b34cf5e
Chatqna/benchmark: Remove the deprecated directory ( #1261 )
...
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com >
2024-12-19 10:51:01 +08:00
lkk
2af1ea0f8e
remove examples gateway. ( #1243 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-13 15:16:11 +08:00