Commit Graph

46 Commits

Author SHA1 Message Date
xiguiw
45d5da2ddd HUGGINGFACEHUB_API_TOKEN environment is change to HF_TOKEN (#1503)
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
2025-02-09 20:33:06 +08:00
Liang Lv
9adf7a6af0 Add support for latest deepseek models on Gaudi (#1491)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2025-02-05 08:30:04 +08:00
Wang, Kai Lawrence
3d3ac59bfb [ChatQnA] Update the default LLM to llama3-8B on cpu/gpu/hpu (#1430)
Update the default LLM to llama3-8B on cpu/nvgpu/amdgpu/gaudi for docker-compose deployment to avoid the potential model serving issue or the missing chat-template issue using neural-chat-7b.

Slow serving issue of neural-chat-7b on ICX: #1420
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
2025-01-20 22:47:56 +08:00
Liang Lv
0f7e5a37ac Adapt code for dataprep microservice refactor (#1408)
https://github.com/opea-project/GenAIComps/pull/1153

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2025-01-20 20:37:03 +08:00
xiguiw
2d5898244c Enchance health check in GenAIExample docker-compose (#1410)
Fix service launch issue

1. Update Gaudi TGI image from 2.0.6 to 2.3.1
2. Change the hpu-gaudi TGI health check condition.

Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
2025-01-20 20:13:13 +08:00
Wang, Kai Lawrence
00e9da9ced [ChatQnA] Switch to vLLM as default llm backend on Gaudi (#1404)
Switching from TGI to vLLM as the default LLM serving backend on Gaudi for the ChatQnA example to enhance the perf. 

https://github.com/opea-project/GenAIExamples/issues/1213
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
2025-01-17 20:46:38 +08:00
Letong Han
4cabd55778 Refactor Retrievers related Examples (#1387)
Delete redundant retrievers docker image in docker_images_list.md.
Refactor Retrievers related Examples READMEs.
Change all of the comps/retrievers/xxx/xxx/Dockerfile path into comps/retrievers/src/Dockerfile.

Fix the Examples CI issues of PR opea-project/GenAIComps#1138.
Signed-off-by: letonghan <letong.han@intel.com>
2025-01-16 14:21:48 +08:00
Liang Lv
3ca78867eb Update example code for embedding dependency moving to 3rd_party (#1368)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2025-01-10 15:36:58 +08:00
Louie Tsai
81022355a7 Enable OpenTelemetry Tracing for ChatQnA TGI serving on Gaudi (#1316)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2025-01-08 17:20:13 -08:00
Liang Lv
b3c405a5f6 Adapt example code for guardrails refactor (#1360)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
2025-01-08 14:35:23 +08:00
ZePan110
ed2b8ed983 Exclude dockerfile under tests and exclude check Dockerfile under tests. (#1354)
Signed-off-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-07 09:05:01 +08:00
ZePan110
aa5c91d7ee Check duplicated dockerfile (#1289)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-01-06 17:30:12 +08:00
chen, suyue
5c7a5bd850 Update Code and README for GenAIComps Refactor (#1285)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: ZePan110 <ze.pan@intel.com>
Signed-off-by: WenjiaoYue <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>
2025-01-02 20:03:26 +08:00
Ying Hu
597f17b979 Update set_env.sh to fix LOGFLAG warning (#1319) 2024-12-30 10:54:26 +08:00
Wang, Kai Lawrence
4c01e14642 [ChatQnA] Remove enforce-eager to enable HPU graphs for better vLLM perf (#1210)
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
2024-12-10 13:19:15 +08:00
Sihan Chen
907b30b7fe Refactor service names (#1199) 2024-11-28 10:01:31 +08:00
Wang, Kai Lawrence
ac470421d0 Update the llm backend ports (#1172)
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
2024-11-22 09:20:09 +08:00
ZePan110
8808b51e42 Rename image name XXX-hpu to XXX-gaudi (#1154)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-11-19 22:18:41 +08:00
Louie Tsai
152adf8012 maintain a version info for docker_compose yaml files among release (#1141)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-11-17 22:39:41 -08:00
Louie Tsai
00d9bb6128 Enable vLLM Profiling for ChatQnA on Gaudi (#1128)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-11-14 15:46:33 -08:00
lvliang-intel
9ff7df9202 Use fixed version of TEI Gaudi for stability (#1101)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Malini Bhandaru <malini.bhandaru@intel.com>
2024-11-13 10:45:50 -08:00
lvliang-intel
1ff85f6a85 Upgrade TGI Gaudi version to v2.0.6 (#1088)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-12 14:38:22 +08:00
Letong Han
aa314f6757 [Readme] Update ChatQnA Readme for LLM Endpoint (#1086)
Signed-off-by: letonghan <letong.han@intel.com>
2024-11-11 13:53:06 +08:00
XinyaoWa
40386d9bd6 remove vllm-on-ray (#1084)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-11-08 13:01:48 +08:00
lvliang-intel
4635a927fa Make embedding run on CPU for aligning with Gaudi performance benchmark (#1057)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-07 17:39:34 +08:00
XinyaoWa
e9b164505e align vllm hpu version to latest vllm-fork (#1061)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-11-07 14:08:56 +08:00
xiguiw
a0921f127f [Doc] Fix broken build instruction (#1063)
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
2024-11-05 13:35:12 +08:00
Louie Tsai
a10b4a1f1d Address request from Issue#971 (#1018) 2024-10-23 23:57:52 -07:00
lvliang-intel
0eedbbfce0 Update aipc ollama docker compose and readme (#984)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-10-22 10:30:47 +08:00
lvliang-intel
9438d392b4 Update README for some minor issues (#1000)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-22 10:30:18 +08:00
lvliang-intel
3c164f3aa2 Make rerank run on gaudi for hpu docker compose (#980)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-18 21:49:36 +08:00
lvliang-intel
256b58c07e Replace environment variables with service name for ChatQnA (#977)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-18 11:31:24 +08:00
ylg
37c74b232c Update ChatQnA yaml and set retriever's TEI_EMBEDDING_ENDPOINT (#953)
Signed-off-by: longguang.yue <bigclouds@163.com>
2024-10-17 16:58:47 +08:00
lvliang-intel
c930bea172 Add missing nginx microservice and fix frontend test (#951)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-16 13:29:31 +08:00
lvliang-intel
619d941047 Set no wrapper ChatQnA as default (#891)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-11 13:30:45 +08:00
lvliang-intel
3fb60608b3 Use official tei gaudi image and update tgi gaudi version (#810)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-23 17:52:56 +08:00
Letong Han
c35fe0b429 [Doc] Update ChatQnA README for Nginx Docker Image (#862)
Signed-off-by: letonghan <letong.han@intel.com>
2024-09-23 12:25:30 +09:00
Letong Han
7eaab93d0b [Doc] Refine ChatQnA README (#855)
Signed-off-by: letonghan <letong.han@intel.com>
2024-09-20 11:20:20 +08:00
Neo Zhang Jianyu
bc817700b9 refactor the network port setting for AWS (#849)
Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com>
2024-09-19 21:58:56 +08:00
Letong Han
6c364487d3 [ChatQnA] Add Nginx in Docker Compose and README (#850)
Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-19 20:39:58 +08:00
XinyaoWa
2f03a3a894 Align parameters for "max_token, repetition_penalty,presence_penalty,frequency_penalty" (#726)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-19 14:15:25 +08:00
kevinintel
3b70fb0d42 Refine the quick start of ChatQnA (#828)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-18 22:23:22 +08:00
lvliang-intel
bceacdc804 Fix README issues (#817)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-18 09:50:17 +08:00
XinyaoWa
d2bab99835 refine readme for reorg (#782)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-11 14:57:29 +08:00
feng-intel
63406dc050 Yaml: add comments to specify gaudi device ids. (#753)
Signed-off-by: fengding <feng1.ding@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-11 12:02:18 +08:00
XinyaoWa
d73129cbf0 Refactor folder to support different vendors (#743)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-09-10 23:27:19 +08:00