GenAIExamples

Author	SHA1	Message	Date
Louie Tsai	e8cdf7d668	[ChatQnA] update to the latest Grafana Dashboard (#1728 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com>	2025-04-03 12:14:55 -07:00
chen, suyue	c48cd651e4	[CICD enhance] ChatQnA run CI with latest base image, group logs in GHA outputs. (#1736 ) Signed-off-by: chensuyue <suyue.chen@intel.com>	2025-04-03 22:03:20 +08:00
chyundunovDatamonsters	c50dfb2510	Adding files to deploy ChatQnA application on ROCm vLLM (#1560 ) Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>	2025-04-03 17:19:26 +08:00
Louie Tsai	8fe2d5d0be	Update README.md to have Table for contents (#1721 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com>	2025-04-01 10:31:05 -07:00
Xiaotian Chen	1bd56af994	Update TGI image versions (#1625 ) Signed-off-by: xiaotia3 <xiaotian.chen@intel.com>	2025-04-01 11:27:51 +08:00
xiguiw	87baeb833d	Update TEI docker image to 1.6 (#1650 ) Signed-off-by: Wang, Xigui <xigui.wang@intel.com>	2025-03-27 09:40:22 +08:00
Louie Tsai	0736912c69	change gaudi node exporter from default one to 41612 (#1702 ) Signed-off-by: Louie Tsai <louie.tsai@intel.com> Signed-off-by: Tsai, Louie <louie.tsai@intel.com>	2025-03-20 21:38:24 -07:00
XinyaoWa	6d24c1c77a	Merge FaqGen into ChatQnA (#1654 ) 1. Delete FaqGen 2. Refactor FaqGen into ChatQnA, serve as a LLM selection. 3. Combine all ChatQnA related Dockerfile into one Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>	2025-03-20 17:40:00 +08:00
James Edwards	527b146a80	Add final README.md and set_env.sh script for quickstart review. Previous pull request was 1595. (#1662 ) Signed-off-by: Edwards, James A <jaedwards@habana.ai> Co-authored-by: Edwards, James A <jaedwards@habana.ai>	2025-03-14 16:05:01 -07:00
Louie Tsai	671dff7f51	[ChatQnA] Enable Prometheus and Grafana with telemetry docker compose file. (#1623 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com>	2025-03-13 23:18:29 -07:00
Li Gang	0701b8cfff	[ChatQnA][docker]Check healthy of redis to avoid dataprep failure (#1591 ) Signed-off-by: Li Gang <gang.g.li@intel.com>	2025-03-13 10:52:33 +08:00
Eero Tamminen	4269669f73	Use GenAIComp base image to simplify Dockerfiles & reduce image sizes - part 2 (#1638 ) Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>	2025-03-13 08:23:07 +08:00
chen, suyue	43d0a18270	Enhance ChatQnA test scripts (#1643 ) Signed-off-by: chensuyue <suyue.chen@intel.com>	2025-03-10 17:36:26 +08:00
Wang, Kai Lawrence	5362321d3a	Fix vllm model cache directory (#1642 ) Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>	2025-03-10 13:40:42 +08:00
chen, suyue	4cab86260f	Use the latest HabanaAI/vllm-fork release tag to build vllm-gaudi image (#1635 ) Signed-off-by: chensuyue <suyue.chen@intel.com> Co-authored-by: Liang Lv <liang1.lv@intel.com>	2025-03-07 20:40:32 +08:00
wangleflex	694207f76b	[ChatQnA] Show spinner after query to improve user experience (#1003 ) (#1628 ) Signed-off-by: Wang,Le3 <le3.wang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-03-07 17:08:53 +08:00
ZePan110	785ffb9a1e	Update compose.yaml for ChatQnA (#1621 ) Update compose.yaml for ChatQnA Signed-off-by: ZePan110 <ze.pan@intel.com>	2025-03-07 09:19:39 +08:00
ZePan110	6ead1b12db	Enable ChatQnA model cache for docker compose test. (#1605 ) Enable ChatQnA model cache for docker compose test. Signed-off-by: ZePan110 <ze.pan@intel.com>	2025-03-05 11:30:04 +08:00
chen, suyue	8f8d3af7c3	open chatqna frontend test (#1594 ) Signed-off-by: chensuyue <suyue.chen@intel.com>	2025-03-04 10:41:22 +08:00
Spycsh	ce38a84372	Revert chatqna async and enhance tests (#1598 ) align with opea-project/GenAIComps#1354	2025-03-03 23:03:44 +08:00
Eze Lanza (Eze)	fba0de45d2	ChatQnA Docker compose file for Milvus as vdb (#1548 ) Signed-off-by: Ezequiel Lanza <ezequiel.lanza@gmail.com> Signed-off-by: Kendall González León <kendall.gonzalez.leon@intel.com> Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Spycsh <sihan.chen@intel.com> Signed-off-by: Wang, Xigui <xigui.wang@intel.com> Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: minmin-intel <minmin.hou@intel.com> Signed-off-by: Artem Astafev <a.astafev@datamonsters.com> Signed-off-by: Xinyao Wang <xinyao.wang@intel.com> Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> Signed-off-by: letonghan <letong.han@intel.com> Signed-off-by: alexsin368 <alex.sin@intel.com> Signed-off-by: WenjiaoYue <wenjiao.yue@intel.com> Co-authored-by: Ezequiel Lanza <emlanza@CDQ242RKJDmac.local> Co-authored-by: Kendall González León <kendallgonzalez@hotmail.es> Co-authored-by: chen, suyue <suyue.chen@intel.com> Co-authored-by: Spycsh <39623753+Spycsh@users.noreply.github.com> Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com> Co-authored-by: jotpalch <49465120+jotpalch@users.noreply.github.com> Co-authored-by: ZePan110 <ze.pan@intel.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: minmin-intel <minmin.hou@intel.com> Co-authored-by: Ying Hu <ying.hu@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Artem Astafev <a.astafev@datamonsters.com> Co-authored-by: XinyaoWa <xinyao.wang@intel.com> Co-authored-by: alexsin368 <109180236+alexsin368@users.noreply.github.com> Co-authored-by: WenjiaoYue <wenjiao.yue@intel.com>	2025-02-28 22:40:31 +08:00
chen, suyue	3d8009aa91	Fix benchmark scripts (#1517 ) - Align benchmark default config: 1. Update default helm charts version. 2. Add `# mandatory` comment. 3. Update default model ID for LLM. - Fix deploy issue: 1. Support different `replicaCount` for w/ w/o rerank test. 2. Add `max_num_seqs` for vllm. 3. Add resource setting for tune mode. - Fix Benchmark issue: 1. Update `user_queries` and `concurrency` setting. 2. Remove invalid parameters. 3. Fix `dataset` and `prompt` setting. And dataset ingest into db. 5. Fix the benchmark hang issue with large user queries. Update `"processes": 16` will fix this issue. 6. Update the eval_path setting logical. - Optimize benchmark readme. - Optimize the log path to make the logs more readable. Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> Signed-off-by: letonghan <letong.han@intel.com>	2025-02-28 10:30:54 +08:00
XinyaoWa	78f8ae524d	Fix async in chatqna bug (#1589 ) Algin async with comps: related PR: opea-project/GenAIComps#1300 Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>	2025-02-27 23:32:29 +08:00
Artem Astafev	6abf7652e8	Fix ChatQnA ROCm compose Readme file and absolute path for ROCM CI test (#1159 ) Signed-off-by: Artem Astafev <a.astafev@datamonsters.com>	2025-02-27 15:26:45 +08:00
Eero Tamminen	23a77df302	Fix "OpenAI" & "response" spelling (#1561 )	2025-02-25 12:45:21 +08:00
Ying Hu	852bc7027c	Update README.md of AIPC quick start (#1578 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-02-23 17:38:27 +08:00
ZePan110	caec354324	Fix trivy issue (#1569 ) Fix docker image security issue Signed-off-by: ZePan110 <ze.pan@intel.com>	2025-02-20 14:41:52 +08:00
xiguiw	d482554a6b	Fix mismatched environment variable (#1575 ) Signed-off-by: Wang, Xigui <xigui.wang@intel.com>	2025-02-19 19:24:10 +08:00
xiguiw	2ae6871fc5	Simplify ChatQnA AIPC user setting (#1573 ) Signed-off-by: Wang, Xigui <xigui.wang@intel.com>	2025-02-19 16:30:02 +08:00
ZePan110	799881a3fa	Remove perf test code from test scripts. (#1510 ) Signed-off-by: ZePan110 <ze.pan@intel.com>	2025-02-18 16:23:49 +08:00
xiguiw	0c0edffc5b	update vLLM CPU to the latest stable version (#1546 ) Signed-off-by: Wang, Xigui <xigui.wang@intel.com> Co-authored-by: chen, suyue <suyue.chen@intel.com>	2025-02-17 08:26:25 +08:00
Kendall González León	80dd86f122	Make a fix in the main README.md of the ChatQnA. (#1551 ) Signed-off-by: Kendall González León <kendall.gonzalez.leon@intel.com>	2025-02-14 17:00:44 +08:00
Louie Tsai	970b869838	Add a new section to change LLM model such as deepseek based on validated model table in LLM microservice (#1501 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com> Co-authored-by: Wang, Kai Lawrence <109344418+wangkl2@users.noreply.github.com> Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com>	2025-02-12 09:34:56 +08:00
XinyaoWa	87ff149f61	Remove vllm hpu triton version fix (#1515 ) vllm-fork has fix triton version issue, remove duplicated code https://github.com/HabanaAI/vllm-fork/blob/habana_main/requirements-hpu.txt Signed-off-by: Xinyao Wang <xinyao.wang@intel.com> Co-authored-by: chen, suyue <suyue.chen@intel.com>	2025-02-12 09:24:38 +08:00
chen, suyue	81b02bb947	Revert "HUGGINGFACEHUB_API_TOKEN environment is change to HF_TOKEN (#… (#1521 ) Revert this PR since the test is not triggered properly due to the false merge of a WIP CI PR, `44a689b0bf`, which block the CI test. This change will be submitted in another PR.	2025-02-11 18:36:12 +08:00
Louie Tsai	47069ac70c	fix a test script issue due to name change for telemetry yaml files (#1516 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com>	2025-02-11 17:58:42 +08:00
Louie Tsai	ad5523bac7	Enable OpenTelemtry Tracing for ChatQnA on Xeon and Gaudi by docker compose merge feature (#1488 ) Signed-off-by: Louie, Tsai <louie.tsai@intel.com> Signed-off-by: Tsai, Louie <louie.tsai@intel.com>	2025-02-10 22:58:50 -08:00
xiguiw	45d5da2ddd	HUGGINGFACEHUB_API_TOKEN environment is change to HF_TOKEN (#1503 ) Signed-off-by: Wang, Xigui <xigui.wang@intel.com>	2025-02-09 20:33:06 +08:00
Louie Tsai	4c41a5db83	Update README.md for OPEA OTLP tracing (#1406 ) Signed-off-by: louie-tsai <louie.tsai@intel.com> Signed-off-by: Tsai, Louie <louie.tsai@intel.com> Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>	2025-02-05 13:03:15 -08:00
Liang Lv	9adf7a6af0	Add support for latest deepseek models on Gaudi (#1491 ) Signed-off-by: lvliang-intel <liang1.lv@intel.com>	2025-02-05 08:30:04 +08:00
bjzhjing	ed163087ba	Provide unified scalable deployment and benchmarking support for exam… (#1315 ) Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-01-24 22:27:49 +08:00
chen, suyue	259099d19f	Remove kubernetes manifest related code and tests (#1466 ) Remove deprecated kubernetes manifest related code and tests. k8s implementation for those examples based on helm charts will target for next release. Signed-off-by: chensuyue <suyue.chen@intel.com>	2025-01-24 15:23:12 +08:00
chen, suyue	9a1118730b	Freeze the triton version in vllm-gaudi image to 3.1.0 (#1463 ) The new triton version 3.2.0 can't work with vllm-gaudi. Freeze the triton version in vllm-gaudi image to 3.1.0. Issue create for vllm-fork: HabanaAI/vllm-fork#732 Signed-off-by: chensuyue <suyue.chen@intel.com>	2025-01-24 09:50:59 +08:00
dolpher	9b0f98be8b	Update ChatQnA helm chart README. (#1459 ) Signed-off-by: Dolpher Du <dolpher.du@intel.com>	2025-01-23 10:54:39 +08:00
Ervin Castelino	27fdbcab58	[chore/chatqna] Missing protocol in curl command (#1447 ) This PR fixes the missing protocol for in the curl command mentioned in chatqna readme for tei-embedding-service.	2025-01-22 21:41:47 +08:00
dolpher	ee0e5cc8d9	Sync value files from GenAIInfra (#1428 ) All gaudi values updated with extra flags. Added helm support for 2 new examples Text2Image and SearchQnA. Minor fix for llm-uservice. Signed-off-by: Dolpher Du <dolpher.du@intel.com>	2025-01-22 17:44:11 +08:00
WenjiaoYue	b721c256f9	Fix Domain Access Issue in Latest Vite Version (#1444 ) Fix the restriction on using domain names when users are using the latest version of Vite When users use the new version of Vite, the UI cannot be accessed via domain names due to Vite's new rules. This fix adds the corresponding parameters according to Vite's new rules, ensuring that users can access the frontend via domain names when building the UI. Fixes #1441 Co-authored-by: WenjiaoYue <wenjiao.yue@intel.com>	2025-01-21 23:28:37 +08:00
chen, suyue	927698e23e	Simplify git clone code in CI test (#1434 ) 1. Simplify git clone code in CI test. 2. Replace git clone branch in Dockerfile. Signed-off-by: chensuyue <suyue.chen@intel.com>	2025-01-21 23:00:08 +08:00
Wang, Kai Lawrence	284db982be	[ROCm] Fix the hf-token setting for TGI and TEI in ChatQnA (#1432 ) This PR is to correct the env variable names in chatqna example on ROCm platform passing to the docker container of TGI and TEI. For tgi, either HF_TOKEN and HUGGING_FACE_HUB_TOKEN could be parsed in TGI while HF_API_TOKEN can be parsed in TEI. TGI: https://github.com/huggingface/text-generation-inference/blob/main/router/src/server.rs#L1700C1-L1702C15 TEI: https://github.com/huggingface/text-embeddings-inference/blob/main/router/src/main.rs#L112 Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>	2025-01-21 14:22:39 +08:00
Wang, Kai Lawrence	3d3ac59bfb	[ChatQnA] Update the default LLM to llama3-8B on cpu/gpu/hpu (#1430 ) Update the default LLM to llama3-8B on cpu/nvgpu/amdgpu/gaudi for docker-compose deployment to avoid the potential model serving issue or the missing chat-template issue using neural-chat-7b. Slow serving issue of neural-chat-7b on ICX: #1420 Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>	2025-01-20 22:47:56 +08:00

1 2 3 4 5 ...

443 Commits