Commit Graph

535 Commits

Author SHA1 Message Date
dolpher
1cc4d2119d Add kubernetes deployment for GenAIComps (#1104)
* Add kubernetes deployment for GenAIComps

---------

Signed-off-by: Dolpher Du <dolpher.du@intel.com>
2025-01-13 15:42:33 +08:00
XinyaoWa
88f93733b0 Refactor llm Docsum (#1101)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2025-01-13 15:24:43 +08:00
lkk
3a7ccb0a75 add tool choices for agent. (#1126) 2025-01-13 14:42:31 +08:00
Liang Lv
fe24decd72 Fix docker compose health check issue (#1133)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
2025-01-13 13:54:30 +08:00
Sihan Chen
feef30b0ea Refactor lvms (#1096)
Co-authored-by: ZePan110 <ze.pan@intel.com>
2025-01-13 13:06:59 +08:00
XinyaoWa
ea72c943bd Refactor FaqGen (#1093)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-13 11:30:59 +08:00
Yao Qing
3f23bf582a Remove version restrictions in animations (#1132)
Signed-off-by: Yao, Qing <qing.yao@intel.com>
2025-01-10 11:00:15 -08:00
XinyuYe-Intel
9349478601 Make naming compatible to the defined style (#1129)
* make naming style compatible to the defined style.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-01-10 15:31:58 +08:00
Liang Lv
b91911a543 Refine embedding naming and move dependency to 3rd_party (#1125)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2025-01-10 14:44:25 +08:00
Yao Qing
4f9f95574b Rename folder name integration to integrations in image2image and animation (#1130)
Signed-off-by: Yao, Qing <qing.yao@intel.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
2025-01-10 10:35:21 +08:00
XinyuYe-Intel
efd95780fd Finetuning code refactor (#1081)
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
2025-01-09 11:50:15 +08:00
XinyuYe-Intel
2587a2978a Text2image code refactor (#1054)
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
2025-01-09 11:45:15 +08:00
Liang Lv
179b5da06b Refactor prompt registry microservice (#1124)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2025-01-09 11:26:03 +08:00
Liang Lv
ec66b91c51 Feedback management microservice refactor (#1057)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2025-01-09 11:18:15 +08:00
Sihan Chen
962e097893 Refactor web retriever (#1102) 2025-01-08 15:24:08 +08:00
Liang Lv
631b570481 Refactor guardrails microservice (#1116)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2025-01-08 13:29:23 +08:00
lkk
650be0d660 fix stream issue. (#1120) 2025-01-08 10:40:27 +08:00
WenjiaoYue
267cad1f44 Refactor reranking (#1113)
Signed-off-by: WenjiaoYue <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>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
2025-01-08 10:19:04 +08:00
Liang Lv
bf09739585 Refine Component Interface (#1106)
* Refine component interface

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* update env

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* add health check

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* update mulimodal embedding

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* update import

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* refine other components

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix dataprepissue

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* fix tts issue

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* fix ci issues

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tts response issue

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comments

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2025-01-07 09:24:47 +08:00
lkk
cf90932fef refine agent directories. (#1109) 2025-01-06 17:35:39 +08:00
ZePan110
b933b66f15 Check duplicated dockerfile (#1073)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2025-01-06 17:27:59 +08:00
XinyaoWa
679e6664d4 Rename streaming to stream to align with OpenAI API (#1098)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2025-01-06 13:25:47 +08:00
chen, suyue
f57e30dde6 GenAIComps microservices refactor (#1072)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: WenjiaoYue <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>
2025-01-02 16:31:01 +08:00
Yao Qing
2cfd014b3b Refactor text2sql based on ERAG (#1080)
Signed-off-by: Yao, Qing <qing.yao@intel.com>
2025-01-02 10:09:10 +08:00
XinyuYe-Intel
90a86345c5 Image2video code refactor (#1075)
* image2video code refactor.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spell error.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* Update opea_image2video_microservice.py

* changed naming

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

---------

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-31 13:18:15 +08:00
Sihan Chen
a19c222636 Refactor asr/tts components (#1083)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-31 12:03:10 +08:00
Yao Qing
1040875055 Refactor image2image (#1076)
* Refactor image2image

Signed-off-by: Yao, Qing <qing.yao@intel.com>




---------

Signed-off-by: Yao, Qing <qing.yao@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-27 17:05:57 +08:00
Yao Qing
a7888ab299 Refactor Animation based on ERAG (#1079)
Signed-off-by: Yao, Qing <qing.yao@intel.com>
2024-12-27 15:26:49 +08:00
Sihan Chen
f006a3ee6c remove dataprep/multimedia2text (#1065) 2024-12-26 14:42:33 +08:00
Cameron Morin
8d6b4b0ac7 Add opensearch integration for OPEA (#1024)
* Add opensearch integration for OPEA

Signed-off-by: Cameron Morin <cammorin@amazon.com>

* Update docker compose yaml workflows files

Signed-off-by: Cameron Morin <cammorin@amazon.com>

* Fix empty files

Signed-off-by: Cameron Morin <cammorin@amazon.com>

* Address PR comments

Signed-off-by: Cameron Morin <cammorin@amazon.com>

---------

Signed-off-by: Cameron Morin <cammorin@amazon.com>
2024-12-26 11:09:59 +08:00
XinyaoWa
45d0002057 DocSum Long Context add auto mode (#1046)
* docsum refine mode promt update

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* docsum vllm requirement update

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* docsum add auto mode

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix readme

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refine

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-20 11:03:54 +08:00
minmin-intel
717c3c1025 Add SQL agent strategy (#1039)
* initial code for sql agent llama

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* add test for sql agent

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update sql agent test

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* fix bugs and use vllm to test sql agent

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* add tag-bench test and google search tool

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* test sql agent with hints

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* fix bugs for sql agent with hints and update test

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add readme for sql agent and fix ci bugs

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add sql agent using openai models

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix bugs in sql agent openai

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* make wait time longer for sql agent microservice to be ready

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update readme

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* fix test bug

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* skip planexec with vllm due to vllm-gaudi bug

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* debug ut issue

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use vllm for all uts

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* debug ci issue

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* change vllm port

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update ut

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* remove tgi server

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* align vllm port

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* remove unnecessary files and fix bugs

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* connect to db with full uri

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update readme and use vllm mainstream

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* rm unnecessary log

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update readme

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update test script

Signed-off-by: minmin-intel <minmin.hou@intel.com>

---------

Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-18 21:35:58 +08:00
Sihan Chen
70c151d895 Fix wrong endpoint for tei embedding gaudi wrapper (#1043)
* fix doc index retriever embed issue on gaudi

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* align test router with examples

* align readme

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-18 16:58:39 +08:00
Letong Han
a6cdd17242 fix multimodalqna issue (#1042)
Signed-off-by: letonghan <letong.han@intel.com>
2024-12-17 16:57:45 +08:00
XinyuYe-Intel
39ae2643a2 Update Dockerfile (#1040)
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-12-17 14:17:54 +08:00
XinyaoWa
5aba3b25cf Support Long context for DocSum (#981)
* docsum four

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* support 4 modes for docsum

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refine for docsum tgi

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* add docsum for ut and vllm

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix ut bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix ut bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* set default value

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

---------

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-17 14:09:49 +08:00
Letong Han
f3aaaebf5a [Reorg] Remove redundant file in retrievers/redis (#1016)
Signed-off-by: letonghan <letong.han@intel.com>
2024-12-17 12:01:13 +08:00
lkk
ce1faf6ae1 refine tgi doc with default openai format. (#1037) 2024-12-17 10:43:08 +08:00
lkk
c955e5e498 update tei embedding format. (#1035) 2024-12-16 14:54:32 +08:00
minmin-intel
46835f95da Revert "Add SQL agent strategy (#975)" (#1030)
This reverts commit c36c5032dc.

Co-authored-by: lkk <33276950+lkk12014402@users.noreply.github.com>
2024-12-14 12:09:14 +08:00
XinyaoWa
48ed589822 vllm comps support openai API ChatCompletionRequest (#1032)
* vllm support openai API

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* test_llms_text-generation_vllm_langchain_on_intel_hpu.sh

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix time

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix bug

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

---------

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-13 17:56:24 +08:00
lkk
f5efaf1f18 remove examples gateway. (#979)
* remove examples gateway.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove gateway.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refine service code.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update http_service.py

* remove gateway ut.

* remove gateway ut.

* fix conflict service name.

* Update http_service.py

* add handle message ut.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove `multiprocessing.Process` start server code.

* fix ut.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove multiprocessing and enhance ut for coverage.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-12-13 09:31:11 +08:00
minmin-intel
c36c5032dc Add SQL agent strategy (#975)
* initial code for sql agent llama

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* add test for sql agent

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update sql agent test

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* fix bugs and use vllm to test sql agent

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* add tag-bench test and google search tool

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* test sql agent with hints

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* fix bugs for sql agent with hints and update test

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add readme for sql agent and fix ci bugs

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add sql agent using openai models

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix bugs in sql agent openai

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* make wait time longer for sql agent microservice to be ready

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update readme

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* fix test bug

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* skip planexec with vllm due to vllm-gaudi bug

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* debug ut issue

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use vllm for all uts

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* debug ci issue

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* change vllm port

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* update ut

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* remove tgi server

Signed-off-by: minmin-intel <minmin.hou@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* align vllm port

Signed-off-by: minmin-intel <minmin.hou@intel.com>

---------

Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-11 10:55:55 -08:00
Letong Han
6acefae785 [LLM] Modify Params to Support Falcon3 Model (#1027)
* modify params to support falcon3 model

---------

Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Zhenzhong Xu <zhenzhong.xu@intel.com>
2024-12-11 11:35:05 +08:00
Liang Lv
c409ef9fcc Add Component base class for code refactoring (#983)
* Add Component base class

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add controller class

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add ut

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-10 13:20:16 +08:00
Wang, Kai Lawrence
ddd372d3e4 Remove enforce-eager to enable HPU graphs for better vLLM perf (#954)
* remove enforce-eager to enable HPU graphs

Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>

* Increase the llm max timeout in ci for fully warmup

Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>

---------

Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
2024-12-10 13:19:56 +08:00
kkrishTa
5ed041bded Feature/elasticsearch vector store integration - Infosys (#972)
* Feature/elastic

Elasticsearch vectorstore, dataprep and retriever

---------

Co-authored-by: Adarsh <reachaadi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Liang Lv <liang1.lv@intel.com>
2024-12-10 09:40:44 +08:00
Yao Qing
fbf3017afb Revert mosec embedding microservice to to use synchronous interface. (#971)
* Revert mosec embedding microservice to  to use synchronous interface.

Signed-off-by: Yao, Qing <qing.yao@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add dependency.

Signed-off-by: Yao, Qing <qing.yao@intel.com>

---------

Signed-off-by: Yao, Qing <qing.yao@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-06 13:18:51 +08:00
Eero Tamminen
5663e16821 Exclude yield/reply time from first token latency metric (#973)
While metrics are OK for small number of requests, when megaservice
is handling many (hundreds of) _parallel_ requests, it was reporting
clearly (~10%) larger first token latency, than the client receiving
the tokens from the megaservice.

Getting the time before token is yielded, means that reported first
token latency can be slightly shorter than it actually is. However,
testing with ChatQnA shows latencies to be clearly closer to ones seen
by the client (within couple of percent) and typically smaller (i.e.
logical).

PS. Doing the metrics timing after yielding the token, meant that also
time for sending the reply to the client and waiting that to complete,
was included to the token time.  I suspect that with lot of parallel
requests, processing often had switched to other megaservice request
processing threads, and getting control back to yielding thread for
timing, could be delayed much longer than sending the response to
client took.

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
2024-12-06 11:08:57 +08:00
dependabot[bot]
3328ea3ab2 Bump aiohttp from 3.10.10 to 3.10.11 in /comps/animation/wav2lip (#966)
Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.10 to 3.10.11.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.10...v3.10.11)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-04 09:36:53 +08:00