Compare commits

...

159 Commits

Author SHA1 Message Date
NeuralChatBot
bbb4e231d0 Freeze OPEA images tag
Signed-off-by: NeuralChatBot <grp_neural_chat_bot@intel.com>
2024-11-21 14:24:16 +00:00
bjzhjing
da10068964 Adjustments for helm release change (#1173)
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
(cherry picked from commit ef2047b070)
2024-11-21 16:57:30 +08:00
Letong Han
188b568467 Fix Translation Manifest CI with MODEL_ID (#1169)
Signed-off-by: letonghan <letong.han@intel.com>
(cherry picked from commit 94231584aa)
2024-11-21 16:57:29 +08:00
minmin-intel
9e9af9766f Fix DocIndexRetriever CI error on Xeon (#1167)
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
(cherry picked from commit c5177c5e2f)
2024-11-21 16:57:28 +08:00
chen, suyue
cc108b5a18 Fix DBQnA image build (#1165)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-20 10:56:49 +08:00
chen, suyue
f70d9c3853 chatqna benchmark for v1.1 release (#1120)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2024-11-19 22:57:25 +08:00
ZePan110
8808b51e42 Rename image name XXX-hpu to XXX-gaudi (#1154)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-11-19 22:18:41 +08:00
chen, suyue
17d4b0c97f freeze nodejs version in CI test (#1162)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-19 13:22:56 +08:00
Sun, Xuehao
3a03d31f8f Update manual-freeze-tag workflow (#1161)
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
2024-11-19 11:00:36 +08:00
dependabot[bot]
179fd84362 Bump gradio from 4.44.0 to 5.5.0 in /DocSum/ui/gradio (#1157)
Signed-off-by: dependabot[bot] <support@github.com>
2024-11-18 23:50:56 +08:00
chen, suyue
9ba034b22d fix the docker image name for release image build (#1152)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-18 23:48:01 +08:00
jotpalch
c3e6f43ece Fix command in README for deploying ChatQnA application (#1156) 2024-11-18 22:59:22 +08:00
Theresa
1ac756a1c7 Rename the GraphRAG UI image (#1155)
Signed-off-by: ichbinblau <theresa.shan@intel.com>
2024-11-18 20:07:22 +08:00
sgurunat
56f770cb28 ChatQnA with Remote Inference Endpoints (Kubernetes) (#1149)
Signed-off-by: sgurunat <gurunath.s@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-18 20:06:17 +08:00
XinyaoWa
0cdeb946e4 DocSum Manifest support multimedia (#1158)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-18 18:46:01 +08:00
Artem Astafev
5648839411 Add compose example for FaqGen AMD ROCm (#1126)
Signed-off-by: artem-astafev <a.astafev@datamonsters.com>
2024-11-18 17:38:21 +08:00
Mustafa
eb91d1f054 Docsum (#1095)
Signed-off-by: Mustafa <mustafa.cetin@intel.com>
Signed-off-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Co-authored-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: XinyaoWa <xinyao.wang@intel.com>
Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-18 17:15:42 +08:00
Wang, Kai Lawrence
2587179224 Add instructions of modifying reranking docker image for NVGPU (#1133)
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-18 15:37:32 +08:00
chyundunovDatamonsters
7e62175c2e Adding files to deploy CodeTrans application on AMD GPU (#1138)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
2024-11-18 14:58:38 +08:00
Louie Tsai
152adf8012 maintain a version info for docker_compose yaml files among release (#1141)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-11-17 22:39:41 -08:00
chyundunovDatamonsters
83172e9a99 Adding files to deploy CodeGen application on AMD GPU (#1130)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-18 14:36:23 +08:00
Liang Lv
fb514bb8ba Add chatqna wrapper for multiple model selection (#1144)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Ying Hu <ying.hu@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-18 10:48:09 +08:00
Artem Astafev
b1bb6db52d Add compose example for DocSum amd rocm deployment (#1125)
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-18 09:09:12 +08:00
rui2zhang
7949045176 EdgeCraftRAG: Add E2E test cases for EdgeCraftRAG - local LLM and vllm (#1137)
Signed-off-by: Zhang, Rui <rui2.zhang@intel.com>
Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mingyuan Qi <mingyuan.qi@intel.com>
2024-11-17 18:22:32 +08:00
Lianhao Lu
cbe952ec5e Fail CI manifest test if response content is not expected (#1145)
Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
2024-11-17 12:46:31 +08:00
chen, suyue
3b1a9fe9e1 optimize hardware list for test (#1151)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-15 22:46:02 +08:00
chen, suyue
e66d7fe381 fix typo involved in ci workflow (#1150)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-15 21:19:29 +08:00
Artem Astafev
6d3a017609 Add compose example for ChatQnA AMD ROCm deployment (#1122)
Signed-off-by: Artem Astafev <a.astafev@datamonsters.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-15 17:24:06 +08:00
Ying Hu
dbf4ba03fa Update AgentQnA README.md for refactor doc structure (#1146)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-15 16:30:13 +08:00
XinyaoWa
4f96d9e605 vllm hpu fix version for bug fix (#1142)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-11-15 15:12:53 +08:00
Ying Hu
a8f4245384 Update README.md for usage experience (#1135)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2024-11-15 14:23:12 +08:00
Mingyuan Qi
096a37aacc EdgeCraftRAG: Fix multiple issues (#1143)
Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-15 14:01:27 +08:00
rbrugaro
6f8fa6a689 Grag ex1.1 (#1123)
Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com>
Signed-off-by: theresa <theresa.shan@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: theresa <theresa.shan@intel.com>
2024-11-15 13:17:06 +08:00
Letong Han
39f68d5d6b Fix SearchQnA CI Issue (#1134)
Signed-off-by: letonghan <letong.han@intel.com>
2024-11-15 10:01:27 +08:00
Louie Tsai
00d9bb6128 Enable vLLM Profiling for ChatQnA on Gaudi (#1128)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-11-14 15:46:33 -08:00
Abolfazl Shahbazi
59b624c677 Fix minor documentation build issue (#1139)
Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
2024-11-14 15:29:50 -08:00
chen, suyue
2b2c7ee2f5 upgrade setuptools version to fix CVE-2024-6345 (#999)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-14 14:57:16 +08:00
Hoong Tee, Yeoh
6b9a27dd83 DBQnA: Include workflow in README (#956)
Signed-off-by: Yeoh, Hoong Tee <hoong.tee.yeoh@intel.com>
2024-11-14 14:05:28 +08:00
Yi Yao
5720cd45c0 Add benchmark launcher for AudioQnA (#981)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-14 13:58:51 +08:00
XinyaoWa
73879d3cec fix faq ui bug (#1118)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-14 10:00:30 +08:00
Lucas Melo
7c9ed04132 ChatQnA - Add Terraform and Ansible Modules information (#970)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: lucasmelogithub <lucas.melo@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Malini Bhandaru <malini.bhandaru@intel.com>
2024-11-13 11:42:12 -08:00
lvliang-intel
9ff7df9202 Use fixed version of TEI Gaudi for stability (#1101)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Malini Bhandaru <malini.bhandaru@intel.com>
2024-11-13 10:45:50 -08:00
Abolfazl Shahbazi
b5f95f735e Fix missing end of file chars (#1106)
Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-13 09:40:53 -08:00
chen, suyue
393367e9f1 Fix left issue of tgi version update (#1121)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-13 15:42:42 +08:00
Louie Tsai
7adbba6add Enable vLLM Profiling for ChatQnA (#1124) 2024-11-13 11:26:31 +08:00
pallavijaini0525
0d52c2f003 Pinecone update to Readme and docker compose for ChatQnA (#540)
Signed-off-by: pallavi jaini <pallavi.jaini@intel.com>
Signed-off-by: AI Workloads <aigoldrush1@g2-r3-2.iind.intel.com>
Signed-off-by: Pallavi Jaini <pallavi,jaini@intel.com>
Signed-off-by: Pallavi Jaini <pallavi.jaini@intel.com>
Signed-off-by: root <root@test-pjaini.535545281608.us-region-2.idcservice.net>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: AI Workloads <aigoldrush1@g2-r3-2.iind.intel.com>
Co-authored-by: Pallavi Jaini <pallavi,jaini@intel.com>
Co-authored-by: root <root@test-pjaini.535545281608.us-region-2.idcservice.net>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-13 09:32:37 +08:00
lvliang-intel
1ff85f6a85 Upgrade TGI Gaudi version to v2.0.6 (#1088)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-12 14:38:22 +08:00
bjzhjing
f7a7f8aa3f Fix typo (#1117)
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2024-11-12 09:54:05 +08:00
lvliang-intel
e3187be819 Update ChatQnA manifests using always pull image policy (#1100)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-11-11 14:37:14 +08:00
Sihan Chen
abd9d12937 Fix non stream case (#1115)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-11 14:18:42 +08:00
bjzhjing
a7353bbaa4 Refine performance directory (#1017)
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2024-11-11 13:58:46 +08:00
Letong Han
aa314f6757 [Readme] Update ChatQnA Readme for LLM Endpoint (#1086)
Signed-off-by: letonghan <letong.han@intel.com>
2024-11-11 13:53:06 +08:00
WenjiaoYue
3744bb8c1b Fix docSum ui error in accessing parsed files (#1079)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Signed-off-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com>
Co-authored-by: XinyaoWa <xinyao.wang@intel.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
2024-11-11 09:10:12 +08:00
chen, suyue
82801d0121 image build bug fix (#1105)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-08 23:54:32 +08:00
Wang, Kai Lawrence
f7026773b8 [ChatQnA] Fix the no_proxy setting for gpu example (#1078)
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
2024-11-08 22:27:51 +08:00
Hoong Tee, Yeoh
edc09ece5c ProductivitySuite: Fix typo in README (#1083)
Signed-off-by: Yeoh, Hoong Tee <hoong.tee.yeoh@intel.com>
2024-11-08 22:26:32 +08:00
dependabot[bot]
dfed2aead2 Bump gradio from 5.0.0 to 5.5.0 in /MultimodalQnA/ui/gradio (#1080)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-08 22:24:36 +08:00
ZePan110
049517f977 Improve the robustness of links check workflow (#1096)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-11-08 22:19:52 +08:00
Neo Zhang Jianyu
ee83a6d5b4 opt CI to skip none MD and RST files (#1098)
Signed-off-by: ZhangJianyu <zhang.jianyu@outlook.com>
2024-11-08 22:07:17 +08:00
WenjiaoYue
e2bdd19fd4 update faqGen ui response (#1091)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: lvliang-intel <liang1.lv@intel.com>
2024-11-08 21:29:52 +08:00
Zhu Yongbo
c9088eb824 Add EdgeCraftRag as a GenAIExample (#1072)
Signed-off-by: ZePan110 <ze.pan@intel.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Zhu, Yongbo <yongbo.zhu@intel.com>
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com>
Co-authored-by: lvliang-intel <liang1.lv@intel.com>
2024-11-08 21:07:24 +08:00
XinyaoWa
9c3023a12e Fix faq ut bug (#1097)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-11-08 16:27:00 +08:00
Melanie Hart Buehler
bbc95bb708 MultimodalQnA Image and Audio Support Phase 1 (#1071)
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Co-authored-by: Omar Khleif <omar.khleif@intel.com>
Co-authored-by: dmsuehir <dina.s.jones@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
2024-11-08 15:54:49 +08:00
ZePan110
dd9623d3d5 Add new image repo clone. (#1093)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-11-08 15:27:42 +08:00
XinyaoWa
4c27a3d30c Align faqgen to form input (#1089)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-08 13:32:26 +08:00
XinyaoWa
40386d9bd6 remove vllm-on-ray (#1084)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-11-08 13:01:48 +08:00
Neo Zhang Jianyu
fe97e88c7a Add CI case to check online doc building, not update online doc (#1087)
Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com>
2024-11-08 11:57:01 +08:00
Hoong Tee, Yeoh
11d8b24c8a ProductivitySuite: Update TGI CPU image version to 2.4.0 (#1062)
Signed-off-by: Yeoh, Hoong Tee <hoong.tee.yeoh@intel.com>
2024-11-08 09:50:11 +08:00
lvliang-intel
4635a927fa Make embedding run on CPU for aligning with Gaudi performance benchmark (#1057)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-11-07 17:39:34 +08:00
ZePan110
1da44d99a1 Remove debug outputs (#1085)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-11-07 14:11:46 +08:00
XinyaoWa
e9b164505e align vllm hpu version to latest vllm-fork (#1061)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-11-07 14:08:56 +08:00
Arthur Leung
6263b517b9 [Doc] Add steps to deploy opea services using minikube (#1058)
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Co-authored-by: Arthur Leung <arcyleung@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-07 13:57:34 +08:00
chen, suyue
2de7c0ba89 Enhance CI hardware list detect (#1077)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-07 09:38:19 +08:00
Wang, Kai Lawrence
944ae47948 [ChatQnA] Fix the service connection issue on GPU and modify the emb backend (#1059)
Signed-off-by: Wang, Kai Lawrence <kai.lawrence.wang@intel.com>
2024-11-06 10:22:21 +08:00
Neo Zhang Jianyu
2d9aeb3715 fix wrong format which break online doc build (#1073)
Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com>
2024-11-05 17:01:40 +08:00
xiguiw
a0921f127f [Doc] Fix broken build instruction (#1063)
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
2024-11-05 13:35:12 +08:00
chen, suyue
cf86aceb18 Update nightly image build jobs (#1070)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-11-05 09:14:44 +08:00
chen, suyue
c2b7bd25d9 Use docker stop instead of docker compose stop to avoid container clean up issue (#1068)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-04 22:54:19 +08:00
chen, suyue
78331ee678 Add nightly image build and publish action (#1067)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-04 17:22:56 +08:00
ZePan110
7f7ad0e256 Inject commit for the release docker image (#1060)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-11-04 17:08:15 +08:00
lvliang-intel
0306c620b5 Update TGI CPU image to latest official release 2.4.0 (#1035)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-04 11:28:43 +08:00
lkk
3372b9d480 update accuracy embedding endpoint for no wrapper (#1056)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-04 09:18:49 +08:00
minmin-intel
5eb3d2869f Update AgentQnA example for v1.1 release (#885)
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-04 09:17:19 +08:00
Yi Yao
ced68e1834 Add performance benchmark scripts for 4 use cases. (#1052)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-03 12:41:02 +08:00
JoshuaL3000
bf5c391e47 Add Workflow Executor Example (#892)
Signed-off-by: JoshuaL3000 <joshua.jian.ern.liew@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-31 20:50:20 -05:00
XinyaoWa
c65d7d40fb fix vllm output in chatqna (#1038)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-11-01 09:26:57 +08:00
chen, suyue
9d124161e0 update action for CI (#1050)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-31 14:54:04 +08:00
chen, suyue
0f5a9c4a5e Fix ChatQnA manifest test issue on Xeon (#1044)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-31 14:23:17 +08:00
rbrugaro
a65640b4a5 Graph rag (#1007)
Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-30 08:52:25 -07:00
lvliang-intel
7197286a14 Fix ChatQnA manifest default port issue (#1033)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-30 11:52:04 +08:00
Chun Tao
960805a57b Adding audio and image/video files needed for loading the Gradio UI, and update the UI Python function (#1034)
Signed-off-by: Chun Tao <chun.tao@intel.com>
Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com>
Signed-off-by: ZePan110 <ze.pan@intel.com>
Signed-off-by: Louie Tsai <louie.tsai@intel.com>
Signed-off-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: rbrugaro <rita.brugarolas.brufau@intel.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: kevinintel <hanwen.chang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Louie Tsai <louie.tsai@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-10-30 10:05:02 +08:00
Louie Tsai
002f0e2b11 Update VisualQnA README.md for its workflow (#912)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-30 09:27:22 +08:00
XinyaoWa
fde5996192 fix FaqGen accuracy scripts bug (#1039)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-10-29 16:34:11 +08:00
Lianhao Lu
bc47930ce1 manifest CI: repopulate the failure from inner test script (#1032)
Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
2024-10-28 11:51:24 +08:00
Yao Qing
2332d22950 [Codegen] Replace codegen default Model to Qwen/Qwen2.5-Coder-7B-Instruct. (#1013)
Signed-off-by: Yao, Qing <qing.yao@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-28 09:18:01 +08:00
XinyaoWa
a2afce1675 update codetrans default model (#1015)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-28 09:11:54 +08:00
WenjiaoYue
89f4c5fb41 update upload response format and add streaming method in front_end (#1019)
Signed-off-by: Yue, Wenjiao <wenjiao.yue@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-25 15:46:56 +08:00
lvliang-intel
98f66405ac Update docsum test command line format (#1027)
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-10-25 15:39:05 +08:00
Louie Tsai
90c2d49050 Update CodeTrans README.md for workflow (#908)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-25 12:39:18 +08:00
xiguiw
95b58b51fa Fix AIPC docker container network issue (#1021)
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
2024-10-25 10:46:57 +08:00
chen, suyue
d3ce6f5357 add new secrets for CI test (#1023)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-24 18:10:22 +08:00
Louie Tsai
a10b4a1f1d Address request from Issue#971 (#1018) 2024-10-23 23:57:52 -07:00
XinyuYe-Intel
085d859a70 Add example for text2image (#920)
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-24 11:43:44 +08:00
chen, suyue
15cc457cea fix action path in CI workflow (#1016)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-23 17:40:08 +08:00
Chun Tao
cfffb4c005 Initiate "AvatarChatbot" (audio) example (#923)
Signed-off-by: Chun Tao <chun.tao@intel.com>
Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com>
Signed-off-by: ZePan110 <ze.pan@intel.com>
Signed-off-by: Louie Tsai <louie.tsai@intel.com>
Signed-off-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: rbrugaro <rita.brugarolas.brufau@intel.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: kevinintel <hanwen.chang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Louie Tsai <louie.tsai@intel.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-10-23 14:58:17 +08:00
Chun Tao
41955f65ad Add a sample UI image for CodeGen's TGI monitoring (#1009)
Signed-off-by: Chun Tao <chun.tao@intel.com>
2024-10-23 14:38:12 +08:00
RuijingGuo
def39cfcdc setup ollama service in aipc docker compose (#1008)
Signed-off-by: Guo Ruijing <ruijing.guo@intel.com>
2024-10-23 14:22:48 +08:00
Louie Tsai
35a4fef70d Update Translation README.md for workflow (#907)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-23 11:35:15 +08:00
Louie Tsai
a3f9811f7e Update DocIndexRetriever README.md for workflow (#939)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-22 14:44:36 +08:00
lvliang-intel
0eedbbfce0 Update aipc ollama docker compose and readme (#984)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
2024-10-22 10:30:47 +08:00
lvliang-intel
9438d392b4 Update README for some minor issues (#1000)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-22 10:30:18 +08:00
Louie Tsai
1929dfd3a0 Update VideoQnA README.md for workflow (#906)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-21 13:56:45 -07:00
ZePan110
c7e33647ad Fix script name errors. (#997)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-10-21 11:44:50 +08:00
Dina Suehiro Jones
184e9a43b8 Update AudioQnA README to add a couple usage details (#948)
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
Co-authored-by: Sihan Chen <39623753+Spycsh@users.noreply.github.com>
2024-10-21 10:22:22 +08:00
Sihan Chen
658867fce4 Add multi-language AudioQnA on Xeon (#982)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-21 09:58:14 +08:00
chen, suyue
620ef76d16 open manifest test in CI when dockerfile changed (#985)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-20 21:58:52 +08:00
Louie Tsai
23b820e740 Update Agent README.md for workflow (#950)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-18 23:58:04 +08:00
lvliang-intel
3c164f3aa2 Make rerank run on gaudi for hpu docker compose (#980)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-18 21:49:36 +08:00
CharleneHu-42
7669c42085 Update ChatQnA README to add benchmark launcher (#958)
Signed-off-by: CharleneHu-42 <yabai.hu@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Yi Yao <yi.a.yao@intel.com>
2024-10-18 13:33:20 +08:00
lvliang-intel
256b58c07e Replace environment variables with service name for ChatQnA (#977)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-18 11:31:24 +08:00
jiahuit1
3c3a5bed67 Remove deprecated images in docker_images_list.md (#979)
Signed-off-by: jiahuit1 <jia1.hui.tan@intel.com>
2024-10-18 11:21:46 +08:00
ylg
37c74b232c Update ChatQnA yaml and set retriever's TEI_EMBEDDING_ENDPOINT (#953)
Signed-off-by: longguang.yue <bigclouds@163.com>
2024-10-17 16:58:47 +08:00
Sihan Chen
4a265abb73 Fix top_n rerank docs (#976) 2024-10-17 15:49:16 +08:00
Sihan Chen
b0487fe92b fix chatqna accuracy issue with incorrect penalty (#974) 2024-10-17 15:48:44 +08:00
chen, suyue
d486bbbe10 Fix issue find in image build (#978)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-17 15:01:11 +08:00
XinyaoWa
b0f7c9cfc2 Support Chinese for Docsum (#960)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
2024-10-17 14:58:21 +08:00
chen, suyue
eeced9b31c Enhance CI/CD image build (#961)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-17 14:33:58 +08:00
WenjiaoYue
b377c2b8f8 Update manifest ui containerPort (#952)
Signed-off-by: Yue, Wenjiao <wenjiao.yue@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-17 09:42:55 +08:00
chen, suyue
5dae713793 add PINECONE_KEY_LANGCHAIN_TEST for CI test (#959)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-16 15:53:20 +08:00
lvliang-intel
c930bea172 Add missing nginx microservice and fix frontend test (#951)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-16 13:29:31 +08:00
Louie Tsai
0edff26ee5 Update Productivity README.md for workflow (#940)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-16 10:27:42 +08:00
lvliang-intel
778afb50ac Clean no wrapper image in performance benchmark manifests (#955)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
2024-10-15 18:21:53 +08:00
Louie Tsai
40800b0848 Update MultiModal README.md for workflow (#905)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-15 11:00:14 +08:00
dependabot[bot]
f2f6c09a0f Bump gradio from 4.44.0 to 5.0.0 in /MultimodalQnA/ui/gradio (#932)
Signed-off-by: dependabot[bot] <support@github.com>
2024-10-15 10:30:02 +08:00
WenjiaoYue
c6fc92d37c Add Text2Image UI, UI tests, Readme, and Docker support (#927)
Signed-off-by: Yue, Wenjiao <wenjiao.yue@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-14 13:36:33 +08:00
Supriya-Krishnamurthi
c0643b71e8 Adding DBQnA example in GenAIExamples (#894)
Signed-off-by: supriya-krishnamurthi <supriya.krishnamurthi@intel.com>
Signed-off-by: Yogesh <yogeshpandey@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Yogesh <yogeshpandey@intel.com>
Co-authored-by: Hoong Tee, Yeoh <hoong.tee.yeoh@intel.com>
Co-authored-by: Yogesh Pandey <yogesh.pandey@intel.com>
2024-10-14 13:36:00 +08:00
lkk
088ab98f31 update examples accuracy (#941)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-14 13:20:50 +08:00
Sun, Xuehao
441f8cc6ba Freeze docformatter in pre-commit (#937)
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
2024-10-14 09:30:23 +08:00
xiguiw
b056ce6617 [Doc] Update ChatQnA AIPC README (#935)
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-12 11:04:53 +08:00
xiguiw
773c32b38b Fix AIPC retriever and UI error (#933)
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
2024-10-11 13:35:27 +08:00
lvliang-intel
619d941047 Set no wrapper ChatQnA as default (#891)
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-11 13:30:45 +08:00
Abolfazl Shahbazi
b71a12d424 Remove 'vim' from Dockerfiles (#924)
Signed-off-by: Abolfazl Shahbazi <abolfazl.shahbazi@intel.com>
2024-10-10 18:24:31 -07:00
Louie Tsai
12469c92d8 Update CodeGen README for its workflow (#911)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-10 08:47:56 -07:00
Louie Tsai
fbde15b40d Update DocSum README.md for its workflow (#904)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-10 08:46:41 -07:00
feng-intel
ae10712fe8 doc: Update ChatQnA/benchmark/performance doc (#930) 2024-10-10 16:30:40 +08:00
ZePan110
373fa88033 Fix the issue of exiting due to inability to find hyperlinks (#929)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-10-10 14:34:26 +08:00
pallavijaini0525
e2f9037344 Added the K8s yaml for vLLM support (#917)
Signed-off-by: desaidhr <dhruv.desai@intel.com>
Co-authored-by: desaidhr <dhruv.desai@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-10 11:08:07 +08:00
shaohef
afc39fa4c0 Simplify the deployment ProductivitySuite on kubernetes (#919)
Signed-off-by: Shaohe Feng <shaohe.feng@intel.com>
Co-authored-by: Hoong Tee, Yeoh <hoong.tee.yeoh@intel.com>
2024-10-10 09:23:54 +08:00
ZePan110
e1c476c185 Add missing content (#914)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-10-10 09:08:44 +08:00
kevinintel
77920613dc Update CODEOWNERS (#918) 2024-10-10 07:17:08 +08:00
ZePan110
7dec00176e Optimize path and link validity check. (#866)
Signed-off-by: ZePan110 <ze.pan@intel.com>
2024-10-09 10:03:32 +08:00
Louie Tsai
bf28c7f098 Update SearchQnA README.md for its workflow (#913)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-08 08:50:28 -07:00
Louie Tsai
63bad29794 Update AudioQnA README.md for its workflow (#903) 2024-10-08 08:49:55 -07:00
chen, suyue
36d3ef2b17 fix image name (#909)
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-08 20:48:07 +08:00
Louie Tsai
0c6b044139 Update FaqGen README.md for its workflow (#910)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2024-10-08 20:47:26 +08:00
ZePan110
d23cd799e9 Update docker image list. (#893)
Signed-off-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: kevinintel <hanwen.chang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-08 14:05:28 +08:00
rbrugaro
644c3a67ce instruction finetune README improvement (#897)
Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com>
2024-10-08 14:04:47 +08:00
Hoong Tee, Yeoh
ffecd182db [ProductivitySuite]: Update service port number (#879)
Signed-off-by: Yeoh, Hoong Tee <hoong.tee.yeoh@intel.com>
2024-09-30 22:01:09 -07:00
Zhenzhong1
d16c80e493 [ChatQnA] manage your own ChatQnA pipelines. (#878)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-30 17:01:44 +09:00
765 changed files with 55926 additions and 7334 deletions

8
.github/CODEOWNERS vendored
View File

@@ -1,13 +1,17 @@
/AgentQnA/ xuhui.ren@intel.com
/AgentQnA/ kaokao.lv@intel.com
/AudioQnA/ sihan.chen@intel.com
/ChatQnA/ liang1.lv@intel.com
/CodeGen/ liang1.lv@intel.com
/CodeTrans/ sihan.chen@intel.com
/DocSum/ letong.han@intel.com
/DocIndexRetriever/ xuhui.ren@intel.com chendi.xue@intel.com
/DocIndexRetriever/ kaokao.lv@intel.com chendi.xue@intel.com
/InstructionTuning xinyu.ye@intel.com
/RerankFinetuning xinyu.ye@intel.com
/MultimodalQnA tiep.le@intel.com
/FaqGen/ xinyao.wang@intel.com
/SearchQnA/ sihan.chen@intel.com
/Translation/ liang1.lv@intel.com
/VisualQnA/ liang1.lv@intel.com
/ProductivitySuite/ hoong.tee.yeoh@intel.com
/VideoQnA huiling.bao@intel.com
/*/ liang1.lv@intel.com

View File

@@ -0,0 +1,2 @@
ModelIn
modelin

View File

@@ -1,2 +1,2 @@
Copyright (C) 2024 Intel Corporation
SPDX-License-Identifier: Apache-2.0
SPDX-License-Identifier: Apache-2.0

View File

@@ -12,6 +12,10 @@ on:
example:
required: true
type: string
services:
default: ""
required: false
type: string
tag:
default: "latest"
required: false
@@ -36,6 +40,11 @@ on:
default: "main"
required: false
type: string
inject_commit:
default: false
required: false
type: string
jobs:
####################################################################################################
# Image Build
@@ -68,6 +77,10 @@ jobs:
git clone https://github.com/vllm-project/vllm.git
cd vllm && git rev-parse HEAD && cd ../
fi
if [[ $(grep -c "vllm-gaudi:" ${docker_compose_path}) != 0 ]]; then
git clone https://github.com/HabanaAI/vllm-fork.git
cd vllm-fork && git checkout 3c39626 && cd ../
fi
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps && git checkout ${{ inputs.opea_branch }} && git rev-parse HEAD && cd ../
@@ -77,7 +90,9 @@ jobs:
with:
work_dir: ${{ github.workspace }}/${{ inputs.example }}/docker_image_build
docker_compose_path: ${{ github.workspace }}/${{ inputs.example }}/docker_image_build/build.yaml
service_list: ${{ inputs.services }}
registry: ${OPEA_IMAGE_REPO}opea
inject_commit: ${{ inputs.inject_commit }}
tag: ${{ inputs.tag }}
####################################################################################################
@@ -105,7 +120,6 @@ jobs:
example: ${{ inputs.example }}
hardware: ${{ inputs.node }}
tag: ${{ inputs.tag }}
context: "CD"
secrets: inherit
####################################################################################################

View File

@@ -14,7 +14,7 @@ on:
test_mode:
required: false
type: string
default: 'docker_compose'
default: 'compose'
outputs:
run_matrix:
description: "The matrix string"

View File

@@ -20,11 +20,6 @@ on:
description: "Tag to apply to images, default is latest"
required: false
type: string
context:
default: "CI"
description: "CI or CD"
required: false
type: string
jobs:
manifest-test:
@@ -51,7 +46,7 @@ jobs:
- name: Set variables
run: |
echo "IMAGE_REPO=$OPEA_IMAGE_REPO" >> $GITHUB_ENV
echo "IMAGE_REPO=${OPEA_IMAGE_REPO}opea" >> $GITHUB_ENV
echo "IMAGE_TAG=${{ inputs.tag }}" >> $GITHUB_ENV
lower_example=$(echo "${{ inputs.example }}" | tr '[:upper:]' '[:lower:]')
echo "NAMESPACE=$lower_example-$(tr -dc a-z0-9 </dev/urandom | head -c 16)" >> $GITHUB_ENV
@@ -60,7 +55,6 @@ jobs:
echo "continue_test=true" >> $GITHUB_ENV
echo "should_cleanup=false" >> $GITHUB_ENV
echo "skip_validate=true" >> $GITHUB_ENV
echo "CONTEXT=${{ inputs.context }}" >> $GITHUB_ENV
echo "NAMESPACE=$NAMESPACE"
- name: Kubectl install
@@ -96,10 +90,16 @@ jobs:
echo "Validate ${{ inputs.example }} successful!"
else
echo "Validate ${{ inputs.example }} failure!!!"
.github/workflows/scripts/k8s-utils.sh dump_all_pod_logs $NAMESPACE
echo "Check the logs in 'Dump logs when e2e test failed' step!!!"
exit 1
fi
fi
- name: Dump logs when e2e test failed
if: failure()
run: |
.github/workflows/scripts/k8s-utils.sh dump_all_pod_logs $NAMESPACE
- name: Kubectl uninstall
if: always()
run: |

View File

@@ -118,6 +118,9 @@ jobs:
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
PINECONE_KEY: ${{ secrets.PINECONE_KEY }}
PINECONE_KEY_LANGCHAIN_TEST: ${{ secrets.PINECONE_KEY_LANGCHAIN_TEST }}
SDK_BASE_URL: ${{ secrets.SDK_BASE_URL }}
SERVING_TOKEN: ${{ secrets.SERVING_TOKEN }}
IMAGE_REPO: ${{ inputs.registry }}
IMAGE_TAG: ${{ inputs.tag }}
example: ${{ inputs.example }}
@@ -138,7 +141,11 @@ jobs:
flag=${flag#test_}
yaml_file=$(find . -type f -wholename "*${{ inputs.hardware }}/${flag}.yaml")
echo $yaml_file
docker compose -f $yaml_file stop && docker compose -f $yaml_file rm -f || true
container_list=$(cat $yaml_file | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
done
docker system prune -f
docker rmi $(docker images --filter reference="*:5000/*/*" -q) || true

View File

@@ -0,0 +1,35 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Check Online Document Building
permissions: {}
on:
pull_request:
branches: [main]
paths:
- "**.md"
- "**.rst"
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
path: GenAIExamples
- name: Checkout docs
uses: actions/checkout@v4
with:
repository: opea-project/docs
path: docs
- name: Build Online Document
shell: bash
run: |
echo "build online doc"
cd docs
bash scripts/build.sh

View File

@@ -50,6 +50,11 @@ on:
description: 'OPEA branch for image build'
required: false
type: string
inject_commit:
default: true
description: "inject commit to docker images true or false"
required: false
type: string
permissions: read-all
jobs:
@@ -101,4 +106,5 @@ jobs:
test_k8s: ${{ fromJSON(inputs.test_k8s) }}
test_gmc: ${{ fromJSON(inputs.test_gmc) }}
opea_branch: ${{ inputs.opea_branch }}
inject_commit: ${{ inputs.inject_commit }}
secrets: inherit

View File

@@ -1,13 +1,13 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Freeze OPEA images release tag in readme on manual event
name: Freeze OPEA images release tag
on:
workflow_dispatch:
inputs:
tag:
default: "latest"
default: "1.1.0"
description: "Tag to apply to images"
required: true
type: string
@@ -23,10 +23,6 @@ jobs:
fetch-depth: 0
ref: ${{ github.ref }}
- uses: actions/setup-python@v5
with:
python-version: "3.10"
- name: Set up Git
run: |
git config --global user.name "NeuralChatBot"
@@ -35,9 +31,10 @@ jobs:
- name: Run script
run: |
find . -name "*.md" | xargs sed -i "s|^docker\ compose|TAG=${{ github.event.inputs.tag }}\ docker\ compose|g"
find . -type f -name "*.yaml" \( -path "*/benchmark/*" -o -path "*/kubernetes/*" \) | xargs sed -i -E 's/(opea\/[A-Za-z0-9\-]*:)latest/\1${{ github.event.inputs.tag }}/g'
find . -type f -name "*.md" \( -path "*/benchmark/*" -o -path "*/kubernetes/*" \) | xargs sed -i -E 's/(opea\/[A-Za-z0-9\-]*:)latest/\1${{ github.event.inputs.tag }}/g'
IFS='.' read -r major minor patch <<< "${{ github.event.inputs.tag }}"
echo "VERSION_MAJOR ${major}" > version.txt
echo "VERSION_MINOR ${minor}" >> version.txt
echo "VERSION_PATCH ${patch}" >> version.txt
- name: Commit changes
run: |

View File

@@ -0,0 +1,66 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Build specific images on manual event
on:
workflow_dispatch:
inputs:
nodes:
default: "gaudi,xeon"
description: "Hardware to run test"
required: true
type: string
example:
default: "ChatQnA"
description: 'Build images belong to which example?'
required: true
type: string
services:
default: "chatqna,chatqna-without-rerank"
description: 'Service list to build'
required: true
type: string
tag:
default: "latest"
description: "Tag to apply to images"
required: true
type: string
opea_branch:
default: "main"
description: 'OPEA branch for image build'
required: false
type: string
inject_commit:
default: true
description: "inject commit to docker images true or false"
required: false
type: string
jobs:
get-test-matrix:
runs-on: ubuntu-latest
outputs:
nodes: ${{ steps.get-matrix.outputs.nodes }}
steps:
- name: Create Matrix
id: get-matrix
run: |
nodes=($(echo ${{ inputs.nodes }} | tr ',' ' '))
nodes_json=$(printf '%s\n' "${nodes[@]}" | sort -u | jq -R '.' | jq -sc '.')
echo "nodes=$nodes_json" >> $GITHUB_OUTPUT
image-build:
needs: get-test-matrix
strategy:
matrix:
node: ${{ fromJson(needs.get-test-matrix.outputs.nodes) }}
fail-fast: false
uses: ./.github/workflows/_example-workflow.yml
with:
node: ${{ matrix.node }}
example: ${{ inputs.example }}
services: ${{ inputs.services }}
tag: ${{ inputs.tag }}
opea_branch: ${{ inputs.opea_branch }}
inject_commit: ${{ inputs.inject_commit }}
secrets: inherit

View File

@@ -0,0 +1,70 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Nightly build/publish latest docker images
on:
schedule:
- cron: "30 13 * * *" # UTC time
workflow_dispatch:
env:
EXAMPLES: "AgentQnA,AudioQnA,ChatQnA,CodeGen,CodeTrans,DocIndexRetriever,DocSum,FaqGen,InstructionTuning,MultimodalQnA,ProductivitySuite,RerankFinetuning,SearchQnA,Translation,VideoQnA,VisualQnA"
TAG: "latest"
PUBLISH_TAGS: "latest"
jobs:
get-build-matrix:
runs-on: ubuntu-latest
outputs:
examples_json: ${{ steps.get-matrix.outputs.examples_json }}
EXAMPLES: ${{ steps.get-matrix.outputs.EXAMPLES }}
TAG: ${{ steps.get-matrix.outputs.TAG }}
PUBLISH_TAGS: ${{ steps.get-matrix.outputs.PUBLISH_TAGS }}
steps:
- name: Create Matrix
id: get-matrix
run: |
examples=($(echo ${EXAMPLES} | tr ',' ' '))
examples_json=$(printf '%s\n' "${examples[@]}" | sort -u | jq -R '.' | jq -sc '.')
echo "examples_json=$examples_json" >> $GITHUB_OUTPUT
echo "EXAMPLES=$EXAMPLES" >> $GITHUB_OUTPUT
echo "TAG=$TAG" >> $GITHUB_OUTPUT
echo "PUBLISH_TAGS=$PUBLISH_TAGS" >> $GITHUB_OUTPUT
build:
needs: get-build-matrix
strategy:
matrix:
example: ${{ fromJSON(needs.get-build-matrix.outputs.examples_json) }}
fail-fast: false
uses: ./.github/workflows/_example-workflow.yml
with:
node: gaudi
example: ${{ matrix.example }}
secrets: inherit
get-image-list:
needs: get-build-matrix
uses: ./.github/workflows/_get-image-list.yml
with:
examples: ${{ needs.get-build-matrix.outputs.EXAMPLES }}
publish:
needs: [get-build-matrix, get-image-list, build]
strategy:
matrix:
image: ${{ fromJSON(needs.get-image-list.outputs.matrix) }}
runs-on: "docker-build-gaudi"
steps:
- uses: docker/login-action@v3.2.0
with:
username: ${{ secrets.DOCKERHUB_USER }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Image Publish
uses: opea-project/validation/actions/image-publish@main
with:
local_image_ref: ${OPEA_IMAGE_REPO}opea/${{ matrix.image }}:${{ needs.get-build-matrix.outputs.TAG }}
image_name: opea/${{ matrix.image }}
publish_tags: ${{ needs.get-build-matrix.outputs.PUBLISH_TAGS }}

View File

@@ -1,50 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Check Requirements
on: [pull_request]
jobs:
check-requirements:
runs-on: ubuntu-latest
steps:
- name: Checkout PR branch
uses: actions/checkout@v4
- name: Save PR requirements
run: |
find . -name "requirements.txt" -exec cat {} \; | \
grep -v '^\s*#' | \
grep -v '^\s*$' | \
grep -v '^\s*-' | \
sed 's/^\s*//' | \
awk -F'[>=<]' '{print $1}' | \
sort -u > pr-requirements.txt
cat pr-requirements.txt
- name: Checkout main branch
uses: actions/checkout@v4
with:
ref: main
path: main-branch
- name: Save main branch requirements
run: |
find ./main-branch -name "requirements.txt" -exec cat {} \; | \
grep -v '^\s*#' | \
grep -v '^\s*$' | \
grep -v '^\s*-' | \
sed 's/^\s*//' | \
awk -F'[>=<]' '{print $1}' | \
sort -u > main-requirements.txt
cat main-requirements.txt
- name: Compare requirements
run: |
comm -23 pr-requirements.txt main-requirements.txt > added-packages.txt
if [ -s added-packages.txt ]; then
echo "New packages found in PR:" && cat added-packages.txt
else
echo "No new packages found😊."
fi

View File

@@ -12,7 +12,7 @@ on:
- "**/tests/test_gmc**"
- "!**.md"
- "!**.txt"
- "!**/kubernetes/**/manifests/**"
- "!**/kubernetes/**/manifest/**"
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

View File

@@ -8,7 +8,9 @@ on:
branches: ["main", "*rc"]
types: [opened, reopened, ready_for_review, synchronize] # added `ready_for_review` since draft is skipped
paths:
- "**/kubernetes/**/manifests/**"
- "**/Dockerfile**"
- "**.py"
- "**/kubernetes/**/manifest/**"
- "**/tests/test_manifest**"
- "!**.md"
- "!**.txt"

View File

@@ -50,28 +50,40 @@ jobs:
- name: Checkout Repo GenAIExamples
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check the Validity of Hyperlinks
run: |
cd ${{github.workspace}}
fail="FALSE"
url_lines=$(grep -Eo '\]\(http[s]?://[^)]+\)' --include='*.md' -r .)
if [ -n "$url_lines" ]; then
for url_line in $url_lines; do
url=$(echo "$url_line"|cut -d '(' -f2 | cut -d ')' -f1|sed 's/\.git$//')
path=$(echo "$url_line"|cut -d':' -f1 | cut -d'/' -f2-)
response=$(curl -L -s -o /dev/null -w "%{http_code}" "$url")
if [ "$response" -ne 200 ]; then
echo "**********Validation failed, try again**********"
response_retry=$(curl -s -o /dev/null -w "%{http_code}" "$url")
if [ "$response_retry" -eq 200 ]; then
echo "*****Retry successfully*****"
else
echo "Invalid link from ${{github.workspace}}/$path: $url"
fail="TRUE"
fi
merged_commit=$(git log -1 --format='%H')
changed_files="$(git diff --name-status --diff-filter=ARM ${{ github.event.pull_request.base.sha }} ${merged_commit} | awk '/\.md$/ {print $NF}')"
if [ -n "$changed_files" ]; then
for changed_file in $changed_files; do
# echo $changed_file
url_lines=$(grep -H -Eo '\]\(http[s]?://[^)]+\)' "$changed_file" | grep -Ev 'GenAIExamples/blob/main') || true
if [ -n "$url_lines" ]; then
for url_line in $url_lines; do
# echo $url_line
url=$(echo "$url_line"|cut -d '(' -f2 | cut -d ')' -f1|sed 's/\.git$//')
path=$(echo "$url_line"|cut -d':' -f1 | cut -d'/' -f2-)
response=$(curl -L -s -o /dev/null -w "%{http_code}" "$url")|| true
if [ "$response" -ne 200 ]; then
echo "**********Validation failed, try again**********"
response_retry=$(curl -s -o /dev/null -w "%{http_code}" "$url")
if [ "$response_retry" -eq 200 ]; then
echo "*****Retry successfully*****"
else
echo "Invalid link from ${{github.workspace}}/$path: $url"
fail="TRUE"
fi
fi
done
fi
done
else
echo "No changed .md file."
fi
if [[ "$fail" == "TRUE" ]]; then
@@ -89,6 +101,8 @@ jobs:
- name: Checkout Repo GenAIExamples
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checking Relative Path Validity
run: |
@@ -102,33 +116,34 @@ jobs:
branch="https://github.com/opea-project/GenAIExamples/blob/${{ github.event.pull_request.head.ref }}"
fi
link_head="https://github.com/opea-project/GenAIExamples/blob/main"
merged_commit=$(git log -1 --format='%H')
changed_files="$(git diff --name-status --diff-filter=ARM ${{ github.event.pull_request.base.sha }} ${merged_commit} | awk '/\.md$/ {print $NF}')"
png_lines=$(grep -Eo '\]\([^)]+\)' --include='*.md' -r .|grep -Ev 'http')
if [ -n "$png_lines" ]; then
for png_line in $png_lines; do
refer_path=$(echo "$png_line"|cut -d':' -f1 | cut -d'/' -f2-)
png_path=$(echo "$png_line"|cut -d '(' -f2 | cut -d ')' -f1)
if [[ "${png_path:0:1}" == "/" ]]; then
check_path=${{github.workspace}}$png_path
elif [[ "${png_path:0:1}" == "#" ]]; then
check_path=${{github.workspace}}/$refer_path$png_path
check_path=$png_path
elif [[ "$png_path" == *#* ]]; then
relative_path=$(echo "$png_path" | cut -d '#' -f1)
if [ -n "$relative_path" ]; then
check_path=$(dirname "$refer_path")/$relative_path
png_path=$(echo "$png_path" | awk -F'#' '{print "#" $2}')
else
check_path=$refer_path
fi
else
check_path=${{github.workspace}}/$(dirname "$refer_path")/$png_path
check_path=$(dirname "$refer_path")/$png_path
fi
real_path=$(realpath $check_path)
if [ $? -ne 0 ]; then
echo "Path $png_path in file ${{github.workspace}}/$refer_path does not exist"
fail="TRUE"
else
url=$link_head$(echo "$real_path" | sed 's|.*/GenAIExamples||')
response=$(curl -I -L -s -o /dev/null -w "%{http_code}" "$url")
if [ "$response" -ne 200 ]; then
echo "**********Validation failed, try again**********"
response_retry=$(curl -s -o /dev/null -w "%{http_code}" "$url")
if [ "$response_retry" -eq 200 ]; then
echo "*****Retry successfully*****"
else
echo "Retry failed. Check branch ${{ github.event.pull_request.head.ref }}"
url_dev=$branch$(echo "$real_path" | sed 's|.*/GenAIExamples||')
if [ -e "$check_path" ]; then
real_path=$(realpath $check_path)
if [[ "$png_line" == *#* ]]; then
if [ -n "changed_files" ] && echo "$changed_files" | grep -q "^${refer_path}$"; then
url_dev=$branch$(echo "$real_path" | sed 's|.*/GenAIExamples||')$png_path
response=$(curl -I -L -s -o /dev/null -w "%{http_code}" "$url_dev")
if [ "$response" -ne 200 ]; then
echo "**********Validation failed, try again**********"
@@ -140,10 +155,13 @@ jobs:
fail="TRUE"
fi
else
echo "Check branch ${{ github.event.pull_request.head.ref }} successfully."
echo "Validation succeed $png_line"
fi
fi
fi
else
echo "${{github.workspace}}/$refer_path:$png_path does not exist"
fail="TRUE"
fi
done
fi

View File

@@ -8,7 +8,8 @@ on:
branches: [ 'main' ]
paths:
- "**.py"
- "**Dockerfile"
- "**Dockerfile*"
- "**docker_image_build/build.yaml"
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-on-push
@@ -18,17 +19,15 @@ jobs:
job1:
uses: ./.github/workflows/_get-test-matrix.yml
with:
test_mode: "docker_image_build/build.yaml"
test_mode: "docker_image_build"
image-build:
needs: job1
strategy:
matrix:
example: ${{ fromJSON(needs.job1.outputs.run_matrix).include.*.example }}
node: ["gaudi","xeon"]
matrix: ${{ fromJSON(needs.job1.outputs.run_matrix) }}
fail-fast: false
uses: ./.github/workflows/_example-workflow.yml
with:
node: ${{ matrix.node }}
node: ${{ matrix.hardware }}
example: ${{ matrix.example }}
secrets: inherit

View File

@@ -9,12 +9,20 @@ set -e
changed_files=$changed_files
test_mode=$test_mode
run_matrix="{\"include\":["
hardware_list="xeon gaudi" # current support hardware list
examples=$(printf '%s\n' "${changed_files[@]}" | grep '/' | cut -d'/' -f1 | sort -u)
for example in ${examples}; do
cd $WORKSPACE/$example
if [[ ! $(find . -type f | grep ${test_mode}) ]]; then continue; fi
cd tests
ls -l
if [[ "$test_mode" == "docker_image_build" ]]; then
find_name="test_manifest_on_*.sh"
else
find_name="test_${test_mode}*_on_*.sh"
fi
hardware_list=$(find . -type f -name "${find_name}" | cut -d/ -f2 | cut -d. -f1 | awk -F'_on_' '{print $2}'| sort -u)
echo -e "Test supported hardware list: \n${hardware_list}"
run_hardware=""
if [[ $(printf '%s\n' "${changed_files[@]}" | grep ${example} | cut -d'/' -f2 | grep -E '*.py|Dockerfile*|ui|docker_image_build' ) ]]; then

2
.gitignore vendored
View File

@@ -5,4 +5,4 @@
**/playwright/.cache/
**/test-results/
__pycache__/
__pycache__/

View File

@@ -79,7 +79,7 @@ repos:
- id: isort
- repo: https://github.com/PyCQA/docformatter
rev: v1.7.5
rev: 06907d0
hooks:
- id: docformatter
args: [

View File

@@ -1 +1 @@
**/kubernetes/
**/kubernetes/

16
.set_env.sh Normal file
View File

@@ -0,0 +1,16 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#
#To anounce the version of the codes, please create a version.txt and have following format.
#VERSION_MAJOR 1
#VERSION_MINOR 0
#VERSION_PATCH 0
VERSION_FILE="version.txt"
if [ -f $VERSION_FILE ]; then
VER_OPEA_MAJOR=$(grep "VERSION_MAJOR" $VERSION_FILE | cut -d " " -f 2)
VER_OPEA_MINOR=$(grep "VERSION_MINOR" $VERSION_FILE | cut -d " " -f 2)
VER_OPEA_PATCH=$(grep "VERSION_PATCH" $VERSION_FILE | cut -d " " -f 2)
export TAG=$VER_OPEA_MAJOR.$VER_OPEA_MINOR
echo OPEA Version:$TAG
fi

View File

@@ -5,6 +5,73 @@
This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram is shown below. The supervisor agent interfaces with the user and dispatch tasks to the worker agent and other tools to gather information and come up with answers. The worker agent uses the retrieval tool to generate answers to the queries posted by the supervisor agent. Other tools used by the supervisor agent may include APIs to interface knowledge graphs, SQL databases, external knowledge bases, etc.
![Architecture Overview](assets/agent_qna_arch.png)
The AgentQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
```mermaid
---
config:
flowchart:
nodeSpacing: 400
rankSpacing: 100
curve: linear
themeVariables:
fontSize: 50px
---
flowchart LR
%% Colors %%
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef invisible fill:transparent,stroke:transparent;
%% Subgraphs %%
subgraph DocIndexRetriever-MegaService["DocIndexRetriever MegaService "]
direction LR
EM([Embedding MicroService]):::blue
RET([Retrieval MicroService]):::blue
RER([Rerank MicroService]):::blue
end
subgraph UserInput[" User Input "]
direction LR
a([User Input Query]):::orchid
Ingest([Ingest data]):::orchid
end
AG_REACT([Agent MicroService - react]):::blue
AG_RAG([Agent MicroService - rag]):::blue
LLM_gen{{LLM Service <br>}}
DP([Data Preparation MicroService]):::blue
TEI_RER{{Reranking service<br>}}
TEI_EM{{Embedding service <br>}}
VDB{{Vector DB<br><br>}}
R_RET{{Retriever service <br>}}
%% Questions interaction
direction LR
a[User Input Query] --> AG_REACT
AG_REACT --> AG_RAG
AG_RAG --> DocIndexRetriever-MegaService
EM ==> RET
RET ==> RER
Ingest[Ingest data] --> DP
%% Embedding service flow
direction LR
AG_RAG <-.-> LLM_gen
AG_REACT <-.-> LLM_gen
EM <-.-> TEI_EM
RET <-.-> R_RET
RER <-.-> TEI_RER
direction TB
%% Vector DB interaction
R_RET <-.-> VDB
DP <-.-> VDB
```
### Why Agent for question answering?
1. Improve relevancy of retrieved context.
@@ -14,72 +81,122 @@ This example showcases a hierarchical multi-agent system for question-answering
3. Hierarchical agent can further improve performance.
Expert worker agents, such as retrieval agent, knowledge graph agent, SQL agent, etc., can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer.
### Roadmap
## Deployment with docker
- v0.9: Worker agent uses open-source websearch tool (duckduckgo), agents use OpenAI GPT-4o-mini as llm backend.
- v1.0: Worker agent uses OPEA retrieval megaservice as tool.
- v1.0 or later: agents use open-source llm backend.
- v1.1 or later: add safeguards
1. Build agent docker image [Optional]
## Getting started
> [!NOTE]
> the step is optional. The docker images will be automatically pulled when running the docker compose commands. This step is only needed if pulling images failed.
1. Build agent docker image </br>
First, clone the opea GenAIComps repo
First, clone the opea GenAIComps repo.
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIComps.git
```
Then build the agent docker image. Both the supervisor agent and the worker agent will use the same docker image, but when we launch the two agents we will specify different strategies and register different tools.
```
cd GenAIComps
docker build -t opea/agent-langchain:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/agent/langchain/Dockerfile .
```
2. Set up environment for this example </br>
First, clone this repo.
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIComps.git
git clone https://github.com/opea-project/GenAIExamples.git
```
Then build the agent docker image. Both the supervisor agent and the worker agent will use the same docker image, but when we launch the two agents we will specify different strategies and register different tools.
Second, set up env vars.
```
cd GenAIComps
docker build -t opea/agent-langchain:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/agent/langchain/Dockerfile .
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
export host_ip=$(hostname -I | awk '{print $1}')
# if you are in a proxy environment, also set the proxy-related environment variables
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
# for using open-source llms
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
# optional: OPANAI_API_KEY if you want to use OpenAI models
export OPENAI_API_KEY=<your-openai-key>
```
2. Launch tool services </br>
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
First, launch the mega-service.
```
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
bash launch_retrieval_tool.sh
```
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
```
bash run_ingest_data.sh
```
4. Launch other tools. </br>
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```
3. Set up environment for this example </br>
First, clone this repo
5. Launch agent services</br>
We provide two options for `llm_engine` of the agents: 1. open-source LLMs, 2. OpenAI models via API calls.
Deploy it on Gaudi or Xeon respectively
::::{tab-set}
:::{tab-item} Gaudi
:sync: Gaudi
To use open-source LLMs on Gaudi2, run commands below.
```
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_tgi_gaudi.sh
bash launch_agent_service_tgi_gaudi.sh
```
Second, set up env vars
:::
:::{tab-item} Xeon
:sync: Xeon
To use OpenAI models, run commands below.
```
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
# optional: OPANAI_API_KEY
export OPENAI_API_KEY=<your-openai-key>
```
4. Launch agent services</br>
The configurations of the supervisor agent and the worker agent are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM, and we plan to add support for llama3.1-70B-instruct (served by TGI-Gaudi) in a subsequent release.
To use openai llm, run command below.
```
cd docker_compose/intel/cpu/xeon
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
bash launch_agent_service_openai.sh
```
:::
::::
## Validate services
First look at logs of the agent docker containers:
```
docker logs docgrader-agent-endpoint
# worker agent
docker logs rag-agent-endpoint
```
```
# supervisor agent
docker logs react-agent-endpoint
```
@@ -88,7 +205,7 @@ You should see something like "HTTP server setup successful" if the docker conta
Second, validate worker agent:
```
curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
@@ -96,11 +213,11 @@ curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: app
Third, validate supervisor agent:
```
curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
## How to register your own tools with agent
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md#5-customize-agent-strategy).
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md).

View File

@@ -0,0 +1,100 @@
# Single node on-prem deployment with Docker Compose on Xeon Scalable processors
This example showcases a hierarchical multi-agent system for question-answering applications. We deploy the example on Xeon. For LLMs, we use OpenAI models via API calls. For instructions on using open-source LLMs, please refer to the deployment guide [here](../../../../README.md).
## Deployment with docker
1. First, clone this repo.
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
```
2. Set up environment for this example </br>
```
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
export host_ip=$(hostname -I | awk '{print $1}')
# if you are in a proxy environment, also set the proxy-related environment variables
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
#OPANAI_API_KEY if you want to use OpenAI models
export OPENAI_API_KEY=<your-openai-key>
```
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
First, launch the mega-service.
```
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
bash launch_retrieval_tool.sh
```
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
```
bash run_ingest_data.sh
```
4. Launch Tool service
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```
5. Launch `Agent` service
The configurations of the supervisor agent and the worker agent are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM, and llama3.1-70B-instruct (served by TGI-Gaudi) in Gaudi example. To use openai llm, run command below.
```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
bash launch_agent_service_openai.sh
```
6. [Optional] Build `Agent` docker image if pulling images failed.
```
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
docker build -t opea/agent-langchain:latest -f comps/agent/langchain/Dockerfile .
```
## Validate services
First look at logs of the agent docker containers:
```
# worker agent
docker logs rag-agent-endpoint
```
```
# supervisor agent
docker logs react-agent-endpoint
```
You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>
Second, validate worker agent:
```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
Third, validate supervisor agent:
```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
## How to register your own tools with agent
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md).

View File

@@ -2,11 +2,10 @@
# SPDX-License-Identifier: Apache-2.0
services:
worker-docgrader-agent:
worker-rag-agent:
image: opea/agent-langchain:latest
container_name: docgrader-agent-endpoint
container_name: rag-agent-endpoint
volumes:
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9095:9095"
@@ -36,8 +35,9 @@ services:
supervisor-react-agent:
image: opea/agent-langchain:latest
container_name: react-agent-endpoint
depends_on:
- worker-rag-agent
volumes:
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9090:9090"

View File

@@ -1,13 +1,16 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
pushd "../../../../../" > /dev/null
source .set_env.sh
popd > /dev/null
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
export ip_address=$(hostname -I | awk '{print $1}')
export recursion_limit_worker=12
export recursion_limit_supervisor=10
export model="gpt-4o-mini-2024-07-18"
export temperature=0
export max_new_tokens=512
export max_new_tokens=4096
export OPENAI_API_KEY=${OPENAI_API_KEY}
export WORKER_AGENT_URL="http://${ip_address}:9095/v1/chat/completions"
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"

View File

@@ -0,0 +1,105 @@
# Single node on-prem deployment AgentQnA on Gaudi
This example showcases a hierarchical multi-agent system for question-answering applications. We deploy the example on Gaudi using open-source LLMs,
For more details, please refer to the deployment guide [here](../../../../README.md).
## Deployment with docker
1. First, clone this repo.
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
```
2. Set up environment for this example </br>
```
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
export host_ip=$(hostname -I | awk '{print $1}')
# if you are in a proxy environment, also set the proxy-related environment variables
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
# for using open-source llms
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
# Example export HF_CACHE_DIR=$WORKDIR so that no need to redownload every time
export HF_CACHE_DIR=<directory-where-llms-are-downloaded>
```
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
First, launch the mega-service.
```
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
bash launch_retrieval_tool.sh
```
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
```
bash run_ingest_data.sh
```
4. Launch Tool service
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```
5. Launch `Agent` service
To use open-source LLMs on Gaudi2, run commands below.
```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_tgi_gaudi.sh
bash launch_agent_service_tgi_gaudi.sh
```
6. [Optional] Build `Agent` docker image if pulling images failed.
```
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
docker build -t opea/agent-langchain:latest -f comps/agent/langchain/Dockerfile .
```
## Validate services
First look at logs of the agent docker containers:
```
# worker agent
docker logs rag-agent-endpoint
```
```
# supervisor agent
docker logs react-agent-endpoint
```
You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>
Second, validate worker agent:
```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
Third, validate supervisor agent:
```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
## How to register your own tools with agent
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md).

View File

@@ -2,37 +2,9 @@
# SPDX-License-Identifier: Apache-2.0
services:
tgi-server:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
container_name: tgi-server
ports:
- "8085:80"
volumes:
- ${HF_CACHE_DIR}:/data
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
ENABLE_HPU_GRAPH: true
LIMIT_HPU_GRAPH: true
USE_FLASH_ATTENTION: true
FLASH_ATTENTION_RECOMPUTE: true
runtime: habana
cap_add:
- SYS_NICE
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --sharded true --num-shard ${NUM_SHARDS}
worker-docgrader-agent:
worker-rag-agent:
image: opea/agent-langchain:latest
container_name: docgrader-agent-endpoint
depends_on:
- tgi-server
container_name: rag-agent-endpoint
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
@@ -41,7 +13,7 @@ services:
ipc: host
environment:
ip_address: ${ip_address}
strategy: rag_agent
strategy: rag_agent_llama
recursion_limit: ${recursion_limit_worker}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
@@ -66,8 +38,7 @@ services:
image: opea/agent-langchain:latest
container_name: react-agent-endpoint
depends_on:
- tgi-server
- worker-docgrader-agent
- worker-rag-agent
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
@@ -76,7 +47,7 @@ services:
ipc: host
environment:
ip_address: ${ip_address}
strategy: react_langgraph
strategy: react_llama
recursion_limit: ${recursion_limit_supervisor}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}

View File

@@ -1,6 +1,9 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
pushd "../../../../../" > /dev/null
source .set_env.sh
popd > /dev/null
WORKPATH=$(dirname "$PWD")/..
# export WORKDIR=$WORKPATH/../../
echo "WORKDIR=${WORKDIR}"
@@ -15,7 +18,7 @@ export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-70B-Instruct"
export NUM_SHARDS=4
export LLM_ENDPOINT_URL="http://${ip_address}:8085"
export temperature=0.01
export max_new_tokens=512
export max_new_tokens=4096
# agent related environment variables
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
@@ -27,17 +30,3 @@ export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
export CRAG_SERVER=http://${ip_address}:8080
docker compose -f compose.yaml up -d
sleep 5s
echo "Waiting tgi gaudi ready"
n=0
until [[ "$n" -ge 100 ]] || [[ $ready == true ]]; do
docker logs tgi-server &> tgi-gaudi-service.log
n=$((n+1))
if grep -q Connected tgi-gaudi-service.log; then
break
fi
sleep 5s
done
sleep 5s
echo "Service started successfully"

View File

@@ -0,0 +1,25 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# LLM related environment variables
export HF_CACHE_DIR=${HF_CACHE_DIR}
ls $HF_CACHE_DIR
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-70B-Instruct"
export NUM_SHARDS=4
docker compose -f tgi_gaudi.yaml up -d
sleep 5s
echo "Waiting tgi gaudi ready"
n=0
until [[ "$n" -ge 100 ]] || [[ $ready == true ]]; do
docker logs tgi-server &> tgi-gaudi-service.log
n=$((n+1))
if grep -q Connected tgi-gaudi-service.log; then
break
fi
sleep 5s
done
sleep 5s
echo "Service started successfully"

View File

@@ -0,0 +1,30 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
services:
tgi-server:
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-server
ports:
- "8085:80"
volumes:
- ${HF_CACHE_DIR}:/data
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
ENABLE_HPU_GRAPH: true
LIMIT_HPU_GRAPH: true
USE_FLASH_ATTENTION: true
FLASH_ATTENTION_RECOMPUTE: true
runtime: habana
cap_add:
- SYS_NICE
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --sharded true --num-shard ${NUM_SHARDS}

View File

@@ -17,6 +17,12 @@ if [ ! -d "$HF_CACHE_DIR" ]; then
fi
ls $HF_CACHE_DIR
function start_tgi(){
echo "Starting tgi-gaudi server"
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_tgi_gaudi.sh
}
function start_agent_and_api_server() {
echo "Starting CRAG server"
@@ -25,6 +31,7 @@ function start_agent_and_api_server() {
echo "Starting Agent services"
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_agent_service_tgi_gaudi.sh
sleep 10
}
function validate() {
@@ -43,18 +50,22 @@ function validate() {
function validate_agent_service() {
echo "----------------Test agent ----------------"
local CONTENT=$(http_proxy="" curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Tell me about Michael Jackson song thriller"
}')
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
docker logs docgrader-agent-endpoint
# local CONTENT=$(http_proxy="" curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
# "query": "Tell me about Michael Jackson song thriller"
# }')
export agent_port="9095"
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py)
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "rag-agent-endpoint")
docker logs rag-agent-endpoint
if [ "$EXIT_CODE" == "1" ]; then
exit 1
fi
local CONTENT=$(http_proxy="" curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Tell me about Michael Jackson song thriller"
}')
# local CONTENT=$(http_proxy="" curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
# "query": "Tell me about Michael Jackson song thriller"
# }')
export agent_port="9090"
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py)
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
docker logs react-agent-endpoint
if [ "$EXIT_CODE" == "1" ]; then
@@ -64,6 +75,10 @@ function validate_agent_service() {
}
function main() {
echo "==================== Start TGI ===================="
start_tgi
echo "==================== TGI started ===================="
echo "==================== Start agent ===================="
start_agent_and_api_server
echo "==================== Agent started ===================="

25
AgentQnA/tests/test.py Normal file
View File

@@ -0,0 +1,25 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import os
import requests
def generate_answer_agent_api(url, prompt):
proxies = {"http": ""}
payload = {
"query": prompt,
}
response = requests.post(url, json=payload, proxies=proxies)
answer = response.json()["text"]
return answer
if __name__ == "__main__":
ip_address = os.getenv("ip_address", "localhost")
agent_port = os.getenv("agent_port", "9095")
url = f"http://{ip_address}:{agent_port}/v1/chat/completions"
prompt = "Tell me about Michael Jackson song thriller"
answer = generate_answer_agent_api(url, prompt)
print(answer)

View File

@@ -19,7 +19,6 @@ function stop_crag() {
function stop_agent_docker() {
cd $WORKPATH/docker_compose/intel/hpu/gaudi/
# docker compose -f compose.yaml down
container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
@@ -28,11 +27,21 @@ function stop_agent_docker() {
done
}
function stop_tgi(){
cd $WORKPATH/docker_compose/intel/hpu/gaudi/
container_list=$(cat tgi_gaudi.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
echo "Stopping container $container_name"
if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi
done
}
function stop_retrieval_tool() {
echo "Stopping Retrieval tool"
local RETRIEVAL_TOOL_PATH=$WORKPATH/../DocIndexRetriever
cd $RETRIEVAL_TOOL_PATH/docker_compose/intel/cpu/xeon/
# docker compose -f compose.yaml down
container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
@@ -43,25 +52,26 @@ function stop_retrieval_tool() {
echo "workpath: $WORKPATH"
echo "=================== Stop containers ===================="
stop_crag
stop_tgi
stop_agent_docker
stop_retrieval_tool
cd $WORKPATH/tests
echo "=================== #1 Building docker images===================="
bash 1_build_images.sh
bash step1_build_images.sh
echo "=================== #1 Building docker images completed===================="
echo "=================== #2 Start retrieval tool===================="
bash 2_start_retrieval_tool.sh
bash step2_start_retrieval_tool.sh
echo "=================== #2 Retrieval tool started===================="
echo "=================== #3 Ingest data and validate retrieval===================="
bash 3_ingest_data_and_validate_retrieval.sh
bash step3_ingest_data_and_validate_retrieval.sh
echo "=================== #3 Data ingestion and validation completed===================="
echo "=================== #4 Start agent and API server===================="
bash 4_launch_and_validate_agent_tgi.sh
bash step4_launch_and_validate_agent_tgi.sh
echo "=================== #4 Agent test passed ===================="
echo "=================== #5 Stop agent and API server===================="
@@ -70,4 +80,6 @@ stop_agent_docker
stop_retrieval_tool
echo "=================== #5 Agent and API server stopped===================="
echo y | docker system prune
echo "ALL DONE!"

View File

@@ -25,7 +25,7 @@ get_billboard_rank_date:
args_schema:
rank:
type: int
description: song name
description: the rank of interest, for example 1 for top 1
date:
type: str
description: date

View File

@@ -12,16 +12,31 @@ def search_knowledge_base(query: str) -> str:
print(url)
proxies = {"http": ""}
payload = {
"text": query,
"messages": query,
}
response = requests.post(url, json=payload, proxies=proxies)
print(response)
docs = response.json()["documents"]
context = ""
for i, doc in enumerate(docs):
if i == 0:
context = doc
else:
context += "\n" + doc
print(context)
return context
if "documents" in response.json():
docs = response.json()["documents"]
context = ""
for i, doc in enumerate(docs):
if i == 0:
context = doc
else:
context += "\n" + doc
# print(context)
return context
elif "text" in response.json():
return response.json()["text"]
elif "reranked_docs" in response.json():
docs = response.json()["reranked_docs"]
context = ""
for i, doc in enumerate(docs):
if i == 0:
context = doc["text"]
else:
context += "\n" + doc["text"]
# print(context)
return context
else:
return "Error parsing response from the knowledge base."

View File

@@ -18,7 +18,7 @@ WORKDIR /home/user/
RUN git clone https://github.com/opea-project/GenAIComps.git
WORKDIR /home/user/GenAIComps
RUN pip install --no-cache-dir --upgrade pip && \
RUN pip install --no-cache-dir --upgrade pip setuptools && \
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt
COPY ./audioqna.py /home/user/audioqna.py

View File

@@ -0,0 +1,32 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
FROM python:3.11-slim
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
libgl1-mesa-glx \
libjemalloc-dev \
git
RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/
WORKDIR /home/user/
RUN git clone https://github.com/opea-project/GenAIComps.git
WORKDIR /home/user/GenAIComps
RUN pip install --no-cache-dir --upgrade pip setuptools && \
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt
COPY ./audioqna_multilang.py /home/user/audioqna_multilang.py
ENV PYTHONPATH=$PYTHONPATH:/home/user/GenAIComps
USER user
WORKDIR /home/user
ENTRYPOINT ["python", "audioqna_multilang.py"]

View File

@@ -2,6 +2,63 @@
AudioQnA is an example that demonstrates the integration of Generative AI (GenAI) models for performing question-answering (QnA) on audio files, with the added functionality of Text-to-Speech (TTS) for generating spoken responses. The example showcases how to convert audio input to text using Automatic Speech Recognition (ASR), generate answers to user queries using a language model, and then convert those answers back to speech using Text-to-Speech (TTS).
The AudioQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
```mermaid
---
config:
flowchart:
nodeSpacing: 400
rankSpacing: 100
curve: linear
themeVariables:
fontSize: 50px
---
flowchart LR
%% Colors %%
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef invisible fill:transparent,stroke:transparent;
style AudioQnA-MegaService stroke:#000000
%% Subgraphs %%
subgraph AudioQnA-MegaService["AudioQnA MegaService "]
direction LR
ASR([ASR MicroService]):::blue
LLM([LLM MicroService]):::blue
TTS([TTS MicroService]):::blue
end
subgraph UserInterface[" User Interface "]
direction LR
a([User Input Query]):::orchid
UI([UI server<br>]):::orchid
end
WSP_SRV{{whisper service<br>}}
SPC_SRV{{speecht5 service <br>}}
LLM_gen{{LLM Service <br>}}
GW([AudioQnA GateWay<br>]):::orange
%% Questions interaction
direction LR
a[User Audio Query] --> UI
UI --> GW
GW <==> AudioQnA-MegaService
ASR ==> LLM
LLM ==> TTS
%% Embedding service flow
direction LR
ASR <-.-> WSP_SRV
LLM <-.-> LLM_gen
TTS <-.-> SPC_SRV
```
## Deploy AudioQnA Service
The AudioQnA service can be deployed on either Intel Gaudi2 or Intel Xeon Scalable Processor.

View File

@@ -0,0 +1,98 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import asyncio
import base64
import os
from comps import AudioQnAGateway, MicroService, ServiceOrchestrator, ServiceType
MEGA_SERVICE_HOST_IP = os.getenv("MEGA_SERVICE_HOST_IP", "0.0.0.0")
MEGA_SERVICE_PORT = int(os.getenv("MEGA_SERVICE_PORT", 8888))
WHISPER_SERVER_HOST_IP = os.getenv("WHISPER_SERVER_HOST_IP", "0.0.0.0")
WHISPER_SERVER_PORT = int(os.getenv("WHISPER_SERVER_PORT", 7066))
GPT_SOVITS_SERVER_HOST_IP = os.getenv("GPT_SOVITS_SERVER_HOST_IP", "0.0.0.0")
GPT_SOVITS_SERVER_PORT = int(os.getenv("GPT_SOVITS_SERVER_PORT", 9088))
LLM_SERVER_HOST_IP = os.getenv("LLM_SERVER_HOST_IP", "0.0.0.0")
LLM_SERVER_PORT = int(os.getenv("LLM_SERVER_PORT", 8888))
def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **kwargs):
print(inputs)
if self.services[cur_node].service_type == ServiceType.ASR:
# {'byte_str': 'UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA'}
inputs["audio"] = inputs["byte_str"]
del inputs["byte_str"]
elif self.services[cur_node].service_type == ServiceType.LLM:
# convert TGI/vLLM to unified OpenAI /v1/chat/completions format
next_inputs = {}
next_inputs["model"] = "tgi" # specifically clarify the fake model to make the format unified
next_inputs["messages"] = [{"role": "user", "content": inputs["asr_result"]}]
next_inputs["max_tokens"] = llm_parameters_dict["max_tokens"]
next_inputs["top_p"] = llm_parameters_dict["top_p"]
next_inputs["stream"] = inputs["streaming"] # False as default
next_inputs["frequency_penalty"] = inputs["frequency_penalty"]
# next_inputs["presence_penalty"] = inputs["presence_penalty"]
# next_inputs["repetition_penalty"] = inputs["repetition_penalty"]
next_inputs["temperature"] = inputs["temperature"]
inputs = next_inputs
elif self.services[cur_node].service_type == ServiceType.TTS:
next_inputs = {}
next_inputs["text"] = inputs["choices"][0]["message"]["content"]
next_inputs["text_language"] = kwargs["tts_text_language"] if "tts_text_language" in kwargs else "zh"
inputs = next_inputs
return inputs
def align_outputs(self, data, cur_node, inputs, runtime_graph, llm_parameters_dict, **kwargs):
if self.services[cur_node].service_type == ServiceType.TTS:
audio_base64 = base64.b64encode(data).decode("utf-8")
return {"byte_str": audio_base64}
return data
class AudioQnAService:
def __init__(self, host="0.0.0.0", port=8000):
self.host = host
self.port = port
ServiceOrchestrator.align_inputs = align_inputs
ServiceOrchestrator.align_outputs = align_outputs
self.megaservice = ServiceOrchestrator()
def add_remote_service(self):
asr = MicroService(
name="asr",
host=WHISPER_SERVER_HOST_IP,
port=WHISPER_SERVER_PORT,
# endpoint="/v1/audio/transcriptions",
endpoint="/v1/asr",
use_remote_service=True,
service_type=ServiceType.ASR,
)
llm = MicroService(
name="llm",
host=LLM_SERVER_HOST_IP,
port=LLM_SERVER_PORT,
endpoint="/v1/chat/completions",
use_remote_service=True,
service_type=ServiceType.LLM,
)
tts = MicroService(
name="tts",
host=GPT_SOVITS_SERVER_HOST_IP,
port=GPT_SOVITS_SERVER_PORT,
# endpoint="/v1/audio/speech",
endpoint="/",
use_remote_service=True,
service_type=ServiceType.TTS,
)
self.megaservice.add(asr).add(llm).add(tts)
self.megaservice.flow_to(asr, llm)
self.megaservice.flow_to(llm, tts)
self.gateway = AudioQnAGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)
if __name__ == "__main__":
audioqna = AudioQnAService(host=MEGA_SERVICE_HOST_IP, port=MEGA_SERVICE_PORT)
audioqna.add_remote_service()

View File

@@ -1,4 +1,4 @@
# AudioQnA accuracy Evaluation
# AudioQnA Accuracy
AudioQnA is an example that demonstrates the integration of Generative AI (GenAI) models for performing question-answering (QnA) on audio scene, which contains Automatic Speech Recognition (ASR) and Text-to-Speech (TTS). The following is the piepline for evaluating the ASR accuracy.
@@ -36,9 +36,9 @@ Evaluate the performance with the LLM:
```py
# validate the offline model
# python offline_evaluate.py
# python offline_eval.py
# validate the online asr microservice accuracy
python online_evaluate.py
python online_eval.py
```
### Performance Result

View File

@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
python online_eval.py

View File

@@ -0,0 +1,77 @@
# AudioQnA Benchmarking
This folder contains a collection of scripts to enable inference benchmarking by leveraging a comprehensive benchmarking tool, [GenAIEval](https://github.com/opea-project/GenAIEval/blob/main/evals/benchmark/README.md), that enables throughput analysis to assess inference performance.
By following this guide, you can run benchmarks on your deployment and share the results with the OPEA community.
## Purpose
We aim to run these benchmarks and share them with the OPEA community for three primary reasons:
- To offer insights on inference throughput in real-world scenarios, helping you choose the best service or deployment for your needs.
- To establish a baseline for validating optimization solutions across different implementations, providing clear guidance on which methods are most effective for your use case.
- To inspire the community to build upon our benchmarks, allowing us to better quantify new solutions in conjunction with current leading llms, serving frameworks etc.
## Metrics
The benchmark will report the below metrics, including:
- Number of Concurrent Requests
- End-to-End Latency: P50, P90, P99 (in milliseconds)
- End-to-End First Token Latency: P50, P90, P99 (in milliseconds)
- Average Next Token Latency (in milliseconds)
- Average Token Latency (in milliseconds)
- Requests Per Second (RPS)
- Output Tokens Per Second
- Input Tokens Per Second
Results will be displayed in the terminal and saved as CSV file named `1_stats.csv` for easy export to spreadsheets.
## Getting Started
We recommend using Kubernetes to deploy the AudioQnA service, as it offers benefits such as load balancing and improved scalability. However, you can also deploy the service using Docker if that better suits your needs.
### Prerequisites
- Install Kubernetes by following [this guide](https://github.com/opea-project/docs/blob/main/guide/installation/k8s_install/k8s_install_kubespray.md).
- Every node has direct internet access
- Set up kubectl on the master node with access to the Kubernetes cluster.
- Install Python 3.8+ on the master node for running GenAIEval.
- Ensure all nodes have a local /mnt/models folder, which will be mounted by the pods.
- Ensure that the container's ulimit can meet the the number of requests.
```bash
# The way to modify the containered ulimit:
sudo systemctl edit containerd
# Add two lines:
[Service]
LimitNOFILE=65536:1048576
sudo systemctl daemon-reload; sudo systemctl restart containerd
```
## Test Steps
Please deploy AudioQnA service before benchmarking.
### Run Benchmark Test
Before the benchmark, we can configure the number of test queries and test output directory by:
```bash
export USER_QUERIES="[128, 128, 128, 128]"
export TEST_OUTPUT_DIR="/tmp/benchmark_output"
```
And then run the benchmark by:
```bash
bash benchmark.sh -n <node_count>
```
The argument `-n` refers to the number of test nodes.
### Data collection
All the test results will come to this folder `/tmp/benchmark_output` configured by the environment variable `TEST_OUTPUT_DIR` in previous steps.

View File

@@ -0,0 +1,99 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
deployment_type="k8s"
node_number=1
service_port=8888
query_per_node=128
benchmark_tool_path="$(pwd)/GenAIEval"
usage() {
echo "Usage: $0 [-d deployment_type] [-n node_number] [-i service_ip] [-p service_port]"
echo " -d deployment_type AudioQnA deployment type, select between k8s and docker (default: k8s)"
echo " -n node_number Test node number, required only for k8s deployment_type, (default: 1)"
echo " -i service_ip AudioQnA service ip, required only for docker deployment_type"
echo " -p service_port AudioQnA service port, required only for docker deployment_type, (default: 8888)"
exit 1
}
while getopts ":d:n:i:p:" opt; do
case ${opt} in
d )
deployment_type=$OPTARG
;;
n )
node_number=$OPTARG
;;
i )
service_ip=$OPTARG
;;
p )
service_port=$OPTARG
;;
\? )
echo "Invalid option: -$OPTARG" 1>&2
usage
;;
: )
echo "Invalid option: -$OPTARG requires an argument" 1>&2
usage
;;
esac
done
if [[ "$deployment_type" == "docker" && -z "$service_ip" ]]; then
echo "Error: service_ip is required for docker deployment_type" 1>&2
usage
fi
if [[ "$deployment_type" == "k8s" && ( -n "$service_ip" || -n "$service_port" ) ]]; then
echo "Warning: service_ip and service_port are ignored for k8s deployment_type" 1>&2
fi
function main() {
if [[ ! -d ${benchmark_tool_path} ]]; then
echo "Benchmark tool not found, setting up..."
setup_env
fi
run_benchmark
}
function setup_env() {
git clone https://github.com/opea-project/GenAIEval.git
pushd ${benchmark_tool_path}
python3 -m venv stress_venv
source stress_venv/bin/activate
pip install -r requirements.txt
popd
}
function run_benchmark() {
source ${benchmark_tool_path}/stress_venv/bin/activate
export DEPLOYMENT_TYPE=${deployment_type}
export SERVICE_IP=${service_ip:-"None"}
export SERVICE_PORT=${service_port:-"None"}
if [[ -z $USER_QUERIES ]]; then
user_query=$((query_per_node*node_number))
export USER_QUERIES="[${user_query}, ${user_query}, ${user_query}, ${user_query}]"
echo "USER_QUERIES not configured, setting to: ${USER_QUERIES}."
fi
export WARMUP=$(echo $USER_QUERIES | sed -e 's/[][]//g' -e 's/,.*//')
if [[ -z $WARMUP ]]; then export WARMUP=0; fi
if [[ -z $TEST_OUTPUT_DIR ]]; then
if [[ $DEPLOYMENT_TYPE == "k8s" ]]; then
export TEST_OUTPUT_DIR="${benchmark_tool_path}/evals/benchmark/benchmark_output/node_${node_number}"
else
export TEST_OUTPUT_DIR="${benchmark_tool_path}/evals/benchmark/benchmark_output/docker"
fi
echo "TEST_OUTPUT_DIR not configured, setting to: ${TEST_OUTPUT_DIR}."
fi
envsubst < ./benchmark.yaml > ${benchmark_tool_path}/evals/benchmark/benchmark.yaml
cd ${benchmark_tool_path}/evals/benchmark
python benchmark.py
}
main

View File

@@ -0,0 +1,52 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
test_suite_config: # Overall configuration settings for the test suite
examples: ["audioqna"] # The specific test cases being tested, e.g., chatqna, codegen, codetrans, faqgen, audioqna, visualqna
deployment_type: "k8s" # Default is "k8s", can also be "docker"
service_ip: None # Leave as None for k8s, specify for Docker
service_port: None # Leave as None for k8s, specify for Docker
warm_ups: 0 # Number of test requests for warm-up
run_time: 60m # The max total run time for the test suite
seed: # The seed for all RNGs
user_queries: [1, 2, 4, 8, 16, 32, 64, 128] # Number of test requests at each concurrency level
query_timeout: 120 # Number of seconds to wait for a simulated user to complete any executing task before exiting. 120 sec by defeult.
random_prompt: false # Use random prompts if true, fixed prompts if false
collect_service_metric: false # Collect service metrics if true, do not collect service metrics if false
data_visualization: false # Generate data visualization if true, do not generate data visualization if false
llm_model: "Intel/neural-chat-7b-v3-3" # The LLM model used for the test
test_output_dir: "/tmp/benchmark_output" # The directory to store the test output
load_shape: # Tenant concurrency pattern
name: constant # poisson or constant(locust default load shape)
params: # Loadshape-specific parameters
constant: # Poisson load shape specific parameters, activate only if load_shape is poisson
concurrent_level: 4 # If user_queries is specified, concurrent_level is target number of requests per user. If not, it is the number of simulated users
poisson: # Poisson load shape specific parameters, activate only if load_shape is poisson
arrival-rate: 1.0 # Request arrival rate
namespace: "" # Fill the user-defined namespace. Otherwise, it will be default.
test_cases:
audioqna:
asr:
run_test: true
service_name: "asr-svc" # Replace with your service name
llm:
run_test: true
service_name: "llm-svc" # Replace with your service name
parameters:
model_name: "Intel/neural-chat-7b-v3-3"
max_new_tokens: 128
temperature: 0.01
top_k: 10
top_p: 0.95
repetition_penalty: 1.03
streaming: true
llmserve:
run_test: true
service_name: "llm-svc" # Replace with your service name
tts:
run_test: true
service_name: "tts-svc" # Replace with your service name
e2e:
run_test: true
service_name: "audioqna-backend-server-svc" # Replace with your service name

View File

@@ -127,9 +127,13 @@ curl http://${host_ip}:3002/v1/audio/speech \
## 🚀 Test MegaService
Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
to the response, decode the base64 string and save it as a .wav file.
```bash
curl http://${host_ip}:3008/v1/audioqna \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' \
-H 'Content-Type: application/json'
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
```

View File

@@ -41,7 +41,7 @@ services:
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
container_name: tgi-service
ports:
- "3006:80"

View File

@@ -0,0 +1,64 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
services:
whisper-service:
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
container_name: whisper-service
ports:
- "7066:7066"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
command: --language "zh"
gpt-sovits-service:
image: ${REGISTRY:-opea}/gpt-sovits:${TAG:-latest}
container_name: gpt-sovits-service
ports:
- "9880:9880"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
container_name: tgi-service
ports:
- "3006:80"
volumes:
- "./data:/data"
shm_size: 1g
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
command: --model-id ${LLM_MODEL_ID} --cuda-graphs 0
audioqna-xeon-backend-server:
image: ${REGISTRY:-opea}/audioqna-multilang:${TAG:-latest}
container_name: audioqna-xeon-backend-server
ports:
- "3008:8888"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
- LLM_SERVER_HOST_IP=${LLM_SERVER_HOST_IP}
- LLM_SERVER_PORT=${LLM_SERVER_PORT}
- LLM_MODEL_ID=${LLM_MODEL_ID}
- WHISPER_SERVER_HOST_IP=${WHISPER_SERVER_HOST_IP}
- WHISPER_SERVER_PORT=${WHISPER_SERVER_PORT}
- GPT_SOVITS_SERVER_HOST_IP=${GPT_SOVITS_SERVER_HOST_IP}
- GPT_SOVITS_SERVER_PORT=${GPT_SOVITS_SERVER_PORT}
ipc: host
restart: always
networks:
default:
driver: bridge

View File

@@ -0,0 +1,7 @@
#!/usr/bin/env bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
pushd "../../../../../" > /dev/null
source .set_env.sh
popd > /dev/null

View File

@@ -79,6 +79,8 @@ export LLM_SERVICE_PORT=3007
## 🚀 Start the MegaService
> **_NOTE:_** Users will need at least three Gaudi cards for AudioQnA.
```bash
cd GenAIExamples/AudioQnA/docker_compose/intel/hpu/gaudi/
docker compose up -d
@@ -127,9 +129,13 @@ curl http://${host_ip}:3002/v1/audio/speech \
## 🚀 Test MegaService
Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
to the response, decode the base64 string and save it as a .wav file.
```bash
curl http://${host_ip}:3008/v1/audioqna \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' \
-H 'Content-Type: application/json'
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
```

View File

@@ -51,7 +51,7 @@ services:
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
ports:
- "3006:80"

View File

@@ -0,0 +1,7 @@
#!/usr/bin/env bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
pushd "../../../../../" > /dev/null
source .set_env.sh
popd > /dev/null

View File

@@ -53,3 +53,9 @@ services:
dockerfile: comps/tts/speecht5/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/tts:${TAG:-latest}
gpt-sovits:
build:
context: GenAIComps
dockerfile: comps/tts/gpt-sovits/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/gpt-sovits:${TAG:-latest}

View File

@@ -7,14 +7,14 @@
## Deploy On Xeon
```
cd GenAIExamples/AudioQnA/kubernetes/intel/cpu/xeon/manifests
cd GenAIExamples/AudioQnA/kubernetes/intel/cpu/xeon/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml
```
## Deploy On Gaudi
```
cd GenAIExamples/AudioQnA/kubernetes/intel/hpu/gaudi/manifests
cd GenAIExamples/AudioQnA/kubernetes/intel/hpu/gaudi/manifest
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml

View File

@@ -25,7 +25,7 @@ The AudioQnA uses the below prebuilt images if you choose a Xeon deployment
Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.
For Gaudi:
- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.5
- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.6
- whisper-gaudi: opea/whisper-gaudi:latest
- speecht5-gaudi: opea/speecht5-gaudi:latest

View File

@@ -247,7 +247,7 @@ spec:
- envFrom:
- configMapRef:
name: audio-qna-config
image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
name: llm-dependency-deploy-demo
securityContext:
capabilities:

View File

@@ -271,7 +271,7 @@ spec:
- envFrom:
- configMapRef:
name: audio-qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
name: llm-dependency-deploy-demo
securityContext:
capabilities:

View File

@@ -22,7 +22,7 @@ function build_docker_images() {
service_list="audioqna whisper-gaudi asr llm-tgi speecht5-gaudi tts"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
docker images && sleep 1s
}
@@ -100,7 +100,7 @@ function validate_megaservice() {
#
# sed -i "s/localhost/$ip_address/g" playwright.config.ts
#
## conda install -c conda-forge nodejs -y
## conda install -c conda-forge nodejs=22.6.0 -y
# npm install && npm ci && npx playwright install --with-deps
# node -v && npm -v && pip list
#

View File

@@ -22,7 +22,7 @@ function build_docker_images() {
service_list="audioqna whisper asr llm-tgi speecht5 tts"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
docker images && sleep 1s
}
@@ -90,7 +90,7 @@ function validate_megaservice() {
#
# sed -i "s/localhost/$ip_address/g" playwright.config.ts
#
## conda install -c conda-forge nodejs -y
## conda install -c conda-forge nodejs=22.6.0 -y
# npm install && npm ci && npx playwright install --with-deps
# node -v && npm -v && pip list
#

View File

@@ -23,4 +23,4 @@ RUN npm run build
EXPOSE 5173
# Run the front-end application in preview mode
CMD ["npm", "run", "preview", "--", "--port", "5173", "--host", "0.0.0.0"]
CMD ["npm", "run", "preview", "--", "--port", "5173", "--host", "0.0.0.0"]

View File

@@ -79,4 +79,4 @@ a.btn {
.w-12\/12 {
width: 100%
}
}

View File

@@ -89,4 +89,4 @@
<stop offset="1" stop-color="#3300FF" stop-opacity="0.2" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 5.8 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@@ -89,4 +89,4 @@
<stop offset="1" stop-color="#f3f4f6" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 5.8 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@@ -76,4 +76,4 @@
<stop offset="1" stop-color="#9CFFED" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 4.8 KiB

After

Width:  |  Height:  |  Size: 4.8 KiB

View File

@@ -76,4 +76,4 @@
<stop offset="1" stop-color="#6141E1" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 4.8 KiB

After

Width:  |  Height:  |  Size: 4.8 KiB

View File

@@ -89,4 +89,4 @@
<stop offset="1" stop-color="#3300FF" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 5.8 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@@ -3,4 +3,4 @@
<path
d="M512 1024a512 512 0 1 1 512-512 512 512 0 0 1-512 512z m0-896a384 384 0 1 0 384 384A384 384 0 0 0 512 128z m128 576h-256a64 64 0 0 1-64-64v-256a64 64 0 0 1 64-64h256a64 64 0 0 1 64 64v256a64 64 0 0 1-64 64z"
fill="#d81e06" p-id="3104"></path>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 429 B

After

Width:  |  Height:  |  Size: 430 B

View File

@@ -1 +1 @@
<svg t="1713431562066" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="6399" width="32" height="32"><path d="M592 768h-160c-26.6 0-48-21.4-48-48V384h-175.4c-35.6 0-53.4-43-28.2-68.2L484.6 11.4c15-15 39.6-15 54.6 0l304.4 304.4c25.2 25.2 7.4 68.2-28.2 68.2H640v336c0 26.6-21.4 48-48 48z m432-16v224c0 26.6-21.4 48-48 48H48c-26.6 0-48-21.4-48-48V752c0-26.6 21.4-48 48-48h272v16c0 61.8 50.2 112 112 112h160c61.8 0 112-50.2 112-112v-16h272c26.6 0 48 21.4 48 48z m-248 176c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z m128 0c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z" p-id="6400" fill="#ffffff"></path></svg>
<svg t="1713431562066" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="6399" width="32" height="32"><path d="M592 768h-160c-26.6 0-48-21.4-48-48V384h-175.4c-35.6 0-53.4-43-28.2-68.2L484.6 11.4c15-15 39.6-15 54.6 0l304.4 304.4c25.2 25.2 7.4 68.2-28.2 68.2H640v336c0 26.6-21.4 48-48 48z m432-16v224c0 26.6-21.4 48-48 48H48c-26.6 0-48-21.4-48-48V752c0-26.6 21.4-48 48-48h272v16c0 61.8 50.2 112 112 112h160c61.8 0 112-50.2 112-112v-16h272c26.6 0 48 21.4 48 48z m-248 176c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z m128 0c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z" p-id="6400" fill="#ffffff"></path></svg>

Before

Width:  |  Height:  |  Size: 669 B

After

Width:  |  Height:  |  Size: 670 B

View File

@@ -6,4 +6,4 @@
<path
d="M864 479.776 864 352c0-17.664-14.304-32-32-32s-32 14.336-32 32l0 127.776c0 160.16-129.184 290.464-288 290.464-158.784 0-288-130.304-288-290.464L224 352c0-17.664-14.336-32-32-32s-32 14.336-32 32l0 127.776c0 184.608 140.864 336.48 320 352.832L480 896 288 896c-17.664 0-32 14.304-32 32s14.336 32 32 32l448 0c17.696 0 32-14.304 32-32s-14.304-32-32-32l-192 0 0-63.36C723.136 816.256 864 664.384 864 479.776z"
fill="#707070" p-id="2962"></path>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 844 B

After

Width:  |  Height:  |  Size: 845 B

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 16 KiB

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 15 KiB

After

Width:  |  Height:  |  Size: 15 KiB

8
AvatarChatbot/.gitignore vendored Normal file
View File

@@ -0,0 +1,8 @@
*.safetensors
*.bin
*.model
*.log
docker_compose/intel/cpu/xeon/data
docker_compose/intel/hpu/gaudi/data
inputs/
outputs/

View File

@@ -17,13 +17,12 @@ RUN useradd -m -s /bin/bash user && \
WORKDIR /home/user/
RUN git clone https://github.com/opea-project/GenAIComps.git
WORKDIR /home/user/GenAIComps
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt && \
pip install --no-cache-dir langchain_core
COPY ./chatqna_no_wrapper.py /home/user/chatqna_no_wrapper.py
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt
COPY ./avatarchatbot.py /home/user/avatarchatbot.py
ENV PYTHONPATH=$PYTHONPATH:/home/user/GenAIComps
@@ -31,4 +30,4 @@ USER user
WORKDIR /home/user
ENTRYPOINT ["python", "chatqna_no_wrapper.py", "--without-rerank"]
ENTRYPOINT ["python", "avatarchatbot.py"]

105
AvatarChatbot/README.md Normal file
View File

@@ -0,0 +1,105 @@
# AvatarChatbot Application
The AvatarChatbot service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors.
## AI Avatar Workflow
The AI Avatar example is implemented using both megaservices and the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different megaservices and microservices for this example.
```mermaid
---
config:
flowchart:
nodeSpacing: 100
rankSpacing: 100
curve: linear
themeVariables:
fontSize: 42px
---
flowchart LR
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef thistle fill:#D8BFD8,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef invisible fill:transparent,stroke:transparent;
style AvatarChatbot-Megaservice stroke:#000000
subgraph AvatarChatbot-Megaservice["AvatarChatbot Megaservice"]
direction LR
ASR([ASR Microservice]):::blue
LLM([LLM Microservice]):::blue
TTS([TTS Microservice]):::blue
animation([Animation Microservice]):::blue
end
subgraph UserInterface["User Interface"]
direction LR
invis1[ ]:::invisible
USER1([User Audio Query]):::orchid
USER2([User Image/Video Query]):::orchid
UI([UI server<br>]):::orchid
end
GW([AvatarChatbot GateWay<br>]):::orange
subgraph .
direction LR
X([OPEA Microservice]):::blue
Y{{Open Source Service}}:::thistle
Z([OPEA Gateway]):::orange
Z1([UI]):::orchid
end
WHISPER{{Whisper service}}:::thistle
TGI{{LLM service}}:::thistle
T5{{Speecht5 service}}:::thistle
WAV2LIP{{Wav2Lip service}}:::thistle
%% Connections %%
direction LR
USER1 -->|1| UI
UI -->|2| GW
GW <==>|3| AvatarChatbot-Megaservice
ASR ==>|4| LLM ==>|5| TTS ==>|6| animation
direction TB
ASR <-.->|3'| WHISPER
LLM <-.->|4'| TGI
TTS <-.->|5'| T5
animation <-.->|6'| WAV2LIP
USER2 -->|1| UI
UI <-.->|6'| WAV2LIP
```
## Deploy AvatarChatbot Service
The AvatarChatbot service can be deployed on either Intel Gaudi2 AI Accelerator or Intel Xeon Scalable Processor.
### Deploy AvatarChatbot on Gaudi
Refer to the [Gaudi Guide](./docker_compose/intel/hpu/gaudi/README.md) for instructions on deploying AvatarChatbot on Gaudi, and on setting up an UI for the application.
### Deploy AvatarChatbot on Xeon
Refer to the [Xeon Guide](./docker_compose/intel/cpu/xeon/README.md) for instructions on deploying AvatarChatbot on Xeon.
## Supported Models
### ASR
The default model is [openai/whisper-small](https://huggingface.co/openai/whisper-small). It also supports all models in the Whisper family, such as `openai/whisper-large-v3`, `openai/whisper-medium`, `openai/whisper-base`, `openai/whisper-tiny`, etc.
To replace the model, please edit the `compose.yaml` and add the `command` line to pass the name of the model you want to use:
```yaml
services:
whisper-service:
...
command: --model_name_or_path openai/whisper-tiny
```
### TTS
The default model is [microsoft/SpeechT5](https://huggingface.co/microsoft/speecht5_tts). We currently do not support replacing the model. More models under the commercial license will be added in the future.
### Animation
The default model is [Rudrabha/Wav2Lip](https://github.com/Rudrabha/Wav2Lip) and [TencentARC/GFPGAN](https://github.com/TencentARC/GFPGAN). We currently do not support replacing the model. More models under the commercial license such as [OpenTalker/SadTalker](https://github.com/OpenTalker/SadTalker) will be added in the future.

Binary file not shown.

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Binary file not shown.

After

Width:  |  Height:  |  Size: 595 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 158 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 992 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.6 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 169 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 121 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

View File

@@ -0,0 +1,93 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import asyncio
import os
import sys
from comps import AvatarChatbotGateway, MicroService, ServiceOrchestrator, ServiceType
MEGA_SERVICE_HOST_IP = os.getenv("MEGA_SERVICE_HOST_IP", "0.0.0.0")
MEGA_SERVICE_PORT = int(os.getenv("MEGA_SERVICE_PORT", 8888))
ASR_SERVICE_HOST_IP = os.getenv("ASR_SERVICE_HOST_IP", "0.0.0.0")
ASR_SERVICE_PORT = int(os.getenv("ASR_SERVICE_PORT", 9099))
LLM_SERVICE_HOST_IP = os.getenv("LLM_SERVICE_HOST_IP", "0.0.0.0")
LLM_SERVICE_PORT = int(os.getenv("LLM_SERVICE_PORT", 9000))
TTS_SERVICE_HOST_IP = os.getenv("TTS_SERVICE_HOST_IP", "0.0.0.0")
TTS_SERVICE_PORT = int(os.getenv("TTS_SERVICE_PORT", 9088))
ANIMATION_SERVICE_HOST_IP = os.getenv("ANIMATION_SERVICE_HOST_IP", "0.0.0.0")
ANIMATION_SERVICE_PORT = int(os.getenv("ANIMATION_SERVICE_PORT", 9066))
def check_env_vars(env_var_list):
for var in env_var_list:
if os.getenv(var) is None:
print(f"Error: The environment variable '{var}' is not set.")
sys.exit(1) # Exit the program with a non-zero status code
print("All environment variables are set.")
class AvatarChatbotService:
def __init__(self, host="0.0.0.0", port=8000):
self.host = host
self.port = port
self.megaservice = ServiceOrchestrator()
def add_remote_service(self):
asr = MicroService(
name="asr",
host=ASR_SERVICE_HOST_IP,
port=ASR_SERVICE_PORT,
endpoint="/v1/audio/transcriptions",
use_remote_service=True,
service_type=ServiceType.ASR,
)
llm = MicroService(
name="llm",
host=LLM_SERVICE_HOST_IP,
port=LLM_SERVICE_PORT,
endpoint="/v1/chat/completions",
use_remote_service=True,
service_type=ServiceType.LLM,
)
tts = MicroService(
name="tts",
host=TTS_SERVICE_HOST_IP,
port=TTS_SERVICE_PORT,
endpoint="/v1/audio/speech",
use_remote_service=True,
service_type=ServiceType.TTS,
)
animation = MicroService(
name="animation",
host=ANIMATION_SERVICE_HOST_IP,
port=ANIMATION_SERVICE_PORT,
endpoint="/v1/animation",
use_remote_service=True,
service_type=ServiceType.ANIMATION,
)
self.megaservice.add(asr).add(llm).add(tts).add(animation)
self.megaservice.flow_to(asr, llm)
self.megaservice.flow_to(llm, tts)
self.megaservice.flow_to(tts, animation)
self.gateway = AvatarChatbotGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)
if __name__ == "__main__":
check_env_vars(
[
"MEGA_SERVICE_HOST_IP",
"MEGA_SERVICE_PORT",
"ASR_SERVICE_HOST_IP",
"ASR_SERVICE_PORT",
"LLM_SERVICE_HOST_IP",
"LLM_SERVICE_PORT",
"TTS_SERVICE_HOST_IP",
"TTS_SERVICE_PORT",
"ANIMATION_SERVICE_HOST_IP",
"ANIMATION_SERVICE_PORT",
]
)
avatarchatbot = AvatarChatbotService(host=MEGA_SERVICE_HOST_IP, port=MEGA_SERVICE_PORT)
avatarchatbot.add_remote_service()

View File

@@ -0,0 +1,210 @@
# Build Mega Service of AvatarChatbot on Xeon
This document outlines the deployment process for a AvatarChatbot application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server.
## 🚀 Build Docker images
### 1. Source Code install GenAIComps
```bash
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
```
### 2. Build ASR Image
```bash
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/dependency/Dockerfile .
docker build -t opea/asr:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/Dockerfile .
```
### 3. Build LLM Image
```bash
docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
```
### 4. Build TTS Image
```bash
docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/dependency/Dockerfile .
docker build -t opea/tts:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/Dockerfile .
```
### 5. Build Animation Image
```bash
docker build -t opea/wav2lip:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/animation/wav2lip/dependency/Dockerfile .
docker build -t opea/animation:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/animation/wav2lip/Dockerfile .
```
### 6. Build MegaService Docker Image
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:
```bash
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/AvatarChatbot/
docker build --no-cache -t opea/avatarchatbot:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
```
Then run the command `docker images`, you will have following images ready:
1. `opea/whisper:latest`
2. `opea/asr:latest`
3. `opea/llm-tgi:latest`
4. `opea/speecht5:latest`
5. `opea/tts:latest`
6. `opea/wav2lip:latest`
7. `opea/animation:latest`
8. `opea/avatarchatbot:latest`
## 🚀 Set the environment variables
Before starting the services with `docker compose`, you have to recheck the following environment variables.
```bash
export HUGGINGFACEHUB_API_TOKEN=<your_hf_token>
export host_ip=$(hostname -I | awk '{print $1}')
export TGI_LLM_ENDPOINT=http://$host_ip:3006
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export ASR_ENDPOINT=http://$host_ip:7066
export TTS_ENDPOINT=http://$host_ip:7055
export WAV2LIP_ENDPOINT=http://$host_ip:7860
export MEGA_SERVICE_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export TTS_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export ANIMATION_SERVICE_HOST_IP=${host_ip}
export MEGA_SERVICE_PORT=8888
export ASR_SERVICE_PORT=3001
export TTS_SERVICE_PORT=3002
export LLM_SERVICE_PORT=3007
export ANIMATION_SERVICE_PORT=3008
```
- Xeon CPU
```bash
export DEVICE="cpu"
export WAV2LIP_PORT=7860
export INFERENCE_MODE='wav2lip_only'
export CHECKPOINT_PATH='/usr/local/lib/python3.11/site-packages/Wav2Lip/checkpoints/wav2lip_gan.pth'
export FACE="assets/img/avatar1.jpg"
# export AUDIO='assets/audio/eg3_ref.wav' # audio file path is optional, will use base64str in the post request as input if is 'None'
export AUDIO='None'
export FACESIZE=96
export OUTFILE="/outputs/result.mp4"
export GFPGAN_MODEL_VERSION=1.4 # latest version, can roll back to v1.3 if needed
export UPSCALE_FACTOR=1
export FPS=10
```
## 🚀 Start the MegaService
```bash
cd GenAIExamples/AvatarChatbot/docker_compose/intel/cpu/xeon/
docker compose -f compose.yaml up -d
```
## 🚀 Test MicroServices
```bash
# whisper service
curl http://${host_ip}:7066/v1/asr \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'
# asr microservice
curl http://${host_ip}:3001/v1/audio/transcriptions \
-X POST \
-d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'
# tgi service
curl http://${host_ip}:3006/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
# llm microservice
curl http://${host_ip}:3007/v1/chat/completions\
-X POST \
-d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":false}' \
-H 'Content-Type: application/json'
# speecht5 service
curl http://${host_ip}:7055/v1/tts \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'
# tts microservice
curl http://${host_ip}:3002/v1/audio/speech \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'
# wav2lip service
cd ../../../..
curl http://${host_ip}:7860/v1/wav2lip \
-X POST \
-d @assets/audio/sample_minecraft.json \
-H 'Content-Type: application/json'
# animation microservice
curl http://${host_ip}:3008/v1/animation \
-X POST \
-d @assets/audio/sample_question.json \
-H "Content-Type: application/json"
```
## 🚀 Test MegaService
```bash
curl http://${host_ip}:3009/v1/avatarchatbot \
-X POST \
-d @assets/audio/sample_whoareyou.json \
-H 'Content-Type: application/json'
```
If the megaservice is running properly, you should see the following output:
```bash
"/outputs/result.mp4"
```
The output file will be saved in the current working directory, as `${PWD}` is mapped to `/outputs` inside the wav2lip-service Docker container.
## Gradio UI
```bash
cd $WORKPATH/GenAIExamples/AvatarChatbot
python3 ui/gradio/app_gradio_demo_avatarchatbot.py
```
The UI can be viewed at http://${host_ip}:7861
<img src="../../../../assets/img/UI.png" alt="UI Example" width="60%">
In the current version v1.0, you need to set the avatar figure image/video and the DL model choice in the environment variables before starting AvatarChatbot backend service and running the UI. Please just customize the audio question in the UI.
\*\* We will enable change of avatar figure between runs in v2.0
## Troubleshooting
```bash
cd GenAIExamples/AvatarChatbot/tests
export IMAGE_REPO="opea"
export IMAGE_TAG="latest"
export HUGGINGFACEHUB_API_TOKEN=<your_hf_token>
test_avatarchatbot_on_xeon.sh
```

View File

@@ -0,0 +1,138 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
services:
whisper-service:
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
container_name: whisper-service
ports:
- "7066:7066"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
asr:
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
container_name: asr-service
ports:
- "3001:9099"
ipc: host
environment:
ASR_ENDPOINT: ${ASR_ENDPOINT}
speecht5-service:
image: ${REGISTRY:-opea}/speecht5:${TAG:-latest}
container_name: speecht5-service
ports:
- "7055:7055"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
tts:
image: ${REGISTRY:-opea}/tts:${TAG:-latest}
container_name: tts-service
ports:
- "3002:9088"
ipc: host
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
container_name: tgi-service
ports:
- "3006:80"
volumes:
- "./data:/data"
shm_size: 1g
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
command: --model-id ${LLM_MODEL_ID} --cuda-graphs 0
llm:
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
container_name: llm-tgi-server
depends_on:
- tgi-service
ports:
- "3007:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
restart: unless-stopped
wav2lip-service:
image: ${REGISTRY:-opea}/wav2lip:${TAG:-latest}
container_name: wav2lip-service
ports:
- "7860:7860"
ipc: host
volumes:
- ${PWD}:/outputs
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
DEVICE: ${DEVICE}
INFERENCE_MODE: ${INFERENCE_MODE}
CHECKPOINT_PATH: ${CHECKPOINT_PATH}
FACE: ${FACE}
AUDIO: ${AUDIO}
FACESIZE: ${FACESIZE}
OUTFILE: ${OUTFILE}
GFPGAN_MODEL_VERSION: ${GFPGAN_MODEL_VERSION}
UPSCALE_FACTOR: ${UPSCALE_FACTOR}
FPS: ${FPS}
WAV2LIP_PORT: ${WAV2LIP_PORT}
restart: unless-stopped
animation:
image: ${REGISTRY:-opea}/animation:${TAG:-latest}
container_name: animation-server
ports:
- "3008:9066"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
WAV2LIP_ENDPOINT: ${WAV2LIP_ENDPOINT}
restart: unless-stopped
avatarchatbot-xeon-backend-server:
image: ${REGISTRY:-opea}/avatarchatbot:${TAG:-latest}
container_name: avatarchatbot-xeon-backend-server
depends_on:
- asr
- llm
- tts
- animation
ports:
- "3009:8888"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
- MEGA_SERVICE_PORT=${MEGA_SERVICE_PORT}
- ASR_SERVICE_HOST_IP=${ASR_SERVICE_HOST_IP}
- ASR_SERVICE_PORT=${ASR_SERVICE_PORT}
- LLM_SERVICE_HOST_IP=${LLM_SERVICE_HOST_IP}
- LLM_SERVICE_PORT=${LLM_SERVICE_PORT}
- TTS_SERVICE_HOST_IP=${TTS_SERVICE_HOST_IP}
- TTS_SERVICE_PORT=${TTS_SERVICE_PORT}
- ANIMATION_SERVICE_HOST_IP=${ANIMATION_SERVICE_HOST_IP}
- ANIMATION_SERVICE_PORT=${ANIMATION_SERVICE_PORT}
ipc: host
restart: always
networks:
default:
driver: bridge

Some files were not shown because too many files have changed in this diff Show More