* make naming style compatible to the defined style.
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* docsum refine mode promt update
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* docsum vllm requirement update
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* docsum add auto mode
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* fix bug
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* fix bug
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* fix readme
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* initial code for sql agent llama
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* add test for sql agent
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update sql agent test
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix bugs and use vllm to test sql agent
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* add tag-bench test and google search tool
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* test sql agent with hints
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix bugs for sql agent with hints and update test
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add readme for sql agent and fix ci bugs
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add sql agent using openai models
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix bugs in sql agent openai
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* make wait time longer for sql agent microservice to be ready
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update readme
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix test bug
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* skip planexec with vllm due to vllm-gaudi bug
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* debug ut issue
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* use vllm for all uts
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* debug ci issue
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* change vllm port
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update ut
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* remove tgi server
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* align vllm port
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* remove unnecessary files and fix bugs
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* connect to db with full uri
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update readme and use vllm mainstream
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* rm unnecessary log
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update readme
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update test script
Signed-off-by: minmin-intel <minmin.hou@intel.com>
---------
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* fix doc index retriever embed issue on gaudi
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* align test router with examples
* align readme
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* remove examples gateway.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove gateway.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine service code.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update http_service.py
* remove gateway ut.
* remove gateway ut.
* fix conflict service name.
* Update http_service.py
* add handle message ut.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove `multiprocessing.Process` start server code.
* fix ut.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove multiprocessing and enhance ut for coverage.
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
* initial code for sql agent llama
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* add test for sql agent
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update sql agent test
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix bugs and use vllm to test sql agent
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* add tag-bench test and google search tool
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* test sql agent with hints
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix bugs for sql agent with hints and update test
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add readme for sql agent and fix ci bugs
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add sql agent using openai models
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix bugs in sql agent openai
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* make wait time longer for sql agent microservice to be ready
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update readme
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix test bug
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* skip planexec with vllm due to vllm-gaudi bug
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* debug ut issue
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* use vllm for all uts
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* debug ci issue
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* change vllm port
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update ut
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* remove tgi server
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* align vllm port
Signed-off-by: minmin-intel <minmin.hou@intel.com>
---------
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add Component base class
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add controller class
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add ut
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
While metrics are OK for small number of requests, when megaservice
is handling many (hundreds of) _parallel_ requests, it was reporting
clearly (~10%) larger first token latency, than the client receiving
the tokens from the megaservice.
Getting the time before token is yielded, means that reported first
token latency can be slightly shorter than it actually is. However,
testing with ChatQnA shows latencies to be clearly closer to ones seen
by the client (within couple of percent) and typically smaller (i.e.
logical).
PS. Doing the metrics timing after yielding the token, meant that also
time for sending the reply to the client and waiting that to complete,
was included to the token time. I suspect that with lot of parallel
requests, processing often had switched to other megaservice request
processing threads, and getting control back to yielding thread for
timing, could be delayed much longer than sending the response to
client took.
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
* openai compatible for asr/tts
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add dep
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>