* remove examples gateway.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove gateway.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine service code.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update http_service.py
* remove gateway ut.
* remove gateway ut.
* fix conflict service name.
* Update http_service.py
* add handle message ut.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove `multiprocessing.Process` start server code.
* fix ut.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove multiprocessing and enhance ut for coverage.
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
* initial code for sql agent llama
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* add test for sql agent
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update sql agent test
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix bugs and use vllm to test sql agent
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* add tag-bench test and google search tool
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* test sql agent with hints
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix bugs for sql agent with hints and update test
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add readme for sql agent and fix ci bugs
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add sql agent using openai models
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix bugs in sql agent openai
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* make wait time longer for sql agent microservice to be ready
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update readme
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* fix test bug
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* skip planexec with vllm due to vllm-gaudi bug
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* debug ut issue
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* use vllm for all uts
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* debug ci issue
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* change vllm port
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* update ut
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* remove tgi server
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* align vllm port
Signed-off-by: minmin-intel <minmin.hou@intel.com>
---------
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add Component base class
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add controller class
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add ut
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
While metrics are OK for small number of requests, when megaservice
is handling many (hundreds of) _parallel_ requests, it was reporting
clearly (~10%) larger first token latency, than the client receiving
the tokens from the megaservice.
Getting the time before token is yielded, means that reported first
token latency can be slightly shorter than it actually is. However,
testing with ChatQnA shows latencies to be clearly closer to ones seen
by the client (within couple of percent) and typically smaller (i.e.
logical).
PS. Doing the metrics timing after yielding the token, meant that also
time for sending the reply to the client and waiting that to complete,
was included to the token time. I suspect that with lot of parallel
requests, processing often had switched to other megaservice request
processing threads, and getting control back to yielding thread for
timing, could be delayed much longer than sending the response to
client took.
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
* openai compatible for asr/tts
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add dep
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Update requirements to pin protobuf version and fix grpc conflict, and limit vdms version
Signed-off-by: Lacewell, Chaunte W <chaunte.w.lacewell@intel.com>
* Update fix by removing grpcio pin and pinning opentelemetry-proto to 1.23.0
Signed-off-by: Lacewell, Chaunte W <chaunte.w.lacewell@intel.com>
---------
Signed-off-by: Lacewell, Chaunte W <chaunte.w.lacewell@intel.com>
* fix retriever and reranker to process chat completion request
Signed-off-by: minmin-intel <minmin.hou@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Pass down model id for ChatQnA
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update logic
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* update gateway
Signed-off-by: Mustafa <mustafa.cetin@intel.com>
* update the gateway
Signed-off-by: Mustafa <mustafa.cetin@intel.com>
* update the gateway
Signed-off-by: Mustafa <mustafa.cetin@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Mustafa <mustafa.cetin@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* fix history content from agent memory.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Embedding TEI Langchain compatible with OpenAI API
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* TextDoc support list
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* support tei llama index openai compatible API
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* support mosec langchain openai compatible API
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* update UT for embedding tests
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix ut bug
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* support embedding predictionguard openai compatible API
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* support embedding multimodal clip OpenAI compatible API
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* fix bug
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* enable debug mode for embedding UT
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
* Drop dump_outputs() method that obfuscates the code
dump_outputs() method in ServiceOrchestrator:
* Is not real method (does not use self)
* Adds a member to a dict instead of "dump"ing (drop or output) something
* Obfuscates how schedule() method return value is constructed, and
* Makes calling code unnecessary longer
Similar method in "ServiceOrchestratorWithYaml" is reasonable except
for the name, but drop also that for consistency.
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
* Apply pylint simplification suggestion to execute()
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
---------
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: Sihan Chen <39623753+Spycsh@users.noreply.github.com>
* Multiple models support for langchain vLLM text-generation
Signed-off-by: sgurunat <gurunath.s@intel.com>
* Add authentication support for langchain vLLM text-generation remote endpoints
Signed-off-by: sgurunat <gurunath.s@intel.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: sgurunat <gurunath.s@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>