Go to file

Eero Tamminen a6998a1dbd Add E2E Promeheus metrics to applications (#845 )

* Fix typos in BaseStatistics method names

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

* Add HttpService "inprogress" (pending) request count metrics

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

* Add E2E Prometheus metrics to ServiceOrchestrator

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

* Fix: support metrics with multiple ServiceOrchestrator instances

Unlike apps, CI tests create multiple of them.

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

* Fix: require named MicroService -> HTTPService instances

Creating multiple MicroService()s creates multiple HTTPService()s
which creates multiple Prometheus fastapi instrumentor instances.

While latter handled that fine for ChatQnA and normal HTTP metrics,
that was not the case for its "inprogress" metrics in CI.

Therefore MicroService constructor name argument is now mandatory, so
that it can be used to make "inprogress" metrics for HTTPService
instances unique.

PS. instrumentor requires HTTPService instance specific Starlette
instance, so it cannot be made singleton.

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

* Fix: update test_token_generator()

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

2024-11-04 09:58:23 +08:00

.github

GraphRAG with llama-index (#793 )

2024-10-29 22:45:44 -07:00

comps

Add E2E Promeheus metrics to applications (#845 )

2024-11-04 09:58:23 +08:00

tests

Add E2E Promeheus metrics to applications (#845 )

2024-11-04 09:58:23 +08:00

.gitattributes

Fix eol and add OLLAMA_MODEL param (#309 )

2024-07-18 22:10:27 +08:00

.gitignore

Prediction Guard Guardrails components (#677 )

2024-09-23 16:14:57 +08:00

.pre-commit-config.yaml

fix: Update pre-commit prettier mirror to running prettier in local #765 (#779 )

2024-11-03 10:36:03 -05:00

codecov.yml

Add codecov (#178 )

2024-06-14 13:37:01 +08:00

LEGAL_INFORMATION.md

doc: fix missing references to README.md (#721 )

2024-09-23 07:55:52 +08:00

LICENSE

Initial commit

2024-04-19 18:43:47 +08:00

pyproject.toml

Remove unused imports across all comps (#762 )

2024-10-11 22:11:59 +08:00

README.md

doc: fix missing references to README.md (#721 )

2024-09-23 07:55:52 +08:00

requirements.txt

Support file upload summary for DocSum microservice (#823 )

2024-10-24 14:14:48 +08:00

setup.py

Support export megaservice yaml to docker compose file (#642 )

2024-09-11 22:26:43 +08:00

third-party-programs.txt

bump release version into v0.6 (#128 )

2024-05-31 18:12:40 +08:00

README.md

Generative AI Components (GenAIComps)

Build Enterprise-grade Generative AI Applications with Microservice Architecture

This initiative empowers the development of high-quality Generative AI applications for enterprises via microservices, simplifying the scaling and deployment process for production. It abstracts away infrastructure complexities, facilitating the seamless development and deployment of Enterprise AI services.

GenAIComps

GenAIComps provides a suite of microservices, leveraging a service composer to assemble a mega-service tailored for real-world Enterprise AI applications. All the microservices are containerized, allowing cloud native deployment. Checkout how the microservices are used in GenAIExamples.

Installation

Install from Pypi

pip install opea-comps

Build from Source

git clone https://github.com/opea-project/GenAIComps
cd GenAIComps
pip install -e .

MicroService

Microservices are akin to building blocks, offering the fundamental services for constructing RAG (Retrieval-Augmented Generation) applications.

Each Microservice is designed to perform a specific function or task within the application architecture. By breaking down the system into smaller, self-contained services, Microservices promote modularity, flexibility, and scalability.

This modular approach allows developers to independently develop, deploy, and scale individual components of the application, making it easier to maintain and evolve over time. Additionally, Microservices facilitate fault isolation, as issues in one service are less likely to impact the entire system.

The initially supported Microservices are described in the below table. More Microservices are on the way.

MicroService	Framework	Model	Serving	HW	Description
Embedding	LangChain/LlamaIndex	BAAI/bge-base-en-v1.5	TEI-Gaudi	Gaudi2	Embedding on Gaudi2
Embedding	LangChain/LlamaIndex	BAAI/bge-base-en-v1.5	TEI	Xeon	Embedding on Xeon CPU
Retriever	LangChain/LlamaIndex	BAAI/bge-base-en-v1.5	TEI	Xeon	Retriever on Xeon CPU
Reranking	LangChain/LlamaIndex	BAAI/bge-reranker-base	TEI-Gaudi	Gaudi2	Reranking on Gaudi2
Reranking	LangChain/LlamaIndex	BBAAI/bge-reranker-base	TEI	Xeon	Reranking on Xeon CPU
ASR	NA	openai/whisper-small	NA	Gaudi2	Audio-Speech-Recognition on Gaudi2
ASR	NA	openai/whisper-small	NA	Xeon	Audio-Speech-RecognitionS on Xeon CPU
TTS	NA	microsoft/speecht5_tts	NA	Gaudi2	Text-To-Speech on Gaudi2
TTS	NA	microsoft/speecht5_tts	NA	Xeon	Text-To-Speech on Xeon CPU
Dataprep	Qdrant	sentence-transformers/all-MiniLM-L6-v2	NA	Gaudi2	Dataprep on Gaudi2
Dataprep	Qdrant	sentence-transformers/all-MiniLM-L6-v2	NA	Xeon	Dataprep on Xeon CPU
Dataprep	Redis	BAAI/bge-base-en-v1.5	NA	Gaudi2	Dataprep on Gaudi2
Dataprep	Redis	BAAI/bge-base-en-v1.5	NA	Xeon	Dataprep on Xeon CPU
LLM	LangChain/LlamaIndex	Intel/neural-chat-7b-v3-3	TGI Gaudi	Gaudi2	LLM on Gaudi2
LLM	LangChain/LlamaIndex	Intel/neural-chat-7b-v3-3	TGI	Xeon	LLM on Xeon CPU
LLM	LangChain/LlamaIndex	Intel/neural-chat-7b-v3-3	Ray Serve	Gaudi2	LLM on Gaudi2
LLM	LangChain/LlamaIndex	Intel/neural-chat-7b-v3-3	Ray Serve	Xeon	LLM on Xeon CPU
LLM	LangChain/LlamaIndex	Intel/neural-chat-7b-v3-3	vLLM	Gaudi2	LLM on Gaudi2
LLM	LangChain/LlamaIndex	Intel/neural-chat-7b-v3-3	vLLM	Xeon	LLM on Xeon CPU

A Microservices can be created by using the decorator register_microservice. Taking the embedding microservice as an example:

from langchain_community.embeddings import HuggingFaceHubEmbeddings

from comps import register_microservice, EmbedDoc, ServiceType, TextDoc


@register_microservice(
    name="opea_service@embedding_tgi_gaudi",
    service_type=ServiceType.EMBEDDING,
    endpoint="/v1/embeddings",
    host="0.0.0.0",
    port=6000,
    input_datatype=TextDoc,
    output_datatype=EmbedDoc,
)
def embedding(input: TextDoc) -> EmbedDoc:
    embed_vector = embeddings.embed_query(input.text)
    res = EmbedDoc(text=input.text, embedding=embed_vector)
    return res

MegaService

A Megaservice is a higher-level architectural construct composed of one or more Microservices, providing the capability to assemble end-to-end applications. Unlike individual Microservices, which focus on specific tasks or functions, a Megaservice orchestrates multiple Microservices to deliver a comprehensive solution.

Megaservices encapsulate complex business logic and workflow orchestration, coordinating the interactions between various Microservices to fulfill specific application requirements. This approach enables the creation of modular yet integrated applications, where each Microservice contributes to the overall functionality of the Megaservice.

Here is a simple example of building Megaservice:

from comps import MicroService, ServiceOrchestrator

EMBEDDING_SERVICE_HOST_IP = os.getenv("EMBEDDING_SERVICE_HOST_IP", "0.0.0.0")
EMBEDDING_SERVICE_PORT = os.getenv("EMBEDDING_SERVICE_PORT", 6000)
LLM_SERVICE_HOST_IP = os.getenv("LLM_SERVICE_HOST_IP", "0.0.0.0")
LLM_SERVICE_PORT = os.getenv("LLM_SERVICE_PORT", 9000)


class ExampleService:
    def __init__(self, host="0.0.0.0", port=8000):
        self.host = host
        self.port = port
        self.megaservice = ServiceOrchestrator()

    def add_remote_service(self):
        embedding = MicroService(
            name="embedding",
            host=EMBEDDING_SERVICE_HOST_IP,
            port=EMBEDDING_SERVICE_PORT,
            endpoint="/v1/embeddings",
            use_remote_service=True,
            service_type=ServiceType.EMBEDDING,
        )
        llm = MicroService(
            name="llm",
            host=LLM_SERVICE_HOST_IP,
            port=LLM_SERVICE_PORT,
            endpoint="/v1/chat/completions",
            use_remote_service=True,
            service_type=ServiceType.LLM,
        )
        self.megaservice.add(embedding).add(llm)
        self.megaservice.flow_to(embedding, llm)

Gateway

The Gateway serves as the interface for users to access the Megaservice, providing customized access based on user requirements. It acts as the entry point for incoming requests, routing them to the appropriate Microservices within the Megaservice architecture.

Gateways support API definition, API versioning, rate limiting, and request transformation, allowing for fine-grained control over how users interact with the underlying Microservices. By abstracting the complexity of the underlying infrastructure, Gateways provide a seamless and user-friendly experience for interacting with the Megaservice.

For example, the Gateway for ChatQnA can be built like this:

from comps import ChatQnAGateway

self.gateway = ChatQnAGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)

Contributing to OPEA

Welcome to the OPEA open-source community! We are thrilled to have you here and excited about the potential contributions you can bring to the OPEA platform. Whether you are fixing bugs, adding new GenAI components, improving documentation, or sharing your unique use cases, your contributions are invaluable.

Together, we can make OPEA the go-to platform for enterprise AI solutions. Let's work together to push the boundaries of what's possible and create a future where AI is accessible, efficient, and impactful for everyone.

Please check the Contributing guidelines for a detailed guide on how to contribute a GenAI example and all the ways you can contribute!

Thank you for being a part of this journey. We can't wait to see what we can achieve together!

Additional Content

Languages

Shell 34%

Python 24.1%

TypeScript 16.1%

Svelte 14.2%

Vue 4.7%

Other 6.9%