diff --git a/CodeTrans/README.md b/CodeTrans/README.md index 78527cb94..b5c05e9fc 100644 --- a/CodeTrans/README.md +++ b/CodeTrans/README.md @@ -22,12 +22,11 @@ This Code Translation use case demonstrates Text Generation Inference across mul The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware. -| Category | Deployment Option | Description | -| ---------------------- | -------------------- | ----------------------------------------------------------------- | -| On-premise Deployments | Docker compose | [CodeTrans deployment on Xeon](./docker_compose/intel/cpu/xeon) | -| | | [CodeTrans deployment on Gaudi](./docker_compose/intel/hpu/gaudi) | -| | | [CodeTrans deployment on AMD ROCm](./docker_compose/amd/gpu/rocm) | -| | Kubernetes | [Helm Charts](./kubernetes/helm) | -| | | [GMC](./kubernetes/gmc) | -| | Azure | Work-in-progress | -| | Intel Tiber AI Cloud | Work-in-progress | +| Category | Deployment Option | Description | +| ---------------------- | -------------------- | --------------------------------------------------------------------------- | +| On-premise Deployments | Docker compose | [CodeTrans deployment on Xeon](./docker_compose/intel/cpu/xeon/README.md) | +| | | [CodeTrans deployment on Gaudi](./docker_compose/intel/hpu/gaudi/README.md) | +| | | [CodeTrans deployment on AMD ROCm](./docker_compose/amd/gpu/rocm/README.md) | +| | Kubernetes | [Helm Charts](./kubernetes/helm/README.md) | +| | Azure | Work-in-progress | +| | Intel Tiber AI Cloud | Work-in-progress | diff --git a/CodeTrans/README_miscellaneous.md b/CodeTrans/README_miscellaneous.md index b0c5a11f4..482659d15 100644 --- a/CodeTrans/README_miscellaneous.md +++ b/CodeTrans/README_miscellaneous.md @@ -44,3 +44,38 @@ Some HuggingFace resources, such as some models, are only accessible if the deve 2. (Docker only) If all microservices work well, check the port ${host_ip}:7777, the port may be allocated by other users, you can modify the `compose.yaml`. 3. (Docker only) If you get errors like "The container name is in use", change container name in `compose.yaml`. + +## Monitoring OPEA Services with Prometheus and Grafana Dashboard + +OPEA microservice deployment can easily be monitored through Grafana dashboards using data collected via Prometheus. Follow the [README](https://github.com/opea-project/GenAIEval/blob/main/evals/benchmark/grafana/README.md) to setup Prometheus and Grafana servers and import dashboards to monitor the OPEA services. + +![example dashboards](./assets/img/example_dashboards.png) +![tgi dashboard](./assets/img/tgi_dashboard.png) + +## Tracing with OpenTelemetry and Jaeger + +> NOTE: This feature is disabled by default. Please use the compose.telemetry.yaml file to enable this feature. + +OPEA microservice and [TGI](https://huggingface.co/docs/text-generation-inference/en/index)/[TEI](https://huggingface.co/docs/text-embeddings-inference/en/index) serving can easily be traced through [Jaeger](https://www.jaegertracing.io/) dashboards in conjunction with [OpenTelemetry](https://opentelemetry.io/) Tracing feature. Follow the [README](https://github.com/opea-project/GenAIComps/tree/main/comps/cores/telemetry#tracing) to trace additional functions if needed. + +Tracing data is exported to http://{EXTERNAL_IP}:4318/v1/traces via Jaeger. +Users could also get the external IP via below command. + +```bash +ip route get 8.8.8.8 | grep -oP 'src \K[^ ]+' +``` + +Access the Jaeger dashboard UI at http://{EXTERNAL_IP}:16686 + +For TGI serving on Gaudi, users could see different services like opea, TEI and TGI. +![Screenshot from 2024-12-27 11-58-18](https://github.com/user-attachments/assets/6126fa70-e830-4780-bd3f-83cb6eff064e) + +Here is a screenshot for one tracing of TGI serving request. +![Screenshot from 2024-12-27 11-26-25](https://github.com/user-attachments/assets/3a7c51c6-f422-41eb-8e82-c3df52cd48b8) + +There are also OPEA related tracings. Users could understand the time breakdown of each service request by looking into each opea:schedule operation. +![image](https://github.com/user-attachments/assets/6137068b-b374-4ff8-b345-993343c0c25f) + +There could be asynchronous function such as `llm/MicroService_asyn_generate` and user needs to check the trace of the asynchronous function in another operation like +opea:llm_generate_stream. +![image](https://github.com/user-attachments/assets/a973d283-198f-4ce2-a7eb-58515b77503e) diff --git a/CodeTrans/assets/img/code_trans_architecture.png b/CodeTrans/assets/img/code_trans_architecture.png index 09a7ffccd..5327de86f 100644 Binary files a/CodeTrans/assets/img/code_trans_architecture.png and b/CodeTrans/assets/img/code_trans_architecture.png differ diff --git a/CodeTrans/assets/img/example_dashboards.png b/CodeTrans/assets/img/example_dashboards.png new file mode 100644 index 000000000..24abb6533 Binary files /dev/null and b/CodeTrans/assets/img/example_dashboards.png differ diff --git a/CodeTrans/assets/img/tgi_dashboard.png b/CodeTrans/assets/img/tgi_dashboard.png new file mode 100644 index 000000000..8fcd3ac56 Binary files /dev/null and b/CodeTrans/assets/img/tgi_dashboard.png differ