Refine Main README (#502)

* udpate examples readme

Signed-off-by: letonghan <letong.han@intel.com>

* update architecture img

Signed-off-by: letonghan <letong.han@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update img name

Signed-off-by: letonghan <letong.han@intel.com>

* udpate readme & fix dockerfile issue

Signed-off-by: letonghan <letong.han@intel.com>

* add k8s doc links

Signed-off-by: letonghan <letong.han@intel.com>

---------

Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
Letong Han
2024-08-05 09:47:15 +08:00
committed by GitHub
parent 4259240407
commit 08eb2699b7
2 changed files with 304 additions and 216 deletions

300
README.md
View File

@@ -2,240 +2,108 @@
# Generative AI Examples
This project provides a collective list of Generative AI (GenAI) and Retrieval-Augmented Generation (RAG) examples such as chatbot with question and answering (ChatQnA), code generation (CodeGen), document summary (DocSum), etc.
[![version](https://img.shields.io/badge/release-0.6-green)](https://github.com/opea-project/GenAIExamples/releases)
[![version](https://img.shields.io/badge/release-0.8-green)](https://github.com/opea-project/GenAIExamples/releases)
[![license](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/intel/neural-compressor/blob/master/LICENSE)
---
<div align="left">
## GenAI Examples
## Introduction
All the examples are well-validated on Intel platforms. In addition, these examples are:
GenAIComps-based Generative AI examples offer streamlined deployment, testing, and scalability. All examples are fully compatible with Docker and Kubernetes, supporting a wide range of hardware platforms such as Gaudi, Xeon, and other hardwares.
- <b>Easy to use</b>. Use ecosystem-compliant APIs to build the end-to-end GenAI examples
## Architecture
- <b>Easy to customize</b>. Customize the example using different framework, LLM, embedding, serving etc.
GenAIComps is a service-based tool that includes microservice components such as llm, embedding, reranking, and so on. Using these components, various examples in GenAIExample can be constructed, including ChatQnA, DocSum, etc.
- <b>Easy to deploy</b>. Deploy the GenAI examples with performance on Intel platforms
GenAIInfra, part of the OPEA containerization and cloud-native suite, enables quick and efficient deployment of GenAIExamples in the cloud.
> **Note**:
> The below support matrix gives the validated configurations. Feel free to customize per your needs.
GenAIEvals measures service performance metrics such as throughput, latency, and accuracy for GenAIExamples. This feature helps users compare performance across various hardware configurations easily.
### ChatQnA
## Getting Started
[ChatQnA](./ChatQnA/README.md) is an example of chatbot for question and answering through retrieval argumented generation (RAG).
GenAIExamples offers flexible deployment options that cater to different user needs, enabling efficient use and deployment in various environments. Heres a brief overview of the three primary methods: Python startup, Docker Compose, and Kubernetes.
1. <b>Docker Compose</b>: Check the released docker images in [docker image list](./docker_images_list.md) for detailed information.
2. <b>Kubernetes</b>: Follow the steps at [K8s Install](https://github.com/opea-project/docs/tree/main/guide/installation/k8s_install) and [GMC Install](https://github.com/opea-project/docs/blob/main/guide/installation/gmc_install/gmc_install.md) to setup k8s and GenAI environment .
Users can choose the most suitable approach based on ease of setup, scalability needs, and the environment in which they are operating.
### Deployment
<table>
<tbody>
<tr>
<td>Framework</td>
<td>LLM</td>
<td>Embedding</td>
<td>Vector Database</td>
<td>Serving</td>
<td>HW</td>
<td>Description</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td>
<td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td>
<td><a href="https://redis.io/">Redis</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</a></td>
<td>Xeon/Gaudi2/GPU</td>
<td>Chatbot</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td>
<td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td>
<td><a href="https://www.trychroma.com/">Chroma</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td>
<td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td>
<td><a href="https://redis.io/">Redis</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td>
<td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td>
<td><a href="https://qdrant.tech/">Qdrant</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a> <a href="https://github.com/huggingface/text-embeddings-inference">TEI</td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/Qwen/Qwen2-7B">Qwen2-7B</a></td>
<td><a href="https://huggingface.co/BAAI/bge-base-en">BGE-Base</a></td>
<td><a href="https://redis.io/">Redis</a></td>
<td><a href=<a href="https://github.com/huggingface/text-embeddings-inference">TEI</td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
</tbody>
<tr>
<th rowspan="3" style="text-align:center;">Use Cases</th>
<th colspan="4" style="text-align:center;">Deployment</th>
</tr>
<tr>
<td colspan="2" style="text-align:center;">Docker Compose</td>
<td rowspan="2" style="text-align:center;">Kubernetes</td>
</tr>
<tr>
<td style="text-align:center;">Xeon</td>
<td style="text-align:center;">Gaudi</td>
</tr>
<tr>
<td style="text-align:center;">ChatQnA</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker/gaudi/README.md">Gaudi Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/README.md">K8s Link</a></td>
</tr>
<tr>
<td style="text-align:center;">CodeGen</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeGen/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeGen/docker/gaudi/README.md">Gaudi Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeGen/kubernetes/README.md">K8s Link</a></td>
</tr>
<tr>
<td style="text-align:center;">CodeTrans</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeTrans/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeTrans/docker/gaudi/README.md">Gaudi Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/CodeTrans/kubernetes/README.md">K8s Link</a></td>
</tr>
<tr>
<td style="text-align:center;">DocSum</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/DocSum/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/DocSum/docker/gaudi/README.md">Gaudi Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/DocSum/kubernetes/README.md">K8s Link</a></td>
</tr>
<tr>
<td style="text-align:center;">SearchQnA</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/SearchQnA/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/SearchQnA/docker/gaudi/README.md">Gaudi Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/SearchQnA/kubernetes/README.md">K8s Link</a></td>
</tr>
<tr>
<td style="text-align:center;">FaqGen</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/FaqGen/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/FaqGen/docker/gaudi/README.md">Gaudi Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/FaqGen/kubernetes/manifests/README.md">K8s Link</a></td>
</tr>
<tr>
<td style="text-align:center;">Translation</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/Translation/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/Translation/docker/gaudi/README.md">Gaudi Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/tree/main/Translation/kubernetes">K8s Link</a></td>
</tr>
<tr>
<td style="text-align:center;">AudioQnA</td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/xeon/README.md">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/gaudi/README.md">Gaudi Link</a></td>
<td>Not supported yet</td>
</tr>
<tr>
<td style="text-align:center;">VisualQnA</td>
<td><a href="https://github.com/opea-project/GenAIExamples/tree/main/VisualQnA">Xeon Link</a></td>
<td><a href="https://github.com/opea-project/GenAIExamples/tree/main/VisualQnA">Gaudi Link</a></td>
<td>Not supported yet</td>
</tr>
</table>
### CodeGen
## Support Examples
[CodeGen](./CodeGen/README.md) is an example of copilot designed for code generation in Visual Studio Code.
<table>
<tbody>
<tr>
<td>Framework</td>
<td>LLM</td>
<td>Serving</td>
<td>HW</td>
<td>Description</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/meta-llama/CodeLlama-7b-hf">meta-llama/CodeLlama-7b-hf</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td>
<td>Xeon/Gaudi2</td>
<td>Copilot</td>
</tr>
</tbody>
</table>
### CodeTrans
[CodeTrans](./CodeTrans/README.md) is an example of chatbot for converting code written in one programming language to another programming language while maintaining the same functionality.
<table>
<tbody>
<tr>
<td>Framework</td>
<td>LLM</td>
<td>Serving</td>
<td>HW</td>
<td>Description</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/HuggingFaceH4/mistral-7b-grok">HuggingFaceH4/mistral-7b-grok</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td>
<td>Xeon/Gaudi2</td>
<td>Code Translation</td>
</tr>
</tbody>
</table>
### DocSum
[DocSum](./DocSum/README.md) is an example of chatbot for summarizing the content of documents or reports.
<table>
<tbody>
<tr>
<td>Framework</td>
<td>LLM</td>
<td>Serving</td>
<td>HW</td>
<td>Description</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
</tbody>
</table>
### Language Translation
[Language Translation](./Translation/README.md) is an example of chatbot for converting a source-language text to an equivalent target-language text.
<table>
<tbody>
<tr>
<td>Framework</td>
<td>LLM</td>
<td>Serving</td>
<td>HW</td>
<td>Description</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/haoranxu/ALMA-13B">haoranxu/ALMA-13B</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td>
<td>Xeon/Gaudi2</td>
<td>Language Translation</td>
</tr>
</tbody>
</table>
### SearchQnA
[SearchQnA](./SearchQnA/README.md) is an example of chatbot for using search engine to enhance QA quality.
<table>
<tbody>
<tr>
<td>Framework</td>
<td>LLM</td>
<td>Serving</td>
<td>HW</td>
<td>Description</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/Intel/neural-chat-7b-v3-3">NeuralChat-7B</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
<tr>
<td><a href="https://www.langchain.com">LangChain</a>/<a href="https://www.llamaindex.ai">LlamaIndex</a></td>
<td><a href="https://huggingface.co/mistralai/Mistral-7B-v0.1">Mistral-7B</a></td>
<td><a href="https://github.com/huggingface/text-generation-inference">TGI</a></td>
<td>Xeon/Gaudi2</td>
<td>Chatbot</td>
</tr>
</tbody>
</table>
### VisualQnA
[VisualQnA](./VisualQnA/README.md) is an example of chatbot for question and answering based on the images.
<table>
<tbody>
<tr>
<td>LLM</td>
<td>HW</td>
<td>Description</td>
</tr>
<tr>
<td><a href="https://huggingface.co/llava-hf/llava-1.5-7b-hf">LLaVA-1.5-7B</a></td>
<td>Gaudi2</td>
<td>Chatbot</td>
</tr>
</tbody>
</table>
> **_NOTE:_** The `Language Translation`, `SearchQnA`, `VisualQnA` and other use cases not listing here are in active development. The code structure of these use cases are subject to change.
Check [here](./supported_examples.md) for detailed information of supported examples, models, hardwares, etc.
## Additional Content