Update README with new examples (#808)

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
lvliang-intel
2024-09-14 09:41:51 +08:00
committed by GitHub
parent b84c98983d
commit 2d28bebac6
3 changed files with 62 additions and 16 deletions

View File

@@ -60,8 +60,53 @@ This document introduces the supported examples of GenAIExamples. The supported
[VisualQnA](./VisualQnA/README.md) is an example of chatbot for question and answering based on the images.
| LLM | HW | Description |
| --------------------------------------------------------------- | ------ | ----------- |
| [LLaVA-1.5-7B](https://huggingface.co/llava-hf/llava-1.5-7b-hf) | Gaudi2 | Chatbot |
| LVM | HW | Description |
| --------------------------------------------------------------------------------------------- | ------ | ----------- |
| [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) | Gaudi2 | Chatbot |
> **_NOTE:_** The `Language Translation`, `SearchQnA`, `VisualQnA` and other use cases not listing here are in active development. The code structure of these use cases are subject to change.
### VideoQnA
[VideoQnA](./VideoQnA/README.md) is an example of chatbot for question and answering based on the videos. It retrieves video based on provided user prompt. It uses only the video embeddings to perform vector similarity search in Intel's VDMS vector database and performs all operations on Intel Xeon CPU. The pipeline supports long form videos and time-based search.
By default, the embedding and LVM models are set to a default value as listed below:
| Service | Model | HW | Description |
| --------- | ----------------------------------------------------------------------------------- | ---- | ------------------------ |
| Embedding | [openai/clip-vit-base-patch32](https://huggingface.co/openai/clip-vit-base-patch32) | Xeon | Video embeddings service |
| LVM | [DAMO-NLP-SG/Video-LLaMA](https://huggingface.co/DAMO-NLP-SG/VideoLLaMA2-7B) | Xeon | LVM service |
### RerankFinetuning
Rerank model finetuning example is for training rerank model on a dataset for improving its capability on specific field.
By default, the base model is set to a default value as listed below:
| Service | Base Model | HW | Description |
| ----------------- | ------------------------------------------------------------------------- | ---- | ------------------------------- |
| Rerank Finetuning | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Xeon | Rerank model finetuning service |
### InstructionTuning
The Instruction Tuning example is designed to further train large language models (LLMs) on a dataset consisting of (instruction, output) pairs using supervised learning. This process bridges the gap between the LLM's original objective of next-word prediction and the users objective of having the model follow human instructions accurately. By leveraging Instruction Tuning, this example enhances the LLM's ability to better understand and execute specific tasks, improving the model's alignment with user instructions and its overall performance.
By default, the base model is set to a default value as listed below:
| Service | Base Model | HW | Description |
| ----------------- | ------------------------------------------------------------------------------------- | ---------- | ------------------------------------ |
| InstructionTuning | [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) | Xeon/Gaudi | LLM model Instruction Tuning service |
### DocIndexRetriever
The DocRetriever example demonstrates how to match user queries with free-text records using various retrieval methods. It plays a key role in Retrieval-Augmented Generation (RAG) systems by dynamically fetching relevant information from external sources, ensuring responses are factual and up-to-date. Powered by vector databases, DocRetriever enables efficient, semantic retrieval by storing data as vectors and quickly identifying the most relevant documents based on similarity.
| Framework | Embedding | Vector Database | Serving | HW | Description |
| ------------------------------------------------------------------------------ | --------------------------------------------------- | -------------------------- | --------------------------------------------------------------- | ----------- | -------------------------- |
| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Redis](https://redis.io/) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2 | Document Retrieval Service |
### AgentQnA
The AgentQnA example demonstrates a hierarchical, multi-agent system designed for question-answering tasks. A supervisor agent interacts directly with the user, delegating tasks to a worker agent and utilizing various tools to gather information and generate answers. The worker agent primarily uses a retrieval tool to respond to the supervisor's queries. Additionally, the supervisor can access other tools, such as APIs to query knowledge graphs, SQL databases, or external knowledge bases, to enhance the accuracy and relevance of its responses.
Worker agent uses open-source websearch tool (duckduckgo), agents use OpenAI GPT-4o-mini as llm backend.
> **_NOTE:_** This example is in active development. The code structure of these use cases are subject to change.