diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/README.md b/ChatQnA/docker_compose/intel/cpu/xeon/README.md index 9f20c03a4..c71a866cf 100644 --- a/ChatQnA/docker_compose/intel/cpu/xeon/README.md +++ b/ChatQnA/docker_compose/intel/cpu/xeon/README.md @@ -34,10 +34,22 @@ To set up environment variables for deploying ChatQnA services, follow these ste ``` 3. Set up other environment variables: + ```bash source ./set_env.sh ``` +4. Change Model for LLM serving + + By default, Meta-Llama-3-8B-Instruct is used for LLM serving, the default model can be changed to other validated LLM models. + Please pick a [validated llm models](https://github.com/opea-project/GenAIComps/tree/main/comps/llms/src/text-generation#validated-llm-models) from the table. + To change the default model defined in set_env.sh, overwrite it by exporting LLM_MODEL_ID to the new model or by modifying set_env.sh, and then repeat step 3. + For example, change to Llama-2-7b-chat-hf using the following command. + + ```bash + export LLM_MODEL_ID="meta-llama/Llama-2-7b-chat-hf" + ``` + ## Quick Start: 2.Run Docker Compose ```bash diff --git a/ChatQnA/docker_compose/intel/hpu/gaudi/README.md b/ChatQnA/docker_compose/intel/hpu/gaudi/README.md index 63cd94ab4..bd5c63490 100644 --- a/ChatQnA/docker_compose/intel/hpu/gaudi/README.md +++ b/ChatQnA/docker_compose/intel/hpu/gaudi/README.md @@ -39,6 +39,25 @@ To set up environment variables for deploying ChatQnA services, follow these ste source ./set_env.sh ``` +4. Change Model for LLM serving + + By default, Meta-Llama-3-8B-Instruct is used for LLM serving, the default model can be changed to other validated LLM models. + Please pick a [validated llm models](https://github.com/opea-project/GenAIComps/tree/main/comps/llms/src/text-generation#validated-llm-models) from the table. + To change the default model defined in set_env.sh, overwrite it by exporting LLM_MODEL_ID to the new model or by modifying set_env.sh, and then repeat step 3. + For example, change to DeepSeek-R1-Distill-Qwen-32B using the following command. + + ```bash + export LLM_MODEL_ID="deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" + ``` + + Please also check [required gaudi cards for different models](https://github.com/opea-project/GenAIComps/tree/main/comps/llms/src/text-generation#system-requirements-for-llm-models) for new models. + It might be necessary to increase the number of Gaudi cards for the model by exporting NUM_CARDS to the new model or by modifying set_env.sh, and then repeating step 3. For example, increase the number of Gaudi cards for DeepSeek-R1- + Distill-Qwen-32B using the following command: + + ```bash + export NUM_CARDS=4 + ``` + ## Quick Start: 2.Run Docker Compose ```bash