[Codegen] Refine readme to prompt users on how to change the model. (#695)

* [Codegen] Refine readme to prompt users on how to change the model.

Signed-off-by: Yao, Qing <qing.yao@intel.com>

* [Codegen] Add section Required Model.

Signed-off-by: Yao, Qing <qing.yao@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yao, Qing <qing.yao@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
Yao Qing
2024-08-29 22:17:03 +08:00
committed by GitHub
parent cc84847082
commit 814164dc4f
2 changed files with 13 additions and 4 deletions

View File

@@ -32,6 +32,17 @@ Currently we support two ways of deploying ChatQnA services with docker compose:
2. Start services using the docker images built from source. See the [Gaudi Guide](./docker/gaudi/README.md) or [Xeon Guide](./docker/xeon/README.md) for more information.
### Required Models
By default, the LLM model is set to a default value as listed below:
| Service | Model |
| ------------ | ------------------------------------------------------------------------------- |
| LLM_MODEL_ID | [meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) |
[meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) is a gated model that requires submitting an access request through Hugging Face. You can replace it with another model.
Change the `LLM_MODEL_ID` below for your needs, such as: [Qwen/CodeQwen1.5-7B-Chat](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat), [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)
### Setup Environment Variable
To set up environment variables for deploying ChatQnA services, follow these steps:
@@ -55,10 +66,6 @@ To set up environment variables for deploying ChatQnA services, follow these ste
3. Set up other environment variables:
> Note: By default, the [`docker/set_env.sh`](docker/set_env.sh) file will configure your environment
> variables to use [meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf). This
> is a gated model that requires submitting an access request through Hugging Face.
```bash
source ./docker/set_env.sh
```

View File

@@ -14,7 +14,9 @@
```
cd GenAIExamples/CodeGen/kubernetes/manifests/xeon
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
export MODEL_ID="meta-llama/CodeLlama-7b-hf"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" codegen.yaml
sed -i "s/meta-llama\/CodeLlama-7b-hf/${MODEL_ID}/g" codegen.yaml
kubectl apply -f codegen.yaml
```