[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
refine script for hardcodes variables and test codes
2025-01-24 02:30:47 +00:00 · 2025-01-24 10:30:14 +08:00 · 2025-01-23 15:13:19 +08:00 · 2025-01-23 15:12:08 +08:00 · 2025-01-23 06:44:39 +00:00 · 2025-01-23 14:44:09 +08:00
3 changed files with 1233 additions and 0 deletions
--- a/ChatQnA/chatqna.yaml
+++ b/ChatQnA/chatqna.yaml
@@ -0,0 +1,90 @@
 # Copyright (C) 2025 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0
 deploy:
  device: gaudi
  version: 1.1.0
  modelUseHostPath: /mnt/models
  HUGGINGFACEHUB_API_TOKEN: ""
  node: [1, 2, 4]
  namespace: "default"
  cards_per_node: 8
  services:
    backend:
      instance_num: [2, 2, 4]
      cores_per_instance: ""
      memory_capacity: ""
    teirerank:
      enabled: True
      model_id: ""
      instance_num: [1, 1, 1]
      cards_per_instance: 1
    tei:
      model_id: ""
      instance_num: [1, 2, 4]
      cores_per_instance: ""
      memory_capacity: ""
    llm:
      engine: tgi
      model_id: ""
      instance_num: [7, 15, 31]
      max_batch_size: [1, 2, 4, 8]
      max_input_length: ""
      max_total_tokens: ""
      max_batch_total_tokens: ""
      max_batch_prefill_tokens: ""
      cards_per_instance: 1
    data-prep:
      instance_num: [1, 1, 1]
      cores_per_instance: ""
      memory_capacity: ""
    retriever-usvc:
      instance_num: [2, 2, 4]
      cores_per_instance: ""
      memory_capacity: ""
    redis-vector-db:
      instance_num: [1, 1, 1]
      cores_per_instance: ""
      memory_capacity: ""
    chatqna-ui:
      instance_num: [1, 1, 1]
    nginx:
      instance_num: [1, 1, 1]
 benchmark:
  # http request behavior related fields
  concurrency:               [1, 2, 4]
  totoal_query_num:          [2048, 4096]
  duration:                  [5, 10] # unit minutes
  query_num_per_concurrency: [4, 8, 16]
  possion:                   True
  possion_arrival_rate:      1.0
  warmup_iterations:         10
  seed:                      1024
  # dataset relted fields
  dataset:                   pub_med10 # [dummy_english, dummy_chinese, pub_med100] predefined keywords for supported dataset
  user_queries:              [1, 2, 4]
  query_token_size:          128                   # if specified, means fixed query token size will be sent out
  # advance settings in each component which will impact perf.
  dataprep:                  # not target this time
    chunk_size:              [1024]
    chunk_overlap:           [1000]
  retriever:                   # not target this time
    algo:                    IVF
    fetch_k:                 2
    k:                       1
  rerank:
    top_n:                   2
  llm:
    max_token_size:          128   # specify the output token size
--- a/deploy_and_benchmark.py
+++ b/deploy_and_benchmark.py
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,9 @@
 kubernetes
 locust
 numpy
 opea-eval
 pytest
 pyyaml
 requests
 sseclient-py
 transformers
Author	SHA1	Message	Date
pre-commit-ci[bot]	97d277cd1d	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-01-24 02:30:47 +00:00
letonghan	3f918422c9	refine script for hardcodes variables and test codes Signed-off-by: letonghan <letong.han@intel.com>	2025-01-24 10:30:14 +08:00
letonghan	53e15bfb79	fix merge conflict Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 15:13:19 +08:00
letonghan	bbe649c44c	fix preci issues of variable names conflicts Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 15:12:08 +08:00
pre-commit-ci[bot]	6e26d4615a	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-01-23 06:44:39 +00:00
letonghan	500fcdb975	fix merge conflicts Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 14:44:09 +08:00
letonghan	4825420f04	Merge branch 'main' of https://github.com/opea-project/GenAIExamples into refactor_benchmark	2025-01-23 14:42:10 +08:00
letonghan	78a1efd7f0	refactor python script into deploy_and_benchmark.py Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 14:41:11 +08:00
Letong Han	9b9314b062	Merge branch 'main' into refactor_benchmark	2025-01-21 15:06:19 +08:00
pre-commit-ci[bot]	8b85e8c793	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-01-21 07:05:57 +00:00
letonghan	eba1c300b3	Support ChatQnA benchmark pipeline on pubmed dataset. Add file benchmark.py, benchmark.yaml, and benchmark_requirements.txt. Related PR in GenAIEval: https://github.com/opea-project/GenAIEval/pull/228 Signed-off-by: letonghan <letong.han@intel.com>	2025-01-21 15:02:30 +08:00