[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
refine script for hardcodes variables and test codes
2025-01-24 02:30:47 +00:00 · 2025-01-24 10:30:14 +08:00 · 2025-01-23 15:13:19 +08:00 · 2025-01-23 15:12:08 +08:00 · 2025-01-23 06:44:39 +00:00 · 2025-01-23 14:44:09 +08:00
3 changed files with 1233 additions and 0 deletions
--- a/ChatQnA/chatqna.yaml
+++ b/ChatQnA/chatqna.yaml
@@ -0,0 +1,90 @@
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+deploy:
+  device: gaudi
+  version: 1.1.0
+  modelUseHostPath: /mnt/models
+  HUGGINGFACEHUB_API_TOKEN: ""
+  node: [1, 2, 4]
+  namespace: "default"
+  cards_per_node: 8
+
+  services:
+    backend:
+      instance_num: [2, 2, 4]
+      cores_per_instance: ""
+      memory_capacity: ""
+
+    teirerank:
+      enabled: True
+      model_id: ""
+      instance_num: [1, 1, 1]
+      cards_per_instance: 1
+
+    tei:
+      model_id: ""
+      instance_num: [1, 2, 4]
+      cores_per_instance: ""
+      memory_capacity: ""
+
+    llm:
+      engine: tgi
+      model_id: ""
+      instance_num: [7, 15, 31]
+      max_batch_size: [1, 2, 4, 8]
+      max_input_length: ""
+      max_total_tokens: ""
+      max_batch_total_tokens: ""
+      max_batch_prefill_tokens: ""
+      cards_per_instance: 1
+
+    data-prep:
+      instance_num: [1, 1, 1]
+      cores_per_instance: ""
+      memory_capacity: ""
+
+    retriever-usvc:
+      instance_num: [2, 2, 4]
+      cores_per_instance: ""
+      memory_capacity: ""
+
+    redis-vector-db:
+      instance_num: [1, 1, 1]
+      cores_per_instance: ""
+      memory_capacity: ""
+
+    chatqna-ui:
+      instance_num: [1, 1, 1]
+
+    nginx:
+      instance_num: [1, 1, 1]
+
+benchmark:
+  # http request behavior related fields
+  concurrency:               [1, 2, 4]
+  totoal_query_num:          [2048, 4096]
+  duration:                  [5, 10] # unit minutes
+  query_num_per_concurrency: [4, 8, 16]
+  possion:                   True
+  possion_arrival_rate:      1.0
+  warmup_iterations:         10
+  seed:                      1024
+
+  # dataset relted fields
+  dataset:                   pub_med10 # [dummy_english, dummy_chinese, pub_med100] predefined keywords for supported dataset
+  user_queries:              [1, 2, 4]
+  query_token_size:          128                   # if specified, means fixed query token size will be sent out
+
+  # advance settings in each component which will impact perf.
+  dataprep:                  # not target this time
+    chunk_size:              [1024]
+    chunk_overlap:           [1000]
+  retriever:                   # not target this time
+    algo:                    IVF
+    fetch_k:                 2
+    k:                       1
+  rerank:
+    top_n:                   2
+  llm:
+    max_token_size:          128   # specify the output token size
--- a/deploy_and_benchmark.py
+++ b/deploy_and_benchmark.py
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,9 @@
+kubernetes
+locust
+numpy
+opea-eval
+pytest
+pyyaml
+requests
+sseclient-py
+transformers
Author	SHA1	Message	Date
pre-commit-ci[bot]	97d277cd1d	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-01-24 02:30:47 +00:00
letonghan	3f918422c9	refine script for hardcodes variables and test codes Signed-off-by: letonghan <letong.han@intel.com>	2025-01-24 10:30:14 +08:00
letonghan	53e15bfb79	fix merge conflict Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 15:13:19 +08:00
letonghan	bbe649c44c	fix preci issues of variable names conflicts Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 15:12:08 +08:00
pre-commit-ci[bot]	6e26d4615a	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-01-23 06:44:39 +00:00
letonghan	500fcdb975	fix merge conflicts Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 14:44:09 +08:00
letonghan	4825420f04	Merge branch 'main' of https://github.com/opea-project/GenAIExamples into refactor_benchmark	2025-01-23 14:42:10 +08:00
letonghan	78a1efd7f0	refactor python script into deploy_and_benchmark.py Signed-off-by: letonghan <letong.han@intel.com>	2025-01-23 14:41:11 +08:00
Letong Han	9b9314b062	Merge branch 'main' into refactor_benchmark	2025-01-21 15:06:19 +08:00
pre-commit-ci[bot]	8b85e8c793	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-01-21 07:05:57 +00:00
letonghan	eba1c300b3	Support ChatQnA benchmark pipeline on pubmed dataset. Add file benchmark.py, benchmark.yaml, and benchmark_requirements.txt. Related PR in GenAIEval: https://github.com/opea-project/GenAIEval/pull/228 Signed-off-by: letonghan <letong.han@intel.com>	2025-01-21 15:02:30 +08:00