Compare commits

..

39 Commits

Author SHA1 Message Date
Zhenzhong1
ebd2ab0222 Update tuned_single_gaudi_with_rerank.yaml 2024-11-06 16:00:41 +08:00
Zhenzhong1
2f1f80bbae fixed the issue of cm 2024-10-29 03:03:21 -07:00
Zhenzhong1
5158b5e822 updateto vllm images 2024-10-29 02:44:27 -07:00
Zhenzhong1
1c3f55602a added vllm 2024-10-29 02:13:06 -07:00
Zhenzhong1
bb4c1dbc44 Update configmap.yaml 2024-10-28 19:36:32 +08:00
Zhenzhong1
16018085b0 added some envs 2024-10-25 09:22:36 +03:00
Zhenzhong1
93bbd5131f updated oob manifests 2024-10-24 05:11:23 +03:00
chensuyue
4f32f867ec update cpu core into 80
Signed-off-by: chensuyue <suyue.chen@intel.com>
2024-10-23 14:49:04 +08:00
Zhenzhong1
4f183c2a0d restore README 2024-10-23 14:33:10 +08:00
Zhenzhong1
1046aad26f removed benchmark template 2024-10-23 09:30:03 +03:00
pre-commit-ci[bot]
2876677214 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-22 09:11:46 +00:00
Zhenzhong1
a9536321a0 added the tuned tgi params 2024-10-22 12:11:22 +03:00
Zhenzhong1
24de14e58a fixed the audioqna benchmark path 2024-10-22 11:29:30 +03:00
pre-commit-ci[bot]
065222f29b [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-22 08:19:15 +00:00
Zhenzhong1
3f596d9747 update README 2024-10-22 11:18:49 +03:00
pre-commit-ci[bot]
9da0c09b18 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-22 08:10:20 +00:00
Zhenzhong1
b9c646a2b8 update README 2024-10-22 11:09:50 +03:00
Zhenzhong1
27e9832af4 fixed visualqna issues 2024-10-22 10:40:05 +03:00
Zhenzhong1
f3cbcadfa2 fixed visualqna image issues & tgi params issues 2024-10-22 10:26:44 +03:00
Zhenzhong1
e21ee76f24 updated tgiparams 2024-10-22 09:15:11 +03:00
pre-commit-ci[bot]
8effe7a4eb [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-22 05:38:44 +00:00
Zhenzhong1
0d3876d6fa removed multiple yamls 2024-10-22 08:38:15 +03:00
Zhenzhong1
bb46f5b355 added visual qna & update deployment template 2024-10-22 05:45:00 +03:00
Zhenzhong1
bcaffd7db4 added more cases 2024-10-21 12:21:02 +03:00
Zhenzhong1
124143ea40 removed values.yaml 2024-10-21 12:10:59 +03:00
Zhenzhong1
6dc4bb5c79 refactoered image 2024-10-21 11:54:18 +03:00
pre-commit-ci[bot]
d290bd811f [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-21 08:14:22 +00:00
Zhenzhong Xu
d68ce801e4 refactored AudioQNA 2024-10-21 11:12:29 +03:00
Zhenzhong Xu
048b4e1df9 refactored AudioQNA 2024-10-21 11:06:37 +03:00
Zhenzhong Xu
fdb8a33a6e refactored GaqGen 2024-10-21 10:48:16 +03:00
Zhenzhong Xu
4e1237d410 refactored GaqGen 2024-10-21 10:46:12 +03:00
Zhenzhong Xu
58ff7d9518 moved HUGGINGFACEHUB_API_TOKEN 2024-10-21 10:41:20 +03:00
Zhenzhong Xu
9ee1a7410b rename 2024-10-21 10:31:27 +03:00
Zhenzhong Xu
24166615d7 removed spec 2024-10-21 09:01:00 +03:00
Zhenzhong Xu
a0b2263fd3 updated customize deployment template 2024-10-21 08:49:38 +03:00
Zhenzhong Xu
5c2f3f0301 move image & replicas path 2024-10-21 07:04:31 +03:00
Zhenzhong Xu
a70775d3d6 updated chatqna helmcharts image name 2024-10-21 06:54:27 +03:00
Zhenzhong Xu
3dd5475773 updated chatqna helmcharts 2024-10-21 06:40:46 +03:00
Zhenzhong1
d6b04b3405 benchmark helmcharts (#995)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-21 11:13:24 +08:00
746 changed files with 15731 additions and 30634 deletions

38
.github/CODEOWNERS vendored
View File

@@ -1,23 +1,17 @@
/.github/ suyue.chen@intel.com ze.pan@intel.com
/AgentQnA/ kaokao.lv@intel.com minmin.hou@intel.com
/AudioQnA/ sihan.chen@intel.com wenjiao.yue@intel.com
/AvatarChatbot/ chun.tao@intel.com kaokao.lv@intel.com
/ChatQnA/ liang1.lv@intel.com letong.han@intel.com
/CodeGen/ liang1.lv@intel.com xinyao.wang@intel.com
/CodeTrans/ sihan.chen@intel.com xinyao.wang@intel.com
/DBQnA/ supriya.krishnamurthi@intel.com liang1.lv@intel.com
/AgentQnA/ kaokao.lv@intel.com
/AudioQnA/ sihan.chen@intel.com
/ChatQnA/ liang1.lv@intel.com
/CodeGen/ liang1.lv@intel.com
/CodeTrans/ sihan.chen@intel.com
/DocSum/ letong.han@intel.com
/DocIndexRetriever/ kaokao.lv@intel.com chendi.xue@intel.com
/DocSum/ letong.han@intel.com xinyao.wang@intel.com
/EdgeCraftRAG/ yongbo.zhu@intel.com mingyuan.qi@intel.com
/FaqGen/ yogesh.pandey@intel.com xinyao.wang@intel.com
/GraphRAG/ rita.brugarolas.brufau@intel.com abolfazl.shahbazi@intel.com
/InstructionTuning/ xinyu.ye@intel.com kaokao.lv@intel.com
/MultimodalQnA/ melanie.h.buehler@intel.com tiep.le@intel.com
/ProductivitySuite/ jaswanth.karani@intel.com hoong.tee.yeoh@intel.com
/RerankFinetuning/ xinyu.ye@intel.com kaokao.lv@intel.com
/SearchQnA/ sihan.chen@intel.com letong.han@intel.com
/Text2Image/ wenjiao.yue@intel.com xinyu.ye@intel.com
/Translation/ liang1.lv@intel.com sihan.chen@intel.com
/VideoQnA/ huiling.bao@intel.com xinyao.wang@intel.com
/VisualQnA/ liang1.lv@intel.com sihan.chen@intel.com
/*/ liang1.lv@intel.com feng.tian@intel.com suyue.chen@intel.com
/InstructionTuning xinyu.ye@intel.com
/RerankFinetuning xinyu.ye@intel.com
/MultimodalQnA tiep.le@intel.com
/FaqGen/ xinyao.wang@intel.com
/SearchQnA/ sihan.chen@intel.com
/Translation/ liang1.lv@intel.com
/VisualQnA/ liang1.lv@intel.com
/ProductivitySuite/ hoong.tee.yeoh@intel.com
/VideoQnA huiling.bao@intel.com
/*/ liang1.lv@intel.com

View File

@@ -4,7 +4,6 @@
name: Report Bug
description: Used to report bug
title: "[Bug]"
labels: ["bug"]
body:
- type: dropdown
id: priority

View File

@@ -4,7 +4,6 @@
name: Report Feature
description: Used to report feature
title: "[Feature]"
labels: ["feature"]
body:
- type: dropdown
id: priority

View File

@@ -1,2 +0,0 @@
ModelIn
modelin

View File

@@ -1,2 +1,2 @@
Copyright (C) 2024 Intel Corporation
SPDX-License-Identifier: Apache-2.0
SPDX-License-Identifier: Apache-2.0

View File

@@ -40,11 +40,6 @@ on:
default: "main"
required: false
type: string
inject_commit:
default: false
required: false
type: string
jobs:
####################################################################################################
# Image Build
@@ -77,10 +72,6 @@ jobs:
git clone https://github.com/vllm-project/vllm.git
cd vllm && git rev-parse HEAD && cd ../
fi
if [[ $(grep -c "vllm-gaudi:" ${docker_compose_path}) != 0 ]]; then
git clone https://github.com/HabanaAI/vllm-fork.git
cd vllm-fork && git checkout 3c39626 && cd ../
fi
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps && git checkout ${{ inputs.opea_branch }} && git rev-parse HEAD && cd ../
@@ -92,7 +83,6 @@ jobs:
docker_compose_path: ${{ github.workspace }}/${{ inputs.example }}/docker_image_build/build.yaml
service_list: ${{ inputs.services }}
registry: ${OPEA_IMAGE_REPO}opea
inject_commit: ${{ inputs.inject_commit }}
tag: ${{ inputs.tag }}
####################################################################################################

View File

@@ -14,7 +14,7 @@ on:
test_mode:
required: false
type: string
default: 'compose'
default: 'docker_compose'
outputs:
run_matrix:
description: "The matrix string"
@@ -42,12 +42,6 @@ jobs:
ref: ${{ env.CHECKOUT_REF }}
fetch-depth: 0
- name: Check Dangerous Command Injection
if: github.event_name == 'pull_request' || github.event_name == 'pull_request_target'
uses: opea-project/validation/actions/check-cmd@main
with:
work_dir: ${{ github.workspace }}
- name: Get test matrix
id: get-test-matrix
run: |

View File

@@ -67,6 +67,36 @@ jobs:
make docker.build
make docker.push
- name: Scan gmcmanager
if: ${{ inputs.node == 'gaudi' }}
uses: opea-project/validation/actions/trivy-scan@main
with:
image-ref: ${{ env.DOCKER_REGISTRY }}/gmcmanager:${{ env.VERSION }}
output: gmcmanager-scan.txt
- name: Upload gmcmanager scan result
if: ${{ inputs.node == 'gaudi' }}
uses: actions/upload-artifact@v4.3.4
with:
name: gmcmanager-scan
path: gmcmanager-scan.txt
overwrite: true
- name: Scan gmcrouter
if: ${{ inputs.node == 'gaudi' }}
uses: opea-project/validation/actions/trivy-scan@main
with:
image-ref: ${{ env.DOCKER_REGISTRY }}/gmcrouter:${{ env.VERSION }}
output: gmcrouter-scan.txt
- name: Upload gmcrouter scan result
if: ${{ inputs.node == 'gaudi' }}
uses: actions/upload-artifact@v4.3.4
with:
name: gmcrouter-scan
path: gmcrouter-scan.txt
overwrite: true
- name: Clean up images
if: always()
run: |

View File

@@ -22,72 +22,7 @@ on:
type: string
jobs:
get-test-case:
runs-on: ubuntu-latest
outputs:
test_cases: ${{ steps.test-case-matrix.outputs.test_cases }}
CHECKOUT_REF: ${{ steps.get-checkout-ref.outputs.CHECKOUT_REF }}
steps:
- name: Get checkout ref
id: get-checkout-ref
run: |
if [ "${{ github.event_name }}" == "pull_request" ] || [ "${{ github.event_name }}" == "pull_request_target" ]; then
CHECKOUT_REF=refs/pull/${{ github.event.number }}/merge
else
CHECKOUT_REF=${{ github.ref }}
fi
echo "CHECKOUT_REF=${CHECKOUT_REF}" >> $GITHUB_OUTPUT
echo "checkout ref ${CHECKOUT_REF}"
- name: Checkout out Repo
uses: actions/checkout@v4
with:
ref: ${{ steps.get-checkout-ref.outputs.CHECKOUT_REF }}
fetch-depth: 0
- name: Get test matrix
shell: bash
id: test-case-matrix
run: |
example_l=$(echo ${{ inputs.example }} | tr '[:upper:]' '[:lower:]')
cd ${{ github.workspace }}/${{ inputs.example }}/tests
run_test_cases=""
default_test_case=$(find . -type f -name "test_manifest_on_${{ inputs.hardware }}.sh" | cut -d/ -f2)
if [ "$default_test_case" ]; then run_test_cases="$default_test_case"; fi
other_test_cases=$(find . -type f -name "test_manifest_*_on_${{ inputs.hardware }}.sh" | cut -d/ -f2)
echo "default_test_case=$default_test_case"
echo "other_test_cases=$other_test_cases"
if [ "${{ inputs.tag }}" == "ci" ]; then
base_commit=$(curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
"https://api.github.com/repos/opea-project/GenAIExamples/commits?sha=${{ github.event.pull_request.base.ref }}" | jq -r '.[0].sha')
merged_commit=$(git log -1 --format='%H')
changed_files="$(git diff --name-only ${base_commit} ${merged_commit} | grep -vE '${{ inputs.diff_excluded_files }}')" || true
fi
for test_case in $other_test_cases; do
if [ "${{ inputs.tag }}" == "ci" ]; then
flag=${test_case%_on_*}
flag=${flag#test_compose_}
if [[ $(printf '%s\n' "${changed_files[@]}" | grep ${{ inputs.example }} | grep ${flag}) ]]; then
run_test_cases="$run_test_cases $test_case"
fi
else
run_test_cases="$run_test_cases $test_case"
fi
done
test_cases=$(echo $run_test_cases | tr ' ' '\n' | sort -u | jq -R '.' | jq -sc '.')
echo "test_cases=$test_cases"
echo "test_cases=$test_cases" >> $GITHUB_OUTPUT
manifest-test:
needs: [get-test-case]
strategy:
matrix:
test_case: ${{ fromJSON(needs.get-test-case.outputs.test_cases) }}
fail-fast: false
runs-on: "k8s-${{ inputs.hardware }}"
continue-on-error: true
steps:
@@ -110,14 +45,11 @@ jobs:
fetch-depth: 0
- name: Set variables
env:
test_case: ${{ matrix.test_case }}
run: |
echo "IMAGE_REPO=${OPEA_IMAGE_REPO}opea" >> $GITHUB_ENV
echo "IMAGE_TAG=${{ inputs.tag }}" >> $GITHUB_ENV
lower_example=$(echo "${{ inputs.example }}" | tr '[:upper:]' '[:lower:]')
name=$(echo "$test_case" | cut -d/ -f2 | cut -d'_' -f3- |cut -d'_' -f1 | grep -v 'on' | sed 's/^/-/')
echo "NAMESPACE=$lower_example$name-$(tr -dc a-z0-9 </dev/urandom | head -c 16)" >> $GITHUB_ENV
echo "NAMESPACE=$lower_example-$(tr -dc a-z0-9 </dev/urandom | head -c 16)" >> $GITHUB_ENV
echo "ROLLOUT_TIMEOUT_SECONDS=1800s" >> $GITHUB_ENV
echo "KUBECTL_TIMEOUT_SECONDS=60s" >> $GITHUB_ENV
echo "continue_test=true" >> $GITHUB_ENV
@@ -127,19 +59,15 @@ jobs:
- name: Kubectl install
id: install
env:
test_case: ${{ matrix.test_case }}
run: |
set -x
echo "test_case=$test_case"
if [[ ! -f ${{ github.workspace }}/${{ inputs.example }}/tests/${test_case} ]]; then
if [[ ! -f ${{ github.workspace }}/${{ inputs.example }}/tests/test_manifest_on_${{ inputs.hardware }}.sh ]]; then
echo "No test script found, exist test!"
exit 0
else
${{ github.workspace }}/${{ inputs.example }}/tests/${test_case} init_${{ inputs.example }}
${{ github.workspace }}/${{ inputs.example }}/tests/test_manifest_on_${{ inputs.hardware }}.sh init_${{ inputs.example }}
echo "should_cleanup=true" >> $GITHUB_ENV
kubectl create ns $NAMESPACE
${{ github.workspace }}/${{ inputs.example }}/tests/${test_case} install_${{ inputs.example }} $NAMESPACE
${{ github.workspace }}/${{ inputs.example }}/tests/test_manifest_on_${{ inputs.hardware }}.sh install_${{ inputs.example }} $NAMESPACE
echo "Testing ${{ inputs.example }}, waiting for pod ready..."
if kubectl rollout status deployment --namespace "$NAMESPACE" --timeout "$ROLLOUT_TIMEOUT_SECONDS"; then
echo "Testing manifests ${{ inputs.example }}, waiting for pod ready done!"
@@ -154,26 +82,18 @@ jobs:
- name: Validate e2e test
if: always()
env:
test_case: ${{ matrix.test_case }}
run: |
if $skip_validate; then
echo "Skip validate"
else
if ${{ github.workspace }}/${{ inputs.example }}/tests/${test_case} validate_${{ inputs.example }} $NAMESPACE ; then
echo "Validate ${test_case} successful!"
if ${{ github.workspace }}/${{ inputs.example }}/tests/test_manifest_on_${{ inputs.hardware }}.sh validate_${{ inputs.example }} $NAMESPACE ; then
echo "Validate ${{ inputs.example }} successful!"
else
echo "Validate ${test_case} failure!!!"
echo "Check the logs in 'Dump logs when e2e test failed' step!!!"
exit 1
echo "Validate ${{ inputs.example }} failure!!!"
.github/workflows/scripts/k8s-utils.sh dump_all_pod_logs $NAMESPACE
fi
fi
- name: Dump logs when e2e test failed
if: failure()
run: |
.github/workflows/scripts/k8s-utils.sh dump_all_pod_logs $NAMESPACE
- name: Kubectl uninstall
if: always()
run: |

View File

@@ -111,17 +111,6 @@ jobs:
ref: ${{ needs.get-test-case.outputs.CHECKOUT_REF }}
fetch-depth: 0
- name: Clean up container before test
shell: bash
run: |
docker ps
cd ${{ github.workspace }}/${{ inputs.example }}
export test_case=${{ matrix.test_case }}
export hardware=${{ inputs.hardware }}
bash ${{ github.workspace }}/.github/workflows/scripts/docker_compose_clean_up.sh "containers"
bash ${{ github.workspace }}/.github/workflows/scripts/docker_compose_clean_up.sh "ports"
docker ps
- name: Run test
shell: bash
env:
@@ -130,11 +119,8 @@ jobs:
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
PINECONE_KEY: ${{ secrets.PINECONE_KEY }}
PINECONE_KEY_LANGCHAIN_TEST: ${{ secrets.PINECONE_KEY_LANGCHAIN_TEST }}
SDK_BASE_URL: ${{ secrets.SDK_BASE_URL }}
SERVING_TOKEN: ${{ secrets.SERVING_TOKEN }}
IMAGE_REPO: ${{ inputs.registry }}
IMAGE_TAG: ${{ inputs.tag }}
opea_branch: "refactor_comps"
example: ${{ inputs.example }}
hardware: ${{ inputs.hardware }}
test_case: ${{ matrix.test_case }}
@@ -143,14 +129,17 @@ jobs:
if [[ "$IMAGE_REPO" == "" ]]; then export IMAGE_REPO="${OPEA_IMAGE_REPO}opea"; fi
if [ -f ${test_case} ]; then timeout 30m bash ${test_case}; else echo "Test script {${test_case}} not found, skip test!"; fi
- name: Clean up container after test
- name: Clean up container
shell: bash
if: cancelled() || failure()
run: |
cd ${{ github.workspace }}/${{ inputs.example }}
export test_case=${{ matrix.test_case }}
export hardware=${{ inputs.hardware }}
bash ${{ github.workspace }}/.github/workflows/scripts/docker_compose_clean_up.sh "containers"
cd ${{ github.workspace }}/${{ inputs.example }}/docker_compose
test_case=${{ matrix.test_case }}
flag=${test_case%_on_*}
flag=${flag#test_}
yaml_file=$(find . -type f -wholename "*${{ inputs.hardware }}/${flag}.yaml")
echo $yaml_file
docker compose -f $yaml_file stop && docker compose -f $yaml_file rm -f || true
docker system prune -f
docker rmi $(docker images --filter reference="*:5000/*/*" -q) || true

View File

@@ -1,35 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Check Online Document Building
permissions: {}
on:
pull_request:
branches: [main]
paths:
- "**.md"
- "**.rst"
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
path: GenAIExamples
- name: Checkout docs
uses: actions/checkout@v4
with:
repository: opea-project/docs
path: docs
- name: Build Online Document
shell: bash
run: |
echo "build online doc"
cd docs
bash scripts/build.sh

View File

@@ -1,31 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Clean up container on manual event
on:
workflow_dispatch:
inputs:
node:
default: "rocm"
description: "Hardware to clean"
required: true
type: string
clean_list:
default: ""
description: "docker command to clean"
required: false
type: string
jobs:
clean:
runs-on: "${{ inputs.node }}"
steps:
- name: Clean up container
run: |
docker ps
if [ "${{ inputs.clean_list }}" ]; then
echo "----------stop and remove containers----------"
docker stop ${{ inputs.clean_list }} && docker rm ${{ inputs.clean_list }}
echo "----------container removed----------"
docker ps
fi

View File

@@ -12,7 +12,7 @@ on:
type: string
examples:
default: "ChatQnA"
description: 'List of examples to test [AgentQnA,AudioQnA,ChatQnA,CodeGen,CodeTrans,DocIndexRetriever,DocSum,FaqGen,InstructionTuning,MultimodalQnA,ProductivitySuite,RerankFinetuning,SearchQnA,Translation,VideoQnA,VisualQnA,AvatarChatbot,Text2Image,WorkflowExecAgent,DBQnA,EdgeCraftRAG,GraphRAG]'
description: 'List of examples to test [AudioQnA,ChatQnA,CodeGen,CodeTrans,DocSum,FaqGen,SearchQnA,Translation]'
required: true
type: string
tag:
@@ -50,11 +50,6 @@ on:
description: 'OPEA branch for image build'
required: false
type: string
inject_commit:
default: false
description: "inject commit to docker images true or false"
required: false
type: string
permissions: read-all
jobs:
@@ -106,5 +101,4 @@ jobs:
test_k8s: ${{ fromJSON(inputs.test_k8s) }}
test_gmc: ${{ fromJSON(inputs.test_gmc) }}
opea_branch: ${{ inputs.opea_branch }}
inject_commit: ${{ inputs.inject_commit }}
secrets: inherit

View File

@@ -1,13 +1,13 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Freeze OPEA images release tag
name: Freeze OPEA images release tag in readme on manual event
on:
workflow_dispatch:
inputs:
tag:
default: "1.1.0"
default: "latest"
description: "Tag to apply to images"
required: true
type: string
@@ -23,6 +23,10 @@ jobs:
fetch-depth: 0
ref: ${{ github.ref }}
- uses: actions/setup-python@v5
with:
python-version: "3.10"
- name: Set up Git
run: |
git config --global user.name "NeuralChatBot"
@@ -31,10 +35,9 @@ jobs:
- name: Run script
run: |
IFS='.' read -r major minor patch <<< "${{ github.event.inputs.tag }}"
echo "VERSION_MAJOR ${major}" > version.txt
echo "VERSION_MINOR ${minor}" >> version.txt
echo "VERSION_PATCH ${patch}" >> version.txt
find . -name "*.md" | xargs sed -i "s|^docker\ compose|TAG=${{ github.event.inputs.tag }}\ docker\ compose|g"
find . -type f -name "*.yaml" \( -path "*/benchmark/*" -o -path "*/kubernetes/*" \) | xargs sed -i -E 's/(opea\/[A-Za-z0-9\-]*:)latest/\1${{ github.event.inputs.tag }}/g'
find . -type f -name "*.md" \( -path "*/benchmark/*" -o -path "*/kubernetes/*" \) | xargs sed -i -E 's/(opea\/[A-Za-z0-9\-]*:)latest/\1${{ github.event.inputs.tag }}/g'
- name: Commit changes
run: |

View File

@@ -12,7 +12,7 @@ on:
type: string
example:
default: "ChatQnA"
description: 'Build images belong to which example? [AgentQnA,AudioQnA,ChatQnA,CodeGen,CodeTrans,DocIndexRetriever,DocSum,FaqGen,InstructionTuning,MultimodalQnA,ProductivitySuite,RerankFinetuning,SearchQnA,Translation,VideoQnA,VisualQnA,AvatarChatbot,Text2Image,WorkflowExecAgent,DBQnA,EdgeCraftRAG,GraphRAG]'
description: 'Build images belong to which example?'
required: true
type: string
services:
@@ -30,12 +30,6 @@ on:
description: 'OPEA branch for image build'
required: false
type: string
inject_commit:
default: false
description: "inject commit to docker images true or false"
required: false
type: string
jobs:
get-test-matrix:
runs-on: ubuntu-latest
@@ -62,5 +56,4 @@ jobs:
services: ${{ inputs.services }}
tag: ${{ inputs.tag }}
opea_branch: ${{ inputs.opea_branch }}
inject_commit: ${{ inputs.inject_commit }}
secrets: inherit

View File

@@ -1,59 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Clean up Local Registry on manual event
on:
workflow_dispatch:
inputs:
nodes:
default: "gaudi,xeon"
description: "Hardware to clean up"
required: true
type: string
env:
EXAMPLES: ${{ vars.NIGHTLY_RELEASE_EXAMPLES }}
jobs:
get-build-matrix:
runs-on: ubuntu-latest
outputs:
examples: ${{ steps.get-matrix.outputs.examples }}
nodes: ${{ steps.get-matrix.outputs.nodes }}
steps:
- name: Create Matrix
id: get-matrix
run: |
examples=($(echo ${EXAMPLES} | tr ',' ' '))
examples_json=$(printf '%s\n' "${examples[@]}" | sort -u | jq -R '.' | jq -sc '.')
echo "examples=$examples_json" >> $GITHUB_OUTPUT
nodes=($(echo ${{ inputs.nodes }} | tr ',' ' '))
nodes_json=$(printf '%s\n' "${nodes[@]}" | sort -u | jq -R '.' | jq -sc '.')
echo "nodes=$nodes_json" >> $GITHUB_OUTPUT
clean-up:
needs: get-build-matrix
strategy:
matrix:
node: ${{ fromJson(needs.get-build-matrix.outputs.nodes) }}
fail-fast: false
runs-on: "docker-build-${{ matrix.node }}"
steps:
- name: Clean Up Local Registry
run: |
echo "Cleaning up local registry on ${{ matrix.node }}"
bash /home/sdp/workspace/fully_registry_cleanup.sh
docker ps | grep registry
build:
needs: [get-build-matrix, clean-up]
strategy:
matrix:
example: ${{ fromJson(needs.get-build-matrix.outputs.examples) }}
node: ${{ fromJson(needs.get-build-matrix.outputs.nodes) }}
fail-fast: false
uses: ./.github/workflows/_example-workflow.yml
with:
node: ${{ matrix.node }}
example: ${{ matrix.example }}
secrets: inherit

View File

@@ -1,71 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Nightly build/publish latest docker images
on:
schedule:
- cron: "30 14 * * *" # UTC time
workflow_dispatch:
env:
EXAMPLES: ${{ vars.NIGHTLY_RELEASE_EXAMPLES }}
TAG: "latest"
PUBLISH_TAGS: "latest"
jobs:
get-build-matrix:
runs-on: ubuntu-latest
outputs:
examples_json: ${{ steps.get-matrix.outputs.examples_json }}
EXAMPLES: ${{ steps.get-matrix.outputs.EXAMPLES }}
TAG: ${{ steps.get-matrix.outputs.TAG }}
PUBLISH_TAGS: ${{ steps.get-matrix.outputs.PUBLISH_TAGS }}
steps:
- name: Create Matrix
id: get-matrix
run: |
examples=($(echo ${EXAMPLES} | tr ',' ' '))
examples_json=$(printf '%s\n' "${examples[@]}" | sort -u | jq -R '.' | jq -sc '.')
echo "examples_json=$examples_json" >> $GITHUB_OUTPUT
echo "EXAMPLES=$EXAMPLES" >> $GITHUB_OUTPUT
echo "TAG=$TAG" >> $GITHUB_OUTPUT
echo "PUBLISH_TAGS=$PUBLISH_TAGS" >> $GITHUB_OUTPUT
build-and-test:
needs: get-build-matrix
strategy:
matrix:
example: ${{ fromJSON(needs.get-build-matrix.outputs.examples_json) }}
fail-fast: false
uses: ./.github/workflows/_example-workflow.yml
with:
node: gaudi
example: ${{ matrix.example }}
test_compose: true
secrets: inherit
get-image-list:
needs: get-build-matrix
uses: ./.github/workflows/_get-image-list.yml
with:
examples: ${{ needs.get-build-matrix.outputs.EXAMPLES }}
publish:
needs: [get-build-matrix, get-image-list, build-and-test]
strategy:
matrix:
image: ${{ fromJSON(needs.get-image-list.outputs.matrix) }}
runs-on: "docker-build-gaudi"
steps:
- uses: docker/login-action@v3.2.0
with:
username: ${{ secrets.DOCKERHUB_USER }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Image Publish
uses: opea-project/validation/actions/image-publish@main
with:
local_image_ref: ${OPEA_IMAGE_REPO}opea/${{ matrix.image }}:${{ needs.get-build-matrix.outputs.TAG }}
image_name: opea/${{ matrix.image }}
publish_tags: ${{ needs.get-build-matrix.outputs.PUBLISH_TAGS }}

View File

@@ -1,40 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Check Duplicated Images
on:
pull_request:
branches: [main, genaicomps_refactor]
types: [opened, reopened, ready_for_review, synchronize]
paths:
- "**/docker_image_build/*.yaml"
- ".github/workflows/pr-check-duplicated-image.yml"
- ".github/workflows/scripts/check_duplicated_image.py"
workflow_dispatch:
# If there is a new commit, the previous jobs will be canceled
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
check-duplicated-image:
runs-on: ubuntu-latest
steps:
- name: Clean Up Working Directory
run: sudo rm -rf ${{github.workspace}}/*
- name: Checkout Repo
uses: actions/checkout@v4
- name: Check all the docker image build files
run: |
pip install PyYAML
cd ${{github.workspace}}
build_files=""
for f in `find . -path "*/docker_image_build/build.yaml"`; do
build_files="$build_files $f"
done
python3 .github/workflows/scripts/check_duplicated_image.py $build_files
shell: bash

View File

@@ -34,11 +34,6 @@ jobs:
- name: Checkout out Repo
uses: actions/checkout@v4
- name: Check Dangerous Command Injection
uses: opea-project/validation/actions/check-cmd@main
with:
work_dir: ${{ github.workspace }}
- name: Docker Build
run: |
docker build -f ${{ github.workspace }}/.github/workflows/docker/${{ env.DOCKER_FILE_NAME }}.dockerfile -t ${{ env.REPO_NAME }}:${{ env.REPO_TAG }} .

View File

@@ -2,7 +2,7 @@
# SPDX-License-Identifier: Apache-2.0
name: "Dependency Review"
on: [pull_request_target]
on: [pull_request]
permissions:
contents: read

View File

@@ -4,8 +4,8 @@
name: E2E test with docker compose
on:
pull_request:
branches: ["main", "*rc", "genaicomps_refactor"]
pull_request_target:
branches: ["main", "*rc"]
types: [opened, reopened, ready_for_review, synchronize] # added `ready_for_review` since draft is skipped
paths:
- "**/Dockerfile**"

View File

@@ -1,110 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Compose file and dockerfile path checking
on:
pull_request:
branches: [main, genaicomps_refactor]
types: [opened, reopened, ready_for_review, synchronize]
jobs:
check-dockerfile-paths-in-README:
runs-on: ubuntu-latest
steps:
- name: Clean Up Working Directory
run: sudo rm -rf ${{github.workspace}}/*
- name: Checkout Repo GenAIExamples
uses: actions/checkout@v4
- name: Clone Repo GenAIComps
run: |
cd ..
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps && git checkout refactor_comps
- name: Check for Missing Dockerfile Paths in GenAIComps
run: |
cd ${{github.workspace}}
miss="FALSE"
while IFS=: read -r file line content; do
dockerfile_path=$(echo "$content" | awk -F '-f ' '{print $2}' | awk '{print $1}')
if [[ ! -f "../GenAIComps/${dockerfile_path}" ]]; then
miss="TRUE"
echo "Missing Dockerfile: GenAIComps/${dockerfile_path} (Referenced in GenAIExamples/${file}:${line})"
fi
done < <(grep -Ern 'docker build .* -f comps/.+/Dockerfile' --include='*.md' .)
if [[ "$miss" == "TRUE" ]]; then
exit 1
fi
shell: bash
check-Dockerfile-in-build-yamls:
runs-on: ubuntu-latest
steps:
- name: Clean Up Working Directory
run: sudo rm -rf ${{github.workspace}}/*
- name: Checkout Repo GenAIExamples
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check Dockerfile path included in image build yaml
if: always()
run: |
set -e
shopt -s globstar
no_add="FALSE"
cd ${{github.workspace}}
Dockerfiles=$(realpath $(find ./ -name '*Dockerfile*'))
if [ -n "$Dockerfiles" ]; then
for dockerfile in $Dockerfiles; do
service=$(echo "$dockerfile" | awk -F '/GenAIExamples/' '{print $2}' | awk -F '/' '{print $2}')
cd ${{github.workspace}}/$service/docker_image_build
all_paths=$(realpath $(awk ' /context:/ { context = $2 } /dockerfile:/ { dockerfile = $2; combined = context "/" dockerfile; gsub(/\/+/, "/", combined); if (index(context, ".") > 0) {print combined}}' build.yaml) 2> /dev/null || true )
if ! echo "$all_paths" | grep -q "$dockerfile"; then
echo "AR: Update $dockerfile to GenAIExamples/$service/docker_image_build/build.yaml. The yaml is used for release images build."
no_add="TRUE"
fi
done
fi
if [[ "$no_add" == "TRUE" ]]; then
exit 1
fi
check-image-and-service-names-in-build-yaml:
runs-on: ubuntu-latest
steps:
- name: Clean Up Working Directory
run: sudo rm -rf ${{github.workspace}}/*
- name: Checkout Repo GenAIExamples
uses: actions/checkout@v4
- name: Check name agreement in build.yaml
run: |
pip install ruamel.yaml
cd ${{github.workspace}}
consistency="TRUE"
build_yamls=$(find . -name 'build.yaml')
for build_yaml in $build_yamls; do
message=$(python3 .github/workflows/scripts/check-name-agreement.py "$build_yaml")
if [[ "$message" != *"consistent"* ]]; then
consistency="FALSE"
echo "Inconsistent service name and image name found in file $build_yaml."
echo "$message"
fi
done
if [[ "$consistency" == "FALSE" ]]; then
echo "Please ensure that the service and image names are consistent in build.yaml, otherwise we cannot guarantee that your image will be published correctly."
exit 1
fi
shell: bash

View File

@@ -12,7 +12,7 @@ on:
- "**/tests/test_gmc**"
- "!**.md"
- "!**.txt"
- "!**/kubernetes/**/manifest/**"
- "!**/kubernetes/**/manifests/**"
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

View File

@@ -10,7 +10,7 @@ on:
paths:
- "**/Dockerfile**"
- "**.py"
- "**/kubernetes/**/manifest/**"
- "**/kubernetes/**/manifests/**"
- "**/tests/test_manifest**"
- "!**.md"
- "!**.txt"

View File

@@ -1,14 +1,47 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
name: Check hyperlinks and relative path validity
name: Check Paths and Hyperlinks
on:
pull_request:
branches: [main, genaicomps_refactor]
branches: [main]
types: [opened, reopened, ready_for_review, synchronize]
jobs:
check-dockerfile-paths:
runs-on: ubuntu-latest
steps:
- name: Clean Up Working Directory
run: sudo rm -rf ${{github.workspace}}/*
- name: Checkout Repo GenAIExamples
uses: actions/checkout@v4
- name: Clone Repo GenAIComps
run: |
cd ..
git clone https://github.com/opea-project/GenAIComps.git
- name: Check for Missing Dockerfile Paths in GenAIComps
run: |
cd ${{github.workspace}}
miss="FALSE"
while IFS=: read -r file line content; do
dockerfile_path=$(echo "$content" | awk -F '-f ' '{print $2}' | awk '{print $1}')
if [[ ! -f "../GenAIComps/${dockerfile_path}" ]]; then
miss="TRUE"
echo "Missing Dockerfile: GenAIComps/${dockerfile_path} (Referenced in GenAIExamples/${file}:${line})"
fi
done < <(grep -Ern 'docker build .* -f comps/.+/Dockerfile' --include='*.md' .)
if [[ "$miss" == "TRUE" ]]; then
exit 1
fi
shell: bash
check-the-validity-of-hyperlinks-in-README:
runs-on: ubuntu-latest
steps:
@@ -28,14 +61,14 @@ jobs:
changed_files="$(git diff --name-status --diff-filter=ARM ${{ github.event.pull_request.base.sha }} ${merged_commit} | awk '/\.md$/ {print $NF}')"
if [ -n "$changed_files" ]; then
for changed_file in $changed_files; do
# echo $changed_file
echo $changed_file
url_lines=$(grep -H -Eo '\]\(http[s]?://[^)]+\)' "$changed_file" | grep -Ev 'GenAIExamples/blob/main') || true
if [ -n "$url_lines" ]; then
for url_line in $url_lines; do
# echo $url_line
echo $url_line
url=$(echo "$url_line"|cut -d '(' -f2 | cut -d ')' -f1|sed 's/\.git$//')
path=$(echo "$url_line"|cut -d':' -f1 | cut -d'/' -f2-)
response=$(curl -L -s -o /dev/null -w "%{http_code}" "$url")|| true
response=$(curl -L -s -o /dev/null -w "%{http_code}" "$url")
if [ "$response" -ne 200 ]; then
echo "**********Validation failed, try again**********"
response_retry=$(curl -s -o /dev/null -w "%{http_code}" "$url")

View File

@@ -8,9 +8,7 @@ on:
branches: [ 'main' ]
paths:
- "**.py"
- "**Dockerfile*"
- "**docker_image_build/build.yaml"
- "**/ui/**"
- "**Dockerfile"
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-on-push
@@ -20,7 +18,7 @@ jobs:
job1:
uses: ./.github/workflows/_get-test-matrix.yml
with:
test_mode: "docker_image_build"
test_mode: "docker_image_build/build.yaml"
image-build:
needs: job1

View File

@@ -1,46 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import argparse
from ruamel.yaml import YAML
def parse_yaml_file(file_path):
yaml = YAML()
with open(file_path, "r") as file:
data = yaml.load(file)
return data
def check_service_image_consistency(data):
inconsistencies = []
for service_name, service_details in data.get("services", {}).items():
image_name = service_details.get("image", "")
# Extract the image name part after the last '/'
image_name_part = image_name.split("/")[-1].split(":")[0]
# Check if the service name is a substring of the image name part
if service_name not in image_name_part:
# Get the line number of the service name
line_number = service_details.lc.line + 1
inconsistencies.append((service_name, image_name, line_number))
return inconsistencies
def main():
parser = argparse.ArgumentParser(description="Check service name and image name consistency in a YAML file.")
parser.add_argument("file_path", type=str, help="The path to the YAML file.")
args = parser.parse_args()
data = parse_yaml_file(args.file_path)
inconsistencies = check_service_image_consistency(data)
if inconsistencies:
for service_name, image_name, line_number in inconsistencies:
print(f"Service name: {service_name}, Image name: {image_name}, Line number: {line_number}")
else:
print("All consistent")
if __name__ == "__main__":
main()

View File

@@ -1,63 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import argparse
import os.path
import subprocess
import sys
import yaml
images = {}
def check_docker_compose_build_definition(file_path):
with open(file_path, "r") as f:
data = yaml.load(f, Loader=yaml.FullLoader)
for service in data["services"]:
if "build" in data["services"][service] and "image" in data["services"][service]:
bash_command = "echo " + data["services"][service]["image"]
image = (
subprocess.run(["bash", "-c", bash_command], check=True, capture_output=True)
.stdout.decode("utf-8")
.strip()
)
build = data["services"][service]["build"]
context = build.get("context", "")
dockerfile = os.path.normpath(
os.path.join(os.path.dirname(file_path), context, build.get("dockerfile", ""))
)
if not os.path.isfile(dockerfile):
# dockerfile not exists in the current repo context, assume it's in 3rd party context
dockerfile = os.path.normpath(os.path.join(context, build.get("dockerfile", "")))
item = {"file_path": file_path, "service": service, "dockerfile": dockerfile}
if image in images and dockerfile != images[image]["dockerfile"]:
print("ERROR: !!! Found Conflicts !!!")
print(f"Image: {image}, Dockerfile: {dockerfile}, defined in Service: {service}, File: {file_path}")
print(
f"Image: {image}, Dockerfile: {images[image]['dockerfile']}, defined in Service: {images[image]['service']}, File: {images[image]['file_path']}"
)
sys.exit(1)
else:
# print(f"Add Image: {image} Dockerfile: {dockerfile}")
images[image] = item
def parse_arg():
parser = argparse.ArgumentParser(
description="Check for conflicts in image build definition in docker-compose.yml files"
)
parser.add_argument("files", nargs="+", help="list of files to be checked")
return parser.parse_args()
def main():
args = parse_arg()
for file_path in args.files:
check_docker_compose_build_definition(file_path)
print("SUCCESS: No Conlicts Found.")
return 0
if __name__ == "__main__":
main()

View File

@@ -1,42 +0,0 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# The test machine used by several opea projects, so the test scripts can't use `docker compose down` to clean up
# the all the containers, ports and networks directly.
# So we need to use the following script to minimize the impact of the clean up.
test_case=${test_case:-"test_compose_on_gaudi.sh"}
hardware=${hardware:-"gaudi"}
flag=${test_case%_on_*}
flag=${flag#test_}
yaml_file=$(find . -type f -wholename "*${hardware}/${flag}.yaml")
echo $yaml_file
case "$1" in
containers)
echo "Stop and remove all containers used by the services in $yaml_file ..."
containers=$(cat $yaml_file | grep container_name | cut -d':' -f2)
for container_name in $containers; do
cid=$(docker ps -aq --filter "name=$container_name")
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
done
;;
ports)
echo "Release all ports used by the services in $yaml_file ..."
pip install jq yq
ports=$(yq '.services[].ports[] | split(":")[0]' $yaml_file | grep -o '[0-9a-zA-Z_-]\+')
echo "$ports"
for port in $ports; do
if [[ $port =~ [a-zA-Z_-] ]]; then
port=$(grep -E "export $port=" tests/$test_case | cut -d'=' -f2)
fi
echo $port
cid=$(docker ps --filter "publish=${port}" --format "{{.ID}}")
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
done
;;
*)
echo "Unknown function: $1"
;;
esac

View File

@@ -9,20 +9,12 @@ set -e
changed_files=$changed_files
test_mode=$test_mode
run_matrix="{\"include\":["
hardware_list="xeon gaudi" # current support hardware list
examples=$(printf '%s\n' "${changed_files[@]}" | grep '/' | cut -d'/' -f1 | sort -u)
for example in ${examples}; do
cd $WORKSPACE/$example
if [[ ! $(find . -type f | grep ${test_mode}) ]]; then continue; fi
cd tests
ls -l
if [[ "$test_mode" == "docker_image_build" ]]; then
find_name="test_manifest_on_*.sh"
else
find_name="test_${test_mode}*_on_*.sh"
fi
hardware_list=$(find . -type f -name "${find_name}" | cut -d/ -f2 | cut -d. -f1 | awk -F'_on_' '{print $2}'| sort -u)
echo -e "Test supported hardware list: \n${hardware_list}"
run_hardware=""
if [[ $(printf '%s\n' "${changed_files[@]}" | grep ${example} | cut -d'/' -f2 | grep -E '*.py|Dockerfile*|ui|docker_image_build' ) ]]; then

2
.gitignore vendored
View File

@@ -5,4 +5,4 @@
**/playwright/.cache/
**/test-results/
__pycache__/
__pycache__/

View File

@@ -1 +1 @@
**/kubernetes/
**/kubernetes/

View File

@@ -1,16 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#
#To anounce the version of the codes, please create a version.txt and have following format.
#VERSION_MAJOR 1
#VERSION_MINOR 0
#VERSION_PATCH 0
VERSION_FILE="version.txt"
if [ -f $VERSION_FILE ]; then
VER_OPEA_MAJOR=$(grep "VERSION_MAJOR" $VERSION_FILE | cut -d " " -f 2)
VER_OPEA_MINOR=$(grep "VERSION_MINOR" $VERSION_FILE | cut -d " " -f 2)
VER_OPEA_PATCH=$(grep "VERSION_PATCH" $VERSION_FILE | cut -d " " -f 2)
export TAG=$VER_OPEA_MAJOR.$VER_OPEA_MINOR
echo OPEA Version:$TAG
fi

View File

@@ -81,122 +81,72 @@ flowchart LR
3. Hierarchical agent can further improve performance.
Expert worker agents, such as retrieval agent, knowledge graph agent, SQL agent, etc., can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer.
## Deployment with docker
### Roadmap
1. Build agent docker image [Optional]
- v0.9: Worker agent uses open-source websearch tool (duckduckgo), agents use OpenAI GPT-4o-mini as llm backend.
- v1.0: Worker agent uses OPEA retrieval megaservice as tool.
- v1.0 or later: agents use open-source llm backend.
- v1.1 or later: add safeguards
> [!NOTE]
> the step is optional. The docker images will be automatically pulled when running the docker compose commands. This step is only needed if pulling images failed.
## Getting started
First, clone the opea GenAIComps repo.
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIComps.git
```
Then build the agent docker image. Both the supervisor agent and the worker agent will use the same docker image, but when we launch the two agents we will specify different strategies and register different tools.
```
cd GenAIComps
docker build -t opea/agent-langchain:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/agent/langchain/Dockerfile .
```
2. Set up environment for this example </br>
First, clone this repo.
1. Build agent docker image </br>
First, clone the opea GenAIComps repo
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
git clone https://github.com/opea-project/GenAIComps.git
```
Second, set up env vars.
Then build the agent docker image. Both the supervisor agent and the worker agent will use the same docker image, but when we launch the two agents we will specify different strategies and register different tools.
```
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
export host_ip=$(hostname -I | awk '{print $1}')
# if you are in a proxy environment, also set the proxy-related environment variables
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
# for using open-source llms
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
# optional: OPANAI_API_KEY if you want to use OpenAI models
export OPENAI_API_KEY=<your-openai-key>
cd GenAIComps
docker build -t opea/agent-langchain:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/agent/langchain/Dockerfile .
```
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
First, launch the mega-service.
```
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
bash launch_retrieval_tool.sh
```
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
```
bash run_ingest_data.sh
```
4. Launch other tools. </br>
2. Launch tool services </br>
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```
5. Launch agent services</br>
We provide two options for `llm_engine` of the agents: 1. open-source LLMs, 2. OpenAI models via API calls.
Deploy it on Gaudi or Xeon respectively
::::{tab-set}
:::{tab-item} Gaudi
:sync: Gaudi
To use open-source LLMs on Gaudi2, run commands below.
3. Set up environment for this example </br>
First, clone this repo
```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_tgi_gaudi.sh
bash launch_agent_service_tgi_gaudi.sh
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
```
:::
:::{tab-item} Xeon
:sync: Xeon
To use OpenAI models, run commands below.
Second, set up env vars
```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
# optional: OPANAI_API_KEY
export OPENAI_API_KEY=<your-openai-key>
```
4. Launch agent services</br>
The configurations of the supervisor agent and the worker agent are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM, and we plan to add support for llama3.1-70B-instruct (served by TGI-Gaudi) in a subsequent release.
To use openai llm, run command below.
```
cd docker_compose/intel/cpu/xeon
bash launch_agent_service_openai.sh
```
:::
::::
## Validate services
First look at logs of the agent docker containers:
```
# worker agent
docker logs rag-agent-endpoint
docker logs docgrader-agent-endpoint
```
```
# supervisor agent
docker logs react-agent-endpoint
```
@@ -205,7 +155,7 @@ You should see something like "HTTP server setup successful" if the docker conta
Second, validate worker agent:
```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
@@ -213,11 +163,11 @@ curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: applic
Third, validate supervisor agent:
```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
## How to register your own tools with agent
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md).
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md#5-customize-agent-strategy).

View File

@@ -1,97 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
services:
agent-tgi-server:
image: ${AGENTQNA_TGI_IMAGE}
container_name: agent-tgi-server
ports:
- "${AGENTQNA_TGI_SERVICE_PORT-8085}:80"
volumes:
- /var/opea/agent-service/:/data
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: "http://${HOST_IP}:${AGENTQNA_TGI_SERVICE_PORT}"
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/${AGENTQNA_CARD_ID}:/dev/dri/${AGENTQNA_CARD_ID}
- /dev/dri/${AGENTQNA_RENDER_ID}:/dev/dri/${AGENTQNA_RENDER_ID}
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192
worker-rag-agent:
image: opea/agent-langchain:latest
container_name: rag-agent-endpoint
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9095:9095"
ipc: host
environment:
ip_address: ${ip_address}
strategy: rag_agent_llama
recursion_limit: ${recursion_limit_worker}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
llm_endpoint_url: ${LLM_ENDPOINT_URL}
model: ${LLM_MODEL_ID}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
streaming: false
tools: /home/user/tools/worker_agent_tools.yaml
require_human_feedback: false
RETRIEVAL_TOOL_URL: ${RETRIEVAL_TOOL_URL}
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
LANGCHAIN_PROJECT: "opea-worker-agent-service"
port: 9095
supervisor-react-agent:
image: opea/agent-langchain:latest
container_name: react-agent-endpoint
depends_on:
- agent-tgi-server
- worker-rag-agent
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "${AGENTQNA_FRONTEND_PORT}:9090"
ipc: host
environment:
ip_address: ${ip_address}
strategy: react_langgraph
recursion_limit: ${recursion_limit_supervisor}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
llm_endpoint_url: ${LLM_ENDPOINT_URL}
model: ${LLM_MODEL_ID}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
streaming: false
tools: /home/user/tools/supervisor_agent_tools.yaml
require_human_feedback: false
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
LANGCHAIN_PROJECT: "opea-supervisor-agent-service"
CRAG_SERVER: $CRAG_SERVER
WORKER_AGENT_URL: $WORKER_AGENT_URL
port: 9090

View File

@@ -1,47 +0,0 @@
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
WORKPATH=$(dirname "$PWD")/..
export ip_address=${host_ip}
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export AGENTQNA_TGI_IMAGE=ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
export AGENTQNA_TGI_SERVICE_PORT="8085"
# LLM related environment variables
export AGENTQNA_CARD_ID="card1"
export AGENTQNA_RENDER_ID="renderD136"
export HF_CACHE_DIR=${HF_CACHE_DIR}
ls $HF_CACHE_DIR
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
#export NUM_SHARDS=4
export LLM_ENDPOINT_URL="http://${ip_address}:${AGENTQNA_TGI_SERVICE_PORT}"
export temperature=0.01
export max_new_tokens=512
# agent related environment variables
export AGENTQNA_WORKER_AGENT_SERVICE_PORT="9095"
export TOOLSET_PATH=/home/huggingface/datamonsters/amd-opea/GenAIExamples/AgentQnA/tools/
echo "TOOLSET_PATH=${TOOLSET_PATH}"
export recursion_limit_worker=12
export recursion_limit_supervisor=10
export WORKER_AGENT_URL="http://${ip_address}:${AGENTQNA_WORKER_AGENT_SERVICE_PORT}/v1/chat/completions"
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
export CRAG_SERVER=http://${ip_address}:18881
export AGENTQNA_FRONTEND_PORT="9090"
#retrieval_tool
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
export REDIS_URL="redis://${host_ip}:26379"
export INDEX_NAME="rag-redis"
export MEGA_SERVICE_HOST_IP=${host_ip}
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
export RERANK_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get_file"
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete_file"
docker compose -f compose.yaml up -d

View File

@@ -1,46 +0,0 @@
#!/usr/bin/env bash
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
WORKPATH=$(dirname "$PWD")/..
export ip_address=${host_ip}
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export AGENTQNA_TGI_IMAGE=ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
export AGENTQNA_TGI_SERVICE_PORT="19001"
# LLM related environment variables
export AGENTQNA_CARD_ID="card1"
export AGENTQNA_RENDER_ID="renderD136"
export HF_CACHE_DIR=${HF_CACHE_DIR}
ls $HF_CACHE_DIR
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
export NUM_SHARDS=4
export LLM_ENDPOINT_URL="http://${ip_address}:${AGENTQNA_TGI_SERVICE_PORT}"
export temperature=0.01
export max_new_tokens=512
# agent related environment variables
export AGENTQNA_WORKER_AGENT_SERVICE_PORT="9095"
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
export recursion_limit_worker=12
export recursion_limit_supervisor=10
export WORKER_AGENT_URL="http://${ip_address}:${AGENTQNA_WORKER_AGENT_SERVICE_PORT}/v1/chat/completions"
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
export CRAG_SERVER=http://${ip_address}:18881
export AGENTQNA_FRONTEND_PORT="15557"
#retrieval_tool
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
export REDIS_URL="redis://${host_ip}:26379"
export INDEX_NAME="rag-redis"
export MEGA_SERVICE_HOST_IP=${host_ip}
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
export RERANK_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get_file"
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete_file"

View File

@@ -1,100 +0,0 @@
# Single node on-prem deployment with Docker Compose on Xeon Scalable processors
This example showcases a hierarchical multi-agent system for question-answering applications. We deploy the example on Xeon. For LLMs, we use OpenAI models via API calls. For instructions on using open-source LLMs, please refer to the deployment guide [here](../../../../README.md).
## Deployment with docker
1. First, clone this repo.
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
```
2. Set up environment for this example </br>
```
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
export host_ip=$(hostname -I | awk '{print $1}')
# if you are in a proxy environment, also set the proxy-related environment variables
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
#OPANAI_API_KEY if you want to use OpenAI models
export OPENAI_API_KEY=<your-openai-key>
```
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
First, launch the mega-service.
```
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
bash launch_retrieval_tool.sh
```
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
```
bash run_ingest_data.sh
```
4. Launch Tool service
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```
5. Launch `Agent` service
The configurations of the supervisor agent and the worker agent are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM, and llama3.1-70B-instruct (served by TGI-Gaudi) in Gaudi example. To use openai llm, run command below.
```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
bash launch_agent_service_openai.sh
```
6. [Optional] Build `Agent` docker image if pulling images failed.
```
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
docker build -t opea/agent-langchain:latest -f comps/agent/langchain/Dockerfile .
```
## Validate services
First look at logs of the agent docker containers:
```
# worker agent
docker logs rag-agent-endpoint
```
```
# supervisor agent
docker logs react-agent-endpoint
```
You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>
Second, validate worker agent:
```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
Third, validate supervisor agent:
```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
## How to register your own tools with agent
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md).

View File

@@ -2,10 +2,11 @@
# SPDX-License-Identifier: Apache-2.0
services:
worker-rag-agent:
worker-docgrader-agent:
image: opea/agent-langchain:latest
container_name: rag-agent-endpoint
container_name: docgrader-agent-endpoint
volumes:
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9095:9095"
@@ -35,9 +36,8 @@ services:
supervisor-react-agent:
image: opea/agent-langchain:latest
container_name: react-agent-endpoint
depends_on:
- worker-rag-agent
volumes:
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9090:9090"

View File

@@ -1,16 +1,13 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
pushd "../../../../../" > /dev/null
source .set_env.sh
popd > /dev/null
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
export ip_address=$(hostname -I | awk '{print $1}')
export recursion_limit_worker=12
export recursion_limit_supervisor=10
export model="gpt-4o-mini-2024-07-18"
export temperature=0
export max_new_tokens=4096
export max_new_tokens=512
export OPENAI_API_KEY=${OPENAI_API_KEY}
export WORKER_AGENT_URL="http://${ip_address}:9095/v1/chat/completions"
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"

View File

@@ -1,105 +0,0 @@
# Single node on-prem deployment AgentQnA on Gaudi
This example showcases a hierarchical multi-agent system for question-answering applications. We deploy the example on Gaudi using open-source LLMs,
For more details, please refer to the deployment guide [here](../../../../README.md).
## Deployment with docker
1. First, clone this repo.
```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
```
2. Set up environment for this example </br>
```
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
export host_ip=$(hostname -I | awk '{print $1}')
# if you are in a proxy environment, also set the proxy-related environment variables
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
# for using open-source llms
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
# Example export HF_CACHE_DIR=$WORKDIR so that no need to redownload every time
export HF_CACHE_DIR=<directory-where-llms-are-downloaded>
```
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
First, launch the mega-service.
```
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
bash launch_retrieval_tool.sh
```
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
```
bash run_ingest_data.sh
```
4. Launch Tool service
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```
5. Launch `Agent` service
To use open-source LLMs on Gaudi2, run commands below.
```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_tgi_gaudi.sh
bash launch_agent_service_tgi_gaudi.sh
```
6. [Optional] Build `Agent` docker image if pulling images failed.
```
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
docker build -t opea/agent-langchain:latest -f comps/agent/langchain/Dockerfile .
```
## Validate services
First look at logs of the agent docker containers:
```
# worker agent
docker logs rag-agent-endpoint
```
```
# supervisor agent
docker logs react-agent-endpoint
```
You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>
Second, validate worker agent:
```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
Third, validate supervisor agent:
```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```
## How to register your own tools with agent
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md).

View File

@@ -2,9 +2,37 @@
# SPDX-License-Identifier: Apache-2.0
services:
worker-rag-agent:
tgi-server:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
container_name: tgi-server
ports:
- "8085:80"
volumes:
- ${HF_CACHE_DIR}:/data
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
ENABLE_HPU_GRAPH: true
LIMIT_HPU_GRAPH: true
USE_FLASH_ATTENTION: true
FLASH_ATTENTION_RECOMPUTE: true
runtime: habana
cap_add:
- SYS_NICE
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --sharded true --num-shard ${NUM_SHARDS}
worker-docgrader-agent:
image: opea/agent-langchain:latest
container_name: rag-agent-endpoint
container_name: docgrader-agent-endpoint
depends_on:
- tgi-server
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
@@ -13,7 +41,7 @@ services:
ipc: host
environment:
ip_address: ${ip_address}
strategy: rag_agent_llama
strategy: rag_agent
recursion_limit: ${recursion_limit_worker}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
@@ -38,7 +66,8 @@ services:
image: opea/agent-langchain:latest
container_name: react-agent-endpoint
depends_on:
- worker-rag-agent
- tgi-server
- worker-docgrader-agent
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
@@ -47,7 +76,7 @@ services:
ipc: host
environment:
ip_address: ${ip_address}
strategy: react_llama
strategy: react_langgraph
recursion_limit: ${recursion_limit_supervisor}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}

View File

@@ -1,9 +1,6 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
pushd "../../../../../" > /dev/null
source .set_env.sh
popd > /dev/null
WORKPATH=$(dirname "$PWD")/..
# export WORKDIR=$WORKPATH/../../
echo "WORKDIR=${WORKDIR}"
@@ -18,7 +15,7 @@ export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-70B-Instruct"
export NUM_SHARDS=4
export LLM_ENDPOINT_URL="http://${ip_address}:8085"
export temperature=0.01
export max_new_tokens=4096
export max_new_tokens=512
# agent related environment variables
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
@@ -30,3 +27,17 @@ export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
export CRAG_SERVER=http://${ip_address}:8080
docker compose -f compose.yaml up -d
sleep 5s
echo "Waiting tgi gaudi ready"
n=0
until [[ "$n" -ge 100 ]] || [[ $ready == true ]]; do
docker logs tgi-server &> tgi-gaudi-service.log
n=$((n+1))
if grep -q Connected tgi-gaudi-service.log; then
break
fi
sleep 5s
done
sleep 5s
echo "Service started successfully"

View File

@@ -1,25 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# LLM related environment variables
export HF_CACHE_DIR=${HF_CACHE_DIR}
ls $HF_CACHE_DIR
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-70B-Instruct"
export NUM_SHARDS=4
docker compose -f tgi_gaudi.yaml up -d
sleep 5s
echo "Waiting tgi gaudi ready"
n=0
until [[ "$n" -ge 100 ]] || [[ $ready == true ]]; do
docker logs tgi-server &> tgi-gaudi-service.log
n=$((n+1))
if grep -q Connected tgi-gaudi-service.log; then
break
fi
sleep 5s
done
sleep 5s
echo "Service started successfully"

View File

@@ -1,30 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
services:
tgi-server:
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-server
ports:
- "8085:80"
volumes:
- ${HF_CACHE_DIR}:/data
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
ENABLE_HPU_GRAPH: true
LIMIT_HPU_GRAPH: true
USE_FLASH_ATTENTION: true
FLASH_ATTENTION_RECOMPUTE: true
runtime: habana
cap_add:
- SYS_NICE
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --sharded true --num-shard ${NUM_SHARDS}

View File

@@ -2,7 +2,7 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
set -ex
set -e
WORKPATH=$(dirname "$PWD")
export WORKDIR=$WORKPATH/../../
@@ -23,8 +23,8 @@ function start_agent_and_api_server() {
docker run -d --runtime=runc --name=kdd-cup-24-crag-service -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
echo "Starting Agent services"
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
bash launch_agent_service_tgi_rocm.sh
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_agent_service_tgi_gaudi.sh
}
function validate() {
@@ -47,7 +47,7 @@ function validate_agent_service() {
"query": "Tell me about Michael Jackson song thriller"
}')
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
docker logs rag-agent-endpoint
docker logs docgrader-agent-endpoint
if [ "$EXIT_CODE" == "1" ]; then
exit 1
fi

View File

@@ -1,91 +0,0 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
set -e
WORKPATH=$(dirname "$PWD")
export WORKDIR=$WORKPATH/../../
echo "WORKDIR=${WORKDIR}"
export ip_address=$(hostname -I | awk '{print $1}')
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export HF_CACHE_DIR=$WORKDIR/hf_cache
if [ ! -d "$HF_CACHE_DIR" ]; then
mkdir -p "$HF_CACHE_DIR"
fi
ls $HF_CACHE_DIR
function start_tgi(){
echo "Starting tgi-gaudi server"
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_tgi_gaudi.sh
}
function start_agent_and_api_server() {
echo "Starting CRAG server"
docker run -d --runtime=runc --name=kdd-cup-24-crag-service -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
echo "Starting Agent services"
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_agent_service_tgi_gaudi.sh
sleep 10
}
function validate() {
local CONTENT="$1"
local EXPECTED_RESULT="$2"
local SERVICE_NAME="$3"
if echo "$CONTENT" | grep -q "$EXPECTED_RESULT"; then
echo "[ $SERVICE_NAME ] Content is as expected: $CONTENT"
echo 0
else
echo "[ $SERVICE_NAME ] Content does not match the expected result: $CONTENT"
echo 1
fi
}
function validate_agent_service() {
echo "----------------Test agent ----------------"
# local CONTENT=$(http_proxy="" curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
# "query": "Tell me about Michael Jackson song thriller"
# }')
export agent_port="9095"
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py)
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "rag-agent-endpoint")
docker logs rag-agent-endpoint
if [ "$EXIT_CODE" == "1" ]; then
exit 1
fi
# local CONTENT=$(http_proxy="" curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
# "query": "Tell me about Michael Jackson song thriller"
# }')
export agent_port="9090"
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py)
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
docker logs react-agent-endpoint
if [ "$EXIT_CODE" == "1" ]; then
exit 1
fi
}
function main() {
echo "==================== Start TGI ===================="
start_tgi
echo "==================== TGI started ===================="
echo "==================== Start agent ===================="
start_agent_and_api_server
echo "==================== Agent started ===================="
echo "==================== Validate agent service ===================="
validate_agent_service
echo "==================== Agent service validated ===================="
}
main

View File

@@ -1,25 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import os
import requests
def generate_answer_agent_api(url, prompt):
proxies = {"http": ""}
payload = {
"query": prompt,
}
response = requests.post(url, json=payload, proxies=proxies)
answer = response.json()["text"]
return answer
if __name__ == "__main__":
ip_address = os.getenv("ip_address", "localhost")
agent_port = os.getenv("agent_port", "9095")
url = f"http://{ip_address}:{agent_port}/v1/chat/completions"
prompt = "Tell me about Michael Jackson song thriller"
answer = generate_answer_agent_api(url, prompt)
print(answer)

View File

@@ -19,6 +19,7 @@ function stop_crag() {
function stop_agent_docker() {
cd $WORKPATH/docker_compose/intel/hpu/gaudi/
# docker compose -f compose.yaml down
container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
@@ -27,21 +28,11 @@ function stop_agent_docker() {
done
}
function stop_tgi(){
cd $WORKPATH/docker_compose/intel/hpu/gaudi/
container_list=$(cat tgi_gaudi.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
echo "Stopping container $container_name"
if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi
done
}
function stop_retrieval_tool() {
echo "Stopping Retrieval tool"
local RETRIEVAL_TOOL_PATH=$WORKPATH/../DocIndexRetriever
cd $RETRIEVAL_TOOL_PATH/docker_compose/intel/cpu/xeon/
# docker compose -f compose.yaml down
container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
@@ -52,26 +43,25 @@ function stop_retrieval_tool() {
echo "workpath: $WORKPATH"
echo "=================== Stop containers ===================="
stop_crag
stop_tgi
stop_agent_docker
stop_retrieval_tool
cd $WORKPATH/tests
echo "=================== #1 Building docker images===================="
bash step1_build_images.sh
bash 1_build_images.sh
echo "=================== #1 Building docker images completed===================="
echo "=================== #2 Start retrieval tool===================="
bash step2_start_retrieval_tool.sh
bash 2_start_retrieval_tool.sh
echo "=================== #2 Retrieval tool started===================="
echo "=================== #3 Ingest data and validate retrieval===================="
bash step3_ingest_data_and_validate_retrieval.sh
bash 3_ingest_data_and_validate_retrieval.sh
echo "=================== #3 Data ingestion and validation completed===================="
echo "=================== #4 Start agent and API server===================="
bash step4_launch_and_validate_agent_tgi.sh
bash 4_launch_and_validate_agent_tgi.sh
echo "=================== #4 Agent test passed ===================="
echo "=================== #5 Stop agent and API server===================="
@@ -80,6 +70,4 @@ stop_agent_docker
stop_retrieval_tool
echo "=================== #5 Agent and API server stopped===================="
echo y | docker system prune
echo "ALL DONE!"

View File

@@ -1,75 +0,0 @@
#!/bin/bash
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
set -e
WORKPATH=$(dirname "$PWD")
export WORKDIR=$WORKPATH/../../
echo "WORKDIR=${WORKDIR}"
export ip_address=$(hostname -I | awk '{print $1}')
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
function stop_crag() {
cid=$(docker ps -aq --filter "name=kdd-cup-24-crag-service")
echo "Stopping container kdd-cup-24-crag-service with cid $cid"
if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi
}
function stop_agent_docker() {
cd $WORKPATH/docker_compose/amd/gpu/rocm
# docker compose -f compose.yaml down
container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
echo "Stopping container $container_name"
if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi
done
}
function stop_retrieval_tool() {
echo "Stopping Retrieval tool"
local RETRIEVAL_TOOL_PATH=$WORKPATH/../DocIndexRetriever
cd $RETRIEVAL_TOOL_PATH/docker_compose/intel/cpu/xeon/
# docker compose -f compose.yaml down
container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2)
for container_name in $container_list; do
cid=$(docker ps -aq --filter "name=$container_name")
echo "Stopping container $container_name"
if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi
done
}
echo "workpath: $WORKPATH"
echo "=================== Stop containers ===================="
stop_crag
stop_agent_docker
stop_retrieval_tool
cd $WORKPATH/tests
echo "=================== #1 Building docker images===================="
bash step1_build_images.sh
echo "=================== #1 Building docker images completed===================="
echo "=================== #2 Start retrieval tool===================="
bash step2_start_retrieval_tool.sh
echo "=================== #2 Retrieval tool started===================="
echo "=================== #3 Ingest data and validate retrieval===================="
bash step3_ingest_data_and_validate_retrieval.sh
echo "=================== #3 Data ingestion and validation completed===================="
echo "=================== #4 Start agent and API server===================="
bash step4a_launch_and_validate_agent_tgi_on_rocm.sh
echo "=================== #4 Agent test passed ===================="
echo "=================== #5 Stop agent and API server===================="
stop_crag
stop_agent_docker
stop_retrieval_tool
echo "=================== #5 Agent and API server stopped===================="
echo y | docker system prune
echo "ALL DONE!"

View File

@@ -25,7 +25,7 @@ get_billboard_rank_date:
args_schema:
rank:
type: int
description: the rank of interest, for example 1 for top 1
description: song name
date:
type: str
description: date

View File

@@ -12,31 +12,16 @@ def search_knowledge_base(query: str) -> str:
print(url)
proxies = {"http": ""}
payload = {
"messages": query,
"text": query,
}
response = requests.post(url, json=payload, proxies=proxies)
print(response)
if "documents" in response.json():
docs = response.json()["documents"]
context = ""
for i, doc in enumerate(docs):
if i == 0:
context = doc
else:
context += "\n" + doc
# print(context)
return context
elif "text" in response.json():
return response.json()["text"]
elif "reranked_docs" in response.json():
docs = response.json()["reranked_docs"]
context = ""
for i, doc in enumerate(docs):
if i == 0:
context = doc["text"]
else:
context += "\n" + doc["text"]
# print(context)
return context
else:
return "Error parsing response from the knowledge base."
docs = response.json()["documents"]
context = ""
for i, doc in enumerate(docs):
if i == 0:
context = doc
else:
context += "\n" + doc
print(context)
return context

View File

@@ -16,8 +16,9 @@ RUN useradd -m -s /bin/bash user && \
WORKDIR /home/user/
RUN git clone https://github.com/opea-project/GenAIComps.git
WORKDIR /home/user/GenAIComps
RUN pip install --no-cache-dir --upgrade pip setuptools && \
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt
COPY ./audioqna.py /home/user/audioqna.py

View File

@@ -18,7 +18,7 @@ WORKDIR /home/user/
RUN git clone https://github.com/opea-project/GenAIComps.git
WORKDIR /home/user/GenAIComps
RUN pip install --no-cache-dir --upgrade pip setuptools && \
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt
COPY ./audioqna_multilang.py /home/user/audioqna_multilang.py

View File

@@ -1,133 +1,58 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import asyncio
import os
from comps import MegaServiceEndpoint, MicroService, ServiceOrchestrator, ServiceRoleType, ServiceType
from comps.cores.proto.api_protocol import AudioChatCompletionRequest, ChatCompletionResponse
from comps.cores.proto.docarray import LLMParams
from fastapi import Request
from comps import AudioQnAGateway, MicroService, ServiceOrchestrator, ServiceType
MEGA_SERVICE_HOST_IP = os.getenv("MEGA_SERVICE_HOST_IP", "0.0.0.0")
MEGA_SERVICE_PORT = int(os.getenv("MEGA_SERVICE_PORT", 8888))
WHISPER_SERVER_HOST_IP = os.getenv("WHISPER_SERVER_HOST_IP", "0.0.0.0")
WHISPER_SERVER_PORT = int(os.getenv("WHISPER_SERVER_PORT", 7066))
SPEECHT5_SERVER_HOST_IP = os.getenv("SPEECHT5_SERVER_HOST_IP", "0.0.0.0")
SPEECHT5_SERVER_PORT = int(os.getenv("SPEECHT5_SERVER_PORT", 7055))
LLM_SERVER_HOST_IP = os.getenv("LLM_SERVER_HOST_IP", "0.0.0.0")
LLM_SERVER_PORT = int(os.getenv("LLM_SERVER_PORT", 3006))
def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **kwargs):
if self.services[cur_node].service_type == ServiceType.LLM:
# convert TGI/vLLM to unified OpenAI /v1/chat/completions format
next_inputs = {}
next_inputs["model"] = "tgi" # specifically clarify the fake model to make the format unified
next_inputs["messages"] = [{"role": "user", "content": inputs["asr_result"]}]
next_inputs["max_tokens"] = llm_parameters_dict["max_tokens"]
next_inputs["top_p"] = llm_parameters_dict["top_p"]
next_inputs["stream"] = inputs["streaming"] # False as default
next_inputs["frequency_penalty"] = inputs["frequency_penalty"]
# next_inputs["presence_penalty"] = inputs["presence_penalty"]
# next_inputs["repetition_penalty"] = inputs["repetition_penalty"]
next_inputs["temperature"] = inputs["temperature"]
inputs = next_inputs
elif self.services[cur_node].service_type == ServiceType.TTS:
next_inputs = {}
next_inputs["text"] = inputs["choices"][0]["message"]["content"]
next_inputs["voice"] = kwargs["voice"]
inputs = next_inputs
return inputs
def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **kwargs):
if self.services[cur_node].service_type == ServiceType.TTS:
new_inputs = {}
new_inputs["text"] = inputs["choices"][0]["text"]
return new_inputs
else:
return inputs
ASR_SERVICE_HOST_IP = os.getenv("ASR_SERVICE_HOST_IP", "0.0.0.0")
ASR_SERVICE_PORT = int(os.getenv("ASR_SERVICE_PORT", 9099))
LLM_SERVICE_HOST_IP = os.getenv("LLM_SERVICE_HOST_IP", "0.0.0.0")
LLM_SERVICE_PORT = int(os.getenv("LLM_SERVICE_PORT", 9000))
TTS_SERVICE_HOST_IP = os.getenv("TTS_SERVICE_HOST_IP", "0.0.0.0")
TTS_SERVICE_PORT = int(os.getenv("TTS_SERVICE_PORT", 9088))
class AudioQnAService:
def __init__(self, host="0.0.0.0", port=8000):
self.host = host
self.port = port
ServiceOrchestrator.align_inputs = align_inputs
self.megaservice = ServiceOrchestrator()
self.endpoint = str(MegaServiceEndpoint.AUDIO_QNA)
def add_remote_service(self):
asr = MicroService(
name="asr",
host=WHISPER_SERVER_HOST_IP,
port=WHISPER_SERVER_PORT,
endpoint="/v1/asr",
host=ASR_SERVICE_HOST_IP,
port=ASR_SERVICE_PORT,
endpoint="/v1/audio/transcriptions",
use_remote_service=True,
service_type=ServiceType.ASR,
)
llm = MicroService(
name="llm",
host=LLM_SERVER_HOST_IP,
port=LLM_SERVER_PORT,
host=LLM_SERVICE_HOST_IP,
port=LLM_SERVICE_PORT,
endpoint="/v1/chat/completions",
use_remote_service=True,
service_type=ServiceType.LLM,
)
tts = MicroService(
name="tts",
host=SPEECHT5_SERVER_HOST_IP,
port=SPEECHT5_SERVER_PORT,
endpoint="/v1/tts",
host=TTS_SERVICE_HOST_IP,
port=TTS_SERVICE_PORT,
endpoint="/v1/audio/speech",
use_remote_service=True,
service_type=ServiceType.TTS,
)
self.megaservice.add(asr).add(llm).add(tts)
self.megaservice.flow_to(asr, llm)
self.megaservice.flow_to(llm, tts)
async def handle_request(self, request: Request):
data = await request.json()
chat_request = AudioChatCompletionRequest.parse_obj(data)
parameters = LLMParams(
# relatively lower max_tokens for audio conversation
max_tokens=chat_request.max_tokens if chat_request.max_tokens else 128,
top_k=chat_request.top_k if chat_request.top_k else 10,
top_p=chat_request.top_p if chat_request.top_p else 0.95,
temperature=chat_request.temperature if chat_request.temperature else 0.01,
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
streaming=False, # TODO add streaming LLM output as input to TTS
)
result_dict, runtime_graph = await self.megaservice.schedule(
initial_inputs={"audio": chat_request.audio},
llm_parameters=parameters,
voice=chat_request.voice if hasattr(chat_request, "voice") else "default",
)
last_node = runtime_graph.all_leaves()[-1]
response = result_dict[last_node]["tts_result"]
return response
def start(self):
self.service = MicroService(
self.__class__.__name__,
service_role=ServiceRoleType.MEGASERVICE,
host=self.host,
port=self.port,
endpoint=self.endpoint,
input_datatype=AudioChatCompletionRequest,
output_datatype=ChatCompletionResponse,
)
self.service.add_route(self.endpoint, self.handle_request, methods=["POST"])
self.service.start()
self.gateway = AudioQnAGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)
if __name__ == "__main__":
audioqna = AudioQnAService(port=MEGA_SERVICE_PORT)
audioqna = AudioQnAService(host=MEGA_SERVICE_HOST_IP, port=MEGA_SERVICE_PORT)
audioqna.add_remote_service()
audioqna.start()

View File

@@ -1,14 +1,13 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import asyncio
import base64
import os
from comps import MegaServiceEndpoint, MicroService, ServiceOrchestrator, ServiceRoleType, ServiceType
from comps.cores.proto.api_protocol import AudioChatCompletionRequest, ChatCompletionResponse
from comps.cores.proto.docarray import LLMParams
from fastapi import Request
from comps import AudioQnAGateway, MicroService, ServiceOrchestrator, ServiceType
MEGA_SERVICE_HOST_IP = os.getenv("MEGA_SERVICE_HOST_IP", "0.0.0.0")
MEGA_SERVICE_PORT = int(os.getenv("MEGA_SERVICE_PORT", 8888))
WHISPER_SERVER_HOST_IP = os.getenv("WHISPER_SERVER_HOST_IP", "0.0.0.0")
@@ -20,8 +19,12 @@ LLM_SERVER_PORT = int(os.getenv("LLM_SERVER_PORT", 8888))
def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **kwargs):
if self.services[cur_node].service_type == ServiceType.LLM:
print(inputs)
if self.services[cur_node].service_type == ServiceType.ASR:
# {'byte_str': 'UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA'}
inputs["audio"] = inputs["byte_str"]
del inputs["byte_str"]
elif self.services[cur_node].service_type == ServiceType.LLM:
# convert TGI/vLLM to unified OpenAI /v1/chat/completions format
next_inputs = {}
next_inputs["model"] = "tgi" # specifically clarify the fake model to make the format unified
@@ -57,8 +60,6 @@ class AudioQnAService:
ServiceOrchestrator.align_outputs = align_outputs
self.megaservice = ServiceOrchestrator()
self.endpoint = str(MegaServiceEndpoint.AUDIO_QNA)
def add_remote_service(self):
asr = MicroService(
name="asr",
@@ -89,46 +90,9 @@ class AudioQnAService:
self.megaservice.add(asr).add(llm).add(tts)
self.megaservice.flow_to(asr, llm)
self.megaservice.flow_to(llm, tts)
async def handle_request(self, request: Request):
data = await request.json()
chat_request = AudioChatCompletionRequest.parse_obj(data)
parameters = LLMParams(
# relatively lower max_tokens for audio conversation
max_tokens=chat_request.max_tokens if chat_request.max_tokens else 128,
top_k=chat_request.top_k if chat_request.top_k else 10,
top_p=chat_request.top_p if chat_request.top_p else 0.95,
temperature=chat_request.temperature if chat_request.temperature else 0.01,
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
streaming=False, # TODO add streaming LLM output as input to TTS
)
result_dict, runtime_graph = await self.megaservice.schedule(
initial_inputs={"audio": chat_request.audio}, llm_parameters=parameters
)
last_node = runtime_graph.all_leaves()[-1]
response = result_dict[last_node]["byte_str"]
return response
def start(self):
self.service = MicroService(
self.__class__.__name__,
service_role=ServiceRoleType.MEGASERVICE,
host=self.host,
port=self.port,
endpoint=self.endpoint,
input_datatype=AudioChatCompletionRequest,
output_datatype=ChatCompletionResponse,
)
self.service.add_route(self.endpoint, self.handle_request, methods=["POST"])
self.service.start()
self.gateway = AudioQnAGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)
if __name__ == "__main__":
audioqna = AudioQnAService(port=MEGA_SERVICE_PORT)
audioqna = AudioQnAService(host=MEGA_SERVICE_HOST_IP, port=MEGA_SERVICE_PORT)
audioqna.add_remote_service()
audioqna.start()

View File

@@ -14,12 +14,12 @@ We evaluate the WER (Word Error Rate) metric of the ASR microservice.
### Launch ASR microservice
Launch the ASR microserice with the following commands. For more details please refer to [doc](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src/README.md).
Launch the ASR microserice with the following commands. For more details please refer to [doc](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/whisper/README.md).
```bash
git clone https://github.com/opea-project/GenAIComps
cd GenAIComps
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/Dockerfile .
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/Dockerfile .
# change the name of model by editing model_name_or_path you want to evaluate
docker run -p 7066:7066 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/whisper:latest --model_name_or_path "openai/whisper-tiny"
```
@@ -36,9 +36,9 @@ Evaluate the performance with the LLM:
```py
# validate the offline model
# python offline_eval.py
# python offline_evaluate.py
# validate the online asr microservice accuracy
python online_eval.py
python online_evaluate.py
```
### Performance Result

View File

@@ -2,4 +2,4 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
python online_eval.py
python online_evaluate.py

View File

@@ -1,77 +0,0 @@
# AudioQnA Benchmarking
This folder contains a collection of scripts to enable inference benchmarking by leveraging a comprehensive benchmarking tool, [GenAIEval](https://github.com/opea-project/GenAIEval/blob/main/evals/benchmark/README.md), that enables throughput analysis to assess inference performance.
By following this guide, you can run benchmarks on your deployment and share the results with the OPEA community.
## Purpose
We aim to run these benchmarks and share them with the OPEA community for three primary reasons:
- To offer insights on inference throughput in real-world scenarios, helping you choose the best service or deployment for your needs.
- To establish a baseline for validating optimization solutions across different implementations, providing clear guidance on which methods are most effective for your use case.
- To inspire the community to build upon our benchmarks, allowing us to better quantify new solutions in conjunction with current leading llms, serving frameworks etc.
## Metrics
The benchmark will report the below metrics, including:
- Number of Concurrent Requests
- End-to-End Latency: P50, P90, P99 (in milliseconds)
- End-to-End First Token Latency: P50, P90, P99 (in milliseconds)
- Average Next Token Latency (in milliseconds)
- Average Token Latency (in milliseconds)
- Requests Per Second (RPS)
- Output Tokens Per Second
- Input Tokens Per Second
Results will be displayed in the terminal and saved as CSV file named `1_stats.csv` for easy export to spreadsheets.
## Getting Started
We recommend using Kubernetes to deploy the AudioQnA service, as it offers benefits such as load balancing and improved scalability. However, you can also deploy the service using Docker if that better suits your needs.
### Prerequisites
- Install Kubernetes by following [this guide](https://github.com/opea-project/docs/blob/main/guide/installation/k8s_install/k8s_install_kubespray.md).
- Every node has direct internet access
- Set up kubectl on the master node with access to the Kubernetes cluster.
- Install Python 3.8+ on the master node for running GenAIEval.
- Ensure all nodes have a local /mnt/models folder, which will be mounted by the pods.
- Ensure that the container's ulimit can meet the the number of requests.
```bash
# The way to modify the containered ulimit:
sudo systemctl edit containerd
# Add two lines:
[Service]
LimitNOFILE=65536:1048576
sudo systemctl daemon-reload; sudo systemctl restart containerd
```
## Test Steps
Please deploy AudioQnA service before benchmarking.
### Run Benchmark Test
Before the benchmark, we can configure the number of test queries and test output directory by:
```bash
export USER_QUERIES="[128, 128, 128, 128]"
export TEST_OUTPUT_DIR="/tmp/benchmark_output"
```
And then run the benchmark by:
```bash
bash benchmark.sh -n <node_count>
```
The argument `-n` refers to the number of test nodes.
### Data collection
All the test results will come to this folder `/tmp/benchmark_output` configured by the environment variable `TEST_OUTPUT_DIR` in previous steps.

View File

@@ -1,99 +0,0 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
deployment_type="k8s"
node_number=1
service_port=8888
query_per_node=128
benchmark_tool_path="$(pwd)/GenAIEval"
usage() {
echo "Usage: $0 [-d deployment_type] [-n node_number] [-i service_ip] [-p service_port]"
echo " -d deployment_type AudioQnA deployment type, select between k8s and docker (default: k8s)"
echo " -n node_number Test node number, required only for k8s deployment_type, (default: 1)"
echo " -i service_ip AudioQnA service ip, required only for docker deployment_type"
echo " -p service_port AudioQnA service port, required only for docker deployment_type, (default: 8888)"
exit 1
}
while getopts ":d:n:i:p:" opt; do
case ${opt} in
d )
deployment_type=$OPTARG
;;
n )
node_number=$OPTARG
;;
i )
service_ip=$OPTARG
;;
p )
service_port=$OPTARG
;;
\? )
echo "Invalid option: -$OPTARG" 1>&2
usage
;;
: )
echo "Invalid option: -$OPTARG requires an argument" 1>&2
usage
;;
esac
done
if [[ "$deployment_type" == "docker" && -z "$service_ip" ]]; then
echo "Error: service_ip is required for docker deployment_type" 1>&2
usage
fi
if [[ "$deployment_type" == "k8s" && ( -n "$service_ip" || -n "$service_port" ) ]]; then
echo "Warning: service_ip and service_port are ignored for k8s deployment_type" 1>&2
fi
function main() {
if [[ ! -d ${benchmark_tool_path} ]]; then
echo "Benchmark tool not found, setting up..."
setup_env
fi
run_benchmark
}
function setup_env() {
git clone https://github.com/opea-project/GenAIEval.git
pushd ${benchmark_tool_path}
python3 -m venv stress_venv
source stress_venv/bin/activate
pip install -r requirements.txt
popd
}
function run_benchmark() {
source ${benchmark_tool_path}/stress_venv/bin/activate
export DEPLOYMENT_TYPE=${deployment_type}
export SERVICE_IP=${service_ip:-"None"}
export SERVICE_PORT=${service_port:-"None"}
if [[ -z $USER_QUERIES ]]; then
user_query=$((query_per_node*node_number))
export USER_QUERIES="[${user_query}, ${user_query}, ${user_query}, ${user_query}]"
echo "USER_QUERIES not configured, setting to: ${USER_QUERIES}."
fi
export WARMUP=$(echo $USER_QUERIES | sed -e 's/[][]//g' -e 's/,.*//')
if [[ -z $WARMUP ]]; then export WARMUP=0; fi
if [[ -z $TEST_OUTPUT_DIR ]]; then
if [[ $DEPLOYMENT_TYPE == "k8s" ]]; then
export TEST_OUTPUT_DIR="${benchmark_tool_path}/evals/benchmark/benchmark_output/node_${node_number}"
else
export TEST_OUTPUT_DIR="${benchmark_tool_path}/evals/benchmark/benchmark_output/docker"
fi
echo "TEST_OUTPUT_DIR not configured, setting to: ${TEST_OUTPUT_DIR}."
fi
envsubst < ./benchmark.yaml > ${benchmark_tool_path}/evals/benchmark/benchmark.yaml
cd ${benchmark_tool_path}/evals/benchmark
python benchmark.py
}
main

View File

@@ -1,52 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
test_suite_config: # Overall configuration settings for the test suite
examples: ["audioqna"] # The specific test cases being tested, e.g., chatqna, codegen, codetrans, faqgen, audioqna, visualqna
deployment_type: "k8s" # Default is "k8s", can also be "docker"
service_ip: None # Leave as None for k8s, specify for Docker
service_port: None # Leave as None for k8s, specify for Docker
warm_ups: 0 # Number of test requests for warm-up
run_time: 60m # The max total run time for the test suite
seed: # The seed for all RNGs
user_queries: [1, 2, 4, 8, 16, 32, 64, 128] # Number of test requests at each concurrency level
query_timeout: 120 # Number of seconds to wait for a simulated user to complete any executing task before exiting. 120 sec by defeult.
random_prompt: false # Use random prompts if true, fixed prompts if false
collect_service_metric: false # Collect service metrics if true, do not collect service metrics if false
data_visualization: false # Generate data visualization if true, do not generate data visualization if false
llm_model: "Intel/neural-chat-7b-v3-3" # The LLM model used for the test
test_output_dir: "/tmp/benchmark_output" # The directory to store the test output
load_shape: # Tenant concurrency pattern
name: constant # poisson or constant(locust default load shape)
params: # Loadshape-specific parameters
constant: # Poisson load shape specific parameters, activate only if load_shape is poisson
concurrent_level: 4 # If user_queries is specified, concurrent_level is target number of requests per user. If not, it is the number of simulated users
poisson: # Poisson load shape specific parameters, activate only if load_shape is poisson
arrival-rate: 1.0 # Request arrival rate
namespace: "" # Fill the user-defined namespace. Otherwise, it will be default.
test_cases:
audioqna:
asr:
run_test: true
service_name: "asr-svc" # Replace with your service name
llm:
run_test: true
service_name: "llm-svc" # Replace with your service name
parameters:
model_name: "Intel/neural-chat-7b-v3-3"
max_new_tokens: 128
temperature: 0.01
top_k: 10
top_p: 0.95
repetition_penalty: 1.03
streaming: true
llmserve:
run_test: true
service_name: "llm-svc" # Replace with your service name
tts:
run_test: true
service_name: "tts-svc" # Replace with your service name
e2e:
run_test: true
service_name: "audioqna-backend-server-svc" # Replace with your service name

View File

@@ -1,137 +0,0 @@
# Build Mega Service of AudioQnA on AMD ROCm GPU
This document outlines the deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice
pipeline on server on AMD ROCm GPU platform.
## 🚀 Build Docker images
### 1. Source Code install GenAIComps
```bash
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
```
### 2. Build ASR Image
```bash
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/integrations/dependency/whisper/Dockerfile .
```
### 3. Build LLM Image
For compose for ROCm example AMD optimized image hosted in huggingface repo will be used for TGI service: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm (https://github.com/huggingface/text-generation-inference)
### 4. Build TTS Image
```bash
docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/src/integrations/dependency/speecht5/Dockerfile .
```
### 5. Build MegaService Docker Image
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:
```bash
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/AudioQnA/
docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
```
Then run the command `docker images`, you will have following images ready:
1. `opea/whisper:latest`
2. `opea/speecht5:latest`
3. `opea/audioqna:latest`
## 🚀 Set the environment variables
Before starting the services with `docker compose`, you have to recheck the following environment variables.
```bash
export host_ip=<your External Public IP> # export host_ip=$(hostname -I | awk '{print $1}')
export HUGGINGFACEHUB_API_TOKEN=<your HF token>
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export MEGA_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_HOST_IP=${host_ip}
export SPEECHT5_SERVER_HOST_IP=${host_ip}
export LLM_SERVER_HOST_IP=${host_ip}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna
```
or use set_env.sh file to setup environment variables.
Note: Please replace with host_ip with your external IP address, do not use localhost.
Note: In order to limit access to a subset of GPUs, please pass each device individually using one or more -device /dev/dri/rendered, where is the card index, starting from 128. (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
Example for set isolation for 1 GPU
- /dev/dri/card0:/dev/dri/card0
- /dev/dri/renderD128:/dev/dri/renderD128
Example for set isolation for 2 GPUs
- /dev/dri/card0:/dev/dri/card0
- /dev/dri/renderD128:/dev/dri/renderD128
- /dev/dri/card0:/dev/dri/card0
- /dev/dri/renderD129:/dev/dri/renderD129
Please find more information about accessing and restricting AMD GPUs in the link (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
## 🚀 Start the MegaService
```bash
cd GenAIExamples/AudioQnA/docker_compose/amd/gpu/rocm/
docker compose up -d
```
In following cases, you could build docker image from source by yourself.
- Failed to download the docker image.
- If you want to use a specific version of Docker image.
Please refer to 'Build Docker Images' in below.
## 🚀 Consume the AudioQnA Service
Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
to the response, decode the base64 string and save it as a .wav file.
```bash
# voice can be "default" or "male"
curl http://${host_ip}:3008/v1/audioqna \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
```
## 🚀 Test MicroServices
```bash
# whisper service
curl http://${host_ip}:7066/v1/asr \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'
# tgi service
curl http://${host_ip}:3006/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
# speecht5 service
curl http://${host_ip}:7055/v1/tts \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'
```

View File

@@ -1,85 +0,0 @@
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
services:
whisper-service:
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
container_name: whisper-service
ports:
- "7066:7066"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
speecht5-service:
image: ${REGISTRY:-opea}/speecht5:${TAG:-latest}
container_name: speecht5-service
ports:
- "7055:7055"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
container_name: tgi-service
ports:
- "3006:80"
volumes:
- "./data:/data"
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/card1:/dev/dri/card1
- /dev/dri/renderD136:/dev/dri/renderD136
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
host_ip: ${host_ip}
healthcheck:
test: ["CMD-SHELL", "curl -f http://$host_ip:3006/health || exit 1"]
interval: 10s
timeout: 10s
retries: 100
command: --model-id ${LLM_MODEL_ID}
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
ipc: host
audioqna-backend-server:
image: ${REGISTRY:-opea}/audioqna:${TAG:-latest}
container_name: audioqna-xeon-backend-server
depends_on:
- whisper-service
- tgi-service
- speecht5-service
ports:
- "3008:8888"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
- WHISPER_SERVER_HOST_IP=${WHISPER_SERVER_HOST_IP}
- WHISPER_SERVER_PORT=${WHISPER_SERVER_PORT}
- LLM_SERVER_HOST_IP=${LLM_SERVER_HOST_IP}
- LLM_SERVER_PORT=${LLM_SERVER_PORT}
- SPEECHT5_SERVER_HOST_IP=${SPEECHT5_SERVER_HOST_IP}
- SPEECHT5_SERVER_PORT=${SPEECHT5_SERVER_PORT}
ipc: host
restart: always
networks:
default:
driver: bridge

View File

@@ -1,24 +0,0 @@
#!/usr/bin/env bash set_env.sh
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
# export host_ip=<your External Public IP> # export host_ip=$(hostname -I | awk '{print $1}')
export host_ip="192.165.1.21"
export HUGGINGFACEHUB_API_TOKEN=${YOUR_HUGGINGFACEHUB_API_TOKEN}
# <token>
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export MEGA_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_HOST_IP=${host_ip}
export SPEECHT5_SERVER_HOST_IP=${host_ip}
export LLM_SERVER_HOST_IP=${host_ip}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna

View File

@@ -14,20 +14,27 @@ cd GenAIComps
### 2. Build ASR Image
```bash
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/integrations/dependency/whisper/Dockerfile .
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/dependency/Dockerfile .
docker build -t opea/asr:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/Dockerfile .
```
### 3. Build LLM Image
Intel Xeon optimized image hosted in huggingface repo will be used for TGI service: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu (https://github.com/huggingface/text-generation-inference)
```bash
docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
```
### 4. Build TTS Image
```bash
docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/src/integrations/dependency/speecht5/Dockerfile .
docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/dependency/Dockerfile .
docker build -t opea/tts:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/Dockerfile .
```
### 5. Build MegaService Docker Image
### 6. Build MegaService Docker Image
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:
@@ -40,8 +47,11 @@ docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_p
Then run the command `docker images`, you will have following images ready:
1. `opea/whisper:latest`
2. `opea/speecht5:latest`
3. `opea/audioqna:latest`
2. `opea/asr:latest`
3. `opea/llm-tgi:latest`
4. `opea/speecht5:latest`
5. `opea/tts:latest`
6. `opea/audioqna:latest`
## 🚀 Set the environment variables
@@ -51,24 +61,22 @@ Before starting the services with `docker compose`, you have to recheck the foll
export host_ip=<your External Public IP> # export host_ip=$(hostname -I | awk '{print $1}')
export HUGGINGFACEHUB_API_TOKEN=<your HF token>
export TGI_LLM_ENDPOINT=http://$host_ip:3006
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export ASR_ENDPOINT=http://$host_ip:7066
export TTS_ENDPOINT=http://$host_ip:7055
export MEGA_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_HOST_IP=${host_ip}
export SPEECHT5_SERVER_HOST_IP=${host_ip}
export LLM_SERVER_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export TTS_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna
export ASR_SERVICE_PORT=3001
export TTS_SERVICE_PORT=3002
export LLM_SERVICE_PORT=3007
```
or use set_env.sh file to setup environment variables.
Note: Please replace with host_ip with your external IP address, do not use localhost.
## 🚀 Start the MegaService
```bash
@@ -85,18 +93,36 @@ curl http://${host_ip}:7066/v1/asr \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'
# asr microservice
curl http://${host_ip}:3001/v1/audio/transcriptions \
-X POST \
-d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'
# tgi service
curl http://${host_ip}:3006/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
# llm microservice
curl http://${host_ip}:3007/v1/chat/completions\
-X POST \
-d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":false}' \
-H 'Content-Type: application/json'
# speecht5 service
curl http://${host_ip}:7055/v1/tts \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'
# tts microservice
curl http://${host_ip}:3002/v1/audio/speech \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'
```
## 🚀 Test MegaService
@@ -106,9 +132,8 @@ base64 string to the megaservice endpoint. The megaservice will return a spoken
to the response, decode the base64 string and save it as a .wav file.
```bash
# voice can be "default" or "male"
curl http://${host_ip}:3008/v1/audioqna \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' \
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
```

View File

@@ -13,6 +13,14 @@ services:
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
asr:
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
container_name: asr-service
ports:
- "3001:9099"
ipc: host
environment:
ASR_ENDPOINT: ${ASR_ENDPOINT}
speecht5-service:
image: ${REGISTRY:-opea}/speecht5:${TAG:-latest}
container_name: speecht5-service
@@ -24,8 +32,16 @@ services:
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
tts:
image: ${REGISTRY:-opea}/tts:${TAG:-latest}
container_name: tts-service
ports:
- "3002:9088"
ipc: host
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
container_name: tgi-service
ports:
- "3006:80"
@@ -37,20 +53,29 @@ services:
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
host_ip: ${host_ip}
healthcheck:
test: ["CMD-SHELL", "curl -f http://$host_ip:3006/health || exit 1"]
interval: 10s
timeout: 10s
retries: 100
command: --model-id ${LLM_MODEL_ID} --cuda-graphs 0
llm:
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
container_name: llm-tgi-server
depends_on:
- tgi-service
ports:
- "3007:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
restart: unless-stopped
audioqna-xeon-backend-server:
image: ${REGISTRY:-opea}/audioqna:${TAG:-latest}
container_name: audioqna-xeon-backend-server
depends_on:
- whisper-service
- tgi-service
- speecht5-service
- asr
- llm
- tts
ports:
- "3008:8888"
environment:
@@ -58,26 +83,12 @@ services:
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
- WHISPER_SERVER_HOST_IP=${WHISPER_SERVER_HOST_IP}
- WHISPER_SERVER_PORT=${WHISPER_SERVER_PORT}
- LLM_SERVER_HOST_IP=${LLM_SERVER_HOST_IP}
- LLM_SERVER_PORT=${LLM_SERVER_PORT}
- SPEECHT5_SERVER_HOST_IP=${SPEECHT5_SERVER_HOST_IP}
- SPEECHT5_SERVER_PORT=${SPEECHT5_SERVER_PORT}
ipc: host
restart: always
audioqna-xeon-ui-server:
image: ${REGISTRY:-opea}/audioqna-ui:${TAG:-latest}
container_name: audioqna-xeon-ui-server
depends_on:
- audioqna-xeon-backend-server
ports:
- "5173:5173"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- CHAT_URL=${BACKEND_SERVICE_ENDPOINT}
- ASR_SERVICE_HOST_IP=${ASR_SERVICE_HOST_IP}
- ASR_SERVICE_PORT=${ASR_SERVICE_PORT}
- LLM_SERVICE_HOST_IP=${LLM_SERVICE_HOST_IP}
- LLM_SERVICE_PORT=${LLM_SERVICE_PORT}
- TTS_SERVICE_HOST_IP=${TTS_SERVICE_HOST_IP}
- TTS_SERVICE_PORT=${TTS_SERVICE_PORT}
ipc: host
restart: always

View File

@@ -26,7 +26,7 @@ services:
https_proxy: ${https_proxy}
restart: unless-stopped
tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
container_name: tgi-service
ports:
- "3006:80"

View File

@@ -1,22 +0,0 @@
#!/usr/bin/env bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# export host_ip=<your External Public IP>
export host_ip=$(hostname -I | awk '{print $1}')
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
# <token>
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export MEGA_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_HOST_IP=${host_ip}
export SPEECHT5_SERVER_HOST_IP=${host_ip}
export LLM_SERVER_HOST_IP=${host_ip}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna

View File

@@ -14,20 +14,27 @@ cd GenAIComps
### 2. Build ASR Image
```bash
docker build -t opea/whisper-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/integrations/dependency/whisper/Dockerfile.intel_hpu .
docker build -t opea/whisper-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/dependency/Dockerfile.intel_hpu .
docker build -t opea/asr:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/Dockerfile .
```
### 3. Build LLM Image
Intel Xeon optimized image hosted in huggingface repo will be used for TGI service: ghcr.io/huggingface/tgi-gaudi:2.0.6 (https://github.com/huggingface/tgi-gaudi)
```bash
docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
```
### 4. Build TTS Image
```bash
docker build -t opea/speecht5-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/src/integrations/dependency/speecht5/Dockerfile.intel_hpu .
docker build -t opea/speecht5-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/dependency/Dockerfile.intel_hpu .
docker build -t opea/tts:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/Dockerfile .
```
### 5. Build MegaService Docker Image
### 6. Build MegaService Docker Image
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:
@@ -40,8 +47,11 @@ docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_p
Then run the command `docker images`, you will have following images ready:
1. `opea/whisper-gaudi:latest`
2. `opea/speecht5-gaudi:latest`
3. `opea/audioqna:latest`
2. `opea/asr:latest`
3. `opea/llm-tgi:latest`
4. `opea/speecht5-gaudi:latest`
5. `opea/tts:latest`
6. `opea/audioqna:latest`
## 🚀 Set the environment variables
@@ -51,18 +61,20 @@ Before starting the services with `docker compose`, you have to recheck the foll
export host_ip=<your External Public IP> # export host_ip=$(hostname -I | awk '{print $1}')
export HUGGINGFACEHUB_API_TOKEN=<your HF token>
export TGI_LLM_ENDPOINT=http://$host_ip:3006
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export ASR_ENDPOINT=http://$host_ip:7066
export TTS_ENDPOINT=http://$host_ip:7055
export MEGA_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_HOST_IP=${host_ip}
export SPEECHT5_SERVER_HOST_IP=${host_ip}
export LLM_SERVER_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export TTS_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna
export ASR_SERVICE_PORT=3001
export TTS_SERVICE_PORT=3002
export LLM_SERVICE_PORT=3007
```
## 🚀 Start the MegaService
@@ -83,18 +95,36 @@ curl http://${host_ip}:7066/v1/asr \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'
# asr microservice
curl http://${host_ip}:3001/v1/audio/transcriptions \
-X POST \
-d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'
# tgi service
curl http://${host_ip}:3006/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
# llm microservice
curl http://${host_ip}:3007/v1/chat/completions\
-X POST \
-d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":false}' \
-H 'Content-Type: application/json'
# speecht5 service
curl http://${host_ip}:7055/v1/tts \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'
# tts microservice
curl http://${host_ip}:3002/v1/audio/speech \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'
```
## 🚀 Test MegaService
@@ -104,9 +134,8 @@ base64 string to the megaservice endpoint. The megaservice will return a spoken
to the response, decode the base64 string and save it as a .wav file.
```bash
# voice can be "default" or "male"
curl http://${host_ip}:3008/v1/audioqna \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' \
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
```

View File

@@ -18,6 +18,14 @@ services:
cap_add:
- SYS_NICE
restart: unless-stopped
asr:
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
container_name: asr-service
ports:
- "3001:9099"
ipc: host
environment:
ASR_ENDPOINT: ${ASR_ENDPOINT}
speecht5-service:
image: ${REGISTRY:-opea}/speecht5-gaudi:${TAG:-latest}
container_name: speecht5-service
@@ -34,8 +42,16 @@ services:
cap_add:
- SYS_NICE
restart: unless-stopped
tts:
image: ${REGISTRY:-opea}/tts:${TAG:-latest}
container_name: tts-service
ports:
- "3002:9088"
ipc: host
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
container_name: tgi-gaudi-server
ports:
- "3006:80"
@@ -58,19 +74,29 @@ services:
cap_add:
- SYS_NICE
ipc: host
healthcheck:
test: ["CMD-SHELL", "sleep 500 && exit 0"]
interval: 1s
timeout: 505s
retries: 1
command: --model-id ${LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048
llm:
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
container_name: llm-tgi-gaudi-server
depends_on:
- tgi-service
ports:
- "3007:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
restart: unless-stopped
audioqna-gaudi-backend-server:
image: ${REGISTRY:-opea}/audioqna:${TAG:-latest}
container_name: audioqna-gaudi-backend-server
depends_on:
- whisper-service
- tgi-service
- speecht5-service
- asr
- llm
- tts
ports:
- "3008:8888"
environment:
@@ -78,26 +104,12 @@ services:
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
- WHISPER_SERVER_HOST_IP=${WHISPER_SERVER_HOST_IP}
- WHISPER_SERVER_PORT=${WHISPER_SERVER_PORT}
- LLM_SERVER_HOST_IP=${LLM_SERVER_HOST_IP}
- LLM_SERVER_PORT=${LLM_SERVER_PORT}
- SPEECHT5_SERVER_HOST_IP=${SPEECHT5_SERVER_HOST_IP}
- SPEECHT5_SERVER_PORT=${SPEECHT5_SERVER_PORT}
ipc: host
restart: always
audioqna-gaudi-ui-server:
image: ${REGISTRY:-opea}/audioqna-ui:${TAG:-latest}
container_name: audioqna-gaudi-ui-server
depends_on:
- audioqna-gaudi-backend-server
ports:
- "5173:5173"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- CHAT_URL=${BACKEND_SERVICE_ENDPOINT}
- ASR_SERVICE_HOST_IP=${ASR_SERVICE_HOST_IP}
- ASR_SERVICE_PORT=${ASR_SERVICE_PORT}
- LLM_SERVICE_HOST_IP=${LLM_SERVICE_HOST_IP}
- LLM_SERVICE_PORT=${LLM_SERVICE_PORT}
- TTS_SERVICE_HOST_IP=${TTS_SERVICE_HOST_IP}
- TTS_SERVICE_PORT=${TTS_SERVICE_PORT}
ipc: host
restart: always

View File

@@ -1,22 +0,0 @@
#!/usr/bin/env bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# export host_ip=<your External Public IP>
export host_ip=$(hostname -I | awk '{print $1}')
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
# <token>
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export MEGA_SERVICE_HOST_IP=${host_ip}
export WHISPER_SERVER_HOST_IP=${host_ip}
export SPEECHT5_SERVER_HOST_IP=${host_ip}
export LLM_SERVER_HOST_IP=${host_ip}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna

View File

@@ -11,63 +11,51 @@ services:
context: ../
dockerfile: ./Dockerfile
image: ${REGISTRY:-opea}/audioqna:${TAG:-latest}
audioqna-ui:
build:
context: ../ui
dockerfile: ./docker/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/audioqna-ui:${TAG:-latest}
audioqna-multilang:
build:
context: ../
dockerfile: ./Dockerfile.multilang
extends: audioqna
image: ${REGISTRY:-opea}/audioqna-multilang:${TAG:-latest}
whisper-gaudi:
build:
context: GenAIComps
dockerfile: comps/asr/src/integrations/dependency/whisper/Dockerfile.intel_hpu
dockerfile: comps/asr/whisper/dependency/Dockerfile.intel_hpu
extends: audioqna
image: ${REGISTRY:-opea}/whisper-gaudi:${TAG:-latest}
whisper:
build:
context: GenAIComps
dockerfile: comps/asr/src/integrations/dependency/whisper/Dockerfile
dockerfile: comps/asr/whisper/dependency/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
asr:
build:
context: GenAIComps
dockerfile: comps/asr/src/Dockerfile
dockerfile: comps/asr/whisper/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
llm-tgi:
build:
context: GenAIComps
dockerfile: comps/llms/src/text-generation/Dockerfile
dockerfile: comps/llms/text-generation/tgi/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
speecht5-gaudi:
build:
context: GenAIComps
dockerfile: comps/tts/src/integrations/dependency/speecht5/Dockerfile.intel_hpu
dockerfile: comps/tts/speecht5/dependency/Dockerfile.intel_hpu
extends: audioqna
image: ${REGISTRY:-opea}/speecht5-gaudi:${TAG:-latest}
speecht5:
build:
context: GenAIComps
dockerfile: comps/tts/src/integrations/dependency/speecht5/Dockerfile
dockerfile: comps/tts/speecht5/dependency/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/speecht5:${TAG:-latest}
tts:
build:
context: GenAIComps
dockerfile: comps/tts/src/Dockerfile
dockerfile: comps/tts/speecht5/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/tts:${TAG:-latest}
gpt-sovits:
build:
context: GenAIComps
dockerfile: comps/tts/src/integrations/dependency/gpt-sovits/Dockerfile
dockerfile: comps/tts/gpt-sovits/Dockerfile
extends: audioqna
image: ${REGISTRY:-opea}/gpt-sovits:${TAG:-latest}

View File

@@ -7,14 +7,14 @@
## Deploy On Xeon
```
cd GenAIExamples/AudioQnA/kubernetes/intel/cpu/xeon/manifest
cd GenAIExamples/AudioQnA/kubernetes/intel/cpu/xeon/manifests
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml
```
## Deploy On Gaudi
```
cd GenAIExamples/AudioQnA/kubernetes/intel/hpu/gaudi/manifest
cd GenAIExamples/AudioQnA/kubernetes/intel/hpu/gaudi/manifests
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" audioqna.yaml
kubectl apply -f audioqna.yaml

View File

@@ -25,7 +25,7 @@ The AudioQnA uses the below prebuilt images if you choose a Xeon deployment
Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.
For Gaudi:
- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.6
- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.5
- whisper-gaudi: opea/whisper-gaudi:latest
- speecht5-gaudi: opea/speecht5-gaudi:latest

View File

@@ -7,17 +7,69 @@ metadata:
name: audio-qna-config
namespace: default
data:
ASR_ENDPOINT: http://whisper-svc.default.svc.cluster.local:7066
TTS_ENDPOINT: http://speecht5-svc.default.svc.cluster.local:7055
LLM_MODEL_ID: Intel/neural-chat-7b-v3-3
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
TGI_LLM_ENDPOINT: http://llm-dependency-svc.default.svc.cluster.local:3006
MEGA_SERVICE_HOST_IP: audioqna-backend-server-svc
ASR_SERVICE_HOST_IP: asr-svc
ASR_SERVICE_PORT: "3001"
LLM_SERVICE_HOST_IP: llm-svc
LLM_SERVICE_PORT: "3007"
TTS_SERVICE_HOST_IP: tts-svc
TTS_SERVICE_PORT: "3002"
WHISPER_SERVER_HOST_IP: whisper-svc
WHISPER_SERVER_PORT: 7066
SPEECHT5_SERVER_HOST_IP: speecht5-svc
SPEECHT5_SERVER_PORT: 7055
LLM_SERVER_HOST_IP: llm-svc
LLM_SERVER_PORT: 3006
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: asr-deploy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: asr-deploy
template:
metadata:
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
labels:
app: asr-deploy
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: asr-deploy
hostIPC: true
containers:
- envFrom:
- configMapRef:
name: audio-qna-config
image: opea/asr:latest
imagePullPolicy: IfNotPresent
name: asr-deploy
args: null
ports:
- containerPort: 9099
serviceAccountName: default
---
kind: Service
apiVersion: v1
metadata:
name: asr-svc
spec:
type: ClusterIP
selector:
app: asr-deploy
ports:
- name: service
port: 3001
targetPort: 9099
---
apiVersion: apps/v1
@@ -70,6 +122,57 @@ spec:
port: 7066
targetPort: 7066
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: tts-deploy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: tts-deploy
template:
metadata:
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
labels:
app: tts-deploy
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: tts-deploy
hostIPC: true
containers:
- envFrom:
- configMapRef:
name: audio-qna-config
image: opea/tts:latest
imagePullPolicy: IfNotPresent
name: tts-deploy
args: null
ports:
- containerPort: 9088
serviceAccountName: default
---
kind: Service
apiVersion: v1
metadata:
name: tts-svc
spec:
type: ClusterIP
selector:
app: tts-deploy
ports:
- name: service
port: 3002
targetPort: 9088
---
apiVersion: apps/v1
kind: Deployment
@@ -144,7 +247,7 @@ spec:
- envFrom:
- configMapRef:
name: audio-qna-config
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
name: llm-dependency-deploy-demo
securityContext:
capabilities:
@@ -188,6 +291,57 @@ spec:
port: 3006
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: llm-deploy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: llm-deploy
template:
metadata:
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
labels:
app: llm-deploy
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: llm-deploy
hostIPC: true
containers:
- envFrom:
- configMapRef:
name: audio-qna-config
image: opea/llm-tgi:latest
imagePullPolicy: IfNotPresent
name: llm-deploy
args: null
ports:
- containerPort: 9000
serviceAccountName: default
---
kind: Service
apiVersion: v1
metadata:
name: llm-svc
spec:
type: ClusterIP
selector:
app: llm-deploy
ports:
- name: service
port: 3007
targetPort: 9000
---
apiVersion: apps/v1
kind: Deployment

View File

@@ -7,17 +7,69 @@ metadata:
name: audio-qna-config
namespace: default
data:
ASR_ENDPOINT: http://whisper-svc.default.svc.cluster.local:7066
TTS_ENDPOINT: http://speecht5-svc.default.svc.cluster.local:7055
LLM_MODEL_ID: Intel/neural-chat-7b-v3-3
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
TGI_LLM_ENDPOINT: http://llm-dependency-svc.default.svc.cluster.local:3006
MEGA_SERVICE_HOST_IP: audioqna-backend-server-svc
ASR_SERVICE_HOST_IP: asr-svc
ASR_SERVICE_PORT: "3001"
LLM_SERVICE_HOST_IP: llm-svc
LLM_SERVICE_PORT: "3007"
TTS_SERVICE_HOST_IP: tts-svc
TTS_SERVICE_PORT: "3002"
WHISPER_SERVER_HOST_IP: whisper-svc
WHISPER_SERVER_PORT: 7066
SPEECHT5_SERVER_HOST_IP: speecht5-svc
SPEECHT5_SERVER_PORT: 7055
LLM_SERVER_HOST_IP: llm-svc
LLM_SERVER_PORT: 3006
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: asr-deploy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: asr-deploy
template:
metadata:
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
labels:
app: asr-deploy
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: asr-deploy
hostIPC: true
containers:
- envFrom:
- configMapRef:
name: audio-qna-config
image: opea/asr:latest
imagePullPolicy: IfNotPresent
name: asr-deploy
args: null
ports:
- containerPort: 9099
serviceAccountName: default
---
kind: Service
apiVersion: v1
metadata:
name: asr-svc
spec:
type: ClusterIP
selector:
app: asr-deploy
ports:
- name: service
port: 3001
targetPort: 9099
---
apiVersion: apps/v1
@@ -82,6 +134,57 @@ spec:
port: 7066
targetPort: 7066
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: tts-deploy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: tts-deploy
template:
metadata:
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
labels:
app: tts-deploy
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: tts-deploy
hostIPC: true
containers:
- envFrom:
- configMapRef:
name: audio-qna-config
image: opea/tts:latest
imagePullPolicy: IfNotPresent
name: tts-deploy
args: null
ports:
- containerPort: 9088
serviceAccountName: default
---
kind: Service
apiVersion: v1
metadata:
name: tts-svc
spec:
type: ClusterIP
selector:
app: tts-deploy
ports:
- name: service
port: 3002
targetPort: 9088
---
apiVersion: apps/v1
kind: Deployment
@@ -168,7 +271,7 @@ spec:
- envFrom:
- configMapRef:
name: audio-qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
name: llm-dependency-deploy-demo
securityContext:
capabilities:
@@ -240,6 +343,57 @@ spec:
port: 3006
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: llm-deploy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: llm-deploy
template:
metadata:
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
labels:
app: llm-deploy
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: llm-deploy
hostIPC: true
containers:
- envFrom:
- configMapRef:
name: audio-qna-config
image: opea/llm-tgi:latest
imagePullPolicy: IfNotPresent
name: llm-deploy
args: null
ports:
- containerPort: 9000
serviceAccountName: default
---
kind: Service
apiVersion: v1
metadata:
name: llm-svc
spec:
type: ClusterIP
selector:
app: llm-deploy
ports:
- name: service
port: 3007
targetPort: 9000
---
apiVersion: apps/v1
kind: Deployment

View File

@@ -2,7 +2,7 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
set -xe
set -e
IMAGE_REPO=${IMAGE_REPO:-"opea"}
IMAGE_TAG=${IMAGE_TAG:-"latest"}
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
@@ -19,48 +19,71 @@ function build_docker_images() {
git clone https://github.com/opea-project/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"main"}" && cd ../
echo "Build all the images with --no-cache, check docker_image_build.log for details..."
service_list="audioqna audioqna-ui whisper-gaudi speecht5-gaudi"
service_list="audioqna whisper-gaudi asr llm-tgi speecht5-gaudi tts"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker images && sleep 1s
}
function start_services() {
cd $WORKPATH/docker_compose/intel/hpu/gaudi
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export TGI_LLM_ENDPOINT=http://$ip_address:3006
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export ASR_ENDPOINT=http://$ip_address:7066
export TTS_ENDPOINT=http://$ip_address:7055
export MEGA_SERVICE_HOST_IP=${ip_address}
export WHISPER_SERVER_HOST_IP=${ip_address}
export SPEECHT5_SERVER_HOST_IP=${ip_address}
export LLM_SERVER_HOST_IP=${ip_address}
export ASR_SERVICE_HOST_IP=${ip_address}
export TTS_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export ASR_SERVICE_PORT=3001
export TTS_SERVICE_PORT=3002
export LLM_SERVICE_PORT=3007
export BACKEND_SERVICE_ENDPOINT=http://${ip_address}:3008/v1/audioqna
# sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
# Start Docker Containers
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
sleep 20s
n=0
until [[ "$n" -ge 100 ]]; do
docker logs tgi-gaudi-server > $LOG_PATH/tgi_service_start.log
if grep -q Connected $LOG_PATH/tgi_service_start.log; then
break
fi
sleep 5s
n=$((n+1))
done
n=0
until [[ "$n" -ge 100 ]]; do
docker logs whisper-service > $LOG_PATH/whisper_service_start.log
if grep -q "Uvicorn server setup on port" $LOG_PATH/whisper_service_start.log; then
break
fi
sleep 5s
n=$((n+1))
done
}
function validate_megaservice() {
response=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json')
# always print the log
docker logs whisper-service > $LOG_PATH/whisper-service.log
docker logs speecht5-service > $LOG_PATH/tts-service.log
docker logs tgi-gaudi-server > $LOG_PATH/tgi-gaudi-server.log
docker logs audioqna-gaudi-backend-server > $LOG_PATH/audioqna-gaudi-backend-server.log
echo "$response" | sed 's/^"//;s/"$//' | base64 -d > speech.mp3
if [[ $(file speech.mp3) == *"RIFF"* ]]; then
result=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json')
echo "result is === $result"
if [[ $result == *"AAA"* ]]; then
echo "Result correct."
else
docker logs whisper-service > $LOG_PATH/whisper-service.log
docker logs asr-service > $LOG_PATH/asr-service.log
docker logs speecht5-service > $LOG_PATH/tts-service.log
docker logs tts-service > $LOG_PATH/tts-service.log
docker logs tgi-gaudi-server > $LOG_PATH/tgi-gaudi-server.log
docker logs llm-tgi-gaudi-server > $LOG_PATH/llm-tgi-gaudi-server.log
echo "Result wrong."
exit 1
fi
@@ -77,7 +100,7 @@ function validate_megaservice() {
#
# sed -i "s/localhost/$ip_address/g" playwright.config.ts
#
## conda install -c conda-forge nodejs=22.6.0 -y
## conda install -c conda-forge nodejs -y
# npm install && npm ci && npx playwright install --with-deps
# node -v && npm -v && pip list
#
@@ -103,6 +126,7 @@ function main() {
if [[ "$IMAGE_REPO" == "opea" ]]; then build_docker_images; fi
start_services
# validate_microservices
validate_megaservice
# validate_frontend

View File

@@ -1,116 +0,0 @@
#!/bin/bash
# Copyright (C) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0
set -xe
IMAGE_REPO=${IMAGE_REPO:-"opea"}
IMAGE_TAG=${IMAGE_TAG:-"latest"}
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
echo "TAG=IMAGE_TAG=${IMAGE_TAG}"
export REGISTRY=${IMAGE_REPO}
export TAG=${IMAGE_TAG}
WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
ip_address=$(hostname -I | awk '{print $1}')
export PATH="~/miniconda3/bin:$PATH"
function build_docker_images() {
cd $WORKPATH/docker_image_build
git clone https://github.com/opea-project/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"main"}" && cd ../
echo "Build all the images with --no-cache, check docker_image_build.log for details..."
service_list="audioqna audioqna-ui whisper speecht5"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
echo "docker pull ghcr.io/huggingface/text-generation-inference:2.3.1-rocm"
docker pull ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
docker images && sleep 1s
}
function start_services() {
cd $WORKPATH/docker_compose/amd/gpu/rocm/
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export MEGA_SERVICE_HOST_IP=${ip_address}
export WHISPER_SERVER_HOST_IP=${ip_address}
export SPEECHT5_SERVER_HOST_IP=${ip_address}
export LLM_SERVER_HOST_IP=${ip_address}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${ip_address}:3008/v1/audioqna
# sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
# Start Docker Containers
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
sleep 24s
}
function validate_megaservice() {
response=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json')
# always print the log
docker logs whisper-service > $LOG_PATH/whisper-service.log
docker logs speecht5-service > $LOG_PATH/tts-service.log
docker logs tgi-service > $LOG_PATH/tgi-service.log
docker logs audioqna-xeon-backend-server > $LOG_PATH/audioqna-xeon-backend-server.log
echo "$response" | sed 's/^"//;s/"$//' | base64 -d > speech.mp3
if [[ $(file speech.mp3) == *"RIFF"* ]]; then
echo "Result correct."
else
echo "Result wrong."
exit 1
fi
}
#function validate_frontend() {
# Frontend tests are currently disabled
# cd $WORKPATH/ui/svelte
# local conda_env_name="OPEA_e2e"
# export PATH=${HOME}/miniforge3/bin/:$PATH
## conda remove -n ${conda_env_name} --all -y
## conda create -n ${conda_env_name} python=3.12 -y
# source activate ${conda_env_name}
#
# sed -i "s/localhost/$ip_address/g" playwright.config.ts
#
## conda install -c conda-forge nodejs -y
# npm install && npm ci && npx playwright install --with-deps
# node -v && npm -v && pip list
#
# exit_status=0
# npx playwright test || exit_status=$?
#
# if [ $exit_status -ne 0 ]; then
# echo "[TEST INFO]: ---------frontend test failed---------"
# exit $exit_status
# else
# echo "[TEST INFO]: ---------frontend test passed---------"
# fi
#}
function stop_docker() {
cd $WORKPATH/docker_compose/amd/gpu/rocm/
docker compose stop && docker compose rm -f
}
function main() {
stop_docker
if [[ "$IMAGE_REPO" == "opea" ]]; then build_docker_images; fi
start_services
validate_megaservice
# Frontend tests are currently disabled
# validate_frontend
stop_docker
echo y | docker system prune
}
main

View File

@@ -2,7 +2,7 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
set -xe
set -e
IMAGE_REPO=${IMAGE_REPO:-"opea"}
IMAGE_TAG=${IMAGE_TAG:-"latest"}
echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}"
@@ -19,49 +19,61 @@ function build_docker_images() {
git clone https://github.com/opea-project/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"main"}" && cd ../
echo "Build all the images with --no-cache, check docker_image_build.log for details..."
service_list="audioqna audioqna-ui whisper speecht5"
service_list="audioqna whisper asr llm-tgi speecht5 tts"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker images && sleep 1s
}
function start_services() {
cd $WORKPATH/docker_compose/intel/cpu/xeon/
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export TGI_LLM_ENDPOINT=http://$ip_address:3006
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
export ASR_ENDPOINT=http://$ip_address:7066
export TTS_ENDPOINT=http://$ip_address:7055
export MEGA_SERVICE_HOST_IP=${ip_address}
export WHISPER_SERVER_HOST_IP=${ip_address}
export SPEECHT5_SERVER_HOST_IP=${ip_address}
export LLM_SERVER_HOST_IP=${ip_address}
export ASR_SERVICE_HOST_IP=${ip_address}
export TTS_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export WHISPER_SERVER_PORT=7066
export SPEECHT5_SERVER_PORT=7055
export LLM_SERVER_PORT=3006
export BACKEND_SERVICE_ENDPOINT=http://${ip_address}:3008/v1/audioqna
export ASR_SERVICE_PORT=3001
export TTS_SERVICE_PORT=3002
export LLM_SERVICE_PORT=3007
# sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
# Start Docker Containers
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
sleep 20s
n=0
until [[ "$n" -ge 100 ]]; do
docker logs tgi-service > $LOG_PATH/tgi_service_start.log
if grep -q Connected $LOG_PATH/tgi_service_start.log; then
break
fi
sleep 5s
n=$((n+1))
done
}
function validate_megaservice() {
response=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json')
# always print the log
docker logs whisper-service > $LOG_PATH/whisper-service.log
docker logs speecht5-service > $LOG_PATH/tts-service.log
docker logs tgi-service > $LOG_PATH/tgi-service.log
docker logs audioqna-xeon-backend-server > $LOG_PATH/audioqna-xeon-backend-server.log
echo "$response" | sed 's/^"//;s/"$//' | base64 -d > speech.mp3
if [[ $(file speech.mp3) == *"RIFF"* ]]; then
result=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json')
echo $result
if [[ $result == *"AAA"* ]]; then
echo "Result correct."
else
docker logs whisper-service > $LOG_PATH/whisper-service.log
docker logs asr-service > $LOG_PATH/asr-service.log
docker logs speecht5-service > $LOG_PATH/tts-service.log
docker logs tts-service > $LOG_PATH/tts-service.log
docker logs tgi-service > $LOG_PATH/tgi-service.log
docker logs llm-tgi-server > $LOG_PATH/llm-tgi-server.log
docker logs audioqna-xeon-backend-server > $LOG_PATH/audioqna-xeon-backend-server.log
echo "Result wrong."
exit 1
fi
@@ -78,7 +90,7 @@ function validate_megaservice() {
#
# sed -i "s/localhost/$ip_address/g" playwright.config.ts
#
## conda install -c conda-forge nodejs=22.6.0 -y
## conda install -c conda-forge nodejs -y
# npm install && npm ci && npx playwright install --with-deps
# node -v && npm -v && pip list
#

View File

@@ -23,4 +23,4 @@ RUN npm run build
EXPOSE 5173
# Run the front-end application in preview mode
CMD ["npm", "run", "preview", "--", "--port", "5173", "--host", "0.0.0.0"]
CMD ["npm", "run", "preview", "--", "--port", "5173", "--host", "0.0.0.0"]

View File

@@ -1,5 +1,5 @@
{
"name": "audio-qna",
"name": "sveltekit-auth-example",
"version": "0.0.1",
"private": true,
"scripts": {
@@ -11,38 +11,38 @@
"lint": "prettier --check . && eslint .",
"format": "prettier --write ."
},
"peerDependencies": {
"svelte": "^4.0.0"
},
"devDependencies": {
"@fortawesome/free-solid-svg-icons": "6.2.0",
"@playwright/test": "^1.45.2",
"@sveltejs/adapter-auto": "^3.0.0",
"@sveltejs/kit": "^2.0.0",
"@sveltejs/vite-plugin-svelte": "^3.0.0",
"@sveltejs/adapter-auto": "1.0.0-next.75",
"@sveltejs/kit": "^1.30.4",
"@tailwindcss/typography": "0.5.7",
"@types/debug": "4.1.7",
"@typescript-eslint/eslint-plugin": "^5.27.0",
"@typescript-eslint/parser": "^5.27.0",
"autoprefixer": "^10.4.16",
"autoprefixer": "^10.4.7",
"daisyui": "^3.5.0",
"debug": "4.3.4",
"eslint": "^8.16.0",
"eslint-config-prettier": "^8.3.0",
"eslint-plugin-neverthrow": "1.1.4",
"eslint-plugin-svelte3": "^4.0.0",
"neverthrow": "5.0.0",
"pocketbase": "0.7.0",
"postcss": "^8.4.31",
"postcss": "^8.4.23",
"postcss-load-config": "^4.0.1",
"postcss-preset-env": "^8.3.2",
"prettier": "^2.8.8",
"prettier-plugin-svelte": "^2.7.0",
"prettier-plugin-tailwindcss": "^0.3.0",
"svelte": "^4.2.7",
"svelte-check": "^3.6.0",
"svelte": "^3.59.1",
"svelte-check": "^2.7.1",
"svelte-fa": "3.0.3",
"tailwindcss": "^3.3.6",
"svelte-preprocess": "^4.10.7",
"tailwindcss": "^3.1.5",
"ts-pattern": "4.0.5",
"tslib": "^2.4.1",
"typescript": "^5.0.0",
"vite": "^5.0.11"
"tslib": "^2.3.1",
"typescript": "^4.7.4",
"vite": "^4.3.9"
},
"type": "module",
"dependencies": {

View File

@@ -79,4 +79,4 @@ a.btn {
.w-12\/12 {
width: 100%
}
}

View File

@@ -89,4 +89,4 @@
<stop offset="1" stop-color="#3300FF" stop-opacity="0.2" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 5.8 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@@ -89,4 +89,4 @@
<stop offset="1" stop-color="#f3f4f6" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 5.8 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@@ -76,4 +76,4 @@
<stop offset="1" stop-color="#9CFFED" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 4.8 KiB

After

Width:  |  Height:  |  Size: 4.8 KiB

View File

@@ -76,4 +76,4 @@
<stop offset="1" stop-color="#6141E1" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 4.8 KiB

After

Width:  |  Height:  |  Size: 4.8 KiB

View File

@@ -89,4 +89,4 @@
<stop offset="1" stop-color="#3300FF" stop-opacity="0" />
</linearGradient>
</defs>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 5.8 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@@ -3,4 +3,4 @@
<path
d="M512 1024a512 512 0 1 1 512-512 512 512 0 0 1-512 512z m0-896a384 384 0 1 0 384 384A384 384 0 0 0 512 128z m128 576h-256a64 64 0 0 1-64-64v-256a64 64 0 0 1 64-64h256a64 64 0 0 1 64 64v256a64 64 0 0 1-64 64z"
fill="#d81e06" p-id="3104"></path>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 430 B

After

Width:  |  Height:  |  Size: 429 B

View File

@@ -1 +1 @@
<svg t="1713431562066" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="6399" width="32" height="32"><path d="M592 768h-160c-26.6 0-48-21.4-48-48V384h-175.4c-35.6 0-53.4-43-28.2-68.2L484.6 11.4c15-15 39.6-15 54.6 0l304.4 304.4c25.2 25.2 7.4 68.2-28.2 68.2H640v336c0 26.6-21.4 48-48 48z m432-16v224c0 26.6-21.4 48-48 48H48c-26.6 0-48-21.4-48-48V752c0-26.6 21.4-48 48-48h272v16c0 61.8 50.2 112 112 112h160c61.8 0 112-50.2 112-112v-16h272c26.6 0 48 21.4 48 48z m-248 176c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z m128 0c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z" p-id="6400" fill="#ffffff"></path></svg>
<svg t="1713431562066" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="6399" width="32" height="32"><path d="M592 768h-160c-26.6 0-48-21.4-48-48V384h-175.4c-35.6 0-53.4-43-28.2-68.2L484.6 11.4c15-15 39.6-15 54.6 0l304.4 304.4c25.2 25.2 7.4 68.2-28.2 68.2H640v336c0 26.6-21.4 48-48 48z m432-16v224c0 26.6-21.4 48-48 48H48c-26.6 0-48-21.4-48-48V752c0-26.6 21.4-48 48-48h272v16c0 61.8 50.2 112 112 112h160c61.8 0 112-50.2 112-112v-16h272c26.6 0 48 21.4 48 48z m-248 176c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z m128 0c0-22-18-40-40-40s-40 18-40 40 18 40 40 40 40-18 40-40z" p-id="6400" fill="#ffffff"></path></svg>

Before

Width:  |  Height:  |  Size: 670 B

After

Width:  |  Height:  |  Size: 669 B

View File

@@ -6,4 +6,4 @@
<path
d="M864 479.776 864 352c0-17.664-14.304-32-32-32s-32 14.336-32 32l0 127.776c0 160.16-129.184 290.464-288 290.464-158.784 0-288-130.304-288-290.464L224 352c0-17.664-14.336-32-32-32s-32 14.336-32 32l0 127.776c0 184.608 140.864 336.48 320 352.832L480 896 288 896c-17.664 0-32 14.304-32 32s14.336 32 32 32l448 0c17.696 0 32-14.304 32-32s-14.304-32-32-32l-192 0 0-63.36C723.136 816.256 864 664.384 864 479.776z"
fill="#707070" p-id="2962"></path>
</svg>
</svg>

Before

Width:  |  Height:  |  Size: 845 B

After

Width:  |  Height:  |  Size: 844 B

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 16 KiB

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 15 KiB

After

Width:  |  Height:  |  Size: 15 KiB

View File

@@ -1,8 +0,0 @@
*.safetensors
*.bin
*.model
*.log
docker_compose/intel/cpu/xeon/data
docker_compose/intel/hpu/gaudi/data
inputs/
outputs/

View File

@@ -1,33 +0,0 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
FROM python:3.11-slim
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
libgl1-mesa-glx \
libjemalloc-dev \
vim \
git
RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/
WORKDIR /home/user/
RUN git clone https://github.com/opea-project/GenAIComps.git
WORKDIR /home/user/GenAIComps
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt
COPY ./avatarchatbot.py /home/user/avatarchatbot.py
ENV PYTHONPATH=$PYTHONPATH:/home/user/GenAIComps
USER user
WORKDIR /home/user
ENTRYPOINT ["python", "avatarchatbot.py"]

View File

@@ -1,105 +0,0 @@
# AvatarChatbot Application
The AvatarChatbot service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors.
## AI Avatar Workflow
The AI Avatar example is implemented using both megaservices and the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different megaservices and microservices for this example.
```mermaid
---
config:
flowchart:
nodeSpacing: 100
rankSpacing: 100
curve: linear
themeVariables:
fontSize: 42px
---
flowchart LR
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef thistle fill:#D8BFD8,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef invisible fill:transparent,stroke:transparent;
style AvatarChatbot-Megaservice stroke:#000000
subgraph AvatarChatbot-Megaservice["AvatarChatbot Megaservice"]
direction LR
ASR([ASR Microservice]):::blue
LLM([LLM Microservice]):::blue
TTS([TTS Microservice]):::blue
animation([Animation Microservice]):::blue
end
subgraph UserInterface["User Interface"]
direction LR
invis1[ ]:::invisible
USER1([User Audio Query]):::orchid
USER2([User Image/Video Query]):::orchid
UI([UI server<br>]):::orchid
end
GW([AvatarChatbot GateWay<br>]):::orange
subgraph .
direction LR
X([OPEA Microservice]):::blue
Y{{Open Source Service}}:::thistle
Z([OPEA Gateway]):::orange
Z1([UI]):::orchid
end
WHISPER{{Whisper service}}:::thistle
TGI{{LLM service}}:::thistle
T5{{Speecht5 service}}:::thistle
WAV2LIP{{Wav2Lip service}}:::thistle
%% Connections %%
direction LR
USER1 -->|1| UI
UI -->|2| GW
GW <==>|3| AvatarChatbot-Megaservice
ASR ==>|4| LLM ==>|5| TTS ==>|6| animation
direction TB
ASR <-.->|3'| WHISPER
LLM <-.->|4'| TGI
TTS <-.->|5'| T5
animation <-.->|6'| WAV2LIP
USER2 -->|1| UI
UI <-.->|6'| WAV2LIP
```
## Deploy AvatarChatbot Service
The AvatarChatbot service can be deployed on either Intel Gaudi2 AI Accelerator or Intel Xeon Scalable Processor.
### Deploy AvatarChatbot on Gaudi
Refer to the [Gaudi Guide](./docker_compose/intel/hpu/gaudi/README.md) for instructions on deploying AvatarChatbot on Gaudi, and on setting up an UI for the application.
### Deploy AvatarChatbot on Xeon
Refer to the [Xeon Guide](./docker_compose/intel/cpu/xeon/README.md) for instructions on deploying AvatarChatbot on Xeon.
## Supported Models
### ASR
The default model is [openai/whisper-small](https://huggingface.co/openai/whisper-small). It also supports all models in the Whisper family, such as `openai/whisper-large-v3`, `openai/whisper-medium`, `openai/whisper-base`, `openai/whisper-tiny`, etc.
To replace the model, please edit the `compose.yaml` and add the `command` line to pass the name of the model you want to use:
```yaml
services:
whisper-service:
...
command: --model_name_or_path openai/whisper-tiny
```
### TTS
The default model is [microsoft/SpeechT5](https://huggingface.co/microsoft/speecht5_tts). We currently do not support replacing the model. More models under the commercial license will be added in the future.
### Animation
The default model is [Rudrabha/Wav2Lip](https://github.com/Rudrabha/Wav2Lip) and [TencentARC/GFPGAN](https://github.com/TencentARC/GFPGAN). We currently do not support replacing the model. More models under the commercial license such as [OpenTalker/SadTalker](https://github.com/OpenTalker/SadTalker) will be added in the future.

Some files were not shown because too many files have changed in this diff Show More