diff --git a/CodeGen/README.md b/CodeGen/README.md index d79bcc19d..b50669e68 100644 --- a/CodeGen/README.md +++ b/CodeGen/README.md @@ -18,154 +18,14 @@ CodeGen architecture shows below: ![architecture](https://i.imgur.com/G9ozwFX.png) -# Environment Setup +# Deploy CodeGen Service -To use [🤗 text-generation-inference](https://github.com/huggingface/text-generation-inference) on Intel Gaudi2, please follow these steps: +The CodeGen service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors. -## Prepare Gaudi Image +## Deploy CodeGen on Gaudi -Getting started is straightforward with the official Docker container. Simply pull the image using: +Refer to the [Gaudi Guide](./microservice/gaudi/README.md) for instructions on deploying CodeGen on Gaudi. -```bash -docker pull ghcr.io/huggingface/tgi-gaudi:1.2.1 -``` +## Deploy CodeGen on Xeon -Alternatively, you can build the Docker image yourself with: - -```bash -bash ./serving/tgi_gaudi/build_docker.sh -``` - -## Launch TGI Gaudi Service - -### Launch a local server instance on 1 Gaudi card: - -```bash -bash ./serving/tgi_gaudi/launch_tgi_service.sh -``` - -### Launch a local server instance on 4 Gaudi cards: - -```bash -bash ./serving/tgi_gaudi/launch_tgi_service.sh 4 9000 "deepseek-ai/deepseek-coder-33b-instruct" -``` - -### Customize TGI Gaudi Service - -The ./tgi_gaudi/launch_tgi_service.sh script accepts three parameters: - -- num_cards: The number of Gaudi cards to be utilized, ranging from 1 to 8. The default is set to 1. -- port_number: The port number assigned to the TGI Gaudi endpoint, with the default being 8080. -- model_name: The model name utilized for LLM, with the default set to "m-a-p/OpenCodeInterpreter-DS-6.7B". - -You have the flexibility to customize these parameters according to your specific needs. Additionally, you can set the TGI Gaudi endpoint by exporting the environment variable `TGI_ENDPOINT`: - -```bash -export TGI_ENDPOINT="http://xxx.xxx.xxx.xxx:8080" -``` - -## Launch Copilot Docker - -### Build Copilot Docker Image (Optional) - -```bash -cd codegen -bash ./build_docker.sh -cd .. -``` - -### Launch Copilot Docker - -```bash -docker run -it -e http_proxy=${http_proxy} -e https_proxy=${https_proxy} --net=host --ipc=host -v /var/run/docker.sock:/var/run/docker.sock intel/gen-ai-examples:copilot bash -``` - -# Start Copilot Server - -## Start the Backend Service - -Make sure TGI-Gaudi service is running and also make sure data is populated into Redis. Launch the backend service: - -Please follow this link [huggingface token](https://huggingface.co/docs/hub/security-tokens) to get the access token and export `HUGGINGFACEHUB_API_TOKEN` environment with the token. - -```bash -export HUGGINGFACEHUB_API_TOKEN= -nohup python server.py & -``` - -The Copilot backend defaults to listening on port 8000, but you can adjust the port number as needed. - -# Install Copilot VSCode extension from Plugin Marketplace - -Install `Neural Copilot` in VSCode as below. - -![Install-screenshot](https://i.imgur.com/cnHRAdD.png) - -# How to use - -## Service URL setting - -Please adjust the service URL in the extension settings based on the endpoint of the code generation backend service. - -![Setting-screenshot](https://i.imgur.com/4hjvKPu.png) -![Setting-screenshot](https://i.imgur.com/AQZuzqd.png) - -## Customize - -The Copilot enables users to input their corresponding sensitive information and tokens in the user settings according to their own needs. This customization enhances the accuracy and output content to better meet individual requirements. - -![Customize](https://i.imgur.com/PkObak9.png) - -## Code Suggestion - -To trigger inline completion, you'll need to type # {your keyword} (start with your programming language's comment keyword, like // in C++ and # in python). Make sure Inline Suggest is enabled from the VS Code Settings. -For example: - -![code suggestion](https://i.imgur.com/sH5UoTO.png) - -To provide programmers with a smooth experience, the Copilot supports multiple ways to trigger inline code suggestions. If you are interested in the details, they are summarized as follows: - -- Generate code from single-line comments: The simplest way introduced before. -- Generate code from consecutive single-line comments: - -![codegen from single-line comments](https://i.imgur.com/GZsQywX.png) - -- Generate code from multi-line comments, which will not be triggered until there is at least one `space` outside the multi-line comment): - -![codegen from multi-line comments](https://i.imgur.com/PzhiWrG.png) - -- Automatically complete multi-line comments: - -![auto complete](https://i.imgur.com/cJO3PQ0.jpg) - -## Chat with AI assistant - -You can start a conversation with the AI programming assistant by clicking on the robot icon in the plugin bar on the left: - -![icon](https://i.imgur.com/f7rzfCQ.png) - -Then you can see the conversation window on the left, where you can chat with AI assistant: - -![dialog](https://i.imgur.com/aiYzU60.png) - -There are 4 areas worth noting: - -- Enter and submit your question -- Your previous questions -- Answers from AI assistant (Code will be highlighted properly according to the programming language it is written in, also support streaming output) -- Copy or replace code with one click (Note that you need to select the code in the editor first and then click "replace", otherwise the code will be inserted) - -You can also select the code in the editor and ask AI assistant question about it. -For example: - -- Select code - -![select code](https://i.imgur.com/grvrtY6.png) - -- Ask question and get answer - -![qna](https://i.imgur.com/8Kdpld7.png) - -# - -SCRIPT USAGE NOTICE:  By downloading and using any script file included with the associated software package (such as files with .bat, .cmd, or .JS extensions, Docker files, or any other type of file that, when executed, automatically downloads and/or installs files onto your system) (the “Script File”), it is your obligation to review the Script File to understand what files (e.g.,  other software, AI models, AI Datasets) the Script File will download to your system (“Downloaded Files”). Furthermore, by downloading and using the Downloaded Files, even if they are installed through a silent install, you agree to any and all terms and conditions associated with such files, including but not limited to, license terms, notices, or disclaimers. +Refer to the [Xeon Guide](./microservice/xeon/README.md) for instructions on deploying CodeGen on Xeon. diff --git a/CodeGen/deprecated/README.md b/CodeGen/deprecated/README.md new file mode 100644 index 000000000..d79bcc19d --- /dev/null +++ b/CodeGen/deprecated/README.md @@ -0,0 +1,171 @@ +# Code Generation + +Code-generating LLMs are specialized AI models designed for the task of generating computer code. Such models undergo training with datasets that encompass repositories, specialized documentation, programming code, relevant web content, and other related data. They possess a deep understanding of various programming languages, coding patterns, and software development concepts. Code LLMs are engineered to assist developers and programmers. When these LLMs are seamlessly integrated into the developer's Integrated Development Environment (IDE), they possess a comprehensive understanding of the coding context, which includes elements such as comments, function names, and variable names. This contextual awareness empowers them to provide more refined and contextually relevant coding suggestions. + +Capabilities of LLMs in Coding: + +- Code Generation: streamline coding through Code Generation, enabling non-programmers to describe tasks for code creation. +- Code Completion: accelerate coding by suggesting contextually relevant snippets as developers type. +- Code Translation and Modernization: translate and modernize code across multiple programming languages, aiding interoperability and updating legacy projects. +- Code summarization: extract key insights from codebases, improving readability and developer productivity. +- Code Refactoring: offer suggestions for code refactoring, enhancing code performance and efficiency. +- AI-Assisted Testing: assist in creating test cases, ensuring code robustness and accelerating development cycles. +- Error Detection and Debugging: detect errors in code and provide detailed descriptions and potential fixes, expediting debugging processes. + +In this example, we present a Code Copilot application to showcase how code generation can be executed on the Intel Gaudi2 platform. This CodeGen use case involves code generation utilizing open source models such as "m-a-p/OpenCodeInterpreter-DS-6.7B", "deepseek-ai/deepseek-coder-33b-instruct" and Text Generation Inference on Intel Gaudi2. + +CodeGen architecture shows below: + +![architecture](https://i.imgur.com/G9ozwFX.png) + +# Environment Setup + +To use [🤗 text-generation-inference](https://github.com/huggingface/text-generation-inference) on Intel Gaudi2, please follow these steps: + +## Prepare Gaudi Image + +Getting started is straightforward with the official Docker container. Simply pull the image using: + +```bash +docker pull ghcr.io/huggingface/tgi-gaudi:1.2.1 +``` + +Alternatively, you can build the Docker image yourself with: + +```bash +bash ./serving/tgi_gaudi/build_docker.sh +``` + +## Launch TGI Gaudi Service + +### Launch a local server instance on 1 Gaudi card: + +```bash +bash ./serving/tgi_gaudi/launch_tgi_service.sh +``` + +### Launch a local server instance on 4 Gaudi cards: + +```bash +bash ./serving/tgi_gaudi/launch_tgi_service.sh 4 9000 "deepseek-ai/deepseek-coder-33b-instruct" +``` + +### Customize TGI Gaudi Service + +The ./tgi_gaudi/launch_tgi_service.sh script accepts three parameters: + +- num_cards: The number of Gaudi cards to be utilized, ranging from 1 to 8. The default is set to 1. +- port_number: The port number assigned to the TGI Gaudi endpoint, with the default being 8080. +- model_name: The model name utilized for LLM, with the default set to "m-a-p/OpenCodeInterpreter-DS-6.7B". + +You have the flexibility to customize these parameters according to your specific needs. Additionally, you can set the TGI Gaudi endpoint by exporting the environment variable `TGI_ENDPOINT`: + +```bash +export TGI_ENDPOINT="http://xxx.xxx.xxx.xxx:8080" +``` + +## Launch Copilot Docker + +### Build Copilot Docker Image (Optional) + +```bash +cd codegen +bash ./build_docker.sh +cd .. +``` + +### Launch Copilot Docker + +```bash +docker run -it -e http_proxy=${http_proxy} -e https_proxy=${https_proxy} --net=host --ipc=host -v /var/run/docker.sock:/var/run/docker.sock intel/gen-ai-examples:copilot bash +``` + +# Start Copilot Server + +## Start the Backend Service + +Make sure TGI-Gaudi service is running and also make sure data is populated into Redis. Launch the backend service: + +Please follow this link [huggingface token](https://huggingface.co/docs/hub/security-tokens) to get the access token and export `HUGGINGFACEHUB_API_TOKEN` environment with the token. + +```bash +export HUGGINGFACEHUB_API_TOKEN= +nohup python server.py & +``` + +The Copilot backend defaults to listening on port 8000, but you can adjust the port number as needed. + +# Install Copilot VSCode extension from Plugin Marketplace + +Install `Neural Copilot` in VSCode as below. + +![Install-screenshot](https://i.imgur.com/cnHRAdD.png) + +# How to use + +## Service URL setting + +Please adjust the service URL in the extension settings based on the endpoint of the code generation backend service. + +![Setting-screenshot](https://i.imgur.com/4hjvKPu.png) +![Setting-screenshot](https://i.imgur.com/AQZuzqd.png) + +## Customize + +The Copilot enables users to input their corresponding sensitive information and tokens in the user settings according to their own needs. This customization enhances the accuracy and output content to better meet individual requirements. + +![Customize](https://i.imgur.com/PkObak9.png) + +## Code Suggestion + +To trigger inline completion, you'll need to type # {your keyword} (start with your programming language's comment keyword, like // in C++ and # in python). Make sure Inline Suggest is enabled from the VS Code Settings. +For example: + +![code suggestion](https://i.imgur.com/sH5UoTO.png) + +To provide programmers with a smooth experience, the Copilot supports multiple ways to trigger inline code suggestions. If you are interested in the details, they are summarized as follows: + +- Generate code from single-line comments: The simplest way introduced before. +- Generate code from consecutive single-line comments: + +![codegen from single-line comments](https://i.imgur.com/GZsQywX.png) + +- Generate code from multi-line comments, which will not be triggered until there is at least one `space` outside the multi-line comment): + +![codegen from multi-line comments](https://i.imgur.com/PzhiWrG.png) + +- Automatically complete multi-line comments: + +![auto complete](https://i.imgur.com/cJO3PQ0.jpg) + +## Chat with AI assistant + +You can start a conversation with the AI programming assistant by clicking on the robot icon in the plugin bar on the left: + +![icon](https://i.imgur.com/f7rzfCQ.png) + +Then you can see the conversation window on the left, where you can chat with AI assistant: + +![dialog](https://i.imgur.com/aiYzU60.png) + +There are 4 areas worth noting: + +- Enter and submit your question +- Your previous questions +- Answers from AI assistant (Code will be highlighted properly according to the programming language it is written in, also support streaming output) +- Copy or replace code with one click (Note that you need to select the code in the editor first and then click "replace", otherwise the code will be inserted) + +You can also select the code in the editor and ask AI assistant question about it. +For example: + +- Select code + +![select code](https://i.imgur.com/grvrtY6.png) + +- Ask question and get answer + +![qna](https://i.imgur.com/8Kdpld7.png) + +# + +SCRIPT USAGE NOTICE:  By downloading and using any script file included with the associated software package (such as files with .bat, .cmd, or .JS extensions, Docker files, or any other type of file that, when executed, automatically downloads and/or installs files onto your system) (the “Script File”), it is your obligation to review the Script File to understand what files (e.g.,  other software, AI models, AI Datasets) the Script File will download to your system (“Downloaded Files”). Furthermore, by downloading and using the Downloaded Files, even if they are installed through a silent install, you agree to any and all terms and conditions associated with such files, including but not limited to, license terms, notices, or disclaimers. diff --git a/CodeGen/codegen/Dockerfile b/CodeGen/deprecated/codegen/Dockerfile similarity index 100% rename from CodeGen/codegen/Dockerfile rename to CodeGen/deprecated/codegen/Dockerfile diff --git a/CodeGen/codegen/build_docker.sh b/CodeGen/deprecated/codegen/build_docker.sh similarity index 100% rename from CodeGen/codegen/build_docker.sh rename to CodeGen/deprecated/codegen/build_docker.sh diff --git a/CodeGen/codegen/codegen-app/openai_protocol.py b/CodeGen/deprecated/codegen/codegen-app/openai_protocol.py similarity index 100% rename from CodeGen/codegen/codegen-app/openai_protocol.py rename to CodeGen/deprecated/codegen/codegen-app/openai_protocol.py diff --git a/CodeGen/codegen/codegen-app/server.py b/CodeGen/deprecated/codegen/codegen-app/server.py similarity index 100% rename from CodeGen/codegen/codegen-app/server.py rename to CodeGen/deprecated/codegen/codegen-app/server.py diff --git a/CodeGen/codegen/requirements.txt b/CodeGen/deprecated/codegen/requirements.txt similarity index 100% rename from CodeGen/codegen/requirements.txt rename to CodeGen/deprecated/codegen/requirements.txt diff --git a/CodeGen/serving/tgi_gaudi/build_docker.sh b/CodeGen/deprecated/serving/tgi_gaudi/build_docker.sh similarity index 100% rename from CodeGen/serving/tgi_gaudi/build_docker.sh rename to CodeGen/deprecated/serving/tgi_gaudi/build_docker.sh diff --git a/CodeGen/serving/tgi_gaudi/launch_tgi_service.sh b/CodeGen/deprecated/serving/tgi_gaudi/launch_tgi_service.sh similarity index 100% rename from CodeGen/serving/tgi_gaudi/launch_tgi_service.sh rename to CodeGen/deprecated/serving/tgi_gaudi/launch_tgi_service.sh diff --git a/CodeGen/tests/test_codegen_inference.sh b/CodeGen/deprecated/tests/test_codegen_inference.sh similarity index 100% rename from CodeGen/tests/test_codegen_inference.sh rename to CodeGen/deprecated/tests/test_codegen_inference.sh diff --git a/CodeGen/microservice/gaudi/README.md b/CodeGen/microservice/gaudi/README.md new file mode 100644 index 000000000..e06a35157 --- /dev/null +++ b/CodeGen/microservice/gaudi/README.md @@ -0,0 +1,185 @@ +# Build MegaService of CodeGen on Gaudi + +This document outlines the deployment process for a CodeGen application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Gaudi server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `llm`. We will publish the Docker images to Docker Hub, it will simplify the deployment process for this service. + +## 🚀 Build Docker Images + +First of all, you need to build Docker Images locally. This step can be ignored after the Docker images published to Docker hub. + +### 1. Git clone GenAIComps + +```bash +git clone https://github.com/opea-project/GenAIComps.git +cd GenAIComps +``` + +### 2. Build LLM Image + +```bash +docker build -t opea/gen-ai-comps:llm-tgi-gaudi-server --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/langchain/docker/Dockerfile . +``` + +### 3. Build MegaService Docker Image + +To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `codegen.py` Python script. Build the MegaService Docker image using the command below: + +```bash +git clone https://github.com/opea-project/GenAIExamples +cd GenAIExamples/CodeGen/microservice/gaudi/ +docker build -t opea/gen-ai-comps:codegen-megaservice-server --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile . +``` + +### 4. Build UI Docker Image + +Construct the frontend Docker image using the command below: + +```bash +cd GenAIExamples/CodeGen/ui/ +docker build -t opea/gen-ai-comps:codegen-ui-server --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +``` + +Then run the command `docker images`, you will have the following 3 Docker Images: + +1. `opea/gen-ai-comps:llm-tgi-server` +2. `opea/gen-ai-comps:codegen-megaservice-server` +3. `opea/gen-ai-comps:codegen-ui-server` + +## 🚀 Start MicroServices and MegaService + +### Setup Environment Variables + +Since the `docker_compose.yaml` will consume some environment variables, you need to setup them in advance as below. + +```bash +export http_proxy=${your_http_proxy} +export https_proxy=${your_http_proxy} +export LLM_MODEL_ID="ise-uiuc/Magicoder-S-DS-6.7B" +export TGI_LLM_ENDPOINT="http://${host_ip}:8028" +export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token} +export MEGA_SERVICE_HOST_IP=${host_ip} +export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:6666/v1/codegen" +``` + +Note: Please replace with `host_ip` with you external IP address, do not use localhost. + +### Start all the services Docker Containers + +```bash +docker compose -f docker_compose.yaml up -d +``` + +### Validate MicroServices and MegaService + +1. TGI Service + +```bash +curl http://${host_ip}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ + -H 'Content-Type: application/json' +``` + +2. LLM Microservice + +```bash +curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_new_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' +``` + +3. MegaService + +```bash +curl http://${host_ip}:6666/v1/codegen -H "Content-Type: application/json" -d '{ + "model": "ise-uiuc/Magicoder-S-DS-6.7B", + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." + }' +``` + +## 🚀 Launch the UI + +To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below: + +```yaml + chaqna-gaudi-ui-server: + image: opea/gen-ai-comps:codegen-ui-server + ... + ports: + - "80:5173" +``` + +![project-screenshot](https://imgur.com/d1SmaRb.png) + +## Install Copilot VSCode extension from Plugin Marketplace as the frontend + +In addition to the Svelte UI, users can also install the Copilot VSCode extension from the Plugin Marketplace as the frontend. + +Install `Neural Copilot` in VSCode as below. + +![Install-screenshot](https://i.imgur.com/cnHRAdD.png) + +### How to use + +#### Service URL setting + +Please adjust the service URL in the extension settings based on the endpoint of the code generation backend service. + +![Setting-screenshot](https://i.imgur.com/4hjvKPu.png) +![Setting-screenshot](https://i.imgur.com/AQZuzqd.png) + +#### Customize + +The Copilot enables users to input their corresponding sensitive information and tokens in the user settings according to their own needs. This customization enhances the accuracy and output content to better meet individual requirements. + +![Customize](https://i.imgur.com/PkObak9.png) + +#### Code Suggestion + +To trigger inline completion, you'll need to type # {your keyword} (start with your programming language's comment keyword, like // in C++ and # in python). Make sure Inline Suggest is enabled from the VS Code Settings. +For example: + +![code suggestion](https://i.imgur.com/sH5UoTO.png) + +To provide programmers with a smooth experience, the Copilot supports multiple ways to trigger inline code suggestions. If you are interested in the details, they are summarized as follows: + +- Generate code from single-line comments: The simplest way introduced before. +- Generate code from consecutive single-line comments: + +![codegen from single-line comments](https://i.imgur.com/GZsQywX.png) + +- Generate code from multi-line comments, which will not be triggered until there is at least one `space` outside the multi-line comment): + +![codegen from multi-line comments](https://i.imgur.com/PzhiWrG.png) + +- Automatically complete multi-line comments: + +![auto complete](https://i.imgur.com/cJO3PQ0.jpg) + +### Chat with AI assistant + +You can start a conversation with the AI programming assistant by clicking on the robot icon in the plugin bar on the left: + +![icon](https://i.imgur.com/f7rzfCQ.png) + +Then you can see the conversation window on the left, where you can chat with AI assistant: + +![dialog](https://i.imgur.com/aiYzU60.png) + +There are 4 areas worth noting: + +- Enter and submit your question +- Your previous questions +- Answers from AI assistant (Code will be highlighted properly according to the programming language it is written in, also support streaming output) +- Copy or replace code with one click (Note that you need to select the code in the editor first and then click "replace", otherwise the code will be inserted) + +You can also select the code in the editor and ask AI assistant question about it. +For example: + +- Select code + +![select code](https://i.imgur.com/grvrtY6.png) + +- Ask question and get answer + +![qna](https://i.imgur.com/8Kdpld7.png) diff --git a/CodeGen/microservice/gaudi/codegen.py b/CodeGen/microservice/gaudi/codegen.py new file mode 100644 index 000000000..7f3e0d3d4 --- /dev/null +++ b/CodeGen/microservice/gaudi/codegen.py @@ -0,0 +1,51 @@ +# Copyright (c) 2024 Intel Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import asyncio +import os + +from comps import CodeGenGateway, MicroService, ServiceOrchestrator, ServiceType + +SERVICE_HOST_IP = os.getenv("MEGA_SERVICE_HOST_IP", "0.0.0.0") + + +class ChatQnAService: + def __init__(self, port=8000): + self.port = port + self.megaservice = ServiceOrchestrator() + + def add_remote_service(self): + llm = MicroService( + name="llm", + host=SERVICE_HOST_IP, + port=9000, + endpoint="/v1/chat/completions", + use_remote_service=True, + service_type=ServiceType.LLM, + ) + self.megaservice.add(llm) + self.gateway = CodeGenGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port) + + async def schedule(self): + await self.megaservice.schedule( + initial_inputs={"text": "Write a function that checks if a year is a leap year in Python."} + ) + result_dict = self.megaservice.result_dict + print(result_dict) + + +if __name__ == "__main__": + chatqna = ChatQnAService(port=6666) + chatqna.add_remote_service() + asyncio.run(chatqna.schedule()) diff --git a/CodeGen/microservice/gaudi/docker/Dockerfile b/CodeGen/microservice/gaudi/docker/Dockerfile new file mode 100644 index 000000000..45305043c --- /dev/null +++ b/CodeGen/microservice/gaudi/docker/Dockerfile @@ -0,0 +1,44 @@ +# Copyright (c) 2024 Intel Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +FROM python:3.11-slim + +ENV LANG C.UTF-8 + +RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \ + libgl1-mesa-glx \ + libjemalloc-dev \ + vim \ + git + +RUN useradd -m -s /bin/bash user && \ + mkdir -p /home/user && \ + chown -R user /home/user/ + +RUN cd /home/user/ && \ + git clone https://github.com/opea-project/GenAIComps.git + +RUN cd /home/user/GenAIComps && pip install --no-cache-dir --upgrade pip && \ + pip install -r /home/user/GenAIComps/requirements.txt + +COPY ../codegen.py /home/user/codegen.py + +ENV PYTHONPATH=$PYTHONPATH:/home/user/GenAIComps + +USER user + +WORKDIR /home/user + +ENTRYPOINT ["python", "codegen.py"] diff --git a/CodeGen/microservice/gaudi/docker_compose.yaml b/CodeGen/microservice/gaudi/docker_compose.yaml new file mode 100644 index 000000000..3633ffc6b --- /dev/null +++ b/CodeGen/microservice/gaudi/docker_compose.yaml @@ -0,0 +1,76 @@ +# Copyright (c) 2024 Intel Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +version: "3.8" + +services: + tgi_service: + image: ghcr.io/huggingface/tgi-gaudi:1.2.1 + container_name: tgi-gaudi-server + ports: + - "8028:80" + volumes: + - "./data:/data" + environment: + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + RUNTIME: habana + HABANA_VISIBLE_DEVICES: all + OMPI_MCA_btl_vader_single_copy_mechanism: none + command: --model-id ${LLM_MODEL_ID} + llm: + image: opea/gen-ai-comps:llm-tgi-gaudi-server + container_name: llm-tgi-gaudi-server + depends_on: + - tgi_service + ports: + - "9000:9000" + ipc: host + environment: + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + restart: unless-stopped + codegen-gaudi-backend-server: + image: opea/gen-ai-comps:codegen-megaservice-server + container_name: codegen-gaudi-backend-server + depends_on: + - llm + ports: + - "6666:6666" + environment: + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP} + ipc: host + restart: always + codegen-gaudi-ui-server: + image: opea/gen-ai-comps:codegen-ui-server + container_name: codegen-gaudi-ui-server + depends_on: + - codegen-gaudi-backend-server + ports: + - "5173:5173" + environment: + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - BASIC_URL=${BACKEND_SERVICE_ENDPOINT} + ipc: host + restart: always + +networks: + default: + driver: bridge diff --git a/CodeGen/microservice/xeon/README.md b/CodeGen/microservice/xeon/README.md new file mode 100644 index 000000000..578d3b6f7 --- /dev/null +++ b/CodeGen/microservice/xeon/README.md @@ -0,0 +1,191 @@ +# Build Mega Service of CodeGen on Xeon + +This document outlines the deployment process for a CodeGen application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `llm`. We will publish the Docker images to Docker Hub soon, it will simplify the deployment process for this service. + +## 🚀 Apply Xeon Server on AWS + +To apply a Xeon server on AWS, start by creating an AWS account if you don't have one already. Then, head to the [EC2 Console](https://console.aws.amazon.com/ec2/v2/home) to begin the process. Within the EC2 service, select the Amazon EC2 M7i or M7i-flex instance type to leverage the power of 4th Generation Intel Xeon Scalable processors. These instances are optimized for high-performance computing and demanding workloads. + +For detailed information about these instance types, you can refer to this [link](https://aws.amazon.com/ec2/instance-types/m7i/). Once you've chosen the appropriate instance type, proceed with configuring your instance settings, including network configurations, security groups, and storage options. + +After launching your instance, you can connect to it using SSH (for Linux instances) or Remote Desktop Protocol (RDP) (for Windows instances). From there, you'll have full access to your Xeon server, allowing you to install, configure, and manage your applications as needed. + +## 🚀 Build Docker Images + +First of all, you need to build Docker Images locally and install the python package of it. + +```bash +git clone https://github.com/opea-project/GenAIComps.git +cd GenAIComps +``` + +### 1. Build LLM Image + +```bash +docker build -t opea/gen-ai-comps:llm-tgi-server --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/langchain/docker/Dockerfile . +``` + +### 2. Build MegaService Docker Image + +To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `codegen.py` Python script. Build MegaService Docker image via below command: + +```bash +git clone https://github.com/opea-project/GenAIExamples +cd GenAIExamples/CodeGen/microservice/xeon/ +docker build -t opea/gen-ai-comps:codegen-megaservice-server --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile . +``` + +### 6. Build UI Docker Image + +Build frontend Docker image via below command: + +```bash +cd GenAIExamples/CodeGen/ui/ +docker build -t opea/gen-ai-comps:codegen-ui-server --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +``` + +Then run the command `docker images`, you will have the following 3 Docker Images: + +1. `opea/gen-ai-comps:llm-tgi-server` +2. `opea/gen-ai-comps:codegen-megaservice-server` +3. `opea/gen-ai-comps:codegen-ui-server` + +## 🚀 Start Microservices + +### Setup Environment Variables + +Since the `docker_compose.yaml` will consume some environment variables, you need to setup them in advance as below. + +```bash +export http_proxy=${your_http_proxy} +export https_proxy=${your_http_proxy} +export LLM_MODEL_ID="ise-uiuc/Magicoder-S-DS-6.7B" +export TGI_LLM_ENDPOINT="http://${host_ip}:8028" +export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token} +export MEGA_SERVICE_HOST_IP=${host_ip} +export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:6666/v1/codegen" +``` + +Note: Please replace with `host_ip` with you external IP address, do not use localhost. + +### Start all the services Docker Containers + +```bash +docker compose -f docker_compose.yaml up -d +``` + +### Validate Microservices + +1. TGI Service + +```bash +curl http://${host_ip}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ + -H 'Content-Type: application/json' +``` + +2. LLM Microservice + +```bash +curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_new_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' +``` + +3. MegaService + +```bash +curl http://${host_ip}:6666/v1/codegen -H "Content-Type: application/json" -d '{ + "model": "ise-uiuc/Magicoder-S-DS-6.7B", + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." + }' +``` + +## 🚀 Launch the UI + +To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below: + +```yaml + chaqna-gaudi-ui-server: + image: opea/gen-ai-comps:codegen-ui-server + ... + ports: + - "80:5173" +``` + +![project-screenshot](https://imgur.com/d1SmaRb.png) + +## Install Copilot VSCode extension from Plugin Marketplace as the frontend + +In addition to the Svelte UI, users can also install the Copilot VSCode extension from the Plugin Marketplace as the frontend. + +Install `Neural Copilot` in VSCode as below. + +![Install-screenshot](https://i.imgur.com/cnHRAdD.png) + +### How to use + +#### Service URL setting + +Please adjust the service URL in the extension settings based on the endpoint of the code generation backend service. + +![Setting-screenshot](https://i.imgur.com/4hjvKPu.png) +![Setting-screenshot](https://i.imgur.com/AQZuzqd.png) + +#### Customize + +The Copilot enables users to input their corresponding sensitive information and tokens in the user settings according to their own needs. This customization enhances the accuracy and output content to better meet individual requirements. + +![Customize](https://i.imgur.com/PkObak9.png) + +#### Code Suggestion + +To trigger inline completion, you'll need to type # {your keyword} (start with your programming language's comment keyword, like // in C++ and # in python). Make sure Inline Suggest is enabled from the VS Code Settings. +For example: + +![code suggestion](https://i.imgur.com/sH5UoTO.png) + +To provide programmers with a smooth experience, the Copilot supports multiple ways to trigger inline code suggestions. If you are interested in the details, they are summarized as follows: + +- Generate code from single-line comments: The simplest way introduced before. +- Generate code from consecutive single-line comments: + +![codegen from single-line comments](https://i.imgur.com/GZsQywX.png) + +- Generate code from multi-line comments, which will not be triggered until there is at least one `space` outside the multi-line comment): + +![codegen from multi-line comments](https://i.imgur.com/PzhiWrG.png) + +- Automatically complete multi-line comments: + +![auto complete](https://i.imgur.com/cJO3PQ0.jpg) + +### Chat with AI assistant + +You can start a conversation with the AI programming assistant by clicking on the robot icon in the plugin bar on the left: + +![icon](https://i.imgur.com/f7rzfCQ.png) + +Then you can see the conversation window on the left, where you can chat with AI assistant: + +![dialog](https://i.imgur.com/aiYzU60.png) + +There are 4 areas worth noting: + +- Enter and submit your question +- Your previous questions +- Answers from AI assistant (Code will be highlighted properly according to the programming language it is written in, also support streaming output) +- Copy or replace code with one click (Note that you need to select the code in the editor first and then click "replace", otherwise the code will be inserted) + +You can also select the code in the editor and ask AI assistant question about it. +For example: + +- Select code + +![select code](https://i.imgur.com/grvrtY6.png) + +- Ask question and get answer + +![qna](https://i.imgur.com/8Kdpld7.png) diff --git a/CodeGen/microservice/xeon/codegen.py b/CodeGen/microservice/xeon/codegen.py new file mode 100644 index 000000000..7f3e0d3d4 --- /dev/null +++ b/CodeGen/microservice/xeon/codegen.py @@ -0,0 +1,51 @@ +# Copyright (c) 2024 Intel Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import asyncio +import os + +from comps import CodeGenGateway, MicroService, ServiceOrchestrator, ServiceType + +SERVICE_HOST_IP = os.getenv("MEGA_SERVICE_HOST_IP", "0.0.0.0") + + +class ChatQnAService: + def __init__(self, port=8000): + self.port = port + self.megaservice = ServiceOrchestrator() + + def add_remote_service(self): + llm = MicroService( + name="llm", + host=SERVICE_HOST_IP, + port=9000, + endpoint="/v1/chat/completions", + use_remote_service=True, + service_type=ServiceType.LLM, + ) + self.megaservice.add(llm) + self.gateway = CodeGenGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port) + + async def schedule(self): + await self.megaservice.schedule( + initial_inputs={"text": "Write a function that checks if a year is a leap year in Python."} + ) + result_dict = self.megaservice.result_dict + print(result_dict) + + +if __name__ == "__main__": + chatqna = ChatQnAService(port=6666) + chatqna.add_remote_service() + asyncio.run(chatqna.schedule()) diff --git a/CodeGen/microservice/xeon/docker/Dockerfile b/CodeGen/microservice/xeon/docker/Dockerfile new file mode 100644 index 000000000..45305043c --- /dev/null +++ b/CodeGen/microservice/xeon/docker/Dockerfile @@ -0,0 +1,44 @@ +# Copyright (c) 2024 Intel Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +FROM python:3.11-slim + +ENV LANG C.UTF-8 + +RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \ + libgl1-mesa-glx \ + libjemalloc-dev \ + vim \ + git + +RUN useradd -m -s /bin/bash user && \ + mkdir -p /home/user && \ + chown -R user /home/user/ + +RUN cd /home/user/ && \ + git clone https://github.com/opea-project/GenAIComps.git + +RUN cd /home/user/GenAIComps && pip install --no-cache-dir --upgrade pip && \ + pip install -r /home/user/GenAIComps/requirements.txt + +COPY ../codegen.py /home/user/codegen.py + +ENV PYTHONPATH=$PYTHONPATH:/home/user/GenAIComps + +USER user + +WORKDIR /home/user + +ENTRYPOINT ["python", "codegen.py"] diff --git a/CodeGen/microservice/xeon/docker_compose.yaml b/CodeGen/microservice/xeon/docker_compose.yaml new file mode 100644 index 000000000..6c5a0300b --- /dev/null +++ b/CodeGen/microservice/xeon/docker_compose.yaml @@ -0,0 +1,74 @@ +# Copyright (c) 2024 Intel Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +version: "3.8" + +services: + tgi_service: + image: ghcr.io/huggingface/text-generation-inference:1.4 + container_name: tgi-service + ports: + - "8028:80" + volumes: + - "./data:/data" + shm_size: 1g + environment: + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + command: --model-id ${LLM_MODEL_ID} + llm: + image: opea/gen-ai-comps:llm-tgi-server + container_name: llm-tgi-server + depends_on: + - tgi_service + ports: + - "9000:9000" + ipc: host + environment: + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + restart: unless-stopped + codegen-xeon-backend-server: + image: opea/gen-ai-comps:codegen-megaservice-server + container_name: codegen-xeon-backend-server + depends_on: + - llm + ports: + - "6666:6666" + environment: + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP} + ipc: host + restart: always + codegen-xeon-ui-server: + image: opea/gen-ai-comps:codegen-ui-server + container_name: codegen-xeon-ui-server + depends_on: + - codegen-xeon-backend-server + ports: + - "5173:5173" + environment: + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - BASIC_URL=${BACKEND_SERVICE_ENDPOINT} + ipc: host + restart: always + +networks: + default: + driver: bridge diff --git a/CodeGen/tests/test_codegen_on_gaudi.sh b/CodeGen/tests/test_codegen_on_gaudi.sh new file mode 100644 index 000000000..708ee85ec --- /dev/null +++ b/CodeGen/tests/test_codegen_on_gaudi.sh @@ -0,0 +1,135 @@ +# Copyright (c) 2024 Intel Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -x + +WORKPATH=$(dirname "$PWD") +LOG_PATH="$WORKPATH/tests" +ip_name=$(echo $(hostname) | tr '[a-z]-' '[A-Z]_')_$(echo 'IP') +ip_address=$(eval echo '$'$ip_name) + +function build_docker_images() { + cd $WORKPATH + git clone https://github.com/opea-project/GenAIComps.git + cd GenAIComps + + docker build -t opea/gen-ai-comps:llm-tgi-gaudi-server -f comps/llms/langchain/docker/Dockerfile . + + docker pull ghcr.io/huggingface/tgi-gaudi:1.2.1 + + cd $WORKPATH/microservice/gaudi + docker build --no-cache -t opea/gen-ai-comps:codegen-megaservice-server -f docker/Dockerfile . + + cd $WORKPATH/ui + docker build --no-cache -t opea/gen-ai-comps:codegen-ui-server -f docker/Dockerfile . + + docker images +} + +function start_services() { + cd $WORKPATH/microservice/gaudi + + export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3" + export TGI_LLM_ENDPOINT="http://${ip_address}:8028" + export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} + export MEGA_SERVICE_HOST_IP=${ip_address} + export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:6666/v1/codegen" + + # Start Docker Containers + # TODO: Replace the container name with a test-specific name + docker compose -f docker_compose.yaml up -d + + sleep 1m # Waits 1 minutes +} + +function validate_microservices() { + # Check if the microservices are running correctly. + # TODO: Any results check required?? + + export PATH="${HOME}/miniconda3/bin:$PATH" + + curl http://${ip_address}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":1024, "do_sample": true}}' \ + -H 'Content-Type: application/json' > ${LOG_PATH}/generate.log + exit_code=$? + if [ $exit_code -ne 0 ]; then + echo "Microservice failed, please check the logs in artifacts!" + docker logs tgi-gaudi-server >> ${LOG_PATH}/generate.log + exit 1 + fi + sleep 5s + + curl http://${ip_address}:9000/v1/chat/completions \ + -X POST \ + -d '{"text":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}' \ + -H 'Content-Type: application/json' > ${LOG_PATH}/completions.log + exit_code=$? + if [ $exit_code -ne 0 ]; then + echo "Microservice failed, please check the logs in artifacts!" + docker logs llm-tgi-gaudi-server >> ${LOG_PATH}/completions.log + exit 1 + fi + sleep 5s +} + +function validate_megaservice() { + # Curl the Mega Service + curl http://${ip_address}:6666/v1/codegen -H "Content-Type: application/json" -d '{ + "model": "ise-uiuc/Magicoder-S-DS-6.7B", + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}' > ${LOG_PATH}/curl_megaservice.log + + echo "Checking response results, make sure the output is reasonable. " + local status=true + if [[ -f $LOG_PATH/curl_megaservice.log ]] && \ + [[ $(grep -c "billion" $LOG_PATH/curl_megaservice.log) != 0 ]]; then + status=true + fi + + if [ $status == false ]; then + echo "Response check failed, please check the logs in artifacts!" + exit 1 + else + echo "Response check succeed!" + fi + + echo "Checking response format, make sure the output format is acceptable for UI." + # TODO +} + +function stop_docker() { + cd $WORKPATH/microservice/gaudi + container_list=$(cat docker_compose.yaml | grep container_name | cut -d':' -f2) + for container_name in $container_list; do + cid=$(docker ps -aq --filter "name=$container_name") + if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi + done +} + +function main() { + + stop_docker + + build_docker_images + start_services + + validate_microservices + validate_megaservice + + stop_docker + echo y | docker system prune + +} + +main diff --git a/CodeGen/tests/test_codegen_on_xeon.sh b/CodeGen/tests/test_codegen_on_xeon.sh new file mode 100644 index 000000000..14141ceab --- /dev/null +++ b/CodeGen/tests/test_codegen_on_xeon.sh @@ -0,0 +1,112 @@ +#!/bin/bash +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +set -xe + +WORKPATH=$(dirname "$PWD") +LOG_PATH="$WORKPATH/tests" +ip_name=$(echo $(hostname) | tr '[a-z]-' '[A-Z]_')_$(echo 'IP') +ip_address=$(eval echo '$'$ip_name) + +function build_docker_images() { + cd $WORKPATH + git clone https://github.com/opea-project/GenAIComps.git + cd GenAIComps + + docker build -t opea/gen-ai-comps:llm-tgi-server -f comps/llms/langchain/docker/Dockerfile . + + cd $WORKPATH/microservice/xeon + docker build --no-cache -t opea/gen-ai-comps:codegen-megaservice-server -f docker/Dockerfile . + + cd $WORKPATH/ui + docker build --no-cache -t opea/gen-ai-comps:codegen-ui-server -f docker/Dockerfile . + + docker images +} + +function start_services() { + cd $WORKPATH/microservice/xeon + + export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3" + export TGI_LLM_ENDPOINT="http://${ip_address}:8028" + export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} + export MEGA_SERVICE_HOST_IP=${ip_address} + export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:6666/v1/codegen" + + # Start Docker Containers + # TODO: Replace the container name with a test-specific name + docker compose -f docker_compose.yaml up -d + + sleep 1m # Waits 1 minutes +} + +function validate_microservices() { + # Check if the microservices are running correctly. + # TODO: Any results check required?? + + export PATH="${HOME}/miniconda3/bin:$PATH" + + curl http://${ip_address}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ + -H 'Content-Type: application/json' > ${LOG_PATH}/generate.log + sleep 5s + + curl http://${ip_address}:9000/v1/chat/completions \ + -X POST \ + -d '{"text":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}' \ + -H 'Content-Type: application/json' > ${LOG_PATH}/completions.log + sleep 5s +} + +function validate_megaservice() { + # Curl the Mega Service + curl http://${ip_address}:6666/v1/codegen -H "Content-Type: application/json" -d '{ + "model": "ise-uiuc/Magicoder-S-DS-6.7B", + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}' > ${LOG_PATH}/curl_megaservice.log + + echo "Checking response results, make sure the output is reasonable. " + local status=true + if [[ -f $LOG_PATH/curl_megaservice.log ]] && \ + [[ $(grep -c "billion" $LOG_PATH/curl_megaservice.log) != 0 ]]; then + status=true + fi + + if [ $status == false ]; then + echo "Response check failed, please check the logs in artifacts!" + exit 1 + else + echo "Response check succeed!" + fi + + echo "Checking response format, make sure the output format is acceptable for UI." + # TODO + +} + +function stop_docker() { + cd $WORKPATH/microservice/xeon + container_list=$(cat docker_compose.yaml | grep container_name | cut -d':' -f2) + for container_name in $container_list; do + cid=$(docker ps -aq --filter "name=$container_name") + if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi + done +} + +function main() { + + stop_docker + + build_docker_images + start_services + + validate_microservices + validate_megaservice + + stop_docker + echo y | docker system prune + +} + +main diff --git a/CodeGen/ui/docker/Dockerfile b/CodeGen/ui/docker/Dockerfile new file mode 100644 index 000000000..2a9b7deed --- /dev/null +++ b/CodeGen/ui/docker/Dockerfile @@ -0,0 +1,20 @@ +# Use node 20.11.1 as the base image +FROM node:20.11.1 + +# Update package manager and install Git +RUN apt-get update -y && apt-get install -y git + +# Clone the front-end code repository +RUN git clone https://github.com/opea-project/GenAIExamples.git /home/user/GenAIExamples + +# Set the working directory +WORKDIR /home/user/GenAIExamples/CodeGen/ui/svelte + +# Install front-end dependencies +RUN npm install + +# Expose the port of the front-end application +EXPOSE 5173 + +# Run the front-end application +CMD ["npm", "run", "dev", "--", "--host=0.0.0.0"]