add initial examples

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
This commit is contained in:
lvliang-intel
2024-03-21 10:17:09 +08:00
parent bc7c18f68d
commit fabff168ff
147 changed files with 23216 additions and 0 deletions

155
ChatQnA/README.md Normal file
View File

@@ -0,0 +1,155 @@
This ChatQnA use case performs RAG using LangChain, Redis vectordb and Text Generation Inference on Intel Gaudi2. The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Please visit [Habana AI products](https://habana.ai/products) for more details.
# Environment Setup
To use [🤗 text-generation-inference](https://github.com/huggingface/text-generation-inference) on Habana Gaudi/Gaudi2, please follow these steps:
## Build TGI Gaudi Docker Image
```bash
bash ./serving/tgi_gaudi/build_docker.sh
```
## Launch TGI Gaudi Service
### Launch a local server instance on 1 Gaudi card:
```bash
bash ./serving/tgi_gaudi/launch_tgi_service.sh
```
For gated models such as `LLAMA-2`, you will have to pass -e HUGGING_FACE_HUB_TOKEN=\<token\> to the docker run command above with a valid Hugging Face Hub read token.
Please follow this link [huggingface token](https://huggingface.co/docs/hub/security-tokens) to get the access token ans export `HUGGINGFACEHUB_API_TOKEN` environment with the token.
```bash
export HUGGINGFACEHUB_API_TOKEN=<token>
```
### Launch a local server instance on 8 Gaudi cards:
```bash
bash ./serving/tgi_gaudi/launch_tgi_service.sh 8
```
### Customize TGI Gaudi Service
The ./serving/tgi_gaudi/launch_tgi_service.sh script accepts three parameters:
- num_cards: The number of Gaudi cards to be utilized, ranging from 1 to 8. The default is set to 1.
- port_number: The port number assigned to the TGI Gaudi endpoint, with the default being 8080.
- model_name: The model name utilized for LLM, with the default set to "Intel/neural-chat-7b-v3-3".
You have the flexibility to customize these parameters according to your specific needs. Additionally, you can set the TGI Gaudi endpoint by exporting the environment variable `TGI_ENDPOINT`:
```bash
export TGI_ENDPOINT="http://xxx.xxx.xxx.xxx:8080"
```
## Enable TGI Gaudi FP8 for higher throughput
The TGI Gaudi utilizes BFLOAT16 optimization as the default setting. If you aim to achieve higher throughput, you can enable FP8 quantization on the TGI Gaudi. According to our test results, FP8 quantization yields approximately a 1.8x performance gain compared to BFLOAT16. Please follow the below steps to enable FP8 quantization.
### Prepare Metadata for FP8 Quantization
Enter into the TGI Gaudi docker container, and then run the below commands:
```bash
git clone https://github.com/huggingface/optimum-habana.git
cd optimum-habana/examples/text-generation
pip install -r requirements_lm_eval.txt
QUANT_CONFIG=./quantization_config/maxabs_measure.json python ../gaudi_spawn.py run_lm_eval.py -o acc_7b_bs1_measure.txt --
model_name_or_path meta-llama/Llama-2-7b-hf --attn_softmax_bf16 --use_hpu_graphs --trim_logits --use_kv_cache --reuse_cache --bf16 --batch_size 1
QUANT_CONFIG=./quantization_config/maxabs_quant.json python ../gaudi_spawn.py run_lm_eval.py -o acc_7b_bs1_quant.txt --model_name_or_path
meta-llama/Llama-2-7b-hf --attn_softmax_bf16 --use_hpu_graphs --trim_logits --use_kv_cache --reuse_cache --bf16 --batch_size 1 --fp8
```
After finishing the above commands, the quantization metadata will be generated. Move the metadata directory ./hqt_output/ and copy the quantization JSON file to the host (under …/data). Please adapt the commands with your Docker ID and directory path.
```bash
docker cp 262e04bbe466:/usr/src/optimum-habana/examples/text-generation/hqt_output data/
docker cp 262e04bbe466:/usr/src/optimum-habana/examples/text-generation/quantization_config/maxabs_quant.json data/
```
### Restart the TGI Gaudi server within all the metadata mapped
```bash
docker run -d -p 8080:80 -e QUANT_CONFIG=/data/maxabs_quant.json -e HUGGING_FACE_HUB_TOKEN=<your HuggingFace token> -v $volume:/data --
runtime=habana -e HABANA_VISIBLE_DEVICES="4,5,6" -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host tgi_gaudi --
model-id meta-llama/Llama-2-7b-hf
```
Now the TGI Gaudi will launch the FP8 model by default. Please note that currently only Llama2 and Mistral models support FP8 quantization.
## Launch Redis
```bash
docker pull redis/redis-stack:latest
docker compose -f langchain/docker/docker-compose-redis.yml up -d
```
## Launch LangChain Docker
### Build LangChain Docker Image
```bash
cd langchain/docker/
bash ./build_docker.sh
```
### Lanuch LangChain Docker
Update the `HUGGINGFACEHUB_API_TOKEN` environment variable with your huggingface token in the `docker-compose-langchain.yml`
```bash
docker compose -f docker-compose-langchain.yml up -d
cd ../../
```
## Ingest data into redis
After every time of redis container is launched, data should be ingested in the container ingestion steps:
```bash
docker exec -it qna-rag-redis-server bash
cd /ws
python ingest.py
```
Note: `ingest.py` will download the embedding model, please set the proxy if necessary.
# Start LangChain Server
## Start the Backend Service
Make sure TGI-Gaudi service is running and also make sure data is populated into Redis. Launch the backend service:
```bash
docker exec -it qna-rag-redis-server bash
nohup python app/server.py &
```
## Start the Frontend Service
Navigate to the "ui" folder and execute the following commands to start the fronend GUI:
```bash
cd ui
sudo apt-get install npm && \
npm install -g n && \
n stable && \
hash -r && \
npm install -g npm@latest
```
For CentOS, please use the following commands instead:
```bash
curl -sL https://rpm.nodesource.com/setup_20.x | sudo bash -
sudo yum install -y nodejs
```
Update the `DOC_BASE_URL` environment variable in the `.env` file by replacing the IP address '127.0.0.1' with the actual IP address.
Run the following command to install the required dependencies:
```bash
npm install
```
Start the development server by executing the following command:
```bash
nohup npm run dev &
```
This will initiate the frontend service and launch the application.

View File

@@ -0,0 +1 @@
Will update soon.

View File

@@ -0,0 +1,34 @@
import requests
import json
import argparse
import concurrent.futures
import random
def extract_qText(json_data):
try:
file = open('devtest.json')
data = json.load(file)
json_data = json.loads(json_data)
json_data["inputs"] = data[random.randint(0, len(data) - 1)]["qText"]
return json.dumps(json_data)
except (json.JSONDecodeError, KeyError, IndexError):
return None
def send_request(url, json_data):
headers = {'Content-Type': 'application/json'}
response = requests.post(url, data=json_data, headers=headers)
print(f"Question: {json_data} Response: {response.status_code} - {response.text}")
def main(url, json_data, concurrency):
with concurrent.futures.ThreadPoolExecutor(max_workers=concurrency) as executor:
future_to_url = {executor.submit(send_request, url, extract_qText(json_data)): url for _ in range(concurrency*2)}
for future in concurrent.futures.as_completed(future_to_url):
_ = future_to_url[future]
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Concurrent client to send POST requests")
parser.add_argument("--url", type=str, default="http://localhost:12345", help="URL to send requests to")
parser.add_argument("--json_data", type=str, default='{"inputs":"Which NFL team won the Super Bowl in the 2010 season?","parameters":{"do_sample": true}}', help="JSON data to send")
parser.add_argument("--concurrency", type=int, default=100, help="Concurrency level")
args = parser.parse_args()
main(args.url, args.json_data, args.concurrency)

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,7 @@
HUGGING_FACE_HUB_TOKEN=<your-hf-token>
volume=./data
model=meta-llama/Llama-2-13b-chat-hf
MAX_TOTAL_TOKENS=2000
ENABLE_HPU_GRAPH=True
PT_HPU_ENABLE_LAZY_COLLECTIVES=true
OMPI_MCA_btl_vader_single_copy_mechanism=none

View File

@@ -0,0 +1,9 @@
## Launch 8 models on 8 separate Gaudi2 cards:
Add HuggingFace access token in .env <br/>
Optinally change model name and linked volume direcotry to store downloaded model<br/><br/>
Run the following command in your terminal to launch nginx load balancer and 8 instances of tgi_gaudi containers (one for each Gaudi card):
```
docker compose up -f docker-compose.yml -d
```

View File

@@ -0,0 +1,135 @@
version: '3'
services:
gaudi0:
image: tgi_gaudi
runtime: habana
ports:
- "8081:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=0
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
gaudi1:
image: tgi_gaudi
runtime: habana
ports:
- "8082:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=1
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
gaudi2:
image: tgi_gaudi
runtime: habana
ports:
- "8083:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=2
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
gaudi3:
image: tgi_gaudi
runtime: habana
ports:
- "8084:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=3
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
gaudi4:
image: tgi_gaudi
runtime: habana
ports:
- "8085:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=4
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
gaudi5:
image: tgi_gaudi
runtime: habana
ports:
- "8086:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=5
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
gaudi6:
image: tgi_gaudi
runtime: habana
ports:
- "8087:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=6
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
gaudi7:
image: tgi_gaudi
runtime: habana
ports:
- "8088:80"
env_file:
- .env
environment:
- HABANA_VISIBLE_DEVICES=7
volumes:
- $volume:/data
cap_add:
- sys_nice
ipc: "host"
command: ["--model-id", "$model"]
nginx:
build: ./nginx
ports:
- "80:80"
depends_on:
- gaudi0
- gaudi1
- gaudi2
- gaudi3
- gaudi4
- gaudi5
- gaudi6
- gaudi7

View File

@@ -0,0 +1,11 @@
# FROM nginx
# RUN rm /etc/nginx/conf.d/default.conf
# COPY nginx.conf /etc/nginx/conf.d/default.conf
FROM nginx:latest
RUN rm /etc/nginx/conf.d/default.conf
COPY nginx.conf /etc/nginx/conf.d/default.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

View File

@@ -0,0 +1,23 @@
upstream backend {
least_conn;
server gaudi0:80 max_fails=3 fail_timeout=30s;
server gaudi1:80 max_fails=3 fail_timeout=30s;
server gaudi2:80 max_fails=3 fail_timeout=30s;
server gaudi3:80 max_fails=3 fail_timeout=30s;
server gaudi4:80 max_fails=3 fail_timeout=30s;
server gaudi5:80 max_fails=3 fail_timeout=30s;
server gaudi6:80 max_fails=3 fail_timeout=30s;
server gaudi7:80 max_fails=3 fail_timeout=30s;
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}

View File

View File

@@ -0,0 +1,37 @@
FROM langchain/langchain
ARG http_proxy
ARG https_proxy
ENV http_proxy=$http_proxy
ENV https_proxy=$https_proxy
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y \
libgl1-mesa-glx \
libjemalloc-dev
RUN pip install --upgrade pip \
sentence-transformers \
redis \
unstructured \
unstructured[pdf] \
langchain-cli \
pydantic==1.10.13 \
langchain==0.1.12 \
poetry \
pymupdf \
easyocr \
langchain_benchmarks \
pyarrow \
jupyter \
intel-extension-for-pytorch \
intel-openmp
ENV PYTHONPATH=/ws:/qna-app/app
COPY qna-app /qna-app
COPY qna-app-no-rag /qna-app-no-rag
WORKDIR /qna-app
ENTRYPOINT ["/usr/bin/sleep", "infinity"]

View File

@@ -0,0 +1,3 @@
#!/bin/bash
docker build . -t qna-rag-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy

View File

@@ -0,0 +1,18 @@
version: '3'
services:
qna-rag-redis-server:
image: qna-rag-redis:latest
container_name: qna-rag-redis-server
environment:
- "REDIS_PORT=6379"
- "EMBED_MODEL=BAAI/bge-base-en-v1.5"
- "REDIS_SCHEMA=schema_dim_768.yml"
- "HUGGINGFACEHUB_API_TOKEN=<update-your-hugging-face-token>"
ulimits:
memlock:
soft: -1 # Set memlock to unlimited (no soft or hard limit)
hard: -1
volumes:
- ../redis:/ws
- ../test:/test
network_mode: "host"

View File

@@ -0,0 +1,8 @@
version: '1'
services:
redis-vector-db:
image: redis/redis-stack:latest
container_name: redis-vector-db
ports:
- "6379:6379"
- "8001:8001"

View File

@@ -0,0 +1,21 @@
FROM python:3.11-slim
RUN pip install poetry==1.6.1
RUN poetry config virtualenvs.create false
WORKDIR /code
COPY ./pyproject.toml ./README.md ./poetry.lock* ./
COPY ./package[s] ./packages
RUN poetry install --no-interaction --no-ansi --no-root
COPY ./app ./app
RUN poetry install --no-interaction --no-ansi
EXPOSE 8080
CMD exec uvicorn app.server:app --host 0.0.0.0 --port 8080

View File

@@ -0,0 +1,79 @@
# my-app
## Installation
Install the LangChain CLI if you haven't yet
```bash
pip install -U langchain-cli
```
## Adding packages
```bash
# adding packages from
# https://github.com/langchain-ai/langchain/tree/master/templates
langchain app add $PROJECT_NAME
# adding custom GitHub repo packages
langchain app add --repo $OWNER/$REPO
# or with whole git string (supports other git providers):
# langchain app add git+https://github.com/hwchase17/chain-of-verification
# with a custom api mount point (defaults to `/{package_name}`)
langchain app add $PROJECT_NAME --api_path=/my/custom/path/rag
```
Note: you remove packages by their api path
```bash
langchain app remove my/custom/path/rag
```
## Setup LangSmith (Optional)
LangSmith will help us trace, monitor and debug LangChain applications.
LangSmith is currently in private beta, you can sign up [here](https://smith.langchain.com/).
If you don't have access, you can skip this section
```shell
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project> # if not specified, defaults to "default"
```
## Launch LangServe
```bash
langchain serve
```
## Running in Docker
This project folder includes a Dockerfile that allows you to easily build and host your LangServe app.
### Building the Image
To build the image, you simply:
```shell
docker build . -t my-langserve-app
```
If you tag your image with something other than `my-langserve-app`,
note it for use in the next step.
### Running the Image Locally
To run the image, you'll need to include any environment variables
necessary for your application.
In the below example, we inject the `OPENAI_API_KEY` environment
variable with the value set in my local environment
(`$OPENAI_API_KEY`)
We also expose port 8080 with the `-p 8080:8080` option.
```shell
docker run -e OPENAI_API_KEY=$OPENAI_API_KEY -p 8080:8080 my-langserve-app
```

View File

@@ -0,0 +1,51 @@
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# ========= Raw Q&A template prompt =========
template = """
Use the following pieces of context from retrieved
dataset to answer the question. Do not make up an answer if there is no
context provided to help answer it. Include the 'source' and 'start_index'
from the metadata included in the context you used to answer the question
Context:
---------
{context}
---------
Question: {question}
---------
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)
# ========= contextualize prompt =========
contextualize_q_system_prompt = """Given a chat history and the latest user question \
which might reference context in the chat history, formulate a standalone question \
which can be understood without the chat history. Do NOT answer the question, \
just reformulate it if needed and otherwise return it as is."""
contextualize_q_prompt = ChatPromptTemplate.from_messages(
[
("system", contextualize_q_system_prompt),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{question}"),
]
)
# ========= Q&A with history prompt =========
qa_system_prompt = """You are an assistant for question-answering tasks. \
Use the following pieces of retrieved context to answer the question. \
If you don't know the answer, just say that you don't know. \
Use three sentences maximum and keep the answer concise.\
{context}"""
qa_prompt = ChatPromptTemplate.from_messages(
[
("system", qa_system_prompt),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{question}"),
]
)

View File

@@ -0,0 +1,252 @@
import os
from fastapi import FastAPI, APIRouter, Request, UploadFile, File
from fastapi.responses import RedirectResponse, StreamingResponse, JSONResponse
from langserve import add_routes
from rag_redis.chain import chain as qna_rag_redis_chain
from starlette.middleware.cors import CORSMiddleware
from langchain_community.llms import HuggingFaceEndpoint
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_community.vectorstores import Redis
from langchain_core.output_parsers import StrOutputParser
from langchain_core.messages import HumanMessage
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from rag_redis.config import EMBED_MODEL, INDEX_NAME, REDIS_URL, INDEX_SCHEMA
from utils import (
create_retriever_from_files, reload_retriever, create_kb_folder,
get_current_beijing_time, create_retriever_from_links
)
from prompts import contextualize_q_prompt, qa_prompt
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"])
class RAGAPIRouter(APIRouter):
def __init__(self, upload_dir, entrypoint) -> None:
super().__init__()
self.upload_dir = upload_dir
self.entrypoint = entrypoint
print(f"[rag - router] Initializing API Router, params:\n \
upload_dir={upload_dir}, entrypoint={entrypoint}")
# Define LLM
self.llm = HuggingFaceEndpoint(
endpoint_url=entrypoint,
max_new_tokens=512,
top_k=10,
top_p=0.95,
typical_p=0.95,
temperature=0.01,
repetition_penalty=1.03,
streaming=True,
)
print("[rag - router] LLM initialized.")
# Define LLM Chain
self.embeddings = HuggingFaceBgeEmbeddings(model_name=EMBED_MODEL)
rds = Redis.from_existing_index(
self.embeddings,
index_name=INDEX_NAME,
redis_url=REDIS_URL,
schema=INDEX_SCHEMA,
)
retriever = rds.as_retriever(search_type="mmr")
# Define contextualize chain
self.contextualize_q_chain = contextualize_q_prompt | self.llm | StrOutputParser()
# Define LLM chain
self.llm_chain = (
RunnablePassthrough.assign(
context=self.contextualized_question | retriever
)
| qa_prompt
| self.llm
)
print("[rag - router] LLM chain initialized.")
# Define chat history
self.chat_history = []
def contextualized_question(self, input: dict):
if input.get("chat_history"):
return self.contextualize_q_chain
else:
return input["question"]
def handle_rag_chat(self, query: str):
response = self.llm_chain.invoke({"question": query, "chat_history": self.chat_history})
result = response.split("</s>")[0]
self.chat_history.extend([HumanMessage(content=query), response])
return result
upload_dir = os.getenv("RAG_UPLOAD_DIR", "./upload_dir")
tgi_endpoint = os.getenv("TGI_ENDPOINT", "http://localhost:8080")
router = RAGAPIRouter(upload_dir, tgi_endpoint)
@router.post("/v1/rag/chat")
async def rag_chat(request: Request):
params = await request.json()
print(f"[rag - chat] POST request: /v1/rag/chat, params:{params}")
query = params['query']
kb_id = params.get("knowledge_base_id", "default")
print(f"[rag - chat] history: {router.chat_history}")
if kb_id == "default":
print(f"[rag - chat] use default knowledge base")
retriever = reload_retriever(router.embeddings, INDEX_NAME)
router.llm_chain = (
RunnablePassthrough.assign(
context=router.contextualized_question | retriever
)
| qa_prompt
| router.llm
)
elif kb_id.startswith("kb"):
new_index_name = INDEX_NAME + kb_id
print(f"[rag - chat] use knowledge base {kb_id}, index name is {new_index_name}")
retriever = reload_retriever(router.embeddings, new_index_name)
router.llm_chain = (
RunnablePassthrough.assign(
context=router.contextualized_question | retriever
)
| qa_prompt
| router.llm
)
else:
return JSONResponse(status_code=400, content={"message":"Wrong knowledge base id."})
return router.handle_rag_chat(query=query)
@router.post("/v1/rag/chat_stream")
async def rag_chat(request: Request):
params = await request.json()
print(f"[rag - chat_stream] POST request: /v1/rag/chat_stream, params:{params}")
query = params['query']
kb_id = params.get("knowledge_base_id", "default")
print(f"[rag - chat_stream] history: {router.chat_history}")
if kb_id == "default":
retriever = reload_retriever(router.embeddings, INDEX_NAME)
router.llm_chain = (
RunnablePassthrough.assign(
context=router.contextualized_question | retriever
)
| qa_prompt
| router.llm
)
elif kb_id.startswith("kb"):
new_index_name = INDEX_NAME + kb_id
retriever = reload_retriever(router.embeddings, new_index_name)
router.llm_chain = (
RunnablePassthrough.assign(
context=router.contextualized_question | retriever
)
| qa_prompt
| router.llm
)
else:
return JSONResponse(status_code=400, content={"message":"Wrong knowledge base id."})
def stream_generator():
for text in router.llm_chain.stream({"question": query, "chat_history": router.chat_history}):
# print(f"[rag - chat_stream] text: {text}")
if text == " ":
yield f"data: @#$\n\n"
continue
if text.isspace():
continue
if "\n" in text:
yield f"data: <br/>\n\n"
new_text = text.replace(" ", "@#$")
yield f"data: {new_text}\n\n"
yield f"data: [DONE]\n\n"
return StreamingResponse(stream_generator(), media_type="text/event-stream")
@router.post("/v1/rag/create")
async def rag_create(file: UploadFile = File(...)):
filename = file.filename
if '/' in filename:
filename = filename.split('/')[-1]
print(f"[rag - create] POST request: /v1/rag/create, filename:{filename}")
kb_id, user_upload_dir, user_persist_dir = create_kb_folder(router.upload_dir)
# save file to local path
cur_time = get_current_beijing_time()
save_file_name = str(user_upload_dir) + '/' + cur_time + '-' + filename
with open(save_file_name, 'wb') as fout:
content = await file.read()
fout.write(content)
print(f"[rag - create] file saved to local path: {save_file_name}")
# create new retriever
try:
# get retrieval instance and reload db with new knowledge base
print("[rag - create] starting to create local db...")
index_name = INDEX_NAME + kb_id
retriever = create_retriever_from_files(save_file_name, router.embeddings, index_name)
router.llm_chain = (
RunnablePassthrough.assign(
context=router.contextualized_question | retriever
)
| qa_prompt
| router.llm
)
print(f"[rag - create] kb created successfully")
except Exception as e:
print(f"[rag - create] create knowledge base failed! {e}")
return JSONResponse(status_code=500, content={"message":"Fail to create new knowledge base."})
return {"knowledge_base_id": kb_id}
@router.post("/v1/rag/upload_link")
async def rag_create(request: Request):
params = await request.json()
link_list = params['link_list']
print(f"[rag - upload_link] POST request: /v1/rag/upload_link, link list:{link_list}")
kb_id, user_upload_dir, user_persist_dir = create_kb_folder(router.upload_dir)
# create new retriever
try:
print("[rag - upload_link] starting to create local db...")
index_name = INDEX_NAME + kb_id
retriever = create_retriever_from_links(router.embeddings, link_list, index_name)
router.llm_chain = (
RunnablePassthrough.assign(
context=router.contextualized_question | retriever
)
| qa_prompt
| router.llm
)
print(f"[rag - upload_link] kb created successfully")
except Exception as e:
print(f"[rag - upload_link] create knowledge base failed! {e}")
return JSONResponse(status_code=500, content={"message":"Fail to create new knowledge base."})
return {"knowledge_base_id": kb_id}
app.include_router(router)
@app.get("/")
async def redirect_root_to_docs():
return RedirectResponse("/docs")
add_routes(app, qna_rag_redis_chain, path="/rag-redis")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)

View File

@@ -0,0 +1,327 @@
import os
import re
import uuid
import requests
import unicodedata
import multiprocessing
from pathlib import Path
from bs4 import BeautifulSoup
from urllib.parse import urlparse, urlunparse
from datetime import timedelta, timezone, datetime
from langchain_community.document_loaders import UnstructuredFileLoader
from langchain_community.vectorstores import Redis
from langchain_core.documents import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from rag_redis.config import INDEX_SCHEMA, REDIS_URL
def get_current_beijing_time():
SHA_TZ = timezone(
timedelta(hours=8),
name='Asia/Shanghai'
)
utc_now = datetime.utcnow().replace(tzinfo=timezone.utc)
beijing_time = utc_now.astimezone(SHA_TZ).strftime("%Y-%m-%d-%H:%M:%S")
return beijing_time
def create_kb_folder(upload_dir):
kb_id = f"kb_{str(uuid.uuid1())[:8]}"
path_prefix = upload_dir
# create local folder for retieval
cur_path = Path(path_prefix) / kb_id
os.makedirs(path_prefix, exist_ok=True)
cur_path.mkdir(parents=True, exist_ok=True)
user_upload_dir = Path(path_prefix) / f"{kb_id}/upload_dir"
user_persist_dir = Path(path_prefix) / f"{kb_id}/persist_dir"
user_upload_dir.mkdir(parents=True, exist_ok=True)
user_persist_dir.mkdir(parents=True, exist_ok=True)
print(f"[rag - create kb folder] upload path: {user_upload_dir}, persist path: {user_persist_dir}")
return kb_id, str(user_upload_dir), str(user_persist_dir)
class Crawler:
def __init__(self, pool=None):
if pool:
assert isinstance(pool, (str, list, tuple)), 'url pool should be str, list or tuple'
self.pool = pool
self.headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng, \
*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, \
like Gecko) Chrome/113.0.0.0 Safari/537.36'
}
self.fetched_pool = set()
def get_sublinks(self, soup):
sublinks = []
for links in soup.find_all('a'):
sublinks.append(str(links.get('href')))
return sublinks
def get_hyperlink(self, soup, base_url):
sublinks = []
for links in soup.find_all('a'):
link = str(links.get('href'))
if link.startswith('#') or link is None or link == 'None':
continue
suffix = link.split('/')[-1]
if '.' in suffix and suffix.split('.')[-1] not in ['html', 'htmld']:
continue
link_parse = urlparse(link)
base_url_parse = urlparse(base_url)
if link_parse.path == '':
continue
if link_parse.netloc != '':
# keep crawler works in the same domain
if link_parse.netloc != base_url_parse.netloc:
continue
sublinks.append(link)
else:
sublinks.append(urlunparse((base_url_parse.scheme,
base_url_parse.netloc,
link_parse.path,
link_parse.params,
link_parse.query,
link_parse.fragment)))
return sublinks
def fetch(self, url, headers=None, max_times=5):
if not headers:
headers = self.headers
while max_times:
if not url.startswith('http') or not url.startswith('https'):
url = 'http://' + url
print('start fetch %s...', url)
try:
response = requests.get(url, headers=headers, verify=True)
if response.status_code != 200:
print('fail to fetch %s, response status code: %s', url, response.status_code)
else:
return response
except Exception as e:
print('fail to fetch %s, caused by %s', url, e)
raise Exception(e)
max_times -= 1
return None
def process_work(self, sub_url, work):
response = self.fetch(sub_url)
if response is None:
return []
self.fetched_pool.add(sub_url)
soup = self.parse(response.text)
base_url = self.get_base_url(sub_url)
sublinks = self.get_hyperlink(soup, base_url)
if work:
work(sub_url, soup)
return sublinks
def crawl(self, pool, work=None, max_depth=10, workers=10):
url_pool = set()
for url in pool:
base_url = self.get_base_url(url)
response = self.fetch(url)
soup = self.parse(response.text)
sublinks = self.get_hyperlink(soup, base_url)
self.fetched_pool.add(url)
url_pool.update(sublinks)
depth = 0
while len(url_pool) > 0 and depth < max_depth:
print('current depth %s...', depth)
mp = multiprocessing.Pool(processes=workers)
results = []
for sub_url in url_pool:
if sub_url not in self.fetched_pool:
results.append(mp.apply_async(self.process_work, (sub_url, work)))
mp.close()
mp.join()
url_pool = set()
for result in results:
sublinks = result.get()
url_pool.update(sublinks)
depth += 1
def parse(self, html_doc):
soup = BeautifulSoup(html_doc, 'lxml')
return soup
def download(self, url, file_name):
print('download %s into %s...', url, file_name)
try:
r = requests.get(url, stream=True, headers=self.headers, verify=True)
f = open(file_name, "wb")
for chunk in r.iter_content(chunk_size=512):
if chunk:
f.write(chunk)
except Exception as e:
print('fail to download %s, caused by %s', url, e)
def get_base_url(self, url):
result = urlparse(url)
return urlunparse((result.scheme, result.netloc, '', '', '', ''))
def clean_text(self, text):
text = text.strip().replace('\r', '\n')
text = re.sub(' +', ' ', text)
text = re.sub('\n+', '\n', text)
text = text.split('\n')
return '\n'.join([i for i in text if i and i != ' '])
def uni_pro(text):
"""Check if the character is ASCII or falls in the category of non-spacing marks."""
normalized_text = unicodedata.normalize('NFKD', text)
filtered_text = ''
for char in normalized_text:
if ord(char) < 128 or unicodedata.category(char) == 'Mn':
filtered_text += char
return filtered_text
def load_html_data(url):
crawler = Crawler()
res = crawler.fetch(url)
if res == None:
return None
soup = crawler.parse(res.text)
all_text = crawler.clean_text(soup.select_one('body').text)
main_content = ''
for element_name in ['main', 'container']:
main_block = None
if soup.select(f'.{element_name}'):
main_block = soup.select(f'.{element_name}')
elif soup.select(f'#{element_name}'):
main_block = soup.select(f'#{element_name}')
if main_block:
for element in main_block:
text = crawler.clean_text(element.text)
if text not in main_content:
main_content += f'\n{text}'
main_content = crawler.clean_text(main_content)
main_content = main_content.replace('\n', '')
main_content = main_content.replace('\n\n', '')
main_content = uni_pro(main_content)
main_content = re.sub(r'\s+', ' ', main_content)
# {'text': all_text, 'main_content': main_content}
return main_content
def get_chuck_data(content, max_length, min_length, input):
"""Process the context to make it maintain a suitable length for the generation."""
sentences = re.split('(?<=[!.?])', content)
paragraphs = []
current_length = 0
count = 0
current_paragraph = ""
for sub_sen in sentences:
count +=1
sentence_length = len(sub_sen)
if current_length + sentence_length <= max_length:
current_paragraph += sub_sen
current_length += sentence_length
if count == len(sentences) and len(current_paragraph.strip())>min_length:
paragraphs.append([current_paragraph.strip() ,input])
else:
paragraphs.append([current_paragraph.strip() ,input])
current_paragraph = sub_sen
current_length = sentence_length
return paragraphs
def parse_html(input):
"""
Parse the uploaded file.
"""
chucks = []
for link in input:
if re.match(r'^https?:/{2}\w.+$', link):
content = load_html_data(link)
if content == None:
continue
chuck = [[content.strip(), link]]
chucks += chuck
else:
print("The given link/str {} cannot be parsed.".format(link))
return chucks
def document_transfer(data_collection):
"Transfer the raw document into langchain supported format."
documents = []
for data, meta in data_collection:
doc_id = str(uuid.uuid4())
metadata = {"source": meta, "identify_id":doc_id}
doc = Document(page_content=data, metadata=metadata)
documents.append(doc)
return documents
def create_retriever_from_files(doc, embeddings, index_name: str):
print(f"[rag - create retriever] create with index: {index_name}")
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1500, chunk_overlap=100, add_start_index=True
)
loader = UnstructuredFileLoader(doc, mode="single", strategy="fast")
chunks = loader.load_and_split(text_splitter)
rds = Redis.from_texts(
texts=[chunk.page_content for chunk in chunks],
metadatas=[chunk.metadata for chunk in chunks],
embedding=embeddings,
index_name=index_name,
redis_url=REDIS_URL,
index_schema=INDEX_SCHEMA,
)
retriever = rds.as_retriever(search_type="mmr")
return retriever
def create_retriever_from_links(embeddings, link_list: list, index_name):
data_collection = parse_html(link_list)
texts = []
metadatas = []
for data, meta in data_collection:
doc_id = str(uuid.uuid4())
metadata = {"source": meta, "identify_id":doc_id}
texts.append(data)
metadatas.append(metadata)
rds = Redis.from_texts(
texts=texts,
metadatas=metadatas,
embedding=embeddings,
index_name=index_name,
redis_url=REDIS_URL,
index_schema=INDEX_SCHEMA,
)
retriever = rds.as_retriever(search_type="mmr")
return retriever
def reload_retriever(embeddings, index_name):
print(f"[rag - reload retriever] reload with index: {index_name}")
rds = Redis.from_existing_index(
embeddings,
index_name=index_name,
redis_url=REDIS_URL,
schema=INDEX_SCHEMA,
)
retriever = rds.as_retriever(search_type="mmr")
return retriever

View File

@@ -0,0 +1,23 @@
[tool.poetry]
name = "my-app"
version = "0.1.0"
description = ""
authors = ["Your Name <you@example.com>"]
readme = "README.md"
packages = [
{ include = "app" },
]
[tool.poetry.dependencies]
python = "^3.11"
uvicorn = "^0.23.2"
langserve = {extras = ["server"], version = ">=0.0.30"}
pydantic = "<2"
[tool.poetry.group.dev.dependencies]
langchain-cli = ">=0.0.15"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2023 LangChain, Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,82 @@
import os
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Redis
from rag_redis.config import EMBED_MODEL, INDEX_NAME, INDEX_SCHEMA, REDIS_URL
from PIL import Image
import numpy as np
import io
def pdf_loader(file_path):
try:
import fitz # noqa:F401
import easyocr
except ImportError:
raise ImportError(
"`PyMuPDF` or 'easyocr' package is not found, please install it with "
"`pip install pymupdf or pip install easyocr.`"
)
doc = fitz.open(file_path)
reader = easyocr.Reader(['en'])
result =''
for i in range(doc.page_count):
page = doc.load_page(i)
pagetext = page.get_text().strip()
if pagetext:
result=result+pagetext
if len(doc.get_page_images(i)) > 0 :
for img in doc.get_page_images(i):
if img:
pageimg=''
xref = img[0]
img_data = doc.extract_image(xref)
img_bytes = img_data['image']
pil_image = Image.open(io.BytesIO(img_bytes))
img = np.array(pil_image)
img_result = reader.readtext(img, paragraph=True, detail=0)
pageimg=pageimg + ', '.join(img_result).strip()
if pageimg.endswith('!') or pageimg.endswith('?') or pageimg.endswith('.'):
pass
else:
pageimg=pageimg+'.'
result=result+pageimg
return result
def ingest_documents():
"""
Ingest PDF to Redis from the data/ directory that
contains Edgar 10k filings data for Nike.
"""
# Load list of pdfs
company_name = "Nike"
data_path = "data/"
doc_path = [os.path.join(data_path, file) for file in os.listdir(data_path)][0]
print("Parsing 10k filing doc for NIKE", doc_path) # noqa: T201
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1500, chunk_overlap=100, add_start_index=True
)
content = pdf_loader(doc_path)
chunks = text_splitter.split_text(content)
print("Done preprocessing. Created ", len(chunks), " chunks of the original pdf") # noqa: T201
# Create vectorstore
embedder = HuggingFaceEmbeddings(model_name=EMBED_MODEL)
_ = Redis.from_texts(
# appending this little bit can sometimes help with semantic retrieval
# especially with multiple companies
texts=[f"Company: {company_name}. " + chunk for chunk in chunks],
embedding=embedder,
index_name=INDEX_NAME,
index_schema=INDEX_SCHEMA,
redis_url=REDIS_URL,
)
if __name__ == "__main__":
ingest_documents()

View File

@@ -0,0 +1,31 @@
from langchain_community.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import UnstructuredFileLoader
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Redis
from langchain_community.document_loaders import TextLoader
from rag_redis.config import EMBED_MODEL, INDEX_NAME, INDEX_SCHEMA, REDIS_URL
loader = DirectoryLoader('/ws/txt_files', glob="**/*.txt", show_progress=True, use_multithreading=True, loader_cls=TextLoader)
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1500, chunk_overlap=100, add_start_index=True
)
chunks = loader.load_and_split(text_splitter)
print("Done preprocessing. Created", len(chunks), "chunks of the original data") # noqa: T201
# Create vectorstore
embedder = HuggingFaceEmbeddings(model_name=EMBED_MODEL)
company_name = "Intel"
_ = Redis.from_texts(
# appending this little bit can sometimes help with semantic retrieval
# especially with multiple companies
texts=[f"Company: {company_name}. " + chunk.page_content for chunk in chunks],
metadatas=[chunk.metadata for chunk in chunks],
embedding=embedder,
index_name=INDEX_NAME,
index_schema=INDEX_SCHEMA,
redis_url=REDIS_URL,
)

View File

@@ -0,0 +1,82 @@
import os
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Redis
from rag_redis.config import EMBED_MODEL, INDEX_NAME, INDEX_SCHEMA, REDIS_URL
from PIL import Image
import numpy as np
import io
def pdf_loader(file_path):
try:
import fitz # noqa:F401
import easyocr
except ImportError:
raise ImportError(
"`PyMuPDF` or 'easyocr' package is not found, please install it with "
"`pip install pymupdf or pip install easyocr.`"
)
doc = fitz.open(file_path)
reader = easyocr.Reader(['en'])
result =''
for i in range(doc.page_count):
page = doc.load_page(i)
pagetext = page.get_text().strip()
if pagetext:
result=result+pagetext
if len(doc.get_page_images(i)) > 0 :
for img in doc.get_page_images(i):
if img:
pageimg=''
xref = img[0]
img_data = doc.extract_image(xref)
img_bytes = img_data['image']
pil_image = Image.open(io.BytesIO(img_bytes))
img = np.array(pil_image)
img_result = reader.readtext(img, paragraph=True, detail=0)
pageimg=pageimg + ', '.join(img_result).strip()
if pageimg.endswith('!') or pageimg.endswith('?') or pageimg.endswith('.'):
pass
else:
pageimg=pageimg+'.'
result=result+pageimg
return result
def ingest_documents():
"""
Ingest PDF to Redis from the data/ directory that
contains Intel manuals.
"""
# Load list of pdfs
company_name = "Intel"
data_path = "data_intel/"
doc_path = [os.path.join(data_path, file) for file in os.listdir(data_path)][0]
print("Parsing Intel architecture manuals", doc_path) # noqa: T201
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1500, chunk_overlap=100, add_start_index=True
)
content = pdf_loader(doc_path)
chunks = text_splitter.split_text(content)
print("Done preprocessing. Created", len(chunks), "chunks of the original pdf") # noqa: T201
# Create vectorstore
embedder = HuggingFaceEmbeddings(model_name=EMBED_MODEL)
_ = Redis.from_texts(
# appending this little bit can sometimes help with semantic retrieval
# especially with multiple companies
texts=[f"Company: {company_name}. " + chunk for chunk in chunks],
embedding=embedder,
index_name=INDEX_NAME,
index_schema=INDEX_SCHEMA,
redis_url=REDIS_URL,
)
if __name__ == "__main__":
ingest_documents()

View File

@@ -0,0 +1,88 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "681a5d1e",
"metadata": {},
"source": [
"## Connect to RAG App\n",
"\n",
"Assuming you are already running this server:\n",
"```bash\n",
"langserve start\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "d774be2a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Nike's revenue in 2023 was $51.2 billion. \n",
"\n",
"Source: 'data/nke-10k-2023.pdf', Start Index: '146100'\n"
]
}
],
"source": [
"from langserve.client import RemoteRunnable\n",
"\n",
"rag_redis = RemoteRunnable(\"http://localhost:8000/rag-redis\")\n",
"\n",
"print(rag_redis.invoke(\"What was Nike's revenue in 2023?\"))"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "07ae0005",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"As of May 31, 2023, Nike had approximately 83,700 employees worldwide. This information can be found in the first piece of context provided. (source: data/nke-10k-2023.pdf, start_index: 32532)\n"
]
}
],
"source": [
"print(rag_redis.invoke(\"How many employees work at Nike?\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4a6b9f00",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,82 @@
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Redis
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_community.llms import HuggingFaceEndpoint
import intel_extension_for_pytorch as ipex
import torch
from rag_redis.config import (
EMBED_MODEL,
INDEX_NAME,
INDEX_SCHEMA,
REDIS_URL,
TGI_ENDPOINT,
)
# Make this look better in the docs.
class Question(BaseModel):
__root__: str
# Init Embeddings
embedder = HuggingFaceEmbeddings(model_name=EMBED_MODEL)
embedder.client= ipex.optimize(embedder.client.eval(), dtype=torch.bfloat16)
#Setup semantic cache for LLM
from langchain.cache import RedisSemanticCache
from langchain.globals import set_llm_cache
set_llm_cache(RedisSemanticCache(
embedding=embedder,
redis_url=REDIS_URL
))
# Connect to pre-loaded vectorstore
# run the ingest.py script to populate this
vectorstore = Redis.from_existing_index(
embedding=embedder, index_name=INDEX_NAME, schema=INDEX_SCHEMA, redis_url=REDIS_URL
)
# TODO allow user to change parameters
retriever = vectorstore.as_retriever(search_type="mmr")
# Define our prompt
template = """
Use the following pieces of context from retrieved
dataset to answer the question. Do not make up an answer if there is no
context provided to help answer it. Include the 'source' and 'start_index'
from the metadata included in the context you used to answer the question
Context:
---------
{context}
---------
Question: {question}
---------
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)
# RAG Chain
model = HuggingFaceEndpoint(
endpoint_url=TGI_ENDPOINT,
max_new_tokens=512,
top_k=10,
top_p=0.95,
typical_p=0.95,
temperature=0.01,
repetition_penalty=1.03,
streaming=True,
truncate=1024
)
chain = (
RunnableParallel({"context": retriever, "question": RunnablePassthrough()})
| prompt
| model
| StrOutputParser()
).with_types(input_type=Question)

View File

@@ -0,0 +1,59 @@
from langchain_community.chat_models import ChatOpenAI
from langchain_community.llms import Ollama
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Redis
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_community.llms import HuggingFaceEndpoint
from langchain.callbacks import streaming_stdout
import intel_extension_for_pytorch as ipex
import torch
from rag_redis.config import (
EMBED_MODEL,
INDEX_NAME,
INDEX_SCHEMA,
REDIS_URL,
TGI_ENDPOINT_NO_RAG,
)
# Make this look better in the docs.
class Question(BaseModel):
__root__: str
# Define our prompt
template = """
Answer the question
---------
Question: {question}
---------
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)
# RAG Chain
callbacks = [streaming_stdout.StreamingStdOutCallbackHandler()]
model = HuggingFaceEndpoint(
endpoint_url=TGI_ENDPOINT_NO_RAG,
max_new_tokens=512,
top_k=10,
top_p=0.95,
typical_p=0.95,
temperature=0.01,
repetition_penalty=1.03,
streaming=True,
truncate=1024
)
chain = (
RunnableParallel({"question": RunnablePassthrough()})
| prompt
| model
| StrOutputParser()
).with_types(input_type=Question)

View File

@@ -0,0 +1,81 @@
import os
def get_boolean_env_var(var_name, default_value=False):
"""Retrieve the boolean value of an environment variable.
Args:
var_name (str): The name of the environment variable to retrieve.
default_value (bool): The default value to return if the variable
is not found.
Returns:
bool: The value of the environment variable, interpreted as a boolean.
"""
true_values = {"true", "1", "t", "y", "yes"}
false_values = {"false", "0", "f", "n", "no"}
# Retrieve the environment variable's value
value = os.getenv(var_name, "").lower()
# Decide the boolean value based on the content of the string
if value in true_values:
return True
elif value in false_values:
return False
else:
return default_value
# Check for openai API key
#if "OPENAI_API_KEY" not in os.environ:
# raise Exception("Must provide an OPENAI_API_KEY as an env var.")
# Whether or not to enable langchain debugging
DEBUG = get_boolean_env_var("DEBUG", False)
# Set DEBUG env var to "true" if you wish to enable LC debugging module
if DEBUG:
import langchain
langchain.debug = True
# Embedding model
EMBED_MODEL = os.getenv("EMBED_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
# Redis Connection Information
REDIS_HOST = os.getenv("REDIS_HOST", "localhost")
REDIS_PORT = int(os.getenv("REDIS_PORT", 6379))
def format_redis_conn_from_env():
redis_url = os.getenv("REDIS_URL", None)
if redis_url:
return redis_url
else:
using_ssl = get_boolean_env_var("REDIS_SSL", False)
start = "rediss://" if using_ssl else "redis://"
# if using RBAC
password = os.getenv("REDIS_PASSWORD", None)
username = os.getenv("REDIS_USERNAME", "default")
if password is not None:
start += f"{username}:{password}@"
return start + f"{REDIS_HOST}:{REDIS_PORT}"
REDIS_URL = format_redis_conn_from_env()
# Vector Index Configuration
INDEX_NAME = os.getenv("INDEX_NAME", "rag-redis")
current_file_path = os.path.abspath(__file__)
parent_dir = os.path.dirname(current_file_path)
REDIS_SCHEMA = os.getenv("REDIS_SCHEMA", "schema.yml")
schema_path = os.path.join(parent_dir, REDIS_SCHEMA)
INDEX_SCHEMA = schema_path
TGI_ENDPOINT = os.getenv("TGI_ENDPOINT", "http://localhost:8080")
TGI_ENDPOINT_NO_RAG = os.getenv("TGI_ENDPOINT_NO_RAG", "http://localhost:8081")

View File

@@ -0,0 +1,11 @@
text:
- name: content
- name: source
numeric:
- name: start_index
vector:
- name: content_vector
algorithm: HNSW
datatype: FLOAT32
dims: 384
distance_metric: COSINE

View File

@@ -0,0 +1,11 @@
text:
- name: content
- name: source
numeric:
- name: start_index
vector:
- name: content_vector
algorithm: HNSW
datatype: FLOAT32
dims: 1024
distance_metric: COSINE

View File

@@ -0,0 +1,11 @@
text:
- name: content
- name: source
numeric:
- name: start_index
vector:
- name: content_vector
algorithm: HNSW
datatype: FLOAT32
dims: 768
distance_metric: COSINE

View File

@@ -0,0 +1,15 @@
text:
- name: content
- name: changefreq
- name: description
- name: language
- name: loc
- name: priority
- name: source
- name: title
vector:
- name: content_vector
algorithm: HNSW
datatype: FLOAT32
dims: 768
distance_metric: COSINE

View File

@@ -0,0 +1,18 @@
## Performance measurements of chain with langsmith
Pre-requisite: Signup in langsmith [https://www.langchain.com/langsmith] and get the api token <br />
### Steps to run perf measurements
1. Build langchain-rag container with most updated Dockerfile
2. Start tgi server on system with Gaudi
3. Statr redis container with docker-compose-redis.yml
4. Add your hugging face access token in docker-compose-langchain.yml and start langchain-rag-server container
5. enter into langchain-rag-server container and start jupyter notebook server (can specify needed IP address and jupyter will run on port 8888)
```
docker exec -it langchain-rag-server bash
cd /test
jupyter notebook --allow-root --ip=X.X.X.X
```
6. Launch jupyter notebook in your browser and open the tgi_gaudi.ipynb notebook
7. Add langsmith api key in first cell of the notebook [os.environ["LANGCHAIN_API_KEY"] = "add-your-langsmith-key" # Your API key]
8. Clear all the cells and run all the cells
9. The output of the last cell which calls client.run_on_dataset() will run the langchain Q&A test and captures measurements in the langsmith server. The URL to access the test result can be obtained from the output of the command

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1 @@
Will update soon.

View File

@@ -0,0 +1,5 @@
#!/bin/bash
git clone https://github.com/huggingface/tgi-gaudi.git
cd ./tgi-gaudi/
docker build -t tgi_gaudi . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy

View File

@@ -0,0 +1,36 @@
#!/bin/bash
# Set default values
default_port=8080
default_model="Intel/neural-chat-7b-v3-3"
default_num_cards=1
# Check if all required arguments are provided
if [ "$#" -lt 0 ] || [ "$#" -gt 3 ]; then
echo "Usage: $0 [num_cards] [port_number] [model_name]"
exit 1
fi
# Assign arguments to variables
num_cards=${1:-$default_num_cards}
port_number=${2:-$default_port}
model_name=${3:-$default_model}
# Check if num_cards is within the valid range (1-8)
if [ "$num_cards" -lt 1 ] || [ "$num_cards" -gt 8 ]; then
echo "Error: num_cards must be between 1 and 8."
exit 1
fi
# Set the volume variable
volume=$PWD/data
# Build the Docker run command based on the number of cards
if [ "$num_cards" -eq 1 ]; then
docker_cmd="docker run -p $port_number:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy tgi_gaudi --model-id $model_name"
else
docker_cmd="docker run -p $port_number:80 -v $volume:/data --runtime=habana -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy tgi_gaudi --model-id $model_name --sharded true --num-shard $num_cards"
fi
# Execute the Docker run command
eval $docker_cmd

View File

@@ -0,0 +1 @@
Will update soon.

10
ChatQnA/ui/.editorconfig Normal file
View File

@@ -0,0 +1,10 @@
[*]
indent_style = tab
[package.json]
indent_style = space
indent_size = 2
[*.md]
indent_style = space
indent_size = 2

1
ChatQnA/ui/.env Normal file
View File

@@ -0,0 +1 @@
DOC_BASE_URL = 'http://xxx.xxx.xxx.xxx:8000/v1/rag'

13
ChatQnA/ui/.eslintignore Normal file
View File

@@ -0,0 +1,13 @@
.DS_Store
node_modules
/build
/.svelte-kit
/package
.env
.env.*
!.env.example
# Ignore files for PNPM, NPM and YARN
pnpm-lock.yaml
package-lock.json
yarn.lock

24
ChatQnA/ui/.eslintrc.cjs Normal file
View File

@@ -0,0 +1,24 @@
module.exports = {
root: true,
parser: "@typescript-eslint/parser",
extends: [
"eslint:recommended",
"plugin:@typescript-eslint/recommended",
"prettier",
],
plugins: ["svelte3", "@typescript-eslint", "neverthrow"],
ignorePatterns: ["*.cjs"],
overrides: [{ files: ["*.svelte"], processor: "svelte3/svelte3" }],
settings: {
"svelte3/typescript": () => require("typescript"),
},
parserOptions: {
sourceType: "module",
ecmaVersion: 2020,
},
env: {
browser: true,
es2017: true,
node: true,
},
};

View File

@@ -0,0 +1,13 @@
.DS_Store
node_modules
/build
/.svelte-kit
/package
.env
.env.*
!.env.example
# Ignore files for PNPM, NPM and YARN
pnpm-lock.yaml
package-lock.json
yarn.lock

13
ChatQnA/ui/.prettierrc Normal file
View File

@@ -0,0 +1,13 @@
{
"pluginSearchDirs": [
"."
],
"overrides": [
{
"files": "*.svelte",
"options": {
"parser": "svelte"
}
}
]
}

34
ChatQnA/ui/README.md Normal file
View File

@@ -0,0 +1,34 @@
<h1 align="center" id="title"> ChatQnA Customized UI</h1>
### 📸 Project Screenshots
![project-screenshot](https://i.imgur.com/26zMnEr.png)
![project-screenshot](https://i.imgur.com/fZbOiTk.png)
![project-screenshot](https://i.imgur.com/FnY3MuU.png)
<h2>🧐 Features</h2>
Here're some of the project's features:
- Start a Text ChatInitiate a text chat with the ability to input written conversations, where the dialogue content can also be customized based on uploaded files.
- Upload File: The choice between uploading locally or copying a remote link. Chat according to uploaded knowledge base.
- Clear: Clear the record of the current dialog box without retaining the contents of the dialog box.
- Chat history: Historical chat records can still be retained after refreshing, making it easier for users to view the context.
- Scroll to Bottom / Top: The chat automatically slides to the bottom. Users can also click the top icon to slide to the top of the chat record.
- End to End Time: Shows the time spent on the current conversation.
<h2>🛠️ Get it Running:</h2>
1. Clone the repo.
2. cd command to the current folder.
3. Modify the required .env variables.
```
DOC_BASE_URL = ''
```
4. Execute `npm install` to install the corresponding dependencies.
5. Execute `npm run dev` in both enviroments

10175
ChatQnA/ui/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

58
ChatQnA/ui/package.json Normal file
View File

@@ -0,0 +1,58 @@
{
"name": "sveltekit-auth-example",
"version": "0.0.1",
"private": true,
"scripts": {
"dev": "vite dev --port 80 --host 0.0.0.0",
"build": "vite build",
"preview": "vite preview",
"check": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json",
"check:watch": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json --watch",
"lint": "prettier --check . && eslint .",
"format": "prettier --write ."
},
"devDependencies": {
"@fortawesome/free-solid-svg-icons": "6.2.0",
"@sveltejs/adapter-auto": "1.0.0-next.75",
"@sveltejs/kit": "^1.20.1",
"@tailwindcss/typography": "0.5.7",
"@types/debug": "4.1.7",
"@typescript-eslint/eslint-plugin": "^5.27.0",
"@typescript-eslint/parser": "^5.27.0",
"autoprefixer": "^10.4.7",
"daisyui": "3.5.1",
"date-picker-svelte": "^2.6.0",
"debug": "4.3.4",
"eslint": "^8.16.0",
"eslint-config-prettier": "^8.3.0",
"eslint-plugin-neverthrow": "1.1.4",
"eslint-plugin-svelte3": "^4.0.0",
"flowbite-svelte": "^0.44.4",
"postcss": "^8.4.23",
"postcss-load-config": "^4.0.1",
"postcss-preset-env": "^8.3.2",
"prettier": "^2.8.8",
"prettier-plugin-svelte": "^2.7.0",
"prettier-plugin-tailwindcss": "^0.3.0",
"svelte": "^3.59.1",
"svelte-check": "^2.7.1",
"svelte-fa": "3.0.3",
"svelte-preprocess": "^4.10.7",
"tailwindcss": "^3.1.5",
"tslib": "^2.3.1",
"typescript": "^4.7.4",
"vite": "^4.3.9"
},
"type": "module",
"dependencies": {
"date-fns": "^2.30.0",
"driver.js": "^1.3.0",
"flowbite-svelte-icons": "^1.4.0",
"fuse.js": "^6.6.2",
"lodash": "^4.17.21",
"ramda": "^0.29.0",
"sse.js": "^0.6.1",
"svelte-notifications": "^0.9.98",
"svrollbar": "^0.12.0"
}
}

View File

@@ -0,0 +1,13 @@
const tailwindcss = require('tailwindcss');
const autoprefixer = require('autoprefixer');
const config = {
plugins: [
//Some plugins, like tailwindcss/nesting, need to run before Tailwind,
tailwindcss(),
//But others, like autoprefixer, need to run after,
autoprefixer
]
};
module.exports = config;

5
ChatQnA/ui/src/app.d.ts vendored Normal file
View File

@@ -0,0 +1,5 @@
// See: https://kit.svelte.dev/docs/types#app
// import { Result} from "neverthrow";
interface Window {
deviceType: string;
}

14
ChatQnA/ui/src/app.html Normal file
View File

@@ -0,0 +1,14 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<link rel="icon" href="%sveltekit.assets%/favicon.png" />
<meta name="viewport" content="width=device-width" />
%sveltekit.head%
</head>
<body>
<div class="h-full w-full">
%sveltekit.body%
</div>
</body>
</html>

View File

@@ -0,0 +1,86 @@
/* Write your global styles here, in PostCSS syntax */
@tailwind base;
@tailwind components;
@tailwind utilities;
html, body {
height: 100%;
}
.btn {
@apply flex-nowrap;
}
a.btn {
@apply no-underline;
}
.input {
@apply text-base;
}
.bg-dark-blue {
background-color: #004a86;
}
.bg-light-blue {
background-color: #0068b5;
}
.bg-turquoise {
background-color: #00a3f6;
}
.bg-header {
background-color: #ffffff;
}
.bg-button {
background-color: #0068b5;
}
.bg-title {
background-color: #f7f7f7;
}
.text-header {
color: #0068b5;
}
.text-button {
color: #252e47;
}
.text-title-color {
color: rgb(38,38,38);
}
.font-intel {
font-family: "intel-clear","tahoma",Helvetica,"helvetica",Arial,sans-serif;
}
.font-title-intel {
font-family: "intel-one","intel-clear",Helvetica,Arial,sans-serif;
}
.bg-footer {
background-color: #e7e7e7;
}
.bg-light-green {
background-color: #d7f3a1;
}
.bg-purple {
background-color: #653171;
}
.bg-dark-blue {
background-color: #224678;
}
.border-input-color {
border-color: #605e5c;
}
.w-12\/12 {
width: 100%
}

View File

@@ -0,0 +1,14 @@
<script lang="ts">
import { createEventDispatcher } from "svelte";
let dispatch = createEventDispatcher();
</script>
<!-- svelte-ignore a11y-click-events-have-key-events -->
<svg
class="absolute top-0 right-0 hover:opacity-70"
on:click={() => {
dispatch('DeleteAvatar') }}
viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" width="20" height="20">
<path d="M512 832c-176.448 0-320-143.552-320-320S335.552 192 512 192s320 143.552 320 320-143.552 320-320 320m0-704C300.256 128 128 300.256 128 512s172.256 384 384 384 384-172.256 384-384S723.744 128 512 128" fill="#bbbbbb"></path><path d="M649.824 361.376a31.968 31.968 0 0 0-45.248 0L505.6 460.352l-98.976-98.976a31.968 31.968 0 1 0-45.248 45.248l98.976 98.976-98.976 98.976a32 32 0 0 0 45.248 45.248l98.976-98.976 98.976 98.976a31.904 31.904 0 0 0 45.248 0 31.968 31.968 0 0 0 0-45.248L550.848 505.6l98.976-98.976a31.968 31.968 0 0 0 0-45.248" fill="#bbbbbb"></path>
</svg>

View File

@@ -0,0 +1,28 @@
<!-- <svg
width="35"
height="35"
viewBox="0 0 48 48"
fill="none"
xmlns="http://www.w3.org/2000/svg"
>
<g clip-path="url(#clip0_16_93)">
<rect x="0.5" y="0.238312" width="47" height="47" fill="#0068B5" />
<path
d="M39.51 0.238312H8.49C4.0955 0.238312 0.5 3.83381 0.5 8.22831V39.2483C0.5 43.6428 4.0955 47.2383 8.49 47.2383H39.51C43.9045 47.2383 47.5 43.6428 47.5 39.2483V8.22831C47.5 3.83381 43.9045 0.238312 39.51 0.238312ZM44.915 39.2483C44.915 42.2328 42.4945 44.6533 39.51 44.6533H8.49C5.5055 44.6533 3.085 42.2328 3.085 39.2483V8.22831C3.085 5.24381 5.5055 2.82331 8.49 2.82331H39.51C42.4945 2.82331 44.915 5.24381 44.915 8.22831V39.2483Z"
fill="#0068B5"
/>
<path
d="M9.52393 21.3178H11.7094L11.7094 29.3548H9.52393V21.3178ZM20.3574 22.2108C20.1694 21.9523 19.8874 21.7408 19.4879 21.5763C19.1119 21.4118 18.6889 21.3178 18.2424 21.3178C17.2084 21.3178 16.3389 21.7643 15.6574 22.6338V21.4823H13.7304V29.3078H15.7984V25.7593C15.7984 24.8898 15.8454 24.2788 15.9629 23.9498C16.0569 23.6208 16.2684 23.3623 16.5504 23.1743C16.8324 22.9863 17.1614 22.8688 17.5139 22.8688C17.7959 22.8688 18.0309 22.9393 18.2424 23.0803C18.4304 23.2213 18.5949 23.4093 18.6654 23.6678C18.7594 23.9263 18.8064 24.4668 18.8064 25.3128V29.3078H20.8744V24.4433C20.8744 23.8323 20.8274 23.3858 20.7569 23.0568C20.6864 22.7513 20.5689 22.4693 20.3574 22.2108ZM25.7389 27.8038C25.5979 27.8038 25.4804 27.7803 25.3864 27.7098C25.2924 27.6393 25.2219 27.5453 25.1984 27.4513C25.1749 27.3573 25.1514 26.9813 25.1514 26.3233V23.1508H26.5614V21.5058H25.1514V18.7563L23.0834 19.9548V21.5058V23.1508V26.5583C23.0834 27.2868 23.1069 27.7803 23.1539 28.0153C23.2009 28.3443 23.2949 28.6263 23.4359 28.8143C23.5769 29.0023 23.7884 29.1668 24.0939 29.3078C24.3994 29.4253 24.7284 29.4958 25.1044 29.4958C25.7154 29.4958 26.2559 29.4018 26.7494 29.1903L26.5614 27.5923C26.2089 27.7333 25.9269 27.8038 25.7389 27.8038ZM33.7524 22.4928C33.0709 21.7173 32.1544 21.3413 31.0029 21.3413C29.9689 21.3413 29.0994 21.7173 28.4414 22.4458C27.7599 23.1743 27.4309 24.1848 27.4309 25.5008C27.4309 26.5818 27.6894 27.4748 28.2064 28.2033C28.8644 29.0963 29.8749 29.5428 31.2379 29.5428C32.1074 29.5428 32.8124 29.3548 33.3764 28.9553C33.9404 28.5558 34.3634 27.9918 34.6219 27.2163L32.5539 26.8638C32.4364 27.2633 32.2719 27.5453 32.0604 27.7098C31.8489 27.8743 31.5669 27.9683 31.2379 27.9683C30.7679 27.9683 30.3684 27.8038 30.0394 27.4513C29.7104 27.0988 29.5459 26.6288 29.5459 26.0178H34.7394C34.7394 24.4433 34.4339 23.2448 33.7524 22.4928ZM29.5694 24.7488C29.5694 24.1848 29.7104 23.7383 29.9924 23.4093C30.2979 23.0803 30.6504 22.9158 31.1204 22.9158C31.5434 22.9158 31.8959 23.0803 32.2014 23.3858C32.5069 23.6913 32.6479 24.1613 32.6714 24.7488H29.5694ZM36.4079 18.5448H38.4759V29.3548H36.4079V18.5448Z"
fill="white"
/>
<path
d="M9.52393 18.5448H11.7094L11.7094 20.5654H9.52393V18.5448ZM39.2058 53.1889C59.7131 70.5741 37.9465 53.1367 37.547 52.9722C60.5267 71.228 41.5876 53.1889 41.1411 53.1889C40.1071 53.1889 54.2638 57.2959 53.5823 58.1654L44.3775 54.0099L42.8 56.0803L44.9335 56.0763L43.617 55.1029L49.2888 57.4321C49.2888 56.5626 69.0838 68.5409 41.665 52.9722C67.9574 69.2353 48.7539 58.3534 49.0359 58.1654C49.3179 57.9774 72.2331 77.3305 48.0529 59.0448C73.8431 77.373 40.6532 52.2185 40.8647 52.3595C64.5928 69.3279 66.2469 69.734 44.0477 53.3531C68.4587 70.8049 45.1808 54.42 45.1808 55.266L49.6436 57.6191L50.8176 56.2254L46.645 54.7317C46.645 54.1207 47.0599 55.184 46.9894 54.855C46.9189 54.5495 63.0924 72.6928 39.2058 53.1889ZM45.3834 56.0442C45.2424 56.0442 60.49 64.1373 43.0764 53.1889C59.6606 67.1938 58.0346 62.1756 40.8647 50.7007C58.8678 64.6804 43.7296 53.3942 43.7296 52.7362L43.617 55.1029L43.3529 52.3595L44.7353 53.7418L43.0764 53.1889L44.244 54.855L46.1176 55.6771L42.8 57.336L45.5647 53.1889L41.9705 49.5948L46.1176 55.1029L46.3941 55.6771C46.3941 56.4056 44.3403 54.3363 44.3873 54.5713C65.2775 66.4664 68.0297 70.4029 45.348 56.6803C69.965 73.7705 43.9793 55.5361 44.2848 55.6771C44.5903 55.7946 60.4832 66.2088 41.9705 53.7418C42.5815 53.7418 44.8545 53.1837 45.348 52.9722L43.7511 52.3595C43.3986 52.5005 45.5714 56.0442 45.3834 56.0442ZM44.0342 56.5108C43.3527 55.7353 45.3338 56.783 44.1823 56.783C43.1483 56.783 44.9043 55.6048 44.2463 56.3333C43.5648 57.0618 43.7511 51.0435 43.7511 52.3595C43.7511 53.4405 43.6653 53.0133 44.1823 53.7418C44.8403 54.6348 41.7134 54.2598 43.0764 54.2598C43.9459 54.2598 43.4702 56.9103 44.0342 56.5108C44.5982 56.1113 44.1288 57.5428 44.3873 56.7673L43.7511 56.2254C55.3795 71.8986 44.3938 54.9384 44.1823 55.1029C43.9708 55.2674 44.0801 54.2598 43.7511 54.2598C56.2643 69.3767 58.4567 71.4935 44.1823 55.1029C57.894 68.7712 44.3873 57.3783 44.3873 56.7673L44.1823 56.945C44.1823 55.3705 44.7157 57.2628 44.0342 56.5108ZM44.3873 54.5713C44.3873 54.0073 43.7522 56.8398 44.0342 56.5108C44.3397 56.1818 43.495 56.2254 43.965 56.2254C44.388 56.2254 55.4258 75.7185 43.7511 56.2254C44.0566 56.5309 44.1588 56.1955 44.1823 56.783L44.3873 54.5713Z"
fill="#00C7FD"
/>
</g>
<defs>
<clipPath id="clip0_16_93">
<rect x="0.5" y="0.238312" width="47" height="47" fill="white" />
</clipPath>
</defs>
</svg> -->

View File

@@ -0,0 +1,52 @@
<script lang="ts">
export let overrideClasses = "";
const classes = overrideClasses ? overrideClasses : `w-5 h-5 text-gray-400`;
</script>
<!-- <svg
class={classes}
width="10"
height="10"
fill="none"
viewBox="0 0 18 18"
style="min-width: 18px; min-height: 18px;"
><g
><path
fill="#3369FF"
d="M15.71 8.019 3.835 1.368a1.125 1.125 0 0 0-1.61 1.36l2.04 5.71h5.298a.562.562 0 1 1 0 1.125H4.264l-2.04 5.71a1.128 1.128 0 0 0 1.058 1.506c.194 0 .384-.05.552-.146l11.877-6.65a1.125 1.125 0 0 0 0-1.964Z"
/></g
></svg
> -->
<!--
<svg
class={classes}
xmlns="http://www.w3.org/2000/svg"
fill="none"
viewBox="0 0 24 24"
stroke-width="1.5"
stroke="currentColor"
>
<path
stroke-linecap="round"
stroke-linejoin="round"
d="M6 12L3.269 3.126A59.768 59.768 0 0121.485 12 59.77 59.77 0 013.27 20.876L5.999 12zm0 0h7.5"
/>
</svg> -->
<svg
t="1708926517502"
class={classes}
viewBox="0 0 1024 1024"
version="1.1"
xmlns="http://www.w3.org/2000/svg"
p-id="4586"
id="mx_n_1708926517503"
width="200"
height="200"
><path
d="M0 1024l106.496-474.112 588.8-36.864-588.8-39.936-106.496-473.088 1024 512z"
p-id="4587"
fill="#0068b5"
/></svg
>

View File

@@ -0,0 +1,10 @@
<!-- <svg
viewBox="0 0 1024 1024"
version="1.1"
xmlns="http://www.w3.org/2000/svg"
width="32"
height="32"
>
<path d="M512 512c93.866667 0 170.666667-76.8 170.666667-170.666667 0-93.866667-76.8-170.666667-170.666667-170.666667C418.133333 170.666667 341.333333 247.466667 341.333333 341.333333 341.333333 435.2 418.133333 512 512 512zM512 597.333333c-115.2 0-341.333333 55.466667-341.333333 170.666667l0 85.333333 682.666667 0 0-85.333333C853.333333 652.8 627.2 597.333333 512 597.333333z" p-id="4050" fill="#ffffff"></path></svg> -->
<svg t="1708914168912" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="1581" width="200" height="200"><path d="M447.13 46.545h101.818v930.91H447.13V46.545z" fill="#0068b5" p-id="1582" data-spm-anchor-id="a313x.search_index.0.i0.12a13a81x9rPe6" class="selected"></path></svg>

After

Width:  |  Height:  |  Size: 857 B

View File

@@ -0,0 +1,87 @@
.driverjs-theme {
background: transparent;
color: #fff;
box-shadow: none;
padding: 0;
}
.driver-popover-arrow {
border: 10px solid transparent;
animation: blink 1s 3 steps(1);
}
@keyframes blink {
0% { opacity: 1; }
50% { opacity: 0.2; }
100% { opacity: 1; }
}
.driver-popover.driverjs-theme .driver-popover-arrow-side-left.driver-popover-arrow {
border-left-color: #174ed1;
}
.driver-popover.driverjs-theme .driver-popover-arrow-side-right.driver-popover-arrow {
border-right-color: #174ed1;
}
.driver-popover.driverjs-theme .driver-popover-arrow-side-top.driver-popover-arrow {
border-top-color: #174ed1;
}
.driver-popover.driverjs-theme .driver-popover-arrow-side-bottom.driver-popover-arrow {
border-bottom-color: #174ed1;
}
.driver-popover-footer {
background: transparent;
color: #fff;
}
.driver-popover-title {
border-top-left-radius: 5px;
border-top-right-radius: 5px;
}
.driver-popover-title, .driver-popover-description {
display: block;
padding: 15px 15px 7px 15px;
background: #174ed1;
border: none;
}
.driver-popover-close-btn {
color: #fff
}
.driver-popover-footer button:hover, .driver-popover-footer button:focus {
background: #174ed1;
color: #fff;
}
.driver-popover-description {
padding: 5px 15px;
border-bottom-left-radius: 5px;
border-bottom-right-radius: 5px;
}
.driver-popover-title[style*=block]+.driver-popover-description {
margin: 0;
}
.driver-popover-progress-text {
color: #fff;
}
.driver-popover-footer button {
background: #174ed1;
border: 2px #174ed1 dashed;
color: #fff;
border-radius: 50%;
text-shadow: none;
}
.driver-popover-close-btn:hover, .driver-popover-close-btn:focus {
color: #fff;
}
.driver-popover-navigation-btns button+button {
margin-left: 10px;
}

View File

@@ -0,0 +1,16 @@
<svg
class="h-4 w-4 text-white rtl:rotate-180 dark:text-white-800"
aria-hidden="true"
xmlns="http://www.w3.org/2000/svg"
fill="none"
viewBox="0 0 6 10"
>
<path
stroke="currentColor"
stroke-linecap="round"
stroke-linejoin="round"
stroke-width="2"
d="m1 9 4-4-4-4"
/>
</svg>

After

Width:  |  Height:  |  Size: 291 B

View File

@@ -0,0 +1,15 @@
<svg
class="h-4 w-4 text-white rtl:rotate-180 dark:text-white-800"
aria-hidden="true"
xmlns="http://www.w3.org/2000/svg"
fill="none"
viewBox="0 0 6 10"
>
<path
stroke="currentColor"
stroke-linecap="round"
stroke-linejoin="round"
stroke-width="2"
d="M5 1 1 5l4 4"
/>
</svg>

After

Width:  |  Height:  |  Size: 290 B

View File

@@ -0,0 +1 @@
<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1699596229588" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="20460" xmlns:xlink="http://www.w3.org/1999/xlink" width="32" height="32"><path d="M576 128a96 96 0 0 1 96 96v128h-224a96 96 0 0 0-95.84 90.368L352 448v224H224a96 96 0 0 1-96-96V224a96 96 0 0 1 96-96h352z" fill="#CCD9FF" p-id="20461"></path><path d="M576 96a128 128 0 0 1 128 128v128h-64V224a64 64 0 0 0-59.2-63.84L576 160H224a64 64 0 0 0-64 64v352a64 64 0 0 0 64 64h128v64H224a128 128 0 0 1-128-128V224a128 128 0 0 1 128-128z" fill="#3671FD" p-id="20462"></path><path d="M800 320H448a128 128 0 0 0-128 128v352a128 128 0 0 0 128 128h352a128 128 0 0 0 128-128V448a128 128 0 0 0-128-128z m-352 64h352a64 64 0 0 1 64 64v352a64 64 0 0 1-64 64H448a64 64 0 0 1-64-64V448a64 64 0 0 1 64-64z" fill="#3671FD" p-id="20463"></path><path d="M128 736a32 32 0 0 1 32 32 96 96 0 0 0 90.368 95.84L256 864a32 32 0 0 1 0 64 160 160 0 0 1-160-160 32 32 0 0 1 32-32z" fill="#FE9C23" p-id="20464"></path></svg>

After

Width:  |  Height:  |  Size: 1.1 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 14 KiB

View File

@@ -0,0 +1,52 @@
<script lang="ts">
import MessageAvatar from "$lib/modules/chat/MessageAvatar.svelte";
import type { Message } from "$lib/shared/constant/Interface";
import MessageTimer from "./MessageTimer.svelte";
import { createEventDispatcher } from "svelte";
let dispatch = createEventDispatcher();
export let msg: Message;
export let time: string = "";
console.log("msg", msg);
</script>
<div
class={msg.role === 0
? "flex w-full gap-3"
: "flex w-full items-center gap-3"}
>
<div
class={msg.role === 0
? "flex aspect-square w-[3px] items-center justify-center rounded bg-[#0597ff] max-sm:hidden"
: "flex aspect-square h-10 w-[3px] items-center justify-center rounded bg-[#000] max-sm:hidden"}
>
<MessageAvatar role={msg.role} />
</div>
<div class="group relative items-center">
<div>
<p
class=" max-w-[60vw] items-center whitespace-pre-line break-keep text-[0.8rem] leading-5 sm:max-w-[50rem]"
>
{@html msg.content}
</p>
</div>
</div>
</div>
{#if time}
<div>
<MessageTimer
{time}
on:handleTop={() => {
dispatch("scrollTop");
}}
/>
</div>
{/if}
<style>
.wrap-style {
word-wrap: break-word;
word-break: break-all;
}
</style>

View File

@@ -0,0 +1,14 @@
<script lang="ts">
import AssistantIcon from "$lib/assets/chat/svelte/Assistant.svelte";
import PersonOutlined from "$lib/assets/chat/svelte/PersonOutlined.svelte";
import { MessageRole } from "$lib/shared/constant/Interface";
export let role: MessageRole;
</script>
{#if role === MessageRole.User}
<PersonOutlined />
{:else}
<AssistantIcon />
{/if}

View File

@@ -0,0 +1,51 @@
<script lang="ts">
export let time: string;
import { createEventDispatcher } from "svelte";
let dispatch = createEventDispatcher();
</script>
<div class="ml-2 flex flex-col">
<div class="my-4 flex items-center justify-end gap-2 space-x-2">
<div class="ml-2 w-min cursor-pointer" data-state="closed">
<!-- svelte-ignore a11y-click-events-have-key-events -->
<svg
xmlns="http://www.w3.org/2000/svg"
xml:space="preserve"
viewBox="0 0 21.6 21.6"
width="24"
height="24"
class="w-5 fill-[#0597ff] hover:fill-[#0597ff]"
on:click={() => {
dispatch("handleTop");
}}
><path
d="M2.2 3.6V.8h17.2v2.8zm7.2 17.2V10.4L5.8 14l-1.9-1.9 6.9-6.9 6.9 6.9-1.9 1.9-3.6-3.6v10.4z"
/></svg
>
</div>
<div
class="inline-block w-0.5 self-stretch bg-gray-300 opacity-100 dark:opacity-50"
/>
<div class="w-min cursor-pointer" data-state="closed">
<svg
xmlns="http://www.w3.org/2000/svg"
xml:space="preserve"
viewBox="0 0 21.6 21.6"
width="24"
height="24"
class="w-5 fill-[#0597ff] hover:fill-[#0597ff]"
><path d="M12.3 17.1V7.6H7.6v2.8h1.9v6.7H6.4v2.7h8.8v-2.7z" /><circle
cx="10.8"
cy="3.6"
r="1.9"
/></svg
>
</div>
<div class="flex items-center space-x-1 text-base text-gray-800">
<strong>End to End Time: </strong>
<p>{time}s</p>
</div>
</div>
<div class="ml-2 flex flex-col" />
</div>

View File

@@ -0,0 +1,32 @@
<script lang="ts">
import { onMount } from "svelte";
import { page } from "$app/stores";
import { browser } from "$app/environment";
import { open } from "$lib/shared/stores/common/Store";
import Scrollbar from "$lib/shared/components/scrollbar/Scrollbar.svelte";
let root: HTMLElement
onMount(() => {
document.getElementsByTagName("body").item(0)!.removeAttribute("tabindex");
// root.style.height = document.documentElement.clientHeight + 'px'
});
if (browser) {
page.subscribe(() => {
// close side navigation when route changes
if (window.innerWidth > 768) {
$open = true;
}
});
}
</script>
<div bind:this={root} class='h-full overflow-hidden relative'>
<div class="h-full flex items-start">
<div class='relative flex flex-col h-full pl-0 w-full bg-white'>
<Scrollbar className="h-0 grow " classLayout="h-full" alwaysVisible={false}>
<slot />
</Scrollbar>
</div>
</div>
</div>

View File

@@ -0,0 +1,24 @@
import { env } from "$env/dynamic/public";
import { SSE } from "sse.js";
const DOC_BASE_URL = env.DOC_BASE_URL;
export async function fetchTextStream(
query: string,
knowledge_base_id: string,
) {
let payload = {};
let url = "";
payload = {
query: query,
knowledge_base_id: knowledge_base_id,
};
url = `${DOC_BASE_URL}/chat_stream`;
return new SSE(url, {
headers: { "Content-Type": "application/json" },
payload: JSON.stringify(payload),
});
}

View File

@@ -0,0 +1,44 @@
import { env } from "$env/dynamic/public";
const DOC_BASE_URL = env.DOC_BASE_URL;
export async function fetchKnowledgeBaseId(file: Blob, fileName: string) {
const url = `${DOC_BASE_URL}/create`;
const formData = new FormData();
formData.append("file", file, fileName);
const init: RequestInit = {
method: "POST",
body: formData,
};
try {
const response = await fetch(url, init);
if (!response.ok) throw response.status;
return await response.json();
} catch (error) {
console.error("network error: ", error);
return undefined;
}
}
export async function fetchKnowledgeBaseIdByPaste(pasteUrlList: any, urlType: string | undefined) {
const url = `${DOC_BASE_URL}/upload_link`;
const data = {
link_list: pasteUrlList,
};
const init: RequestInit = {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(data),
};
try {
const response = await fetch(url, init);
if (!response.ok) throw response.status;
return await response.json();
} catch (error) {
console.error("network error: ", error);
return undefined;
}
}

View File

@@ -0,0 +1,43 @@
export function scrollToBottom(scrollToDiv: HTMLElement) {
if (scrollToDiv) {
setTimeout(
() =>
scrollToDiv.scroll({
behavior: "auto",
top: scrollToDiv.scrollHeight,
}),
100
);
}
}
export function scrollToTop(scrollToDiv: HTMLElement) {
if (scrollToDiv) {
setTimeout(
() =>
scrollToDiv.scroll({
behavior: "auto",
top: 0,
}),
100
);
}
}
export function getCurrentTimeStamp() {
return Math.floor(new Date().getTime())
}
export function fromTimeStampToTime(timeStamp: number) {
return new Date(timeStamp * 1000).toTimeString().slice(0, 8)
}
export function formatTime(seconds) {
const hours = String(Math.floor(seconds / 3600)).padStart(2, '0');
const minutes = String(Math.floor((seconds % 3600) / 60)).padStart(2, '0');
const remainingSeconds = String(seconds % 60).padStart(2, '0');
return `${hours}:${minutes}:${remainingSeconds}`;
}

View File

@@ -0,0 +1,140 @@
<script lang="ts">
import Scrollbar from "$lib/shared/components/scrollbar/Scrollbar.svelte";
import ChatMessage from "$lib/modules/chat/ChatMessage.svelte";
import "driver.js/dist/driver.css";
import "$lib/assets/layout/css/driver.css";
import Previous from "$lib/assets/upload/previous.svelte";
import Next from "$lib/assets/upload/next.svelte";
import { scrollToBottom } from "$lib/shared/Utils";
import { onMount } from "svelte";
let scrollToDiv: HTMLDivElement;
export let items;
export let label: string;
export let scrollName: string;
onMount(async () => {
scrollToDiv = document
.querySelector(scrollName)
?.querySelector(".svlr-viewport")!;
console.log(
"scrollToDiv",
scrollName,
document,
document.querySelector("chat-scrollbar1")
);
});
// gallery
let currentIndex = 0;
function nextItem() {
currentIndex = (currentIndex + 1) % items.length;
console.log("nextItem", currentIndex);
}
function prevItem() {
currentIndex = (currentIndex - 1 + items.length) % items.length;
console.log("prevItem", currentIndex);
}
$: currentItem = items[currentIndex];
$: {
if (items) {
scrollToBottom(scrollToDiv);
}
}
// gallery
</script>
<div
id="custom-controls-gallery"
class="relative mb-8 h-0 w-full w-full grow px-2 {scrollName}"
data-carousel="slide"
>
<!-- Carousel wrapper -->
<!-- Display current item -->
{#if currentItem}
<Scrollbar
classLayout="flex flex-col gap-5"
className=" h-0 w-full grow px-2 mt-3 ml-10"
>
{#each currentItem.content as message, i}
<ChatMessage msg={message} />
{/each}
</Scrollbar>
<!-- Loading text -->
{/if}
<div class="radius absolute left-0 p-2">
<!-- Display end to end time -->
<label for="" class="mr-2 text-xs font-bold text-blue-700">{label} </label>
</div>
{#if currentItem.time !== "0s"}
<div class="radius absolute right-0 p-2">
<!-- Display end to end time -->
<label for="" class="mr-2 text-xs font-bold text-blue-700"
>End to End Time:
</label>
<label for="" class="text-xs">{currentItem.time}</label>
</div>
{/if}
<div class="flex items-center justify-between">
<div class="justify-left ml-2 flex items-center">
<!-- Previous button -->
<button
type="button"
class="group absolute start-0 top-0 z-30 flex h-full
cursor-pointer items-center justify-center
focus:outline-none"
on:click={prevItem}
>
<span
class="group-focus:ring-gray dark:group-hover:bg-[#000]-800/60 dark:group-focus:ring-[#000]-800/70 inline-flex h-7
w-7 items-center justify-center
rounded-full bg-[#000]/10
group-hover:bg-[#000]/50 group-focus:bg-[#000]/50
group-focus:outline-none
group-focus:ring-4 dark:bg-gray-800/30"
>
<Previous />
<span class="sr-only">Previous</span>
</span>
</button>
<!-- Next button -->
<button
type="button"
class="group absolute end-0 top-0 z-30 flex h-full cursor-pointer items-center justify-center focus:outline-none"
on:click={nextItem}
>
<span
class="group-focus:ring-gray dark:group-hover:bg-[#000]-800/60 dark:group-focus:ring-[#000]-800/70 inline-flex h-7
w-7 items-center justify-center
rounded-full bg-[#000]/10
group-hover:bg-[#000]/50 group-focus:bg-[#000]/50
group-focus:outline-none
group-focus:ring-4 dark:bg-gray-800/30"
>
<Next />
<span class="sr-only">Next</span>
</span>
</button>
</div>
</div>
</div>
<style>
.row::-webkit-scrollbar {
display: none;
}
.row {
scrollbar-width: none;
}
.row {
-ms-overflow-style: none;
}
</style>

View File

@@ -0,0 +1,32 @@
<div
class="mb-6 flex items-center justify-center self-center bg-black text-sm text-gray-500"
/>
<div class="flex items-center justify-center gap-3">
<div class="relative inline-flex">
<div class="h-2 w-2 rounded-full bg-blue-600" />
<div
class="absolute left-0 top-0 h-2 w-2 animate-[ping_1s_infinite_100ms] rounded-full bg-blue-600"
/>
<div
class="duration-800 absolute left-0 top-0 h-2 w-2 animate-pulse rounded-full bg-blue-600"
/>
</div>
<div class="relative inline-flex">
<div class="h-2 w-2 rounded-full bg-blue-600" />
<div
class="absolute left-0 top-0 h-2 w-2 animate-[ping_1s_infinite_300ms] rounded-full bg-blue-600"
/>
<div
class="absolute left-0 top-0 h-2 w-2 animate-pulse rounded-full bg-blue-600"
/>
</div>
<div class="relative inline-flex">
<div class="h-2 w-2 rounded-full bg-blue-600" />
<div
class="absolute left-0 top-0 h-2 w-2 animate-[ping_1s_infinite_500ms] rounded-full bg-blue-600"
/>
<div
class="absolute left-0 top-0 h-2 w-2 animate-pulse rounded-full bg-blue-600"
/>
</div>
</div>

View File

@@ -0,0 +1,32 @@
<script lang="ts">
import { Svroller } from "svrollbar";
export let className: string = "";
export let classLayout: string = "";
export let alwaysVisible = true;
</script>
<div class={className}>
<Svroller height="100%" width="100%" {alwaysVisible}>
<div class={classLayout}>
<slot></slot>
</div>
</Svroller>
</div>
<style>
:global(.svlr-contents) {
height: 100%;
}
.row::-webkit-scrollbar {
display: none;
}
.row {
scrollbar-width: none;
}
.row {
-ms-overflow-style: none;
}
</style>

View File

@@ -0,0 +1,33 @@
<script lang="ts">
import { Button, Helper, Input, Label, Modal } from "flowbite-svelte";
import { createEventDispatcher } from "svelte";
const dispatch = createEventDispatcher();
let formModal = false;
let urlValue = "";
function handelPasteURL() {
const pasteUrlList = urlValue.split(";").map((url) => url.trim());
dispatch("paste", { pasteUrlList });
formModal = false;
}
</script>
<Label class="space-y-1">
<div class="grid grid-cols-3">
<Input
class="col-span-2 rounded-none rounded-l-lg focus:border-blue-700 focus:ring-blue-700"
type="text"
name="text"
placeholder="URL"
bind:value={urlValue}
/>
<Button
type="submit"
class="w-full rounded-none rounded-r-lg bg-blue-700"
on:click={() => handelPasteURL()}>Confirm</Button
>
</div>
<Helper>Use semicolons (;) to separate multiple URLs.</Helper>
</Label>

View File

@@ -0,0 +1,32 @@
<script lang="ts">
import { Fileupload, Label } from "flowbite-svelte";
import { createEventDispatcher } from "svelte";
const dispatch = createEventDispatcher();
let value;
function handleInput(event: Event) {
const file = (event.target as HTMLInputElement).files![0];
if (!file) return;
const reader = new FileReader();
reader.onloadend = () => {
if (!reader.result) return;
const src = reader.result.toString();
dispatch("upload", { src: src, fileName: file.name });
};
reader.readAsDataURL(file);
}
</script>
<div>
<Label class="space-y-2 mb-2">
<Fileupload
bind:value
on:change={handleInput}
class="focus:border-blue-700 foucs:ring-0"
/>
</Label>
</div>

View File

@@ -0,0 +1,151 @@
<script lang="ts">
import { Drawer, Button, CloseButton, Tabs, TabItem } from "flowbite-svelte";
import { InfoCircleSolid } from "flowbite-svelte-icons";
import { sineIn } from "svelte/easing";
import UploadFile from "./upload-knowledge.svelte";
import PasteURL from "./PasteKnowledge.svelte";
import {
knowledge1,
knowledgeName,
} from "$lib/shared/stores/common/Store";
import DeleteIcon from "$lib/assets/avatar/svelte/Delete.svelte";
import { getNotificationsContext } from "svelte-notifications";
import {
fetchKnowledgeBaseId,
fetchKnowledgeBaseIdByPaste,
} from "$lib/network/upload/Network";
const { addNotification } = getNotificationsContext();
console.log("allKnowledges", $knowledgeName);
let hidden6 = true;
let selectKnowledge = -1;
let transitionParamsRight = {
x: 320,
duration: 200,
easing: sineIn,
};
async function handleKnowledgePaste(
e: CustomEvent<{ pasteUrlList: string[] }>
) {
let knowledge_id = "";
// let knowledge_id2 = "";
try {
const pasteUrlList = e.detail.pasteUrlList;
const res = await fetchKnowledgeBaseIdByPaste(pasteUrlList, "url1");
// sihan
knowledge_id = res.knowledge_base_id ? res.knowledge_base_id : "default";
} catch {
knowledge_id = "default";
}
knowledge1.set({ id: knowledge_id });
knowledgeName.set('knowledge_base');
addNotification({
text: "Uploaded successfully",
position: "top-left",
type: "success",
removeAfter: 3000,
});
}
async function handleKnowledgeUpload(e: CustomEvent<any>) {
let knowledge_id = "";
// let knowledge_id2 = "";
try {
const blob = await fetch(e.detail.src).then((r) => r.blob());
const fileName = e.detail.fileName;
// letong
const res = await fetchKnowledgeBaseId(blob, fileName);
// sihan
knowledge_id = res.knowledge_base_id ? res.knowledge_base_id : "default";
// knowledge_id2 = res2.knowledge_base_id ? res2.knowledge_base_id : "default";
console.log("knowledge_id", knowledge_id);
} catch {
knowledge_id = "default";
// knowledge_id2 = "default";
}
knowledge1.set({ id: knowledge_id });
knowledgeName.set(e.detail.fileName);
addNotification({
text: "Uploaded successfully",
position: "top-left",
type: "success",
removeAfter: 3000,
});
}
function handleKnowledgeDelete() {
knowledge1.set({ id: "default" });
knowledgeName.set("");
}
</script>
<div class="text-center">
<Button
on:click={() => (hidden6 = false)}
class="bg-transparent focus-within:ring-gray-300 hover:bg-transparent focus:ring-0"
>
<svg
aria-hidden="true"
class="h-7 w-7 text-blue-700"
fill="none"
stroke="currentColor"
viewBox="0 0 24 24"
xmlns="http://www.w3.org/2000/svg"
><path
stroke-linecap="round"
stroke-linejoin="round"
stroke-width="2"
d="M7 16a4 4 0 01-.88-7.903A5 5 0 1115.9 6L16 6a5 5 0 011 9.9M15 13l-3-3m0 0l-3 3m3-3v12"
/></svg
>
</Button>
</div>
<Drawer
backdrop={false}
placement="right"
transitionType="fly"
transitionParams={transitionParamsRight}
bind:hidden={hidden6}
class=" shadow border-2 border-r-0 border-b-0"
id="sidebar6"
>
<div class="flex items-center">
<h5
id="drawer-label"
class="mb-4 inline-flex items-center text-base font-semibold text-gray-500 dark:text-gray-400"
>
<InfoCircleSolid class="me-2.5 h-4 w-4" />Data Source
</h5>
<CloseButton
on:click={() => (hidden6 = true)}
class="mb-4 dark:text-white"
/>
</div>
<p class="mb-6 text-sm text-gray-500 dark:text-gray-400">
Please upload your local file or paste a remote file link, and Chat will
respond based on the content of the uploaded file.
</p>
<Tabs
style="full"
defaultClass="flex rounded-lg divide-x rtl:divide-x-reverse divide-gray-200 shadow dark:divide-gray-700 foucs:ring-0"
>
<TabItem class="w-full" open>
<span slot="title">Upload File</span>
<UploadFile on:upload={handleKnowledgeUpload} />
</TabItem>
<TabItem class="w-full">
<span slot="title">Paste Link</span>
<PasteURL on:paste={handleKnowledgePaste} />
</TabItem>
</Tabs>
{#if ($knowledgeName) && ($knowledgeName !== "")}
<div class="relative">
<p class="border-b p-6 pb-2">{$knowledgeName}</p>
<DeleteIcon on:DeleteAvatar={() => handleKnowledgeDelete()} />
</div>
{/if}
</Drawer>

View File

@@ -0,0 +1,25 @@
export enum MessageRole {
Assistant, User
}
export enum MessageType {
Text, SingleAudio, AudioList, SingleImage, ImageList, singleVideo
}
type Map<T> = T extends MessageType.Text | MessageType.SingleAudio ? string :
T extends MessageType.AudioList ? string[] :
T extends MessageType.SingleImage ? { imgSrc: string; imgId: string; } :
{ imgSrc: string; imgId: string; }[];
export interface Message {
role: MessageRole,
type: MessageType,
content: Map<Message['type']>,
time: number,
}
export enum LOCAL_STORAGE_KEY {
STORAGE_CHAT_KEY = 'chatMessages',
STORAGE_TIME_KEY = 'initTime',
}

View File

@@ -0,0 +1,25 @@
import { writable } from "svelte/store";
export let open = writable(true);
export let knowledgeAccess = writable(true);
export let showTemplate = writable(false);
export let showSidePage = writable(false);
export let droppedObj = writable({});
export let isLoading = writable(false);
export let newUploadNum = writable(0);
export let ifStoreMsg = writable(true);
export const resetControl = writable(false);
export const knowledge1 = writable<{
id: string;
}>();
export const knowledgeName = writable("");

View File

@@ -0,0 +1,32 @@
<script>
import "tailwindcss/tailwind.css";
import "../app.postcss";
import Notifications from "svelte-notifications";
import Layout from "$lib/modules/frame/Layout.svelte";
import { onMount } from "svelte";
onMount(() => {
window.deviceType = window.innerWidth > 640 ? "pc" : "mobile";
window.onresize = () => {
window.deviceType = window.innerWidth > 640 ? "pc" : "mobile";
};
window.addEventListener("load", function () {
setTimeout(function () {
// This hides the address bar:
window.scrollTo(0, 1);
}, 0);
});
});
</script>
<Notifications>
<Layout>
<div class="flex h-full flex-col">
<div class="h-0 grow bg-white lg:rounded-tl-3xl">
<slot />
</div>
</div>
</Layout>
</Notifications>

View File

@@ -0,0 +1,249 @@
<script lang="ts">
export let data;
import { ifStoreMsg, knowledge1 } from "$lib/shared/stores/common/Store";
import { onMount } from "svelte";
import {
LOCAL_STORAGE_KEY,
MessageRole,
MessageType,
type Message,
} from "$lib/shared/constant/Interface";
import {
fromTimeStampToTime,
getCurrentTimeStamp,
scrollToBottom,
scrollToTop,
} from "$lib/shared/Utils";
import { fetchTextStream } from "$lib/network/chat/Network";
import LoadingAnimation from "$lib/shared/components/loading/Loading.svelte";
import { browser } from "$app/environment";
import "driver.js/dist/driver.css";
import "$lib/assets/layout/css/driver.css";
import UploadFile from "$lib/shared/components/upload/uploadFile.svelte";
import PaperAirplane from "$lib/assets/chat/svelte/PaperAirplane.svelte";
import Gallery from "$lib/shared/components/chat/gallery.svelte";
import Scrollbar from "$lib/shared/components/scrollbar/Scrollbar.svelte";
import ChatMessage from "$lib/modules/chat/ChatMessage.svelte";
let query: string = "";
let loading: boolean = false;
let scrollToDiv: HTMLDivElement;
// ·········
let chatMessages: Message[] = data.chatMsg ? data.chatMsg : [];
console.log("chatMessages", chatMessages);
// ··············
$: knowledge_1 = $knowledge1?.id ? $knowledge1.id : "default";
onMount(async () => {
scrollToDiv = document
.querySelector(".chat-scrollbar")
?.querySelector(".svlr-viewport")!;
});
function handleTop() {
console.log("top");
scrollToTop(scrollToDiv);
}
function storeMessages() {
console.log('localStorage', chatMessages);
localStorage.setItem(
LOCAL_STORAGE_KEY.STORAGE_CHAT_KEY,
JSON.stringify(chatMessages)
);
}
const callTextStream = async (query: string) => {
const eventSource = await fetchTextStream(query, knowledge_1);
eventSource.addEventListener("message", (e: any) => {
let currentMsg = e.data;
currentMsg = currentMsg.replace("@#$", " ")
console.log("currentMsg", currentMsg);
if (currentMsg == "[DONE]") {
console.log("done getCurrentTimeStamp", getCurrentTimeStamp);
let startTime = chatMessages[chatMessages.length - 1].time;
loading = false;
let totalTime = parseFloat(((getCurrentTimeStamp() - startTime) / 1000).toFixed(2));
console.log("done totalTime", totalTime);
console.log(
"chatMessages[chatMessages.length - 1]",
chatMessages[chatMessages.length - 1]
);
if (chatMessages.length - 1 !== -1) {
chatMessages[chatMessages.length - 1].time = totalTime;
}
console.log("done chatMessages", chatMessages);
storeMessages();
} else {
if (chatMessages[chatMessages.length - 1].role == MessageRole.User) {
console.log("?", getCurrentTimeStamp());
chatMessages = [
...chatMessages,
{
role: MessageRole.Assistant,
type: MessageType.Text,
content: currentMsg,
time: getCurrentTimeStamp(),
},
];
console.log("? chatMessages", chatMessages);
} else {
let content = chatMessages[chatMessages.length - 1].content as string;
chatMessages[chatMessages.length - 1].content =
content + currentMsg;
}
scrollToBottom(scrollToDiv);
}
});
eventSource.stream();
};
const handleTextSubmit = async () => {
console.log("handleTextSubmit");
loading = true;
const newMessage = {
role: MessageRole.User,
type: MessageType.Text,
content: query,
time: 0,
};
chatMessages = [...chatMessages, newMessage];
scrollToBottom(scrollToDiv);
storeMessages();
query = "";
await callTextStream(newMessage.content);
scrollToBottom(scrollToDiv);
storeMessages();
};
function handelClearHistory() {
localStorage.removeItem(LOCAL_STORAGE_KEY.STORAGE_CHAT_KEY);
chatMessages = [];
}
function isEmptyObject(obj: any): boolean {
for (let key in obj) {
if (obj.hasOwnProperty(key)) {
return false;
}
}
return true;
}
</script>
<!-- <DropZone on:drop={handleImageSubmit}> -->
<div
class="h-full items-center gap-5 bg-white sm:flex sm:pb-2 lg:rounded-tl-3xl"
>
<div class="mx-auto flex h-full w-full flex-col sm:mt-0 sm:w-[72%]">
<div class="flex justify-between p-2">
<p class="text-[1.7rem] font-bold tracking-tight">ChatQnA</p>
<UploadFile />
</div>
<div
class="fixed relative flex w-full flex-col items-center justify-between bg-white p-2 pb-0"
>
<div class="relative my-4 flex w-full flex-row justify-center">
<div class="foucs:border-none relative w-full">
<input
class="text-md block w-full border-0 border-b-2 border-gray-300 px-1 py-4
text-gray-900 focus:border-gray-300 focus:ring-0 dark:border-gray-600 dark:bg-gray-700 dark:text-white dark:placeholder-gray-400 dark:focus:border-blue-500 dark:focus:ring-blue-500"
type="text"
placeholder="Enter prompt here"
disabled={loading}
maxlength="1200"
bind:value={query}
on:keydown={(event) => {
if (event.key === "Enter" && !event.shiftKey && query) {
event.preventDefault();
handleTextSubmit();
}
}}
/>
<button
on:click={() => {
if (query) {
handleTextSubmit();
}
}}
type="submit"
class="absolute bottom-2.5 end-2.5 px-4 py-2 text-sm font-medium text-white dark:bg-blue-600 dark:hover:bg-blue-700 dark:focus:ring-blue-800"
><PaperAirplane /></button
>
</div>
</div>
</div>
<!-- clear -->
{#if Array.isArray(chatMessages) && chatMessages.length > 0 && !loading}
<div class="flex w-full justify-between pr-5">
<div class="flex items-center">
<button
class="bg-primary text-primary-foreground hover:bg-primary/90 group flex items-center justify-center space-x-2 p-2"
type="button"
on:click={() => handelClearHistory()}
><svg
xmlns="http://www.w3.org/2000/svg"
viewBox="0 0 20 20"
width="24"
height="24"
class="fill-[#0597ff] group-hover:fill-[#0597ff]"
><path
d="M12.6 12 10 9.4 7.4 12 6 10.6 8.6 8 6 5.4 7.4 4 10 6.6 12.6 4 14 5.4 11.4 8l2.6 2.6zm7.4 8V2q0-.824-.587-1.412A1.93 1.93 0 0 0 18 0H2Q1.176 0 .588.588A1.93 1.93 0 0 0 0 2v12q0 .825.588 1.412Q1.175 16 2 16h14zm-3.15-6H2V2h16v13.125z"
/></svg
><span class="font-medium text-[#0597ff]">CLEAR</span></button
>
</div>
</div>
{/if}
<!-- clear -->
<div class="mx-auto flex h-full w-full flex-col">
<Scrollbar
classLayout="flex flex-col gap-1 mr-4"
className="chat-scrollbar h-0 w-full grow px-2 pt-2 mt-3 mr-5"
>
{#each chatMessages as message, i}
<ChatMessage
on:scrollTop={() => handleTop()}
msg={message}
time={i === 0 || (message.time > 0 && message.time < 100)
? message.time
: ""}
/>
{/each}
</Scrollbar>
<!-- Loading text -->
{#if loading}
<LoadingAnimation />
{/if}
</div>
<!-- gallery -->
</div>
</div>
<style>
.row::-webkit-scrollbar {
display: none;
}
.row {
scrollbar-width: none;
}
.row {
-ms-overflow-style: none;
}
</style>

View File

@@ -0,0 +1,12 @@
import { browser } from '$app/environment';
import { LOCAL_STORAGE_KEY } from '$lib/shared/constant/Interface';
export const load = async () => {
if (browser) {
const chat = localStorage.getItem(LOCAL_STORAGE_KEY.STORAGE_CHAT_KEY);
return {
chatMsg: JSON.parse(chat || '[]')
}
}
};

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

View File

@@ -0,0 +1,25 @@
import adapter from '@sveltejs/adapter-auto';
import preprocess from 'svelte-preprocess';
import postcssPresetEnv from 'postcss-preset-env';
/** @type {import('@sveltejs/kit').Config} */
const config = {
// Consult https://github.com/sveltejs/svelte-preprocess
// for more information about preprocessors
preprocess: preprocess({
sourceMap: true,
postcss: {
plugins: [postcssPresetEnv({ features: { 'nesting-rules': true } })]
}
}),
kit: {
adapter: adapter(),
env: {
publicPrefix: ''
}
}
};
export default config;

View File

@@ -0,0 +1,30 @@
const config = {
content: ["./src/**/*.{html,js,svelte,ts}",
"./node_modules/flowbite-svelte/**/*.{html,js,svelte,ts}",],
plugins: [require('flowbite/plugin')],
darkMode: 'class',
theme: {
extend: {
colors: {
// flowbite-svelte
primary: {
50: '#FFF5F2',
100: '#FFF1EE',
200: '#FFE4DE',
300: '#FFD5CC',
400: '#FFBCAD',
500: '#FE795D',
600: '#EF562F',
700: '#EB4F27',
800: '#CC4522',
900: '#A5371B'
}
}
}
}
};
module.exports = config;

17
ChatQnA/ui/tsconfig.json Normal file
View File

@@ -0,0 +1,17 @@
{
"extends": "./.svelte-kit/tsconfig.json",
"compilerOptions": {
"allowJs": true,
"checkJs": true,
"esModuleInterop": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"skipLibCheck": true,
"sourceMap": true,
"strict": true
}
// Path aliases are handled by https://kit.svelte.dev/docs/configuration#alias
//
// If you want to overwrite includes/excludes, make sure to copy over the relevant includes/excludes
// from the referenced tsconfig.json - TypeScript does not merge them in
}

10
ChatQnA/ui/vite.config.ts Normal file
View File

@@ -0,0 +1,10 @@
import { sveltekit } from '@sveltejs/kit/vite';
import type { UserConfig } from 'vite';
const config: UserConfig = {
plugins: [sveltekit()],
server: {}
};
export default config;

136
CodeGen/README.md Normal file
View File

@@ -0,0 +1,136 @@
Code generation is a noteworthy application of Large Language Model (LLM) technology. In this example, we present a Copilot application to showcase how code generation can be executed on the Intel Gaudi2 platform. This CodeGen use case involves code generation utilizing open source models such as "m-a-p/OpenCodeInterpreter-DS-6.7B", "deepseek-ai/deepseek-coder-33b-instruct" and Text Generation Inference on Intel Gaudi2.
# Environment Setup
To use [🤗 text-generation-inference](https://github.com/huggingface/text-generation-inference) on Intel Gaudi2, please follow these steps:
## Build TGI Gaudi Docker Image
```bash
bash ./tgi_gaudi/build_docker.sh
```
## Launch TGI Gaudi Service
### Launch a local server instance on 1 Gaudi card:
```bash
bash ./tgi_gaudi/launch_tgi_service.sh
```
### Launch a local server instance on 4 Gaudi cards:
```bash
bash ./tgi_gaudi/launch_tgi_service.sh 4 9000 "deepseek-ai/deepseek-coder-33b-instruct"
```
### Customize TGI Gaudi Service
The ./tgi_gaudi/launch_tgi_service.sh script accepts three parameters:
- num_cards: The number of Gaudi cards to be utilized, ranging from 1 to 8. The default is set to 1.
- port_number: The port number assigned to the TGI Gaudi endpoint, with the default being 8080.
- model_name: The model name utilized for LLM, with the default set to "m-a-p/OpenCodeInterpreter-DS-6.7B".
You have the flexibility to customize these parameters according to your specific needs. Additionally, you can set the TGI Gaudi endpoint by exporting the environment variable `TGI_ENDPOINT`:
```bash
export TGI_ENDPOINT="xxx.xxx.xxx.xxx:8080"
```
## Launch Copilot Docker
### Build Copilot Docker Image
```bash
cd codegen
bash ./build_docker.sh
cd ..
```
### Lanuch Copilot Docker
```bash
docker run -it --net=host --ipc=host -v /var/run/docker.sock:/var/run/docker.sock copilot:latest
```
# Start Copilot Server
## Start the Backend Service
Make sure TGI-Gaudi service is running and also make sure data is populated into Redis. Launch the backend service:
Please follow this link [huggingface token](https://huggingface.co/docs/hub/security-tokens) to get the access token ans export `HUGGINGFACEHUB_API_TOKEN` environment with the token.
```bash
export HUGGINGFACEHUB_API_TOKEN=<token>
nohup python server.py &
```
## Install Copilot VSCode extension offline
Copy the vsix file `copilot-0.0.1.vsix` to local and install it in VSCode as below.
![Install-screenshot](https://i.imgur.com/JXQ3rqE.jpg)
We will be also releasing the plugin in Visual Studio Code plugin market to facilitate the installation.
# How to use
## Service URL setting
Please adjust the service URL in the extension settings based on the endpoint of the code generation backend service.
![Setting-screenshot](https://i.imgur.com/4hjvKPu.png)
![Setting-screenshot](https://i.imgur.com/JfJVFV3.png)
## Customize
The Copilot enables users to input their corresponding sensitive information and tokens in the user settings according to their own needs. This customization enhances the accuracy and output content to better meet individual requirements.
![Customize](https://i.imgur.com/PkObak9.png)
## Code suggestion
To trigger inline completion, you'll need to type # {your keyword} (start with your programming language's comment keyword, like // in C++ and # in python). Make sure Inline Suggest is enabled from the VS Code Settings.
For example:
![code suggestion](https://i.imgur.com/sH5UoTO.png)
To provide programmers with a smooth experience, the Copilot supports multiple ways to trigger inline code suggestions. If you are interested in the details, they are summarized as follows:
- Generate code from single-line comments: The simplest way introduced before.
- Generate code from consecutive single-line comments:
![codegen from single-line comments](https://i.imgur.com/GZsQywX.png)
- Generate code from multi-line comments, which will not be triggered until there is at least one `space` outside the multi-line comment):
![codegen from multi-line comments](https://i.imgur.com/PzhiWrG.png)
- Automatically complete multi-line comments:
![auto complete](https://i.imgur.com/cJO3PQ0.jpg)
## Chat with AI assistant
You can start a conversation with the AI programming assistant by clicking on the robot icon in the plugin bar on the left:
![icon](https://i.imgur.com/f7rzfCQ.png)
Then you can see the conversation window on the left, where you can chat with AI assistant:
![dialog](https://i.imgur.com/aiYzU60.png)
There are 4 areas worth noting:
- Enter and submit your question
- Your previous questions
- Answers from AI assistant (Code will be highlighted properly according to the programming language it is written in, also support streaming output)
- Copy or replace code with one click (Note that you need to select the code in the editor first and then click "replace", otherwise the code will be inserted)
You can also select the code in the editor and ask AI assistant question about it.
For example:
- Select code
![select code](https://i.imgur.com/grvrtY6.png)
- Ask question and get answer
![qna](https://i.imgur.com/8Kdpld7.png)

View File

@@ -0,0 +1,25 @@
# Copyright (c) 2024 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
FROM langchain/langchain
RUN apt-get update && apt-get -y install libgl1-mesa-glx
RUN pip install -U langchain-cli pydantic==1.10.13
RUN pip install langchain==0.1.11
RUN pip install shortuuid
RUN pip install huggingface_hub
RUN mkdir -p /ws
ENV PYTHONPATH=/ws
COPY codegen-app /codegen-app
WORKDIR /codegen-app
CMD ["/bin/bash"]

View File

@@ -0,0 +1,17 @@
# Copyright (c) 2024 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#!/bin/bash
docker build . -t copilot:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy

View File

@@ -0,0 +1,42 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Code source from FastChat's OpenAI protocol:
https://github.com/lm-sys/FastChat/blob/main/fastchat/protocol/openai_api_protocol.py
"""
from typing import Optional, List, Any, Union
import time
import shortuuid
# pylint: disable=E0611
from pydantic import BaseModel, Field
class ChatCompletionRequest(BaseModel):
prompt: Union[str, List[Any]]
device: Optional[str] = 'cpu'
temperature: Optional[float] = 0.7
top_p: Optional[float] = 1.0
top_k: Optional[int] = 1
repetition_penalty: Optional[float] = 1.0
max_new_tokens: Optional[int] = 128
stream: Optional[bool] = False
class ChatCompletionResponse(BaseModel):
id: str = Field(default_factory=lambda: f"chatcmpl-{shortuuid.random()}")
object: str = "chat.completion"
created: int = Field(default_factory=lambda: int(time.time()))
response: str

View File

@@ -0,0 +1,229 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2024 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import requests
import json
import types
from concurrent import futures
from typing import Optional
from fastapi import FastAPI, APIRouter
from fastapi.responses import RedirectResponse, StreamingResponse
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_community.llms import HuggingFaceEndpoint
from langchain_core.pydantic_v1 import BaseModel
from starlette.middleware.cors import CORSMiddleware
from openai_protocol import ChatCompletionRequest, ChatCompletionResponse
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"])
class CodeGenAPIRouter(APIRouter):
def __init__(self, entrypoint) -> None:
super().__init__()
self.entrypoint = entrypoint
print(f"[codegen - router] Initializing API Router, entrypoint={entrypoint}")
# Define LLM
self.llm = HuggingFaceEndpoint(
endpoint_url=entrypoint,
max_new_tokens=512,
top_k=10,
top_p=0.95,
typical_p=0.95,
temperature=0.01,
repetition_penalty=1.03,
streaming=True,
)
print("[codegen - router] LLM initialized.")
def is_generator(self, obj):
return isinstance(obj, types.GeneratorType)
def handle_chat_completion_request(self, request: ChatCompletionRequest):
try:
print(f"Predicting chat completion using prompt '{request.prompt}'")
buffered_texts = ""
if request.stream:
generator = self.llm(request.prompt, callbacks=[StreamingStdOutCallbackHandler()])
if not self.is_generator(generator):
generator = (generator,)
def stream_generator():
nonlocal buffered_texts
for output in generator:
yield f"data: {output}\n\n"
yield f"data: [DONE]\n\n"
return StreamingResponse(stream_generator(), media_type="text/event-stream")
else:
response = self.llm(request.prompt)
except Exception as e:
print(f"An error occurred: {e}")
else:
print("Chat completion finished.")
return ChatCompletionResponse(response=response)
tgi_endpoint = os.getenv("TGI_ENDPOINT", "http://localhost:8080")
router = CodeGenAPIRouter(tgi_endpoint)
app.include_router(router)
def check_completion_request(request: BaseModel) -> Optional[str]:
if request.temperature is not None and request.temperature < 0:
return f"Param Error: {request.temperature} is less than the minimum of 0 --- 'temperature'"
if request.temperature is not None and request.temperature > 2:
return f"Param Error: {request.temperature} is greater than the maximum of 2 --- 'temperature'"
if request.top_p is not None and request.top_p < 0:
return f"Param Error: {request.top_p} is less than the minimum of 0 --- 'top_p'"
if request.top_p is not None and request.top_p > 1:
return f"Param Error: {request.top_p} is greater than the maximum of 1 --- 'top_p'"
if request.top_k is not None and (not isinstance(request.top_k, int)):
return f"Param Error: {request.top_k} is not valid under any of the given schemas --- 'top_k'"
if request.top_k is not None and request.top_k < 1:
return f"Param Error: {request.top_k} is greater than the minimum of 1 --- 'top_k'"
if request.max_new_tokens is not None and (not isinstance(request.max_new_tokens, int)):
return f"Param Error: {request.max_new_tokens} is not valid under any of the given schemas --- 'max_new_tokens'"
return None
def filter_code_format(code):
language_prefixes = {
"go": "```go",
"c": "```c",
"cpp": "```cpp",
"java": "```java",
"python": "```python",
"typescript": "```typescript"
}
suffix = "\n```"
# Find the first occurrence of a language prefix
first_prefix_pos = len(code)
for prefix in language_prefixes.values():
pos = code.find(prefix)
if pos != -1 and pos < first_prefix_pos:
first_prefix_pos = pos + len(prefix) + 1
# Find the first occurrence of the suffix after the first language prefix
first_suffix_pos = code.find(suffix, first_prefix_pos + 1)
# Extract the code block
if first_prefix_pos != -1 and first_suffix_pos != -1:
return code[first_prefix_pos:first_suffix_pos]
elif first_prefix_pos != -1:
return code[first_prefix_pos:]
return code
# router /v1/code_generation only supports non-streaming mode.
@router.post("/v1/code_generation")
async def code_generation_endpoint(chat_request: ChatCompletionRequest):
if router.use_deepspeed:
responses = []
def send_request(port):
try:
url = f'http://{router.host}:{port}/v1/code_generation'
response = requests.post(url, json=chat_request.dict())
response.raise_for_status()
json_response = json.loads(response.content)
cleaned_code = filter_code_format(json_response['response'])
chat_completion_response = ChatCompletionResponse(response=cleaned_code)
responses.append(chat_completion_response)
except requests.exceptions.RequestException as e:
print(f"Error sending/receiving on port {port}: {e}")
with futures.ThreadPoolExecutor(max_workers=router.world_size) as executor:
worker_ports = [router.port + i + 1 for i in range(router.world_size)]
executor.map(send_request, worker_ports)
if responses:
return responses[0]
else:
ret = check_completion_request(chat_request)
if ret is not None:
raise RuntimeError("Invalid parameter.")
return router.handle_chat_completion_request(chat_request)
# router /v1/code_chat supports both non-streaming and streaming mode.
@router.post("/v1/code_chat")
async def code_chat_endpoint(chat_request: ChatCompletionRequest):
if router.use_deepspeed:
if chat_request.stream:
responses = []
def generate_stream(port):
url = f'http://{router.host}:{port}/v1/code_generation'
response = requests.post(url, json=chat_request.dict(), stream=True, timeout=1000)
responses.append(response)
with futures.ThreadPoolExecutor(max_workers=router.world_size) as executor:
worker_ports = [router.port + i + 1 for i in range(router.world_size)]
executor.map(generate_stream, worker_ports)
while not responses:
pass
def generate():
if responses[0]:
for chunk in responses[0].iter_lines(decode_unicode=False, delimiter=b"\0"):
if chunk:
yield f"data: {chunk}\n\n"
yield f"data: [DONE]\n\n"
return StreamingResponse(generate(), media_type="text/event-stream")
else:
responses = []
def send_request(port):
try:
url = f'http://{router.host}:{port}/v1/code_generation'
response = requests.post(url, json=chat_request.dict())
response.raise_for_status()
json_response = json.loads(response.content)
chat_completion_response = ChatCompletionResponse(response=json_response['response'])
responses.append(chat_completion_response)
except requests.exceptions.RequestException as e:
print(f"Error sending/receiving on port {port}: {e}")
with futures.ThreadPoolExecutor(max_workers=router.world_size) as executor:
worker_ports = [router.port + i + 1 for i in range(router.world_size)]
executor.map(send_request, worker_ports)
if responses:
return responses[0]
else:
ret = check_completion_request(chat_request)
if ret is not None:
raise RuntimeError("Invalid parameter.")
return router.handle_chat_completion_request(chat_request)
@app.get("/")
async def redirect_root_to_docs():
return RedirectResponse("/docs")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)

BIN
CodeGen/copilot-0.0.1.vsix Normal file

Binary file not shown.

View File

@@ -0,0 +1,19 @@
# Copyright (c) 2024 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#!/bin/bash
git clone https://github.com/huggingface/tgi-gaudi.git
cd ./tgi-gaudi/
docker build -t tgi_gaudi_codegen . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy

View File

@@ -0,0 +1,50 @@
# Copyright (c) 2024 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#!/bin/bash
# Set default values
default_port=8080
default_model="m-a-p/OpenCodeInterpreter-DS-6.7B"
default_num_cards=1
# Check if all required arguments are provided
if [ "$#" -lt 0 ] || [ "$#" -gt 3 ]; then
echo "Usage: $0 [num_cards] [port_number] [model_name]"
exit 1
fi
# Assign arguments to variables
num_cards=${1:-$default_num_cards}
port_number=${2:-$default_port}
model_name=${3:-$default_model}
# Check if num_cards is within the valid range (1-8)
if [ "$num_cards" -lt 1 ] || [ "$num_cards" -gt 8 ]; then
echo "Error: num_cards must be between 1 and 8."
exit 1
fi
# Set the volume variable
volume=$PWD/data
# Build the Docker run command based on the number of cards
if [ "$num_cards" -eq 1 ]; then
docker_cmd="docker run -p $port_number:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy tgi_gaudi_codegen --model-id $model_name"
else
docker_cmd="docker run -p $port_number:80 -v $volume:/data --runtime=habana -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy tgi_gaudi_codegen --model-id $model_name --sharded true --num-shard $num_cards"
fi
# Execute the Docker run command
eval $docker_cmd

101
DocSum/README.md Normal file
View File

@@ -0,0 +1,101 @@
Text summarization is an NLP task that creates a concise and informative summary of a longer text. LLMs can be used to create summaries of news articles, research papers, technical documents, and other types of text. Suppose you have a set of documents (PDFs, Notion pages, customer questions, etc.) and you want to summarize the content. In this example use case, we use LangChain to apply some summarization strategies and run LLM inference using Text Generation Inference on Intel Gaudi2.
# Environment Setup
To use [🤗 text-generation-inference](https://github.com/huggingface/text-generation-inference) on Habana Gaudi/Gaudi2, please follow these steps:
## Build TGI Gaudi Docker Image
```bash
bash ./serving/tgi_gaudi/build_docker.sh
```
## Launch TGI Gaudi Service
### Launch a local server instance on 1 Gaudi card:
```bash
bash ./serving/tgi_gaudi/launch_tgi_service.sh
```
For gated models such as `LLAMA-2`, you will have to pass -e HUGGING_FACE_HUB_TOKEN=\<token\> to the docker run command above with a valid Hugging Face Hub read token.
Please follow this link [huggingface token](https://huggingface.co/docs/hub/security-tokens) to get the access token ans export `HUGGINGFACEHUB_API_TOKEN` environment with the token.
```bash
export HUGGINGFACEHUB_API_TOKEN=<token>
```
### Launch a local server instance on 8 Gaudi cards:
```bash
bash ./serving/tgi_gaudi/launch_tgi_service.sh 8
```
### Customize TGI Gaudi Service
The ./serving/tgi_gaudi/launch_tgi_service.sh script accepts three parameters:
- num_cards: The number of Gaudi cards to be utilized, ranging from 1 to 8. The default is set to 1.
- port_number: The port number assigned to the TGI Gaudi endpoint, with the default being 8080.
- model_name: The model name utilized for LLM, with the default set to "Intel/neural-chat-7b-v3-3".
You have the flexibility to customize these parameters according to your specific needs. Additionally, you can set the TGI Gaudi endpoint by exporting the environment variable `TGI_ENDPOINT`:
```bash
export TGI_ENDPOINT="http://xxx.xxx.xxx.xxx:8080"
```
## Launch Document Summary Docker
### Build Document Summary Docker Image
```bash
cd langchain/docker/
bash ./build_docker.sh
cd ../../
```
### Lanuch Document Summary Docker
```bash
docker run -it --net=host --ipc=host -v /var/run/docker.sock:/var/run/docker.sock document-summarize:latest
```
# Start Document Summary Server
## Start the Backend Service
Make sure TGI-Gaudi service is running. Launch the backend service:
```bash
export HUGGINGFACEHUB_API_TOKEN=<token>
nohup python app/server.py &
```
## Start the Frontend Service
Navigate to the "ui" folder and execute the following commands to start the fronend GUI:
```bash
cd ui
sudo apt-get install npm && \
npm install -g n && \
n stable && \
hash -r && \
npm install -g npm@latest
```
For CentOS, please use the following commands instead:
```bash
curl -sL https://rpm.nodesource.com/setup_20.x | sudo bash -
sudo yum install -y nodejs
```
Update the `BASIC_URL` environment variable in the `.env` file by replacing the IP address '127.0.0.1' with the actual IP address.
Run the following command to install the required dependencies:
```bash
npm install
```
Start the development server by executing the following command:
```bash
nohup npm run dev &
```
This will initiate the frontend service and launch the application.

View File

@@ -0,0 +1,35 @@
FROM langchain/langchain
ARG http_proxy
ARG https_proxy
ENV http_proxy=$http_proxy
ENV https_proxy=$https_proxy
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y \
libgl1-mesa-glx \
libjemalloc-dev
RUN pip install --upgrade pip \
sentence-transformers \
langchain-cli \
pydantic==1.10.13 \
langchain==0.1.12 \
poetry \
langchain_benchmarks \
pyarrow \
jupyter \
docx2txt \
pypdf \
beautifulsoup4 \
python-multipart \
intel-extension-for-pytorch \
intel-openmp
ENV PYTHONPATH=/ws:/summarize-app/app
COPY summarize-app /summarize-app
WORKDIR /summarize-app
CMD ["/bin/bash"]

View File

@@ -0,0 +1,3 @@
#!/bin/bash
docker build . -t document-summarize:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy

Some files were not shown because too many files have changed in this diff Show More