Text summarization is an NLP task that creates a concise and informative summary of a longer text. LLMs can be used to create summaries of news articles, research papers, technical documents, and other types of text. Suppose you have a set of documents (PDFs, Notion pages, customer questions, etc.) and you want to summarize the content. In this example use case, we use LangChain to apply some summarization strategies and run LLM inference using Text Generation Inference on Intel Gaudi2.
Environment Setup
To use 🤗 text-generation-inference on Habana Gaudi/Gaudi2, please follow these steps:
Build TGI Gaudi Docker Image
bash ./serving/tgi_gaudi/build_docker.sh
Launch TGI Gaudi Service
Launch a local server instance on 1 Gaudi card:
bash ./serving/tgi_gaudi/launch_tgi_service.sh
For gated models such as LLAMA-2, you will have to pass -e HUGGING_FACE_HUB_TOKEN=<token> to the docker run command above with a valid Hugging Face Hub read token.
Please follow this link huggingface token to get the access token ans export HUGGINGFACEHUB_API_TOKEN environment with the token.
export HUGGINGFACEHUB_API_TOKEN=<token>
Launch a local server instance on 8 Gaudi cards:
bash ./serving/tgi_gaudi/launch_tgi_service.sh 8
Customize TGI Gaudi Service
The ./serving/tgi_gaudi/launch_tgi_service.sh script accepts three parameters:
- num_cards: The number of Gaudi cards to be utilized, ranging from 1 to 8. The default is set to 1.
- port_number: The port number assigned to the TGI Gaudi endpoint, with the default being 8080.
- model_name: The model name utilized for LLM, with the default set to "Intel/neural-chat-7b-v3-3".
You have the flexibility to customize these parameters according to your specific needs. Additionally, you can set the TGI Gaudi endpoint by exporting the environment variable TGI_ENDPOINT:
export TGI_ENDPOINT="http://xxx.xxx.xxx.xxx:8080"
Launch Document Summary Docker
Build Document Summary Docker Image (Optional)
cd langchain/docker/
bash ./build_docker.sh
cd ../../
Lanuch Document Summary Docker
docker run -it --net=host --ipc=host -e http_proxy=${http_proxy} -e https_proxy=${https_proxy} -v /var/run/docker.sock:/var/run/docker.sock intel/gen-ai-examples:document-summarize bash
Start Document Summary Server
Start the Backend Service
Make sure TGI-Gaudi service is running. Launch the backend service:
export HUGGINGFACEHUB_API_TOKEN=<token>
nohup python app/server.py &
Start the Frontend Service
Navigate to the "ui" folder and execute the following commands to start the fronend GUI:
cd ui
sudo apt-get install npm && \
npm install -g n && \
n stable && \
hash -r && \
npm install -g npm@latest
For CentOS, please use the following commands instead:
curl -sL https://rpm.nodesource.com/setup_20.x | sudo bash -
sudo yum install -y nodejs
Update the BASIC_URL environment variable in the .env file by replacing the IP address '127.0.0.1' with the actual IP address.
Run the following command to install the required dependencies:
npm install
Start the development server by executing the following command:
nohup npm run dev &
This will initiate the frontend service and launch the application.