doc: fix heading levels (#690)

Only one H1 for the title is allowed

Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
This commit is contained in:
David Kinder
2024-09-13 21:36:41 -04:00
committed by GitHub
parent 3c5fc80570
commit f8f8854f9c
2 changed files with 108 additions and 108 deletions

View File

@@ -6,9 +6,9 @@ For dataprep microservice, we currently provide one framework: `Langchain`.
We organized the folders in the same way, so you can use either framework for dataprep microservice with the following constructions.
# 🚀1. Start Microservice with Python (Option 1)
## 🚀1. Start Microservice with Python (Option 1)
## 1.1 Install Requirements
### 1.1 Install Requirements
Install Single-process version (for 1-10 files processing)
@@ -25,11 +25,11 @@ pip install -r requirements.txt
cd langchain_ray; pip install -r requirements_ray.txt
``` -->
## 1.2 Start VDMS Server
### 1.2 Start VDMS Server
Please refer to this [readme](../../vectorstores/vdms/README.md).
Refer to this [readme](../../vectorstores/vdms/README.md).
## 1.3 Setup Environment Variables
### 1.3 Setup Environment Variables
```bash
export http_proxy=${your_http_proxy}
@@ -40,7 +40,7 @@ export COLLECTION_NAME=${your_collection_name}
export PYTHONPATH=${path_to_comps}
```
## 1.4 Start Document Preparation Microservice for VDMS with Python Script
### 1.4 Start Document Preparation Microservice for VDMS with Python Script
Start document preparation microservice for VDMS with below command.
@@ -56,13 +56,13 @@ python prepare_doc_vdms.py
python prepare_doc_redis_on_ray.py
``` -->
# 🚀2. Start Microservice with Docker (Option 2)
## 🚀2. Start Microservice with Docker (Option 2)
## 2.1 Start VDMS Server
### 2.1 Start VDMS Server
Please refer to this [readme](../../vectorstores/vdms/README.md).
Refer to this [readme](../../vectorstores/vdms/README.md).
## 2.2 Setup Environment Variables
### 2.2 Setup Environment Variables
```bash
export http_proxy=${your_http_proxy}
@@ -76,16 +76,16 @@ export DISTANCE_STRATEGY="L2"
export PYTHONPATH=${path_to_comps}
```
## 2.3 Build Docker Image
### 2.3 Build Docker Image
- Build docker image with langchain
Start single-process version (for 1-10 files processing)
Start single-process version (for 1-10 files processing)
```bash
cd ../../../
docker build -t opea/dataprep-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/langchain/Dockerfile .
```
```bash
cd ../../../
docker build -t opea/dataprep-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/langchain/Dockerfile .
```
<!-- - option 2: Start multi-process version (for >10 files processing)
@@ -93,7 +93,7 @@ docker build -t opea/dataprep-vdms:latest --build-arg https_proxy=$https_proxy -
cd ../../../../
docker build -t opea/dataprep-on-ray-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/langchain_ray/Dockerfile . -->
## 2.4 Run Docker with CLI
### 2.4 Run Docker with CLI
Start single-process version (for 1-10 files processing)
@@ -113,13 +113,13 @@ docker run -d --name="dataprep-vdms-server" -p 6007:6007 --runtime=runc --ipc=ho
-e TIMEOUT_SECONDS=600 opea/dataprep-on-ray-vdms:latest
``` -->
# 🚀3. Status Microservice
## 🚀3. Status Microservice
```bash
docker container logs -f dataprep-vdms-server
```
# 🚀4. Consume Microservice
## 🚀4. Consume Microservice
Once document preparation microservice for VDMS is started, user can use below command to invoke the microservice to convert the document to embedding and save to the database.
@@ -127,61 +127,61 @@ Make sure the file path after `files=@` is correct.
- Single file upload
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.txt" \
http://localhost:6007/v1/dataprep
```
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.txt" \
http://localhost:6007/v1/dataprep
```
You can specify chunk_size and chunk_size by the following commands.
You can specify `chunk_size` and `chunk_overlap` by the following commands.
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./LLAMA2_page6.pdf" \
-F "chunk_size=1500" \
-F "chunk_overlap=100" \
http://localhost:6007/v1/dataprep
```
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./LLAMA2_page6.pdf" \
-F "chunk_size=1500" \
-F "chunk_overlap=100" \
http://localhost:6007/v1/dataprep
```
- Multiple file upload
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.txt" \
-F "files=@./file2.txt" \
-F "files=@./file3.txt" \
http://localhost:6007/v1/dataprep
```
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.txt" \
-F "files=@./file2.txt" \
-F "files=@./file3.txt" \
http://localhost:6007/v1/dataprep
```
- Links upload (not supported for llama_index now)
- Links upload (not supported for `llama_index` now)
```bash
curl -X POST \
-F 'link_list=["https://www.ces.tech/"]' \
http://localhost:6007/v1/dataprep
```
```bash
curl -X POST \
-F 'link_list=["https://www.ces.tech/"]' \
http://localhost:6007/v1/dataprep
```
or
or
```python
import requests
import json
```python
import requests
import json
proxies = {"http": ""}
url = "http://localhost:6007/v1/dataprep"
urls = [
"https://towardsdatascience.com/no-gpu-no-party-fine-tune-bert-for-sentiment-analysis-with-vertex-ai-custom-jobs-d8fc410e908b?source=rss----7f60cf5620c9---4"
]
payload = {"link_list": json.dumps(urls)}
proxies = {"http": ""}
url = "http://localhost:6007/v1/dataprep"
urls = [
"https://towardsdatascience.com/no-gpu-no-party-fine-tune-bert-for-sentiment-analysis-with-vertex-ai-custom-jobs-d8fc410e908b?source=rss----7f60cf5620c9---4"
]
payload = {"link_list": json.dumps(urls)}
try:
resp = requests.post(url=url, data=payload, proxies=proxies)
print(resp.text)
resp.raise_for_status() # Raise an exception for unsuccessful HTTP status codes
print("Request successful!")
except requests.exceptions.RequestException as e:
print("An error occurred:", e)
```
try:
resp = requests.post(url=url, data=payload, proxies=proxies)
print(resp.text)
resp.raise_for_status() # Raise an exception for unsuccessful HTTP status codes
print("Request successful!")
except requests.exceptions.RequestException as e:
print("An error occurred:", e)
```

View File

@@ -2,25 +2,25 @@
For dataprep microservice, we currently provide one framework: `Langchain`.
# 🚀1. Start Microservice with Python (Option 1)
## 🚀1. Start Microservice with Python (Option 1)
## 1.1 Install Requirements
### 1.1 Install Requirements
- option 1: Install Single-process version (for 1-10 files processing)
```bash
apt-get update
apt-get install -y default-jre tesseract-ocr libtesseract-dev poppler-utils
pip install -r requirements.txt
```
```bash
apt-get update
apt-get install -y default-jre tesseract-ocr libtesseract-dev poppler-utils
pip install -r requirements.txt
```
## 1.2 Start VDMS Server
### 1.2 Start VDMS Server
```bash
docker run -d --name="vdms-vector-db" -p 55555:55555 intellabs/vdms:latest
```
## 1.3 Setup Environment Variables
### 1.3 Setup Environment Variables
```bash
export http_proxy=${your_http_proxy}
@@ -33,7 +33,7 @@ export your_hf_api_token="{your_hf_token}"
export PYTHONPATH=${path_to_comps}
```
## 1.4 Start Data Preparation Microservice for VDMS with Python Script
### 1.4 Start Data Preparation Microservice for VDMS with Python Script
Start document preparation microservice for VDMS with below command.
@@ -41,15 +41,15 @@ Start document preparation microservice for VDMS with below command.
python ingest_videos.py
```
# 🚀2. Start Microservice with Docker (Option 2)
## 🚀2. Start Microservice with Docker (Option 2)
## 2.1 Start VDMS Server
### 2.1 Start VDMS Server
```bash
docker run -d --name="vdms-vector-db" -p 55555:55555 intellabs/vdms:latest
```
## 2.1 Setup Environment Variables
### 2.1 Setup Environment Variables
```bash
export http_proxy=${your_http_proxy}
@@ -61,29 +61,29 @@ export INDEX_NAME="rag-vdms"
export your_hf_api_token="{your_hf_token}"
```
## 2.3 Build Docker Image
### 2.3 Build Docker Image
- Build docker image
```bash
cd ../../../
docker build -t opea/dataprep-vdms:latest --network host --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/multimodal_langchain/Dockerfile .
```bash
cd ../../../
docker build -t opea/dataprep-vdms:latest --network host --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/multimodal_langchain/Dockerfile .
```
```
## 2.4 Run Docker Compose
### 2.4 Run Docker Compose
```bash
docker compose -f comps/dataprep/vdms/multimodal_langchain/docker-compose-dataprep-vdms.yaml up -d
```
# 🚀3. Status Microservice
## 🚀3. Status Microservice
```bash
docker container logs -f dataprep-vdms-server
```
# 🚀4. Consume Microservice
## 🚀4. Consume Microservice
Once data preparation microservice for VDMS is started, user can use below command to invoke the microservice to convert the videos to embedding and save to the database.
@@ -91,34 +91,34 @@ Make sure the file path after `files=@` is correct.
- Single file upload
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.mp4" \
http://localhost:6007/v1/dataprep
```
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.mp4" \
http://localhost:6007/v1/dataprep
```
- Multiple file upload
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.mp4" \
-F "files=@./file2.mp4" \
-F "files=@./file3.mp4" \
http://localhost:6007/v1/dataprep
```
```bash
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./file1.mp4" \
-F "files=@./file2.mp4" \
-F "files=@./file3.mp4" \
http://localhost:6007/v1/dataprep
```
- List of uploaded files
```bash
curl -X GET http://localhost:6007/v1/dataprep/get_videos
```
```bash
curl -X GET http://localhost:6007/v1/dataprep/get_videos
```
- Download uploaded files
Please use the file name from the list
Use the file name from the list
```bash
curl -X GET http://localhost:6007/v1/dataprep/get_file/${filename}
```
```bash
curl -X GET http://localhost:6007/v1/dataprep/get_file/${filename}
```