doc: fix heading levels (#690)
Only one H1 for the title is allowed Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
This commit is contained in:
@@ -6,9 +6,9 @@ For dataprep microservice, we currently provide one framework: `Langchain`.
|
||||
|
||||
We organized the folders in the same way, so you can use either framework for dataprep microservice with the following constructions.
|
||||
|
||||
# 🚀1. Start Microservice with Python (Option 1)
|
||||
## 🚀1. Start Microservice with Python (Option 1)
|
||||
|
||||
## 1.1 Install Requirements
|
||||
### 1.1 Install Requirements
|
||||
|
||||
Install Single-process version (for 1-10 files processing)
|
||||
|
||||
@@ -25,11 +25,11 @@ pip install -r requirements.txt
|
||||
cd langchain_ray; pip install -r requirements_ray.txt
|
||||
``` -->
|
||||
|
||||
## 1.2 Start VDMS Server
|
||||
### 1.2 Start VDMS Server
|
||||
|
||||
Please refer to this [readme](../../vectorstores/vdms/README.md).
|
||||
Refer to this [readme](../../vectorstores/vdms/README.md).
|
||||
|
||||
## 1.3 Setup Environment Variables
|
||||
### 1.3 Setup Environment Variables
|
||||
|
||||
```bash
|
||||
export http_proxy=${your_http_proxy}
|
||||
@@ -40,7 +40,7 @@ export COLLECTION_NAME=${your_collection_name}
|
||||
export PYTHONPATH=${path_to_comps}
|
||||
```
|
||||
|
||||
## 1.4 Start Document Preparation Microservice for VDMS with Python Script
|
||||
### 1.4 Start Document Preparation Microservice for VDMS with Python Script
|
||||
|
||||
Start document preparation microservice for VDMS with below command.
|
||||
|
||||
@@ -56,13 +56,13 @@ python prepare_doc_vdms.py
|
||||
python prepare_doc_redis_on_ray.py
|
||||
``` -->
|
||||
|
||||
# 🚀2. Start Microservice with Docker (Option 2)
|
||||
## 🚀2. Start Microservice with Docker (Option 2)
|
||||
|
||||
## 2.1 Start VDMS Server
|
||||
### 2.1 Start VDMS Server
|
||||
|
||||
Please refer to this [readme](../../vectorstores/vdms/README.md).
|
||||
Refer to this [readme](../../vectorstores/vdms/README.md).
|
||||
|
||||
## 2.2 Setup Environment Variables
|
||||
### 2.2 Setup Environment Variables
|
||||
|
||||
```bash
|
||||
export http_proxy=${your_http_proxy}
|
||||
@@ -76,16 +76,16 @@ export DISTANCE_STRATEGY="L2"
|
||||
export PYTHONPATH=${path_to_comps}
|
||||
```
|
||||
|
||||
## 2.3 Build Docker Image
|
||||
### 2.3 Build Docker Image
|
||||
|
||||
- Build docker image with langchain
|
||||
|
||||
Start single-process version (for 1-10 files processing)
|
||||
Start single-process version (for 1-10 files processing)
|
||||
|
||||
```bash
|
||||
cd ../../../
|
||||
docker build -t opea/dataprep-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/langchain/Dockerfile .
|
||||
```
|
||||
```bash
|
||||
cd ../../../
|
||||
docker build -t opea/dataprep-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/langchain/Dockerfile .
|
||||
```
|
||||
|
||||
<!-- - option 2: Start multi-process version (for >10 files processing)
|
||||
|
||||
@@ -93,7 +93,7 @@ docker build -t opea/dataprep-vdms:latest --build-arg https_proxy=$https_proxy -
|
||||
cd ../../../../
|
||||
docker build -t opea/dataprep-on-ray-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/langchain_ray/Dockerfile . -->
|
||||
|
||||
## 2.4 Run Docker with CLI
|
||||
### 2.4 Run Docker with CLI
|
||||
|
||||
Start single-process version (for 1-10 files processing)
|
||||
|
||||
@@ -113,13 +113,13 @@ docker run -d --name="dataprep-vdms-server" -p 6007:6007 --runtime=runc --ipc=ho
|
||||
-e TIMEOUT_SECONDS=600 opea/dataprep-on-ray-vdms:latest
|
||||
``` -->
|
||||
|
||||
# 🚀3. Status Microservice
|
||||
## 🚀3. Status Microservice
|
||||
|
||||
```bash
|
||||
docker container logs -f dataprep-vdms-server
|
||||
```
|
||||
|
||||
# 🚀4. Consume Microservice
|
||||
## 🚀4. Consume Microservice
|
||||
|
||||
Once document preparation microservice for VDMS is started, user can use below command to invoke the microservice to convert the document to embedding and save to the database.
|
||||
|
||||
@@ -127,61 +127,61 @@ Make sure the file path after `files=@` is correct.
|
||||
|
||||
- Single file upload
|
||||
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.txt" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.txt" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
|
||||
You can specify chunk_size and chunk_size by the following commands.
|
||||
You can specify `chunk_size` and `chunk_overlap` by the following commands.
|
||||
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./LLAMA2_page6.pdf" \
|
||||
-F "chunk_size=1500" \
|
||||
-F "chunk_overlap=100" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./LLAMA2_page6.pdf" \
|
||||
-F "chunk_size=1500" \
|
||||
-F "chunk_overlap=100" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
|
||||
- Multiple file upload
|
||||
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.txt" \
|
||||
-F "files=@./file2.txt" \
|
||||
-F "files=@./file3.txt" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.txt" \
|
||||
-F "files=@./file2.txt" \
|
||||
-F "files=@./file3.txt" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
|
||||
- Links upload (not supported for llama_index now)
|
||||
- Links upload (not supported for `llama_index` now)
|
||||
|
||||
```bash
|
||||
curl -X POST \
|
||||
-F 'link_list=["https://www.ces.tech/"]' \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
```bash
|
||||
curl -X POST \
|
||||
-F 'link_list=["https://www.ces.tech/"]' \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
|
||||
or
|
||||
or
|
||||
|
||||
```python
|
||||
import requests
|
||||
import json
|
||||
```python
|
||||
import requests
|
||||
import json
|
||||
|
||||
proxies = {"http": ""}
|
||||
url = "http://localhost:6007/v1/dataprep"
|
||||
urls = [
|
||||
"https://towardsdatascience.com/no-gpu-no-party-fine-tune-bert-for-sentiment-analysis-with-vertex-ai-custom-jobs-d8fc410e908b?source=rss----7f60cf5620c9---4"
|
||||
]
|
||||
payload = {"link_list": json.dumps(urls)}
|
||||
proxies = {"http": ""}
|
||||
url = "http://localhost:6007/v1/dataprep"
|
||||
urls = [
|
||||
"https://towardsdatascience.com/no-gpu-no-party-fine-tune-bert-for-sentiment-analysis-with-vertex-ai-custom-jobs-d8fc410e908b?source=rss----7f60cf5620c9---4"
|
||||
]
|
||||
payload = {"link_list": json.dumps(urls)}
|
||||
|
||||
try:
|
||||
resp = requests.post(url=url, data=payload, proxies=proxies)
|
||||
print(resp.text)
|
||||
resp.raise_for_status() # Raise an exception for unsuccessful HTTP status codes
|
||||
print("Request successful!")
|
||||
except requests.exceptions.RequestException as e:
|
||||
print("An error occurred:", e)
|
||||
```
|
||||
try:
|
||||
resp = requests.post(url=url, data=payload, proxies=proxies)
|
||||
print(resp.text)
|
||||
resp.raise_for_status() # Raise an exception for unsuccessful HTTP status codes
|
||||
print("Request successful!")
|
||||
except requests.exceptions.RequestException as e:
|
||||
print("An error occurred:", e)
|
||||
```
|
||||
|
||||
@@ -2,25 +2,25 @@
|
||||
|
||||
For dataprep microservice, we currently provide one framework: `Langchain`.
|
||||
|
||||
# 🚀1. Start Microservice with Python (Option 1)
|
||||
## 🚀1. Start Microservice with Python (Option 1)
|
||||
|
||||
## 1.1 Install Requirements
|
||||
### 1.1 Install Requirements
|
||||
|
||||
- option 1: Install Single-process version (for 1-10 files processing)
|
||||
|
||||
```bash
|
||||
apt-get update
|
||||
apt-get install -y default-jre tesseract-ocr libtesseract-dev poppler-utils
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
```bash
|
||||
apt-get update
|
||||
apt-get install -y default-jre tesseract-ocr libtesseract-dev poppler-utils
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 1.2 Start VDMS Server
|
||||
### 1.2 Start VDMS Server
|
||||
|
||||
```bash
|
||||
docker run -d --name="vdms-vector-db" -p 55555:55555 intellabs/vdms:latest
|
||||
```
|
||||
|
||||
## 1.3 Setup Environment Variables
|
||||
### 1.3 Setup Environment Variables
|
||||
|
||||
```bash
|
||||
export http_proxy=${your_http_proxy}
|
||||
@@ -33,7 +33,7 @@ export your_hf_api_token="{your_hf_token}"
|
||||
export PYTHONPATH=${path_to_comps}
|
||||
```
|
||||
|
||||
## 1.4 Start Data Preparation Microservice for VDMS with Python Script
|
||||
### 1.4 Start Data Preparation Microservice for VDMS with Python Script
|
||||
|
||||
Start document preparation microservice for VDMS with below command.
|
||||
|
||||
@@ -41,15 +41,15 @@ Start document preparation microservice for VDMS with below command.
|
||||
python ingest_videos.py
|
||||
```
|
||||
|
||||
# 🚀2. Start Microservice with Docker (Option 2)
|
||||
## 🚀2. Start Microservice with Docker (Option 2)
|
||||
|
||||
## 2.1 Start VDMS Server
|
||||
### 2.1 Start VDMS Server
|
||||
|
||||
```bash
|
||||
docker run -d --name="vdms-vector-db" -p 55555:55555 intellabs/vdms:latest
|
||||
```
|
||||
|
||||
## 2.1 Setup Environment Variables
|
||||
### 2.1 Setup Environment Variables
|
||||
|
||||
```bash
|
||||
export http_proxy=${your_http_proxy}
|
||||
@@ -61,29 +61,29 @@ export INDEX_NAME="rag-vdms"
|
||||
export your_hf_api_token="{your_hf_token}"
|
||||
```
|
||||
|
||||
## 2.3 Build Docker Image
|
||||
### 2.3 Build Docker Image
|
||||
|
||||
- Build docker image
|
||||
|
||||
```bash
|
||||
cd ../../../
|
||||
docker build -t opea/dataprep-vdms:latest --network host --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/multimodal_langchain/Dockerfile .
|
||||
```bash
|
||||
cd ../../../
|
||||
docker build -t opea/dataprep-vdms:latest --network host --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/vdms/multimodal_langchain/Dockerfile .
|
||||
|
||||
```
|
||||
```
|
||||
|
||||
## 2.4 Run Docker Compose
|
||||
### 2.4 Run Docker Compose
|
||||
|
||||
```bash
|
||||
docker compose -f comps/dataprep/vdms/multimodal_langchain/docker-compose-dataprep-vdms.yaml up -d
|
||||
```
|
||||
|
||||
# 🚀3. Status Microservice
|
||||
## 🚀3. Status Microservice
|
||||
|
||||
```bash
|
||||
docker container logs -f dataprep-vdms-server
|
||||
```
|
||||
|
||||
# 🚀4. Consume Microservice
|
||||
## 🚀4. Consume Microservice
|
||||
|
||||
Once data preparation microservice for VDMS is started, user can use below command to invoke the microservice to convert the videos to embedding and save to the database.
|
||||
|
||||
@@ -91,34 +91,34 @@ Make sure the file path after `files=@` is correct.
|
||||
|
||||
- Single file upload
|
||||
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.mp4" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.mp4" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
|
||||
- Multiple file upload
|
||||
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.mp4" \
|
||||
-F "files=@./file2.mp4" \
|
||||
-F "files=@./file3.mp4" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
```bash
|
||||
curl -X POST \
|
||||
-H "Content-Type: multipart/form-data" \
|
||||
-F "files=@./file1.mp4" \
|
||||
-F "files=@./file2.mp4" \
|
||||
-F "files=@./file3.mp4" \
|
||||
http://localhost:6007/v1/dataprep
|
||||
```
|
||||
|
||||
- List of uploaded files
|
||||
|
||||
```bash
|
||||
curl -X GET http://localhost:6007/v1/dataprep/get_videos
|
||||
```
|
||||
```bash
|
||||
curl -X GET http://localhost:6007/v1/dataprep/get_videos
|
||||
```
|
||||
|
||||
- Download uploaded files
|
||||
|
||||
Please use the file name from the list
|
||||
Use the file name from the list
|
||||
|
||||
```bash
|
||||
curl -X GET http://localhost:6007/v1/dataprep/get_file/${filename}
|
||||
```
|
||||
```bash
|
||||
curl -X GET http://localhost:6007/v1/dataprep/get_file/${filename}
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user