dify/api/services at 626e71cb3b82d43b1c40d64b5341124b0c1e92e6 - dify - Moonshot Source Control

jandres/dify

mirror of https://github.com/langgenius/dify.git synced 2026-01-07 23:04:12 +00:00

Files

History

Frederick2313072 626e71cb3b feat: implement content-based deduplication for document segments

- Add database index on (dataset_id, index_node_hash) for efficient deduplication queries
- Add deduplication check in SegmentService.create_segment and multi_create_segment methods
- Add deduplication check in DatasetDocumentStore.add_documents method to prevent duplicate embedding processing
- Skip creating segments with identical content hashes across the entire dataset

This prevents duplicate content from being re-processed and re-embedded when uploading documents with repeated content, improving efficiency and reducing unnecessary compute costs.

2025-09-20 06:28:14 +08:00

..

update sql in batch (#24801 )

2025-09-10 13:00:17 +08:00

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

refactor: replace print statements with proper logging (#25773 )

2025-09-18 20:35:47 +08:00

Chore: correct inconsistent logging and typo (#25945 )

2025-09-19 10:36:16 +08:00

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

fix: handle None description in MCP tool transformation (#25872 )

2025-09-18 13:11:38 +08:00

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

__init__.py

chore(api/services): apply ruff reformatting (#7599 )

2024-08-26 13:43:57 +08:00

account_service.py

fix: remove billing cache when add or delete app or member (#25885 )

2025-09-18 12:18:07 +08:00

advanced_prompt_template_service.py

chore: adopt StrEnum and auto() for some string-typed enums (#25129 )

2025-09-12 21:14:26 +08:00

agent_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

annotation_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

api_based_extension_service.py

remove bare list, dict, Sequence, None, Any (#25058 )

2025-09-06 03:32:23 +08:00

app_dsl_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

app_generate_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

app_model_config_service.py

remove bare list, dict, Sequence, None, Any (#25058 )

2025-09-06 03:32:23 +08:00

app_service.py

fix: remove billing cache when add or delete app or member (#25885 )

2025-09-18 12:18:07 +08:00

audio_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

billing_service.py

fix: remove billing cache when add or delete app or member (#25885 )

2025-09-18 12:18:07 +08:00

clear_free_plan_tenant_expired_logs.py

update sql in batch (#24801 )

2025-09-10 13:00:17 +08:00

code_based_extension_service.py

remove bare list, dict, Sequence, None, Any (#25058 )

2025-09-06 03:32:23 +08:00

conversation_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

dataset_service.py

feat: implement content-based deduplication for document segments

2025-09-20 06:28:14 +08:00

datasource_provider_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

external_knowledge_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

feature_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

file_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

hit_testing_service.py

remove bare list, dict, Sequence, None, Any (#25058 )

2025-09-06 03:32:23 +08:00

knowledge_service.py

feat: mypy for all type check (#10921 )

2024-12-24 18:38:51 +08:00

message_service.py

fix: Message => str (#25876 )

2025-09-18 17:57:57 +08:00

metadata_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

model_load_balancing_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

model_provider_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

oauth_server.py

feat: oauth provider (#24206 )

2025-08-29 14:10:51 +08:00

operation_service.py

chore(api/services): apply ruff reformatting (#7599 )

2024-08-26 13:43:57 +08:00

ops_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

recommended_app_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

saved_message_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

tag_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

variable_truncator.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

vector_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

web_conversation_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

webapp_auth_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

website_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

workflow_app_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

workflow_draft_variable_service.py

feat: knowledge pipeline (#25360 )

2025-09-18 12:49:10 +08:00

workflow_run_service.py

chore: add ast-grep rule to convert Optional[T] to T | None (#25560 )

2025-09-15 13:06:33 +08:00

workflow_service.py

Refactor WorkflowService to handle missing default credentials gracef… (#25960 )

2025-09-19 00:45:35 -07:00

workspace_service.py

Fix basedpyright type errors (#25435 )

2025-09-10 01:54:26 +08:00