Frederick2313072
626e71cb3b
feat: implement content-based deduplication for document segments
...
- Add database index on (dataset_id, index_node_hash) for efficient deduplication queries
- Add deduplication check in SegmentService.create_segment and multi_create_segment methods
- Add deduplication check in DatasetDocumentStore.add_documents method to prevent duplicate embedding processing
- Skip creating segments with identical content hashes across the entire dataset
This prevents duplicate content from being re-processed and re-embedded when uploading documents with repeated content, improving efficiency and reducing unnecessary compute costs.
2025-09-20 06:28:14 +08:00
quicksand
680eb7a9f6
fix(datasets): retrieval_model null issue when updating dataset info ( #25907 )
2025-09-18 17:58:06 +08:00
-LAN-
85cda47c70
feat: knowledge pipeline ( #25360 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: twwu <twwu@dify.ai >
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
Co-authored-by: jyong <718720800@qq.com >
Co-authored-by: Wu Tianwei <30284043+WTW0313@users.noreply.github.com >
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com >
Co-authored-by: lyzno1 <yuanyouhuilyz@gmail.com >
Co-authored-by: quicksand <quicksandzn@gmail.com >
Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com >
Co-authored-by: lyzno1 <92089059+lyzno1@users.noreply.github.com >
Co-authored-by: zxhlyh <jasonapring2015@outlook.com >
Co-authored-by: Yongtao Huang <yongtaoh2022@gmail.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Joel <iamjoel007@gmail.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: nite-knite <nkCoding@gmail.com >
Co-authored-by: Hanqing Zhao <sherry9277@gmail.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry <xh001x@hotmail.com >
2025-09-18 12:49:10 +08:00
-LAN-
bab4975809
chore: add ast-grep rule to convert Optional[T] to T | None ( #25560 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-15 13:06:33 +08:00
Krito.
a13d7987e0
chore: adopt StrEnum and auto() for some string-typed enums ( #25129 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-09-12 21:14:26 +08:00
kenwoodjw
c91253d05d
fix segment deletion race condition ( #24408 )
...
Signed-off-by: kenwoodjw <blackxin55+@gmail.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-09-12 15:29:57 +08:00
Eric Guo
70e4d6be34
Fix 500 in dataset page. ( #25474 )
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Main CI Pipeline / Check Changed Files (push) Waiting to run
Main CI Pipeline / API Tests (push) Blocked by required conditions
Main CI Pipeline / Web Tests (push) Blocked by required conditions
Main CI Pipeline / Style Check (push) Waiting to run
Main CI Pipeline / VDB Tests (push) Blocked by required conditions
Main CI Pipeline / DB Migration Test (push) Blocked by required conditions
2025-09-10 15:57:04 +08:00
Asuka Minato
cbc0e639e4
update sql in batch ( #24801 )
...
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Main CI Pipeline / Check Changed Files (push) Waiting to run
Main CI Pipeline / API Tests (push) Blocked by required conditions
Main CI Pipeline / Web Tests (push) Blocked by required conditions
Main CI Pipeline / Style Check (push) Waiting to run
Main CI Pipeline / VDB Tests (push) Blocked by required conditions
Main CI Pipeline / DB Migration Test (push) Blocked by required conditions
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com >
2025-09-10 13:00:17 +08:00
-LAN-
08dd3f7b50
Fix basedpyright type errors ( #25435 )
...
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Main CI Pipeline / Check Changed Files (push) Waiting to run
Main CI Pipeline / API Tests (push) Blocked by required conditions
Main CI Pipeline / Web Tests (push) Blocked by required conditions
Main CI Pipeline / Style Check (push) Waiting to run
Main CI Pipeline / VDB Tests (push) Blocked by required conditions
Main CI Pipeline / DB Migration Test (push) Blocked by required conditions
Signed-off-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-10 01:54:26 +08:00
ttz12345
d2e50a508c
Fix:About the error problem of creating an empty knowledge base interface in service_api ( #25398 )
...
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-09-09 15:18:31 +08:00
Asuka Minato
16a3e21410
more assert ( #24996 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-09-08 09:59:43 +08:00
-LAN-
9b8a03b53b
[Chore/Refactor] Improve type annotations in models module ( #25281 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-09-08 09:42:27 +08:00
Asuka Minato
a78339a040
remove bare list, dict, Sequence, None, Any ( #25058 )
...
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Main CI Pipeline / Check Changed Files (push) Waiting to run
Main CI Pipeline / API Tests (push) Blocked by required conditions
Main CI Pipeline / Web Tests (push) Blocked by required conditions
Main CI Pipeline / Style Check (push) Waiting to run
Main CI Pipeline / VDB Tests (push) Blocked by required conditions
Main CI Pipeline / DB Migration Test (push) Blocked by required conditions
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com >
2025-09-06 03:32:23 +08:00
-LAN-
a2e0f80c01
[Chore/Refactor] Improve type checking configuration ( #25185 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-09-05 08:34:18 +08:00
-LAN-
53c4a8787f
[Chore/Refactor] Improve type safety and resolve type checking issues ( #25104 )
2025-09-04 09:35:32 +08:00
Bowen Liang
7b379e2a61
chore: apply ty checks on api code with script and ci action ( #24653 )
2025-09-02 16:05:13 +08:00
Frederick2313072
5b3cc560d5
fix:hard-coded top-k fallback issue. ( #24879 )
2025-09-01 15:46:37 +08:00
Asuka Minato
24e2b72b71
Update ast-grep pattern for session.query ( #24828 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-31 17:03:51 +08:00
Yongtao Huang
2a29c61041
Refactor: replace count() > 0 check with exists() ( #24583 )
...
Co-authored-by: Yongtao Huang <99629139+hyongtao-db@users.noreply.github.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-27 17:46:52 +08:00
Yongtao Huang
826f19e968
Chore : rm dead code detected by pylance ( #24588 )
2025-08-27 13:19:40 +08:00
Yongtao Huang
fa753239ad
Refactor: use logger = logging.getLogger(__name__) in logging ( #24515 )
...
Co-authored-by: Yongtao Huang <99629139+hyongtao-db@users.noreply.github.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-08-26 18:10:31 +08:00
Muke Wang
044ad5100e
fix: Update doc word count after delete chunks ( #24435 )
...
Co-authored-by: wangmuke <wangmuke@kingsware.cn >
2025-08-25 12:08:34 +08:00
-LAN-
da9af7b547
[Chore/Refactor] Use centralized naive_utc_now for UTC datetime operations ( #24352 )
...
autofix.ci / autofix (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Signed-off-by: -LAN- <laipz8200@outlook.com >
2025-08-22 23:53:05 +08:00
huangzhuo1949
1caeac56f2
fix: dataset doc-form compatible ( #24177 )
...
autofix.ci / autofix (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com >
2025-08-20 23:48:56 +08:00
Zhehao Peng
c0702aacac
Use typing.Literal to replace str places ( #24099 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-18 21:34:13 +08:00
Yongtao Huang
76d123fe19
Fix segment query tenant bug and variable naming typo ( #23321 )
...
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com >
2025-08-03 18:30:09 +08:00
Yongtao Huang
fbf844efd5
Chore: replace deprecated datetime.utcnow() with naive_utc_now() ( #23312 )
...
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com >
2025-08-03 10:11:47 +08:00
Asuka Minato
79ea94483e
refine some orm types ( #22885 )
2025-07-31 18:43:04 +08:00
NeatGuyCoding
47cc951841
Fix Empty Collection WHERE Filter Issue ( #23086 )
2025-07-29 11:17:50 +08:00
Asuka Minato
a189d293f8
make logging not use f-str, change others to f-str ( #22882 )
2025-07-25 10:32:48 +08:00
Asuka Minato
ef51678c73
orm filter -> where ( #22801 )
...
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Signed-off-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: Claude <noreply@anthropic.com >
2025-07-24 00:57:45 +08:00
Asuka Minato
6d3e198c3c
Mapped column ( #22644 )
...
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-07-23 00:39:59 +08:00
Aryan Raj
ce794335e9
Fix/replace datetime patterns with naive utc now ( #22654 )
2025-07-20 11:05:53 +08:00
Khoa
a06af88b26
Feat/api validate model provider ( #21582 )
...
Co-authored-by: crazywoola <427733928@qq.com >
2025-06-27 09:59:44 +08:00
NeatGuyCoding
33f0457a23
fix: wrong token number when using qa_model and answer is updated. ( #21574 )
2025-06-27 09:11:41 +08:00
NeatGuyCoding
6bb82f8ee0
Fix minor comment missing ( #21517 )
2025-06-26 10:06:49 +08:00
NeatGuyCoding
94f8e48647
Refactor update dataset ( fix #21401 ) ( #21402 )
...
Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-06-25 11:44:35 +08:00
NeatGuyCoding
a0a89b562c
Feature:Refactor batch update document status for #21324 ( #21325 )
2025-06-23 09:49:13 +08:00
GuanMu
870e73c03b
Knowledge base API supports status updates #18147 ( #18235 )
2025-06-21 11:18:48 +08:00
Jyong
57f7368a0e
fix notion dataset rule not found ( #21236 )
2025-06-20 20:05:01 +08:00
Jyong
9a18a98b58
fix keyword search top-k not initial ( #21202 )
2025-06-19 11:10:41 +08:00
Bowen Liang
c1a13fa553
chore: replace pseudo-random generators with secrets module ( #20616 )
2025-06-06 10:48:28 +08:00
Mio Inamijima
0ebaba98f0
fix: dataset permission check for partial team members ( #19249 ) ( #20242 )
...
Co-authored-by: MioINAMIJIMA <m.inamijima@optimaize-consulting.com >
2025-05-27 14:33:11 +08:00
GonzaHM
38b1e46241
fix: correct indentation in dataset retrieval model assignment ( #20040 )
2025-05-22 10:05:24 +08:00
Emmanuel Ferdman
582b721160
Resolve Python Logger library warnings ( #19791 )
...
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com >
2025-05-16 14:31:54 +08:00
非法操作
14cd71ed0a
chore: all model.query replace to db.session.query ( #19521 )
2025-05-12 15:19:41 +08:00
非法操作
b00f94df64
fix: replace all dataset.Model.query to db.session.query(Model) ( #19509 )
2025-05-12 13:52:33 +08:00
Bowen Liang
8537abfff8
chore: avoid repeated type ignore noqa by adding flask_restful and flask_login in mypy import exclusions ( #19224 )
2025-05-06 11:58:49 +08:00
devxing
e912928cce
fix: create child chunk ( #18209 )
...
Co-authored-by: devxing <devxing@gmail.com >
2025-04-16 19:56:21 +08:00
诗浓
4166f73d9d
fix: page/limit param not effective ( #18196 )
2025-04-16 17:26:47 +08:00