Compare commits

...

38 Commits

Author SHA1 Message Date
GareArc
2afd76dbf6 docs(enterprise): add telemetry data dictionary for OTEL signals
- Comprehensive reference for all enterprise telemetry signals
- Documents 3 span types, 10 counters, 6 histograms, 13 log events
- Includes trace correlation model with ASCII diagrams
- Configuration reference for all 8 ENTERPRISE_* variables
- Per-emission-site label tables for metrics
- Full JSON schemas for structured log events
- Content gating behavior and token double-counting warnings
2026-02-10 20:11:56 -08:00
GareArc
a295f65490 feat(telemetry): add missing ID fields for name attributes
- Add dify.credential.id to node execution events
- Add dify.event.id to all telemetry events (APP_CREATED, APP_UPDATED, APP_DELETED, FEEDBACK_CREATED)

This ensures all .name fields have corresponding .id fields for reliable aggregation and deduplication.
2026-02-10 00:31:05 -08:00
GareArc
9d79927207 feat(enterprise-telemetry): wire bearer token auth and configurable insecure flag into OTEL exporter 2026-02-10 00:31:00 -08:00
GareArc
0caf4d6efd feat(enterprise-telemetry): add ENTERPRISE_OTLP_API_KEY config field 2026-02-10 00:31:00 -08:00
GareArc
0dd8288cf2 feat(telemetry): unify token metric label structure with Pydantic enforcement
- Add TokenMetricLabels BaseModel to enforce consistent label structure
- All dify.token.* metrics now use identical 6-label structure:
  * tenant_id, app_id, operation_type, model_provider, model_name, node_type
- Pydantic validation ensures runtime enforcement (extra='forbid', frozen=True)
- Enables filtering by operation_type to avoid double-counting:
  * workflow: aggregated workflow-level tokens
  * node_execution: individual node-level tokens
  * message: direct message tokens
  * rule_generate/code_generate: prompt generation tokens

Previously, inconsistent label cardinality made aggregation impossible:
- WORKFLOW: 3 labels
- NODE_EXECUTION: 6 labels
- MESSAGE: 5 labels
- PROMPT_GENERATION: 5 labels

Now all use the same 6-label structure for consistent querying.
2026-02-10 00:31:00 -08:00
GareArc
7f11898da7 feat: add dedicated app event counters and convert event names to StrEnum
- Add APP_CREATED, APP_UPDATED, APP_DELETED counters to EnterpriseTelemetryCounter
- Create EnterpriseTelemetryEvent StrEnum for type-safe event names
- Update metric_handler to use new app-specific counters with labels (tenant_id, app_id, mode)
- Convert all event_name strings to EnterpriseTelemetryEvent enum values
- Update exporter to create OTEL meters for new app counters (dify.app.created.total, etc.)
- Update tests to verify new counter behavior and enum usage
2026-02-10 00:30:59 -08:00
GareArc
e614ad663b feat(telemetry): add operation_type labels for token metrics
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-10 00:30:59 -08:00
GareArc
3a4d79986f docs(telemetry): clarify enterprise_telemetry queue is EE-only 2026-02-10 00:30:59 -08:00
GareArc
a8bd9e1496 feat(telemetry): add enterprise OTEL telemetry with gateway, traces, metrics, and logs 2026-02-10 00:30:59 -08:00
GareArc
946a3f0e0b test(enterprise-telemetry): add unit tests for OTEL bearer auth and insecure flag 2026-02-09 01:46:05 -08:00
GareArc
ef642b6778 feat(enterprise-telemetry): wire bearer token auth and configurable insecure flag into OTEL exporter 2026-02-09 01:46:05 -08:00
GareArc
51cdf37236 feat(enterprise-telemetry): add ENTERPRISE_OTLP_API_KEY config field 2026-02-09 01:46:04 -08:00
GareArc
d0ff19f697 feat(telemetry): unify token metric label structure with Pydantic enforcement
- Add TokenMetricLabels BaseModel to enforce consistent label structure
- All dify.token.* metrics now use identical 6-label structure:
  * tenant_id, app_id, operation_type, model_provider, model_name, node_type
- Pydantic validation ensures runtime enforcement (extra='forbid', frozen=True)
- Enables filtering by operation_type to avoid double-counting:
  * workflow: aggregated workflow-level tokens
  * node_execution: individual node-level tokens
  * message: direct message tokens
  * rule_generate/code_generate: prompt generation tokens

Previously, inconsistent label cardinality made aggregation impossible:
- WORKFLOW: 3 labels
- NODE_EXECUTION: 6 labels
- MESSAGE: 5 labels
- PROMPT_GENERATION: 5 labels

Now all use the same 6-label structure for consistent querying.
2026-02-06 03:12:29 -08:00
GareArc
8a4d519718 feat: add dedicated app event counters and convert event names to StrEnum
- Add APP_CREATED, APP_UPDATED, APP_DELETED counters to EnterpriseTelemetryCounter
- Create EnterpriseTelemetryEvent StrEnum for type-safe event names
- Update metric_handler to use new app-specific counters with labels (tenant_id, app_id, mode)
- Convert all event_name strings to EnterpriseTelemetryEvent enum values
- Update exporter to create OTEL meters for new app counters (dify.app.created.total, etc.)
- Update tests to verify new counter behavior and enum usage
2026-02-06 02:40:53 -08:00
GareArc
0ce67009c6 feat(telemetry): add enterprise OTEL telemetry with gateway, traces, metrics, and logs 2026-02-06 01:01:11 -08:00
GareArc
81a955db65 docs(telemetry): clarify enterprise_telemetry queue is EE-only 2026-02-05 23:03:51 -08:00
GareArc
f8966ecc1c feat(telemetry): add enterprise OTEL telemetry with gateway, traces, metrics, and logs 2026-02-05 23:02:46 -08:00
wangxiaolei
a297b06aac fix: fix tool type is miss (#32042) 2026-02-06 14:38:15 +08:00
QuantumGhost
e988266f53 chore: update HITL auto deploy workflow (#32040) 2026-02-06 14:15:32 +08:00
longbingljw
d9530f7bb7 fix: make flask upgrade-db fail on error (#32024) 2026-02-06 12:01:31 +08:00
wangxiaolei
b24e6edada fix: fix agent node tool type is not right (#32008)
Infer real tool type via querying relevant database tables.

The root cause for incorrect `type` field is still not clear.
2026-02-06 11:24:39 +08:00
Ryan
59a9cbbf78 chore: remove .codex/skills directory (#32022)
Co-authored-by: Longwei Liu <longweiliu@LongweideMacBook-Air.local>
2026-02-06 10:46:50 +08:00
99
45164ce33e refactor: strip external imports in workflow template transform (#32017) 2026-02-06 10:37:26 +08:00
99
095b3ee234 chore: Remove redundant double space in variable type description (core/variables/variables.py) (#32002)
Some checks failed
autofix.ci / autofix (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
Main CI Pipeline / Check Changed Files (push) Has been cancelled
Main CI Pipeline / API Tests (push) Has been cancelled
Main CI Pipeline / Web Tests (push) Has been cancelled
Main CI Pipeline / Style Check (push) Has been cancelled
Main CI Pipeline / VDB Tests (push) Has been cancelled
Main CI Pipeline / DB Migration Test (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-02-05 21:44:31 +08:00
QuantumGhost
cb970e54da perf(api): Optimize the response time of AppListApi endpoint (#31999) 2026-02-05 19:05:09 +08:00
Stream
e04f2a0786 feat: use static manifest for pre-caching all plugin manifests before checking updates (#31942)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-05 18:58:17 +08:00
Stephen Zhou
7202a24bcf chore: migrate to eslint-better-tailwind (#31969) 2026-02-05 18:36:08 +09:00
wangxiaolei
be8f265e43 fix: fix uuid_generate_v4 only used in postgresql (#31304) 2026-02-05 17:32:33 +08:00
lif
9e54f086dc fix(web): add rewrite rule to fix Serwist precaching 404 errors (#31770)
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Stephen Zhou <38493346+hyoban@users.noreply.github.com>
2026-02-05 15:42:18 +08:00
Joel
8c31b69c8e chore: sticky the applist header in explore page (#31967)
Some checks failed
autofix.ci / autofix (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
Main CI Pipeline / Check Changed Files (push) Has been cancelled
Main CI Pipeline / API Tests (push) Has been cancelled
Main CI Pipeline / Web Tests (push) Has been cancelled
Main CI Pipeline / Style Check (push) Has been cancelled
Main CI Pipeline / VDB Tests (push) Has been cancelled
Main CI Pipeline / DB Migration Test (push) Has been cancelled
2026-02-05 14:44:51 +08:00
wangxiaolei
b886b3f6c8 fix: fix miss use db.session (#31971) 2026-02-05 14:42:34 +08:00
Stephen Zhou
ef0d18bb61 test: fix test (#31975) 2026-02-05 14:31:21 +08:00
Xiyuan Chen
c56ad8e323 feat: account delete cleanup (#31519)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-02-04 17:59:41 -08:00
yyh
365f749ed5 fix: remove staleTime/gcTime overrides from trigger query hooks and use orpc contract (#31863)
Some checks failed
autofix.ci / autofix (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
Main CI Pipeline / Check Changed Files (push) Has been cancelled
Main CI Pipeline / API Tests (push) Has been cancelled
Main CI Pipeline / Web Tests (push) Has been cancelled
Main CI Pipeline / Style Check (push) Has been cancelled
Main CI Pipeline / VDB Tests (push) Has been cancelled
Main CI Pipeline / DB Migration Test (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-02-04 19:33:32 +08:00
wangxiaolei
f686197589 feat: use latest hash to sync draft (#31924) 2026-02-04 19:32:36 +08:00
Coding On Star
f584be9cf0 chore: update CODEOWNERS to specify test file patterns for base components (#31941)
Co-authored-by: CodingOnStar <hanxujiang@dify.com>
2026-02-04 19:29:57 +08:00
QuantumGhost
3bd228ddb7 chore: bump version in docker-compose and package manager to 1.12.1 (#31947) 2026-02-04 19:29:28 +08:00
wangxiaolei
0dfa59b1db fix: fix delete_draft_variables_batch cycle forever (#31934)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-02-04 19:10:27 +08:00
185 changed files with 11229 additions and 19560 deletions

View File

@@ -1 +0,0 @@
../../.agents/skills/component-refactoring

View File

@@ -1 +0,0 @@
../../.agents/skills/frontend-code-review

View File

@@ -1 +0,0 @@
../../.agents/skills/frontend-testing

View File

@@ -1 +0,0 @@
../../.agents/skills/orpc-contract-first

7
.github/CODEOWNERS vendored
View File

@@ -24,10 +24,6 @@
/api/services/tools/mcp_tools_manage_service.py @Nov1c444
/api/controllers/mcp/ @Nov1c444
/api/controllers/console/app/mcp_server.py @Nov1c444
# Backend - Tests
/api/tests/ @laipz8200 @QuantumGhost
/api/tests/**/*mcp* @Nov1c444
# Backend - Workflow - Engine (Core graph execution engine)
@@ -238,9 +234,6 @@
# Frontend - Base Components
/web/app/components/base/ @iamjoel @zxhlyh
# Frontend - Base Components Tests
/web/app/components/base/**/__tests__/ @hyoban @CodingOnStar
# Frontend - Utils and Hooks
/web/utils/classnames.ts @iamjoel @zxhlyh
/web/utils/time.ts @iamjoel @zxhlyh

View File

@@ -0,0 +1,314 @@
# Add Missing Telemetry ID Fields
## TL;DR
> **Quick Summary**: Add `dify.credential.id` and `dify.event.id` to enterprise telemetry events where corresponding `.name` fields exist. These IDs are already available in metadata/envelope and only require reading + outputting.
>
> **Deliverables**:
> - Add `dify.credential.id` to node execution events (read from metadata)
> - Add `dify.event.id` to all telemetry events (read from envelope)
>
> **Estimated Effort**: Quick
> **Parallel Execution**: NO - sequential (credential.id first, then event.id)
> **Critical Path**: Task 1 → Task 2
---
## Context
### Original Request
User wants all telemetry data to have corresponding IDs for every `.name` field to enable reliable aggregation and prevent ambiguity when names change or duplicate.
### Interview Summary
**Key Discussions**:
- workspace=tenant: No need for separate `workspace.id` (tenant_id serves this purpose)
- "Directly available": If field exists in metadata/envelope and just needs reading, it counts
- Skip fields requiring upstream propagation or new logic
**Current State**:
-`dify.app.name` + `dify.app_id` (already paired)
-`dify.dataset.names` + `dify.dataset.ids` (already paired)
-`dify.workspace.name` + `dify.tenant_id` (workspace=tenant per user)
-`dify.plugin.name` stores `plugin_unique_identifier` (acts as ID)
-`dify.credential.name` missing `dify.credential.id` (but credential_id in metadata)
-`dify.event.name` missing `dify.event.id` (but event_id in envelope)
- ⏸️ `gen_ai.provider.name` / `gen_ai.tool.name` - IDs not in metadata (deferred)
### Research Findings
- `credential_id` available at: `api/core/app/workflow/layers/persistence.py:491` → stored in node_data metadata → accessible in `ops_trace_manager.py:1164` and `enterprise_trace.py` metadata dict
- `event_id` available at: `api/enterprise/telemetry/contracts.py:67` (TelemetryEnvelope) → accessible in `metric_handler.py` when processing events
---
## Work Objectives
### Core Objective
Add `dify.credential.id` and `dify.event.id` to enterprise telemetry attributes for events that already output the corresponding `.name` fields.
### Concrete Deliverables
- `dify.credential.id` added to node execution events (where `dify.credential.name` exists)
- `dify.event.id` added to all telemetry events processed through metric_handler
### Definition of Done
- [ ] Query telemetry logs for node execution events → `dify.credential.id` present when `dify.credential.name` exists
- [ ] Query telemetry logs for any event → `dify.event.id` present in attributes
- [ ] All existing tests pass
### Must Have
- Credential ID only added when credential exists (don't add null/empty values)
- Event ID always added (every event has envelope.event_id)
### Must NOT Have (Guardrails)
- Do NOT add provider.id or tool.id (not in metadata, would require upstream changes)
- Do NOT change existing field names or remove any fields
- Do NOT add workspace.id (workspace=tenant per user requirement)
---
## Verification Strategy
> **UNIVERSAL RULE: ZERO HUMAN INTERVENTION**
>
> ALL tasks MUST be verifiable WITHOUT any human action.
### Test Decision
- **Infrastructure exists**: YES (pytest)
- **Automated tests**: Tests-after (verify field presence)
- **Framework**: pytest
### Agent-Executed QA Scenarios (MANDATORY — ALL tasks)
> All verification done by agent via grep/read to confirm attribute output in code.
---
## Execution Strategy
### Parallel Execution Waves
```
Wave 1 (Start Immediately):
└── Task 1: Add dify.credential.id
Wave 2 (After Wave 1):
└── Task 2: Add dify.event.id
Critical Path: Task 1 → Task 2
Sequential execution required (separate files/patterns)
```
### Dependency Matrix
| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2 | None |
| 2 | 1 | None | None |
### Agent Dispatch Summary
| Wave | Tasks | Recommended Agents |
|------|-------|-------------------|
| 1 | 1 | task(category="quick", load_skills=[], run_in_background=false) |
| 2 | 2 | task(category="quick", load_skills=[], run_in_background=false) |
---
## TODOs
- [ ] 1. Add dify.credential.id to node execution events
**What to do**:
- Read `credential_id` from metadata in `_node_execution_trace` method
- Add `"dify.credential.id": metadata.get("credential_id")` to attrs dict (only if credential_id exists and is not None)
- Follow same pattern as existing `dify.credential.name` (line 368)
**Must NOT do**:
- Do NOT add credential.id if it's None or empty string
- Do NOT add to events other than node execution
- Do NOT change how credential_id is populated in metadata upstream
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Single file edit, straightforward field addition, clear pattern to follow
- **Skills**: []
- No special skills needed - standard Python dict manipulation
- **Skills Evaluated but Omitted**: N/A
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 1 (sequential)
- **Blocks**: Task 2 (different file, cleaner to do sequentially)
- **Blocked By**: None (can start immediately)
**References** (CRITICAL - Be Exhaustive):
**Pattern References** (existing code to follow):
- `api/enterprise/telemetry/enterprise_trace.py:368` - Pattern for adding credential.name from metadata (line: `"dify.credential.name": metadata.get("credential_name")`)
- `api/enterprise/telemetry/enterprise_trace.py:365-370` - Context where credential fields are added (shows conditional node_execution_id pattern)
**API/Type References** (contracts to implement against):
- `api/core/app/workflow/layers/persistence.py:491-493` - Where credential_id is set in node_data (shows it comes from tool_info metadata)
- `api/core/ops/ops_trace_manager.py:1163-1176` - Where credential_id flows into metadata dict (line 1164: credential_id from node_data)
**Test References** (testing patterns to follow):
- No existing tests for credential fields - verification via code inspection
**Documentation References** (specs and requirements):
- Context from this plan: credential_id must only be added when it exists (not null/empty)
**WHY Each Reference Matters**:
- Line 368 shows exact dict.get() pattern and naming convention (dify.credential.*)
- persistence.py:491 proves credential_id is available in metadata when tool uses credentials
- ops_trace_manager.py:1163-1176 shows metadata construction including credential_id
- These together confirm: metadata already has credential_id, just need to output it
**Acceptance Criteria**:
**Agent-Executed QA Scenarios (MANDATORY — per-scenario, ultra-detailed):**
```
Scenario: credential.id added when credential exists
Tool: Grep + Read
Preconditions: Code changes applied to enterprise_trace.py
Steps:
1. Read api/enterprise/telemetry/enterprise_trace.py lines 360-380
2. Assert: Line exists matching pattern "dify.credential.id.*metadata.get.*credential_id"
3. Assert: credential.id added in same conditional block as credential.name (after line 368)
4. Assert: Pattern uses metadata.get("credential_id") (not direct access)
Expected Result: credential.id field added using safe dict.get pattern
Evidence: Code snippet showing the added line
Scenario: credential.id not added to non-node events
Tool: Grep
Preconditions: Code changes applied
Steps:
1. Grep api/enterprise/telemetry/enterprise_trace.py for "dify.credential.id"
2. Assert: Only ONE occurrence (in _node_execution_trace method)
3. Assert: NOT in _message_trace, _workflow_trace, _tool_trace, _dataset_retrieval_trace
Expected Result: credential.id only in node execution context
Evidence: Grep output showing single occurrence
```
**Evidence to Capture**:
- [ ] Code snippet showing added line with proper indentation
- [ ] Grep confirmation of single occurrence location
**Commit**: YES
- Message: `feat(telemetry): add dify.credential.id to node execution events`
- Files: `api/enterprise/telemetry/enterprise_trace.py`
- Pre-commit: `make lint`
- [ ] 2. Add dify.event.id to all telemetry events
**What to do**:
- In `metric_handler.py`, add `"dify.event.id": envelope.event_id` to attributes for ALL event types
- Add once in common attribute building (before case-specific handling)
- Ensure event_id is added to rehydration_failed, app events, feedback, and all metric-only events
**Must NOT do**:
- Do NOT skip any event type (event.id must be universal)
- Do NOT add event.id to span attributes (only event attributes in metric_handler)
- Do NOT change envelope.event_id generation logic
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Single file, add field in one central location before event dispatch
- **Skills**: []
- Standard Python dict manipulation
- **Skills Evaluated but Omitted**: N/A
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 2 (after Task 1)
- **Blocks**: None (final task)
- **Blocked By**: Task 1 (cleaner to do sequentially, different files)
**References** (CRITICAL - Be Exhaustive):
**Pattern References** (existing code to follow):
- `api/enterprise/telemetry/metric_handler.py:184` - Existing pattern for adding event_id (REHYDRATION_FAILED case: `"dify.event_id": envelope.event_id`)
- `api/enterprise/telemetry/metric_handler.py:183` - Shows tenant_id added before event_id (follow same pattern)
**API/Type References** (contracts to implement against):
- `api/enterprise/telemetry/contracts.py:67` - TelemetryEnvelope schema showing event_id field (always present)
- `api/enterprise/telemetry/gateway.py:138` - Where event_id is generated (str(uuid.uuid4()))
**Test References** (testing patterns to follow):
- No existing tests - verification via code inspection
**Documentation References** (specs and requirements):
- Context from this plan: event.id must be added to ALL events (universal field)
**WHY Each Reference Matters**:
- metric_handler.py:184 shows exact pattern: envelope.event_id is directly accessible
- contracts.py:67 confirms event_id is non-nullable (always present on envelope)
- gateway.py:138 proves every event gets unique event_id at creation time
- Pattern at line 183-184 shows safe approach: add common fields before case switch
**Acceptance Criteria**:
**Agent-Executed QA Scenarios (MANDATORY — per-scenario, ultra-detailed):**
```
Scenario: event.id added before event-specific processing
Tool: Read
Preconditions: Code changes applied to metric_handler.py
Steps:
1. Read api/enterprise/telemetry/metric_handler.py lines 175-195
2. Assert: "dify.event.id": envelope.event_id added before line 189 (before case dispatch)
3. Assert: Added in same block as tenant_id (line 183)
4. Assert: Uses envelope.event_id directly (not envelope.event_id or None)
Expected Result: event.id added early, applies to all event types
Evidence: Code snippet showing placement before case statement
Scenario: event.id present in all event type handlers
Tool: Grep + Read
Preconditions: Code changes applied
Steps:
1. Read metric_handler.py APP_CREATED handler (lines 180-210)
2. Read metric_handler.py FEEDBACK_CREATED handler (lines 315-340)
3. Assert: Both inherit event.id from common attrs (added before case dispatch)
4. Grep for "envelope.event_id" → Assert only ONE new occurrence (in common attrs)
Expected Result: event.id available to all event types via shared attribute dict
Evidence: Code showing single central addition point
```
**Evidence to Capture**:
- [ ] Code snippet showing event.id addition in common attributes section
- [ ] Grep output confirming single addition point
**Commit**: YES
- Message: `feat(telemetry): add dify.event.id to all telemetry events`
- Files: `api/enterprise/telemetry/metric_handler.py`
- Pre-commit: `make lint`
---
## Commit Strategy
| After Task | Message | Files | Verification |
|------------|---------|-------|--------------|
| 1 | `feat(telemetry): add dify.credential.id to node execution events` | enterprise_trace.py | make lint |
| 2 | `feat(telemetry): add dify.event.id to all telemetry events` | metric_handler.py | make lint |
---
## Success Criteria
### Verification Commands
```bash
# Verify credential.id added
grep -n "dify.credential.id" api/enterprise/telemetry/enterprise_trace.py
# Expected: One line in _node_execution_trace method using metadata.get("credential_id")
# Verify event.id added
grep -n "dify.event.id" api/enterprise/telemetry/metric_handler.py
# Expected: One line in common attributes section using envelope.event_id
```
### Final Checklist
- [ ] credential.id present in node execution events (when credential exists)
- [ ] event.id present in all telemetry events
- [ ] No new fields added beyond these two
- [ ] Linting passes
- [ ] Code follows existing patterns (dict.get for metadata, direct access for envelope)

View File

@@ -106,10 +106,10 @@ ignore = [
"N803", # invalid-argument-name
]
"tests/*" = [
"F811", # redefined-while-unused
"T201", # allow print in tests,
"S110", # allow ignoring exceptions in tests code (currently)
"F811", # redefined-while-unused
"T201", # allow print in tests,
"S110", # allow ignoring exceptions in tests code (currently)
"PT019", # @patch-injected params look like unused fixtures
]
"controllers/console/explore/trial.py" = ["TID251"]
"controllers/console/human_input_form.py" = ["TID251"]

View File

@@ -122,7 +122,8 @@ These commands assume you start from the repository root.
```bash
cd api
uv run celery -A app.celery worker -P threads -c 2 --loglevel INFO -Q dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention
# Note: enterprise_telemetry queue is only used in Enterprise Edition
uv run celery -A app.celery worker -P threads -c 2 --loglevel INFO -Q dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention,enterprise_telemetry
```
1. Optional: start Celery Beat (scheduled tasks, in a new terminal).

View File

@@ -81,6 +81,7 @@ def initialize_extensions(app: DifyApp):
ext_commands,
ext_compress,
ext_database,
ext_enterprise_telemetry,
ext_fastopenapi,
ext_forward_refs,
ext_hosting_provider,
@@ -131,6 +132,7 @@ def initialize_extensions(app: DifyApp):
ext_commands,
ext_fastopenapi,
ext_otel,
ext_enterprise_telemetry,
ext_request_logging,
ext_session_factory,
]

View File

@@ -8,7 +8,7 @@ from pydantic_settings import BaseSettings, PydanticBaseSettingsSource, Settings
from libs.file_utils import search_file_upwards
from .deploy import DeploymentConfig
from .enterprise import EnterpriseFeatureConfig
from .enterprise import EnterpriseFeatureConfig, EnterpriseTelemetryConfig
from .extra import ExtraServiceConfig
from .feature import FeatureConfig
from .middleware import MiddlewareConfig
@@ -73,6 +73,8 @@ class DifyConfig(
# Enterprise feature configs
# **Before using, please contact business@dify.ai by email to inquire about licensing matters.**
EnterpriseFeatureConfig,
# Enterprise telemetry configs
EnterpriseTelemetryConfig,
):
model_config = SettingsConfigDict(
# read from dotenv format config file

View File

@@ -18,3 +18,50 @@ class EnterpriseFeatureConfig(BaseSettings):
description="Allow customization of the enterprise logo.",
default=False,
)
class EnterpriseTelemetryConfig(BaseSettings):
"""
Configuration for enterprise telemetry.
"""
ENTERPRISE_TELEMETRY_ENABLED: bool = Field(
description="Enable enterprise telemetry collection (also requires ENTERPRISE_ENABLED=true).",
default=False,
)
ENTERPRISE_OTLP_ENDPOINT: str = Field(
description="Enterprise OTEL collector endpoint.",
default="",
)
ENTERPRISE_OTLP_HEADERS: str = Field(
description="Auth headers for OTLP export (key=value,key2=value2).",
default="",
)
ENTERPRISE_OTLP_PROTOCOL: str = Field(
description="OTLP protocol: 'http' or 'grpc' (default: http).",
default="http",
)
ENTERPRISE_OTLP_API_KEY: str = Field(
description="Bearer token for enterprise OTLP export authentication. "
"When set, gRPC exporters automatically use TLS (insecure=False).",
default="",
)
ENTERPRISE_INCLUDE_CONTENT: bool = Field(
description="Include input/output content in traces (privacy toggle).",
default=True,
)
ENTERPRISE_SERVICE_NAME: str = Field(
description="Service name for OTEL resource.",
default="dify",
)
ENTERPRISE_OTEL_SAMPLING_RATE: float = Field(
description="Sampling rate for enterprise traces (0.0 to 1.0, default 1.0 = 100%).",
default=1.0,
)

View File

@@ -1,4 +1,5 @@
from collections.abc import Sequence
from typing import Any
from flask_restx import Resource
from pydantic import BaseModel, Field
@@ -11,12 +12,10 @@ from controllers.console.app.error import (
ProviderQuotaExceededError,
)
from controllers.console.wraps import account_initialization_required, setup_required
from core.app.app_config.entities import ModelConfig
from core.errors.error import ModelCurrentlyNotSupportError, ProviderTokenNotInitError, QuotaExceededError
from core.helper.code_executor.code_node_provider import CodeNodeProvider
from core.helper.code_executor.javascript.javascript_code_provider import JavascriptCodeProvider
from core.helper.code_executor.python3.python3_code_provider import Python3CodeProvider
from core.llm_generator.entities import RuleCodeGeneratePayload, RuleGeneratePayload, RuleStructuredOutputPayload
from core.llm_generator.llm_generator import LLMGenerator
from core.model_runtime.errors.invoke import InvokeError
from extensions.ext_database import db
@@ -27,14 +26,32 @@ from services.workflow_service import WorkflowService
DEFAULT_REF_TEMPLATE_SWAGGER_2_0 = "#/definitions/{model}"
class RuleGeneratePayload(BaseModel):
instruction: str = Field(..., description="Rule generation instruction")
model_config_data: dict[str, Any] = Field(..., alias="model_config", description="Model configuration")
no_variable: bool = Field(default=False, description="Whether to exclude variables")
app_id: str | None = Field(default=None, description="App ID for prompt generation tracing")
class RuleCodeGeneratePayload(RuleGeneratePayload):
code_language: str = Field(default="javascript", description="Programming language for code generation")
class RuleStructuredOutputPayload(BaseModel):
instruction: str = Field(..., description="Structured output generation instruction")
model_config_data: dict[str, Any] = Field(..., alias="model_config", description="Model configuration")
app_id: str | None = Field(default=None, description="App ID for prompt generation tracing")
class InstructionGeneratePayload(BaseModel):
flow_id: str = Field(..., description="Workflow/Flow ID")
node_id: str = Field(default="", description="Node ID for workflow context")
current: str = Field(default="", description="Current instruction text")
language: str = Field(default="javascript", description="Programming language (javascript/python)")
instruction: str = Field(..., description="Instruction for generation")
model_config_data: ModelConfig = Field(..., alias="model_config", description="Model configuration")
model_config_data: dict[str, Any] = Field(..., alias="model_config", description="Model configuration")
ideal_output: str = Field(default="", description="Expected ideal output")
app_id: str | None = Field(default=None, description="App ID for prompt generation tracing")
class InstructionTemplatePayload(BaseModel):
@@ -50,7 +67,6 @@ reg(RuleCodeGeneratePayload)
reg(RuleStructuredOutputPayload)
reg(InstructionGeneratePayload)
reg(InstructionTemplatePayload)
reg(ModelConfig)
@console_ns.route("/rule-generate")
@@ -66,10 +82,17 @@ class RuleGenerateApi(Resource):
@account_initialization_required
def post(self):
args = RuleGeneratePayload.model_validate(console_ns.payload)
_, current_tenant_id = current_account_with_tenant()
account, current_tenant_id = current_account_with_tenant()
try:
rules = LLMGenerator.generate_rule_config(tenant_id=current_tenant_id, args=args)
rules = LLMGenerator.generate_rule_config(
tenant_id=current_tenant_id,
instruction=args.instruction,
model_config=args.model_config_data,
no_variable=args.no_variable,
user_id=account.id,
app_id=args.app_id,
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
except QuotaExceededError:
@@ -95,12 +118,16 @@ class RuleCodeGenerateApi(Resource):
@account_initialization_required
def post(self):
args = RuleCodeGeneratePayload.model_validate(console_ns.payload)
_, current_tenant_id = current_account_with_tenant()
account, current_tenant_id = current_account_with_tenant()
try:
code_result = LLMGenerator.generate_code(
tenant_id=current_tenant_id,
args=args,
instruction=args.instruction,
model_config=args.model_config_data,
code_language=args.code_language,
user_id=account.id,
app_id=args.app_id,
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@@ -127,12 +154,15 @@ class RuleStructuredOutputGenerateApi(Resource):
@account_initialization_required
def post(self):
args = RuleStructuredOutputPayload.model_validate(console_ns.payload)
_, current_tenant_id = current_account_with_tenant()
account, current_tenant_id = current_account_with_tenant()
try:
structured_output = LLMGenerator.generate_structured_output(
tenant_id=current_tenant_id,
args=args,
instruction=args.instruction,
model_config=args.model_config_data,
user_id=account.id,
app_id=args.app_id,
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@@ -159,14 +189,14 @@ class InstructionGenerateApi(Resource):
@account_initialization_required
def post(self):
args = InstructionGeneratePayload.model_validate(console_ns.payload)
_, current_tenant_id = current_account_with_tenant()
account, current_tenant_id = current_account_with_tenant()
app_id = args.app_id or args.flow_id
providers: list[type[CodeNodeProvider]] = [Python3CodeProvider, JavascriptCodeProvider]
code_provider: type[CodeNodeProvider] | None = next(
(p for p in providers if p.is_accept_language(args.language)), None
)
code_template = code_provider.get_default_code() if code_provider else ""
try:
# Generate from nothing for a workflow node
if (args.current in (code_template, "")) and args.node_id != "":
app = db.session.query(App).where(App.id == args.flow_id).first()
if not app:
@@ -183,33 +213,33 @@ class InstructionGenerateApi(Resource):
case "llm":
return LLMGenerator.generate_rule_config(
current_tenant_id,
args=RuleGeneratePayload(
instruction=args.instruction,
model_config=args.model_config_data,
no_variable=True,
),
instruction=args.instruction,
model_config=args.model_config_data,
no_variable=True,
user_id=account.id,
app_id=app_id,
)
case "agent":
return LLMGenerator.generate_rule_config(
current_tenant_id,
args=RuleGeneratePayload(
instruction=args.instruction,
model_config=args.model_config_data,
no_variable=True,
),
instruction=args.instruction,
model_config=args.model_config_data,
no_variable=True,
user_id=account.id,
app_id=app_id,
)
case "code":
return LLMGenerator.generate_code(
tenant_id=current_tenant_id,
args=RuleCodeGeneratePayload(
instruction=args.instruction,
model_config=args.model_config_data,
code_language=args.language,
),
instruction=args.instruction,
model_config=args.model_config_data,
code_language=args.language,
user_id=account.id,
app_id=app_id,
)
case _:
return {"error": f"invalid node type: {node_type}"}
if args.node_id == "" and args.current != "": # For legacy app without a workflow
if args.node_id == "" and args.current != "":
return LLMGenerator.instruction_modify_legacy(
tenant_id=current_tenant_id,
flow_id=args.flow_id,
@@ -217,8 +247,10 @@ class InstructionGenerateApi(Resource):
instruction=args.instruction,
model_config=args.model_config_data,
ideal_output=args.ideal_output,
user_id=account.id,
app_id=app_id,
)
if args.node_id != "" and args.current != "": # For workflow node
if args.node_id != "" and args.current != "":
return LLMGenerator.instruction_modify_workflow(
tenant_id=current_tenant_id,
flow_id=args.flow_id,
@@ -228,6 +260,8 @@ class InstructionGenerateApi(Resource):
model_config=args.model_config_data,
ideal_output=args.ideal_output,
workflow_service=WorkflowService(),
user_id=account.id,
app_id=app_id,
)
return {"error": "incompatible parameters"}, 400
except ProviderTokenNotInitError as ex:

View File

@@ -1,6 +1,7 @@
from typing import Any
from flask import request
from flask_login import current_user
from flask_restx import Resource, fields
from pydantic import BaseModel, Field
from werkzeug.exceptions import BadRequest
@@ -77,7 +78,10 @@ class TraceAppConfigApi(Resource):
try:
result = OpsService.create_tracing_app_config(
app_id=app_id, tracing_provider=args.tracing_provider, tracing_config=args.tracing_config
app_id=app_id,
tracing_provider=args.tracing_provider,
tracing_config=args.tracing_config,
account_id=current_user.id,
)
if not result:
raise TracingConfigIsExist()
@@ -102,7 +106,10 @@ class TraceAppConfigApi(Resource):
try:
result = OpsService.update_tracing_app_config(
app_id=app_id, tracing_provider=args.tracing_provider, tracing_config=args.tracing_config
app_id=app_id,
tracing_provider=args.tracing_provider,
tracing_config=args.tracing_config,
account_id=current_user.id,
)
if not result:
raise TracingConfigNotExist()
@@ -124,7 +131,9 @@ class TraceAppConfigApi(Resource):
args = TraceProviderQuery.model_validate(request.args.to_dict(flat=True)) # type: ignore
try:
result = OpsService.delete_tracing_app_config(app_id=app_id, tracing_provider=args.tracing_provider)
result = OpsService.delete_tracing_app_config(
app_id=app_id, tracing_provider=args.tracing_provider, account_id=current_user.id
)
if not result:
raise TracingConfigNotExist()
return {"result": "success"}, 204

View File

@@ -1,16 +1,15 @@
import logging
from typing import Any, Literal, cast
from typing import Any, cast
from flask import request
from flask_restx import Resource, fields, marshal, marshal_with
from pydantic import BaseModel
from flask_restx import Resource, fields, marshal, marshal_with, reqparse
from werkzeug.exceptions import Forbidden, InternalServerError, NotFound
import services
from controllers.common.fields import Parameters as ParametersResponse
from controllers.common.fields import Site as SiteResponse
from controllers.common.schema import get_or_create_model
from controllers.console import api, console_ns
from controllers.console import api
from controllers.console.app.error import (
AppUnavailableError,
AudioTooLargeError,
@@ -118,56 +117,7 @@ workflow_fields_copy["rag_pipeline_variables"] = fields.List(fields.Nested(pipel
workflow_model = get_or_create_model("TrialWorkflow", workflow_fields_copy)
# Pydantic models for request validation
DEFAULT_REF_TEMPLATE_SWAGGER_2_0 = "#/definitions/{model}"
class WorkflowRunRequest(BaseModel):
inputs: dict
files: list | None = None
class ChatRequest(BaseModel):
inputs: dict
query: str
files: list | None = None
conversation_id: str | None = None
parent_message_id: str | None = None
retriever_from: str = "explore_app"
class TextToSpeechRequest(BaseModel):
message_id: str | None = None
voice: str | None = None
text: str | None = None
streaming: bool | None = None
class CompletionRequest(BaseModel):
inputs: dict
query: str = ""
files: list | None = None
response_mode: Literal["blocking", "streaming"] | None = None
retriever_from: str = "explore_app"
# Register schemas for Swagger documentation
console_ns.schema_model(
WorkflowRunRequest.__name__, WorkflowRunRequest.model_json_schema(ref_template=DEFAULT_REF_TEMPLATE_SWAGGER_2_0)
)
console_ns.schema_model(
ChatRequest.__name__, ChatRequest.model_json_schema(ref_template=DEFAULT_REF_TEMPLATE_SWAGGER_2_0)
)
console_ns.schema_model(
TextToSpeechRequest.__name__, TextToSpeechRequest.model_json_schema(ref_template=DEFAULT_REF_TEMPLATE_SWAGGER_2_0)
)
console_ns.schema_model(
CompletionRequest.__name__, CompletionRequest.model_json_schema(ref_template=DEFAULT_REF_TEMPLATE_SWAGGER_2_0)
)
class TrialAppWorkflowRunApi(TrialAppResource):
@console_ns.expect(console_ns.models[WorkflowRunRequest.__name__])
def post(self, trial_app):
"""
Run workflow
@@ -179,8 +129,10 @@ class TrialAppWorkflowRunApi(TrialAppResource):
if app_mode != AppMode.WORKFLOW:
raise NotWorkflowAppError()
request_data = WorkflowRunRequest.model_validate(console_ns.payload)
args = request_data.model_dump()
parser = reqparse.RequestParser()
parser.add_argument("inputs", type=dict, required=True, nullable=False, location="json")
parser.add_argument("files", type=list, required=False, location="json")
args = parser.parse_args()
assert current_user is not None
try:
app_id = app_model.id
@@ -231,7 +183,6 @@ class TrialAppWorkflowTaskStopApi(TrialAppResource):
class TrialChatApi(TrialAppResource):
@console_ns.expect(console_ns.models[ChatRequest.__name__])
@trial_feature_enable
def post(self, trial_app):
app_model = trial_app
@@ -239,14 +190,14 @@ class TrialChatApi(TrialAppResource):
if app_mode not in {AppMode.CHAT, AppMode.AGENT_CHAT, AppMode.ADVANCED_CHAT}:
raise NotChatAppError()
request_data = ChatRequest.model_validate(console_ns.payload)
args = request_data.model_dump()
# Validate UUID values if provided
if args.get("conversation_id"):
args["conversation_id"] = uuid_value(args["conversation_id"])
if args.get("parent_message_id"):
args["parent_message_id"] = uuid_value(args["parent_message_id"])
parser = reqparse.RequestParser()
parser.add_argument("inputs", type=dict, required=True, location="json")
parser.add_argument("query", type=str, required=True, location="json")
parser.add_argument("files", type=list, required=False, location="json")
parser.add_argument("conversation_id", type=uuid_value, location="json")
parser.add_argument("parent_message_id", type=uuid_value, required=False, location="json")
parser.add_argument("retriever_from", type=str, required=False, default="explore_app", location="json")
args = parser.parse_args()
args["auto_generate_name"] = False
@@ -369,16 +320,20 @@ class TrialChatAudioApi(TrialAppResource):
class TrialChatTextApi(TrialAppResource):
@console_ns.expect(console_ns.models[TextToSpeechRequest.__name__])
@trial_feature_enable
def post(self, trial_app):
app_model = trial_app
try:
request_data = TextToSpeechRequest.model_validate(console_ns.payload)
parser = reqparse.RequestParser()
parser.add_argument("message_id", type=str, required=False, location="json")
parser.add_argument("voice", type=str, location="json")
parser.add_argument("text", type=str, location="json")
parser.add_argument("streaming", type=bool, location="json")
args = parser.parse_args()
message_id = request_data.message_id
text = request_data.text
voice = request_data.voice
message_id = args.get("message_id", None)
text = args.get("text", None)
voice = args.get("voice", None)
if not isinstance(current_user, Account):
raise ValueError("current_user must be an Account instance")
@@ -416,15 +371,19 @@ class TrialChatTextApi(TrialAppResource):
class TrialCompletionApi(TrialAppResource):
@console_ns.expect(console_ns.models[CompletionRequest.__name__])
@trial_feature_enable
def post(self, trial_app):
app_model = trial_app
if app_model.mode != "completion":
raise NotCompletionAppError()
request_data = CompletionRequest.model_validate(console_ns.payload)
args = request_data.model_dump()
parser = reqparse.RequestParser()
parser.add_argument("inputs", type=dict, required=True, location="json")
parser.add_argument("query", type=str, location="json", default="")
parser.add_argument("files", type=list, required=False, location="json")
parser.add_argument("response_mode", type=str, choices=["blocking", "streaming"], location="json")
parser.add_argument("retriever_from", type=str, required=False, default="explore_app", location="json")
args = parser.parse_args()
streaming = args["response_mode"] == "streaming"
args["auto_generate_name"] = False

View File

@@ -79,7 +79,7 @@ class BaseAgentRunner(AppRunner):
self.model_instance = model_instance
# init callback
self.agent_callback = DifyAgentCallbackHandler()
self.agent_callback = DifyAgentCallbackHandler(tenant_id=tenant_id)
# init dataset tools
hit_callback = DatasetIndexToolCallbackHandler(
queue_manager=queue_manager,

View File

@@ -63,6 +63,8 @@ from core.base.tts import AppGeneratorTTSPublisher, AudioTrunk
from core.model_runtime.entities.llm_entities import LLMUsage
from core.model_runtime.utils.encoders import jsonable_encoder
from core.ops.ops_trace_manager import TraceQueueManager
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
from core.workflow.enums import WorkflowExecutionStatus
from core.workflow.nodes import NodeType
from core.workflow.repositories.draft_variable_repository import DraftVariableSaverFactory
@@ -564,7 +566,6 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
**kwargs,
) -> Generator[StreamResponse, None, None]:
"""Handle stop events."""
_ = trace_manager
resolved_state = None
if self._workflow_run_id:
resolved_state = self._resolve_graph_runtime_state(graph_runtime_state)
@@ -579,8 +580,7 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
)
with self._database_session() as session:
# Save message
self._save_message(session=session, graph_runtime_state=resolved_state)
self._save_message(session=session, graph_runtime_state=resolved_state, trace_manager=trace_manager)
yield workflow_finish_resp
elif event.stopped_by in (
@@ -589,8 +589,7 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
):
# When hitting input-moderation or annotation-reply, the workflow will not start
with self._database_session() as session:
# Save message
self._save_message(session=session)
self._save_message(session=session, trace_manager=trace_manager)
yield self._message_end_to_stream_response()
@@ -599,6 +598,7 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
event: QueueAdvancedChatMessageEndEvent,
*,
graph_runtime_state: GraphRuntimeState | None = None,
trace_manager: TraceQueueManager | None = None,
**kwargs,
) -> Generator[StreamResponse, None, None]:
"""Handle advanced chat message end events."""
@@ -616,7 +616,7 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
# Save message
with self._database_session() as session:
self._save_message(session=session, graph_runtime_state=resolved_state)
self._save_message(session=session, graph_runtime_state=resolved_state, trace_manager=trace_manager)
yield self._message_end_to_stream_response()
@@ -770,7 +770,13 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
if self._conversation_name_generate_thread:
logger.debug("Conversation name generation running as daemon thread")
def _save_message(self, *, session: Session, graph_runtime_state: GraphRuntimeState | None = None):
def _save_message(
self,
*,
session: Session,
graph_runtime_state: GraphRuntimeState | None = None,
trace_manager: TraceQueueManager | None = None,
):
message = self._get_message(session=session)
# If there are assistant files, remove markdown image links from answer
@@ -826,6 +832,22 @@ class AdvancedChatAppGenerateTaskPipeline(GraphRuntimeStateSupport):
]
session.add_all(message_files)
if trace_manager:
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.MESSAGE_TRACE,
context=TelemetryContext(
tenant_id=self._application_generate_entity.app_config.tenant_id,
app_id=self._application_generate_entity.app_config.app_id,
),
payload={
"conversation_id": str(message.conversation_id),
"message_id": str(message.id),
},
),
trace_manager=trace_manager,
)
def _seed_graph_runtime_state_from_queue_manager(self) -> None:
"""Bootstrap the cached runtime state from the queue manager when present."""
candidate = self._base_task_pipeline.queue_manager.graph_runtime_state

View File

@@ -147,9 +147,12 @@ class WorkflowAppGenerator(BaseAppGenerator):
inputs: Mapping[str, Any] = args["inputs"]
extras = {
extras: dict[str, Any] = {
**extract_external_trace_id_from_args(args),
}
parent_trace_context = args.get("_parent_trace_context")
if parent_trace_context:
extras["parent_trace_context"] = parent_trace_context
workflow_run_id = str(uuid.uuid4())
# FIXME (Yeuoly): we need to remove the SKIP_PREPARE_USER_INPUTS_KEY from the args
# trigger shouldn't prepare user inputs

View File

@@ -52,10 +52,11 @@ from core.model_runtime.entities.message_entities import (
TextPromptMessageContent,
)
from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
from core.ops.entities.trace_entity import TraceTaskName
from core.ops.ops_trace_manager import TraceQueueManager, TraceTask
from core.ops.ops_trace_manager import TraceQueueManager
from core.prompt.utils.prompt_message_util import PromptMessageUtil
from core.prompt.utils.prompt_template_parser import PromptTemplateParser
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
from events.message_event import message_was_created
from extensions.ext_database import db
from libs.datetime_utils import naive_utc_now
@@ -409,10 +410,19 @@ class EasyUIBasedGenerateTaskPipeline(BasedGenerateTaskPipeline):
message.message_metadata = self._task_state.metadata.model_dump_json()
if trace_manager:
trace_manager.add_trace_task(
TraceTask(
TraceTaskName.MESSAGE_TRACE, conversation_id=self._conversation_id, message_id=self._message_id
)
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.MESSAGE_TRACE,
context=TelemetryContext(
tenant_id=self._application_generate_entity.app_config.tenant_id,
app_id=self._application_generate_entity.app_config.app_id,
),
payload={
"conversation_id": self._conversation_id,
"message_id": self._message_id,
},
),
trace_manager=trace_manager,
)
message_was_created.send(

View File

@@ -15,8 +15,7 @@ from datetime import datetime
from typing import Any, Union
from core.app.entities.app_invoke_entities import AdvancedChatAppGenerateEntity, WorkflowAppGenerateEntity
from core.ops.entities.trace_entity import TraceTaskName
from core.ops.ops_trace_manager import TraceQueueManager, TraceTask
from core.ops.ops_trace_manager import TraceQueueManager
from core.workflow.constants import SYSTEM_VARIABLE_NODE_ID
from core.workflow.entities import WorkflowExecution, WorkflowNodeExecution
from core.workflow.enums import (
@@ -373,6 +372,7 @@ class WorkflowPersistenceLayer(GraphEngineLayer):
self._workflow_node_execution_repository.save(domain_execution)
self._workflow_node_execution_repository.save_execution_data(domain_execution)
self._enqueue_node_trace_task(domain_execution)
def _fail_running_node_executions(self, *, error_message: str) -> None:
now = naive_utc_now()
@@ -390,17 +390,131 @@ class WorkflowPersistenceLayer(GraphEngineLayer):
conversation_id = self._system_variables().get(SystemVariableKey.CONVERSATION_ID.value)
external_trace_id = None
parent_trace_context = None
if isinstance(self._application_generate_entity, (WorkflowAppGenerateEntity, AdvancedChatAppGenerateEntity)):
external_trace_id = self._application_generate_entity.extras.get("external_trace_id")
parent_trace_context = self._application_generate_entity.extras.get("parent_trace_context")
trace_task = TraceTask(
TraceTaskName.WORKFLOW_TRACE,
workflow_execution=execution,
conversation_id=conversation_id,
user_id=self._trace_manager.user_id,
external_trace_id=external_trace_id,
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.WORKFLOW_TRACE,
context=TelemetryContext(
tenant_id=self._application_generate_entity.app_config.tenant_id,
user_id=self._trace_manager.user_id,
app_id=self._application_generate_entity.app_config.app_id,
),
payload={
"workflow_execution": execution,
"conversation_id": conversation_id,
"user_id": self._trace_manager.user_id,
"external_trace_id": external_trace_id,
"parent_trace_context": parent_trace_context,
},
),
trace_manager=self._trace_manager,
)
def _enqueue_node_trace_task(self, domain_execution: WorkflowNodeExecution) -> None:
if not self._trace_manager:
return
execution = self._get_workflow_execution()
meta = domain_execution.metadata or {}
parent_trace_context = None
if isinstance(self._application_generate_entity, (WorkflowAppGenerateEntity, AdvancedChatAppGenerateEntity)):
parent_trace_context = self._application_generate_entity.extras.get("parent_trace_context")
node_data: dict[str, Any] = {
"workflow_id": domain_execution.workflow_id,
"workflow_execution_id": execution.id_,
"tenant_id": self._application_generate_entity.app_config.tenant_id,
"app_id": self._application_generate_entity.app_config.app_id,
"node_execution_id": domain_execution.id,
"node_id": domain_execution.node_id,
"node_type": str(domain_execution.node_type.value),
"title": domain_execution.title,
"status": str(domain_execution.status.value),
"error": domain_execution.error,
"elapsed_time": domain_execution.elapsed_time,
"index": domain_execution.index,
"predecessor_node_id": domain_execution.predecessor_node_id,
"created_at": domain_execution.created_at,
"finished_at": domain_execution.finished_at,
"total_tokens": meta.get(WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS, 0),
"prompt_tokens": meta.get(WorkflowNodeExecutionMetadataKey.PROMPT_TOKENS),
"completion_tokens": meta.get(WorkflowNodeExecutionMetadataKey.COMPLETION_TOKENS),
"total_price": meta.get(WorkflowNodeExecutionMetadataKey.TOTAL_PRICE, 0.0),
"currency": meta.get(WorkflowNodeExecutionMetadataKey.CURRENCY),
"tool_name": (meta.get(WorkflowNodeExecutionMetadataKey.TOOL_INFO) or {}).get("tool_name")
if isinstance(meta.get(WorkflowNodeExecutionMetadataKey.TOOL_INFO), dict)
else None,
"iteration_id": meta.get(WorkflowNodeExecutionMetadataKey.ITERATION_ID),
"iteration_index": meta.get(WorkflowNodeExecutionMetadataKey.ITERATION_INDEX),
"loop_id": meta.get(WorkflowNodeExecutionMetadataKey.LOOP_ID),
"loop_index": meta.get(WorkflowNodeExecutionMetadataKey.LOOP_INDEX),
"parallel_id": meta.get(WorkflowNodeExecutionMetadataKey.PARALLEL_ID),
"node_inputs": dict(domain_execution.inputs) if domain_execution.inputs else None,
"node_outputs": dict(domain_execution.outputs) if domain_execution.outputs else None,
"process_data": dict(domain_execution.process_data) if domain_execution.process_data else None,
}
node_data["invoke_from"] = self._application_generate_entity.invoke_from.value
node_data["user_id"] = self._system_variables().get(SystemVariableKey.USER_ID.value)
if domain_execution.node_type.value == "knowledge-retrieval" and domain_execution.outputs:
results = domain_execution.outputs.get("result") or []
dataset_ids: list[str] = []
dataset_names: list[str] = []
for doc in results:
if not isinstance(doc, dict):
continue
doc_meta = doc.get("metadata") or {}
did = doc_meta.get("dataset_id")
dname = doc_meta.get("dataset_name")
if did and did not in dataset_ids:
dataset_ids.append(did)
if dname and dname not in dataset_names:
dataset_names.append(dname)
if dataset_ids:
node_data["dataset_ids"] = dataset_ids
if dataset_names:
node_data["dataset_names"] = dataset_names
tool_info = meta.get(WorkflowNodeExecutionMetadataKey.TOOL_INFO)
if isinstance(tool_info, dict):
plugin_id = tool_info.get("plugin_unique_identifier")
if plugin_id:
node_data["plugin_name"] = plugin_id
credential_id = tool_info.get("credential_id")
if credential_id:
node_data["credential_id"] = credential_id
node_data["credential_provider_type"] = tool_info.get("provider_type")
conversation_id = self._system_variables().get(SystemVariableKey.CONVERSATION_ID.value)
if conversation_id:
node_data["conversation_id"] = conversation_id
if parent_trace_context:
node_data["parent_trace_context"] = parent_trace_context
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.NODE_EXECUTION_TRACE,
context=TelemetryContext(
tenant_id=node_data.get("tenant_id"),
user_id=node_data.get("user_id"),
app_id=node_data.get("app_id"),
),
payload={"node_execution_data": node_data},
),
trace_manager=self._trace_manager,
)
self._trace_manager.add_trace_task(trace_task)
def _system_variables(self) -> Mapping[str, Any]:
runtime_state = self.graph_runtime_state

View File

@@ -4,8 +4,9 @@ from typing import Any, TextIO, Union
from pydantic import BaseModel
from configs import dify_config
from core.ops.entities.trace_entity import TraceTaskName
from core.ops.ops_trace_manager import TraceQueueManager, TraceTask
from core.ops.ops_trace_manager import TraceQueueManager
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
from core.tools.entities.tool_entities import ToolInvokeMessage
_TEXT_COLOR_MAPPING = {
@@ -36,13 +37,15 @@ class DifyAgentCallbackHandler(BaseModel):
color: str | None = ""
current_loop: int = 1
tenant_id: str | None = None
def __init__(self, color: str | None = None):
def __init__(self, color: str | None = None, tenant_id: str | None = None):
super().__init__()
"""Initialize callback handler."""
# use a specific color is not specified
self.color = color or "green"
self.current_loop = 1
self.tenant_id = tenant_id
def on_tool_start(
self,
@@ -71,15 +74,23 @@ class DifyAgentCallbackHandler(BaseModel):
print_text("\n")
if trace_manager:
trace_manager.add_trace_task(
TraceTask(
TraceTaskName.TOOL_TRACE,
message_id=message_id,
tool_name=tool_name,
tool_inputs=tool_inputs,
tool_outputs=tool_outputs,
timer=timer,
)
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.TOOL_TRACE,
context=TelemetryContext(
tenant_id=self.tenant_id,
app_id=trace_manager.app_id,
user_id=trace_manager.user_id,
),
payload={
"message_id": message_id,
"tool_name": tool_name,
"tool_inputs": tool_inputs,
"tool_outputs": tool_outputs,
"timer": timer,
},
),
trace_manager=trace_manager,
)
def on_tool_error(self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any):

View File

@@ -6,8 +6,6 @@ from typing import Protocol, cast
import json_repair
from core.app.app_config.entities import ModelConfig
from core.llm_generator.entities import RuleCodeGeneratePayload, RuleGeneratePayload, RuleStructuredOutputPayload
from core.llm_generator.output_parser.rule_config_generator import RuleConfigGeneratorOutputParser
from core.llm_generator.output_parser.suggested_questions_after_answer import SuggestedQuestionsAfterAnswerOutputParser
from core.llm_generator.prompts import (
@@ -27,10 +25,11 @@ from core.model_runtime.entities.llm_entities import LLMResult
from core.model_runtime.entities.message_entities import PromptMessage, SystemPromptMessage, UserPromptMessage
from core.model_runtime.entities.model_entities import ModelType
from core.model_runtime.errors.invoke import InvokeAuthorizationError, InvokeError
from core.ops.entities.trace_entity import TraceTaskName
from core.ops.ops_trace_manager import TraceQueueManager, TraceTask
from core.ops.entities.trace_entity import OperationType
from core.ops.utils import measure_time
from core.prompt.utils.prompt_template_parser import PromptTemplateParser
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionMetadataKey
from extensions.ext_database import db
from extensions.ext_storage import storage
@@ -73,8 +72,8 @@ class LLMGenerator:
response: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompts), model_parameters={"max_tokens": 500, "temperature": 1}, stream=False
)
answer = response.message.get_text_content()
if answer == "":
answer = cast(str, response.message.content)
if answer is None:
return ""
try:
result_dict = json.loads(answer)
@@ -96,15 +95,17 @@ class LLMGenerator:
name = name[:75] + "..."
# get tracing instance
trace_manager = TraceQueueManager(app_id=app_id)
trace_manager.add_trace_task(
TraceTask(
TraceTaskName.GENERATE_NAME_TRACE,
conversation_id=conversation_id,
generate_conversation_name=name,
inputs=prompt,
timer=timer,
tenant_id=tenant_id,
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.GENERATE_NAME_TRACE,
context=TelemetryContext(tenant_id=tenant_id, app_id=app_id),
payload={
"conversation_id": conversation_id,
"generate_conversation_name": name,
"inputs": prompt,
"timer": timer,
"tenant_id": tenant_id,
},
)
)
@@ -153,19 +154,27 @@ class LLMGenerator:
return questions
@classmethod
def generate_rule_config(cls, tenant_id: str, args: RuleGeneratePayload):
def generate_rule_config(
cls,
tenant_id: str,
instruction: str,
model_config: dict,
no_variable: bool,
user_id: str | None = None,
app_id: str | None = None,
):
output_parser = RuleConfigGeneratorOutputParser()
error = ""
error_step = ""
rule_config = {"prompt": "", "variables": [], "opening_statement": "", "error": ""}
model_parameters = args.model_config_data.completion_params
if args.no_variable:
model_parameters = model_config.get("completion_params", {})
if no_variable:
prompt_template = PromptTemplateParser(WORKFLOW_RULE_CONFIG_PROMPT_GENERATE_TEMPLATE)
prompt_generate = prompt_template.format(
inputs={
"TASK_DESCRIPTION": args.instruction,
"TASK_DESCRIPTION": instruction,
},
remove_template_variables=False,
)
@@ -177,26 +186,45 @@ class LLMGenerator:
model_instance = model_manager.get_model_instance(
tenant_id=tenant_id,
model_type=ModelType.LLM,
provider=args.model_config_data.provider,
model=args.model_config_data.name,
provider=model_config.get("provider", ""),
model=model_config.get("name", ""),
)
try:
response: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
llm_result = None
with measure_time() as timer:
try:
llm_result = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
rule_config["prompt"] = response.message.get_text_content()
rule_config["prompt"] = cast(str, llm_result.message.content)
except InvokeError as e:
error = str(e)
error_step = "generate rule config"
except Exception as e:
logger.exception("Failed to generate rule config, model: %s", args.model_config_data.name)
rule_config["error"] = str(e)
except InvokeError as e:
error = str(e)
error_step = "generate rule config"
except Exception as e:
logger.exception("Failed to generate rule config, model: %s", model_config.get("name"))
rule_config["error"] = str(e)
error = str(e)
rule_config["error"] = f"Failed to {error_step}. Error: {error}" if error else ""
if user_id:
prompt_value = rule_config.get("prompt", "")
generated_output = str(prompt_value) if prompt_value else ""
cls._emit_prompt_generation_trace(
tenant_id=tenant_id,
user_id=user_id,
app_id=app_id,
operation_type=OperationType.RULE_GENERATE,
instruction=instruction,
generated_output=generated_output,
llm_result=llm_result,
model_config=model_config,
timer=timer,
error=error or None,
)
return rule_config
# get rule config prompt, parameter and statement
@@ -211,7 +239,7 @@ class LLMGenerator:
# format the prompt_generate_prompt
prompt_generate_prompt = prompt_template.format(
inputs={
"TASK_DESCRIPTION": args.instruction,
"TASK_DESCRIPTION": instruction,
},
remove_template_variables=False,
)
@@ -222,84 +250,125 @@ class LLMGenerator:
model_instance = model_manager.get_model_instance(
tenant_id=tenant_id,
model_type=ModelType.LLM,
provider=args.model_config_data.provider,
model=args.model_config_data.name,
provider=model_config.get("provider", ""),
model=model_config.get("name", ""),
)
try:
llm_result = None
with measure_time() as timer:
try:
# the first step to generate the task prompt
prompt_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
try:
# the first step to generate the task prompt
prompt_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
llm_result = prompt_content
except InvokeError as e:
error = str(e)
error_step = "generate prefix prompt"
rule_config["error"] = f"Failed to {error_step}. Error: {error}" if error else ""
if user_id:
cls._emit_prompt_generation_trace(
tenant_id=tenant_id,
user_id=user_id,
app_id=app_id,
operation_type=OperationType.RULE_GENERATE,
instruction=instruction,
generated_output="",
llm_result=llm_result,
model_config=model_config,
timer=timer,
error=error,
)
return rule_config
rule_config["prompt"] = cast(str, prompt_content.message.content)
if not isinstance(prompt_content.message.content, str):
raise NotImplementedError("prompt content is not a string")
parameter_generate_prompt = parameter_template.format(
inputs={
"INPUT_TEXT": prompt_content.message.content,
},
remove_template_variables=False,
)
except InvokeError as e:
error = str(e)
error_step = "generate prefix prompt"
rule_config["error"] = f"Failed to {error_step}. Error: {error}" if error else ""
parameter_messages = [UserPromptMessage(content=parameter_generate_prompt)]
return rule_config
rule_config["prompt"] = prompt_content.message.get_text_content()
parameter_generate_prompt = parameter_template.format(
inputs={
"INPUT_TEXT": prompt_content.message.get_text_content(),
},
remove_template_variables=False,
)
parameter_messages = [UserPromptMessage(content=parameter_generate_prompt)]
# the second step to generate the task_parameter and task_statement
statement_generate_prompt = statement_template.format(
inputs={
"TASK_DESCRIPTION": args.instruction,
"INPUT_TEXT": prompt_content.message.get_text_content(),
},
remove_template_variables=False,
)
statement_messages = [UserPromptMessage(content=statement_generate_prompt)]
try:
parameter_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(parameter_messages), model_parameters=model_parameters, stream=False
# the second step to generate the task_parameter and task_statement
statement_generate_prompt = statement_template.format(
inputs={
"TASK_DESCRIPTION": instruction,
"INPUT_TEXT": prompt_content.message.content,
},
remove_template_variables=False,
)
rule_config["variables"] = re.findall(r'"\s*([^"]+)\s*"', parameter_content.message.get_text_content())
except InvokeError as e:
error = str(e)
error_step = "generate variables"
statement_messages = [UserPromptMessage(content=statement_generate_prompt)]
try:
statement_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(statement_messages), model_parameters=model_parameters, stream=False
)
rule_config["opening_statement"] = statement_content.message.get_text_content()
except InvokeError as e:
error = str(e)
error_step = "generate conversation opener"
try:
parameter_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(parameter_messages), model_parameters=model_parameters, stream=False
)
rule_config["variables"] = re.findall(
r'"\s*([^"]+)\s*"', cast(str, parameter_content.message.content)
)
except InvokeError as e:
error = str(e)
error_step = "generate variables"
except Exception as e:
logger.exception("Failed to generate rule config, model: %s", args.model_config_data.name)
rule_config["error"] = str(e)
try:
statement_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(statement_messages), model_parameters=model_parameters, stream=False
)
rule_config["opening_statement"] = cast(str, statement_content.message.content)
except InvokeError as e:
error = str(e)
error_step = "generate conversation opener"
except Exception as e:
logger.exception("Failed to generate rule config, model: %s", model_config.get("name"))
rule_config["error"] = str(e)
error = str(e)
rule_config["error"] = f"Failed to {error_step}. Error: {error}" if error else ""
if user_id:
generated_output = rule_config.get("prompt", "")
cls._emit_prompt_generation_trace(
tenant_id=tenant_id,
user_id=user_id,
app_id=app_id,
operation_type=OperationType.RULE_GENERATE,
instruction=instruction,
generated_output=str(generated_output) if generated_output else "",
llm_result=llm_result,
model_config=model_config,
timer=timer,
error=error or None,
)
return rule_config
@classmethod
def generate_code(
cls,
tenant_id: str,
args: RuleCodeGeneratePayload,
instruction: str,
model_config: dict,
code_language: str = "javascript",
user_id: str | None = None,
app_id: str | None = None,
):
if args.code_language == "python":
if code_language == "python":
prompt_template = PromptTemplateParser(PYTHON_CODE_GENERATOR_PROMPT_TEMPLATE)
else:
prompt_template = PromptTemplateParser(JAVASCRIPT_CODE_GENERATOR_PROMPT_TEMPLATE)
prompt = prompt_template.format(
inputs={
"INSTRUCTION": args.instruction,
"CODE_LANGUAGE": args.code_language,
"INSTRUCTION": instruction,
"CODE_LANGUAGE": code_language,
},
remove_template_variables=False,
)
@@ -308,28 +377,49 @@ class LLMGenerator:
model_instance = model_manager.get_model_instance(
tenant_id=tenant_id,
model_type=ModelType.LLM,
provider=args.model_config_data.provider,
model=args.model_config_data.name,
provider=model_config.get("provider", ""),
model=model_config.get("name", ""),
)
prompt_messages = [UserPromptMessage(content=prompt)]
model_parameters = args.model_config_data.completion_params
try:
response: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
model_parameters = model_config.get("completion_params", {})
llm_result = None
error = None
with measure_time() as timer:
try:
llm_result = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
generated_code = cast(str, llm_result.message.content)
result = {"code": generated_code, "language": code_language, "error": ""}
except InvokeError as e:
error = str(e)
result = {"code": "", "language": code_language, "error": f"Failed to generate code. Error: {error}"}
except Exception as e:
logger.exception(
"Failed to invoke LLM model, model: %s, language: %s", model_config.get("name"), code_language
)
error = str(e)
result = {"code": "", "language": code_language, "error": f"An unexpected error occurred: {str(e)}"}
if user_id:
cls._emit_prompt_generation_trace(
tenant_id=tenant_id,
user_id=user_id,
app_id=app_id,
operation_type=OperationType.CODE_GENERATE,
instruction=instruction,
generated_output=result.get("code", ""),
llm_result=llm_result,
model_config=model_config,
timer=timer,
error=error,
)
generated_code = response.message.get_text_content()
return {"code": generated_code, "language": args.code_language, "error": ""}
except InvokeError as e:
error = str(e)
return {"code": "", "language": args.code_language, "error": f"Failed to generate code. Error: {error}"}
except Exception as e:
logger.exception(
"Failed to invoke LLM model, model: %s, language: %s", args.model_config_data.name, args.code_language
)
return {"code": "", "language": args.code_language, "error": f"An unexpected error occurred: {str(e)}"}
return result
@classmethod
def generate_qa_document(cls, tenant_id: str, query, document_language: str):
@@ -355,49 +445,76 @@ class LLMGenerator:
raise TypeError("Expected LLMResult when stream=False")
response = result
answer = response.message.get_text_content()
answer = cast(str, response.message.content)
return answer.strip()
@classmethod
def generate_structured_output(cls, tenant_id: str, args: RuleStructuredOutputPayload):
def generate_structured_output(
cls, tenant_id: str, instruction: str, model_config: dict, user_id: str | None = None, app_id: str | None = None
):
model_manager = ModelManager()
model_instance = model_manager.get_model_instance(
tenant_id=tenant_id,
model_type=ModelType.LLM,
provider=args.model_config_data.provider,
model=args.model_config_data.name,
provider=model_config.get("provider", ""),
model=model_config.get("name", ""),
)
prompt_messages = [
SystemPromptMessage(content=SYSTEM_STRUCTURED_OUTPUT_GENERATE),
UserPromptMessage(content=args.instruction),
UserPromptMessage(content=instruction),
]
model_parameters = args.model_config_data.completion_params
model_parameters = model_config.get("model_parameters", {})
try:
response: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
llm_result = None
error = None
result = {"output": "", "error": ""}
with measure_time() as timer:
try:
llm_result = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
raw_content = llm_result.message.content
if not isinstance(raw_content, str):
raise ValueError(f"LLM response content must be a string, got: {type(raw_content)}")
try:
parsed_content = json.loads(raw_content)
except json.JSONDecodeError:
parsed_content = json_repair.loads(raw_content)
if not isinstance(parsed_content, dict | list):
raise ValueError(f"Failed to parse structured output from llm: {raw_content}")
generated_json_schema = json.dumps(parsed_content, indent=2, ensure_ascii=False)
result = {"output": generated_json_schema, "error": ""}
except InvokeError as e:
error = str(e)
result = {"output": "", "error": f"Failed to generate JSON Schema. Error: {error}"}
except Exception as e:
logger.exception("Failed to invoke LLM model, model: %s", model_config.get("name"))
error = str(e)
result = {"output": "", "error": f"An unexpected error occurred: {str(e)}"}
if user_id:
cls._emit_prompt_generation_trace(
tenant_id=tenant_id,
user_id=user_id,
app_id=app_id,
operation_type=OperationType.STRUCTURED_OUTPUT,
instruction=instruction,
generated_output=result.get("output", ""),
llm_result=llm_result,
model_config=model_config,
timer=timer,
error=error,
)
raw_content = response.message.get_text_content()
try:
parsed_content = json.loads(raw_content)
except json.JSONDecodeError:
parsed_content = json_repair.loads(raw_content)
if not isinstance(parsed_content, dict | list):
raise ValueError(f"Failed to parse structured output from llm: {raw_content}")
generated_json_schema = json.dumps(parsed_content, indent=2, ensure_ascii=False)
return {"output": generated_json_schema, "error": ""}
except InvokeError as e:
error = str(e)
return {"output": "", "error": f"Failed to generate JSON Schema. Error: {error}"}
except Exception as e:
logger.exception("Failed to invoke LLM model, model: %s", args.model_config_data.name)
return {"output": "", "error": f"An unexpected error occurred: {str(e)}"}
return result
@staticmethod
def instruction_modify_legacy(
@@ -405,14 +522,16 @@ class LLMGenerator:
flow_id: str,
current: str,
instruction: str,
model_config: ModelConfig,
model_config: dict,
ideal_output: str | None,
user_id: str | None = None,
app_id: str | None = None,
):
last_run: Message | None = (
db.session.query(Message).where(Message.app_id == flow_id).order_by(Message.created_at.desc()).first()
)
if not last_run:
return LLMGenerator.__instruction_modify_common(
result = LLMGenerator.__instruction_modify_common(
tenant_id=tenant_id,
model_config=model_config,
last_run=None,
@@ -421,22 +540,28 @@ class LLMGenerator:
instruction=instruction,
node_type="llm",
ideal_output=ideal_output,
user_id=user_id,
app_id=app_id,
)
last_run_dict = {
"query": last_run.query,
"answer": last_run.answer,
"error": last_run.error,
}
return LLMGenerator.__instruction_modify_common(
tenant_id=tenant_id,
model_config=model_config,
last_run=last_run_dict,
current=current,
error_message=str(last_run.error),
instruction=instruction,
node_type="llm",
ideal_output=ideal_output,
)
else:
last_run_dict = {
"query": last_run.query,
"answer": last_run.answer,
"error": last_run.error,
}
result = LLMGenerator.__instruction_modify_common(
tenant_id=tenant_id,
model_config=model_config,
last_run=last_run_dict,
current=current,
error_message=str(last_run.error),
instruction=instruction,
node_type="llm",
ideal_output=ideal_output,
user_id=user_id,
app_id=app_id,
)
return result
@staticmethod
def instruction_modify_workflow(
@@ -445,9 +570,11 @@ class LLMGenerator:
node_id: str,
current: str,
instruction: str,
model_config: ModelConfig,
model_config: dict,
ideal_output: str | None,
workflow_service: WorkflowServiceInterface,
user_id: str | None = None,
app_id: str | None = None,
):
session = db.session()
@@ -478,6 +605,8 @@ class LLMGenerator:
instruction=instruction,
node_type=node_type,
ideal_output=ideal_output,
user_id=user_id,
app_id=app_id,
)
def agent_log_of(node_execution: WorkflowNodeExecutionModel) -> Sequence:
@@ -511,18 +640,22 @@ class LLMGenerator:
instruction=instruction,
node_type=last_run.node_type,
ideal_output=ideal_output,
user_id=user_id,
app_id=app_id,
)
@staticmethod
def __instruction_modify_common(
tenant_id: str,
model_config: ModelConfig,
model_config: dict,
last_run: dict | None,
current: str | None,
error_message: str | None,
instruction: str,
node_type: str,
ideal_output: str | None,
user_id: str | None = None,
app_id: str | None = None,
):
LAST_RUN = "{{#last_run#}}"
CURRENT = "{{#current#}}"
@@ -537,8 +670,8 @@ class LLMGenerator:
model_instance = ModelManager().get_model_instance(
tenant_id=tenant_id,
model_type=ModelType.LLM,
provider=model_config.provider,
model=model_config.name,
provider=model_config.get("provider", ""),
model=model_config.get("name", ""),
)
match node_type:
case "llm" | "agent":
@@ -562,24 +695,122 @@ class LLMGenerator:
]
model_parameters = {"temperature": 0.4}
try:
response: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
llm_result = None
error = None
result = {}
with measure_time() as timer:
try:
llm_result = model_instance.invoke_llm(
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
generated_raw = llm_result.message.get_text_content()
first_brace = generated_raw.find("{")
last_brace = generated_raw.rfind("}")
if first_brace == -1 or last_brace == -1 or last_brace < first_brace:
raise ValueError(f"Could not find a valid JSON object in response: {generated_raw}")
json_str = generated_raw[first_brace : last_brace + 1]
data = json_repair.loads(json_str)
if not isinstance(data, dict):
raise TypeError(f"Expected a JSON object, but got {type(data).__name__}")
result = data
except InvokeError as e:
error = str(e)
result = {"error": f"Failed to generate code. Error: {error}"}
except Exception as e:
logger.exception(
"Failed to invoke LLM model, model: %s", json.dumps(model_config.get("name")), exc_info=True
)
error = str(e)
result = {"error": f"An unexpected error occurred: {str(e)}"}
if user_id:
generated_output = ""
if isinstance(result, dict):
for key in ["prompt", "code", "output", "modified"]:
if result.get(key):
generated_output = str(result[key])
break
LLMGenerator._emit_prompt_generation_trace(
tenant_id=tenant_id,
user_id=user_id,
app_id=app_id,
operation_type=OperationType.INSTRUCTION_MODIFY,
instruction=instruction,
generated_output=generated_output,
llm_result=llm_result,
model_config=model_config,
timer=timer,
error=error,
)
generated_raw = response.message.get_text_content()
first_brace = generated_raw.find("{")
last_brace = generated_raw.rfind("}")
if first_brace == -1 or last_brace == -1 or last_brace < first_brace:
raise ValueError(f"Could not find a valid JSON object in response: {generated_raw}")
json_str = generated_raw[first_brace : last_brace + 1]
data = json_repair.loads(json_str)
if not isinstance(data, dict):
raise TypeError(f"Expected a JSON object, but got {type(data).__name__}")
return data
except InvokeError as e:
error = str(e)
return {"error": f"Failed to generate code. Error: {error}"}
except Exception as e:
logger.exception("Failed to invoke LLM model, model: %s", json.dumps(model_config.name), exc_info=True)
return {"error": f"An unexpected error occurred: {str(e)}"}
return result
@classmethod
def _emit_prompt_generation_trace(
cls,
tenant_id: str,
user_id: str,
app_id: str | None,
operation_type: OperationType,
instruction: str,
generated_output: str,
llm_result: LLMResult | None,
model_config: dict | None = None,
timer=None,
error: str | None = None,
):
if llm_result:
prompt_tokens = llm_result.usage.prompt_tokens
completion_tokens = llm_result.usage.completion_tokens
total_tokens = llm_result.usage.total_tokens
model_name = llm_result.model
# Extract provider from model_config if available, otherwise fall back to parsing model name
if model_config and model_config.get("provider"):
model_provider = model_config.get("provider", "")
else:
model_provider = model_name.split("/")[0] if "/" in model_name else ""
latency = llm_result.usage.latency
total_price = float(llm_result.usage.total_price) if llm_result.usage.total_price else None
currency = llm_result.usage.currency
else:
prompt_tokens = 0
completion_tokens = 0
total_tokens = 0
model_provider = model_config.get("provider", "") if model_config else ""
model_name = model_config.get("name", "") if model_config else ""
latency = 0.0
if timer:
start_time = timer.get("start")
end_time = timer.get("end")
if start_time and end_time:
latency = (end_time - start_time).total_seconds()
total_price = None
currency = None
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.PROMPT_GENERATION_TRACE,
context=TelemetryContext(tenant_id=tenant_id, user_id=user_id, app_id=app_id),
payload={
"tenant_id": tenant_id,
"user_id": user_id,
"app_id": app_id,
"operation_type": operation_type,
"instruction": instruction,
"generated_output": generated_output,
"prompt_tokens": prompt_tokens,
"completion_tokens": completion_tokens,
"total_tokens": total_tokens,
"model_provider": model_provider,
"model_name": model_name,
"latency": latency,
"total_price": total_price,
"currency": currency,
"timer": timer,
"error": error,
},
)
)

View File

@@ -15,16 +15,23 @@ class TraceContextFilter(logging.Filter):
"""
def filter(self, record: logging.LogRecord) -> bool:
# Get trace context from OpenTelemetry
trace_id, span_id = self._get_otel_context()
# Preserve explicit trace_id set by the caller (e.g. emit_metric_only_event)
existing_trace_id = getattr(record, "trace_id", "")
if not existing_trace_id:
# Get trace context from OpenTelemetry
trace_id, span_id = self._get_otel_context()
# Set trace_id (fallback to ContextVar if no OTEL context)
if trace_id:
record.trace_id = trace_id
# Set trace_id (fallback to ContextVar if no OTEL context)
if trace_id:
record.trace_id = trace_id
else:
record.trace_id = get_trace_id()
record.span_id = span_id or ""
else:
record.trace_id = get_trace_id()
record.span_id = span_id or ""
# Keep existing trace_id; only fill span_id if missing
if not getattr(record, "span_id", ""):
record.span_id = ""
# For backward compatibility, also set req_id
record.req_id = get_request_id()
@@ -55,9 +62,12 @@ class IdentityContextFilter(logging.Filter):
def filter(self, record: logging.LogRecord) -> bool:
identity = self._extract_identity()
record.tenant_id = identity.get("tenant_id", "")
record.user_id = identity.get("user_id", "")
record.user_type = identity.get("user_type", "")
if not getattr(record, "tenant_id", ""):
record.tenant_id = identity.get("tenant_id", "")
if not getattr(record, "user_id", ""):
record.user_id = identity.get("user_id", "")
if not getattr(record, "user_type", ""):
record.user_type = identity.get("user_type", "")
return True
def _extract_identity(self) -> dict[str, str]:

View File

@@ -5,9 +5,10 @@ from typing import Any
from core.app.app_config.entities import AppConfig
from core.moderation.base import ModerationAction, ModerationError
from core.moderation.factory import ModerationFactory
from core.ops.entities.trace_entity import TraceTaskName
from core.ops.ops_trace_manager import TraceQueueManager, TraceTask
from core.ops.ops_trace_manager import TraceQueueManager
from core.ops.utils import measure_time
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
logger = logging.getLogger(__name__)
@@ -49,14 +50,18 @@ class InputModeration:
moderation_result = moderation_factory.moderation_for_inputs(inputs, query)
if trace_manager:
trace_manager.add_trace_task(
TraceTask(
TraceTaskName.MODERATION_TRACE,
message_id=message_id,
moderation_result=moderation_result,
inputs=inputs,
timer=timer,
)
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.MODERATION_TRACE,
context=TelemetryContext(tenant_id=tenant_id, app_id=app_id),
payload={
"message_id": message_id,
"moderation_result": moderation_result,
"inputs": inputs,
"timer": timer,
},
),
trace_manager=trace_manager,
)
if not moderation_result.flagged:

View File

@@ -9,8 +9,8 @@ from pydantic import BaseModel, ConfigDict, field_serializer, field_validator
class BaseTraceInfo(BaseModel):
message_id: str | None = None
message_data: Any | None = None
inputs: Union[str, dict[str, Any], list] | None = None
outputs: Union[str, dict[str, Any], list] | None = None
inputs: Union[str, dict[str, Any], list[Any]] | None = None
outputs: Union[str, dict[str, Any], list[Any]] | None = None
start_time: datetime | None = None
end_time: datetime | None = None
metadata: dict[str, Any]
@@ -18,7 +18,7 @@ class BaseTraceInfo(BaseModel):
@field_validator("inputs", "outputs")
@classmethod
def ensure_type(cls, v):
def ensure_type(cls, v: str | dict[str, Any] | list[Any] | None) -> str | dict[str, Any] | list[Any] | None:
if v is None:
return None
if isinstance(v, str | dict | list):
@@ -48,10 +48,14 @@ class WorkflowTraceInfo(BaseTraceInfo):
workflow_run_version: str
error: str | None = None
total_tokens: int
prompt_tokens: int | None = None
completion_tokens: int | None = None
file_list: list[str]
query: str
metadata: dict[str, Any]
invoked_by: str | None = None
class MessageTraceInfo(BaseTraceInfo):
conversation_model: str
@@ -59,7 +63,7 @@ class MessageTraceInfo(BaseTraceInfo):
answer_tokens: int
total_tokens: int
error: str | None = None
file_list: Union[str, dict[str, Any], list] | None = None
file_list: Union[str, dict[str, Any], list[Any]] | None = None
message_file_data: Any | None = None
conversation_mode: str
gen_ai_server_time_to_first_token: float | None = None
@@ -106,7 +110,7 @@ class ToolTraceInfo(BaseTraceInfo):
tool_config: dict[str, Any]
time_cost: Union[int, float]
tool_parameters: dict[str, Any]
file_url: Union[str, None, list] = None
file_url: Union[str, None, list[str]] = None
class GenerateNameTraceInfo(BaseTraceInfo):
@@ -114,6 +118,79 @@ class GenerateNameTraceInfo(BaseTraceInfo):
tenant_id: str
class PromptGenerationTraceInfo(BaseTraceInfo):
"""Trace information for prompt generation operations (rule-generate, code-generate, etc.)."""
tenant_id: str
user_id: str
app_id: str | None = None
operation_type: str
instruction: str
prompt_tokens: int
completion_tokens: int
total_tokens: int
model_provider: str
model_name: str
latency: float
total_price: float | None = None
currency: str | None = None
error: str | None = None
model_config = ConfigDict(protected_namespaces=())
class WorkflowNodeTraceInfo(BaseTraceInfo):
workflow_id: str
workflow_run_id: str
tenant_id: str
node_execution_id: str
node_id: str
node_type: str
title: str
status: str
error: str | None = None
elapsed_time: float
index: int
predecessor_node_id: str | None = None
total_tokens: int = 0
total_price: float = 0.0
currency: str | None = None
model_provider: str | None = None
model_name: str | None = None
prompt_tokens: int | None = None
completion_tokens: int | None = None
tool_name: str | None = None
iteration_id: str | None = None
iteration_index: int | None = None
loop_id: str | None = None
loop_index: int | None = None
parallel_id: str | None = None
node_inputs: Mapping[str, Any] | None = None
node_outputs: Mapping[str, Any] | None = None
process_data: Mapping[str, Any] | None = None
invoked_by: str | None = None
model_config = ConfigDict(protected_namespaces=())
class DraftNodeExecutionTrace(WorkflowNodeTraceInfo):
pass
class TaskData(BaseModel):
app_id: str
trace_info_type: str
@@ -128,16 +205,38 @@ trace_info_info_map = {
"DatasetRetrievalTraceInfo": DatasetRetrievalTraceInfo,
"ToolTraceInfo": ToolTraceInfo,
"GenerateNameTraceInfo": GenerateNameTraceInfo,
"PromptGenerationTraceInfo": PromptGenerationTraceInfo,
"WorkflowNodeTraceInfo": WorkflowNodeTraceInfo,
"DraftNodeExecutionTrace": DraftNodeExecutionTrace,
}
class OperationType(StrEnum):
"""Operation type for token metric labels.
Used as a metric attribute on ``dify.tokens.input`` / ``dify.tokens.output``
counters so consumers can break down token usage by operation.
"""
WORKFLOW = "workflow"
NODE_EXECUTION = "node_execution"
MESSAGE = "message"
RULE_GENERATE = "rule_generate"
CODE_GENERATE = "code_generate"
STRUCTURED_OUTPUT = "structured_output"
INSTRUCTION_MODIFY = "instruction_modify"
class TraceTaskName(StrEnum):
CONVERSATION_TRACE = "conversation"
WORKFLOW_TRACE = "workflow"
DRAFT_NODE_EXECUTION_TRACE = "draft_node_execution"
MESSAGE_TRACE = "message"
MODERATION_TRACE = "moderation"
SUGGESTED_QUESTION_TRACE = "suggested_question"
DATASET_RETRIEVAL_TRACE = "dataset_retrieval"
TOOL_TRACE = "tool"
GENERATE_NAME_TRACE = "generate_conversation_name"
PROMPT_GENERATION_TRACE = "prompt_generation"
DATASOURCE_TRACE = "datasource"
NODE_EXECUTION_TRACE = "node_execution"

View File

@@ -3,6 +3,7 @@ import os
from datetime import datetime, timedelta
from langfuse import Langfuse
from sqlalchemy import select
from sqlalchemy.orm import sessionmaker
from core.ops.base_trace_instance import BaseTraceInstance
@@ -30,7 +31,7 @@ from core.ops.utils import filter_none_values
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.enums import NodeType
from extensions.ext_database import db
from models import EndUser, WorkflowNodeExecutionTriggeredFrom
from models import EndUser, Message, WorkflowNodeExecutionTriggeredFrom
from models.enums import MessageStatus
logger = logging.getLogger(__name__)
@@ -71,7 +72,50 @@ class LangFuseDataTrace(BaseTraceInstance):
metadata = trace_info.metadata
metadata["workflow_app_log_id"] = trace_info.workflow_app_log_id
if trace_info.message_id:
# Check for parent_trace_context to detect nested workflow
parent_trace_context = trace_info.metadata.get("parent_trace_context")
if parent_trace_context:
# Nested workflow: create span under outer trace
outer_trace_id = parent_trace_context.get("trace_id")
parent_node_execution_id = parent_trace_context.get("parent_node_execution_id")
parent_conversation_id = parent_trace_context.get("parent_conversation_id")
parent_workflow_run_id = parent_trace_context.get("parent_workflow_run_id")
# Resolve outer trace_id: try message_id lookup first, fallback to workflow_run_id
if parent_conversation_id:
session_factory = sessionmaker(bind=db.engine)
with session_factory() as session:
message_data_stmt = select(Message.id).where(
Message.conversation_id == parent_conversation_id,
Message.workflow_run_id == parent_workflow_run_id,
)
resolved_message_id = session.scalar(message_data_stmt)
if resolved_message_id:
outer_trace_id = resolved_message_id
else:
outer_trace_id = parent_workflow_run_id
else:
outer_trace_id = parent_workflow_run_id
# Create inner workflow span under outer trace
workflow_span_data = LangfuseSpan(
id=trace_info.workflow_run_id,
name=TraceTaskName.WORKFLOW_TRACE,
input=dict(trace_info.workflow_run_inputs),
output=dict(trace_info.workflow_run_outputs),
trace_id=outer_trace_id,
parent_observation_id=parent_node_execution_id,
start_time=trace_info.start_time,
end_time=trace_info.end_time,
metadata=metadata,
level=LevelEnum.DEFAULT if trace_info.error == "" else LevelEnum.ERROR,
status_message=trace_info.error or "",
)
self.add_span(langfuse_span_data=workflow_span_data)
# Use outer_trace_id for all node spans/generations
trace_id = outer_trace_id
elif trace_info.message_id:
trace_id = trace_info.trace_id or trace_info.message_id
name = TraceTaskName.MESSAGE_TRACE
trace_data = LangfuseTrace(
@@ -174,6 +218,11 @@ class LangFuseDataTrace(BaseTraceInstance):
}
)
# Determine parent_observation_id for nested workflows
node_parent_observation_id = None
if parent_trace_context or trace_info.message_id:
node_parent_observation_id = trace_info.workflow_run_id
# add generation span
if process_data and process_data.get("model_mode") == "chat":
total_token = metadata.get("total_tokens", 0)
@@ -206,7 +255,7 @@ class LangFuseDataTrace(BaseTraceInstance):
metadata=metadata,
level=(LevelEnum.DEFAULT if status == "succeeded" else LevelEnum.ERROR),
status_message=trace_info.error or "",
parent_observation_id=trace_info.workflow_run_id if trace_info.message_id else None,
parent_observation_id=node_parent_observation_id,
usage=generation_usage,
)
@@ -225,7 +274,7 @@ class LangFuseDataTrace(BaseTraceInstance):
metadata=metadata,
level=(LevelEnum.DEFAULT if status == "succeeded" else LevelEnum.ERROR),
status_message=trace_info.error or "",
parent_observation_id=trace_info.workflow_run_id if trace_info.message_id else None,
parent_observation_id=node_parent_observation_id,
)
self.add_span(langfuse_span_data=span_data)

View File

@@ -6,6 +6,7 @@ from typing import cast
from langsmith import Client
from langsmith.schemas import RunBase
from sqlalchemy import select
from sqlalchemy.orm import sessionmaker
from core.ops.base_trace_instance import BaseTraceInstance
@@ -30,7 +31,7 @@ from core.ops.utils import filter_none_values, generate_dotted_order
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.enums import NodeType, WorkflowNodeExecutionMetadataKey
from extensions.ext_database import db
from models import EndUser, MessageFile, WorkflowNodeExecutionTriggeredFrom
from models import EndUser, Message, MessageFile, WorkflowNodeExecutionTriggeredFrom
logger = logging.getLogger(__name__)
@@ -64,7 +65,35 @@ class LangSmithDataTrace(BaseTraceInstance):
self.generate_name_trace(trace_info)
def workflow_trace(self, trace_info: WorkflowTraceInfo):
trace_id = trace_info.trace_id or trace_info.message_id or trace_info.workflow_run_id
# Check for parent_trace_context for cross-workflow linking
parent_trace_context = trace_info.metadata.get("parent_trace_context")
if parent_trace_context:
# Inner workflow: resolve outer trace_id and link to parent node
outer_trace_id = parent_trace_context.get("parent_workflow_run_id")
# Try to resolve message_id from conversation_id if available
if parent_trace_context.get("parent_conversation_id"):
try:
session_factory = sessionmaker(bind=db.engine)
with session_factory() as session:
message_data_stmt = select(Message.id).where(
Message.conversation_id == parent_trace_context["parent_conversation_id"],
Message.workflow_run_id == parent_trace_context["parent_workflow_run_id"],
)
resolved_message_id = session.scalar(message_data_stmt)
if resolved_message_id:
outer_trace_id = resolved_message_id
except Exception as e:
logger.debug("Failed to resolve message_id from conversation_id: %s", str(e))
trace_id = outer_trace_id
parent_run_id = parent_trace_context.get("parent_node_execution_id")
else:
# Outer workflow: existing behavior
trace_id = trace_info.trace_id or trace_info.message_id or trace_info.workflow_run_id
parent_run_id = trace_info.message_id or None
if trace_info.start_time is None:
trace_info.start_time = datetime.now()
message_dotted_order = (
@@ -78,7 +107,8 @@ class LangSmithDataTrace(BaseTraceInstance):
metadata = trace_info.metadata
metadata["workflow_app_log_id"] = trace_info.workflow_app_log_id
if trace_info.message_id:
# Only create message_run for outer workflows (no parent_trace_context)
if trace_info.message_id and not parent_trace_context:
message_run = LangSmithRunModel(
id=trace_info.message_id,
name=TraceTaskName.MESSAGE_TRACE,
@@ -121,9 +151,9 @@ class LangSmithDataTrace(BaseTraceInstance):
},
error=trace_info.error,
tags=["workflow"],
parent_run_id=trace_info.message_id or None,
parent_run_id=parent_run_id,
trace_id=trace_id,
dotted_order=workflow_dotted_order,
dotted_order=None if parent_trace_context else workflow_dotted_order,
serialized=None,
events=[],
session_id=None,

View File

@@ -21,19 +21,25 @@ from core.ops.entities.config_entity import (
)
from core.ops.entities.trace_entity import (
DatasetRetrievalTraceInfo,
DraftNodeExecutionTrace,
GenerateNameTraceInfo,
MessageTraceInfo,
ModerationTraceInfo,
PromptGenerationTraceInfo,
SuggestedQuestionTraceInfo,
TaskData,
ToolTraceInfo,
TraceTaskName,
WorkflowNodeTraceInfo,
WorkflowTraceInfo,
)
from core.ops.utils import get_message_data
from extensions.ext_database import db
from extensions.ext_storage import storage
from models.account import Tenant
from models.dataset import Dataset
from models.model import App, AppModelConfig, Conversation, Message, MessageFile, TraceAppConfig
from models.tools import ApiToolProvider, BuiltinToolProvider, MCPToolProvider, WorkflowToolProvider
from models.workflow import WorkflowAppLog
from tasks.ops_trace_task import process_trace_tasks
@@ -43,6 +49,44 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
def _lookup_app_and_workspace_names(app_id: str | None, tenant_id: str | None) -> tuple[str, str]:
"""Return (app_name, workspace_name) for the given IDs. Falls back to empty strings."""
app_name = ""
workspace_name = ""
if not app_id and not tenant_id:
return app_name, workspace_name
with Session(db.engine) as session:
if app_id:
name = session.scalar(select(App.name).where(App.id == app_id))
if name:
app_name = name
if tenant_id:
name = session.scalar(select(Tenant.name).where(Tenant.id == tenant_id))
if name:
workspace_name = name
return app_name, workspace_name
_PROVIDER_TYPE_TO_MODEL: dict[str, type] = {
"builtin": BuiltinToolProvider,
"plugin": BuiltinToolProvider,
"api": ApiToolProvider,
"workflow": WorkflowToolProvider,
"mcp": MCPToolProvider,
}
def _lookup_credential_name(credential_id: str | None, provider_type: str | None) -> str:
if not credential_id:
return ""
model_cls = _PROVIDER_TYPE_TO_MODEL.get(provider_type or "")
if not model_cls:
return ""
with Session(db.engine) as session:
name = session.scalar(select(model_cls.name).where(model_cls.id == credential_id))
return str(name) if name else ""
class OpsTraceProviderConfigMap(collections.UserDict[str, dict[str, Any]]):
def __getitem__(self, provider: str) -> dict[str, Any]:
match provider:
@@ -317,6 +361,10 @@ class OpsTraceManager:
if app_id is None:
return None
# Handle storage_id format (tenant-{uuid}) - not a real app_id
if isinstance(app_id, str) and app_id.startswith("tenant-"):
return None
app: App | None = db.session.query(App).where(App.id == app_id).first()
if app is None:
@@ -479,6 +527,56 @@ class TraceTask:
cls._workflow_run_repo = DifyAPIRepositoryFactory.create_api_workflow_run_repository(session_maker)
return cls._workflow_run_repo
@classmethod
def _get_user_id_from_metadata(cls, metadata: dict[str, Any]) -> str:
"""Extract user ID from metadata, prioritizing end_user over account.
Returns the actual user ID (end_user or account) who invoked the workflow,
regardless of invoke_from context.
"""
# Priority 1: End user (external users via API/WebApp)
if user_id := metadata.get("from_end_user_id"):
return f"end_user:{user_id}"
# Priority 2: Account user (internal users via console/debugger)
if user_id := metadata.get("from_account_id"):
return f"account:{user_id}"
# Priority 3: User (internal users via console/debugger)
if user_id := metadata.get("user_id"):
return f"user:{user_id}"
return "anonymous"
@classmethod
def _calculate_workflow_token_split(cls, workflow_run_id: str, tenant_id: str) -> tuple[int, int]:
from core.workflow.enums import WorkflowNodeExecutionMetadataKey
from models.workflow import WorkflowNodeExecutionModel
with Session(db.engine) as session:
node_executions = session.scalars(
select(WorkflowNodeExecutionModel).where(
WorkflowNodeExecutionModel.tenant_id == tenant_id,
WorkflowNodeExecutionModel.workflow_run_id == workflow_run_id,
)
).all()
total_prompt = 0
total_completion = 0
for node_exec in node_executions:
metadata = node_exec.execution_metadata_dict
prompt = metadata.get(WorkflowNodeExecutionMetadataKey.PROMPT_TOKENS)
if prompt is not None:
total_prompt += prompt
completion = metadata.get(WorkflowNodeExecutionMetadataKey.COMPLETION_TOKENS)
if completion is not None:
total_completion += completion
return (total_prompt, total_completion)
def __init__(
self,
trace_type: Any,
@@ -499,6 +597,8 @@ class TraceTask:
self.app_id = None
self.trace_id = None
self.kwargs = kwargs
if user_id is not None and "user_id" not in self.kwargs:
self.kwargs["user_id"] = user_id
external_trace_id = kwargs.get("external_trace_id")
if external_trace_id:
self.trace_id = external_trace_id
@@ -512,7 +612,7 @@ class TraceTask:
TraceTaskName.WORKFLOW_TRACE: lambda: self.workflow_trace(
workflow_run_id=self.workflow_run_id, conversation_id=self.conversation_id, user_id=self.user_id
),
TraceTaskName.MESSAGE_TRACE: lambda: self.message_trace(message_id=self.message_id),
TraceTaskName.MESSAGE_TRACE: lambda: self.message_trace(message_id=self.message_id, **self.kwargs),
TraceTaskName.MODERATION_TRACE: lambda: self.moderation_trace(
message_id=self.message_id, timer=self.timer, **self.kwargs
),
@@ -528,6 +628,9 @@ class TraceTask:
TraceTaskName.GENERATE_NAME_TRACE: lambda: self.generate_name_trace(
conversation_id=self.conversation_id, timer=self.timer, **self.kwargs
),
TraceTaskName.PROMPT_GENERATION_TRACE: lambda: self.prompt_generation_trace(**self.kwargs),
TraceTaskName.NODE_EXECUTION_TRACE: lambda: self.node_execution_trace(**self.kwargs),
TraceTaskName.DRAFT_NODE_EXECUTION_TRACE: lambda: self.draft_node_execution_trace(**self.kwargs),
}
return preprocess_map.get(self.trace_type, lambda: None)()
@@ -563,6 +666,10 @@ class TraceTask:
total_tokens = workflow_run.total_tokens
prompt_tokens, completion_tokens = self._calculate_workflow_token_split(
workflow_run_id=workflow_run_id, tenant_id=tenant_id
)
file_list = workflow_run_inputs.get("sys.file") or []
query = workflow_run_inputs.get("query") or workflow_run_inputs.get("sys.query") or ""
@@ -583,7 +690,9 @@ class TraceTask:
)
message_id = session.scalar(message_data_stmt)
metadata = {
app_name, workspace_name = _lookup_app_and_workspace_names(workflow_run.app_id, tenant_id)
metadata: dict[str, Any] = {
"workflow_id": workflow_id,
"conversation_id": conversation_id,
"workflow_run_id": workflow_run_id,
@@ -596,8 +705,14 @@ class TraceTask:
"triggered_from": workflow_run.triggered_from,
"user_id": user_id,
"app_id": workflow_run.app_id,
"app_name": app_name,
"workspace_name": workspace_name,
}
parent_trace_context = self.kwargs.get("parent_trace_context")
if parent_trace_context:
metadata["parent_trace_context"] = parent_trace_context
workflow_trace_info = WorkflowTraceInfo(
trace_id=self.trace_id,
workflow_data=workflow_run.to_dict(),
@@ -612,6 +727,8 @@ class TraceTask:
workflow_run_version=workflow_run_version,
error=error,
total_tokens=total_tokens,
prompt_tokens=prompt_tokens,
completion_tokens=completion_tokens,
file_list=file_list,
query=query,
metadata=metadata,
@@ -619,10 +736,11 @@ class TraceTask:
message_id=message_id,
start_time=workflow_run.created_at,
end_time=workflow_run.finished_at,
invoked_by=self._get_user_id_from_metadata(metadata),
)
return workflow_trace_info
def message_trace(self, message_id: str | None):
def message_trace(self, message_id: str | None, **kwargs):
if not message_id:
return {}
message_data = get_message_data(message_id)
@@ -645,6 +763,14 @@ class TraceTask:
streaming_metrics = self._extract_streaming_metrics(message_data)
tenant_id = ""
with Session(db.engine) as session:
tid = session.scalar(select(App.tenant_id).where(App.id == message_data.app_id))
if tid:
tenant_id = str(tid)
app_name, workspace_name = _lookup_app_and_workspace_names(message_data.app_id, tenant_id)
metadata = {
"conversation_id": message_data.conversation_id,
"ls_provider": message_data.model_provider,
@@ -656,7 +782,14 @@ class TraceTask:
"workflow_run_id": message_data.workflow_run_id,
"from_source": message_data.from_source,
"message_id": message_id,
"tenant_id": tenant_id,
"app_id": message_data.app_id,
"user_id": message_data.from_end_user_id or message_data.from_account_id,
"app_name": app_name,
"workspace_name": workspace_name,
}
if node_execution_id := kwargs.get("node_execution_id"):
metadata["node_execution_id"] = node_execution_id
message_tokens = message_data.message_tokens
@@ -698,6 +831,8 @@ class TraceTask:
"preset_response": moderation_result.preset_response,
"query": moderation_result.query,
}
if node_execution_id := kwargs.get("node_execution_id"):
metadata["node_execution_id"] = node_execution_id
# get workflow_app_log_id
workflow_app_log_id = None
@@ -739,6 +874,8 @@ class TraceTask:
"workflow_run_id": message_data.workflow_run_id,
"from_source": message_data.from_source,
}
if node_execution_id := kwargs.get("node_execution_id"):
metadata["node_execution_id"] = node_execution_id
# get workflow_app_log_id
workflow_app_log_id = None
@@ -778,6 +915,36 @@ class TraceTask:
if not message_data:
return {}
tenant_id = ""
with Session(db.engine) as session:
tid = session.scalar(select(App.tenant_id).where(App.id == message_data.app_id))
if tid:
tenant_id = str(tid)
app_name, workspace_name = _lookup_app_and_workspace_names(message_data.app_id, tenant_id)
doc_list = [doc.model_dump() for doc in documents] if documents else []
dataset_ids: set[str] = set()
for doc in doc_list:
doc_meta = doc.get("metadata") or {}
did = doc_meta.get("dataset_id")
if did:
dataset_ids.add(did)
embedding_models: dict[str, dict[str, str]] = {}
if dataset_ids:
with Session(db.engine) as session:
rows = session.execute(
select(Dataset.id, Dataset.embedding_model, Dataset.embedding_model_provider).where(
Dataset.id.in_(list(dataset_ids))
)
).all()
for row in rows:
embedding_models[str(row[0])] = {
"embedding_model": row[1] or "",
"embedding_model_provider": row[2] or "",
}
metadata = {
"message_id": message_id,
"ls_provider": message_data.model_provider,
@@ -788,13 +955,21 @@ class TraceTask:
"agent_based": message_data.agent_based,
"workflow_run_id": message_data.workflow_run_id,
"from_source": message_data.from_source,
"tenant_id": tenant_id,
"app_id": message_data.app_id,
"user_id": message_data.from_end_user_id or message_data.from_account_id,
"app_name": app_name,
"workspace_name": workspace_name,
"embedding_models": embedding_models,
}
if node_execution_id := kwargs.get("node_execution_id"):
metadata["node_execution_id"] = node_execution_id
dataset_retrieval_trace_info = DatasetRetrievalTraceInfo(
trace_id=self.trace_id,
message_id=message_id,
inputs=message_data.query or message_data.inputs,
documents=[doc.model_dump() for doc in documents] if documents else [],
documents=doc_list,
start_time=timer.get("start"),
end_time=timer.get("end"),
metadata=metadata,
@@ -837,6 +1012,10 @@ class TraceTask:
"error": error,
"tool_parameters": tool_parameters,
}
if message_data.workflow_run_id:
metadata["workflow_run_id"] = message_data.workflow_run_id
if node_execution_id := kwargs.get("node_execution_id"):
metadata["node_execution_id"] = node_execution_id
file_url = ""
message_file_data = db.session.query(MessageFile).filter_by(message_id=message_id).first()
@@ -891,6 +1070,8 @@ class TraceTask:
"conversation_id": conversation_id,
"tenant_id": tenant_id,
}
if node_execution_id := kwargs.get("node_execution_id"):
metadata["node_execution_id"] = node_execution_id
generate_name_trace_info = GenerateNameTraceInfo(
trace_id=self.trace_id,
@@ -905,6 +1086,158 @@ class TraceTask:
return generate_name_trace_info
def prompt_generation_trace(self, **kwargs) -> PromptGenerationTraceInfo | dict:
tenant_id = kwargs.get("tenant_id", "")
user_id = kwargs.get("user_id", "")
app_id = kwargs.get("app_id")
operation_type = kwargs.get("operation_type", "")
instruction = kwargs.get("instruction", "")
generated_output = kwargs.get("generated_output", "")
prompt_tokens = kwargs.get("prompt_tokens", 0)
completion_tokens = kwargs.get("completion_tokens", 0)
total_tokens = kwargs.get("total_tokens", 0)
model_provider = kwargs.get("model_provider", "")
model_name = kwargs.get("model_name", "")
latency = kwargs.get("latency", 0.0)
timer = kwargs.get("timer")
start_time = timer.get("start") if timer else None
end_time = timer.get("end") if timer else None
total_price = kwargs.get("total_price")
currency = kwargs.get("currency")
error = kwargs.get("error")
app_name = None
workspace_name = None
if app_id:
app_name, workspace_name = _lookup_app_and_workspace_names(app_id, tenant_id)
metadata = {
"tenant_id": tenant_id,
"user_id": user_id,
"app_id": app_id or "",
"app_name": app_name,
"workspace_name": workspace_name,
"operation_type": operation_type,
"model_provider": model_provider,
"model_name": model_name,
}
if node_execution_id := kwargs.get("node_execution_id"):
metadata["node_execution_id"] = node_execution_id
return PromptGenerationTraceInfo(
trace_id=self.trace_id,
inputs=instruction,
outputs=generated_output,
start_time=start_time,
end_time=end_time,
metadata=metadata,
tenant_id=tenant_id,
user_id=user_id,
app_id=app_id,
operation_type=operation_type,
instruction=instruction,
prompt_tokens=prompt_tokens,
completion_tokens=completion_tokens,
total_tokens=total_tokens,
model_provider=model_provider,
model_name=model_name,
latency=latency,
total_price=total_price,
currency=currency,
error=error,
)
def node_execution_trace(self, **kwargs) -> WorkflowNodeTraceInfo | dict:
node_data: dict = kwargs.get("node_execution_data", {})
if not node_data:
return {}
app_name, workspace_name = _lookup_app_and_workspace_names(node_data.get("app_id"), node_data.get("tenant_id"))
credential_name = _lookup_credential_name(
node_data.get("credential_id"), node_data.get("credential_provider_type")
)
metadata: dict[str, Any] = {
"tenant_id": node_data.get("tenant_id"),
"app_id": node_data.get("app_id"),
"app_name": app_name,
"workspace_name": workspace_name,
"user_id": node_data.get("user_id"),
"dataset_ids": node_data.get("dataset_ids"),
"dataset_names": node_data.get("dataset_names"),
"plugin_name": node_data.get("plugin_name"),
"credential_name": credential_name,
}
parent_trace_context = node_data.get("parent_trace_context")
if parent_trace_context:
metadata["parent_trace_context"] = parent_trace_context
message_id: str | None = None
conversation_id = node_data.get("conversation_id")
workflow_execution_id = node_data.get("workflow_execution_id")
if conversation_id and workflow_execution_id and not parent_trace_context:
with Session(db.engine) as session:
msg_id = session.scalar(
select(Message.id).where(
Message.conversation_id == conversation_id,
Message.workflow_run_id == workflow_execution_id,
)
)
if msg_id:
message_id = str(msg_id)
metadata["message_id"] = message_id
return WorkflowNodeTraceInfo(
trace_id=self.trace_id,
message_id=message_id,
start_time=node_data.get("created_at"),
end_time=node_data.get("finished_at"),
metadata=metadata,
workflow_id=node_data.get("workflow_id", ""),
workflow_run_id=node_data.get("workflow_execution_id", ""),
tenant_id=node_data.get("tenant_id", ""),
node_execution_id=node_data.get("node_execution_id", ""),
node_id=node_data.get("node_id", ""),
node_type=node_data.get("node_type", ""),
title=node_data.get("title", ""),
status=node_data.get("status", ""),
error=node_data.get("error"),
elapsed_time=node_data.get("elapsed_time", 0.0),
index=node_data.get("index", 0),
predecessor_node_id=node_data.get("predecessor_node_id"),
total_tokens=node_data.get("total_tokens", 0),
total_price=node_data.get("total_price", 0.0),
currency=node_data.get("currency"),
model_provider=node_data.get("model_provider"),
model_name=node_data.get("model_name"),
prompt_tokens=node_data.get("prompt_tokens"),
completion_tokens=node_data.get("completion_tokens"),
tool_name=node_data.get("tool_name"),
iteration_id=node_data.get("iteration_id"),
iteration_index=node_data.get("iteration_index"),
loop_id=node_data.get("loop_id"),
loop_index=node_data.get("loop_index"),
parallel_id=node_data.get("parallel_id"),
node_inputs=node_data.get("node_inputs"),
node_outputs=node_data.get("node_outputs"),
process_data=node_data.get("process_data"),
invoked_by=self._get_user_id_from_metadata(metadata),
)
def draft_node_execution_trace(self, **kwargs) -> DraftNodeExecutionTrace | dict:
node_trace = self.node_execution_trace(**kwargs)
if not node_trace or not isinstance(node_trace, WorkflowNodeTraceInfo):
return node_trace
return DraftNodeExecutionTrace(**node_trace.model_dump())
def _extract_streaming_metrics(self, message_data) -> dict:
if not message_data.message_metadata:
return {}
@@ -938,13 +1271,17 @@ class TraceQueueManager:
self.user_id = user_id
self.trace_instance = OpsTraceManager.get_ops_trace_instance(app_id)
self.flask_app = current_app._get_current_object() # type: ignore
from core.telemetry import is_enterprise_telemetry_enabled
self._enterprise_telemetry_enabled = is_enterprise_telemetry_enabled()
if trace_manager_timer is None:
self.start_timer()
def add_trace_task(self, trace_task: TraceTask):
global trace_manager_timer, trace_manager_queue
try:
if self.trace_instance:
if self._enterprise_telemetry_enabled or self.trace_instance:
trace_task.app_id = self.app_id
trace_manager_queue.put(trace_task)
except Exception:
@@ -980,20 +1317,27 @@ class TraceQueueManager:
def send_to_celery(self, tasks: list[TraceTask]):
with self.flask_app.app_context():
for task in tasks:
if task.app_id is None:
continue
storage_id = task.app_id
if storage_id is None:
tenant_id = task.kwargs.get("tenant_id")
if tenant_id:
storage_id = f"tenant-{tenant_id}"
else:
logger.warning("Skipping trace without app_id or tenant_id, trace_type: %s", task.trace_type)
continue
file_id = uuid4().hex
trace_info = task.execute()
task_data = TaskData(
app_id=task.app_id,
app_id=storage_id,
trace_info_type=type(trace_info).__name__,
trace_info=trace_info.model_dump() if trace_info else None,
)
file_path = f"{OPS_FILE_PATH}{task.app_id}/{file_id}.json"
file_path = f"{OPS_FILE_PATH}{storage_id}/{file_id}.json"
storage.save(file_path, task_data.model_dump_json().encode("utf-8"))
file_info = {
"file_id": file_id,
"app_id": task.app_id,
"app_id": storage_id,
}
process_trace_tasks.delay(file_info) # type: ignore

View File

@@ -27,8 +27,7 @@ from core.model_runtime.entities.llm_entities import LLMResult, LLMUsage
from core.model_runtime.entities.message_entities import PromptMessage, PromptMessageRole, PromptMessageTool
from core.model_runtime.entities.model_entities import ModelFeature, ModelType
from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
from core.ops.entities.trace_entity import TraceTaskName
from core.ops.ops_trace_manager import TraceQueueManager, TraceTask
from core.ops.ops_trace_manager import TraceQueueManager
from core.ops.utils import measure_time
from core.prompt.advanced_prompt_transform import AdvancedPromptTransform
from core.prompt.entities.advanced_prompt_entities import ChatModelMessage, CompletionModelPromptTemplate
@@ -56,6 +55,8 @@ from core.rag.retrieval.template_prompts import (
METADATA_FILTER_USER_PROMPT_2,
METADATA_FILTER_USER_PROMPT_3,
)
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
from core.tools.signature import sign_upload_file
from core.tools.utils.dataset_retriever.dataset_retriever_base_tool import DatasetRetrieverBaseTool
from extensions.ext_database import db
@@ -728,10 +729,21 @@ class DatasetRetrieval:
self.application_generate_entity.trace_manager if self.application_generate_entity else None
)
if trace_manager:
trace_manager.add_trace_task(
TraceTask(
TraceTaskName.DATASET_RETRIEVAL_TRACE, message_id=message_id, documents=documents, timer=timer
)
app_config = self.application_generate_entity.app_config if self.application_generate_entity else None
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.DATASET_RETRIEVAL_TRACE,
context=TelemetryContext(
tenant_id=app_config.tenant_id if app_config else None,
app_id=app_config.app_id if app_config else None,
),
payload={
"message_id": message_id,
"documents": documents,
"timer": timer,
},
),
trace_manager=trace_manager,
)
def _on_query(

View File

@@ -0,0 +1,60 @@
"""Community telemetry helpers.
Provides ``emit()`` which enqueues trace events into the CE trace pipeline
(``TraceQueueManager`` → ``ops_trace`` Celery queue → Langfuse / LangSmith / etc.).
Enterprise-only traces (node execution, draft node execution, prompt generation)
are silently dropped when enterprise telemetry is disabled.
"""
from __future__ import annotations
from typing import TYPE_CHECKING
from core.ops.entities.trace_entity import TraceTaskName
from core.telemetry.events import TelemetryContext, TelemetryEvent
if TYPE_CHECKING:
from core.ops.ops_trace_manager import TraceQueueManager
_ENTERPRISE_ONLY_TRACES: frozenset[TraceTaskName] = frozenset(
{
TraceTaskName.DRAFT_NODE_EXECUTION_TRACE,
TraceTaskName.NODE_EXECUTION_TRACE,
TraceTaskName.PROMPT_GENERATION_TRACE,
}
)
def _is_enterprise_telemetry_enabled() -> bool:
try:
from enterprise.telemetry.exporter import is_enterprise_telemetry_enabled
return is_enterprise_telemetry_enabled()
except Exception:
return False
def emit(event: TelemetryEvent, trace_manager: TraceQueueManager | None = None) -> None:
from core.ops.ops_trace_manager import TraceQueueManager as LocalTraceQueueManager
from core.ops.ops_trace_manager import TraceTask
if event.name in _ENTERPRISE_ONLY_TRACES and not _is_enterprise_telemetry_enabled():
return
queue_manager = trace_manager or LocalTraceQueueManager(
app_id=event.context.app_id,
user_id=event.context.user_id,
)
queue_manager.add_trace_task(TraceTask(event.name, **event.payload))
is_enterprise_telemetry_enabled = _is_enterprise_telemetry_enabled
__all__ = [
"TelemetryContext",
"TelemetryEvent",
"TraceTaskName",
"emit",
"is_enterprise_telemetry_enabled",
]

View File

@@ -0,0 +1,21 @@
from __future__ import annotations
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from core.ops.entities.trace_entity import TraceTaskName
@dataclass(frozen=True)
class TelemetryContext:
tenant_id: str | None = None
user_id: str | None = None
app_id: str | None = None
@dataclass(frozen=True)
class TelemetryEvent:
name: TraceTaskName
context: TelemetryContext
payload: dict[str, Any]

View File

@@ -50,6 +50,7 @@ class WorkflowTool(Tool):
self.workflow_call_depth = workflow_call_depth
self.label = label
self._latest_usage = LLMUsage.empty_usage()
self.parent_trace_context: dict[str, str] | None = None
super().__init__(entity=entity, runtime=runtime)
@@ -90,11 +91,15 @@ class WorkflowTool(Tool):
self._latest_usage = LLMUsage.empty_usage()
args: dict[str, Any] = {"inputs": tool_parameters, "files": files}
if self.parent_trace_context:
args["_parent_trace_context"] = self.parent_trace_context
result = generator.generate(
app_model=app,
workflow=workflow,
user=user,
args={"inputs": tool_parameters, "files": files},
args=args,
invoke_from=self.runtime.invoke_from,
streaming=False,
call_depth=self.workflow_call_depth + 1,

View File

@@ -232,6 +232,8 @@ class WorkflowNodeExecutionMetadataKey(StrEnum):
"""
TOTAL_TOKENS = "total_tokens"
PROMPT_TOKENS = "prompt_tokens"
COMPLETION_TOKENS = "completion_tokens"
TOTAL_PRICE = "total_price"
CURRENCY = "currency"
TOOL_INFO = "tool_info"

View File

@@ -322,6 +322,8 @@ class LLMNode(Node[LLMNodeData]):
outputs=outputs,
metadata={
WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: usage.total_tokens,
WorkflowNodeExecutionMetadataKey.PROMPT_TOKENS: usage.prompt_tokens,
WorkflowNodeExecutionMetadataKey.COMPLETION_TOKENS: usage.completion_tokens,
WorkflowNodeExecutionMetadataKey.TOTAL_PRICE: usage.total_price,
WorkflowNodeExecutionMetadataKey.CURRENCY: usage.currency,
},

View File

@@ -61,6 +61,7 @@ class ToolNode(Node[ToolNodeData]):
"provider_type": self.node_data.provider_type.value,
"provider_id": self.node_data.provider_id,
"plugin_unique_identifier": self.node_data.plugin_unique_identifier,
"credential_id": self.node_data.credential_id,
}
# get tool runtime
@@ -105,6 +106,20 @@ class ToolNode(Node[ToolNodeData]):
# get conversation id
conversation_id = self.graph_runtime_state.variable_pool.get(["sys", SystemVariableKey.CONVERSATION_ID])
from core.tools.workflow_as_tool.tool import WorkflowTool
if isinstance(tool_runtime, WorkflowTool):
workflow_run_id_var = self.graph_runtime_state.variable_pool.get(
["sys", SystemVariableKey.WORKFLOW_EXECUTION_ID]
)
tool_runtime.parent_trace_context = {
"trace_id": str(workflow_run_id_var.text) if workflow_run_id_var else "",
"parent_node_execution_id": self.execution_id,
"parent_workflow_run_id": str(workflow_run_id_var.text) if workflow_run_id_var else "",
"parent_app_id": self.app_id,
"parent_conversation_id": conversation_id.text if conversation_id else None,
}
try:
message_stream = ToolEngine.generic_invoke(
tool=tool_runtime,
@@ -431,6 +446,8 @@ class ToolNode(Node[ToolNodeData]):
}
if isinstance(usage.total_tokens, int) and usage.total_tokens > 0:
metadata[WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS] = usage.total_tokens
metadata[WorkflowNodeExecutionMetadataKey.PROMPT_TOKENS] = usage.prompt_tokens
metadata[WorkflowNodeExecutionMetadataKey.COMPLETION_TOKENS] = usage.completion_tokens
metadata[WorkflowNodeExecutionMetadataKey.TOTAL_PRICE] = usage.total_price
metadata[WorkflowNodeExecutionMetadataKey.CURRENCY] = usage.currency

View File

View File

@@ -0,0 +1,911 @@
# Dify Enterprise Telemetry Data Dictionary
This document provides a comprehensive reference for all telemetry signals emitted by the Dify Enterprise OpenTelemetry (OTEL) exporter. It is intended for platform engineers and data scientists integrating Dify with observability stacks like Prometheus, Grafana, Jaeger, or Honeycomb.
## 1. Overview
Dify Enterprise uses a "slim span + rich companion log" architecture to provide high-fidelity observability without overwhelming trace storage.
- **Traces (Spans)**: Capture the structure, identity, and timing of high-level operations (Workflows and Nodes).
- **Structured Logs**: Provide deep context (inputs, outputs, metadata) for every event, correlated to spans via `trace_id` and `span_id`.
- **Metrics**: Provide 100% accurate counters and histograms for usage, performance, and error tracking.
### Signal Architecture
```mermaid
graph TD
A[Workflow Run] -->|Span| B(dify.workflow.run)
A -->|Log| C(dify.workflow.run detail)
B ---|trace_id| C
D[Node Execution] -->|Span| E(dify.node.execution)
D -->|Log| F(dify.node.execution detail)
E ---|span_id| F
G[Message/Tool/etc] -->|Log| H(dify.* event)
G -->|Metric| I(dify.* counter/histogram)
```
## 2. Configuration
The Enterprise OTEL exporter is configured via environment variables.
| Variable | Description | Default |
|----------|-------------|---------|
| `ENTERPRISE_ENABLED` | Master switch for all enterprise features. | `false` |
| `ENTERPRISE_TELEMETRY_ENABLED` | Master switch for enterprise telemetry. | `false` |
| `ENTERPRISE_OTLP_ENDPOINT` | OTLP collector endpoint (e.g., `http://otel-collector:4318`). | - |
| `ENTERPRISE_OTLP_HEADERS` | Custom headers for OTLP requests (e.g., `x-scope-orgid=tenant1`). | - |
| `ENTERPRISE_OTLP_PROTOCOL` | OTLP transport protocol (`http` or `grpc`). | `http` |
| `ENTERPRISE_OTLP_API_KEY` | Bearer token for authentication. | - |
| `ENTERPRISE_INCLUDE_CONTENT` | Whether to include sensitive content (inputs/outputs) in logs. | `true` |
| `ENTERPRISE_SERVICE_NAME` | Service name reported to OTEL. | `dify` |
| `ENTERPRISE_OTEL_SAMPLING_RATE` | Sampling rate for traces (0.0 to 1.0). Metrics are always 100%. | `1.0` |
## 3. Resource Attributes
These attributes are attached to every signal (Span, Metric, Log) emitted by the exporter.
| Attribute | Description | Example |
|-----------|-------------|---------|
| `service.name` | The name of the service. | `dify` |
| `host.name` | The hostname of the container/machine. | `dify-api-7f8b` |
## 4. Traces (Spans)
Dify emits spans for long-running or structural operations.
### 4.1 `dify.workflow.run`
Represents the full execution of a workflow.
| Attribute | Type | Description |
|-----------|------|-------------|
| `dify.trace_id` | string | The business trace ID (Workflow Run ID). |
| `dify.tenant_id` | string | Tenant identifier. |
| `dify.app_id` | string | Application identifier. |
| `dify.workflow.id` | string | Workflow definition ID. |
| `dify.workflow.run_id` | string | Unique ID for this specific run. |
| `dify.workflow.status` | string | Final status (`succeeded`, `failed`, `stopped`, etc.). |
| `dify.workflow.error` | string | Error message if the run failed. |
| `dify.workflow.elapsed_time` | float | Total execution time in seconds. |
| `dify.invoke_from` | string | Source of the trigger (`api`, `webapp`, `debug`). |
| `dify.conversation.id` | string | Conversation ID (if applicable). |
| `dify.message.id` | string | Message ID (if applicable). |
| `dify.invoked_by` | string | User ID who triggered the run. |
| `dify.parent.trace_id` | string | (Optional) Trace ID of the parent workflow. |
| `dify.parent.workflow.run_id` | string | (Optional) Run ID of the parent workflow. |
| `dify.parent.node.execution_id` | string | (Optional) Execution ID of the parent node. |
| `dify.parent.app.id` | string | (Optional) App ID of the parent workflow. |
### 4.2 `dify.node.execution`
Represents the execution of a single node within a workflow.
| Attribute | Type | Description |
|-----------|------|-------------|
| `dify.trace_id` | string | The business trace ID. |
| `dify.tenant_id` | string | Tenant identifier. |
| `dify.app_id` | string | Application identifier. |
| `dify.workflow.id` | string | Workflow definition ID. |
| `dify.workflow.run_id` | string | Workflow Run ID this node belongs to. |
| `dify.message.id` | string | Message ID (if applicable). |
| `dify.conversation.id` | string | Conversation ID (if applicable). |
| `dify.node.execution_id` | string | Unique ID for this node execution. |
| `dify.node.id` | string | Node ID in the workflow graph. |
| `dify.node.type` | string | Type of node (`llm`, `knowledge-retrieval`, `tool`, etc.). |
| `dify.node.title` | string | Display title of the node. |
| `dify.node.status` | string | Execution status (`succeeded`, `failed`). |
| `dify.node.error` | string | Error message if the node failed. |
| `dify.node.elapsed_time` | float | Execution time in seconds. |
| `dify.node.index` | int | Execution order index. |
| `dify.node.predecessor_node_id` | string | ID of the node that triggered this one. |
| `dify.node.iteration_id` | string | (Optional) ID of the iteration this node belongs to. |
| `dify.node.loop_id` | string | (Optional) ID of the loop this node belongs to. |
| `dify.node.parallel_id` | string | (Optional) ID of the parallel branch. |
| `dify.node.invoked_by` | string | User ID who triggered the execution. |
### 4.3 `dify.node.execution.draft`
Identical to `dify.node.execution`, but emitted during "Preview" or "Debug" runs of a single node.
## 5. Correlation Model
Dify uses deterministic ID generation to ensure signals are correlated across different services and asynchronous tasks.
### ID Generation Rules
- `trace_id`: Derived from the correlation ID (workflow_run_id or node_execution_id for drafts) using `int(UUID(correlation_id))`
- `span_id`: Derived from the source ID using `SHA256(source_id)[:8]`
### Scenario A: Simple Workflow
A single workflow run with multiple nodes. All spans and logs share the same `trace_id` (derived from `workflow_run_id`).
```
trace_id = UUID(workflow_run_id)
├── [root span] dify.workflow.run (span_id = hash(workflow_run_id))
│ ├── [child] dify.node.execution - "Start" (span_id = hash(node_exec_id_1))
│ ├── [child] dify.node.execution - "LLM" (span_id = hash(node_exec_id_2))
│ └── [child] dify.node.execution - "End" (span_id = hash(node_exec_id_3))
```
### Scenario B: Nested Sub-Workflow
A workflow calling another workflow via a Tool or Sub-workflow node. The child workflow's spans are linked to the parent via `parent_span_id`. Both workflows share the same trace_id.
```
trace_id = UUID(outer_workflow_run_id) ← shared across both workflows
├── [root] dify.workflow.run (outer) (span_id = hash(outer_workflow_run_id))
│ ├── dify.node.execution - "Start Node"
│ ├── dify.node.execution - "Tool Node" (triggers sub-workflow)
│ │ └── [child] dify.workflow.run (inner) (span_id = hash(inner_workflow_run_id))
│ │ ├── dify.node.execution - "Inner Start"
│ │ └── dify.node.execution - "Inner End"
│ └── dify.node.execution - "End Node"
```
**Key attributes for nested workflows:**
- Inner workflow's `dify.parent.trace_id` = outer `workflow_run_id`
- Inner workflow's `dify.parent.node.execution_id` = tool node's `execution_id`
- Inner workflow's `dify.parent.workflow.run_id` = outer `workflow_run_id`
- Inner workflow's `dify.parent.app.id` = outer `app_id`
### Scenario C: Draft Node Execution
A single node run in isolation (debugger/preview mode). It creates its own trace where the node span is the root.
```
trace_id = UUID(node_execution_id) ← own trace, NOT part of any workflow
└── dify.node.execution.draft (span_id = hash(node_execution_id))
```
**Key difference:** Draft executions use `node_execution_id` as the correlation_id, so they are NOT children of any workflow trace.
## 6. Counters
All counters are cumulative and emitted at 100% accuracy.
### 6.1 Token Counters
⚠️ **Warning on Double-Counting**: `dify.tokens.total` at the workflow level includes all tokens from its nodes. To get the true total usage for a tenant, you **MUST** filter by `operation_type`.
| Metric | Unit | Description |
|--------|------|-------------|
| `dify.tokens.total` | `{token}` | Total tokens consumed. |
| `dify.tokens.input` | `{token}` | Input (prompt) tokens. |
| `dify.tokens.output` | `{token}` | Output (completion) tokens. |
**Labels for Token Counters:**
- `tenant_id`: Tenant identifier.
- `app_id`: Application identifier.
- `operation_type`: `workflow`, `node_execution`, `message`, `rule_generate`, `code_generate`, `structured_output`, `instruction_modify`.
- `model_provider`: LLM provider (e.g., `openai`).
- `model_name`: LLM model (e.g., `gpt-4`).
- `node_type`: Workflow node type (if `operation_type=node_execution`).
**PromQL Example (Total Input Tokens per Tenant):**
```promql
sum(dify_tokens_input_total{tenant_id="my-tenant", operation_type="workflow"})
```
### 6.2 Request Counters
| Metric | Unit | Description |
|--------|------|-------------|
| `dify.requests.total` | `{request}` | Total number of operations. |
**Labels vary by `type`:**
| `type` | Additional Labels |
|--------|-------------------|
| `workflow` | `tenant_id`, `app_id`, `status`, `invoke_from` |
| `node` | `tenant_id`, `app_id`, `node_type`, `model_provider`, `status` |
| `draft_node` | `tenant_id`, `app_id`, `node_type`, `model_provider`, `status` |
| `message` | `tenant_id`, `app_id`, `model_provider`, `model_name`, `status`, `invoke_from` |
| `tool` | `tenant_id`, `app_id`, `tool_name` |
| `moderation` | `tenant_id`, `app_id` |
| `suggested_question` | `tenant_id`, `app_id` |
| `dataset_retrieval` | `tenant_id`, `app_id` |
| `generate_name` | `tenant_id`, `app_id` |
| `prompt_generation` | `tenant_id`, `app_id`, `operation_type`, `model_provider`, `model_name`, `status` |
### 6.3 Error Counters
| Metric | Unit | Description |
|--------|------|-------------|
| `dify.errors.total` | `{error}` | Total number of failed operations. |
**Labels vary by `type`:**
| `type` | Additional Labels |
|--------|-------------------|
| `workflow` | `tenant_id`, `app_id` |
| `node` | `tenant_id`, `app_id`, `node_type`, `model_provider` |
| `draft_node` | `tenant_id`, `app_id`, `node_type`, `model_provider` |
| `message` | `tenant_id`, `app_id`, `model_provider`, `model_name` |
| `tool` | `tenant_id`, `app_id`, `tool_name` |
| `prompt_generation` | `tenant_id`, `app_id`, `operation_type`, `model_provider`, `model_name` |
### 6.4 Other Counters
| Metric | Unit | Labels |
|--------|------|--------|
| `dify.feedback.total` | `{feedback}` | `tenant_id`, `app_id`, `rating` |
| `dify.dataset.retrievals.total` | `{retrieval}` | `tenant_id`, `app_id`, `dataset_id` |
| `dify.app.created.total` | `{app}` | `tenant_id`, `app_id`, `mode` |
| `dify.app.updated.total` | `{app}` | `tenant_id`, `app_id` |
| `dify.app.deleted.total` | `{app}` | `tenant_id`, `app_id` |
## 7. Histograms
Histograms measure the distribution of durations and latencies.
| Metric | Unit | Labels |
|--------|------|--------|
| `dify.workflow.duration` | `s` | `tenant_id`, `app_id`, `status` |
| `dify.node.duration` | `s` | `tenant_id`, `app_id`, `node_type`, `model_provider`, `plugin_name` (if tool/knowledge) |
| `dify.message.duration` | `s` | `tenant_id`, `app_id`, `model_provider`, `model_name` |
| `dify.message.time_to_first_token` | `s` | `tenant_id`, `app_id`, `model_provider`, `model_name` |
| `dify.tool.duration` | `s` | `tenant_id`, `app_id`, `tool_name` |
| `dify.prompt_generation.duration` | `s` | `tenant_id`, `app_id`, `operation_type`, `model_provider`, `model_name` |
**PromQL Example (P95 Node Duration by Type):**
```promql
histogram_quantile(0.95, sum by (le, node_type) (rate(dify_node_duration_bucket[5m])))
```
## 8. Structured Logs
Dify emits 13 types of structured logs. Logs are categorized as "Span Companion" (accompanying a span) or "Standalone" (metric-only events).
### 8.1 Span Companion Logs
These logs contain the full payload for spans, providing rich detail beyond what is captured in the span attributes.
#### 8.1.1 `dify.workflow.run` Companion Log
**Signal Type:** `span_detail`
This log accompanies the `dify.workflow.run` span and includes ALL span attributes PLUS additional detail attributes.
**Additional Attributes (beyond span attributes):**
| Additional Attribute | Type | Always Present | Description |
|---------------------|------|----------------|-------------|
| `dify.user.id` | string | No | User identifier (when available) |
| `gen_ai.usage.total_tokens` | int | No | Total tokens consumed by this workflow (sum of all nodes) |
| `dify.workflow.version` | string | Yes | Workflow version identifier |
| `dify.workflow.inputs` | string/JSON | Yes | Workflow input parameters (content-gated) |
| `dify.workflow.outputs` | string/JSON | Yes | Workflow output results (content-gated) |
| `dify.workflow.query` | string | No | User query text (content-gated, for chat workflows) |
**Common Log Attributes:**
- `dify.event.name`: `"dify.workflow.run"`
- `dify.event.signal`: `"span_detail"`
- `trace_id`: Correlated OTEL trace ID (32-char hex)
- `span_id`: Correlated OTEL span ID (16-char hex)
- `tenant_id`: Tenant identifier
- `user_id`: User identifier (when available)
#### 8.1.2 `dify.node.execution` and `dify.node.execution.draft` Companion Logs
**Signal Type:** `span_detail`
These logs accompany node execution spans and include ALL span attributes PLUS additional detail attributes.
**Additional Attributes (beyond span attributes):**
| Additional Attribute | Type | Always Present | Description |
|---------------------|------|----------------|-------------|
| `dify.user.id` | string | No | User identifier (when available) |
| `gen_ai.provider.name` | string | No | LLM provider name (for LLM nodes only) |
| `gen_ai.request.model` | string | No | LLM model name (for LLM nodes only) |
| `gen_ai.usage.input_tokens` | int | No | Input tokens for this node (for LLM nodes only) |
| `gen_ai.usage.output_tokens` | int | No | Output tokens for this node (for LLM nodes only) |
| `gen_ai.usage.total_tokens` | int | No | Total tokens for this node (for LLM nodes only) |
| `dify.node.total_price` | float | No | Cost in currency units (for LLM nodes only) |
| `dify.node.currency` | string | No | Currency code (for LLM nodes only) |
| `dify.node.plugin_name` | string | No | Plugin name (for tool/knowledge-retrieval nodes only) |
| `dify.node.plugin_id` | string | No | Plugin identifier (for tool/knowledge-retrieval nodes only) |
| `dify.dataset.id` | string | No | Dataset identifier (for knowledge-retrieval nodes only) |
| `dify.dataset.name` | string | No | Dataset name (for knowledge-retrieval nodes only) |
| `dify.node.inputs` | string/JSON | Yes | Node input data (content-gated) |
| `dify.node.outputs` | string/JSON | Yes | Node output data (content-gated) |
| `dify.node.process_data` | string/JSON | No | Internal processing data (content-gated) |
**Common Log Attributes:**
- `dify.event.name`: `"dify.node.execution"` or `"dify.node.execution.draft"`
- `dify.event.signal`: `"span_detail"`
- `trace_id`: Correlated OTEL trace ID (32-char hex)
- `span_id`: Correlated OTEL span ID (16-char hex)
- `tenant_id`: Tenant identifier
- `user_id`: User identifier (when available)
### 8.2 Standalone Logs
These logs represent events that do not have a structural span. They are emitted as `metric_only` signals.
#### 8.2.1 `dify.message.run`
**Signal Type:** `metric_only`
Emitted for each message execution (LLM interaction).
**Attributes:**
- `dify.event.name`: `"dify.message.run"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID (32-char hex)
- `span_id`: OTEL span ID (16-char hex)
- `tenant_id`: Tenant identifier
- `user_id`: User identifier (when available)
- `dify.app_id`: Application identifier
- `dify.message.id`: Message identifier
- `dify.conversation.id`: Conversation identifier (when available)
- `dify.workflow.run_id`: Workflow run identifier (when part of workflow)
- `dify.invoke_from`: Source of invocation (`"service-api"`, `"web-app"`, `"debugger"`, `"explore"`)
- `gen_ai.provider.name`: LLM provider name
- `gen_ai.request.model`: LLM model name
- `gen_ai.usage.input_tokens`: Input tokens consumed
- `gen_ai.usage.output_tokens`: Output tokens generated
- `gen_ai.usage.total_tokens`: Total tokens consumed
- `dify.message.status`: Execution status (`"succeeded"`, `"failed"`)
- `dify.message.error`: Error message (when status is `"failed"`)
- `dify.message.duration`: Execution duration in seconds
- `dify.message.time_to_first_token`: Time to first token in seconds
- `dify.message.inputs`: Message inputs (content-gated)
- `dify.message.outputs`: Message outputs (content-gated)
**Example:**
```json
{
"trace_id": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6",
"span_id": "a1b2c3d4e5f6g7h8",
"attributes": {
"dify.event.name": "dify.message.run",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"user_id": "660e8400-e29b-41d4-a716-446655440001",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.message.id": "880e8400-e29b-41d4-a716-446655440003",
"dify.conversation.id": "990e8400-e29b-41d4-a716-446655440004",
"dify.invoke_from": "web-app",
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "gpt-4",
"gen_ai.usage.input_tokens": 120,
"gen_ai.usage.output_tokens": 85,
"gen_ai.usage.total_tokens": 205,
"dify.message.status": "succeeded",
"dify.message.duration": 2.45,
"dify.message.time_to_first_token": 0.32,
"dify.message.inputs": "{\"query\": \"What is the weather?\"}",
"dify.message.outputs": "{\"answer\": \"The weather is sunny.\"}"
}
}
```
#### 8.2.2 `dify.tool.execution`
**Signal Type:** `metric_only`
Emitted for each tool invocation.
**Attributes:**
- `dify.event.name`: `"dify.tool.execution"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID
- `span_id`: OTEL span ID
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.message.id`: Message identifier
- `dify.tool.name`: Tool name
- `dify.tool.duration`: Execution duration in seconds
- `dify.tool.status`: Execution status (`"succeeded"`, `"failed"`)
- `dify.tool.error`: Error message (when failed)
- `dify.tool.inputs`: Tool inputs (content-gated)
- `dify.tool.outputs`: Tool outputs (content-gated)
- `dify.tool.parameters`: Tool parameters (content-gated)
- `dify.tool.config`: Tool configuration (content-gated)
**Example:**
```json
{
"trace_id": "b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7",
"span_id": "b2c3d4e5f6g7h8i9",
"attributes": {
"dify.event.name": "dify.tool.execution",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.message.id": "880e8400-e29b-41d4-a716-446655440003",
"dify.tool.name": "weather_api",
"dify.tool.duration": 0.85,
"dify.tool.status": "succeeded",
"dify.tool.inputs": "{\"location\": \"San Francisco\"}",
"dify.tool.outputs": "{\"temperature\": 72, \"condition\": \"sunny\"}",
"dify.tool.parameters": "{\"api_key\": \"***\"}",
"dify.tool.config": "{\"timeout\": 30}"
}
}
```
#### 8.2.3 `dify.moderation.check`
**Signal Type:** `metric_only`
Emitted for content moderation checks.
**Attributes:**
- `dify.event.name`: `"dify.moderation.check"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID
- `span_id`: OTEL span ID
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.message.id`: Message identifier
- `dify.moderation.type`: Moderation type (`"input"`, `"output"`)
- `dify.moderation.action`: Action taken (`"pass"`, `"block"`, `"flag"`)
- `dify.moderation.flagged`: Whether content was flagged (boolean)
- `dify.moderation.categories`: Flagged categories (JSON array)
- `dify.moderation.query`: Content being moderated (content-gated)
**Example:**
```json
{
"trace_id": "c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8",
"span_id": "c3d4e5f6g7h8i9j0",
"attributes": {
"dify.event.name": "dify.moderation.check",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.message.id": "880e8400-e29b-41d4-a716-446655440003",
"dify.moderation.type": "input",
"dify.moderation.action": "pass",
"dify.moderation.flagged": false,
"dify.moderation.categories": "[]",
"dify.moderation.query": "What is the weather?"
}
}
```
#### 8.2.4 `dify.suggested_question.generation`
**Signal Type:** `metric_only`
Emitted when suggested questions are generated.
**Attributes:**
- `dify.event.name`: `"dify.suggested_question.generation"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID
- `span_id`: OTEL span ID
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.message.id`: Message identifier
- `dify.suggested_question.count`: Number of questions generated
- `dify.suggested_question.duration`: Generation duration in seconds
- `dify.suggested_question.status`: Generation status (`"succeeded"`, `"failed"`)
- `dify.suggested_question.error`: Error message (when failed)
- `dify.suggested_question.questions`: Generated questions (content-gated, JSON array)
**Example:**
```json
{
"trace_id": "d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9",
"span_id": "d4e5f6g7h8i9j0k1",
"attributes": {
"dify.event.name": "dify.suggested_question.generation",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.message.id": "880e8400-e29b-41d4-a716-446655440003",
"dify.suggested_question.count": 3,
"dify.suggested_question.duration": 1.2,
"dify.suggested_question.status": "succeeded",
"dify.suggested_question.questions": "[\"What about tomorrow?\", \"How about next week?\", \"Is it raining?\"]"
}
}
```
#### 8.2.5 `dify.dataset.retrieval`
**Signal Type:** `metric_only`
Emitted for knowledge base retrieval operations.
**Attributes:**
- `dify.event.name`: `"dify.dataset.retrieval"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID
- `span_id`: OTEL span ID
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.message.id`: Message identifier
- `dify.dataset.id`: Dataset identifier
- `dify.dataset.name`: Dataset name
- `dify.retrieval.query`: Search query (content-gated)
- `dify.retrieval.document_count`: Number of documents retrieved
- `dify.retrieval.duration`: Retrieval duration in seconds
- `dify.retrieval.status`: Retrieval status (`"succeeded"`, `"failed"`)
- `dify.retrieval.error`: Error message (when failed)
- `dify.dataset.documents`: Retrieved documents (content-gated, JSON array)
**Example:**
```json
{
"trace_id": "e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0",
"span_id": "e5f6g7h8i9j0k1l2",
"attributes": {
"dify.event.name": "dify.dataset.retrieval",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.message.id": "880e8400-e29b-41d4-a716-446655440003",
"dify.dataset.id": "aa0e8400-e29b-41d4-a716-446655440005",
"dify.dataset.name": "Product Documentation",
"dify.retrieval.query": "installation guide",
"dify.retrieval.document_count": 5,
"dify.retrieval.duration": 0.45,
"dify.retrieval.status": "succeeded",
"dify.dataset.documents": "[{\"id\": \"doc1\", \"score\": 0.95}, {\"id\": \"doc2\", \"score\": 0.87}]"
}
}
```
#### 8.2.6 `dify.generate_name.execution`
**Signal Type:** `metric_only`
Emitted when conversation names are auto-generated.
**Attributes:**
- `dify.event.name`: `"dify.generate_name.execution"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID
- `span_id`: OTEL span ID
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.conversation.id`: Conversation identifier
- `dify.generate_name.duration`: Generation duration in seconds
- `dify.generate_name.status`: Generation status (`"succeeded"`, `"failed"`)
- `dify.generate_name.error`: Error message (when failed)
- `dify.generate_name.inputs`: Generation inputs (content-gated)
- `dify.generate_name.outputs`: Generated name (content-gated)
**Example:**
```json
{
"trace_id": "f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1",
"span_id": "f6g7h8i9j0k1l2m3",
"attributes": {
"dify.event.name": "dify.generate_name.execution",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.conversation.id": "990e8400-e29b-41d4-a716-446655440004",
"dify.generate_name.duration": 0.75,
"dify.generate_name.status": "succeeded",
"dify.generate_name.inputs": "{\"first_message\": \"What is the weather?\"}",
"dify.generate_name.outputs": "Weather Inquiry"
}
}
```
#### 8.2.7 `dify.prompt_generation.execution`
**Signal Type:** `metric_only`
Emitted for prompt engineering operations.
**Attributes:**
- `dify.event.name`: `"dify.prompt_generation.execution"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID
- `span_id`: OTEL span ID
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.prompt_generation.operation_type`: Operation type (`"rule_generate"`, `"code_generate"`, `"structured_output"`, `"instruction_modify"`)
- `gen_ai.provider.name`: LLM provider name
- `gen_ai.request.model`: LLM model name
- `gen_ai.usage.input_tokens`: Input tokens consumed
- `gen_ai.usage.output_tokens`: Output tokens generated
- `gen_ai.usage.total_tokens`: Total tokens consumed
- `dify.prompt_generation.duration`: Generation duration in seconds
- `dify.prompt_generation.status`: Generation status (`"succeeded"`, `"failed"`)
- `dify.prompt_generation.error`: Error message (when failed)
- `dify.prompt_generation.instruction`: Prompt instruction (content-gated)
- `dify.prompt_generation.output`: Generated output (content-gated)
**Example:**
```json
{
"trace_id": "g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2",
"span_id": "g7h8i9j0k1l2m3n4",
"attributes": {
"dify.event.name": "dify.prompt_generation.execution",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.prompt_generation.operation_type": "rule_generate",
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "gpt-4",
"gen_ai.usage.input_tokens": 50,
"gen_ai.usage.output_tokens": 30,
"gen_ai.usage.total_tokens": 80,
"dify.prompt_generation.duration": 1.1,
"dify.prompt_generation.status": "succeeded",
"dify.prompt_generation.instruction": "Generate validation rules",
"dify.prompt_generation.output": "{\"rules\": [\"check_length\", \"validate_format\"]}"
}
}
```
#### 8.2.8 `dify.app.created`
**Signal Type:** `metric_only`
Emitted when an application is created.
**Attributes:**
- `dify.event.name`: `"dify.app.created"`
- `dify.event.signal`: `"metric_only"`
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.app.mode`: Application mode (`"chat"`, `"completion"`, `"agent-chat"`, `"workflow"`)
- `dify.app.created_at`: Creation timestamp (ISO 8601)
**Example:**
```json
{
"attributes": {
"dify.event.name": "dify.app.created",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.app.mode": "workflow",
"dify.app.created_at": "2026-02-10T19:30:00Z"
}
}
```
#### 8.2.9 `dify.app.updated`
**Signal Type:** `metric_only`
Emitted when an application is updated.
**Attributes:**
- `dify.event.name`: `"dify.app.updated"`
- `dify.event.signal`: `"metric_only"`
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.app.updated_at`: Update timestamp (ISO 8601)
**Example:**
```json
{
"attributes": {
"dify.event.name": "dify.app.updated",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.app.updated_at": "2026-02-10T20:15:00Z"
}
}
```
#### 8.2.10 `dify.app.deleted`
**Signal Type:** `metric_only`
Emitted when an application is deleted.
**Attributes:**
- `dify.event.name`: `"dify.app.deleted"`
- `dify.event.signal`: `"metric_only"`
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.app.deleted_at`: Deletion timestamp (ISO 8601)
**Example:**
```json
{
"attributes": {
"dify.event.name": "dify.app.deleted",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.app.deleted_at": "2026-02-10T21:00:00Z"
}
}
```
#### 8.2.11 `dify.feedback.created`
**Signal Type:** `metric_only`
Emitted when user feedback is submitted.
**Attributes:**
- `dify.event.name`: `"dify.feedback.created"`
- `dify.event.signal`: `"metric_only"`
- `trace_id`: OTEL trace ID
- `span_id`: OTEL span ID
- `tenant_id`: Tenant identifier
- `dify.app_id`: Application identifier
- `dify.message.id`: Message identifier
- `dify.feedback.rating`: Rating (`"like"`, `"dislike"`, `null`)
- `dify.feedback.content`: Feedback text (content-gated, omitted when gating enabled)
- `dify.feedback.created_at`: Feedback timestamp (ISO 8601)
**Example:**
```json
{
"trace_id": "h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3",
"span_id": "h8i9j0k1l2m3n4o5",
"attributes": {
"dify.event.name": "dify.feedback.created",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.app_id": "770e8400-e29b-41d4-a716-446655440002",
"dify.message.id": "880e8400-e29b-41d4-a716-446655440003",
"dify.feedback.rating": "like",
"dify.feedback.content": "Very helpful response!",
"dify.feedback.created_at": "2026-02-10T19:45:00Z"
}
}
```
#### 8.2.12 `dify.telemetry.rehydration_failed`
**Signal Type:** `metric_only` (Diagnostic)
Emitted when telemetry payload rehydration fails. This is a diagnostic event for monitoring telemetry system health.
**Attributes:**
- `dify.event.name`: `"dify.telemetry.rehydration_failed"`
- `dify.event.signal`: `"metric_only"`
- `tenant_id`: Tenant identifier
- `dify.telemetry.error`: Error message
- `dify.telemetry.payload_type`: Type of payload that failed (`"workflow"`, `"node"`, `"message"`, etc.)
- `dify.telemetry.correlation_id`: Correlation ID of the failed payload
**Example:**
```json
{
"attributes": {
"dify.event.name": "dify.telemetry.rehydration_failed",
"dify.event.signal": "metric_only",
"tenant_id": "550e8400-e29b-41d4-a716-446655440000",
"dify.telemetry.error": "Workflow run not found in database",
"dify.telemetry.payload_type": "workflow",
"dify.telemetry.correlation_id": "bb0e8400-e29b-41d4-a716-446655440006"
}
}
```
## 9. Content Attributes
When `ENTERPRISE_INCLUDE_CONTENT` is set to `false`, the following attributes are replaced with a reference string (e.g., `ref:workflow_run_id=...`) to prevent sensitive data leakage to the OTEL collector.
| Attribute | Signal |
|-----------|--------|
| `dify.workflow.inputs` | `dify.workflow.run` |
| `dify.workflow.outputs` | `dify.workflow.run` |
| `dify.workflow.query` | `dify.workflow.run` |
| `dify.node.inputs` | `dify.node.execution` |
| `dify.node.outputs` | `dify.node.execution` |
| `dify.node.process_data` | `dify.node.execution` |
| `dify.message.inputs` | `dify.message.run` |
| `dify.message.outputs` | `dify.message.run` |
| `dify.tool.inputs` | `dify.tool.execution` |
| `dify.tool.outputs` | `dify.tool.execution` |
| `dify.tool.parameters` | `dify.tool.execution` |
| `dify.tool.config` | `dify.tool.execution` |
| `dify.moderation.query` | `dify.moderation.check` |
| `dify.suggested_question.questions` | `dify.suggested_question.generation` |
| `dify.retrieval.query` | `dify.dataset.retrieval` |
| `dify.dataset.documents` | `dify.dataset.retrieval` |
| `dify.generate_name.inputs` | `dify.generate_name.execution` |
| `dify.generate_name.outputs` | `dify.generate_name.execution` |
| `dify.prompt_generation.instruction` | `dify.prompt_generation.execution` |
| `dify.prompt_generation.output` | `dify.prompt_generation.execution` |
| `dify.feedback.content` | `dify.feedback.created` |
### Content Gating Behavior
When `ENTERPRISE_INCLUDE_CONTENT=true` (default), content attributes contain the actual data:
```json
{
"dify.workflow.inputs": "{\"query\": \"What is the weather?\", \"location\": \"San Francisco\"}",
"dify.workflow.outputs": "{\"answer\": \"The weather in San Francisco is sunny, 72°F.\"}",
"dify.workflow.query": "What is the weather?"
}
```
When `ENTERPRISE_INCLUDE_CONTENT=false`, content attributes are replaced with reference strings:
```json
{
"dify.workflow.inputs": "ref:workflow_run_id=550e8400-e29b-41d4-a716-446655440000",
"dify.workflow.outputs": "ref:workflow_run_id=550e8400-e29b-41d4-a716-446655440000",
"dify.workflow.query": "ref:workflow_run_id=550e8400-e29b-41d4-a716-446655440000"
}
```
### Reference String Format
Reference strings follow the pattern `ref:{id_type}={uuid}`, where:
- `{id_type}` identifies the entity type
- `{uuid}` is the entity's unique identifier
**Examples:**
```
ref:workflow_run_id=550e8400-e29b-41d4-a716-446655440000
ref:node_execution_id=660e8400-e29b-41d4-a716-446655440001
ref:message_id=770e8400-e29b-41d4-a716-446655440002
ref:conversation_id=880e8400-e29b-41d4-a716-446655440003
ref:trace_id=990e8400-e29b-41d4-a716-446655440004
```
**Usage:** To retrieve the actual content, query your Dify database using the provided UUID. For example, `ref:workflow_run_id=550e8400-e29b-41d4-a716-446655440000` means you can fetch the workflow run record from the `workflow_runs` table using that ID.
**Special Case:** The `dify.feedback.content` attribute is **omitted entirely** (not replaced with a reference string) when content gating is enabled, as feedback text is not stored with a retrievable ID.
## 10. Appendix
### Operation Types (`operation_type`)
- `workflow`
- `node_execution`
- `message`
- `rule_generate`
- `code_generate`
- `structured_output`
- `instruction_modify`
### Node Types (`node_type`)
- `start`, `end`, `answer`, `llm`, `knowledge-retrieval`, `knowledge-index`, `if-else`, `code`, `template-transform`, `question-classifier`, `http-request`, `tool`, `datasource`, `variable-aggregator`, `loop`, `iteration`, `parameter-extractor`, `assigner`, `document-extractor`, `list-operator`, `agent`, `trigger-webhook`, `trigger-schedule`, `trigger-plugin`, `human-input`.
### Workflow Statuses
- `running`, `succeeded`, `failed`, `stopped`, `partial-succeeded`, `paused`.
### Null Value Behavior
The handling of null/missing values differs between spans (traces) and structured logs:
**In Spans (Traces):**
Attributes with `None`/`null` values are **omitted** from the exported span data. This follows OpenTelemetry conventions for efficient trace storage.
Example - a workflow span without optional parent attributes:
```python
# These attributes are NOT included in the span:
# - dify.parent.trace_id (null)
# - dify.parent.workflow.run_id (null)
# - dify.parent.node.execution_id (null)
# - dify.parent.app.id (null)
```
**In Structured Logs:**
Attributes with `null` values appear as `null` in the JSON payload.
Example:
```json
{
"attributes": {
"dify.workflow.error": null,
"dify.conversation.id": null
}
}
```
**Conditional Attributes:**
Many attributes are only present under specific conditions:
- **LLM-related attributes** (`gen_ai.provider.name`, `gen_ai.usage.total_tokens`, etc.) only appear for LLM nodes
- **Plugin attributes** (`dify.node.plugin_name`, `dify.node.plugin_id`) only appear for tool/knowledge-retrieval nodes
- **Dataset attributes** (`dify.dataset.id`, `dify.dataset.name`) only appear for knowledge-retrieval nodes
- **Error attributes** (`dify.workflow.error`, `dify.node.error`) only appear when status is `"failed"`
- **Parent attributes** (`dify.parent.*`) only appear for nested/sub-workflow executions
**Content-Gated Attributes:**
When `ENTERPRISE_INCLUDE_CONTENT=false`, content attributes are replaced with reference strings (not set to `null`). See Section 9 for details.

View File

View File

@@ -0,0 +1,83 @@
"""Telemetry gateway contracts and data structures.
This module defines the envelope format for telemetry events and the routing
configuration that determines how each event type is processed.
"""
from __future__ import annotations
from enum import StrEnum
from typing import Any
from pydantic import BaseModel, field_validator
class TelemetryCase(StrEnum):
"""Enumeration of all known telemetry event cases."""
WORKFLOW_RUN = "workflow_run"
NODE_EXECUTION = "node_execution"
DRAFT_NODE_EXECUTION = "draft_node_execution"
MESSAGE_RUN = "message_run"
TOOL_EXECUTION = "tool_execution"
MODERATION_CHECK = "moderation_check"
SUGGESTED_QUESTION = "suggested_question"
DATASET_RETRIEVAL = "dataset_retrieval"
GENERATE_NAME = "generate_name"
PROMPT_GENERATION = "prompt_generation"
APP_CREATED = "app_created"
APP_UPDATED = "app_updated"
APP_DELETED = "app_deleted"
FEEDBACK_CREATED = "feedback_created"
class SignalType(StrEnum):
"""Signal routing type for telemetry cases."""
TRACE = "trace"
METRIC_LOG = "metric_log"
class CaseRoute(BaseModel):
"""Routing configuration for a telemetry case.
Attributes:
signal_type: The type of signal (trace or metric_log).
ce_eligible: Whether this case is eligible for community edition tracing.
"""
signal_type: SignalType
ce_eligible: bool
class TelemetryEnvelope(BaseModel):
"""Envelope for telemetry events.
Attributes:
case: The telemetry case type.
tenant_id: The tenant identifier.
event_id: Unique event identifier for deduplication.
payload: The main event payload.
payload_fallback: Fallback payload (max 64KB).
metadata: Optional metadata dictionary.
"""
case: TelemetryCase
tenant_id: str
event_id: str
payload: dict[str, Any]
payload_fallback: bytes | None = None
metadata: dict[str, Any] | None = None
@field_validator("payload_fallback")
@classmethod
def validate_payload_fallback_size(cls, v: bytes | None) -> bytes | None:
"""Validate that payload_fallback does not exceed 64KB."""
if v is not None and len(v) > 65536: # 64 * 1024
raise ValueError("payload_fallback must not exceed 64KB")
return v
class Config:
"""Pydantic configuration."""
use_enum_values = False

View File

@@ -0,0 +1,77 @@
from __future__ import annotations
from collections.abc import Mapping
from typing import Any
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
from core.workflow.enums import WorkflowNodeExecutionMetadataKey
from models.workflow import WorkflowNodeExecutionModel
def enqueue_draft_node_execution_trace(
*,
execution: WorkflowNodeExecutionModel,
outputs: Mapping[str, Any] | None,
workflow_execution_id: str | None,
user_id: str,
) -> None:
node_data = _build_node_execution_data(
execution=execution,
outputs=outputs,
workflow_execution_id=workflow_execution_id,
)
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.DRAFT_NODE_EXECUTION_TRACE,
context=TelemetryContext(
tenant_id=execution.tenant_id,
user_id=user_id,
app_id=execution.app_id,
),
payload={"node_execution_data": node_data},
)
)
def _build_node_execution_data(
*,
execution: WorkflowNodeExecutionModel,
outputs: Mapping[str, Any] | None,
workflow_execution_id: str | None,
) -> dict[str, Any]:
metadata = execution.execution_metadata_dict
node_outputs = outputs if outputs is not None else execution.outputs_dict
execution_id = workflow_execution_id or execution.workflow_run_id or execution.id
return {
"workflow_id": execution.workflow_id,
"workflow_execution_id": execution_id,
"tenant_id": execution.tenant_id,
"app_id": execution.app_id,
"node_execution_id": execution.id,
"node_id": execution.node_id,
"node_type": execution.node_type,
"title": execution.title,
"status": execution.status,
"error": execution.error,
"elapsed_time": execution.elapsed_time,
"index": execution.index,
"predecessor_node_id": execution.predecessor_node_id,
"created_at": execution.created_at,
"finished_at": execution.finished_at,
"total_tokens": metadata.get(WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS, 0),
"total_price": metadata.get(WorkflowNodeExecutionMetadataKey.TOTAL_PRICE, 0.0),
"currency": metadata.get(WorkflowNodeExecutionMetadataKey.CURRENCY),
"tool_name": (metadata.get(WorkflowNodeExecutionMetadataKey.TOOL_INFO) or {}).get("tool_name")
if isinstance(metadata.get(WorkflowNodeExecutionMetadataKey.TOOL_INFO), dict)
else None,
"iteration_id": metadata.get(WorkflowNodeExecutionMetadataKey.ITERATION_ID),
"iteration_index": metadata.get(WorkflowNodeExecutionMetadataKey.ITERATION_INDEX),
"loop_id": metadata.get(WorkflowNodeExecutionMetadataKey.LOOP_ID),
"loop_index": metadata.get(WorkflowNodeExecutionMetadataKey.LOOP_INDEX),
"parallel_id": metadata.get(WorkflowNodeExecutionMetadataKey.PARALLEL_ID),
"node_inputs": execution.inputs_dict,
"node_outputs": node_outputs,
"process_data": execution.process_data_dict,
}

View File

@@ -0,0 +1,904 @@
"""Enterprise trace handler — duck-typed, NOT a BaseTraceInstance subclass.
Invoked directly in the Celery task, not through OpsTraceManager dispatch.
Only requires a matching ``trace(trace_info)`` method signature.
Signal strategy:
- **Traces (spans)**: workflow run, node execution, draft node execution only.
- **Metrics + structured logs**: all other event types.
Token metric labels (unified structure):
All token metrics (dify.tokens.input, dify.tokens.output, dify.tokens.total) use the
same label set for consistent filtering and aggregation:
- tenant_id: Tenant identifier
- app_id: Application identifier
- operation_type: Source of token usage (workflow | node_execution | message | rule_generate | etc.)
- model_provider: LLM provider name (empty string if not applicable)
- model_name: LLM model name (empty string if not applicable)
- node_type: Workflow node type (empty string if not node_execution)
This unified structure allows filtering by operation_type to separate:
- Workflow-level aggregates (operation_type=workflow)
- Individual node executions (operation_type=node_execution)
- Direct message calls (operation_type=message)
- Prompt generation operations (operation_type=rule_generate, code_generate, etc.)
Without this, tokens are double-counted when querying totals (workflow totals include
node totals, since workflow.total_tokens is the sum of all node tokens).
"""
from __future__ import annotations
import json
import logging
from typing import Any, cast
from opentelemetry.util.types import AttributeValue
from core.ops.entities.trace_entity import (
BaseTraceInfo,
DatasetRetrievalTraceInfo,
DraftNodeExecutionTrace,
GenerateNameTraceInfo,
MessageTraceInfo,
ModerationTraceInfo,
OperationType,
PromptGenerationTraceInfo,
SuggestedQuestionTraceInfo,
ToolTraceInfo,
WorkflowNodeTraceInfo,
WorkflowTraceInfo,
)
from enterprise.telemetry.entities import (
EnterpriseTelemetryCounter,
EnterpriseTelemetryEvent,
EnterpriseTelemetryHistogram,
EnterpriseTelemetrySpan,
TokenMetricLabels,
)
from enterprise.telemetry.telemetry_log import emit_metric_only_event, emit_telemetry_log
logger = logging.getLogger(__name__)
class EnterpriseOtelTrace:
"""Duck-typed enterprise trace handler.
``*_trace`` methods emit spans (workflow/node only) or structured logs
(all other events), plus metrics at 100 % accuracy.
"""
def __init__(self) -> None:
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
exporter = get_enterprise_exporter()
if exporter is None:
raise RuntimeError("EnterpriseOtelTrace instantiated but exporter is not initialized")
self._exporter = exporter
def trace(self, trace_info: BaseTraceInfo) -> None:
if isinstance(trace_info, WorkflowTraceInfo):
self._workflow_trace(trace_info)
elif isinstance(trace_info, MessageTraceInfo):
self._message_trace(trace_info)
elif isinstance(trace_info, ToolTraceInfo):
self._tool_trace(trace_info)
elif isinstance(trace_info, DraftNodeExecutionTrace):
self._draft_node_execution_trace(trace_info)
elif isinstance(trace_info, WorkflowNodeTraceInfo):
self._node_execution_trace(trace_info)
elif isinstance(trace_info, ModerationTraceInfo):
self._moderation_trace(trace_info)
elif isinstance(trace_info, SuggestedQuestionTraceInfo):
self._suggested_question_trace(trace_info)
elif isinstance(trace_info, DatasetRetrievalTraceInfo):
self._dataset_retrieval_trace(trace_info)
elif isinstance(trace_info, GenerateNameTraceInfo):
self._generate_name_trace(trace_info)
elif isinstance(trace_info, PromptGenerationTraceInfo):
self._prompt_generation_trace(trace_info)
def _common_attrs(self, trace_info: BaseTraceInfo) -> dict[str, Any]:
metadata = self._metadata(trace_info)
tenant_id, app_id, user_id = self._context_ids(trace_info, metadata)
return {
"dify.trace_id": trace_info.trace_id,
"dify.tenant_id": tenant_id,
"dify.app_id": app_id,
"dify.app.name": metadata.get("app_name"),
"dify.workspace.name": metadata.get("workspace_name"),
"gen_ai.user.id": user_id,
"dify.message.id": trace_info.message_id,
}
def _metadata(self, trace_info: BaseTraceInfo) -> dict[str, Any]:
return trace_info.metadata
def _context_ids(
self,
trace_info: BaseTraceInfo,
metadata: dict[str, Any],
) -> tuple[str | None, str | None, str | None]:
tenant_id = getattr(trace_info, "tenant_id", None) or metadata.get("tenant_id")
app_id = getattr(trace_info, "app_id", None) or metadata.get("app_id")
user_id = getattr(trace_info, "user_id", None) or metadata.get("user_id")
return tenant_id, app_id, user_id
def _labels(self, **values: AttributeValue) -> dict[str, AttributeValue]:
return dict(values)
def _safe_payload_value(self, value: Any) -> str | dict[str, Any] | list[object] | None:
if isinstance(value, str):
return value
if isinstance(value, dict):
return cast(dict[str, Any], value)
if isinstance(value, list):
items: list[object] = []
for item in cast(list[object], value):
items.append(item)
return items
return None
def _content_or_ref(self, value: Any, ref: str) -> Any:
if self._exporter.include_content:
return self._maybe_json(value)
return ref
def _maybe_json(self, value: Any) -> str | None:
if value is None:
return None
if isinstance(value, str):
return value
try:
return json.dumps(value, default=str)
except (TypeError, ValueError):
return str(value)
# ------------------------------------------------------------------
# SPAN-emitting handlers (workflow, node execution, draft node)
# ------------------------------------------------------------------
def _workflow_trace(self, info: WorkflowTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
# -- Slim span attrs: identity + structure + status + timing only --
span_attrs: dict[str, Any] = {
"dify.trace_id": info.trace_id,
"dify.tenant_id": tenant_id,
"dify.app_id": app_id,
"dify.workflow.id": info.workflow_id,
"dify.workflow.run_id": info.workflow_run_id,
"dify.workflow.status": info.workflow_run_status,
"dify.workflow.error": info.error,
"dify.workflow.elapsed_time": info.workflow_run_elapsed_time,
"dify.invoke_from": metadata.get("triggered_from"),
"dify.conversation.id": info.conversation_id,
"dify.message.id": info.message_id,
"dify.invoked_by": info.invoked_by,
}
trace_correlation_override: str | None = None
parent_span_id_source: str | None = None
parent_ctx = metadata.get("parent_trace_context")
if isinstance(parent_ctx, dict):
parent_ctx_dict = cast(dict[str, Any], parent_ctx)
span_attrs["dify.parent.trace_id"] = parent_ctx_dict.get("trace_id")
span_attrs["dify.parent.node.execution_id"] = parent_ctx_dict.get("parent_node_execution_id")
span_attrs["dify.parent.workflow.run_id"] = parent_ctx_dict.get("parent_workflow_run_id")
span_attrs["dify.parent.app.id"] = parent_ctx_dict.get("parent_app_id")
trace_override_value = parent_ctx_dict.get("parent_workflow_run_id")
if isinstance(trace_override_value, str):
trace_correlation_override = trace_override_value
parent_span_value = parent_ctx_dict.get("parent_node_execution_id")
if isinstance(parent_span_value, str):
parent_span_id_source = parent_span_value
self._exporter.export_span(
EnterpriseTelemetrySpan.WORKFLOW_RUN,
span_attrs,
correlation_id=info.workflow_run_id,
span_id_source=info.workflow_run_id,
start_time=info.start_time,
end_time=info.end_time,
trace_correlation_override=trace_correlation_override,
parent_span_id_source=parent_span_id_source,
)
# -- Companion log: ALL attrs (span + detail) for full picture --
log_attrs: dict[str, Any] = {**span_attrs}
log_attrs.update(
{
"dify.app.name": metadata.get("app_name"),
"dify.workspace.name": metadata.get("workspace_name"),
"gen_ai.user.id": user_id,
"gen_ai.usage.total_tokens": info.total_tokens,
"dify.workflow.version": info.workflow_run_version,
}
)
ref = f"ref:workflow_run_id={info.workflow_run_id}"
log_attrs["dify.workflow.inputs"] = self._content_or_ref(info.workflow_run_inputs, ref)
log_attrs["dify.workflow.outputs"] = self._content_or_ref(info.workflow_run_outputs, ref)
log_attrs["dify.workflow.query"] = self._content_or_ref(info.query, ref)
emit_telemetry_log(
event_name=EnterpriseTelemetryEvent.WORKFLOW_RUN,
attributes=log_attrs,
signal="span_detail",
trace_id_source=info.workflow_run_id,
span_id_source=info.workflow_run_id,
tenant_id=tenant_id,
user_id=user_id,
)
# -- Metrics --
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
)
token_labels = TokenMetricLabels(
tenant_id=tenant_id or "",
app_id=app_id or "",
operation_type=OperationType.WORKFLOW,
model_provider="",
model_name="",
node_type="",
).to_dict()
self._exporter.increment_counter(EnterpriseTelemetryCounter.TOKENS, info.total_tokens, token_labels)
if info.prompt_tokens is not None and info.prompt_tokens > 0:
self._exporter.increment_counter(EnterpriseTelemetryCounter.INPUT_TOKENS, info.prompt_tokens, token_labels)
if info.completion_tokens is not None and info.completion_tokens > 0:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.OUTPUT_TOKENS, info.completion_tokens, token_labels
)
invoke_from = metadata.get("triggered_from", "")
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="workflow",
status=info.workflow_run_status,
invoke_from=invoke_from,
),
)
self._exporter.record_histogram(
EnterpriseTelemetryHistogram.WORKFLOW_DURATION,
float(info.workflow_run_elapsed_time),
self._labels(
**labels,
status=info.workflow_run_status,
),
)
if info.error:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.ERRORS,
1,
self._labels(
**labels,
type="workflow",
),
)
def _node_execution_trace(self, info: WorkflowNodeTraceInfo) -> None:
self._emit_node_execution_trace(info, EnterpriseTelemetrySpan.NODE_EXECUTION, "node")
def _draft_node_execution_trace(self, info: DraftNodeExecutionTrace) -> None:
self._emit_node_execution_trace(
info,
EnterpriseTelemetrySpan.DRAFT_NODE_EXECUTION,
"draft_node",
correlation_id_override=info.node_execution_id,
trace_correlation_override_param=info.workflow_run_id,
)
def _emit_node_execution_trace(
self,
info: WorkflowNodeTraceInfo,
span_name: EnterpriseTelemetrySpan,
request_type: str,
correlation_id_override: str | None = None,
trace_correlation_override_param: str | None = None,
) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
# -- Slim span attrs: identity + structure + status + timing --
span_attrs: dict[str, Any] = {
"dify.trace_id": info.trace_id,
"dify.tenant_id": tenant_id,
"dify.app_id": app_id,
"dify.workflow.id": info.workflow_id,
"dify.workflow.run_id": info.workflow_run_id,
"dify.message.id": info.message_id,
"dify.conversation.id": metadata.get("conversation_id"),
"dify.node.execution_id": info.node_execution_id,
"dify.node.id": info.node_id,
"dify.node.type": info.node_type,
"dify.node.title": info.title,
"dify.node.status": info.status,
"dify.node.error": info.error,
"dify.node.elapsed_time": info.elapsed_time,
"dify.node.index": info.index,
"dify.node.predecessor_node_id": info.predecessor_node_id,
"dify.node.iteration_id": info.iteration_id,
"dify.node.loop_id": info.loop_id,
"dify.node.parallel_id": info.parallel_id,
"dify.node.invoked_by": info.invoked_by,
}
trace_correlation_override = trace_correlation_override_param
parent_ctx = metadata.get("parent_trace_context")
if isinstance(parent_ctx, dict):
parent_ctx_dict = cast(dict[str, Any], parent_ctx)
override_value = parent_ctx_dict.get("parent_workflow_run_id")
if isinstance(override_value, str):
trace_correlation_override = override_value
effective_correlation_id = correlation_id_override or info.workflow_run_id
self._exporter.export_span(
span_name,
span_attrs,
correlation_id=effective_correlation_id,
span_id_source=info.node_execution_id,
start_time=info.start_time,
end_time=info.end_time,
trace_correlation_override=trace_correlation_override,
)
# -- Companion log: ALL attrs (span + detail) --
log_attrs: dict[str, Any] = {**span_attrs}
log_attrs.update(
{
"dify.app.name": metadata.get("app_name"),
"dify.workspace.name": metadata.get("workspace_name"),
"dify.invoke_from": metadata.get("invoke_from"),
"gen_ai.user.id": user_id,
"gen_ai.usage.total_tokens": info.total_tokens,
"dify.node.total_price": info.total_price,
"dify.node.currency": info.currency,
"gen_ai.provider.name": info.model_provider,
"gen_ai.request.model": info.model_name,
"gen_ai.tool.name": info.tool_name,
"dify.node.iteration_index": info.iteration_index,
"dify.node.loop_index": info.loop_index,
"dify.plugin.name": metadata.get("plugin_name"),
"dify.credential.name": metadata.get("credential_name"),
"dify.credential.id": metadata.get("credential_id"),
"dify.dataset.ids": self._maybe_json(metadata.get("dataset_ids")),
"dify.dataset.names": self._maybe_json(metadata.get("dataset_names")),
}
)
ref = f"ref:node_execution_id={info.node_execution_id}"
log_attrs["dify.node.inputs"] = self._content_or_ref(info.node_inputs, ref)
log_attrs["dify.node.outputs"] = self._content_or_ref(info.node_outputs, ref)
log_attrs["dify.node.process_data"] = self._content_or_ref(info.process_data, ref)
emit_telemetry_log(
event_name=span_name.value,
attributes=log_attrs,
signal="span_detail",
trace_id_source=info.workflow_run_id,
span_id_source=info.node_execution_id,
tenant_id=tenant_id,
user_id=user_id,
)
# -- Metrics --
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
node_type=info.node_type,
model_provider=info.model_provider or "",
)
if info.total_tokens:
token_labels = TokenMetricLabels(
tenant_id=tenant_id or "",
app_id=app_id or "",
operation_type=OperationType.NODE_EXECUTION,
model_provider=info.model_provider or "",
model_name=info.model_name or "",
node_type=info.node_type,
).to_dict()
self._exporter.increment_counter(EnterpriseTelemetryCounter.TOKENS, info.total_tokens, token_labels)
if info.prompt_tokens is not None and info.prompt_tokens > 0:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.INPUT_TOKENS, info.prompt_tokens, token_labels
)
if info.completion_tokens is not None and info.completion_tokens > 0:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.OUTPUT_TOKENS, info.completion_tokens, token_labels
)
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type=request_type,
status=info.status,
),
)
duration_labels = dict(labels)
plugin_name = metadata.get("plugin_name")
if plugin_name and info.node_type in {"tool", "knowledge-retrieval"}:
duration_labels["plugin_name"] = plugin_name
self._exporter.record_histogram(EnterpriseTelemetryHistogram.NODE_DURATION, info.elapsed_time, duration_labels)
if info.error:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.ERRORS,
1,
self._labels(
**labels,
type=request_type,
),
)
# ------------------------------------------------------------------
# METRIC-ONLY handlers (structured log + counters/histograms)
# ------------------------------------------------------------------
def _message_trace(self, info: MessageTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
attrs = self._common_attrs(info)
attrs.update(
{
"dify.invoke_from": metadata.get("from_source"),
"dify.conversation.id": metadata.get("conversation_id"),
"dify.conversation.mode": info.conversation_mode,
"gen_ai.provider.name": metadata.get("ls_provider"),
"gen_ai.request.model": metadata.get("ls_model_name"),
"gen_ai.usage.input_tokens": info.message_tokens,
"gen_ai.usage.output_tokens": info.answer_tokens,
"gen_ai.usage.total_tokens": info.total_tokens,
"dify.message.status": metadata.get("status"),
"dify.message.error": info.error,
"dify.message.from_source": metadata.get("from_source"),
"dify.message.from_end_user_id": metadata.get("from_end_user_id"),
"dify.message.from_account_id": metadata.get("from_account_id"),
"dify.streaming": info.is_streaming_request,
"dify.message.time_to_first_token": info.gen_ai_server_time_to_first_token,
"dify.message.streaming_duration": info.llm_streaming_time_to_generate,
"dify.workflow.run_id": metadata.get("workflow_run_id"),
}
)
node_execution_id = metadata.get("node_execution_id")
if node_execution_id:
attrs["dify.node.execution_id"] = node_execution_id
ref = f"ref:message_id={info.message_id}"
inputs = self._safe_payload_value(info.inputs)
outputs = self._safe_payload_value(info.outputs)
attrs["dify.message.inputs"] = self._content_or_ref(inputs, ref)
attrs["dify.message.outputs"] = self._content_or_ref(outputs, ref)
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.MESSAGE_RUN,
attributes=attrs,
trace_id_source=metadata.get("workflow_run_id") or str(info.message_id) if info.message_id else None,
span_id_source=node_execution_id,
tenant_id=tenant_id,
user_id=user_id,
)
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
model_provider=metadata.get("ls_provider", ""),
model_name=metadata.get("ls_model_name", ""),
)
token_labels = TokenMetricLabels(
tenant_id=tenant_id or "",
app_id=app_id or "",
operation_type=OperationType.MESSAGE,
model_provider=metadata.get("ls_provider", ""),
model_name=metadata.get("ls_model_name", ""),
node_type="",
).to_dict()
self._exporter.increment_counter(EnterpriseTelemetryCounter.TOKENS, info.total_tokens, token_labels)
if info.message_tokens > 0:
self._exporter.increment_counter(EnterpriseTelemetryCounter.INPUT_TOKENS, info.message_tokens, token_labels)
if info.answer_tokens > 0:
self._exporter.increment_counter(EnterpriseTelemetryCounter.OUTPUT_TOKENS, info.answer_tokens, token_labels)
invoke_from = metadata.get("from_source", "")
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="message",
status=metadata.get("status", ""),
invoke_from=invoke_from,
),
)
if info.start_time and info.end_time:
duration = (info.end_time - info.start_time).total_seconds()
self._exporter.record_histogram(EnterpriseTelemetryHistogram.MESSAGE_DURATION, duration, labels)
if info.gen_ai_server_time_to_first_token is not None:
self._exporter.record_histogram(
EnterpriseTelemetryHistogram.MESSAGE_TTFT, info.gen_ai_server_time_to_first_token, labels
)
if info.error:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.ERRORS,
1,
self._labels(
**labels,
type="message",
),
)
def _tool_trace(self, info: ToolTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
attrs = self._common_attrs(info)
attrs.update(
{
"gen_ai.tool.name": info.tool_name,
"dify.tool.time_cost": info.time_cost,
"dify.tool.error": info.error,
"dify.workflow.run_id": metadata.get("workflow_run_id"),
}
)
node_execution_id = metadata.get("node_execution_id")
if node_execution_id:
attrs["dify.node.execution_id"] = node_execution_id
ref = f"ref:message_id={info.message_id}"
attrs["dify.tool.inputs"] = self._content_or_ref(info.tool_inputs, ref)
attrs["dify.tool.outputs"] = self._content_or_ref(info.tool_outputs, ref)
attrs["dify.tool.parameters"] = self._content_or_ref(info.tool_parameters, ref)
attrs["dify.tool.config"] = self._content_or_ref(info.tool_config, ref)
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.TOOL_EXECUTION,
attributes=attrs,
span_id_source=node_execution_id,
tenant_id=tenant_id,
user_id=user_id,
)
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
tool_name=info.tool_name,
)
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="tool",
),
)
self._exporter.record_histogram(EnterpriseTelemetryHistogram.TOOL_DURATION, float(info.time_cost), labels)
if info.error:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.ERRORS,
1,
self._labels(
**labels,
type="tool",
),
)
def _moderation_trace(self, info: ModerationTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
attrs = self._common_attrs(info)
attrs.update(
{
"dify.moderation.flagged": info.flagged,
"dify.moderation.action": info.action,
"dify.moderation.preset_response": info.preset_response,
"dify.workflow.run_id": metadata.get("workflow_run_id"),
}
)
node_execution_id = metadata.get("node_execution_id")
if node_execution_id:
attrs["dify.node.execution_id"] = node_execution_id
attrs["dify.moderation.query"] = self._content_or_ref(
info.query,
f"ref:message_id={info.message_id}",
)
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.MODERATION_CHECK,
attributes=attrs,
span_id_source=node_execution_id,
tenant_id=tenant_id,
user_id=user_id,
)
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
)
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="moderation",
),
)
def _suggested_question_trace(self, info: SuggestedQuestionTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
attrs = self._common_attrs(info)
attrs.update(
{
"gen_ai.usage.total_tokens": info.total_tokens,
"dify.suggested_question.status": info.status,
"dify.suggested_question.error": info.error,
"gen_ai.provider.name": info.model_provider,
"gen_ai.request.model": info.model_id,
"dify.suggested_question.count": len(info.suggested_question),
"dify.workflow.run_id": metadata.get("workflow_run_id"),
}
)
node_execution_id = metadata.get("node_execution_id")
if node_execution_id:
attrs["dify.node.execution_id"] = node_execution_id
attrs["dify.suggested_question.questions"] = self._content_or_ref(
info.suggested_question,
f"ref:message_id={info.message_id}",
)
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.SUGGESTED_QUESTION_GENERATION,
attributes=attrs,
span_id_source=node_execution_id,
tenant_id=tenant_id,
user_id=user_id,
)
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
)
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="suggested_question",
),
)
def _dataset_retrieval_trace(self, info: DatasetRetrievalTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
attrs = self._common_attrs(info)
attrs["dify.dataset.error"] = info.error
attrs["dify.workflow.run_id"] = metadata.get("workflow_run_id")
node_execution_id = metadata.get("node_execution_id")
if node_execution_id:
attrs["dify.node.execution_id"] = node_execution_id
docs: list[dict[str, Any]] = []
documents_any: Any = info.documents
documents_list: list[Any] = cast(list[Any], documents_any) if isinstance(documents_any, list) else []
for entry in documents_list:
if isinstance(entry, dict):
entry_dict: dict[str, Any] = cast(dict[str, Any], entry)
docs.append(entry_dict)
dataset_ids: list[str] = []
dataset_names: list[str] = []
structured_docs: list[dict[str, Any]] = []
for doc in docs:
meta_raw = doc.get("metadata")
meta: dict[str, Any] = cast(dict[str, Any], meta_raw) if isinstance(meta_raw, dict) else {}
did = meta.get("dataset_id")
dname = meta.get("dataset_name")
if did and did not in dataset_ids:
dataset_ids.append(did)
if dname and dname not in dataset_names:
dataset_names.append(dname)
structured_docs.append(
{
"dataset_id": did,
"document_id": meta.get("document_id"),
"segment_id": meta.get("segment_id"),
"score": meta.get("score"),
}
)
attrs["dify.dataset.ids"] = self._maybe_json(dataset_ids)
attrs["dify.dataset.names"] = self._maybe_json(dataset_names)
attrs["dify.retrieval.document_count"] = len(docs)
embedding_models_raw: Any = metadata.get("embedding_models")
embedding_models: dict[str, Any] = (
cast(dict[str, Any], embedding_models_raw) if isinstance(embedding_models_raw, dict) else {}
)
if embedding_models:
providers: list[str] = []
models: list[str] = []
for ds_info in embedding_models.values():
if isinstance(ds_info, dict):
ds_info_dict: dict[str, Any] = cast(dict[str, Any], ds_info)
p = ds_info_dict.get("embedding_model_provider", "")
m = ds_info_dict.get("embedding_model", "")
if p and p not in providers:
providers.append(p)
if m and m not in models:
models.append(m)
attrs["dify.dataset.embedding_providers"] = self._maybe_json(providers)
attrs["dify.dataset.embedding_models"] = self._maybe_json(models)
ref = f"ref:message_id={info.message_id}"
retrieval_inputs = self._safe_payload_value(info.inputs)
attrs["dify.retrieval.query"] = self._content_or_ref(retrieval_inputs, ref)
attrs["dify.dataset.documents"] = self._content_or_ref(structured_docs, ref)
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.DATASET_RETRIEVAL,
attributes=attrs,
trace_id_source=metadata.get("workflow_run_id") or str(info.message_id) if info.message_id else None,
span_id_source=node_execution_id or (str(info.message_id) if info.message_id else None),
tenant_id=tenant_id,
user_id=user_id,
)
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
)
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="dataset_retrieval",
),
)
for did in dataset_ids:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.DATASET_RETRIEVALS,
1,
self._labels(
**labels,
dataset_id=did,
),
)
def _generate_name_trace(self, info: GenerateNameTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
attrs = self._common_attrs(info)
attrs["dify.conversation.id"] = info.conversation_id
node_execution_id = metadata.get("node_execution_id")
if node_execution_id:
attrs["dify.node.execution_id"] = node_execution_id
ref = f"ref:conversation_id={info.conversation_id}"
inputs = self._safe_payload_value(info.inputs)
outputs = self._safe_payload_value(info.outputs)
attrs["dify.generate_name.inputs"] = self._content_or_ref(inputs, ref)
attrs["dify.generate_name.outputs"] = self._content_or_ref(outputs, ref)
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.GENERATE_NAME_EXECUTION,
attributes=attrs,
span_id_source=node_execution_id,
tenant_id=tenant_id,
user_id=user_id,
)
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
)
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="generate_name",
),
)
def _prompt_generation_trace(self, info: PromptGenerationTraceInfo) -> None:
metadata = self._metadata(info)
tenant_id, app_id, user_id = self._context_ids(info, metadata)
attrs = {
"dify.trace_id": info.trace_id,
"dify.tenant_id": tenant_id,
"dify.user.id": user_id,
"dify.app.id": app_id or "",
"dify.app.name": metadata.get("app_name"),
"dify.workspace.name": metadata.get("workspace_name"),
"dify.operation.type": info.operation_type,
"gen_ai.provider.name": info.model_provider,
"gen_ai.request.model": info.model_name,
"gen_ai.usage.input_tokens": info.prompt_tokens,
"gen_ai.usage.output_tokens": info.completion_tokens,
"gen_ai.usage.total_tokens": info.total_tokens,
"dify.prompt_generation.latency": info.latency,
"dify.prompt_generation.error": info.error,
}
node_execution_id = metadata.get("node_execution_id")
if node_execution_id:
attrs["dify.node.execution_id"] = node_execution_id
if info.total_price is not None:
attrs["dify.prompt_generation.total_price"] = info.total_price
attrs["dify.prompt_generation.currency"] = info.currency
ref = f"ref:trace_id={info.trace_id}"
outputs = self._safe_payload_value(info.outputs)
attrs["dify.prompt_generation.instruction"] = self._content_or_ref(info.instruction, ref)
attrs["dify.prompt_generation.output"] = self._content_or_ref(outputs, ref)
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.PROMPT_GENERATION_EXECUTION,
attributes=attrs,
span_id_source=node_execution_id,
tenant_id=tenant_id,
user_id=user_id,
)
token_labels = TokenMetricLabels(
tenant_id=tenant_id or "",
app_id=app_id or "",
operation_type=info.operation_type,
model_provider=info.model_provider,
model_name=info.model_name,
node_type="",
).to_dict()
labels = self._labels(
tenant_id=tenant_id or "",
app_id=app_id or "",
operation_type=info.operation_type,
model_provider=info.model_provider,
model_name=info.model_name,
)
self._exporter.increment_counter(EnterpriseTelemetryCounter.TOKENS, info.total_tokens, token_labels)
if info.prompt_tokens > 0:
self._exporter.increment_counter(EnterpriseTelemetryCounter.INPUT_TOKENS, info.prompt_tokens, token_labels)
if info.completion_tokens > 0:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.OUTPUT_TOKENS, info.completion_tokens, token_labels
)
status = "failed" if info.error else "success"
self._exporter.increment_counter(
EnterpriseTelemetryCounter.REQUESTS,
1,
self._labels(
**labels,
type="prompt_generation",
status=status,
),
)
self._exporter.record_histogram(
EnterpriseTelemetryHistogram.PROMPT_GENERATION_DURATION,
info.latency,
labels,
)
if info.error:
self._exporter.increment_counter(
EnterpriseTelemetryCounter.ERRORS,
1,
self._labels(
**labels,
type="prompt_generation",
),
)

View File

@@ -0,0 +1,121 @@
from enum import StrEnum
from typing import cast
from opentelemetry.util.types import AttributeValue
from pydantic import BaseModel, ConfigDict
class EnterpriseTelemetrySpan(StrEnum):
WORKFLOW_RUN = "dify.workflow.run"
NODE_EXECUTION = "dify.node.execution"
DRAFT_NODE_EXECUTION = "dify.node.execution.draft"
class EnterpriseTelemetryEvent(StrEnum):
"""Event names for enterprise telemetry logs."""
APP_CREATED = "dify.app.created"
APP_UPDATED = "dify.app.updated"
APP_DELETED = "dify.app.deleted"
FEEDBACK_CREATED = "dify.feedback.created"
WORKFLOW_RUN = "dify.workflow.run"
MESSAGE_RUN = "dify.message.run"
TOOL_EXECUTION = "dify.tool.execution"
MODERATION_CHECK = "dify.moderation.check"
SUGGESTED_QUESTION_GENERATION = "dify.suggested_question.generation"
DATASET_RETRIEVAL = "dify.dataset.retrieval"
GENERATE_NAME_EXECUTION = "dify.generate_name.execution"
PROMPT_GENERATION_EXECUTION = "dify.prompt_generation.execution"
REHYDRATION_FAILED = "dify.telemetry.rehydration_failed"
class EnterpriseTelemetryCounter(StrEnum):
TOKENS = "tokens"
INPUT_TOKENS = "input_tokens"
OUTPUT_TOKENS = "output_tokens"
REQUESTS = "requests"
ERRORS = "errors"
FEEDBACK = "feedback"
DATASET_RETRIEVALS = "dataset_retrievals"
APP_CREATED = "app_created"
APP_UPDATED = "app_updated"
APP_DELETED = "app_deleted"
class EnterpriseTelemetryHistogram(StrEnum):
WORKFLOW_DURATION = "workflow_duration"
NODE_DURATION = "node_duration"
MESSAGE_DURATION = "message_duration"
MESSAGE_TTFT = "message_ttft"
TOOL_DURATION = "tool_duration"
PROMPT_GENERATION_DURATION = "prompt_generation_duration"
class TokenMetricLabels(BaseModel):
"""Unified label structure for all dify.token.* metrics.
All token counters (dify.tokens.input, dify.tokens.output, dify.tokens.total) MUST
use this exact label set to ensure consistent filtering and aggregation across
different operation types.
Attributes:
tenant_id: Tenant identifier.
app_id: Application identifier.
operation_type: Source of token usage (workflow | node_execution | message |
rule_generate | code_generate | structured_output | instruction_modify).
model_provider: LLM provider name. Empty string if not applicable (e.g., workflow-level).
model_name: LLM model name. Empty string if not applicable (e.g., workflow-level).
node_type: Workflow node type. Empty string unless operation_type=node_execution.
Usage:
labels = TokenMetricLabels(
tenant_id="tenant-123",
app_id="app-456",
operation_type=OperationType.WORKFLOW,
model_provider="",
model_name="",
node_type="",
)
exporter.increment_counter(
EnterpriseTelemetryCounter.INPUT_TOKENS,
100,
labels.to_dict()
)
Design rationale:
Without this unified structure, tokens get double-counted when querying totals
because workflow.total_tokens is already the sum of all node tokens. The
operation_type label allows filtering to separate workflow-level aggregates from
node-level detail, while keeping the same label cardinality for consistent queries.
"""
tenant_id: str
app_id: str
operation_type: str
model_provider: str
model_name: str
node_type: str
model_config = ConfigDict(extra="forbid", frozen=True)
def to_dict(self) -> dict[str, AttributeValue]:
return cast(
dict[str, AttributeValue],
{
"tenant_id": self.tenant_id,
"app_id": self.app_id,
"operation_type": self.operation_type,
"model_provider": self.model_provider,
"model_name": self.model_name,
"node_type": self.node_type,
},
)
__all__ = [
"EnterpriseTelemetryCounter",
"EnterpriseTelemetryEvent",
"EnterpriseTelemetryHistogram",
"EnterpriseTelemetrySpan",
"TokenMetricLabels",
]

View File

@@ -0,0 +1,130 @@
"""Blinker signal handlers for enterprise telemetry.
Registered at import time via ``@signal.connect`` decorators.
Import must happen during ``ext_enterprise_telemetry.init_app()`` to ensure handlers fire.
"""
from __future__ import annotations
import logging
import uuid
from events.app_event import app_was_created, app_was_deleted, app_was_updated
from events.feedback_event import feedback_was_created
logger = logging.getLogger(__name__)
__all__ = [
"_handle_app_created",
"_handle_app_deleted",
"_handle_app_updated",
"_handle_feedback_created",
]
@app_was_created.connect
def _handle_app_created(sender: object, **kwargs: object) -> None:
from enterprise.telemetry.contracts import TelemetryCase, TelemetryEnvelope
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
from tasks.enterprise_telemetry_task import process_enterprise_telemetry
exporter = get_enterprise_exporter()
if not exporter:
return
tenant_id = str(getattr(sender, "tenant_id", "") or "")
payload = {
"app_id": getattr(sender, "id", None),
"mode": getattr(sender, "mode", None),
}
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_CREATED,
tenant_id=tenant_id,
event_id=str(uuid.uuid4()),
payload=payload,
)
process_enterprise_telemetry.delay(envelope.model_dump_json())
@app_was_deleted.connect
def _handle_app_deleted(sender: object, **kwargs: object) -> None:
from enterprise.telemetry.contracts import TelemetryCase, TelemetryEnvelope
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
from tasks.enterprise_telemetry_task import process_enterprise_telemetry
exporter = get_enterprise_exporter()
if not exporter:
return
tenant_id = str(getattr(sender, "tenant_id", "") or "")
payload = {
"app_id": getattr(sender, "id", None),
}
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_DELETED,
tenant_id=tenant_id,
event_id=str(uuid.uuid4()),
payload=payload,
)
process_enterprise_telemetry.delay(envelope.model_dump_json())
@app_was_updated.connect
def _handle_app_updated(sender: object, **kwargs: object) -> None:
from enterprise.telemetry.contracts import TelemetryCase, TelemetryEnvelope
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
from tasks.enterprise_telemetry_task import process_enterprise_telemetry
exporter = get_enterprise_exporter()
if not exporter:
return
tenant_id = str(getattr(sender, "tenant_id", "") or "")
payload = {
"app_id": getattr(sender, "id", None),
}
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_UPDATED,
tenant_id=tenant_id,
event_id=str(uuid.uuid4()),
payload=payload,
)
process_enterprise_telemetry.delay(envelope.model_dump_json())
@feedback_was_created.connect
def _handle_feedback_created(sender: object, **kwargs: object) -> None:
from enterprise.telemetry.contracts import TelemetryCase, TelemetryEnvelope
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
from tasks.enterprise_telemetry_task import process_enterprise_telemetry
exporter = get_enterprise_exporter()
if not exporter:
return
tenant_id = str(kwargs.get("tenant_id", "") or "")
payload = {
"message_id": getattr(sender, "message_id", None),
"app_id": getattr(sender, "app_id", None),
"conversation_id": getattr(sender, "conversation_id", None),
"from_end_user_id": getattr(sender, "from_end_user_id", None),
"from_account_id": getattr(sender, "from_account_id", None),
"rating": getattr(sender, "rating", None),
"from_source": getattr(sender, "from_source", None),
"content": getattr(sender, "content", None),
}
envelope = TelemetryEnvelope(
case=TelemetryCase.FEEDBACK_CREATED,
tenant_id=tenant_id,
event_id=str(uuid.uuid4()),
payload=payload,
)
process_enterprise_telemetry.delay(envelope.model_dump_json())

View File

@@ -0,0 +1,266 @@
"""Enterprise OTEL exporter — shared by EnterpriseOtelTrace, event handlers, and direct instrumentation.
Uses dedicated TracerProvider and MeterProvider instances (configurable sampling,
independent from ext_otel.py infrastructure).
Initialized once during Flask extension init (single-threaded via ext_enterprise_telemetry.py).
Accessed via ``ext_enterprise_telemetry.get_enterprise_exporter()`` from any thread/process.
"""
import logging
import socket
import uuid
from datetime import datetime
from typing import Any, cast
from opentelemetry import trace
from opentelemetry.context import Context
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter as GRPCMetricExporter
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter as GRPCSpanExporter
from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter as HTTPMetricExporter
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter as HTTPSpanExporter
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.trace.sampling import ParentBasedTraceIdRatio
from opentelemetry.semconv.resource import ResourceAttributes
from opentelemetry.trace import SpanContext, TraceFlags
from opentelemetry.util.types import Attributes, AttributeValue
from configs import dify_config
from enterprise.telemetry.entities import EnterpriseTelemetryCounter, EnterpriseTelemetryHistogram
from enterprise.telemetry.id_generator import (
CorrelationIdGenerator,
compute_deterministic_span_id,
set_correlation_id,
set_span_id_source,
)
logger = logging.getLogger(__name__)
def is_enterprise_telemetry_enabled() -> bool:
return bool(dify_config.ENTERPRISE_ENABLED and dify_config.ENTERPRISE_TELEMETRY_ENABLED)
def _parse_otlp_headers(raw: str) -> dict[str, str]:
"""Parse ``key=value,key2=value2`` into a dict."""
if not raw:
return {}
headers: dict[str, str] = {}
for pair in raw.split(","):
if "=" not in pair:
continue
k, v = pair.split("=", 1)
headers[k.strip()] = v.strip()
return headers
def _datetime_to_ns(dt: datetime) -> int:
"""Convert a datetime to nanoseconds since epoch (OTEL convention)."""
return int(dt.timestamp() * 1_000_000_000)
class _ExporterFactory:
def __init__(self, protocol: str, endpoint: str, headers: dict[str, str], insecure: bool):
self._protocol = protocol
self._endpoint = endpoint
self._headers = headers
self._grpc_headers = tuple(headers.items()) if headers else None
self._http_headers = headers or None
self._insecure = insecure
def create_trace_exporter(self) -> HTTPSpanExporter | GRPCSpanExporter:
if self._protocol == "grpc":
return GRPCSpanExporter(
endpoint=self._endpoint or None,
headers=self._grpc_headers,
insecure=self._insecure,
)
trace_endpoint = f"{self._endpoint}/v1/traces" if self._endpoint else ""
return HTTPSpanExporter(endpoint=trace_endpoint or None, headers=self._http_headers)
def create_metric_exporter(self) -> HTTPMetricExporter | GRPCMetricExporter:
if self._protocol == "grpc":
return GRPCMetricExporter(
endpoint=self._endpoint or None,
headers=self._grpc_headers,
insecure=self._insecure,
)
metric_endpoint = f"{self._endpoint}/v1/metrics" if self._endpoint else ""
return HTTPMetricExporter(endpoint=metric_endpoint or None, headers=self._http_headers)
class EnterpriseExporter:
"""Shared OTEL exporter for all enterprise telemetry.
``export_span`` creates spans with optional real timestamps, deterministic
span/trace IDs, and cross-workflow parent linking.
``increment_counter`` / ``record_histogram`` emit OTEL metrics at 100% accuracy.
"""
def __init__(self, config: object) -> None:
endpoint: str = getattr(config, "ENTERPRISE_OTLP_ENDPOINT", "")
headers_raw: str = getattr(config, "ENTERPRISE_OTLP_HEADERS", "")
protocol: str = (getattr(config, "ENTERPRISE_OTLP_PROTOCOL", "http") or "http").lower()
service_name: str = getattr(config, "ENTERPRISE_SERVICE_NAME", "dify")
sampling_rate: float = getattr(config, "ENTERPRISE_OTEL_SAMPLING_RATE", 1.0)
self.include_content: bool = getattr(config, "ENTERPRISE_INCLUDE_CONTENT", True)
api_key: str = getattr(config, "ENTERPRISE_OTLP_API_KEY", "")
# Auto-detect TLS: when bearer token is configured, use secure channel
insecure: bool = not bool(api_key)
resource = Resource(
attributes={
ResourceAttributes.SERVICE_NAME: service_name,
ResourceAttributes.HOST_NAME: socket.gethostname(),
}
)
sampler = ParentBasedTraceIdRatio(sampling_rate)
id_generator = CorrelationIdGenerator()
self._tracer_provider = TracerProvider(resource=resource, sampler=sampler, id_generator=id_generator)
headers = _parse_otlp_headers(headers_raw)
if api_key:
if "authorization" in headers:
logger.warning(
"ENTERPRISE_OTLP_API_KEY is set but ENTERPRISE_OTLP_HEADERS also contains "
"'authorization'; the API key will take precedence."
)
headers["authorization"] = f"Bearer {api_key}"
factory = _ExporterFactory(protocol, endpoint, headers, insecure=insecure)
trace_exporter = factory.create_trace_exporter()
self._tracer_provider.add_span_processor(BatchSpanProcessor(trace_exporter))
self._tracer = self._tracer_provider.get_tracer("dify.enterprise")
metric_exporter = factory.create_metric_exporter()
self._meter_provider = MeterProvider(
resource=resource,
metric_readers=[PeriodicExportingMetricReader(metric_exporter)],
)
meter = self._meter_provider.get_meter("dify.enterprise")
self._counters = {
EnterpriseTelemetryCounter.TOKENS: meter.create_counter("dify.tokens.total", unit="{token}"),
EnterpriseTelemetryCounter.INPUT_TOKENS: meter.create_counter("dify.tokens.input", unit="{token}"),
EnterpriseTelemetryCounter.OUTPUT_TOKENS: meter.create_counter("dify.tokens.output", unit="{token}"),
EnterpriseTelemetryCounter.REQUESTS: meter.create_counter("dify.requests.total", unit="{request}"),
EnterpriseTelemetryCounter.ERRORS: meter.create_counter("dify.errors.total", unit="{error}"),
EnterpriseTelemetryCounter.FEEDBACK: meter.create_counter("dify.feedback.total", unit="{feedback}"),
EnterpriseTelemetryCounter.DATASET_RETRIEVALS: meter.create_counter(
"dify.dataset.retrievals.total", unit="{retrieval}"
),
EnterpriseTelemetryCounter.APP_CREATED: meter.create_counter("dify.app.created.total", unit="{app}"),
EnterpriseTelemetryCounter.APP_UPDATED: meter.create_counter("dify.app.updated.total", unit="{app}"),
EnterpriseTelemetryCounter.APP_DELETED: meter.create_counter("dify.app.deleted.total", unit="{app}"),
}
self._histograms = {
EnterpriseTelemetryHistogram.WORKFLOW_DURATION: meter.create_histogram("dify.workflow.duration", unit="s"),
EnterpriseTelemetryHistogram.NODE_DURATION: meter.create_histogram("dify.node.duration", unit="s"),
EnterpriseTelemetryHistogram.MESSAGE_DURATION: meter.create_histogram("dify.message.duration", unit="s"),
EnterpriseTelemetryHistogram.MESSAGE_TTFT: meter.create_histogram(
"dify.message.time_to_first_token", unit="s"
),
EnterpriseTelemetryHistogram.TOOL_DURATION: meter.create_histogram("dify.tool.duration", unit="s"),
EnterpriseTelemetryHistogram.PROMPT_GENERATION_DURATION: meter.create_histogram(
"dify.prompt_generation.duration", unit="s"
),
}
def export_span(
self,
name: str,
attributes: dict[str, Any],
correlation_id: str | None = None,
span_id_source: str | None = None,
start_time: datetime | None = None,
end_time: datetime | None = None,
trace_correlation_override: str | None = None,
parent_span_id_source: str | None = None,
) -> None:
"""Export an OTEL span with optional deterministic IDs and real timestamps.
Args:
name: Span operation name.
attributes: Span attributes dict.
correlation_id: Source for trace_id derivation (groups spans in one trace).
span_id_source: Source for deterministic span_id (e.g. workflow_run_id or node_execution_id).
start_time: Real span start time. When None, uses current time.
end_time: Real span end time. When None, span ends immediately.
trace_correlation_override: Override trace_id source (for cross-workflow linking).
When set, trace_id is derived from this instead of ``correlation_id``.
parent_span_id_source: Override parent span_id source (for cross-workflow linking).
When set, parent span_id is derived from this value. When None and
``correlation_id`` is set, parent is the workflow root span.
"""
effective_trace_correlation = trace_correlation_override or correlation_id
set_correlation_id(effective_trace_correlation)
set_span_id_source(span_id_source)
try:
parent_context: Context | None = None
# A span is the "root" of its correlation group when span_id_source == correlation_id
# (i.e. a workflow root span). All other spans are children.
if parent_span_id_source:
# Cross-workflow linking: parent is an explicit span (e.g. tool node in outer workflow)
parent_span_id = compute_deterministic_span_id(parent_span_id_source)
parent_trace_id = int(uuid.UUID(effective_trace_correlation)) if effective_trace_correlation else 0
if parent_trace_id:
parent_span_context = SpanContext(
trace_id=parent_trace_id,
span_id=parent_span_id,
is_remote=True,
trace_flags=TraceFlags(TraceFlags.SAMPLED),
)
parent_context = trace.set_span_in_context(trace.NonRecordingSpan(parent_span_context))
elif correlation_id and correlation_id != span_id_source:
# Child span: parent is the correlation-group root (workflow root span)
parent_span_id = compute_deterministic_span_id(correlation_id)
parent_trace_id = int(uuid.UUID(effective_trace_correlation or correlation_id))
parent_span_context = SpanContext(
trace_id=parent_trace_id,
span_id=parent_span_id,
is_remote=True,
trace_flags=TraceFlags(TraceFlags.SAMPLED),
)
parent_context = trace.set_span_in_context(trace.NonRecordingSpan(parent_span_context))
span_start_time = _datetime_to_ns(start_time) if start_time is not None else None
span_end_on_exit = end_time is None
with self._tracer.start_as_current_span(
name,
context=parent_context,
start_time=span_start_time,
end_on_exit=span_end_on_exit,
) as span:
for key, value in attributes.items():
if value is not None:
span.set_attribute(key, value)
if end_time is not None:
span.end(end_time=_datetime_to_ns(end_time))
except Exception:
logger.exception("Failed to export span %s", name)
finally:
set_correlation_id(None)
set_span_id_source(None)
def increment_counter(
self, name: EnterpriseTelemetryCounter, value: int, labels: dict[str, AttributeValue]
) -> None:
counter = self._counters.get(name)
if counter:
counter.add(value, cast(Attributes, labels))
def record_histogram(
self, name: EnterpriseTelemetryHistogram, value: float, labels: dict[str, AttributeValue]
) -> None:
histogram = self._histograms.get(name)
if histogram:
histogram.record(value, cast(Attributes, labels))
def shutdown(self) -> None:
self._tracer_provider.shutdown()
self._meter_provider.shutdown()

View File

@@ -0,0 +1,199 @@
"""Telemetry gateway routing and dispatch.
Maps ``TelemetryCase`` → ``CaseRoute`` (signal type + CE eligibility)
and dispatches events to either the trace pipeline or the metric/log
Celery queue.
Singleton lifecycle is managed by ``ext_enterprise_telemetry.init_app()``
which creates the instance during single-threaded Flask app startup.
Access via ``ext_enterprise_telemetry.get_gateway()``.
"""
from __future__ import annotations
import json
import logging
import uuid
from typing import TYPE_CHECKING, Any
from core.ops.entities.trace_entity import TraceTaskName
from enterprise.telemetry.contracts import CaseRoute, SignalType, TelemetryCase, TelemetryEnvelope
from extensions.ext_storage import storage
if TYPE_CHECKING:
from core.ops.ops_trace_manager import TraceQueueManager
logger = logging.getLogger(__name__)
PAYLOAD_SIZE_THRESHOLD_BYTES = 1 * 1024 * 1024
CASE_TO_TRACE_TASK: dict[TelemetryCase, TraceTaskName] = {
TelemetryCase.WORKFLOW_RUN: TraceTaskName.WORKFLOW_TRACE,
TelemetryCase.MESSAGE_RUN: TraceTaskName.MESSAGE_TRACE,
TelemetryCase.NODE_EXECUTION: TraceTaskName.NODE_EXECUTION_TRACE,
TelemetryCase.DRAFT_NODE_EXECUTION: TraceTaskName.DRAFT_NODE_EXECUTION_TRACE,
TelemetryCase.PROMPT_GENERATION: TraceTaskName.PROMPT_GENERATION_TRACE,
}
CASE_ROUTING: dict[TelemetryCase, CaseRoute] = {
TelemetryCase.WORKFLOW_RUN: CaseRoute(signal_type=SignalType.TRACE, ce_eligible=True),
TelemetryCase.MESSAGE_RUN: CaseRoute(signal_type=SignalType.TRACE, ce_eligible=True),
TelemetryCase.NODE_EXECUTION: CaseRoute(signal_type=SignalType.TRACE, ce_eligible=False),
TelemetryCase.DRAFT_NODE_EXECUTION: CaseRoute(signal_type=SignalType.TRACE, ce_eligible=False),
TelemetryCase.PROMPT_GENERATION: CaseRoute(signal_type=SignalType.TRACE, ce_eligible=False),
TelemetryCase.APP_CREATED: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.APP_UPDATED: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.APP_DELETED: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.FEEDBACK_CREATED: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.TOOL_EXECUTION: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.MODERATION_CHECK: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.SUGGESTED_QUESTION: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.DATASET_RETRIEVAL: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
TelemetryCase.GENERATE_NAME: CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False),
}
def _is_enterprise_telemetry_enabled() -> bool:
try:
from enterprise.telemetry.exporter import is_enterprise_telemetry_enabled
return is_enterprise_telemetry_enabled()
except Exception:
return False
def _should_drop_ee_only_event(route: CaseRoute) -> bool:
"""Return True when the event is enterprise-only and EE telemetry is disabled."""
return not route.ce_eligible and not _is_enterprise_telemetry_enabled()
class TelemetryGateway:
"""Routes telemetry events to the trace pipeline or the metric/log Celery queue.
Stateless — instantiated once during ``ext_enterprise_telemetry.init_app()``
and shared for the lifetime of the process.
"""
def emit(
self,
case: TelemetryCase,
context: dict[str, Any],
payload: dict[str, Any],
trace_manager: TraceQueueManager | None = None,
) -> None:
route = CASE_ROUTING.get(case)
if route is None:
logger.warning("Unknown telemetry case: %s, dropping event", case)
return
if _should_drop_ee_only_event(route):
logger.debug("Dropping EE-only event: case=%s (EE disabled)", case)
return
logger.debug(
"Gateway routing: case=%s, signal_type=%s, ce_eligible=%s",
case,
route.signal_type,
route.ce_eligible,
)
if route.signal_type is SignalType.TRACE:
self._emit_trace(case, context, payload, route, trace_manager)
else:
self._emit_metric_log(case, context, payload)
def _emit_trace(
self,
case: TelemetryCase,
context: dict[str, Any],
payload: dict[str, Any],
route: CaseRoute,
trace_manager: TraceQueueManager | None,
) -> None:
from core.ops.ops_trace_manager import TraceQueueManager as LocalTraceQueueManager
from core.ops.ops_trace_manager import TraceTask
trace_task_name = CASE_TO_TRACE_TASK.get(case)
if trace_task_name is None:
logger.warning("No TraceTaskName mapping for case: %s", case)
return
queue_manager = trace_manager or LocalTraceQueueManager(
app_id=context.get("app_id"),
user_id=context.get("user_id"),
)
queue_manager.add_trace_task(TraceTask(trace_task_name, **payload))
logger.debug("Enqueued trace task: case=%s, app_id=%s", case, context.get("app_id"))
def _emit_metric_log(
self,
case: TelemetryCase,
context: dict[str, Any],
payload: dict[str, Any],
) -> None:
from tasks.enterprise_telemetry_task import process_enterprise_telemetry
tenant_id = context.get("tenant_id", "")
event_id = str(uuid.uuid4())
payload_for_envelope, payload_ref = self._handle_payload_sizing(payload, tenant_id, event_id)
envelope = TelemetryEnvelope(
case=case,
tenant_id=tenant_id,
event_id=event_id,
payload=payload_for_envelope,
metadata={"payload_ref": payload_ref} if payload_ref else None,
)
process_enterprise_telemetry.delay(envelope.model_dump_json())
logger.debug(
"Enqueued metric/log event: case=%s, tenant_id=%s, event_id=%s",
case,
tenant_id,
event_id,
)
def _handle_payload_sizing(
self,
payload: dict[str, Any],
tenant_id: str,
event_id: str,
) -> tuple[dict[str, Any], str | None]:
try:
payload_json = json.dumps(payload)
payload_size = len(payload_json.encode("utf-8"))
except (TypeError, ValueError):
logger.warning("Failed to serialize payload for sizing: event_id=%s", event_id)
return payload, None
if payload_size <= PAYLOAD_SIZE_THRESHOLD_BYTES:
return payload, None
storage_key = f"telemetry/{tenant_id}/{event_id}.json"
try:
storage.save(storage_key, payload_json.encode("utf-8"))
logger.debug("Stored large payload to storage: key=%s, size=%d", storage_key, payload_size)
return {}, storage_key
except Exception:
logger.warning("Failed to store large payload, inlining instead: event_id=%s", event_id, exc_info=True)
return payload, None
def emit(
case: TelemetryCase,
context: dict[str, Any],
payload: dict[str, Any],
trace_manager: TraceQueueManager | None = None,
) -> None:
"""Module-level convenience wrapper.
Fetches the gateway singleton from the extension; no-ops when
enterprise telemetry is disabled (gateway is ``None``).
"""
from extensions.ext_enterprise_telemetry import get_gateway
gateway = get_gateway()
if gateway is not None:
gateway.emit(case, context, payload, trace_manager)

View File

@@ -0,0 +1,76 @@
"""Custom OTEL ID Generator for correlation-based trace/span ID derivation.
Uses contextvars for thread-safe correlation_id -> trace_id mapping.
When a span_id_source is set, the span_id is derived deterministically
from that value, enabling any span to reference another as parent
without depending on span creation order.
"""
import random
import uuid
from contextvars import ContextVar
from typing import cast
from opentelemetry.sdk.trace.id_generator import IdGenerator
_correlation_id_context: ContextVar[str | None] = ContextVar("correlation_id", default=None)
_span_id_source_context: ContextVar[str | None] = ContextVar("span_id_source", default=None)
def set_correlation_id(correlation_id: str | None) -> None:
_correlation_id_context.set(correlation_id)
def get_correlation_id() -> str | None:
return _correlation_id_context.get()
def set_span_id_source(source_id: str | None) -> None:
"""Set the source for deterministic span_id generation.
When set, ``generate_span_id()`` derives the span_id from this value
(lower 64 bits of the UUID). Pass the ``workflow_run_id`` for workflow
root spans or ``node_execution_id`` for node spans.
"""
_span_id_source_context.set(source_id)
def compute_deterministic_span_id(source_id: str) -> int:
"""Derive a deterministic span_id from any UUID string.
Uses the lower 64 bits of the UUID, guaranteeing non-zero output
(OTEL requires span_id != 0).
"""
span_id = cast(int, uuid.UUID(source_id).int) & ((1 << 64) - 1)
return span_id if span_id != 0 else 1
class CorrelationIdGenerator(IdGenerator):
"""ID generator that derives trace_id and optionally span_id from context.
- trace_id: always derived from correlation_id (groups all spans in one trace)
- span_id: derived from span_id_source when set (enables deterministic
parent-child linking), otherwise random
"""
def generate_trace_id(self) -> int:
correlation_id = _correlation_id_context.get()
if correlation_id:
try:
return cast(int, uuid.UUID(correlation_id).int)
except (ValueError, AttributeError):
pass
return random.getrandbits(128)
def generate_span_id(self) -> int:
source = _span_id_source_context.get()
if source:
try:
return compute_deterministic_span_id(source)
except (ValueError, AttributeError):
pass
span_id = random.getrandbits(64)
while span_id == 0:
span_id = random.getrandbits(64)
return span_id

View File

@@ -0,0 +1,377 @@
"""Enterprise metric/log event handler.
This module processes metric and log telemetry events after they've been
dequeued from the enterprise_telemetry Celery queue. It handles case routing,
idempotency checking, and payload rehydration.
"""
from __future__ import annotations
import logging
from typing import Any
from enterprise.telemetry.contracts import TelemetryCase, TelemetryEnvelope
from extensions.ext_redis import redis_client
logger = logging.getLogger(__name__)
class EnterpriseMetricHandler:
"""Handler for enterprise metric and log telemetry events.
Processes envelopes from the enterprise_telemetry queue, routing each
case to the appropriate handler method. Implements idempotency checking
and payload rehydration with fallback.
"""
def _increment_diagnostic_counter(self, counter_name: str, labels: dict[str, str] | None = None) -> None:
"""Increment a diagnostic counter for operational monitoring.
Args:
counter_name: Name of the counter (e.g., 'processed_total', 'deduped_total').
labels: Optional labels for the counter.
"""
try:
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
exporter = get_enterprise_exporter()
if not exporter:
return
full_counter_name = f"enterprise_telemetry.handler.{counter_name}"
logger.debug(
"Diagnostic counter: %s, labels=%s",
full_counter_name,
labels or {},
)
except Exception:
logger.debug("Failed to increment diagnostic counter: %s", counter_name, exc_info=True)
def handle(self, envelope: TelemetryEnvelope) -> None:
"""Main entry point for processing telemetry envelopes.
Args:
envelope: The telemetry envelope to process.
"""
# Check for duplicate events
if self._is_duplicate(envelope):
logger.debug(
"Skipping duplicate event: tenant_id=%s, event_id=%s",
envelope.tenant_id,
envelope.event_id,
)
self._increment_diagnostic_counter("deduped_total")
return
# Route to appropriate handler based on case
case = envelope.case
if case == TelemetryCase.APP_CREATED:
self._on_app_created(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "app_created"})
elif case == TelemetryCase.APP_UPDATED:
self._on_app_updated(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "app_updated"})
elif case == TelemetryCase.APP_DELETED:
self._on_app_deleted(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "app_deleted"})
elif case == TelemetryCase.FEEDBACK_CREATED:
self._on_feedback_created(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "feedback_created"})
elif case == TelemetryCase.MESSAGE_RUN:
self._on_message_run(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "message_run"})
elif case == TelemetryCase.TOOL_EXECUTION:
self._on_tool_execution(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "tool_execution"})
elif case == TelemetryCase.MODERATION_CHECK:
self._on_moderation_check(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "moderation_check"})
elif case == TelemetryCase.SUGGESTED_QUESTION:
self._on_suggested_question(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "suggested_question"})
elif case == TelemetryCase.DATASET_RETRIEVAL:
self._on_dataset_retrieval(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "dataset_retrieval"})
elif case == TelemetryCase.GENERATE_NAME:
self._on_generate_name(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "generate_name"})
elif case == TelemetryCase.PROMPT_GENERATION:
self._on_prompt_generation(envelope)
self._increment_diagnostic_counter("processed_total", {"case": "prompt_generation"})
else:
logger.warning(
"Unknown telemetry case: %s (tenant_id=%s, event_id=%s)",
case,
envelope.tenant_id,
envelope.event_id,
)
def _is_duplicate(self, envelope: TelemetryEnvelope) -> bool:
"""Check if this event has already been processed.
Uses Redis with TTL for deduplication. Returns True if duplicate,
False if first time seeing this event.
Args:
envelope: The telemetry envelope to check.
Returns:
True if this event_id has been seen before, False otherwise.
"""
dedup_key = f"telemetry:dedup:{envelope.tenant_id}:{envelope.event_id}"
try:
# Atomic set-if-not-exists with 1h TTL
# Returns True if key was set (first time), None if already exists (duplicate)
was_set = redis_client.set(dedup_key, b"1", nx=True, ex=3600)
return was_set is None
except Exception:
# Fail open: if Redis is unavailable, process the event
# (prefer occasional duplicate over lost data)
logger.warning(
"Redis unavailable for deduplication check, processing event anyway: %s",
envelope.event_id,
exc_info=True,
)
return False
def _rehydrate(self, envelope: TelemetryEnvelope) -> dict[str, Any]:
"""Rehydrate payload from reference or fallback.
Attempts to resolve payload_ref to full data. If that fails,
falls back to payload_fallback. If both fail, emits a degraded
event marker.
Args:
envelope: The telemetry envelope containing payload data.
Returns:
The rehydrated payload dictionary.
"""
# For now, payload is directly in the envelope
# Future: implement payload_ref resolution from storage
payload = envelope.payload
if not payload and envelope.payload_fallback:
import pickle
try:
payload = pickle.loads(envelope.payload_fallback) # noqa: S301
logger.debug("Used payload_fallback for event_id=%s", envelope.event_id)
except Exception:
logger.warning(
"Failed to deserialize payload_fallback for event_id=%s",
envelope.event_id,
exc_info=True,
)
if not payload:
# Both ref and fallback failed - emit degraded event
logger.error(
"Payload rehydration failed for event_id=%s, tenant_id=%s, case=%s",
envelope.event_id,
envelope.tenant_id,
envelope.case,
)
# Emit degraded event marker
from enterprise.telemetry.entities import EnterpriseTelemetryEvent
from enterprise.telemetry.telemetry_log import emit_metric_only_event
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.REHYDRATION_FAILED,
attributes={
"dify.tenant_id": envelope.tenant_id,
"dify.event_id": envelope.event_id,
"dify.case": envelope.case,
"rehydration_failed": True,
},
tenant_id=envelope.tenant_id,
)
self._increment_diagnostic_counter("rehydration_failed_total")
return {}
return payload
# Stub methods for each metric/log case
# These will be implemented in later tasks with actual emission logic
def _on_app_created(self, envelope: TelemetryEnvelope) -> None:
"""Handle app created event."""
from enterprise.telemetry.entities import EnterpriseTelemetryCounter, EnterpriseTelemetryEvent
from enterprise.telemetry.telemetry_log import emit_metric_only_event
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
exporter = get_enterprise_exporter()
if not exporter:
logger.debug("No exporter available for APP_CREATED: event_id=%s", envelope.event_id)
return
payload = self._rehydrate(envelope)
if not payload:
return
attrs = {
"dify.app.id": payload.get("app_id"),
"dify.tenant_id": envelope.tenant_id,
"dify.event.id": envelope.event_id,
"dify.app.mode": payload.get("mode"),
}
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.APP_CREATED,
attributes=attrs,
tenant_id=envelope.tenant_id,
)
exporter.increment_counter(
EnterpriseTelemetryCounter.APP_CREATED,
1,
{
"tenant_id": envelope.tenant_id,
"app_id": str(payload.get("app_id", "")),
"mode": str(payload.get("mode", "")),
},
)
def _on_app_updated(self, envelope: TelemetryEnvelope) -> None:
"""Handle app updated event."""
from enterprise.telemetry.entities import EnterpriseTelemetryCounter, EnterpriseTelemetryEvent
from enterprise.telemetry.telemetry_log import emit_metric_only_event
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
exporter = get_enterprise_exporter()
if not exporter:
logger.debug("No exporter available for APP_UPDATED: event_id=%s", envelope.event_id)
return
payload = self._rehydrate(envelope)
if not payload:
return
attrs = {
"dify.app.id": payload.get("app_id"),
"dify.tenant_id": envelope.tenant_id,
"dify.event.id": envelope.event_id,
}
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.APP_UPDATED,
attributes=attrs,
tenant_id=envelope.tenant_id,
)
exporter.increment_counter(
EnterpriseTelemetryCounter.APP_UPDATED,
1,
{
"tenant_id": envelope.tenant_id,
"app_id": str(payload.get("app_id", "")),
},
)
def _on_app_deleted(self, envelope: TelemetryEnvelope) -> None:
"""Handle app deleted event."""
from enterprise.telemetry.entities import EnterpriseTelemetryCounter, EnterpriseTelemetryEvent
from enterprise.telemetry.telemetry_log import emit_metric_only_event
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
exporter = get_enterprise_exporter()
if not exporter:
logger.debug("No exporter available for APP_DELETED: event_id=%s", envelope.event_id)
return
payload = self._rehydrate(envelope)
if not payload:
return
attrs = {
"dify.app.id": payload.get("app_id"),
"dify.tenant_id": envelope.tenant_id,
"dify.event.id": envelope.event_id,
}
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.APP_DELETED,
attributes=attrs,
tenant_id=envelope.tenant_id,
)
exporter.increment_counter(
EnterpriseTelemetryCounter.APP_DELETED,
1,
{
"tenant_id": envelope.tenant_id,
"app_id": str(payload.get("app_id", "")),
},
)
def _on_feedback_created(self, envelope: TelemetryEnvelope) -> None:
"""Handle feedback created event."""
from enterprise.telemetry.entities import EnterpriseTelemetryCounter, EnterpriseTelemetryEvent
from enterprise.telemetry.telemetry_log import emit_metric_only_event
from extensions.ext_enterprise_telemetry import get_enterprise_exporter
exporter = get_enterprise_exporter()
if not exporter:
logger.debug("No exporter available for FEEDBACK_CREATED: event_id=%s", envelope.event_id)
return
payload = self._rehydrate(envelope)
if not payload:
return
include_content = exporter.include_content
attrs: dict = {
"dify.message.id": payload.get("message_id"),
"dify.tenant_id": envelope.tenant_id,
"dify.event.id": envelope.event_id,
"dify.app_id": payload.get("app_id"),
"dify.conversation.id": payload.get("conversation_id"),
"gen_ai.user.id": payload.get("from_end_user_id") or payload.get("from_account_id"),
"dify.feedback.rating": payload.get("rating"),
"dify.feedback.from_source": payload.get("from_source"),
}
if include_content:
attrs["dify.feedback.content"] = payload.get("content")
user_id = payload.get("from_end_user_id") or payload.get("from_account_id")
emit_metric_only_event(
event_name=EnterpriseTelemetryEvent.FEEDBACK_CREATED,
attributes=attrs,
tenant_id=envelope.tenant_id,
user_id=str(user_id or ""),
)
exporter.increment_counter(
EnterpriseTelemetryCounter.FEEDBACK,
1,
{
"tenant_id": envelope.tenant_id,
"app_id": str(payload.get("app_id", "")),
"rating": str(payload.get("rating", "")),
},
)
def _on_message_run(self, envelope: TelemetryEnvelope) -> None:
"""Handle message run event (stub)."""
logger.debug("Processing MESSAGE_RUN: event_id=%s", envelope.event_id)
def _on_tool_execution(self, envelope: TelemetryEnvelope) -> None:
"""Handle tool execution event (stub)."""
logger.debug("Processing TOOL_EXECUTION: event_id=%s", envelope.event_id)
def _on_moderation_check(self, envelope: TelemetryEnvelope) -> None:
"""Handle moderation check event (stub)."""
logger.debug("Processing MODERATION_CHECK: event_id=%s", envelope.event_id)
def _on_suggested_question(self, envelope: TelemetryEnvelope) -> None:
"""Handle suggested question event (stub)."""
logger.debug("Processing SUGGESTED_QUESTION: event_id=%s", envelope.event_id)
def _on_dataset_retrieval(self, envelope: TelemetryEnvelope) -> None:
"""Handle dataset retrieval event (stub)."""
logger.debug("Processing DATASET_RETRIEVAL: event_id=%s", envelope.event_id)
def _on_generate_name(self, envelope: TelemetryEnvelope) -> None:
"""Handle generate name event (stub)."""
logger.debug("Processing GENERATE_NAME: event_id=%s", envelope.event_id)
def _on_prompt_generation(self, envelope: TelemetryEnvelope) -> None:
"""Handle prompt generation event (stub)."""
logger.debug("Processing PROMPT_GENERATION: event_id=%s", envelope.event_id)

View File

@@ -0,0 +1,122 @@
"""Structured-log emitter for enterprise telemetry events.
Emits structured JSON log lines correlated with OTEL traces via trace_id.
Picked up by ``StructuredJSONFormatter`` → stdout/Loki/Elastic.
"""
from __future__ import annotations
import logging
import uuid
from functools import lru_cache
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from enterprise.telemetry.entities import EnterpriseTelemetryEvent
logger = logging.getLogger("dify.telemetry")
@lru_cache(maxsize=4096)
def compute_trace_id_hex(uuid_str: str | None) -> str:
"""Convert a business UUID string to a 32-hex OTEL-compatible trace_id.
Returns empty string when *uuid_str* is ``None`` or invalid.
"""
if not uuid_str:
return ""
normalized = uuid_str.strip().lower()
if len(normalized) == 32 and all(ch in "0123456789abcdef" for ch in normalized):
return normalized
try:
return f"{uuid.UUID(normalized).int:032x}"
except (ValueError, AttributeError):
return ""
@lru_cache(maxsize=4096)
def compute_span_id_hex(uuid_str: str | None) -> str:
if not uuid_str:
return ""
normalized = uuid_str.strip().lower()
if len(normalized) == 16 and all(ch in "0123456789abcdef" for ch in normalized):
return normalized
try:
from enterprise.telemetry.id_generator import compute_deterministic_span_id
return f"{compute_deterministic_span_id(normalized):016x}"
except (ValueError, AttributeError):
return ""
def emit_telemetry_log(
*,
event_name: str | EnterpriseTelemetryEvent,
attributes: dict[str, Any],
signal: str = "metric_only",
trace_id_source: str | None = None,
span_id_source: str | None = None,
tenant_id: str | None = None,
user_id: str | None = None,
) -> None:
"""Emit a structured log line for a telemetry event.
Parameters
----------
event_name:
Canonical event name, e.g. ``"dify.workflow.run"``.
attributes:
All event-specific attributes (already built by the caller).
signal:
``"metric_only"`` for events with no span, ``"span_detail"``
for detail logs accompanying a slim span.
trace_id_source:
A UUID string (e.g. ``workflow_run_id``) used to derive a 32-hex
trace_id for cross-signal correlation.
tenant_id:
Tenant identifier (for the ``IdentityContextFilter``).
user_id:
User identifier (for the ``IdentityContextFilter``).
"""
if not logger.isEnabledFor(logging.INFO):
return
attrs = {
"dify.event.name": event_name,
"dify.event.signal": signal,
**attributes,
}
extra: dict[str, Any] = {"attributes": attrs}
trace_id_hex = compute_trace_id_hex(trace_id_source)
if trace_id_hex:
extra["trace_id"] = trace_id_hex
span_id_hex = compute_span_id_hex(span_id_source)
if span_id_hex:
extra["span_id"] = span_id_hex
if tenant_id:
extra["tenant_id"] = tenant_id
if user_id:
extra["user_id"] = user_id
logger.info("telemetry.%s", signal, extra=extra)
def emit_metric_only_event(
*,
event_name: str | EnterpriseTelemetryEvent,
attributes: dict[str, Any],
trace_id_source: str | None = None,
span_id_source: str | None = None,
tenant_id: str | None = None,
user_id: str | None = None,
) -> None:
emit_telemetry_log(
event_name=event_name,
attributes=attributes,
signal="metric_only",
trace_id_source=trace_id_source,
span_id_source=span_id_source,
tenant_id=tenant_id,
user_id=user_id,
)

View File

@@ -3,6 +3,12 @@ from blinker import signal
# sender: app
app_was_created = signal("app-was-created")
# sender: app
app_was_deleted = signal("app-was-deleted")
# sender: app
app_was_updated = signal("app-was-updated")
# sender: app, kwargs: app_model_config
app_model_config_was_updated = signal("app-model-config-was-updated")

View File

@@ -0,0 +1,4 @@
from blinker import signal
# sender: MessageFeedback, kwargs: tenant_id
feedback_was_created = signal("feedback-was-created")

View File

@@ -0,0 +1,58 @@
"""Flask extension for enterprise telemetry lifecycle management.
Initializes the EnterpriseExporter and TelemetryGateway singletons during
``create_app()`` (single-threaded), registers blinker event handlers,
and hooks atexit for graceful shutdown.
Skipped entirely when ``ENTERPRISE_ENABLED`` and ``ENTERPRISE_TELEMETRY_ENABLED``
are false (``is_enabled()`` gate).
"""
from __future__ import annotations
import atexit
import logging
from typing import TYPE_CHECKING
from configs import dify_config
if TYPE_CHECKING:
from dify_app import DifyApp
from enterprise.telemetry.exporter import EnterpriseExporter
from enterprise.telemetry.gateway import TelemetryGateway
logger = logging.getLogger(__name__)
_exporter: EnterpriseExporter | None = None
_gateway: TelemetryGateway | None = None
def is_enabled() -> bool:
return bool(dify_config.ENTERPRISE_ENABLED and dify_config.ENTERPRISE_TELEMETRY_ENABLED)
def init_app(app: DifyApp) -> None:
global _exporter, _gateway
if not is_enabled():
return
from enterprise.telemetry.exporter import EnterpriseExporter
from enterprise.telemetry.gateway import TelemetryGateway
_exporter = EnterpriseExporter(dify_config)
_gateway = TelemetryGateway()
atexit.register(_exporter.shutdown)
# Import to trigger @signal.connect decorator registration
import enterprise.telemetry.event_handlers # noqa: F401 # type: ignore[reportUnusedImport]
logger.info("Enterprise telemetry initialized")
def get_enterprise_exporter() -> EnterpriseExporter | None:
return _exporter
def get_gateway() -> TelemetryGateway | None:
return _gateway

View File

@@ -21,3 +21,15 @@ class DifySpanAttributes:
INVOKE_FROM = "dify.invoke_from"
"""Invocation source, e.g. SERVICE_API, WEB_APP, DEBUGGER."""
INVOKED_BY = "dify.invoked_by"
"""Invoked by, e.g. end_user, account, user."""
USAGE_INPUT_TOKENS = "gen_ai.usage.input_tokens"
"""Number of input tokens (prompt tokens) used."""
USAGE_OUTPUT_TOKENS = "gen_ai.usage.output_tokens"
"""Number of output tokens (completion tokens) generated."""
USAGE_TOTAL_TOKENS = "gen_ai.usage.total_tokens"
"""Total number of tokens used."""

View File

@@ -14,7 +14,7 @@ from core.model_runtime.entities.model_entities import ModelPropertyKey, ModelTy
from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
from core.tools.tool_manager import ToolManager
from core.tools.utils.configuration import ToolParameterConfigurationManager
from events.app_event import app_was_created
from events.app_event import app_was_created, app_was_deleted
from extensions.ext_database import db
from libs.datetime_utils import naive_utc_now
from libs.login import current_user
@@ -340,6 +340,8 @@ class AppService:
db.session.delete(app)
db.session.commit()
app_was_deleted.send(app)
# clean up web app settings
if FeatureService.get_system_features().webapp_auth.enabled:
EnterpriseService.WebAppAuth.cleanup_webapp(app.id)

View File

@@ -7,9 +7,10 @@ from core.llm_generator.llm_generator import LLMGenerator
from core.memory.token_buffer_memory import TokenBufferMemory
from core.model_manager import ModelManager
from core.model_runtime.entities.model_entities import ModelType
from core.ops.entities.trace_entity import TraceTaskName
from core.ops.ops_trace_manager import TraceQueueManager, TraceTask
from core.ops.utils import measure_time
from core.telemetry import TelemetryContext, TelemetryEvent, TraceTaskName
from core.telemetry import emit as telemetry_emit
from events.feedback_event import feedback_was_created
from extensions.ext_database import db
from libs.infinite_scroll_pagination import InfiniteScrollPagination
from models import Account
@@ -179,6 +180,9 @@ class MessageService:
db.session.commit()
if feedback and rating:
feedback_was_created.send(feedback, tenant_id=app_model.tenant_id)
return feedback
@classmethod
@@ -294,10 +298,15 @@ class MessageService:
questions: list[str] = list(questions_sequence)
# get tracing instance
trace_manager = TraceQueueManager(app_id=app_model.id)
trace_manager.add_trace_task(
TraceTask(
TraceTaskName.SUGGESTED_QUESTION_TRACE, message_id=message_id, suggested_question=questions, timer=timer
telemetry_emit(
TelemetryEvent(
name=TraceTaskName.SUGGESTED_QUESTION_TRACE,
context=TelemetryContext(tenant_id=app_model.tenant_id, app_id=app_model.id),
payload={
"message_id": message_id,
"suggested_question": questions,
"timer": timer,
},
)
)

View File

@@ -1,3 +1,4 @@
import logging
from typing import Any
from core.ops.entities.config_entity import BaseTracingConfig
@@ -5,6 +6,8 @@ from core.ops.ops_trace_manager import OpsTraceManager, provider_config_map
from extensions.ext_database import db
from models.model import App, TraceAppConfig
logger = logging.getLogger(__name__)
class OpsService:
@classmethod
@@ -135,12 +138,13 @@ class OpsService:
return trace_config_data.to_dict()
@classmethod
def create_tracing_app_config(cls, app_id: str, tracing_provider: str, tracing_config: dict):
def create_tracing_app_config(cls, app_id: str, tracing_provider: str, tracing_config: dict, account_id: str):
"""
Create tracing app config
:param app_id: app id
:param tracing_provider: tracing provider
:param tracing_config: tracing config
:param account_id: account id of the user creating the config
:return:
"""
try:
@@ -207,15 +211,19 @@ class OpsService:
db.session.add(trace_config_data)
db.session.commit()
# Log the creation with modifier information
logger.info("Trace config created: app_id=%s, provider=%s, created_by=%s", app_id, tracing_provider, account_id)
return {"result": "success"}
@classmethod
def update_tracing_app_config(cls, app_id: str, tracing_provider: str, tracing_config: dict):
def update_tracing_app_config(cls, app_id: str, tracing_provider: str, tracing_config: dict, account_id: str):
"""
Update tracing app config
:param app_id: app id
:param tracing_provider: tracing provider
:param tracing_config: tracing config
:param account_id: account id of the user updating the config
:return:
"""
try:
@@ -251,14 +259,18 @@ class OpsService:
current_trace_config.tracing_config = tracing_config
db.session.commit()
# Log the update with modifier information
logger.info("Trace config updated: app_id=%s, provider=%s, updated_by=%s", app_id, tracing_provider, account_id)
return current_trace_config.to_dict()
@classmethod
def delete_tracing_app_config(cls, app_id: str, tracing_provider: str):
def delete_tracing_app_config(cls, app_id: str, tracing_provider: str, account_id: str):
"""
Delete tracing app config
:param app_id: app id
:param tracing_provider: tracing provider
:param account_id: account id of the user deleting the config
:return:
"""
trace_config = (
@@ -270,6 +282,9 @@ class OpsService:
if not trace_config:
return None
# Log the deletion with modifier information
logger.info("Trace config deleted: app_id=%s, provider=%s, deleted_by=%s", app_id, tracing_provider, account_id)
db.session.delete(trace_config)
db.session.commit()

View File

@@ -27,6 +27,7 @@ from core.workflow.nodes.start.entities import StartNodeData
from core.workflow.runtime import VariablePool
from core.workflow.system_variable import SystemVariable
from core.workflow.workflow_entry import WorkflowEntry
from enterprise.telemetry.draft_trace import enqueue_draft_node_execution_trace
from enums.cloud_plan import CloudPlan
from events.app_event import app_draft_workflow_was_synced, app_published_workflow_was_updated
from extensions.ext_database import db
@@ -647,6 +648,7 @@ class WorkflowService:
node_config = draft_workflow.get_node_config_by_id(node_id)
node_type = Workflow.get_node_type_from_node_config(node_config)
node_data = node_config.get("data", {})
workflow_execution_id: str | None = None
if node_type.is_start_node:
with Session(bind=db.engine) as session, session.begin():
draft_var_srv = WorkflowDraftVariableService(session)
@@ -672,10 +674,13 @@ class WorkflowService:
node_type=node_type,
conversation_id=conversation_id,
)
workflow_execution_id = variable_pool.system_variables.workflow_execution_id
else:
workflow_execution_id = str(uuid.uuid4())
system_variable = SystemVariable(workflow_execution_id=workflow_execution_id)
variable_pool = VariablePool(
system_variables=SystemVariable.default(),
system_variables=system_variable,
user_inputs=user_inputs,
environment_variables=draft_workflow.environment_variables,
conversation_variables=[],
@@ -729,6 +734,13 @@ class WorkflowService:
with Session(db.engine) as session:
outputs = workflow_node_execution.load_full_outputs(session, storage)
enqueue_draft_node_execution_trace(
execution=workflow_node_execution,
outputs=outputs,
workflow_execution_id=workflow_execution_id,
user_id=account.id,
)
with Session(bind=db.engine) as session, session.begin():
draft_var_saver = DraftVariableSaver(
session=session,
@@ -784,19 +796,20 @@ class WorkflowService:
Returns:
WorkflowNodeExecution: The execution result
"""
created_at = naive_utc_now()
node, node_run_result, run_succeeded, error = self._execute_node_safely(invoke_node_fn)
finished_at = naive_utc_now()
# Create base node execution
node_execution = WorkflowNodeExecution(
id=str(uuid.uuid4()),
id=node.execution_id or str(uuid.uuid4()),
workflow_id="", # Single-step execution has no workflow ID
index=1,
node_id=node_id,
node_type=node.node_type,
title=node.title,
elapsed_time=time.perf_counter() - start_at,
created_at=naive_utc_now(),
finished_at=naive_utc_now(),
created_at=created_at,
finished_at=finished_at,
)
# Populate execution result data

View File

@@ -0,0 +1,52 @@
"""Celery worker for enterprise metric/log telemetry events.
This module defines the Celery task that processes telemetry envelopes
from the enterprise_telemetry queue. It deserializes envelopes and
dispatches them to the EnterpriseMetricHandler.
"""
import json
import logging
from celery import shared_task
from enterprise.telemetry.contracts import TelemetryEnvelope
from enterprise.telemetry.metric_handler import EnterpriseMetricHandler
logger = logging.getLogger(__name__)
@shared_task(queue="enterprise_telemetry")
def process_enterprise_telemetry(envelope_json: str) -> None:
"""Process enterprise metric/log telemetry envelope.
This task is enqueued by the TelemetryGateway for metric/log-only
events. It deserializes the envelope and dispatches to the handler.
Best-effort processing: logs errors but never raises, to avoid
failing user requests due to telemetry issues.
Args:
envelope_json: JSON-serialized TelemetryEnvelope.
"""
try:
# Deserialize envelope
envelope_dict = json.loads(envelope_json)
envelope = TelemetryEnvelope.model_validate(envelope_dict)
# Process through handler
handler = EnterpriseMetricHandler()
handler.handle(envelope)
logger.debug(
"Successfully processed telemetry envelope: tenant_id=%s, event_id=%s, case=%s",
envelope.tenant_id,
envelope.event_id,
envelope.case,
)
except Exception:
# Best-effort: log and drop on error, never fail user request
logger.warning(
"Failed to process enterprise telemetry envelope, dropping event",
exc_info=True,
)

View File

@@ -39,12 +39,24 @@ def process_trace_tasks(file_info):
trace_info["documents"] = [Document.model_validate(doc) for doc in trace_info["documents"]]
try:
trace_type = trace_info_info_map.get(trace_info_type)
if trace_type:
trace_info = trace_type(**trace_info)
from extensions.ext_enterprise_telemetry import is_enabled as is_ee_telemetry_enabled
if is_ee_telemetry_enabled():
from enterprise.telemetry.enterprise_trace import EnterpriseOtelTrace
try:
EnterpriseOtelTrace().trace(trace_info)
except Exception:
logger.warning("Enterprise trace failed for app_id: %s", app_id, exc_info=True)
if trace_instance:
with current_app.app_context():
trace_type = trace_info_info_map.get(trace_info_type)
if trace_type:
trace_info = trace_type(**trace_info)
trace_instance.trace(trace_info)
logger.info("Processing trace tasks success, app_id: %s", app_id)
except Exception as e:
logger.info("error:\n\n\n%s\n\n\n\n", e)
@@ -52,4 +64,12 @@ def process_trace_tasks(file_info):
redis_client.incr(failed_key)
logger.info("Processing trace tasks failed, app_id: %s", app_id)
finally:
storage.delete(file_path)
try:
storage.delete(file_path)
except Exception as e:
logger.warning(
"Failed to delete trace file %s for app_id %s: %s",
file_path,
app_id,
e,
)

View File

@@ -0,0 +1,200 @@
"""Unit tests for TraceQueueManager telemetry guard.
This test suite verifies that TraceQueueManager correctly drops trace tasks
when telemetry is disabled, proving Bug 1 from code review is a false positive.
The guard logic moved from persistence.py to TraceQueueManager.add_trace_task()
at line 1282 of ops_trace_manager.py:
if self._enterprise_telemetry_enabled or self.trace_instance:
trace_task.app_id = self.app_id
trace_manager_queue.put(trace_task)
Tasks are only enqueued if EITHER:
- Enterprise telemetry is enabled (_enterprise_telemetry_enabled=True), OR
- A third-party trace instance (Langfuse, etc.) is configured
When BOTH are false, tasks are silently dropped (correct behavior).
"""
import queue
import sys
import types
from unittest.mock import MagicMock, patch
import pytest
@pytest.fixture
def trace_queue_manager_and_task(monkeypatch):
"""Fixture to provide TraceQueueManager and TraceTask with delayed imports."""
module_name = "core.ops.ops_trace_manager"
if module_name not in sys.modules:
ops_stub = types.ModuleType(module_name)
class StubTraceTask:
def __init__(self, trace_type):
self.trace_type = trace_type
self.app_id = None
class StubTraceQueueManager:
def __init__(self, app_id=None):
self.app_id = app_id
from core.telemetry import is_enterprise_telemetry_enabled
self._enterprise_telemetry_enabled = is_enterprise_telemetry_enabled()
self.trace_instance = StubOpsTraceManager.get_ops_trace_instance(app_id)
def add_trace_task(self, trace_task):
if self._enterprise_telemetry_enabled or self.trace_instance:
trace_task.app_id = self.app_id
from core.ops.ops_trace_manager import trace_manager_queue
trace_manager_queue.put(trace_task)
class StubOpsTraceManager:
@staticmethod
def get_ops_trace_instance(app_id):
return None
ops_stub.TraceQueueManager = StubTraceQueueManager
ops_stub.TraceTask = StubTraceTask
ops_stub.OpsTraceManager = StubOpsTraceManager
ops_stub.trace_manager_queue = MagicMock(spec=queue.Queue)
monkeypatch.setitem(sys.modules, module_name, ops_stub)
from core.ops.entities.trace_entity import TraceTaskName
ops_module = __import__(module_name, fromlist=["TraceQueueManager", "TraceTask"])
TraceQueueManager = ops_module.TraceQueueManager
TraceTask = ops_module.TraceTask
return TraceQueueManager, TraceTask, TraceTaskName
class TestTraceQueueManagerTelemetryGuard:
"""Test TraceQueueManager's telemetry guard in add_trace_task()."""
def test_task_not_enqueued_when_telemetry_disabled_and_no_trace_instance(self, trace_queue_manager_and_task):
"""Verify task is NOT enqueued when telemetry disabled and no trace instance.
This is the core guard: when _enterprise_telemetry_enabled=False AND
trace_instance=None, the task should be silently dropped.
"""
TraceQueueManager, TraceTask, TraceTaskName = trace_queue_manager_and_task
mock_queue = MagicMock(spec=queue.Queue)
trace_task = TraceTask(trace_type=TraceTaskName.WORKFLOW_TRACE)
with (
patch("core.telemetry.is_enterprise_telemetry_enabled", return_value=False),
patch("core.ops.ops_trace_manager.OpsTraceManager.get_ops_trace_instance", return_value=None),
patch("core.ops.ops_trace_manager.trace_manager_queue", mock_queue),
):
manager = TraceQueueManager(app_id="test-app-id")
manager.add_trace_task(trace_task)
mock_queue.put.assert_not_called()
def test_task_enqueued_when_telemetry_enabled(self, trace_queue_manager_and_task):
"""Verify task IS enqueued when enterprise telemetry is enabled.
When _enterprise_telemetry_enabled=True, the task should be enqueued
regardless of trace_instance state.
"""
TraceQueueManager, TraceTask, TraceTaskName = trace_queue_manager_and_task
mock_queue = MagicMock(spec=queue.Queue)
trace_task = TraceTask(trace_type=TraceTaskName.WORKFLOW_TRACE)
with (
patch("core.telemetry.is_enterprise_telemetry_enabled", return_value=True),
patch("core.ops.ops_trace_manager.OpsTraceManager.get_ops_trace_instance", return_value=None),
patch("core.ops.ops_trace_manager.trace_manager_queue", mock_queue),
):
manager = TraceQueueManager(app_id="test-app-id")
manager.add_trace_task(trace_task)
mock_queue.put.assert_called_once()
called_task = mock_queue.put.call_args[0][0]
assert called_task.app_id == "test-app-id"
def test_task_enqueued_when_trace_instance_configured(self, trace_queue_manager_and_task):
"""Verify task IS enqueued when third-party trace instance is configured.
When trace_instance is not None (e.g., Langfuse configured), the task
should be enqueued even if enterprise telemetry is disabled.
"""
TraceQueueManager, TraceTask, TraceTaskName = trace_queue_manager_and_task
mock_queue = MagicMock(spec=queue.Queue)
mock_trace_instance = MagicMock()
trace_task = TraceTask(trace_type=TraceTaskName.WORKFLOW_TRACE)
with (
patch("core.telemetry.is_enterprise_telemetry_enabled", return_value=False),
patch(
"core.ops.ops_trace_manager.OpsTraceManager.get_ops_trace_instance", return_value=mock_trace_instance
),
patch("core.ops.ops_trace_manager.trace_manager_queue", mock_queue),
):
manager = TraceQueueManager(app_id="test-app-id")
manager.add_trace_task(trace_task)
mock_queue.put.assert_called_once()
called_task = mock_queue.put.call_args[0][0]
assert called_task.app_id == "test-app-id"
def test_task_enqueued_when_both_telemetry_and_trace_instance_enabled(self, trace_queue_manager_and_task):
"""Verify task IS enqueued when both telemetry and trace instance are enabled.
When both _enterprise_telemetry_enabled=True AND trace_instance is set,
the task should definitely be enqueued.
"""
TraceQueueManager, TraceTask, TraceTaskName = trace_queue_manager_and_task
mock_queue = MagicMock(spec=queue.Queue)
mock_trace_instance = MagicMock()
trace_task = TraceTask(trace_type=TraceTaskName.WORKFLOW_TRACE)
with (
patch("core.telemetry.is_enterprise_telemetry_enabled", return_value=True),
patch(
"core.ops.ops_trace_manager.OpsTraceManager.get_ops_trace_instance", return_value=mock_trace_instance
),
patch("core.ops.ops_trace_manager.trace_manager_queue", mock_queue),
):
manager = TraceQueueManager(app_id="test-app-id")
manager.add_trace_task(trace_task)
mock_queue.put.assert_called_once()
called_task = mock_queue.put.call_args[0][0]
assert called_task.app_id == "test-app-id"
def test_app_id_set_before_enqueue(self, trace_queue_manager_and_task):
"""Verify app_id is set on the task before enqueuing.
The guard logic sets trace_task.app_id = self.app_id before calling
trace_manager_queue.put(trace_task). This test verifies that behavior.
"""
TraceQueueManager, TraceTask, TraceTaskName = trace_queue_manager_and_task
mock_queue = MagicMock(spec=queue.Queue)
trace_task = TraceTask(trace_type=TraceTaskName.WORKFLOW_TRACE)
with (
patch("core.telemetry.is_enterprise_telemetry_enabled", return_value=True),
patch("core.ops.ops_trace_manager.OpsTraceManager.get_ops_trace_instance", return_value=None),
patch("core.ops.ops_trace_manager.trace_manager_queue", mock_queue),
):
manager = TraceQueueManager(app_id="expected-app-id")
manager.add_trace_task(trace_task)
called_task = mock_queue.put.call_args[0][0]
assert called_task.app_id == "expected-app-id"

View File

@@ -0,0 +1,181 @@
"""Unit tests for core.telemetry.emit() routing and enterprise-only filtering."""
from __future__ import annotations
import queue
import sys
import types
from unittest.mock import MagicMock, patch
import pytest
from core.ops.entities.trace_entity import TraceTaskName
from core.telemetry.events import TelemetryContext, TelemetryEvent
@pytest.fixture
def telemetry_test_setup(monkeypatch):
module_name = "core.ops.ops_trace_manager"
ops_stub = types.ModuleType(module_name)
class StubTraceTask:
def __init__(self, trace_type, **kwargs):
self.trace_type = trace_type
self.app_id = None
self.kwargs = kwargs
class StubTraceQueueManager:
def __init__(self, app_id=None, user_id=None):
self.app_id = app_id
self.user_id = user_id
self.trace_instance = StubOpsTraceManager.get_ops_trace_instance(app_id)
def add_trace_task(self, trace_task):
trace_task.app_id = self.app_id
from core.ops.ops_trace_manager import trace_manager_queue
trace_manager_queue.put(trace_task)
class StubOpsTraceManager:
@staticmethod
def get_ops_trace_instance(app_id):
return None
ops_stub.TraceQueueManager = StubTraceQueueManager
ops_stub.TraceTask = StubTraceTask
ops_stub.OpsTraceManager = StubOpsTraceManager
ops_stub.trace_manager_queue = MagicMock(spec=queue.Queue)
monkeypatch.setitem(sys.modules, module_name, ops_stub)
from core.telemetry import emit
return emit, ops_stub.trace_manager_queue
class TestTelemetryEmit:
@patch("core.telemetry._is_enterprise_telemetry_enabled", return_value=True)
def test_emit_enterprise_trace_creates_trace_task(self, _mock_ee, telemetry_test_setup):
emit_fn, mock_queue = telemetry_test_setup
event = TelemetryEvent(
name=TraceTaskName.DRAFT_NODE_EXECUTION_TRACE,
context=TelemetryContext(
tenant_id="test-tenant",
user_id="test-user",
app_id="test-app",
),
payload={"key": "value"},
)
emit_fn(event)
mock_queue.put.assert_called_once()
called_task = mock_queue.put.call_args[0][0]
assert called_task.trace_type == TraceTaskName.DRAFT_NODE_EXECUTION_TRACE
def test_emit_community_trace_enqueued(self, telemetry_test_setup):
emit_fn, mock_queue = telemetry_test_setup
event = TelemetryEvent(
name=TraceTaskName.WORKFLOW_TRACE,
context=TelemetryContext(
tenant_id="test-tenant",
user_id="test-user",
app_id="test-app",
),
payload={},
)
emit_fn(event)
mock_queue.put.assert_called_once()
def test_emit_enterprise_only_trace_dropped_when_ee_disabled(self, telemetry_test_setup):
emit_fn, mock_queue = telemetry_test_setup
event = TelemetryEvent(
name=TraceTaskName.DRAFT_NODE_EXECUTION_TRACE,
context=TelemetryContext(
tenant_id="test-tenant",
user_id="test-user",
app_id="test-app",
),
payload={},
)
emit_fn(event)
mock_queue.put.assert_not_called()
@patch("core.telemetry._is_enterprise_telemetry_enabled", return_value=True)
def test_emit_all_enterprise_only_traces_allowed_when_ee_enabled(self, _mock_ee, telemetry_test_setup):
emit_fn, mock_queue = telemetry_test_setup
enterprise_only_traces = [
TraceTaskName.DRAFT_NODE_EXECUTION_TRACE,
TraceTaskName.NODE_EXECUTION_TRACE,
TraceTaskName.PROMPT_GENERATION_TRACE,
]
for trace_name in enterprise_only_traces:
mock_queue.reset_mock()
event = TelemetryEvent(
name=trace_name,
context=TelemetryContext(
tenant_id="test-tenant",
user_id="test-user",
app_id="test-app",
),
payload={},
)
emit_fn(event)
mock_queue.put.assert_called_once()
called_task = mock_queue.put.call_args[0][0]
assert called_task.trace_type == trace_name
@patch("core.telemetry._is_enterprise_telemetry_enabled", return_value=True)
def test_emit_passes_name_directly_to_trace_task(self, _mock_ee, telemetry_test_setup):
emit_fn, mock_queue = telemetry_test_setup
event = TelemetryEvent(
name=TraceTaskName.DRAFT_NODE_EXECUTION_TRACE,
context=TelemetryContext(
tenant_id="test-tenant",
user_id="test-user",
app_id="test-app",
),
payload={"extra": "data"},
)
emit_fn(event)
mock_queue.put.assert_called_once()
called_task = mock_queue.put.call_args[0][0]
assert called_task.trace_type == TraceTaskName.DRAFT_NODE_EXECUTION_TRACE
assert isinstance(called_task.trace_type, TraceTaskName)
@patch("core.telemetry._is_enterprise_telemetry_enabled", return_value=True)
def test_emit_with_provided_trace_manager(self, _mock_ee, telemetry_test_setup):
emit_fn, mock_queue = telemetry_test_setup
mock_trace_manager = MagicMock()
mock_trace_manager.add_trace_task = MagicMock()
event = TelemetryEvent(
name=TraceTaskName.NODE_EXECUTION_TRACE,
context=TelemetryContext(
tenant_id="test-tenant",
user_id="test-user",
app_id="test-app",
),
payload={},
)
emit_fn(event, trace_manager=mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
called_task = mock_trace_manager.add_trace_task.call_args[0][0]
assert called_task.trace_type == TraceTaskName.NODE_EXECUTION_TRACE

View File

@@ -0,0 +1,252 @@
from __future__ import annotations
import sys
from unittest.mock import MagicMock, patch
import pytest
from core.telemetry import is_enterprise_telemetry_enabled
from enterprise.telemetry.contracts import TelemetryCase
from enterprise.telemetry.gateway import TelemetryGateway
class TestTelemetryCoreExports:
def test_is_enterprise_telemetry_enabled_exported(self) -> None:
from core.telemetry import is_enterprise_telemetry_enabled as exported_func
assert callable(exported_func)
@pytest.fixture
def mock_ops_trace_manager():
mock_module = MagicMock()
mock_trace_task_class = MagicMock()
mock_trace_task_class.return_value = MagicMock()
mock_module.TraceTask = mock_trace_task_class
mock_module.TraceQueueManager = MagicMock()
mock_trace_entity = MagicMock()
mock_trace_task_name = MagicMock()
mock_trace_task_name.return_value = "workflow"
mock_trace_entity.TraceTaskName = mock_trace_task_name
with (
patch.dict(sys.modules, {"core.ops.ops_trace_manager": mock_module}),
patch.dict(sys.modules, {"core.ops.entities.trace_entity": mock_trace_entity}),
):
yield mock_module, mock_trace_entity
class TestGatewayIntegrationTraceRouting:
@pytest.fixture
def gateway(self) -> TelemetryGateway:
return TelemetryGateway()
@pytest.fixture
def mock_trace_manager(self) -> MagicMock:
return MagicMock()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_ce_eligible_trace_routed_to_trace_manager(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=True):
context = {"app_id": "app-123", "user_id": "user-456", "tenant_id": "tenant-789"}
payload = {"workflow_run_id": "run-abc"}
gateway.emit(TelemetryCase.WORKFLOW_RUN, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_ce_eligible_trace_routed_when_ee_disabled(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False):
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"workflow_run_id": "run-abc"}
gateway.emit(TelemetryCase.WORKFLOW_RUN, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_enterprise_only_trace_dropped_when_ee_disabled(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False):
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"node_id": "node-abc"}
gateway.emit(TelemetryCase.NODE_EXECUTION, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_not_called()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_enterprise_only_trace_routed_when_ee_enabled(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=True):
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"node_id": "node-abc"}
gateway.emit(TelemetryCase.NODE_EXECUTION, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
class TestGatewayIntegrationMetricRouting:
@pytest.fixture
def gateway(self) -> TelemetryGateway:
return TelemetryGateway()
def test_metric_case_routes_to_celery_task(
self,
gateway: TelemetryGateway,
) -> None:
from enterprise.telemetry.contracts import TelemetryEnvelope
with patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay") as mock_delay:
context = {"tenant_id": "tenant-123"}
payload = {"app_id": "app-abc", "name": "My App"}
gateway.emit(TelemetryCase.APP_CREATED, context, payload)
mock_delay.assert_called_once()
envelope_json = mock_delay.call_args[0][0]
envelope = TelemetryEnvelope.model_validate_json(envelope_json)
assert envelope.case == TelemetryCase.APP_CREATED
assert envelope.tenant_id == "tenant-123"
assert envelope.payload["app_id"] == "app-abc"
def test_tool_execution_metric_routed(
self,
gateway: TelemetryGateway,
) -> None:
from enterprise.telemetry.contracts import TelemetryEnvelope
with patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay") as mock_delay:
context = {"tenant_id": "tenant-123", "app_id": "app-123"}
payload = {"tool_name": "test_tool", "tool_inputs": {}, "tool_outputs": "result"}
gateway.emit(TelemetryCase.TOOL_EXECUTION, context, payload)
mock_delay.assert_called_once()
envelope_json = mock_delay.call_args[0][0]
envelope = TelemetryEnvelope.model_validate_json(envelope_json)
assert envelope.case == TelemetryCase.TOOL_EXECUTION
def test_moderation_check_metric_routed(
self,
gateway: TelemetryGateway,
) -> None:
from enterprise.telemetry.contracts import TelemetryEnvelope
with patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay") as mock_delay:
context = {"tenant_id": "tenant-123", "app_id": "app-123"}
payload = {"message_id": "msg-123", "moderation_result": {"flagged": False}}
gateway.emit(TelemetryCase.MODERATION_CHECK, context, payload)
mock_delay.assert_called_once()
envelope_json = mock_delay.call_args[0][0]
envelope = TelemetryEnvelope.model_validate_json(envelope_json)
assert envelope.case == TelemetryCase.MODERATION_CHECK
class TestGatewayIntegrationCEEligibility:
@pytest.fixture
def gateway(self) -> TelemetryGateway:
return TelemetryGateway()
@pytest.fixture
def mock_trace_manager(self) -> MagicMock:
return MagicMock()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_workflow_run_is_ce_eligible(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False):
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"workflow_run_id": "run-abc"}
gateway.emit(TelemetryCase.WORKFLOW_RUN, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_message_run_is_ce_eligible(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False):
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"message_id": "msg-abc", "conversation_id": "conv-123"}
gateway.emit(TelemetryCase.MESSAGE_RUN, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_node_execution_not_ce_eligible(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False):
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"node_id": "node-abc"}
gateway.emit(TelemetryCase.NODE_EXECUTION, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_not_called()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_draft_node_execution_not_ce_eligible(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False):
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"node_execution_data": {}}
gateway.emit(TelemetryCase.DRAFT_NODE_EXECUTION, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_not_called()
@pytest.mark.usefixtures("mock_ops_trace_manager")
def test_prompt_generation_not_ce_eligible(
self,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
) -> None:
with patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False):
context = {"app_id": "app-123", "user_id": "user-456", "tenant_id": "tenant-789"}
payload = {"operation_type": "generate", "instruction": "test"}
gateway.emit(TelemetryCase.PROMPT_GENERATION, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_not_called()
class TestIsEnterpriseTelemetryEnabled:
def test_returns_false_when_exporter_import_fails(self) -> None:
with patch.dict(sys.modules, {"enterprise.telemetry.exporter": None}):
result = is_enterprise_telemetry_enabled()
assert result is False
def test_function_is_callable(self) -> None:
assert callable(is_enterprise_telemetry_enabled)

View File

@@ -0,0 +1,264 @@
"""Unit tests for telemetry gateway contracts."""
from __future__ import annotations
import pytest
from pydantic import ValidationError
from enterprise.telemetry.contracts import CaseRoute, SignalType, TelemetryCase, TelemetryEnvelope
from enterprise.telemetry.gateway import CASE_ROUTING
class TestTelemetryCase:
"""Tests for TelemetryCase enum."""
def test_all_cases_defined(self) -> None:
"""Verify all 14 telemetry cases are defined."""
expected_cases = {
"WORKFLOW_RUN",
"NODE_EXECUTION",
"DRAFT_NODE_EXECUTION",
"MESSAGE_RUN",
"TOOL_EXECUTION",
"MODERATION_CHECK",
"SUGGESTED_QUESTION",
"DATASET_RETRIEVAL",
"GENERATE_NAME",
"PROMPT_GENERATION",
"APP_CREATED",
"APP_UPDATED",
"APP_DELETED",
"FEEDBACK_CREATED",
}
actual_cases = {case.name for case in TelemetryCase}
assert actual_cases == expected_cases
def test_case_values(self) -> None:
"""Verify case enum values are correct."""
assert TelemetryCase.WORKFLOW_RUN.value == "workflow_run"
assert TelemetryCase.NODE_EXECUTION.value == "node_execution"
assert TelemetryCase.DRAFT_NODE_EXECUTION.value == "draft_node_execution"
assert TelemetryCase.MESSAGE_RUN.value == "message_run"
assert TelemetryCase.TOOL_EXECUTION.value == "tool_execution"
assert TelemetryCase.MODERATION_CHECK.value == "moderation_check"
assert TelemetryCase.SUGGESTED_QUESTION.value == "suggested_question"
assert TelemetryCase.DATASET_RETRIEVAL.value == "dataset_retrieval"
assert TelemetryCase.GENERATE_NAME.value == "generate_name"
assert TelemetryCase.PROMPT_GENERATION.value == "prompt_generation"
assert TelemetryCase.APP_CREATED.value == "app_created"
assert TelemetryCase.APP_UPDATED.value == "app_updated"
assert TelemetryCase.APP_DELETED.value == "app_deleted"
assert TelemetryCase.FEEDBACK_CREATED.value == "feedback_created"
class TestCaseRoute:
"""Tests for CaseRoute model."""
def test_valid_trace_route(self) -> None:
"""Verify valid trace route creation."""
route = CaseRoute(signal_type=SignalType.TRACE, ce_eligible=True)
assert route.signal_type == SignalType.TRACE
assert route.ce_eligible is True
def test_valid_metric_log_route(self) -> None:
"""Verify valid metric_log route creation."""
route = CaseRoute(signal_type=SignalType.METRIC_LOG, ce_eligible=False)
assert route.signal_type == SignalType.METRIC_LOG
assert route.ce_eligible is False
def test_invalid_signal_type(self) -> None:
"""Verify invalid signal_type is rejected."""
with pytest.raises(ValidationError):
CaseRoute(signal_type="invalid", ce_eligible=True)
class TestTelemetryEnvelope:
"""Tests for TelemetryEnvelope model."""
def test_valid_envelope_minimal(self) -> None:
"""Verify valid minimal envelope creation."""
envelope = TelemetryEnvelope(
case=TelemetryCase.WORKFLOW_RUN,
tenant_id="tenant-123",
event_id="event-456",
payload={"key": "value"},
)
assert envelope.case == TelemetryCase.WORKFLOW_RUN
assert envelope.tenant_id == "tenant-123"
assert envelope.event_id == "event-456"
assert envelope.payload == {"key": "value"}
assert envelope.payload_fallback is None
assert envelope.metadata is None
def test_valid_envelope_full(self) -> None:
"""Verify valid envelope with all fields."""
metadata = {"source": "api"}
fallback = b"fallback data"
envelope = TelemetryEnvelope(
case=TelemetryCase.MESSAGE_RUN,
tenant_id="tenant-789",
event_id="event-012",
payload={"message": "hello"},
payload_fallback=fallback,
metadata=metadata,
)
assert envelope.case == TelemetryCase.MESSAGE_RUN
assert envelope.tenant_id == "tenant-789"
assert envelope.event_id == "event-012"
assert envelope.payload == {"message": "hello"}
assert envelope.payload_fallback == fallback
assert envelope.metadata == metadata
def test_missing_required_case(self) -> None:
"""Verify missing case field is rejected."""
with pytest.raises(ValidationError):
TelemetryEnvelope(
tenant_id="tenant-123",
event_id="event-456",
payload={"key": "value"},
)
def test_missing_required_tenant_id(self) -> None:
"""Verify missing tenant_id field is rejected."""
with pytest.raises(ValidationError):
TelemetryEnvelope(
case=TelemetryCase.WORKFLOW_RUN,
event_id="event-456",
payload={"key": "value"},
)
def test_missing_required_event_id(self) -> None:
"""Verify missing event_id field is rejected."""
with pytest.raises(ValidationError):
TelemetryEnvelope(
case=TelemetryCase.WORKFLOW_RUN,
tenant_id="tenant-123",
payload={"key": "value"},
)
def test_missing_required_payload(self) -> None:
"""Verify missing payload field is rejected."""
with pytest.raises(ValidationError):
TelemetryEnvelope(
case=TelemetryCase.WORKFLOW_RUN,
tenant_id="tenant-123",
event_id="event-456",
)
def test_payload_fallback_within_limit(self) -> None:
"""Verify payload_fallback within 64KB limit is accepted."""
fallback = b"x" * 65536
envelope = TelemetryEnvelope(
case=TelemetryCase.WORKFLOW_RUN,
tenant_id="tenant-123",
event_id="event-456",
payload={"key": "value"},
payload_fallback=fallback,
)
assert envelope.payload_fallback == fallback
def test_payload_fallback_exceeds_limit(self) -> None:
"""Verify payload_fallback exceeding 64KB is rejected."""
fallback = b"x" * 65537
with pytest.raises(ValidationError) as exc_info:
TelemetryEnvelope(
case=TelemetryCase.WORKFLOW_RUN,
tenant_id="tenant-123",
event_id="event-456",
payload={"key": "value"},
payload_fallback=fallback,
)
assert "64KB" in str(exc_info.value)
def test_payload_fallback_none(self) -> None:
"""Verify payload_fallback can be None."""
envelope = TelemetryEnvelope(
case=TelemetryCase.WORKFLOW_RUN,
tenant_id="tenant-123",
event_id="event-456",
payload={"key": "value"},
payload_fallback=None,
)
assert envelope.payload_fallback is None
class TestCaseRouting:
"""Tests for CASE_ROUTING table."""
def test_all_cases_routed(self) -> None:
"""Verify all 14 cases have routing entries."""
assert len(CASE_ROUTING) == 14
for case in TelemetryCase:
assert case in CASE_ROUTING
def test_trace_ce_eligible_cases(self) -> None:
"""Verify trace cases with CE eligibility."""
ce_eligible_trace_cases = {
TelemetryCase.WORKFLOW_RUN,
TelemetryCase.MESSAGE_RUN,
}
for case in ce_eligible_trace_cases:
route = CASE_ROUTING[case]
assert route.signal_type == SignalType.TRACE
assert route.ce_eligible is True
def test_trace_enterprise_only_cases(self) -> None:
"""Verify trace cases that are enterprise-only."""
enterprise_only_trace_cases = {
TelemetryCase.NODE_EXECUTION,
TelemetryCase.DRAFT_NODE_EXECUTION,
TelemetryCase.PROMPT_GENERATION,
}
for case in enterprise_only_trace_cases:
route = CASE_ROUTING[case]
assert route.signal_type == SignalType.TRACE
assert route.ce_eligible is False
def test_metric_log_cases(self) -> None:
"""Verify metric/log-only cases."""
metric_log_cases = {
TelemetryCase.APP_CREATED,
TelemetryCase.APP_UPDATED,
TelemetryCase.APP_DELETED,
TelemetryCase.FEEDBACK_CREATED,
TelemetryCase.TOOL_EXECUTION,
TelemetryCase.MODERATION_CHECK,
TelemetryCase.SUGGESTED_QUESTION,
TelemetryCase.DATASET_RETRIEVAL,
TelemetryCase.GENERATE_NAME,
}
for case in metric_log_cases:
route = CASE_ROUTING[case]
assert route.signal_type == SignalType.METRIC_LOG
assert route.ce_eligible is False
def test_routing_table_completeness(self) -> None:
"""Verify routing table covers all cases with correct types."""
trace_cases = {
TelemetryCase.WORKFLOW_RUN,
TelemetryCase.MESSAGE_RUN,
TelemetryCase.NODE_EXECUTION,
TelemetryCase.DRAFT_NODE_EXECUTION,
TelemetryCase.PROMPT_GENERATION,
}
metric_log_cases = {
TelemetryCase.APP_CREATED,
TelemetryCase.APP_UPDATED,
TelemetryCase.APP_DELETED,
TelemetryCase.FEEDBACK_CREATED,
TelemetryCase.TOOL_EXECUTION,
TelemetryCase.MODERATION_CHECK,
TelemetryCase.SUGGESTED_QUESTION,
TelemetryCase.DATASET_RETRIEVAL,
TelemetryCase.GENERATE_NAME,
}
all_cases = trace_cases | metric_log_cases
assert len(all_cases) == 14
assert all_cases == set(TelemetryCase)
for case in trace_cases:
assert CASE_ROUTING[case].signal_type == SignalType.TRACE
for case in metric_log_cases:
assert CASE_ROUTING[case].signal_type == SignalType.METRIC_LOG

View File

@@ -0,0 +1,134 @@
from unittest.mock import MagicMock, patch
import pytest
from enterprise.telemetry import event_handlers
from enterprise.telemetry.contracts import TelemetryCase
@pytest.fixture
def mock_exporter():
with patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter") as mock:
exporter = MagicMock()
mock.return_value = exporter
yield exporter
@pytest.fixture
def mock_task():
with patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry") as mock:
yield mock
def test_handle_app_created_calls_task(mock_exporter, mock_task):
sender = MagicMock()
sender.id = "app-123"
sender.tenant_id = "tenant-456"
sender.mode = "chat"
event_handlers._handle_app_created(sender)
mock_task.delay.assert_called_once()
call_args = mock_task.delay.call_args[0][0]
assert "app_created" in call_args
assert "tenant-456" in call_args
assert "app-123" in call_args
assert "chat" in call_args
def test_handle_app_created_no_exporter(mock_task):
with patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter", return_value=None):
sender = MagicMock()
sender.id = "app-123"
sender.tenant_id = "tenant-456"
event_handlers._handle_app_created(sender)
mock_task.delay.assert_not_called()
def test_handle_app_updated_calls_task(mock_exporter, mock_task):
sender = MagicMock()
sender.id = "app-123"
sender.tenant_id = "tenant-456"
event_handlers._handle_app_updated(sender)
mock_task.delay.assert_called_once()
call_args = mock_task.delay.call_args[0][0]
assert "app_updated" in call_args
assert "tenant-456" in call_args
assert "app-123" in call_args
def test_handle_app_deleted_calls_task(mock_exporter, mock_task):
sender = MagicMock()
sender.id = "app-123"
sender.tenant_id = "tenant-456"
event_handlers._handle_app_deleted(sender)
mock_task.delay.assert_called_once()
call_args = mock_task.delay.call_args[0][0]
assert "app_deleted" in call_args
assert "tenant-456" in call_args
assert "app-123" in call_args
def test_handle_feedback_created_calls_task(mock_exporter, mock_task):
sender = MagicMock()
sender.message_id = "msg-123"
sender.app_id = "app-456"
sender.conversation_id = "conv-789"
sender.from_end_user_id = "user-001"
sender.from_account_id = None
sender.rating = "like"
sender.from_source = "api"
sender.content = "Great response!"
event_handlers._handle_feedback_created(sender, tenant_id="tenant-456")
mock_task.delay.assert_called_once()
call_args = mock_task.delay.call_args[0][0]
assert "feedback_created" in call_args
assert "tenant-456" in call_args
assert "msg-123" in call_args
assert "app-456" in call_args
assert "conv-789" in call_args
assert "user-001" in call_args
assert "like" in call_args
assert "api" in call_args
assert "Great response!" in call_args
def test_handle_feedback_created_no_exporter(mock_task):
with patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter", return_value=None):
sender = MagicMock()
sender.message_id = "msg-123"
event_handlers._handle_feedback_created(sender, tenant_id="tenant-456")
mock_task.delay.assert_not_called()
def test_handlers_create_valid_envelopes(mock_exporter, mock_task):
import json
from enterprise.telemetry.contracts import TelemetryEnvelope
sender = MagicMock()
sender.id = "app-123"
sender.tenant_id = "tenant-456"
sender.mode = "chat"
event_handlers._handle_app_created(sender)
call_args = mock_task.delay.call_args[0][0]
envelope_dict = json.loads(call_args)
envelope = TelemetryEnvelope(**envelope_dict)
assert envelope.case == TelemetryCase.APP_CREATED
assert envelope.tenant_id == "tenant-456"
assert envelope.event_id
assert envelope.payload["app_id"] == "app-123"
assert envelope.payload["mode"] == "chat"

View File

@@ -0,0 +1,215 @@
"""Unit tests for EnterpriseExporter and _ExporterFactory."""
from types import SimpleNamespace
from unittest.mock import MagicMock, patch
from configs.enterprise import EnterpriseTelemetryConfig
from enterprise.telemetry.exporter import EnterpriseExporter
def test_config_api_key_default_empty():
"""Test that ENTERPRISE_OTLP_API_KEY defaults to empty string."""
config = EnterpriseTelemetryConfig()
assert config.ENTERPRISE_OTLP_API_KEY == ""
@patch("enterprise.telemetry.exporter.GRPCSpanExporter")
@patch("enterprise.telemetry.exporter.GRPCMetricExporter")
def test_api_key_only_injects_bearer_header(mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock) -> None:
"""Test that API key alone injects Bearer authorization header."""
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="https://collector.example.com",
ENTERPRISE_OTLP_HEADERS="",
ENTERPRISE_OTLP_PROTOCOL="grpc",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY="test-secret-key",
)
EnterpriseExporter(mock_config)
# Verify span exporter was called with Bearer header
assert mock_span_exporter.call_args is not None
headers = mock_span_exporter.call_args.kwargs.get("headers")
assert headers is not None
assert ("authorization", "Bearer test-secret-key") in headers
@patch("enterprise.telemetry.exporter.GRPCSpanExporter")
@patch("enterprise.telemetry.exporter.GRPCMetricExporter")
def test_empty_api_key_no_auth_header(mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock) -> None:
"""Test that empty API key does not inject authorization header."""
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="https://collector.example.com",
ENTERPRISE_OTLP_HEADERS="",
ENTERPRISE_OTLP_PROTOCOL="grpc",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY="",
)
EnterpriseExporter(mock_config)
# Verify span exporter was called without authorization header
assert mock_span_exporter.call_args is not None
headers = mock_span_exporter.call_args.kwargs.get("headers")
# Headers should be None or not contain authorization
if headers is not None:
assert not any(key == "authorization" for key, _ in headers)
@patch("enterprise.telemetry.exporter.GRPCSpanExporter")
@patch("enterprise.telemetry.exporter.GRPCMetricExporter")
def test_api_key_and_custom_headers_merge(mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock) -> None:
"""Test that API key and custom headers are merged correctly."""
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="https://collector.example.com",
ENTERPRISE_OTLP_HEADERS="x-custom=foo",
ENTERPRISE_OTLP_PROTOCOL="grpc",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY="test-key",
)
EnterpriseExporter(mock_config)
# Verify both headers are present
assert mock_span_exporter.call_args is not None
headers = mock_span_exporter.call_args.kwargs.get("headers")
assert headers is not None
assert ("authorization", "Bearer test-key") in headers
assert ("x-custom", "foo") in headers
@patch("enterprise.telemetry.exporter.logger")
@patch("enterprise.telemetry.exporter.GRPCSpanExporter")
@patch("enterprise.telemetry.exporter.GRPCMetricExporter")
def test_api_key_overrides_conflicting_header(
mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock, mock_logger: MagicMock
) -> None:
"""Test that API key overrides conflicting authorization header and logs warning."""
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="https://collector.example.com",
ENTERPRISE_OTLP_HEADERS="authorization=Basic old",
ENTERPRISE_OTLP_PROTOCOL="grpc",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY="test-key",
)
EnterpriseExporter(mock_config)
# Verify Bearer header takes precedence
assert mock_span_exporter.call_args is not None
headers = mock_span_exporter.call_args.kwargs.get("headers")
assert headers is not None
assert ("authorization", "Bearer test-key") in headers
# Verify old authorization header is not present
assert ("authorization", "Basic old") not in headers
# Verify warning was logged
mock_logger.warning.assert_called_once()
assert mock_logger.warning.call_args is not None
warning_message = mock_logger.warning.call_args[0][0]
assert "ENTERPRISE_OTLP_API_KEY is set" in warning_message
assert "authorization" in warning_message
@patch("enterprise.telemetry.exporter.GRPCSpanExporter")
@patch("enterprise.telemetry.exporter.GRPCMetricExporter")
def test_api_key_set_uses_secure_grpc(mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock) -> None:
"""Test that API key presence enables TLS (insecure=False) for gRPC."""
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="https://collector.example.com",
ENTERPRISE_OTLP_HEADERS="",
ENTERPRISE_OTLP_PROTOCOL="grpc",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY="test-key",
)
EnterpriseExporter(mock_config)
# Verify insecure=False for both exporters
assert mock_span_exporter.call_args is not None
assert mock_span_exporter.call_args.kwargs["insecure"] is False
assert mock_metric_exporter.call_args is not None
assert mock_metric_exporter.call_args.kwargs["insecure"] is False
@patch("enterprise.telemetry.exporter.GRPCSpanExporter")
@patch("enterprise.telemetry.exporter.GRPCMetricExporter")
def test_no_api_key_uses_insecure_grpc(mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock) -> None:
"""Test that empty API key uses insecure gRPC (backward compat)."""
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="http://collector.example.com",
ENTERPRISE_OTLP_HEADERS="",
ENTERPRISE_OTLP_PROTOCOL="grpc",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY="",
)
EnterpriseExporter(mock_config)
# Verify insecure=True for both exporters
assert mock_span_exporter.call_args is not None
assert mock_span_exporter.call_args.kwargs["insecure"] is True
assert mock_metric_exporter.call_args is not None
assert mock_metric_exporter.call_args.kwargs["insecure"] is True
@patch("enterprise.telemetry.exporter.HTTPSpanExporter")
@patch("enterprise.telemetry.exporter.HTTPMetricExporter")
def test_insecure_not_passed_to_http_exporters(mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock) -> None:
"""Test that insecure parameter is not passed to HTTP exporters."""
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="http://collector.example.com",
ENTERPRISE_OTLP_HEADERS="",
ENTERPRISE_OTLP_PROTOCOL="http",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY="test-key",
)
EnterpriseExporter(mock_config)
# Verify insecure kwarg is NOT in HTTP exporter calls
assert mock_span_exporter.call_args is not None
assert "insecure" not in mock_span_exporter.call_args.kwargs
assert mock_metric_exporter.call_args is not None
assert "insecure" not in mock_metric_exporter.call_args.kwargs
@patch("enterprise.telemetry.exporter.GRPCSpanExporter")
@patch("enterprise.telemetry.exporter.GRPCMetricExporter")
def test_api_key_with_special_chars_preserved(mock_metric_exporter: MagicMock, mock_span_exporter: MagicMock) -> None:
"""Test that API key with special characters is preserved without mangling."""
special_key = "abc+def/ghi=jkl=="
mock_config = SimpleNamespace(
ENTERPRISE_OTLP_ENDPOINT="https://collector.example.com",
ENTERPRISE_OTLP_HEADERS="",
ENTERPRISE_OTLP_PROTOCOL="grpc",
ENTERPRISE_SERVICE_NAME="dify",
ENTERPRISE_OTEL_SAMPLING_RATE=1.0,
ENTERPRISE_INCLUDE_CONTENT=True,
ENTERPRISE_OTLP_API_KEY=special_key,
)
EnterpriseExporter(mock_config)
# Verify special characters are preserved in Bearer header
assert mock_span_exporter.call_args is not None
headers = mock_span_exporter.call_args.kwargs.get("headers")
assert headers is not None
assert ("authorization", f"Bearer {special_key}") in headers

View File

@@ -0,0 +1,301 @@
from __future__ import annotations
import sys
from unittest.mock import MagicMock, patch
import pytest
from core.ops.entities.trace_entity import TraceTaskName
from enterprise.telemetry.contracts import SignalType, TelemetryCase, TelemetryEnvelope
from enterprise.telemetry.gateway import (
CASE_ROUTING,
CASE_TO_TRACE_TASK,
PAYLOAD_SIZE_THRESHOLD_BYTES,
TelemetryGateway,
emit,
)
class TestCaseRoutingTable:
def test_all_cases_have_routing(self) -> None:
for case in TelemetryCase:
assert case in CASE_ROUTING, f"Missing routing for {case}"
def test_trace_cases(self) -> None:
trace_cases = [
TelemetryCase.WORKFLOW_RUN,
TelemetryCase.MESSAGE_RUN,
TelemetryCase.NODE_EXECUTION,
TelemetryCase.DRAFT_NODE_EXECUTION,
TelemetryCase.PROMPT_GENERATION,
]
for case in trace_cases:
assert CASE_ROUTING[case].signal_type is SignalType.TRACE, f"{case} should be trace"
def test_metric_log_cases(self) -> None:
metric_log_cases = [
TelemetryCase.APP_CREATED,
TelemetryCase.APP_UPDATED,
TelemetryCase.APP_DELETED,
TelemetryCase.FEEDBACK_CREATED,
TelemetryCase.TOOL_EXECUTION,
TelemetryCase.MODERATION_CHECK,
TelemetryCase.SUGGESTED_QUESTION,
TelemetryCase.DATASET_RETRIEVAL,
TelemetryCase.GENERATE_NAME,
]
for case in metric_log_cases:
assert CASE_ROUTING[case].signal_type is SignalType.METRIC_LOG, f"{case} should be metric_log"
def test_ce_eligible_cases(self) -> None:
ce_eligible_cases = [TelemetryCase.WORKFLOW_RUN, TelemetryCase.MESSAGE_RUN]
for case in ce_eligible_cases:
assert CASE_ROUTING[case].ce_eligible is True, f"{case} should be CE eligible"
def test_enterprise_only_cases(self) -> None:
enterprise_only_cases = [
TelemetryCase.NODE_EXECUTION,
TelemetryCase.DRAFT_NODE_EXECUTION,
TelemetryCase.PROMPT_GENERATION,
]
for case in enterprise_only_cases:
assert CASE_ROUTING[case].ce_eligible is False, f"{case} should be enterprise-only"
def test_trace_cases_have_task_name_mapping(self) -> None:
trace_cases = [c for c in TelemetryCase if CASE_ROUTING[c].signal_type is SignalType.TRACE]
for case in trace_cases:
assert case in CASE_TO_TRACE_TASK, f"Missing TraceTaskName mapping for {case}"
@pytest.fixture
def mock_ops_trace_manager():
mock_module = MagicMock()
mock_trace_task_class = MagicMock()
mock_trace_task_class.return_value = MagicMock()
mock_module.TraceTask = mock_trace_task_class
mock_module.TraceQueueManager = MagicMock()
mock_trace_entity = MagicMock()
mock_trace_task_name = MagicMock()
mock_trace_task_name.return_value = "workflow"
mock_trace_entity.TraceTaskName = mock_trace_task_name
with (
patch.dict(sys.modules, {"core.ops.ops_trace_manager": mock_module}),
patch.dict(sys.modules, {"core.ops.entities.trace_entity": mock_trace_entity}),
):
yield mock_module, mock_trace_entity
class TestTelemetryGatewayTraceRouting:
@pytest.fixture
def gateway(self) -> TelemetryGateway:
return TelemetryGateway()
@pytest.fixture
def mock_trace_manager(self) -> MagicMock:
return MagicMock()
@patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=True)
def test_trace_case_routes_to_trace_manager(
self,
_mock_ee_enabled: MagicMock,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
mock_ops_trace_manager: tuple[MagicMock, MagicMock],
) -> None:
context = {"app_id": "app-123", "user_id": "user-456", "tenant_id": "tenant-789"}
payload = {"workflow_run_id": "run-abc"}
gateway.emit(TelemetryCase.WORKFLOW_RUN, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
@patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False)
def test_ce_eligible_trace_enqueued_when_ee_disabled(
self,
_mock_ee_enabled: MagicMock,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
mock_ops_trace_manager: tuple[MagicMock, MagicMock],
) -> None:
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"workflow_run_id": "run-abc"}
gateway.emit(TelemetryCase.WORKFLOW_RUN, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
@patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=False)
def test_enterprise_only_trace_dropped_when_ee_disabled(
self,
_mock_ee_enabled: MagicMock,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
mock_ops_trace_manager: tuple[MagicMock, MagicMock],
) -> None:
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"node_id": "node-abc"}
gateway.emit(TelemetryCase.NODE_EXECUTION, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_not_called()
@patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=True)
def test_enterprise_only_trace_enqueued_when_ee_enabled(
self,
_mock_ee_enabled: MagicMock,
gateway: TelemetryGateway,
mock_trace_manager: MagicMock,
mock_ops_trace_manager: tuple[MagicMock, MagicMock],
) -> None:
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"node_id": "node-abc"}
gateway.emit(TelemetryCase.NODE_EXECUTION, context, payload, mock_trace_manager)
mock_trace_manager.add_trace_task.assert_called_once()
class TestTelemetryGatewayMetricLogRouting:
@pytest.fixture
def gateway(self) -> TelemetryGateway:
return TelemetryGateway()
@patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay")
def test_metric_case_routes_to_celery_task(
self,
mock_delay: MagicMock,
gateway: TelemetryGateway,
) -> None:
context = {"tenant_id": "tenant-123"}
payload = {"app_id": "app-abc", "name": "My App"}
gateway.emit(TelemetryCase.APP_CREATED, context, payload)
mock_delay.assert_called_once()
envelope_json = mock_delay.call_args[0][0]
envelope = TelemetryEnvelope.model_validate_json(envelope_json)
assert envelope.case == TelemetryCase.APP_CREATED
assert envelope.tenant_id == "tenant-123"
assert envelope.payload["app_id"] == "app-abc"
@patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay")
def test_envelope_has_unique_event_id(
self,
mock_delay: MagicMock,
gateway: TelemetryGateway,
) -> None:
context = {"tenant_id": "tenant-123"}
payload = {"app_id": "app-abc"}
gateway.emit(TelemetryCase.APP_CREATED, context, payload)
gateway.emit(TelemetryCase.APP_CREATED, context, payload)
assert mock_delay.call_count == 2
envelope1 = TelemetryEnvelope.model_validate_json(mock_delay.call_args_list[0][0][0])
envelope2 = TelemetryEnvelope.model_validate_json(mock_delay.call_args_list[1][0][0])
assert envelope1.event_id != envelope2.event_id
class TestTelemetryGatewayPayloadSizing:
@pytest.fixture
def gateway(self) -> TelemetryGateway:
return TelemetryGateway()
@patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay")
def test_small_payload_inlined(
self,
mock_delay: MagicMock,
gateway: TelemetryGateway,
) -> None:
context = {"tenant_id": "tenant-123"}
payload = {"key": "small_value"}
gateway.emit(TelemetryCase.APP_CREATED, context, payload)
envelope_json = mock_delay.call_args[0][0]
envelope = TelemetryEnvelope.model_validate_json(envelope_json)
assert envelope.payload == payload
assert envelope.metadata is None
@patch("enterprise.telemetry.gateway.storage")
@patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay")
def test_large_payload_stored(
self,
mock_delay: MagicMock,
mock_storage: MagicMock,
gateway: TelemetryGateway,
) -> None:
context = {"tenant_id": "tenant-123"}
large_value = "x" * (PAYLOAD_SIZE_THRESHOLD_BYTES + 1000)
payload = {"key": large_value}
gateway.emit(TelemetryCase.APP_CREATED, context, payload)
mock_storage.save.assert_called_once()
storage_key = mock_storage.save.call_args[0][0]
assert storage_key.startswith("telemetry/tenant-123/")
envelope_json = mock_delay.call_args[0][0]
envelope = TelemetryEnvelope.model_validate_json(envelope_json)
assert envelope.payload == {}
assert envelope.metadata is not None
assert envelope.metadata["payload_ref"] == storage_key
@patch("enterprise.telemetry.gateway.storage")
@patch("tasks.enterprise_telemetry_task.process_enterprise_telemetry.delay")
def test_large_payload_fallback_on_storage_error(
self,
mock_delay: MagicMock,
mock_storage: MagicMock,
gateway: TelemetryGateway,
) -> None:
mock_storage.save.side_effect = Exception("Storage failure")
context = {"tenant_id": "tenant-123"}
large_value = "x" * (PAYLOAD_SIZE_THRESHOLD_BYTES + 1000)
payload = {"key": large_value}
gateway.emit(TelemetryCase.APP_CREATED, context, payload)
envelope_json = mock_delay.call_args[0][0]
envelope = TelemetryEnvelope.model_validate_json(envelope_json)
assert envelope.payload == payload
assert envelope.metadata is None
class TestModuleLevelFunctions:
@patch("extensions.ext_enterprise_telemetry.get_gateway")
@patch("enterprise.telemetry.gateway._is_enterprise_telemetry_enabled", return_value=True)
def test_emit_function_uses_gateway(
self,
_mock_ee_enabled: MagicMock,
mock_get_gateway: MagicMock,
mock_ops_trace_manager: tuple[MagicMock, MagicMock],
) -> None:
mock_gateway = TelemetryGateway()
mock_get_gateway.return_value = mock_gateway
mock_trace_manager = MagicMock()
context = {"app_id": "app-123", "user_id": "user-456"}
payload = {"workflow_run_id": "run-abc"}
with patch.object(mock_gateway, "emit") as mock_emit:
emit(TelemetryCase.WORKFLOW_RUN, context, payload, mock_trace_manager)
mock_emit.assert_called_once_with(TelemetryCase.WORKFLOW_RUN, context, payload, mock_trace_manager)
class TestTraceTaskNameMapping:
def test_workflow_run_mapping(self) -> None:
assert CASE_TO_TRACE_TASK[TelemetryCase.WORKFLOW_RUN] is TraceTaskName.WORKFLOW_TRACE
def test_message_run_mapping(self) -> None:
assert CASE_TO_TRACE_TASK[TelemetryCase.MESSAGE_RUN] is TraceTaskName.MESSAGE_TRACE
def test_node_execution_mapping(self) -> None:
assert CASE_TO_TRACE_TASK[TelemetryCase.NODE_EXECUTION] is TraceTaskName.NODE_EXECUTION_TRACE
def test_draft_node_execution_mapping(self) -> None:
assert CASE_TO_TRACE_TASK[TelemetryCase.DRAFT_NODE_EXECUTION] is TraceTaskName.DRAFT_NODE_EXECUTION_TRACE
def test_prompt_generation_mapping(self) -> None:
assert CASE_TO_TRACE_TASK[TelemetryCase.PROMPT_GENERATION] is TraceTaskName.PROMPT_GENERATION_TRACE

View File

@@ -0,0 +1,474 @@
"""Unit tests for EnterpriseMetricHandler."""
from unittest.mock import MagicMock, patch
import pytest
from enterprise.telemetry.contracts import TelemetryCase, TelemetryEnvelope
from enterprise.telemetry.metric_handler import EnterpriseMetricHandler
@pytest.fixture
def mock_redis():
with patch("enterprise.telemetry.metric_handler.redis_client") as mock:
yield mock
@pytest.fixture
def sample_envelope():
return TelemetryEnvelope(
case=TelemetryCase.APP_CREATED,
tenant_id="test-tenant",
event_id="test-event-123",
payload={"app_id": "app-123", "name": "Test App"},
)
def test_dispatch_app_created(sample_envelope, mock_redis):
mock_redis.set.return_value = True
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_app_created") as mock_handler:
handler.handle(sample_envelope)
mock_handler.assert_called_once_with(sample_envelope)
def test_dispatch_app_updated(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_UPDATED,
tenant_id="test-tenant",
event_id="test-event-456",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_app_updated") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_app_deleted(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_DELETED,
tenant_id="test-tenant",
event_id="test-event-789",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_app_deleted") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_feedback_created(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.FEEDBACK_CREATED,
tenant_id="test-tenant",
event_id="test-event-abc",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_feedback_created") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_message_run(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.MESSAGE_RUN,
tenant_id="test-tenant",
event_id="test-event-msg",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_message_run") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_tool_execution(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.TOOL_EXECUTION,
tenant_id="test-tenant",
event_id="test-event-tool",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_tool_execution") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_moderation_check(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.MODERATION_CHECK,
tenant_id="test-tenant",
event_id="test-event-mod",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_moderation_check") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_suggested_question(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.SUGGESTED_QUESTION,
tenant_id="test-tenant",
event_id="test-event-sq",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_suggested_question") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_dataset_retrieval(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.DATASET_RETRIEVAL,
tenant_id="test-tenant",
event_id="test-event-ds",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_dataset_retrieval") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_generate_name(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.GENERATE_NAME,
tenant_id="test-tenant",
event_id="test-event-gn",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_generate_name") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_dispatch_prompt_generation(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.PROMPT_GENERATION,
tenant_id="test-tenant",
event_id="test-event-pg",
payload={},
)
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_prompt_generation") as mock_handler:
handler.handle(envelope)
mock_handler.assert_called_once_with(envelope)
def test_all_known_cases_have_handlers(mock_redis):
mock_redis.set.return_value = True
handler = EnterpriseMetricHandler()
for case in TelemetryCase:
envelope = TelemetryEnvelope(
case=case,
tenant_id="test-tenant",
event_id=f"test-{case.value}",
payload={},
)
handler.handle(envelope)
def test_idempotency_duplicate(sample_envelope, mock_redis):
mock_redis.set.return_value = None
handler = EnterpriseMetricHandler()
with patch.object(handler, "_on_app_created") as mock_handler:
handler.handle(sample_envelope)
mock_handler.assert_not_called()
def test_idempotency_first_seen(sample_envelope, mock_redis):
mock_redis.set.return_value = True
handler = EnterpriseMetricHandler()
is_dup = handler._is_duplicate(sample_envelope)
assert is_dup is False
mock_redis.set.assert_called_once_with(
"telemetry:dedup:test-tenant:test-event-123",
b"1",
nx=True,
ex=3600,
)
def test_idempotency_redis_failure_fails_open(sample_envelope, mock_redis, caplog):
mock_redis.set.side_effect = Exception("Redis unavailable")
handler = EnterpriseMetricHandler()
is_dup = handler._is_duplicate(sample_envelope)
assert is_dup is False
assert "Redis unavailable for deduplication check" in caplog.text
def test_rehydration_uses_payload(sample_envelope):
handler = EnterpriseMetricHandler()
payload = handler._rehydrate(sample_envelope)
assert payload == {"app_id": "app-123", "name": "Test App"}
def test_rehydration_fallback():
import pickle
fallback_data = {"fallback": "data"}
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_CREATED,
tenant_id="test-tenant",
event_id="test-event-fb",
payload={},
payload_fallback=pickle.dumps(fallback_data),
)
handler = EnterpriseMetricHandler()
payload = handler._rehydrate(envelope)
assert payload == fallback_data
def test_rehydration_emits_degraded_event_on_failure():
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_CREATED,
tenant_id="test-tenant",
event_id="test-event-fail",
payload={},
payload_fallback=None,
)
handler = EnterpriseMetricHandler()
with patch("enterprise.telemetry.telemetry_log.emit_metric_only_event") as mock_emit:
payload = handler._rehydrate(envelope)
from enterprise.telemetry.entities import EnterpriseTelemetryEvent
assert payload == {}
mock_emit.assert_called_once()
call_args = mock_emit.call_args
assert call_args[1]["event_name"] == EnterpriseTelemetryEvent.REHYDRATION_FAILED
assert call_args[1]["attributes"]["rehydration_failed"] is True
def test_on_app_created_emits_correct_event(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_CREATED,
tenant_id="tenant-123",
event_id="event-456",
payload={"app_id": "app-789", "mode": "chat"},
)
handler = EnterpriseMetricHandler()
with (
patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter") as mock_get_exporter,
patch("enterprise.telemetry.telemetry_log.emit_metric_only_event") as mock_emit,
):
mock_exporter = MagicMock()
mock_get_exporter.return_value = mock_exporter
handler._on_app_created(envelope)
from enterprise.telemetry.entities import EnterpriseTelemetryEvent
mock_emit.assert_called_once_with(
event_name=EnterpriseTelemetryEvent.APP_CREATED,
attributes={
"dify.app.id": "app-789",
"dify.tenant_id": "tenant-123",
"dify.app.mode": "chat",
},
tenant_id="tenant-123",
)
from enterprise.telemetry.entities import EnterpriseTelemetryCounter
mock_exporter.increment_counter.assert_called_once()
call_args = mock_exporter.increment_counter.call_args
assert call_args[0][0] == EnterpriseTelemetryCounter.APP_CREATED
assert call_args[0][1] == 1
assert call_args[0][2]["tenant_id"] == "tenant-123"
assert call_args[0][2]["app_id"] == "app-789"
assert call_args[0][2]["mode"] == "chat"
def test_on_app_updated_emits_correct_event(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_UPDATED,
tenant_id="tenant-123",
event_id="event-456",
payload={"app_id": "app-789"},
)
handler = EnterpriseMetricHandler()
with (
patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter") as mock_get_exporter,
patch("enterprise.telemetry.telemetry_log.emit_metric_only_event") as mock_emit,
):
mock_exporter = MagicMock()
mock_get_exporter.return_value = mock_exporter
handler._on_app_updated(envelope)
from enterprise.telemetry.entities import EnterpriseTelemetryEvent
mock_emit.assert_called_once_with(
event_name=EnterpriseTelemetryEvent.APP_UPDATED,
attributes={
"dify.app.id": "app-789",
"dify.tenant_id": "tenant-123",
},
tenant_id="tenant-123",
)
from enterprise.telemetry.entities import EnterpriseTelemetryCounter
mock_exporter.increment_counter.assert_called_once()
call_args = mock_exporter.increment_counter.call_args
assert call_args[0][0] == EnterpriseTelemetryCounter.APP_UPDATED
assert call_args[0][1] == 1
assert call_args[0][2]["tenant_id"] == "tenant-123"
assert call_args[0][2]["app_id"] == "app-789"
def test_on_app_deleted_emits_correct_event(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_DELETED,
tenant_id="tenant-123",
event_id="event-456",
payload={"app_id": "app-789"},
)
handler = EnterpriseMetricHandler()
with (
patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter") as mock_get_exporter,
patch("enterprise.telemetry.telemetry_log.emit_metric_only_event") as mock_emit,
):
mock_exporter = MagicMock()
mock_get_exporter.return_value = mock_exporter
handler._on_app_deleted(envelope)
from enterprise.telemetry.entities import EnterpriseTelemetryEvent
mock_emit.assert_called_once_with(
event_name=EnterpriseTelemetryEvent.APP_DELETED,
attributes={
"dify.app.id": "app-789",
"dify.tenant_id": "tenant-123",
},
tenant_id="tenant-123",
)
from enterprise.telemetry.entities import EnterpriseTelemetryCounter
mock_exporter.increment_counter.assert_called_once()
call_args = mock_exporter.increment_counter.call_args
assert call_args[0][0] == EnterpriseTelemetryCounter.APP_DELETED
assert call_args[0][1] == 1
assert call_args[0][2]["tenant_id"] == "tenant-123"
assert call_args[0][2]["app_id"] == "app-789"
def test_on_feedback_created_emits_correct_event(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.FEEDBACK_CREATED,
tenant_id="tenant-123",
event_id="event-456",
payload={
"message_id": "msg-001",
"app_id": "app-789",
"conversation_id": "conv-123",
"from_end_user_id": "user-456",
"from_account_id": None,
"rating": "like",
"from_source": "api",
"content": "Great!",
},
)
handler = EnterpriseMetricHandler()
with (
patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter") as mock_get_exporter,
patch("enterprise.telemetry.telemetry_log.emit_metric_only_event") as mock_emit,
):
mock_exporter = MagicMock()
mock_exporter.include_content = True
mock_get_exporter.return_value = mock_exporter
handler._on_feedback_created(envelope)
mock_emit.assert_called_once()
call_args = mock_emit.call_args
assert call_args[1]["event_name"] == "dify.feedback.created"
assert call_args[1]["attributes"]["dify.message.id"] == "msg-001"
assert call_args[1]["attributes"]["dify.feedback.content"] == "Great!"
assert call_args[1]["tenant_id"] == "tenant-123"
assert call_args[1]["user_id"] == "user-456"
mock_exporter.increment_counter.assert_called_once()
counter_args = mock_exporter.increment_counter.call_args
assert counter_args[0][2]["app_id"] == "app-789"
assert counter_args[0][2]["rating"] == "like"
def test_on_feedback_created_without_content(mock_redis):
mock_redis.set.return_value = True
envelope = TelemetryEnvelope(
case=TelemetryCase.FEEDBACK_CREATED,
tenant_id="tenant-123",
event_id="event-456",
payload={
"message_id": "msg-001",
"app_id": "app-789",
"conversation_id": "conv-123",
"from_end_user_id": "user-456",
"from_account_id": None,
"rating": "like",
"from_source": "api",
"content": "Great!",
},
)
handler = EnterpriseMetricHandler()
with (
patch("extensions.ext_enterprise_telemetry.get_enterprise_exporter") as mock_get_exporter,
patch("enterprise.telemetry.telemetry_log.emit_metric_only_event") as mock_emit,
):
mock_exporter = MagicMock()
mock_exporter.include_content = False
mock_get_exporter.return_value = mock_exporter
handler._on_feedback_created(envelope)
mock_emit.assert_called_once()
call_args = mock_emit.call_args
assert "dify.feedback.content" not in call_args[1]["attributes"]

View File

@@ -0,0 +1,69 @@
"""Unit tests for enterprise telemetry Celery task."""
import json
from unittest.mock import MagicMock, patch
import pytest
from enterprise.telemetry.contracts import TelemetryCase, TelemetryEnvelope
from tasks.enterprise_telemetry_task import process_enterprise_telemetry
@pytest.fixture
def sample_envelope_json():
envelope = TelemetryEnvelope(
case=TelemetryCase.APP_CREATED,
tenant_id="test-tenant",
event_id="test-event-123",
payload={"app_id": "app-123"},
)
return envelope.model_dump_json()
def test_process_enterprise_telemetry_success(sample_envelope_json):
with patch("tasks.enterprise_telemetry_task.EnterpriseMetricHandler") as mock_handler_class:
mock_handler = MagicMock()
mock_handler_class.return_value = mock_handler
process_enterprise_telemetry(sample_envelope_json)
mock_handler.handle.assert_called_once()
call_args = mock_handler.handle.call_args[0][0]
assert isinstance(call_args, TelemetryEnvelope)
assert call_args.case == TelemetryCase.APP_CREATED
assert call_args.tenant_id == "test-tenant"
assert call_args.event_id == "test-event-123"
def test_process_enterprise_telemetry_invalid_json(caplog):
invalid_json = "not valid json"
process_enterprise_telemetry(invalid_json)
assert "Failed to process enterprise telemetry envelope" in caplog.text
def test_process_enterprise_telemetry_handler_exception(sample_envelope_json, caplog):
with patch("tasks.enterprise_telemetry_task.EnterpriseMetricHandler") as mock_handler_class:
mock_handler = MagicMock()
mock_handler.handle.side_effect = Exception("Handler error")
mock_handler_class.return_value = mock_handler
process_enterprise_telemetry(sample_envelope_json)
assert "Failed to process enterprise telemetry envelope" in caplog.text
def test_process_enterprise_telemetry_validation_error(caplog):
invalid_envelope = json.dumps(
{
"case": "INVALID_CASE",
"tenant_id": "test-tenant",
"event_id": "test-event",
"payload": {},
}
)
process_enterprise_telemetry(invalid_envelope)
assert "Failed to process enterprise telemetry envelope" in caplog.text

View File

@@ -1,4 +1,3 @@
import type { App } from '@/types/app'
import { fireEvent, render, screen, waitFor } from '@testing-library/react'
import { useRouter } from 'next/navigation'
import { afterAll, beforeEach, describe, expect, it, vi } from 'vitest'
@@ -14,8 +13,8 @@ import { getRedirection } from '@/utils/app-redirection'
import CreateAppModal from './index'
vi.mock('ahooks', () => ({
useDebounceFn: <T extends (...args: unknown[]) => unknown>(fn: T) => {
const run = (...args: Parameters<T>) => fn(...args)
useDebounceFn: (fn: (...args: any[]) => any) => {
const run = (...args: any[]) => fn(...args)
const cancel = vi.fn()
const flush = vi.fn()
return { run, cancel, flush }
@@ -84,7 +83,7 @@ describe('CreateAppModal', () => {
beforeEach(() => {
vi.clearAllMocks()
mockUseRouter.mockReturnValue({ push: mockPush } as unknown as ReturnType<typeof useRouter>)
mockUseRouter.mockReturnValue({ push: mockPush } as any)
mockUseProviderContext.mockReturnValue({
plan: {
type: AppModeEnum.ADVANCED_CHAT,
@@ -93,10 +92,10 @@ describe('CreateAppModal', () => {
reset: {},
},
enableBilling: true,
} as unknown as ReturnType<typeof useProviderContext>)
} as any)
mockUseAppContext.mockReturnValue({
isCurrentWorkspaceEditor: true,
} as unknown as ReturnType<typeof useAppContext>)
} as any)
mockSetItem.mockClear()
Object.defineProperty(window, 'localStorage', {
value: {
@@ -119,8 +118,8 @@ describe('CreateAppModal', () => {
})
it('creates an app, notifies success, and fires callbacks', async () => {
const mockApp: Partial<App> = { id: 'app-1', mode: AppModeEnum.ADVANCED_CHAT }
mockCreateApp.mockResolvedValue(mockApp as App)
const mockApp = { id: 'app-1', mode: AppModeEnum.ADVANCED_CHAT }
mockCreateApp.mockResolvedValue(mockApp as any)
const { onClose, onSuccess } = renderModal()
const nameInput = screen.getByPlaceholderText('app.newApp.appNamePlaceholder')

View File

@@ -216,22 +216,13 @@ describe('image-uploader utils', () => {
type FileCallback = (file: MockFile) => void
type EntriesCallback = (entries: FileSystemEntry[]) => void
// Helper to create mock FileSystemEntry with required properties
const createMockEntry = (props: {
isFile: boolean
isDirectory: boolean
name?: string
file?: (callback: FileCallback) => void
createReader?: () => { readEntries: (callback: EntriesCallback) => void }
}): FileSystemEntry => props as unknown as FileSystemEntry
it('should resolve with file array for file entry', async () => {
const mockFile: MockFile = { name: 'test.png' }
const mockEntry = createMockEntry({
const mockEntry = {
isFile: true,
isDirectory: false,
file: (callback: FileCallback) => callback(mockFile),
})
}
const result = await traverseFileEntry(mockEntry)
expect(result).toHaveLength(1)
@@ -241,11 +232,11 @@ describe('image-uploader utils', () => {
it('should resolve with file array with prefix for nested file', async () => {
const mockFile: MockFile = { name: 'test.png' }
const mockEntry = createMockEntry({
const mockEntry = {
isFile: true,
isDirectory: false,
file: (callback: FileCallback) => callback(mockFile),
})
}
const result = await traverseFileEntry(mockEntry, 'folder/')
expect(result).toHaveLength(1)
@@ -253,24 +244,24 @@ describe('image-uploader utils', () => {
})
it('should resolve empty array for unknown entry type', async () => {
const mockEntry = createMockEntry({
const mockEntry = {
isFile: false,
isDirectory: false,
})
}
const result = await traverseFileEntry(mockEntry)
expect(result).toEqual([])
})
it('should handle directory with no files', async () => {
const mockEntry = createMockEntry({
const mockEntry = {
isFile: false,
isDirectory: true,
name: 'empty-folder',
createReader: () => ({
readEntries: (callback: EntriesCallback) => callback([]),
}),
})
}
const result = await traverseFileEntry(mockEntry)
expect(result).toEqual([])
@@ -280,20 +271,20 @@ describe('image-uploader utils', () => {
const mockFile1: MockFile = { name: 'file1.png' }
const mockFile2: MockFile = { name: 'file2.png' }
const mockFileEntry1 = createMockEntry({
const mockFileEntry1 = {
isFile: true,
isDirectory: false,
file: (callback: FileCallback) => callback(mockFile1),
})
}
const mockFileEntry2 = createMockEntry({
const mockFileEntry2 = {
isFile: true,
isDirectory: false,
file: (callback: FileCallback) => callback(mockFile2),
})
}
let readCount = 0
const mockEntry = createMockEntry({
const mockEntry = {
isFile: false,
isDirectory: true,
name: 'folder',
@@ -301,14 +292,14 @@ describe('image-uploader utils', () => {
readEntries: (callback: EntriesCallback) => {
if (readCount === 0) {
readCount++
callback([mockFileEntry1, mockFileEntry2])
callback([mockFileEntry1, mockFileEntry2] as unknown as FileSystemEntry[])
}
else {
callback([])
}
},
}),
})
}
const result = await traverseFileEntry(mockEntry)
expect(result).toHaveLength(2)

View File

@@ -18,17 +18,17 @@ type FileWithPath = {
relativePath?: string
} & File
export const traverseFileEntry = (entry: FileSystemEntry, prefix = ''): Promise<FileWithPath[]> => {
export const traverseFileEntry = (entry: any, prefix = ''): Promise<FileWithPath[]> => {
return new Promise((resolve) => {
if (entry.isFile) {
(entry as FileSystemFileEntry).file((file: FileWithPath) => {
entry.file((file: FileWithPath) => {
file.relativePath = `${prefix}${file.name}`
resolve([file])
})
}
else if (entry.isDirectory) {
const reader = (entry as FileSystemDirectoryEntry).createReader()
const entries: FileSystemEntry[] = []
const reader = entry.createReader()
const entries: any[] = []
const read = () => {
reader.readEntries(async (results: FileSystemEntry[]) => {
if (!results.length) {

View File

@@ -1,218 +0,0 @@
'use client'
import { useDebounceFn } from 'ahooks'
import { useRouter } from 'next/navigation'
import { useCallback, useMemo, useRef, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { useContext } from 'use-context-selector'
import { ToastContext } from '@/app/components/base/toast'
import { usePluginDependencies } from '@/app/components/workflow/plugin-dependency/hooks'
import {
DSLImportMode,
DSLImportStatus,
} from '@/models/app'
import { useImportPipelineDSL, useImportPipelineDSLConfirm } from '@/service/use-pipeline'
export enum CreateFromDSLModalTab {
FROM_FILE = 'from-file',
FROM_URL = 'from-url',
}
export type UseDSLImportOptions = {
activeTab?: CreateFromDSLModalTab
dslUrl?: string
onSuccess?: () => void
onClose?: () => void
}
export type DSLVersions = {
importedVersion: string
systemVersion: string
}
export const useDSLImport = ({
activeTab = CreateFromDSLModalTab.FROM_FILE,
dslUrl = '',
onSuccess,
onClose,
}: UseDSLImportOptions) => {
const { push } = useRouter()
const { t } = useTranslation()
const { notify } = useContext(ToastContext)
const [currentFile, setDSLFile] = useState<File>()
const [fileContent, setFileContent] = useState<string>()
const [currentTab, setCurrentTab] = useState(activeTab)
const [dslUrlValue, setDslUrlValue] = useState(dslUrl)
const [showConfirmModal, setShowConfirmModal] = useState(false)
const [versions, setVersions] = useState<DSLVersions>()
const [importId, setImportId] = useState<string>()
const [isConfirming, setIsConfirming] = useState(false)
const { handleCheckPluginDependencies } = usePluginDependencies()
const isCreatingRef = useRef(false)
const { mutateAsync: importDSL } = useImportPipelineDSL()
const { mutateAsync: importDSLConfirm } = useImportPipelineDSLConfirm()
const readFile = useCallback((file: File) => {
const reader = new FileReader()
reader.onload = (event) => {
const content = event.target?.result
setFileContent(content as string)
}
reader.readAsText(file)
}, [])
const handleFile = useCallback((file?: File) => {
setDSLFile(file)
if (file)
readFile(file)
if (!file)
setFileContent('')
}, [readFile])
const onCreate = useCallback(async () => {
if (currentTab === CreateFromDSLModalTab.FROM_FILE && !currentFile)
return
if (currentTab === CreateFromDSLModalTab.FROM_URL && !dslUrlValue)
return
if (isCreatingRef.current)
return
isCreatingRef.current = true
let response
if (currentTab === CreateFromDSLModalTab.FROM_FILE) {
response = await importDSL({
mode: DSLImportMode.YAML_CONTENT,
yaml_content: fileContent || '',
})
}
if (currentTab === CreateFromDSLModalTab.FROM_URL) {
response = await importDSL({
mode: DSLImportMode.YAML_URL,
yaml_url: dslUrlValue || '',
})
}
if (!response) {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
isCreatingRef.current = false
return
}
const { id, status, pipeline_id, dataset_id, imported_dsl_version, current_dsl_version } = response
if (status === DSLImportStatus.COMPLETED || status === DSLImportStatus.COMPLETED_WITH_WARNINGS) {
onSuccess?.()
onClose?.()
notify({
type: status === DSLImportStatus.COMPLETED ? 'success' : 'warning',
message: t(status === DSLImportStatus.COMPLETED ? 'creation.successTip' : 'creation.caution', { ns: 'datasetPipeline' }),
children: status === DSLImportStatus.COMPLETED_WITH_WARNINGS && t('newApp.appCreateDSLWarning', { ns: 'app' }),
})
if (pipeline_id)
await handleCheckPluginDependencies(pipeline_id, true)
push(`/datasets/${dataset_id}/pipeline`)
isCreatingRef.current = false
}
else if (status === DSLImportStatus.PENDING) {
setVersions({
importedVersion: imported_dsl_version ?? '',
systemVersion: current_dsl_version ?? '',
})
onClose?.()
setTimeout(() => {
setShowConfirmModal(true)
}, 300)
setImportId(id)
isCreatingRef.current = false
}
else {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
isCreatingRef.current = false
}
}, [
currentTab,
currentFile,
dslUrlValue,
fileContent,
importDSL,
notify,
t,
onSuccess,
onClose,
handleCheckPluginDependencies,
push,
])
const { run: handleCreateApp } = useDebounceFn(onCreate, { wait: 300 })
const onDSLConfirm = useCallback(async () => {
if (!importId)
return
setIsConfirming(true)
const response = await importDSLConfirm(importId)
setIsConfirming(false)
if (!response) {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
return
}
const { status, pipeline_id, dataset_id } = response
if (status === DSLImportStatus.COMPLETED) {
onSuccess?.()
setShowConfirmModal(false)
notify({
type: 'success',
message: t('creation.successTip', { ns: 'datasetPipeline' }),
})
if (pipeline_id)
await handleCheckPluginDependencies(pipeline_id, true)
push(`/datasets/${dataset_id}/pipeline`)
}
else if (status === DSLImportStatus.FAILED) {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
}
}, [importId, importDSLConfirm, notify, t, onSuccess, handleCheckPluginDependencies, push])
const handleCancelConfirm = useCallback(() => {
setShowConfirmModal(false)
}, [])
const buttonDisabled = useMemo(() => {
if (currentTab === CreateFromDSLModalTab.FROM_FILE)
return !currentFile
if (currentTab === CreateFromDSLModalTab.FROM_URL)
return !dslUrlValue
return false
}, [currentTab, currentFile, dslUrlValue])
return {
// State
currentFile,
currentTab,
dslUrlValue,
showConfirmModal,
versions,
buttonDisabled,
isConfirming,
// Actions
setCurrentTab,
setDslUrlValue,
handleFile,
handleCreateApp,
onDSLConfirm,
handleCancelConfirm,
}
}

View File

@@ -1,18 +1,24 @@
'use client'
import { useKeyPress } from 'ahooks'
import { useDebounceFn, useKeyPress } from 'ahooks'
import { noop } from 'es-toolkit/function'
import { useRouter } from 'next/navigation'
import { useMemo, useRef, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { useContext } from 'use-context-selector'
import Button from '@/app/components/base/button'
import Input from '@/app/components/base/input'
import Modal from '@/app/components/base/modal'
import DSLConfirmModal from './dsl-confirm-modal'
import { ToastContext } from '@/app/components/base/toast'
import { usePluginDependencies } from '@/app/components/workflow/plugin-dependency/hooks'
import {
DSLImportMode,
DSLImportStatus,
} from '@/models/app'
import { useImportPipelineDSL, useImportPipelineDSLConfirm } from '@/service/use-pipeline'
import Header from './header'
import { CreateFromDSLModalTab, useDSLImport } from './hooks/use-dsl-import'
import Tab from './tab'
import Uploader from './uploader'
export { CreateFromDSLModalTab }
type CreateFromDSLModalProps = {
show: boolean
onSuccess?: () => void
@@ -21,6 +27,11 @@ type CreateFromDSLModalProps = {
dslUrl?: string
}
export enum CreateFromDSLModalTab {
FROM_FILE = 'from-file',
FROM_URL = 'from-url',
}
const CreateFromDSLModal = ({
show,
onSuccess,
@@ -28,34 +39,150 @@ const CreateFromDSLModal = ({
activeTab = CreateFromDSLModalTab.FROM_FILE,
dslUrl = '',
}: CreateFromDSLModalProps) => {
const { push } = useRouter()
const { t } = useTranslation()
const { notify } = useContext(ToastContext)
const [currentFile, setDSLFile] = useState<File>()
const [fileContent, setFileContent] = useState<string>()
const [currentTab, setCurrentTab] = useState(activeTab)
const [dslUrlValue, setDslUrlValue] = useState(dslUrl)
const [showErrorModal, setShowErrorModal] = useState(false)
const [versions, setVersions] = useState<{ importedVersion: string, systemVersion: string }>()
const [importId, setImportId] = useState<string>()
const { handleCheckPluginDependencies } = usePluginDependencies()
const {
currentFile,
currentTab,
dslUrlValue,
showConfirmModal,
versions,
buttonDisabled,
isConfirming,
setCurrentTab,
setDslUrlValue,
handleFile,
handleCreateApp,
onDSLConfirm,
handleCancelConfirm,
} = useDSLImport({
activeTab,
dslUrl,
onSuccess,
onClose,
})
const readFile = (file: File) => {
const reader = new FileReader()
reader.onload = function (event) {
const content = event.target?.result
setFileContent(content as string)
}
reader.readAsText(file)
}
const handleFile = (file?: File) => {
setDSLFile(file)
if (file)
readFile(file)
if (!file)
setFileContent('')
}
const isCreatingRef = useRef(false)
const { mutateAsync: importDSL } = useImportPipelineDSL()
const onCreate = async () => {
if (currentTab === CreateFromDSLModalTab.FROM_FILE && !currentFile)
return
if (currentTab === CreateFromDSLModalTab.FROM_URL && !dslUrlValue)
return
if (isCreatingRef.current)
return
isCreatingRef.current = true
let response
if (currentTab === CreateFromDSLModalTab.FROM_FILE) {
response = await importDSL({
mode: DSLImportMode.YAML_CONTENT,
yaml_content: fileContent || '',
})
}
if (currentTab === CreateFromDSLModalTab.FROM_URL) {
response = await importDSL({
mode: DSLImportMode.YAML_URL,
yaml_url: dslUrlValue || '',
})
}
if (!response) {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
isCreatingRef.current = false
return
}
const { id, status, pipeline_id, dataset_id, imported_dsl_version, current_dsl_version } = response
if (status === DSLImportStatus.COMPLETED || status === DSLImportStatus.COMPLETED_WITH_WARNINGS) {
if (onSuccess)
onSuccess()
if (onClose)
onClose()
notify({
type: status === DSLImportStatus.COMPLETED ? 'success' : 'warning',
message: t(status === DSLImportStatus.COMPLETED ? 'creation.successTip' : 'creation.caution', { ns: 'datasetPipeline' }),
children: status === DSLImportStatus.COMPLETED_WITH_WARNINGS && t('newApp.appCreateDSLWarning', { ns: 'app' }),
})
if (pipeline_id)
await handleCheckPluginDependencies(pipeline_id, true)
push(`/datasets/${dataset_id}/pipeline`)
isCreatingRef.current = false
}
else if (status === DSLImportStatus.PENDING) {
setVersions({
importedVersion: imported_dsl_version ?? '',
systemVersion: current_dsl_version ?? '',
})
if (onClose)
onClose()
setTimeout(() => {
setShowErrorModal(true)
}, 300)
setImportId(id)
isCreatingRef.current = false
}
else {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
isCreatingRef.current = false
}
}
const { run: handleCreateApp } = useDebounceFn(onCreate, { wait: 300 })
useKeyPress('esc', () => {
if (show && !showConfirmModal)
if (show && !showErrorModal)
onClose()
})
const { mutateAsync: importDSLConfirm } = useImportPipelineDSLConfirm()
const onDSLConfirm = async () => {
if (!importId)
return
const response = await importDSLConfirm(importId)
if (!response) {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
return
}
const { status, pipeline_id, dataset_id } = response
if (status === DSLImportStatus.COMPLETED) {
if (onSuccess)
onSuccess()
if (onClose)
onClose()
notify({
type: 'success',
message: t('creation.successTip', { ns: 'datasetPipeline' }),
})
if (pipeline_id)
await handleCheckPluginDependencies(pipeline_id, true)
push(`datasets/${dataset_id}/pipeline`)
}
else if (status === DSLImportStatus.FAILED) {
notify({ type: 'error', message: t('creation.errorTip', { ns: 'datasetPipeline' }) })
}
}
const buttonDisabled = useMemo(() => {
if (currentTab === CreateFromDSLModalTab.FROM_FILE)
return !currentFile
if (currentTab === CreateFromDSLModalTab.FROM_URL)
return !dslUrlValue
return false
}, [currentTab, currentFile, dslUrlValue])
return (
<>
<Modal
@@ -69,25 +196,29 @@ const CreateFromDSLModal = ({
setCurrentTab={setCurrentTab}
/>
<div className="px-6 py-4">
{currentTab === CreateFromDSLModalTab.FROM_FILE && (
<Uploader
className="mt-0"
file={currentFile}
updateFile={handleFile}
/>
)}
{currentTab === CreateFromDSLModalTab.FROM_URL && (
<div>
<div className="system-md-semibold leading6 mb-1 text-text-secondary">
DSL URL
</div>
<Input
placeholder={t('importFromDSLUrlPlaceholder', { ns: 'app' }) || ''}
value={dslUrlValue}
onChange={e => setDslUrlValue(e.target.value)}
{
currentTab === CreateFromDSLModalTab.FROM_FILE && (
<Uploader
className="mt-0"
file={currentFile}
updateFile={handleFile}
/>
</div>
)}
)
}
{
currentTab === CreateFromDSLModalTab.FROM_URL && (
<div>
<div className="system-md-semibold leading6 mb-1 text-text-secondary">
DSL URL
</div>
<Input
placeholder={t('importFromDSLUrlPlaceholder', { ns: 'app' }) || ''}
value={dslUrlValue}
onChange={e => setDslUrlValue(e.target.value)}
/>
</div>
)
}
</div>
<div className="flex justify-end gap-x-2 p-6 pt-5">
<Button onClick={onClose}>
@@ -103,14 +234,32 @@ const CreateFromDSLModal = ({
</Button>
</div>
</Modal>
{showConfirmModal && (
<DSLConfirmModal
versions={versions}
onCancel={handleCancelConfirm}
onConfirm={onDSLConfirm}
confirmDisabled={isConfirming}
/>
)}
<Modal
isShow={showErrorModal}
onClose={() => setShowErrorModal(false)}
className="w-[480px]"
>
<div className="flex flex-col items-start gap-2 self-stretch pb-4">
<div className="title-2xl-semi-bold text-text-primary">{t('newApp.appCreateDSLErrorTitle', { ns: 'app' })}</div>
<div className="system-md-regular flex grow flex-col text-text-secondary">
<div>{t('newApp.appCreateDSLErrorPart1', { ns: 'app' })}</div>
<div>{t('newApp.appCreateDSLErrorPart2', { ns: 'app' })}</div>
<br />
<div>
{t('newApp.appCreateDSLErrorPart3', { ns: 'app' })}
<span className="system-md-medium">{versions?.importedVersion}</span>
</div>
<div>
{t('newApp.appCreateDSLErrorPart4', { ns: 'app' })}
<span className="system-md-medium">{versions?.systemVersion}</span>
</div>
</div>
</div>
<div className="flex items-start justify-end gap-2 self-stretch pt-6">
<Button variant="secondary" onClick={() => setShowErrorModal(false)}>{t('newApp.Cancel', { ns: 'app' })}</Button>
<Button variant="primary" destructive onClick={onDSLConfirm}>{t('newApp.Confirm', { ns: 'app' })}</Button>
</div>
</Modal>
</>
)
}

View File

@@ -1,334 +0,0 @@
import type { FileListItemProps } from './file-list-item'
import type { CustomFile as File, FileItem } from '@/models/datasets'
import { fireEvent, render, screen } from '@testing-library/react'
import { beforeEach, describe, expect, it, vi } from 'vitest'
import { PROGRESS_COMPLETE, PROGRESS_ERROR, PROGRESS_NOT_STARTED } from '../constants'
import FileListItem from './file-list-item'
// Mock theme hook - can be changed per test
let mockTheme = 'light'
vi.mock('@/hooks/use-theme', () => ({
default: () => ({ theme: mockTheme }),
}))
// Mock theme types
vi.mock('@/types/app', () => ({
Theme: { dark: 'dark', light: 'light' },
}))
// Mock SimplePieChart with dynamic import handling
vi.mock('next/dynamic', () => ({
default: () => {
const DynamicComponent = ({ percentage, stroke, fill }: { percentage: number, stroke: string, fill: string }) => (
<div data-testid="pie-chart" data-percentage={percentage} data-stroke={stroke} data-fill={fill}>
Pie Chart:
{' '}
{percentage}
%
</div>
)
DynamicComponent.displayName = 'SimplePieChart'
return DynamicComponent
},
}))
// Mock DocumentFileIcon
vi.mock('@/app/components/datasets/common/document-file-icon', () => ({
default: ({ name, extension, size }: { name: string, extension: string, size: string }) => (
<div data-testid="document-icon" data-name={name} data-extension={extension} data-size={size}>
Document Icon
</div>
),
}))
describe('FileListItem', () => {
const createMockFile = (overrides: Partial<File> = {}): File => ({
name: 'test-document.pdf',
size: 1024 * 100, // 100KB
type: 'application/pdf',
lastModified: Date.now(),
...overrides,
} as File)
const createMockFileItem = (overrides: Partial<FileItem> = {}): FileItem => ({
fileID: 'file-123',
file: createMockFile(overrides.file as Partial<File>),
progress: PROGRESS_NOT_STARTED,
...overrides,
})
const defaultProps: FileListItemProps = {
fileItem: createMockFileItem(),
onPreview: vi.fn(),
onRemove: vi.fn(),
}
beforeEach(() => {
vi.clearAllMocks()
mockTheme = 'light'
})
describe('rendering', () => {
it('should render the file item container', () => {
const { container } = render(<FileListItem {...defaultProps} />)
const item = container.firstChild as HTMLElement
expect(item).toHaveClass('flex', 'h-12', 'items-center', 'rounded-lg')
})
it('should render document icon with correct props', () => {
render(<FileListItem {...defaultProps} />)
const icon = screen.getByTestId('document-icon')
expect(icon).toBeInTheDocument()
expect(icon).toHaveAttribute('data-name', 'test-document.pdf')
expect(icon).toHaveAttribute('data-extension', 'pdf')
expect(icon).toHaveAttribute('data-size', 'xl')
})
it('should render file name', () => {
render(<FileListItem {...defaultProps} />)
expect(screen.getByText('test-document.pdf')).toBeInTheDocument()
})
it('should render file extension in uppercase via CSS class', () => {
render(<FileListItem {...defaultProps} />)
const extensionSpan = screen.getByText('pdf')
expect(extensionSpan).toBeInTheDocument()
expect(extensionSpan).toHaveClass('uppercase')
})
it('should render file size', () => {
render(<FileListItem {...defaultProps} />)
// Default mock file is 100KB (1024 * 100 bytes)
expect(screen.getByText('100.00 KB')).toBeInTheDocument()
})
it('should render delete button', () => {
const { container } = render(<FileListItem {...defaultProps} />)
const deleteButton = container.querySelector('.cursor-pointer')
expect(deleteButton).toBeInTheDocument()
})
})
describe('progress states', () => {
it('should show progress chart when uploading (0-99)', () => {
const fileItem = createMockFileItem({ progress: 50 })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
const pieChart = screen.getByTestId('pie-chart')
expect(pieChart).toBeInTheDocument()
expect(pieChart).toHaveAttribute('data-percentage', '50')
})
it('should show progress chart at 0%', () => {
const fileItem = createMockFileItem({ progress: 0 })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
const pieChart = screen.getByTestId('pie-chart')
expect(pieChart).toHaveAttribute('data-percentage', '0')
})
it('should not show progress chart when complete (100)', () => {
const fileItem = createMockFileItem({ progress: PROGRESS_COMPLETE })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.queryByTestId('pie-chart')).not.toBeInTheDocument()
})
it('should not show progress chart when not started (-1)', () => {
const fileItem = createMockFileItem({ progress: PROGRESS_NOT_STARTED })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.queryByTestId('pie-chart')).not.toBeInTheDocument()
})
})
describe('error state', () => {
it('should show error indicator when progress is PROGRESS_ERROR', () => {
const fileItem = createMockFileItem({ progress: PROGRESS_ERROR })
const { container } = render(<FileListItem {...defaultProps} fileItem={fileItem} />)
const errorIndicator = container.querySelector('.text-text-destructive')
expect(errorIndicator).toBeInTheDocument()
})
it('should not show error indicator when not in error state', () => {
const { container } = render(<FileListItem {...defaultProps} />)
const errorIndicator = container.querySelector('.text-text-destructive')
expect(errorIndicator).not.toBeInTheDocument()
})
})
describe('theme handling', () => {
it('should use correct chart color for light theme', () => {
mockTheme = 'light'
const fileItem = createMockFileItem({ progress: 50 })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
const pieChart = screen.getByTestId('pie-chart')
expect(pieChart).toHaveAttribute('data-stroke', '#296dff')
expect(pieChart).toHaveAttribute('data-fill', '#296dff')
})
it('should use correct chart color for dark theme', () => {
mockTheme = 'dark'
const fileItem = createMockFileItem({ progress: 50 })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
const pieChart = screen.getByTestId('pie-chart')
expect(pieChart).toHaveAttribute('data-stroke', '#5289ff')
expect(pieChart).toHaveAttribute('data-fill', '#5289ff')
})
})
describe('event handlers', () => {
it('should call onPreview when item is clicked with file id', () => {
const onPreview = vi.fn()
const fileItem = createMockFileItem({
file: createMockFile({ id: 'uploaded-id' } as Partial<File>),
})
render(<FileListItem {...defaultProps} fileItem={fileItem} onPreview={onPreview} />)
const item = screen.getByText('test-document.pdf').closest('[class*="flex h-12"]')!
fireEvent.click(item)
expect(onPreview).toHaveBeenCalledTimes(1)
expect(onPreview).toHaveBeenCalledWith(fileItem.file)
})
it('should not call onPreview when file has no id', () => {
const onPreview = vi.fn()
const fileItem = createMockFileItem()
render(<FileListItem {...defaultProps} fileItem={fileItem} onPreview={onPreview} />)
const item = screen.getByText('test-document.pdf').closest('[class*="flex h-12"]')!
fireEvent.click(item)
expect(onPreview).not.toHaveBeenCalled()
})
it('should call onRemove when delete button is clicked', () => {
const onRemove = vi.fn()
const fileItem = createMockFileItem()
const { container } = render(<FileListItem {...defaultProps} fileItem={fileItem} onRemove={onRemove} />)
const deleteButton = container.querySelector('.cursor-pointer')!
fireEvent.click(deleteButton)
expect(onRemove).toHaveBeenCalledTimes(1)
expect(onRemove).toHaveBeenCalledWith('file-123')
})
it('should stop propagation when delete button is clicked', () => {
const onPreview = vi.fn()
const onRemove = vi.fn()
const fileItem = createMockFileItem({
file: createMockFile({ id: 'uploaded-id' } as Partial<File>),
})
const { container } = render(<FileListItem {...defaultProps} fileItem={fileItem} onPreview={onPreview} onRemove={onRemove} />)
const deleteButton = container.querySelector('.cursor-pointer')!
fireEvent.click(deleteButton)
expect(onRemove).toHaveBeenCalledTimes(1)
expect(onPreview).not.toHaveBeenCalled()
})
})
describe('file type handling', () => {
it('should handle files with multiple dots in name', () => {
const fileItem = createMockFileItem({
file: createMockFile({ name: 'my.document.file.docx' }),
})
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.getByText('my.document.file.docx')).toBeInTheDocument()
expect(screen.getByText('docx')).toBeInTheDocument()
})
it('should handle files without extension', () => {
const fileItem = createMockFileItem({
file: createMockFile({ name: 'README' }),
})
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
// File name appears once, and extension area shows empty string
expect(screen.getByText('README')).toBeInTheDocument()
})
it('should handle various file extensions', () => {
const extensions = ['txt', 'md', 'json', 'csv', 'xlsx']
extensions.forEach((ext) => {
const fileItem = createMockFileItem({
file: createMockFile({ name: `file.${ext}` }),
})
const { unmount } = render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.getByText(ext)).toBeInTheDocument()
unmount()
})
})
})
describe('file size display', () => {
it('should display size in KB for small files', () => {
const fileItem = createMockFileItem({
file: createMockFile({ size: 5 * 1024 }),
})
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.getByText('5.00 KB')).toBeInTheDocument()
})
it('should display size in MB for larger files', () => {
const fileItem = createMockFileItem({
file: createMockFile({ size: 5 * 1024 * 1024 }),
})
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.getByText('5.00 MB')).toBeInTheDocument()
})
})
describe('upload progress values', () => {
it('should show chart at progress 1', () => {
const fileItem = createMockFileItem({ progress: 1 })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.getByTestId('pie-chart')).toBeInTheDocument()
})
it('should show chart at progress 99', () => {
const fileItem = createMockFileItem({ progress: 99 })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.getByTestId('pie-chart')).toHaveAttribute('data-percentage', '99')
})
it('should not show chart at progress 100', () => {
const fileItem = createMockFileItem({ progress: 100 })
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
expect(screen.queryByTestId('pie-chart')).not.toBeInTheDocument()
})
})
describe('styling', () => {
it('should have proper shadow styling', () => {
const { container } = render(<FileListItem {...defaultProps} />)
const item = container.firstChild as HTMLElement
expect(item).toHaveClass('shadow-xs')
})
it('should have proper border styling', () => {
const { container } = render(<FileListItem {...defaultProps} />)
const item = container.firstChild as HTMLElement
expect(item).toHaveClass('border', 'border-components-panel-border')
})
it('should truncate long file names', () => {
const longFileName = 'this-is-a-very-long-file-name-that-should-be-truncated.pdf'
const fileItem = createMockFileItem({
file: createMockFile({ name: longFileName }),
})
render(<FileListItem {...defaultProps} fileItem={fileItem} />)
const nameElement = screen.getByText(longFileName)
expect(nameElement).toHaveClass('truncate')
})
})
})

View File

@@ -1,89 +0,0 @@
'use client'
import type { CustomFile as File, FileItem } from '@/models/datasets'
import { RiDeleteBinLine, RiErrorWarningFill } from '@remixicon/react'
import dynamic from 'next/dynamic'
import { useMemo } from 'react'
import DocumentFileIcon from '@/app/components/datasets/common/document-file-icon'
import useTheme from '@/hooks/use-theme'
import { Theme } from '@/types/app'
import { formatFileSize, getFileExtension } from '@/utils/format'
import { PROGRESS_COMPLETE, PROGRESS_ERROR } from '../constants'
const SimplePieChart = dynamic(() => import('@/app/components/base/simple-pie-chart'), { ssr: false })
export type FileListItemProps = {
fileItem: FileItem
onPreview: (file: File) => void
onRemove: (fileID: string) => void
}
const FileListItem = ({
fileItem,
onPreview,
onRemove,
}: FileListItemProps) => {
const { theme } = useTheme()
const chartColor = useMemo(() => theme === Theme.dark ? '#5289ff' : '#296dff', [theme])
const isUploading = fileItem.progress >= 0 && fileItem.progress < PROGRESS_COMPLETE
const isError = fileItem.progress === PROGRESS_ERROR
const handleClick = () => {
if (fileItem.file?.id)
onPreview(fileItem.file)
}
const handleRemove = (e: React.MouseEvent) => {
e.stopPropagation()
onRemove(fileItem.fileID)
}
return (
<div
onClick={handleClick}
className="flex h-12 max-w-[640px] items-center rounded-lg border border-components-panel-border bg-components-panel-on-panel-item-bg text-xs leading-3 text-text-tertiary shadow-xs"
>
<div className="flex w-12 shrink-0 items-center justify-center">
<DocumentFileIcon
size="xl"
className="shrink-0"
name={fileItem.file.name}
extension={getFileExtension(fileItem.file.name)}
/>
</div>
<div className="flex shrink grow flex-col gap-0.5">
<div className="flex w-full">
<div className="w-0 grow truncate text-sm leading-4 text-text-secondary">
{fileItem.file.name}
</div>
</div>
<div className="w-full truncate leading-3 text-text-tertiary">
<span className="uppercase">{getFileExtension(fileItem.file.name)}</span>
<span className="px-1 text-text-quaternary">·</span>
<span>{formatFileSize(fileItem.file.size)}</span>
</div>
</div>
<div className="flex w-16 shrink-0 items-center justify-end gap-1 pr-3">
{isUploading && (
<SimplePieChart
percentage={fileItem.progress}
stroke={chartColor}
fill={chartColor}
animationDuration={0}
/>
)}
{isError && (
<RiErrorWarningFill className="size-4 text-text-destructive" />
)}
<span
className="flex h-6 w-6 cursor-pointer items-center justify-center"
onClick={handleRemove}
>
<RiDeleteBinLine className="size-4 text-text-tertiary" />
</span>
</div>
</div>
)
}
export default FileListItem

View File

@@ -1,210 +0,0 @@
import type { RefObject } from 'react'
import type { UploadDropzoneProps } from './upload-dropzone'
import { fireEvent, render, screen } from '@testing-library/react'
import { beforeEach, describe, expect, it, vi } from 'vitest'
import UploadDropzone from './upload-dropzone'
// Helper to create mock ref objects for testing
const createMockRef = <T,>(value: T | null = null): RefObject<T | null> => ({ current: value })
// Mock react-i18next
vi.mock('react-i18next', () => ({
useTranslation: () => ({
t: (key: string, options?: Record<string, unknown>) => {
const translations: Record<string, string> = {
'stepOne.uploader.button': 'Drag and drop files, or',
'stepOne.uploader.buttonSingleFile': 'Drag and drop file, or',
'stepOne.uploader.browse': 'Browse',
'stepOne.uploader.tip': 'Supports {{supportTypes}}, Max {{size}}MB each, up to {{batchCount}} files at a time, {{totalCount}} files total',
}
let result = translations[key] || key
if (options && typeof options === 'object') {
Object.entries(options).forEach(([k, v]) => {
result = result.replace(`{{${k}}}`, String(v))
})
}
return result
},
}),
}))
describe('UploadDropzone', () => {
const defaultProps: UploadDropzoneProps = {
dropRef: createMockRef<HTMLDivElement>() as RefObject<HTMLDivElement | null>,
dragRef: createMockRef<HTMLDivElement>() as RefObject<HTMLDivElement | null>,
fileUploaderRef: createMockRef<HTMLInputElement>() as RefObject<HTMLInputElement | null>,
dragging: false,
supportBatchUpload: true,
supportTypesShowNames: 'PDF, DOCX, TXT',
fileUploadConfig: {
file_size_limit: 15,
batch_count_limit: 5,
file_upload_limit: 10,
},
acceptTypes: ['.pdf', '.docx', '.txt'],
onSelectFile: vi.fn(),
onFileChange: vi.fn(),
}
beforeEach(() => {
vi.clearAllMocks()
})
describe('rendering', () => {
it('should render the dropzone container', () => {
const { container } = render(<UploadDropzone {...defaultProps} />)
const dropzone = container.querySelector('[class*="border-dashed"]')
expect(dropzone).toBeInTheDocument()
})
it('should render hidden file input', () => {
render(<UploadDropzone {...defaultProps} />)
const input = document.getElementById('fileUploader') as HTMLInputElement
expect(input).toBeInTheDocument()
expect(input).toHaveClass('hidden')
expect(input).toHaveAttribute('type', 'file')
})
it('should render upload icon', () => {
render(<UploadDropzone {...defaultProps} />)
const icon = document.querySelector('svg')
expect(icon).toBeInTheDocument()
})
it('should render browse label when extensions are allowed', () => {
render(<UploadDropzone {...defaultProps} />)
expect(screen.getByText('Browse')).toBeInTheDocument()
})
it('should not render browse label when no extensions allowed', () => {
render(<UploadDropzone {...defaultProps} acceptTypes={[]} />)
expect(screen.queryByText('Browse')).not.toBeInTheDocument()
})
it('should render file size and count limits', () => {
render(<UploadDropzone {...defaultProps} />)
const tipText = screen.getByText(/Supports.*Max.*15MB/i)
expect(tipText).toBeInTheDocument()
})
})
describe('file input configuration', () => {
it('should allow multiple files when supportBatchUpload is true', () => {
render(<UploadDropzone {...defaultProps} supportBatchUpload={true} />)
const input = document.getElementById('fileUploader') as HTMLInputElement
expect(input).toHaveAttribute('multiple')
})
it('should not allow multiple files when supportBatchUpload is false', () => {
render(<UploadDropzone {...defaultProps} supportBatchUpload={false} />)
const input = document.getElementById('fileUploader') as HTMLInputElement
expect(input).not.toHaveAttribute('multiple')
})
it('should set accept attribute with correct types', () => {
render(<UploadDropzone {...defaultProps} acceptTypes={['.pdf', '.docx']} />)
const input = document.getElementById('fileUploader') as HTMLInputElement
expect(input).toHaveAttribute('accept', '.pdf,.docx')
})
})
describe('text content', () => {
it('should show batch upload text when supportBatchUpload is true', () => {
render(<UploadDropzone {...defaultProps} supportBatchUpload={true} />)
expect(screen.getByText(/Drag and drop files/i)).toBeInTheDocument()
})
it('should show single file text when supportBatchUpload is false', () => {
render(<UploadDropzone {...defaultProps} supportBatchUpload={false} />)
expect(screen.getByText(/Drag and drop file/i)).toBeInTheDocument()
})
})
describe('dragging state', () => {
it('should apply dragging styles when dragging is true', () => {
const { container } = render(<UploadDropzone {...defaultProps} dragging={true} />)
const dropzone = container.querySelector('[class*="border-components-dropzone-border-accent"]')
expect(dropzone).toBeInTheDocument()
})
it('should render drag overlay when dragging', () => {
const dragRef = createMockRef<HTMLDivElement>()
render(<UploadDropzone {...defaultProps} dragging={true} dragRef={dragRef as RefObject<HTMLDivElement | null>} />)
const overlay = document.querySelector('.absolute.left-0.top-0')
expect(overlay).toBeInTheDocument()
})
it('should not render drag overlay when not dragging', () => {
render(<UploadDropzone {...defaultProps} dragging={false} />)
const overlay = document.querySelector('.absolute.left-0.top-0')
expect(overlay).not.toBeInTheDocument()
})
})
describe('event handlers', () => {
it('should call onSelectFile when browse label is clicked', () => {
const onSelectFile = vi.fn()
render(<UploadDropzone {...defaultProps} onSelectFile={onSelectFile} />)
const browseLabel = screen.getByText('Browse')
fireEvent.click(browseLabel)
expect(onSelectFile).toHaveBeenCalledTimes(1)
})
it('should call onFileChange when files are selected', () => {
const onFileChange = vi.fn()
render(<UploadDropzone {...defaultProps} onFileChange={onFileChange} />)
const input = document.getElementById('fileUploader') as HTMLInputElement
const file = new File(['content'], 'test.pdf', { type: 'application/pdf' })
fireEvent.change(input, { target: { files: [file] } })
expect(onFileChange).toHaveBeenCalledTimes(1)
})
})
describe('refs', () => {
it('should attach dropRef to drop container', () => {
const dropRef = createMockRef<HTMLDivElement>()
render(<UploadDropzone {...defaultProps} dropRef={dropRef as RefObject<HTMLDivElement | null>} />)
expect(dropRef.current).toBeInstanceOf(HTMLDivElement)
})
it('should attach fileUploaderRef to input element', () => {
const fileUploaderRef = createMockRef<HTMLInputElement>()
render(<UploadDropzone {...defaultProps} fileUploaderRef={fileUploaderRef as RefObject<HTMLInputElement | null>} />)
expect(fileUploaderRef.current).toBeInstanceOf(HTMLInputElement)
})
it('should attach dragRef to overlay when dragging', () => {
const dragRef = createMockRef<HTMLDivElement>()
render(<UploadDropzone {...defaultProps} dragging={true} dragRef={dragRef as RefObject<HTMLDivElement | null>} />)
expect(dragRef.current).toBeInstanceOf(HTMLDivElement)
})
})
describe('styling', () => {
it('should have base dropzone styling', () => {
const { container } = render(<UploadDropzone {...defaultProps} />)
const dropzone = container.querySelector('[class*="border-dashed"]')
expect(dropzone).toBeInTheDocument()
expect(dropzone).toHaveClass('rounded-xl')
})
it('should have cursor-pointer on browse label', () => {
render(<UploadDropzone {...defaultProps} />)
const browseLabel = screen.getByText('Browse')
expect(browseLabel).toHaveClass('cursor-pointer')
})
})
describe('accessibility', () => {
it('should have an accessible file input', () => {
render(<UploadDropzone {...defaultProps} />)
const input = document.getElementById('fileUploader') as HTMLInputElement
expect(input).toHaveAttribute('id', 'fileUploader')
})
})
})

View File

@@ -1,84 +0,0 @@
'use client'
import type { RefObject } from 'react'
import type { FileUploadConfig } from '../hooks/use-file-upload'
import { RiUploadCloud2Line } from '@remixicon/react'
import { useTranslation } from 'react-i18next'
import { cn } from '@/utils/classnames'
export type UploadDropzoneProps = {
dropRef: RefObject<HTMLDivElement | null>
dragRef: RefObject<HTMLDivElement | null>
fileUploaderRef: RefObject<HTMLInputElement | null>
dragging: boolean
supportBatchUpload: boolean
supportTypesShowNames: string
fileUploadConfig: FileUploadConfig
acceptTypes: string[]
onSelectFile: () => void
onFileChange: (e: React.ChangeEvent<HTMLInputElement>) => void
}
const UploadDropzone = ({
dropRef,
dragRef,
fileUploaderRef,
dragging,
supportBatchUpload,
supportTypesShowNames,
fileUploadConfig,
acceptTypes,
onSelectFile,
onFileChange,
}: UploadDropzoneProps) => {
const { t } = useTranslation()
return (
<>
<input
ref={fileUploaderRef}
id="fileUploader"
className="hidden"
type="file"
multiple={supportBatchUpload}
accept={acceptTypes.join(',')}
onChange={onFileChange}
/>
<div
ref={dropRef}
className={cn(
'relative mb-2 box-border flex min-h-20 max-w-[640px] flex-col items-center justify-center gap-1 rounded-xl border border-dashed border-components-dropzone-border bg-components-dropzone-bg px-4 py-3 text-xs leading-4 text-text-tertiary',
dragging && 'border-components-dropzone-border-accent bg-components-dropzone-bg-accent',
)}
>
<div className="flex min-h-5 items-center justify-center text-sm leading-4 text-text-secondary">
<RiUploadCloud2Line className="mr-2 size-5" />
<span>
{supportBatchUpload
? t('stepOne.uploader.button', { ns: 'datasetCreation' })
: t('stepOne.uploader.buttonSingleFile', { ns: 'datasetCreation' })}
{acceptTypes.length > 0 && (
<label
className="ml-1 cursor-pointer text-text-accent"
onClick={onSelectFile}
>
{t('stepOne.uploader.browse', { ns: 'datasetCreation' })}
</label>
)}
</span>
</div>
<div>
{t('stepOne.uploader.tip', {
ns: 'datasetCreation',
size: fileUploadConfig.file_size_limit,
supportTypes: supportTypesShowNames,
batchCount: fileUploadConfig.batch_count_limit,
totalCount: fileUploadConfig.file_upload_limit,
})}
</div>
{dragging && <div ref={dragRef} className="absolute left-0 top-0 h-full w-full" />}
</div>
</>
)
}
export default UploadDropzone

View File

@@ -1,3 +0,0 @@
export const PROGRESS_NOT_STARTED = -1
export const PROGRESS_ERROR = -2
export const PROGRESS_COMPLETE = 100

View File

@@ -1,921 +0,0 @@
import type { ReactNode } from 'react'
import type { CustomFile, FileItem } from '@/models/datasets'
import { act, render, renderHook, waitFor } from '@testing-library/react'
import { beforeEach, describe, expect, it, vi } from 'vitest'
import { ToastContext } from '@/app/components/base/toast'
import { PROGRESS_COMPLETE, PROGRESS_ERROR, PROGRESS_NOT_STARTED } from '../constants'
// Import after mocks
import { useFileUpload } from './use-file-upload'
// Mock notify function
const mockNotify = vi.fn()
const mockClose = vi.fn()
// Mock ToastContext
vi.mock('use-context-selector', async () => {
const actual = await vi.importActual<typeof import('use-context-selector')>('use-context-selector')
return {
...actual,
useContext: vi.fn(() => ({ notify: mockNotify, close: mockClose })),
}
})
// Mock upload service
const mockUpload = vi.fn()
vi.mock('@/service/base', () => ({
upload: (...args: unknown[]) => mockUpload(...args),
}))
// Mock file upload config
const mockFileUploadConfig = {
file_size_limit: 15,
batch_count_limit: 5,
file_upload_limit: 10,
}
const mockSupportTypes = {
allowed_extensions: ['pdf', 'docx', 'txt', 'md'],
}
vi.mock('@/service/use-common', () => ({
useFileUploadConfig: () => ({ data: mockFileUploadConfig }),
useFileSupportTypes: () => ({ data: mockSupportTypes }),
}))
// Mock i18n
vi.mock('react-i18next', () => ({
useTranslation: () => ({
t: (key: string) => key,
}),
}))
// Mock locale
vi.mock('@/context/i18n', () => ({
useLocale: () => 'en-US',
}))
vi.mock('@/i18n-config/language', () => ({
LanguagesSupported: ['en-US', 'zh-Hans'],
}))
// Mock config
vi.mock('@/config', () => ({
IS_CE_EDITION: false,
}))
// Mock file upload error message
vi.mock('@/app/components/base/file-uploader/utils', () => ({
getFileUploadErrorMessage: (_e: unknown, defaultMsg: string) => defaultMsg,
}))
const createWrapper = () => {
return ({ children }: { children: ReactNode }) => (
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
{children}
</ToastContext.Provider>
)
}
describe('useFileUpload', () => {
const defaultOptions = {
fileList: [] as FileItem[],
prepareFileList: vi.fn(),
onFileUpdate: vi.fn(),
onFileListUpdate: vi.fn(),
onPreview: vi.fn(),
supportBatchUpload: true,
}
beforeEach(() => {
vi.clearAllMocks()
mockUpload.mockReset()
// Default mock to return a resolved promise to avoid unhandled rejections
mockUpload.mockResolvedValue({ id: 'default-id' })
mockNotify.mockReset()
})
describe('initialization', () => {
it('should initialize with default values', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
expect(result.current.dragging).toBe(false)
expect(result.current.hideUpload).toBe(false)
expect(result.current.dropRef.current).toBeNull()
expect(result.current.dragRef.current).toBeNull()
expect(result.current.fileUploaderRef.current).toBeNull()
})
it('should set hideUpload true when not batch upload and has files', () => {
const { result } = renderHook(
() => useFileUpload({
...defaultOptions,
supportBatchUpload: false,
fileList: [{ fileID: 'file-1', file: {} as CustomFile, progress: 100 }],
}),
{ wrapper: createWrapper() },
)
expect(result.current.hideUpload).toBe(true)
})
it('should compute acceptTypes correctly', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
expect(result.current.acceptTypes).toEqual(['.pdf', '.docx', '.txt', '.md'])
})
it('should compute supportTypesShowNames correctly', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
expect(result.current.supportTypesShowNames).toContain('PDF')
expect(result.current.supportTypesShowNames).toContain('DOCX')
expect(result.current.supportTypesShowNames).toContain('TXT')
// 'md' is mapped to 'markdown' in the extensionMap
expect(result.current.supportTypesShowNames).toContain('MARKDOWN')
})
it('should set batch limit to 1 when not batch upload', () => {
const { result } = renderHook(
() => useFileUpload({
...defaultOptions,
supportBatchUpload: false,
}),
{ wrapper: createWrapper() },
)
expect(result.current.fileUploadConfig.batch_count_limit).toBe(1)
expect(result.current.fileUploadConfig.file_upload_limit).toBe(1)
})
})
describe('selectHandle', () => {
it('should trigger click on file input', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
const mockClick = vi.fn()
const mockInput = { click: mockClick } as unknown as HTMLInputElement
Object.defineProperty(result.current.fileUploaderRef, 'current', {
value: mockInput,
writable: true,
})
act(() => {
result.current.selectHandle()
})
expect(mockClick).toHaveBeenCalled()
})
it('should do nothing when file input ref is null', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
expect(() => {
act(() => {
result.current.selectHandle()
})
}).not.toThrow()
})
})
describe('handlePreview', () => {
it('should call onPreview when file has id', () => {
const onPreview = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onPreview }),
{ wrapper: createWrapper() },
)
const mockFile = { id: 'file-123', name: 'test.pdf', size: 1024 } as CustomFile
act(() => {
result.current.handlePreview(mockFile)
})
expect(onPreview).toHaveBeenCalledWith(mockFile)
})
it('should not call onPreview when file has no id', () => {
const onPreview = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onPreview }),
{ wrapper: createWrapper() },
)
const mockFile = { name: 'test.pdf', size: 1024 } as CustomFile
act(() => {
result.current.handlePreview(mockFile)
})
expect(onPreview).not.toHaveBeenCalled()
})
})
describe('removeFile', () => {
it('should call onFileListUpdate with filtered list', () => {
const onFileListUpdate = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onFileListUpdate }),
{ wrapper: createWrapper() },
)
act(() => {
result.current.removeFile('file-to-remove')
})
expect(onFileListUpdate).toHaveBeenCalled()
})
it('should clear file input value', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
const mockInput = { value: 'some-file' } as HTMLInputElement
Object.defineProperty(result.current.fileUploaderRef, 'current', {
value: mockInput,
writable: true,
})
act(() => {
result.current.removeFile('file-123')
})
expect(mockInput.value).toBe('')
})
})
describe('fileChangeHandle', () => {
it('should handle valid files', async () => {
mockUpload.mockResolvedValue({ id: 'uploaded-id' })
const prepareFileList = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, prepareFileList }),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
await waitFor(() => {
expect(prepareFileList).toHaveBeenCalled()
})
})
it('should limit files to batch count', () => {
const prepareFileList = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, prepareFileList }),
{ wrapper: createWrapper() },
)
const files = Array.from({ length: 10 }, (_, i) =>
new File(['content'], `file${i}.pdf`, { type: 'application/pdf' }))
const event = {
target: { files },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
// Should be called with at most batch_count_limit files
if (prepareFileList.mock.calls.length > 0) {
const calledFiles = prepareFileList.mock.calls[0][0]
expect(calledFiles.length).toBeLessThanOrEqual(mockFileUploadConfig.batch_count_limit)
}
})
it('should reject invalid file types', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.exe', { type: 'application/x-msdownload' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
expect(mockNotify).toHaveBeenCalledWith(
expect.objectContaining({ type: 'error' }),
)
})
it('should reject files exceeding size limit', () => {
const { result } = renderHook(
() => useFileUpload(defaultOptions),
{ wrapper: createWrapper() },
)
// Create a file larger than the limit (15MB)
const largeFile = new File([new ArrayBuffer(20 * 1024 * 1024)], 'large.pdf', { type: 'application/pdf' })
const event = {
target: { files: [largeFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
expect(mockNotify).toHaveBeenCalledWith(
expect.objectContaining({ type: 'error' }),
)
})
it('should handle null files', () => {
const prepareFileList = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, prepareFileList }),
{ wrapper: createWrapper() },
)
const event = {
target: { files: null },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
expect(prepareFileList).not.toHaveBeenCalled()
})
})
describe('drag and drop handlers', () => {
const TestDropzone = ({ options }: { options: typeof defaultOptions }) => {
const {
dropRef,
dragRef,
dragging,
} = useFileUpload(options)
return (
<div>
<div ref={dropRef} data-testid="dropzone">
{dragging && <div ref={dragRef} data-testid="drag-overlay" />}
</div>
<span data-testid="dragging">{String(dragging)}</span>
</div>
)
}
it('should set dragging true on dragenter', async () => {
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={defaultOptions} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
const dragEnterEvent = new Event('dragenter', { bubbles: true, cancelable: true })
dropzone.dispatchEvent(dragEnterEvent)
})
expect(getByTestId('dragging').textContent).toBe('true')
})
it('should handle dragover event', async () => {
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={defaultOptions} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
const dragOverEvent = new Event('dragover', { bubbles: true, cancelable: true })
dropzone.dispatchEvent(dragOverEvent)
})
expect(dropzone).toBeInTheDocument()
})
it('should set dragging false on dragleave from drag overlay', async () => {
const { getByTestId, queryByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={defaultOptions} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
const dragEnterEvent = new Event('dragenter', { bubbles: true, cancelable: true })
dropzone.dispatchEvent(dragEnterEvent)
})
expect(getByTestId('dragging').textContent).toBe('true')
const dragOverlay = queryByTestId('drag-overlay')
if (dragOverlay) {
await act(async () => {
const dragLeaveEvent = new Event('dragleave', { bubbles: true, cancelable: true })
Object.defineProperty(dragLeaveEvent, 'target', { value: dragOverlay })
dropzone.dispatchEvent(dragLeaveEvent)
})
}
})
it('should handle drop with files', async () => {
mockUpload.mockResolvedValue({ id: 'uploaded-id' })
const prepareFileList = vi.fn()
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={{ ...defaultOptions, prepareFileList }} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
await act(async () => {
const dropEvent = new Event('drop', { bubbles: true, cancelable: true }) as Event & { dataTransfer: DataTransfer | null }
Object.defineProperty(dropEvent, 'dataTransfer', {
value: {
items: [{
getAsFile: () => mockFile,
webkitGetAsEntry: () => null,
}],
},
})
dropzone.dispatchEvent(dropEvent)
})
await waitFor(() => {
expect(prepareFileList).toHaveBeenCalled()
})
})
it('should handle drop without dataTransfer', async () => {
const prepareFileList = vi.fn()
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={{ ...defaultOptions, prepareFileList }} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
const dropEvent = new Event('drop', { bubbles: true, cancelable: true }) as Event & { dataTransfer: DataTransfer | null }
Object.defineProperty(dropEvent, 'dataTransfer', { value: null })
dropzone.dispatchEvent(dropEvent)
})
expect(prepareFileList).not.toHaveBeenCalled()
})
it('should limit to single file on drop when supportBatchUpload is false', async () => {
mockUpload.mockResolvedValue({ id: 'uploaded-id' })
const prepareFileList = vi.fn()
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={{ ...defaultOptions, supportBatchUpload: false, prepareFileList }} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
const files = [
new File(['content1'], 'test1.pdf', { type: 'application/pdf' }),
new File(['content2'], 'test2.pdf', { type: 'application/pdf' }),
]
await act(async () => {
const dropEvent = new Event('drop', { bubbles: true, cancelable: true }) as Event & { dataTransfer: DataTransfer | null }
Object.defineProperty(dropEvent, 'dataTransfer', {
value: {
items: files.map(f => ({
getAsFile: () => f,
webkitGetAsEntry: () => null,
})),
},
})
dropzone.dispatchEvent(dropEvent)
})
await waitFor(() => {
if (prepareFileList.mock.calls.length > 0) {
const calledFiles = prepareFileList.mock.calls[0][0]
expect(calledFiles.length).toBe(1)
}
})
})
it('should handle drop with FileSystemFileEntry', async () => {
mockUpload.mockResolvedValue({ id: 'uploaded-id' })
const prepareFileList = vi.fn()
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={{ ...defaultOptions, prepareFileList }} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
const dropEvent = new Event('drop', { bubbles: true, cancelable: true }) as Event & { dataTransfer: DataTransfer | null }
Object.defineProperty(dropEvent, 'dataTransfer', {
value: {
items: [{
getAsFile: () => mockFile,
webkitGetAsEntry: () => ({
isFile: true,
isDirectory: false,
file: (callback: (file: File) => void) => callback(mockFile),
}),
}],
},
})
dropzone.dispatchEvent(dropEvent)
})
await waitFor(() => {
expect(prepareFileList).toHaveBeenCalled()
})
})
it('should handle drop with FileSystemDirectoryEntry', async () => {
mockUpload.mockResolvedValue({ id: 'uploaded-id' })
const prepareFileList = vi.fn()
const mockFile = new File(['content'], 'nested.pdf', { type: 'application/pdf' })
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={{ ...defaultOptions, prepareFileList }} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
let callCount = 0
const dropEvent = new Event('drop', { bubbles: true, cancelable: true }) as Event & { dataTransfer: DataTransfer | null }
Object.defineProperty(dropEvent, 'dataTransfer', {
value: {
items: [{
getAsFile: () => null,
webkitGetAsEntry: () => ({
isFile: false,
isDirectory: true,
name: 'folder',
createReader: () => ({
readEntries: (callback: (entries: Array<{ isFile: boolean, isDirectory: boolean, name?: string, file?: (cb: (f: File) => void) => void }>) => void) => {
// First call returns file entry, second call returns empty (signals end)
if (callCount === 0) {
callCount++
callback([{
isFile: true,
isDirectory: false,
name: 'nested.pdf',
file: (cb: (f: File) => void) => cb(mockFile),
}])
}
else {
callback([])
}
},
}),
}),
}],
},
})
dropzone.dispatchEvent(dropEvent)
})
await waitFor(() => {
expect(prepareFileList).toHaveBeenCalled()
})
})
it('should handle drop with empty directory', async () => {
const prepareFileList = vi.fn()
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={{ ...defaultOptions, prepareFileList }} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
const dropEvent = new Event('drop', { bubbles: true, cancelable: true }) as Event & { dataTransfer: DataTransfer | null }
Object.defineProperty(dropEvent, 'dataTransfer', {
value: {
items: [{
getAsFile: () => null,
webkitGetAsEntry: () => ({
isFile: false,
isDirectory: true,
name: 'empty-folder',
createReader: () => ({
readEntries: (callback: (entries: never[]) => void) => {
callback([])
},
}),
}),
}],
},
})
dropzone.dispatchEvent(dropEvent)
})
// Should not prepare file list if no valid files
await new Promise(resolve => setTimeout(resolve, 100))
})
it('should handle entry that is neither file nor directory', async () => {
const prepareFileList = vi.fn()
const { getByTestId } = await act(async () =>
render(
<ToastContext.Provider value={{ notify: mockNotify, close: mockClose }}>
<TestDropzone options={{ ...defaultOptions, prepareFileList }} />
</ToastContext.Provider>,
),
)
const dropzone = getByTestId('dropzone')
await act(async () => {
const dropEvent = new Event('drop', { bubbles: true, cancelable: true }) as Event & { dataTransfer: DataTransfer | null }
Object.defineProperty(dropEvent, 'dataTransfer', {
value: {
items: [{
getAsFile: () => null,
webkitGetAsEntry: () => ({
isFile: false,
isDirectory: false,
}),
}],
},
})
dropzone.dispatchEvent(dropEvent)
})
// Should not throw and should handle gracefully
await new Promise(resolve => setTimeout(resolve, 100))
})
})
describe('file upload', () => {
it('should call upload with correct parameters', async () => {
mockUpload.mockResolvedValue({ id: 'uploaded-id', name: 'test.pdf' })
const onFileUpdate = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onFileUpdate }),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
await waitFor(() => {
expect(mockUpload).toHaveBeenCalled()
})
})
it('should update progress during upload', async () => {
let progressCallback: ((e: ProgressEvent) => void) | undefined
mockUpload.mockImplementation(async (options: { onprogress: (e: ProgressEvent) => void }) => {
progressCallback = options.onprogress
return { id: 'uploaded-id' }
})
const onFileUpdate = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onFileUpdate }),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
await waitFor(() => {
expect(mockUpload).toHaveBeenCalled()
})
if (progressCallback) {
act(() => {
progressCallback!({
lengthComputable: true,
loaded: 50,
total: 100,
} as ProgressEvent)
})
expect(onFileUpdate).toHaveBeenCalled()
}
})
it('should handle upload error', async () => {
mockUpload.mockRejectedValue(new Error('Upload failed'))
const onFileUpdate = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onFileUpdate }),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
await waitFor(() => {
expect(mockNotify).toHaveBeenCalledWith(
expect.objectContaining({ type: 'error' }),
)
})
})
it('should update file with PROGRESS_COMPLETE on success', async () => {
mockUpload.mockResolvedValue({ id: 'uploaded-id', name: 'test.pdf' })
const onFileUpdate = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onFileUpdate }),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
await waitFor(() => {
const completeCalls = onFileUpdate.mock.calls.filter(
([, progress]) => progress === PROGRESS_COMPLETE,
)
expect(completeCalls.length).toBeGreaterThan(0)
})
})
it('should update file with PROGRESS_ERROR on failure', async () => {
mockUpload.mockRejectedValue(new Error('Upload failed'))
const onFileUpdate = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, onFileUpdate }),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
await waitFor(() => {
const errorCalls = onFileUpdate.mock.calls.filter(
([, progress]) => progress === PROGRESS_ERROR,
)
expect(errorCalls.length).toBeGreaterThan(0)
})
})
})
describe('file count validation', () => {
it('should reject when total files exceed limit', () => {
const existingFiles: FileItem[] = Array.from({ length: 8 }, (_, i) => ({
fileID: `existing-${i}`,
file: { name: `existing-${i}.pdf`, size: 1024 } as CustomFile,
progress: 100,
}))
const { result } = renderHook(
() => useFileUpload({
...defaultOptions,
fileList: existingFiles,
}),
{ wrapper: createWrapper() },
)
const files = Array.from({ length: 5 }, (_, i) =>
new File(['content'], `new-${i}.pdf`, { type: 'application/pdf' }))
const event = {
target: { files },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
expect(mockNotify).toHaveBeenCalledWith(
expect.objectContaining({ type: 'error' }),
)
})
})
describe('progress constants', () => {
it('should use PROGRESS_NOT_STARTED for new files', async () => {
mockUpload.mockResolvedValue({ id: 'file-id' })
const prepareFileList = vi.fn()
const { result } = renderHook(
() => useFileUpload({ ...defaultOptions, prepareFileList }),
{ wrapper: createWrapper() },
)
const mockFile = new File(['content'], 'test.pdf', { type: 'application/pdf' })
const event = {
target: { files: [mockFile] },
} as unknown as React.ChangeEvent<HTMLInputElement>
act(() => {
result.current.fileChangeHandle(event)
})
await waitFor(() => {
if (prepareFileList.mock.calls.length > 0) {
const files = prepareFileList.mock.calls[0][0]
expect(files[0].progress).toBe(PROGRESS_NOT_STARTED)
}
})
})
})
})

View File

@@ -1,351 +0,0 @@
'use client'
import type { RefObject } from 'react'
import type { CustomFile as File, FileItem } from '@/models/datasets'
import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { useContext } from 'use-context-selector'
import { getFileUploadErrorMessage } from '@/app/components/base/file-uploader/utils'
import { ToastContext } from '@/app/components/base/toast'
import { IS_CE_EDITION } from '@/config'
import { useLocale } from '@/context/i18n'
import { LanguagesSupported } from '@/i18n-config/language'
import { upload } from '@/service/base'
import { useFileSupportTypes, useFileUploadConfig } from '@/service/use-common'
import { getFileExtension } from '@/utils/format'
import { PROGRESS_COMPLETE, PROGRESS_ERROR, PROGRESS_NOT_STARTED } from '../constants'
export type FileUploadConfig = {
file_size_limit: number
batch_count_limit: number
file_upload_limit: number
}
export type UseFileUploadOptions = {
fileList: FileItem[]
prepareFileList: (files: FileItem[]) => void
onFileUpdate: (fileItem: FileItem, progress: number, list: FileItem[]) => void
onFileListUpdate?: (files: FileItem[]) => void
onPreview: (file: File) => void
supportBatchUpload?: boolean
/**
* Optional list of allowed file extensions. If not provided, fetches from API.
* Pass this when you need custom extension filtering instead of using the global config.
*/
allowedExtensions?: string[]
}
export type UseFileUploadReturn = {
// Refs
dropRef: RefObject<HTMLDivElement | null>
dragRef: RefObject<HTMLDivElement | null>
fileUploaderRef: RefObject<HTMLInputElement | null>
// State
dragging: boolean
// Config
fileUploadConfig: FileUploadConfig
acceptTypes: string[]
supportTypesShowNames: string
hideUpload: boolean
// Handlers
selectHandle: () => void
fileChangeHandle: (e: React.ChangeEvent<HTMLInputElement>) => void
removeFile: (fileID: string) => void
handlePreview: (file: File) => void
}
type FileWithPath = {
relativePath?: string
} & File
export const useFileUpload = ({
fileList,
prepareFileList,
onFileUpdate,
onFileListUpdate,
onPreview,
supportBatchUpload = false,
allowedExtensions,
}: UseFileUploadOptions): UseFileUploadReturn => {
const { t } = useTranslation()
const { notify } = useContext(ToastContext)
const locale = useLocale()
const [dragging, setDragging] = useState(false)
const dropRef = useRef<HTMLDivElement>(null)
const dragRef = useRef<HTMLDivElement>(null)
const fileUploaderRef = useRef<HTMLInputElement>(null)
const fileListRef = useRef<FileItem[]>([])
const hideUpload = !supportBatchUpload && fileList.length > 0
const { data: fileUploadConfigResponse } = useFileUploadConfig()
const { data: supportFileTypesResponse } = useFileSupportTypes()
// Use provided allowedExtensions or fetch from API
const supportTypes = useMemo(
() => allowedExtensions ?? supportFileTypesResponse?.allowed_extensions ?? [],
[allowedExtensions, supportFileTypesResponse?.allowed_extensions],
)
const supportTypesShowNames = useMemo(() => {
const extensionMap: { [key: string]: string } = {
md: 'markdown',
pptx: 'pptx',
htm: 'html',
xlsx: 'xlsx',
docx: 'docx',
}
return [...supportTypes]
.map(item => extensionMap[item] || item)
.map(item => item.toLowerCase())
.filter((item, index, self) => self.indexOf(item) === index)
.map(item => item.toUpperCase())
.join(locale !== LanguagesSupported[1] ? ', ' : '、 ')
}, [supportTypes, locale])
const acceptTypes = useMemo(() => supportTypes.map((ext: string) => `.${ext}`), [supportTypes])
const fileUploadConfig = useMemo(() => ({
file_size_limit: fileUploadConfigResponse?.file_size_limit ?? 15,
batch_count_limit: supportBatchUpload ? (fileUploadConfigResponse?.batch_count_limit ?? 5) : 1,
file_upload_limit: supportBatchUpload ? (fileUploadConfigResponse?.file_upload_limit ?? 5) : 1,
}), [fileUploadConfigResponse, supportBatchUpload])
const isValid = useCallback((file: File) => {
const { size } = file
const ext = `.${getFileExtension(file.name)}`
const isValidType = acceptTypes.includes(ext.toLowerCase())
if (!isValidType)
notify({ type: 'error', message: t('stepOne.uploader.validation.typeError', { ns: 'datasetCreation' }) })
const isValidSize = size <= fileUploadConfig.file_size_limit * 1024 * 1024
if (!isValidSize)
notify({ type: 'error', message: t('stepOne.uploader.validation.size', { ns: 'datasetCreation', size: fileUploadConfig.file_size_limit }) })
return isValidType && isValidSize
}, [fileUploadConfig, notify, t, acceptTypes])
const fileUpload = useCallback(async (fileItem: FileItem): Promise<FileItem> => {
const formData = new FormData()
formData.append('file', fileItem.file)
const onProgress = (e: ProgressEvent) => {
if (e.lengthComputable) {
const percent = Math.floor(e.loaded / e.total * 100)
onFileUpdate(fileItem, percent, fileListRef.current)
}
}
return upload({
xhr: new XMLHttpRequest(),
data: formData,
onprogress: onProgress,
}, false, undefined, '?source=datasets')
.then((res) => {
const completeFile = {
fileID: fileItem.fileID,
file: res as unknown as File,
progress: PROGRESS_NOT_STARTED,
}
const index = fileListRef.current.findIndex(item => item.fileID === fileItem.fileID)
fileListRef.current[index] = completeFile
onFileUpdate(completeFile, PROGRESS_COMPLETE, fileListRef.current)
return Promise.resolve({ ...completeFile })
})
.catch((e) => {
const errorMessage = getFileUploadErrorMessage(e, t('stepOne.uploader.failed', { ns: 'datasetCreation' }), t)
notify({ type: 'error', message: errorMessage })
onFileUpdate(fileItem, PROGRESS_ERROR, fileListRef.current)
return Promise.resolve({ ...fileItem })
})
.finally()
}, [notify, onFileUpdate, t])
const uploadBatchFiles = useCallback((bFiles: FileItem[]) => {
bFiles.forEach(bf => (bf.progress = 0))
return Promise.all(bFiles.map(fileUpload))
}, [fileUpload])
const uploadMultipleFiles = useCallback(async (files: FileItem[]) => {
const batchCountLimit = fileUploadConfig.batch_count_limit
const length = files.length
let start = 0
let end = 0
while (start < length) {
if (start + batchCountLimit > length)
end = length
else
end = start + batchCountLimit
const bFiles = files.slice(start, end)
await uploadBatchFiles(bFiles)
start = end
}
}, [fileUploadConfig, uploadBatchFiles])
const initialUpload = useCallback((files: File[]) => {
const filesCountLimit = fileUploadConfig.file_upload_limit
if (!files.length)
return false
if (files.length + fileList.length > filesCountLimit && !IS_CE_EDITION) {
notify({ type: 'error', message: t('stepOne.uploader.validation.filesNumber', { ns: 'datasetCreation', filesNumber: filesCountLimit }) })
return false
}
const preparedFiles = files.map((file, index) => ({
fileID: `file${index}-${Date.now()}`,
file,
progress: PROGRESS_NOT_STARTED,
}))
const newFiles = [...fileListRef.current, ...preparedFiles]
prepareFileList(newFiles)
fileListRef.current = newFiles
uploadMultipleFiles(preparedFiles)
}, [prepareFileList, uploadMultipleFiles, notify, t, fileList, fileUploadConfig])
const traverseFileEntry = useCallback(
(entry: FileSystemEntry, prefix = ''): Promise<FileWithPath[]> => {
return new Promise((resolve) => {
if (entry.isFile) {
(entry as FileSystemFileEntry).file((file: FileWithPath) => {
file.relativePath = `${prefix}${file.name}`
resolve([file])
})
}
else if (entry.isDirectory) {
const reader = (entry as FileSystemDirectoryEntry).createReader()
const entries: FileSystemEntry[] = []
const read = () => {
reader.readEntries(async (results: FileSystemEntry[]) => {
if (!results.length) {
const files = await Promise.all(
entries.map(ent =>
traverseFileEntry(ent, `${prefix}${entry.name}/`),
),
)
resolve(files.flat())
}
else {
entries.push(...results)
read()
}
})
}
read()
}
else {
resolve([])
}
})
},
[],
)
const handleDragEnter = useCallback((e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
if (e.target !== dragRef.current)
setDragging(true)
}, [])
const handleDragOver = useCallback((e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
}, [])
const handleDragLeave = useCallback((e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
if (e.target === dragRef.current)
setDragging(false)
}, [])
const handleDrop = useCallback(
async (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
setDragging(false)
if (!e.dataTransfer)
return
const nested = await Promise.all(
Array.from(e.dataTransfer.items).map((it) => {
const entry = (it as DataTransferItem & { webkitGetAsEntry?: () => FileSystemEntry | null }).webkitGetAsEntry?.()
if (entry)
return traverseFileEntry(entry)
const f = it.getAsFile?.()
return f ? Promise.resolve([f as FileWithPath]) : Promise.resolve([])
}),
)
let files = nested.flat()
if (!supportBatchUpload)
files = files.slice(0, 1)
files = files.slice(0, fileUploadConfig.batch_count_limit)
const valid = files.filter(isValid)
initialUpload(valid)
},
[initialUpload, isValid, supportBatchUpload, traverseFileEntry, fileUploadConfig],
)
const selectHandle = useCallback(() => {
if (fileUploaderRef.current)
fileUploaderRef.current.click()
}, [])
const removeFile = useCallback((fileID: string) => {
if (fileUploaderRef.current)
fileUploaderRef.current.value = ''
fileListRef.current = fileListRef.current.filter(item => item.fileID !== fileID)
onFileListUpdate?.([...fileListRef.current])
}, [onFileListUpdate])
const fileChangeHandle = useCallback((e: React.ChangeEvent<HTMLInputElement>) => {
let files = Array.from(e.target.files ?? []) as File[]
files = files.slice(0, fileUploadConfig.batch_count_limit)
initialUpload(files.filter(isValid))
}, [isValid, initialUpload, fileUploadConfig])
const handlePreview = useCallback((file: File) => {
if (file?.id)
onPreview(file)
}, [onPreview])
useEffect(() => {
const dropArea = dropRef.current
dropArea?.addEventListener('dragenter', handleDragEnter)
dropArea?.addEventListener('dragover', handleDragOver)
dropArea?.addEventListener('dragleave', handleDragLeave)
dropArea?.addEventListener('drop', handleDrop)
return () => {
dropArea?.removeEventListener('dragenter', handleDragEnter)
dropArea?.removeEventListener('dragover', handleDragOver)
dropArea?.removeEventListener('dragleave', handleDragLeave)
dropArea?.removeEventListener('drop', handleDrop)
}
}, [handleDragEnter, handleDragOver, handleDragLeave, handleDrop])
return {
// Refs
dropRef,
dragRef,
fileUploaderRef,
// State
dragging,
// Config
fileUploadConfig,
acceptTypes,
supportTypesShowNames,
hideUpload,
// Handlers
selectHandle,
fileChangeHandle,
removeFile,
handlePreview,
}
}

View File

@@ -1,278 +0,0 @@
import type { CustomFile as File, FileItem } from '@/models/datasets'
import { fireEvent, render, screen } from '@testing-library/react'
import { beforeEach, describe, expect, it, vi } from 'vitest'
import { PROGRESS_NOT_STARTED } from './constants'
import FileUploader from './index'
// Mock react-i18next
vi.mock('react-i18next', () => ({
useTranslation: () => ({
t: (key: string) => {
const translations: Record<string, string> = {
'stepOne.uploader.title': 'Upload Files',
'stepOne.uploader.button': 'Drag and drop files, or',
'stepOne.uploader.buttonSingleFile': 'Drag and drop file, or',
'stepOne.uploader.browse': 'Browse',
'stepOne.uploader.tip': 'Supports various file types',
}
return translations[key] || key
},
}),
}))
// Mock ToastContext
const mockNotify = vi.fn()
vi.mock('use-context-selector', async () => {
const actual = await vi.importActual<typeof import('use-context-selector')>('use-context-selector')
return {
...actual,
useContext: vi.fn(() => ({ notify: mockNotify })),
}
})
// Mock services
vi.mock('@/service/base', () => ({
upload: vi.fn().mockResolvedValue({ id: 'uploaded-id' }),
}))
vi.mock('@/service/use-common', () => ({
useFileUploadConfig: () => ({
data: { file_size_limit: 15, batch_count_limit: 5, file_upload_limit: 10 },
}),
useFileSupportTypes: () => ({
data: { allowed_extensions: ['pdf', 'docx', 'txt'] },
}),
}))
vi.mock('@/context/i18n', () => ({
useLocale: () => 'en-US',
}))
vi.mock('@/i18n-config/language', () => ({
LanguagesSupported: ['en-US', 'zh-Hans'],
}))
vi.mock('@/config', () => ({
IS_CE_EDITION: false,
}))
vi.mock('@/app/components/base/file-uploader/utils', () => ({
getFileUploadErrorMessage: () => 'Upload error',
}))
// Mock theme
vi.mock('@/hooks/use-theme', () => ({
default: () => ({ theme: 'light' }),
}))
vi.mock('@/types/app', () => ({
Theme: { dark: 'dark', light: 'light' },
}))
// Mock DocumentFileIcon - uses relative path from file-list-item.tsx
vi.mock('@/app/components/datasets/common/document-file-icon', () => ({
default: ({ extension }: { extension: string }) => <div data-testid="document-icon">{extension}</div>,
}))
// Mock SimplePieChart
vi.mock('next/dynamic', () => ({
default: () => {
const Component = ({ percentage }: { percentage: number }) => (
<div data-testid="pie-chart">
{percentage}
%
</div>
)
return Component
},
}))
describe('FileUploader', () => {
const createMockFile = (overrides: Partial<File> = {}): File => ({
name: 'test.pdf',
size: 1024,
type: 'application/pdf',
...overrides,
} as File)
const createMockFileItem = (overrides: Partial<FileItem> = {}): FileItem => ({
fileID: `file-${Date.now()}`,
file: createMockFile(overrides.file as Partial<File>),
progress: PROGRESS_NOT_STARTED,
...overrides,
})
const defaultProps = {
fileList: [] as FileItem[],
prepareFileList: vi.fn(),
onFileUpdate: vi.fn(),
onFileListUpdate: vi.fn(),
onPreview: vi.fn(),
supportBatchUpload: true,
}
beforeEach(() => {
vi.clearAllMocks()
})
describe('rendering', () => {
it('should render the component', () => {
render(<FileUploader {...defaultProps} />)
expect(screen.getByText('Upload Files')).toBeInTheDocument()
})
it('should render dropzone when no files', () => {
render(<FileUploader {...defaultProps} />)
expect(screen.getByText(/Drag and drop files/i)).toBeInTheDocument()
})
it('should render browse button', () => {
render(<FileUploader {...defaultProps} />)
expect(screen.getByText('Browse')).toBeInTheDocument()
})
it('should apply custom title className', () => {
render(<FileUploader {...defaultProps} titleClassName="custom-class" />)
const title = screen.getByText('Upload Files')
expect(title).toHaveClass('custom-class')
})
})
describe('file list rendering', () => {
it('should render file items when fileList has items', () => {
const fileList = [
createMockFileItem({ file: createMockFile({ name: 'file1.pdf' }) }),
createMockFileItem({ file: createMockFile({ name: 'file2.pdf' }) }),
]
render(<FileUploader {...defaultProps} fileList={fileList} />)
expect(screen.getByText('file1.pdf')).toBeInTheDocument()
expect(screen.getByText('file2.pdf')).toBeInTheDocument()
})
it('should render document icons for files', () => {
const fileList = [createMockFileItem()]
render(<FileUploader {...defaultProps} fileList={fileList} />)
expect(screen.getByTestId('document-icon')).toBeInTheDocument()
})
})
describe('batch upload mode', () => {
it('should show dropzone with batch upload enabled', () => {
render(<FileUploader {...defaultProps} supportBatchUpload={true} />)
expect(screen.getByText(/Drag and drop files/i)).toBeInTheDocument()
})
it('should show single file text when batch upload disabled', () => {
render(<FileUploader {...defaultProps} supportBatchUpload={false} />)
expect(screen.getByText(/Drag and drop file/i)).toBeInTheDocument()
})
it('should hide dropzone when not batch upload and has files', () => {
const fileList = [createMockFileItem()]
render(<FileUploader {...defaultProps} supportBatchUpload={false} fileList={fileList} />)
expect(screen.queryByText(/Drag and drop/i)).not.toBeInTheDocument()
})
})
describe('event handlers', () => {
it('should handle file preview click', () => {
const onPreview = vi.fn()
const fileItem = createMockFileItem({
file: createMockFile({ id: 'file-id' } as Partial<File>),
})
const { container } = render(<FileUploader {...defaultProps} fileList={[fileItem]} onPreview={onPreview} />)
// Find the file list item container by its class pattern
const fileElement = container.querySelector('[class*="flex h-12"]')
if (fileElement)
fireEvent.click(fileElement)
expect(onPreview).toHaveBeenCalledWith(fileItem.file)
})
it('should handle file remove click', () => {
const onFileListUpdate = vi.fn()
const fileItem = createMockFileItem()
const { container } = render(
<FileUploader {...defaultProps} fileList={[fileItem]} onFileListUpdate={onFileListUpdate} />,
)
// Find the delete button (the span with cursor-pointer containing the icon)
const deleteButtons = container.querySelectorAll('[class*="cursor-pointer"]')
// Get the last one which should be the delete button (not the browse label)
const deleteButton = deleteButtons[deleteButtons.length - 1]
if (deleteButton)
fireEvent.click(deleteButton)
expect(onFileListUpdate).toHaveBeenCalled()
})
it('should handle browse button click', () => {
render(<FileUploader {...defaultProps} />)
// The browse label should trigger file input click
const browseLabel = screen.getByText('Browse')
expect(browseLabel).toHaveClass('cursor-pointer')
})
})
describe('upload progress', () => {
it('should show progress chart for uploading files', () => {
const fileItem = createMockFileItem({ progress: 50 })
render(<FileUploader {...defaultProps} fileList={[fileItem]} />)
expect(screen.getByTestId('pie-chart')).toBeInTheDocument()
expect(screen.getByText('50%')).toBeInTheDocument()
})
it('should not show progress chart for completed files', () => {
const fileItem = createMockFileItem({ progress: 100 })
render(<FileUploader {...defaultProps} fileList={[fileItem]} />)
expect(screen.queryByTestId('pie-chart')).not.toBeInTheDocument()
})
it('should not show progress chart for not started files', () => {
const fileItem = createMockFileItem({ progress: PROGRESS_NOT_STARTED })
render(<FileUploader {...defaultProps} fileList={[fileItem]} />)
expect(screen.queryByTestId('pie-chart')).not.toBeInTheDocument()
})
})
describe('multiple files', () => {
it('should render all files in the list', () => {
const fileList = [
createMockFileItem({ fileID: 'f1', file: createMockFile({ name: 'doc1.pdf' }) }),
createMockFileItem({ fileID: 'f2', file: createMockFile({ name: 'doc2.docx' }) }),
createMockFileItem({ fileID: 'f3', file: createMockFile({ name: 'doc3.txt' }) }),
]
render(<FileUploader {...defaultProps} fileList={fileList} />)
expect(screen.getByText('doc1.pdf')).toBeInTheDocument()
expect(screen.getByText('doc2.docx')).toBeInTheDocument()
expect(screen.getByText('doc3.txt')).toBeInTheDocument()
})
})
describe('styling', () => {
it('should have correct container width', () => {
const { container } = render(<FileUploader {...defaultProps} />)
const wrapper = container.firstChild as HTMLElement
expect(wrapper).toHaveClass('w-[640px]')
})
it('should have proper spacing', () => {
const { container } = render(<FileUploader {...defaultProps} />)
const wrapper = container.firstChild as HTMLElement
expect(wrapper).toHaveClass('mb-5')
})
})
})

View File

@@ -1,10 +1,23 @@
'use client'
import type { CustomFile as File, FileItem } from '@/models/datasets'
import { RiDeleteBinLine, RiUploadCloud2Line } from '@remixicon/react'
import * as React from 'react'
import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { useContext } from 'use-context-selector'
import { getFileUploadErrorMessage } from '@/app/components/base/file-uploader/utils'
import SimplePieChart from '@/app/components/base/simple-pie-chart'
import { ToastContext } from '@/app/components/base/toast'
import { IS_CE_EDITION } from '@/config'
import { useLocale } from '@/context/i18n'
import useTheme from '@/hooks/use-theme'
import { LanguagesSupported } from '@/i18n-config/language'
import { upload } from '@/service/base'
import { useFileSupportTypes, useFileUploadConfig } from '@/service/use-common'
import { Theme } from '@/types/app'
import { cn } from '@/utils/classnames'
import FileListItem from './components/file-list-item'
import UploadDropzone from './components/upload-dropzone'
import { useFileUpload } from './hooks/use-file-upload'
import DocumentFileIcon from '../../common/document-file-icon'
type IFileUploaderProps = {
fileList: FileItem[]
@@ -26,62 +39,358 @@ const FileUploader = ({
supportBatchUpload = false,
}: IFileUploaderProps) => {
const { t } = useTranslation()
const { notify } = useContext(ToastContext)
const locale = useLocale()
const [dragging, setDragging] = useState(false)
const dropRef = useRef<HTMLDivElement>(null)
const dragRef = useRef<HTMLDivElement>(null)
const fileUploader = useRef<HTMLInputElement>(null)
const hideUpload = !supportBatchUpload && fileList.length > 0
const {
dropRef,
dragRef,
fileUploaderRef,
dragging,
fileUploadConfig,
acceptTypes,
supportTypesShowNames,
hideUpload,
selectHandle,
fileChangeHandle,
removeFile,
handlePreview,
} = useFileUpload({
fileList,
prepareFileList,
onFileUpdate,
onFileListUpdate,
onPreview,
supportBatchUpload,
})
const { data: fileUploadConfigResponse } = useFileUploadConfig()
const { data: supportFileTypesResponse } = useFileSupportTypes()
const supportTypes = supportFileTypesResponse?.allowed_extensions || []
const supportTypesShowNames = (() => {
const extensionMap: { [key: string]: string } = {
md: 'markdown',
pptx: 'pptx',
htm: 'html',
xlsx: 'xlsx',
docx: 'docx',
}
return [...supportTypes]
.map(item => extensionMap[item] || item) // map to standardized extension
.map(item => item.toLowerCase()) // convert to lower case
.filter((item, index, self) => self.indexOf(item) === index) // remove duplicates
.map(item => item.toUpperCase()) // convert to upper case
.join(locale !== LanguagesSupported[1] ? ', ' : '、 ')
})()
const ACCEPTS = supportTypes.map((ext: string) => `.${ext}`)
const fileUploadConfig = useMemo(() => ({
file_size_limit: fileUploadConfigResponse?.file_size_limit ?? 15,
batch_count_limit: supportBatchUpload ? (fileUploadConfigResponse?.batch_count_limit ?? 5) : 1,
file_upload_limit: supportBatchUpload ? (fileUploadConfigResponse?.file_upload_limit ?? 5) : 1,
}), [fileUploadConfigResponse, supportBatchUpload])
const fileListRef = useRef<FileItem[]>([])
// utils
const getFileType = (currentFile: File) => {
if (!currentFile)
return ''
const arr = currentFile.name.split('.')
return arr[arr.length - 1]
}
const getFileSize = (size: number) => {
if (size / 1024 < 10)
return `${(size / 1024).toFixed(2)}KB`
return `${(size / 1024 / 1024).toFixed(2)}MB`
}
const isValid = useCallback((file: File) => {
const { size } = file
const ext = `.${getFileType(file)}`
const isValidType = ACCEPTS.includes(ext.toLowerCase())
if (!isValidType)
notify({ type: 'error', message: t('stepOne.uploader.validation.typeError', { ns: 'datasetCreation' }) })
const isValidSize = size <= fileUploadConfig.file_size_limit * 1024 * 1024
if (!isValidSize)
notify({ type: 'error', message: t('stepOne.uploader.validation.size', { ns: 'datasetCreation', size: fileUploadConfig.file_size_limit }) })
return isValidType && isValidSize
}, [fileUploadConfig, notify, t, ACCEPTS])
const fileUpload = useCallback(async (fileItem: FileItem): Promise<FileItem> => {
const formData = new FormData()
formData.append('file', fileItem.file)
const onProgress = (e: ProgressEvent) => {
if (e.lengthComputable) {
const percent = Math.floor(e.loaded / e.total * 100)
onFileUpdate(fileItem, percent, fileListRef.current)
}
}
return upload({
xhr: new XMLHttpRequest(),
data: formData,
onprogress: onProgress,
}, false, undefined, '?source=datasets')
.then((res) => {
const completeFile = {
fileID: fileItem.fileID,
file: res as unknown as File,
progress: -1,
}
const index = fileListRef.current.findIndex(item => item.fileID === fileItem.fileID)
fileListRef.current[index] = completeFile
onFileUpdate(completeFile, 100, fileListRef.current)
return Promise.resolve({ ...completeFile })
})
.catch((e) => {
const errorMessage = getFileUploadErrorMessage(e, t('stepOne.uploader.failed', { ns: 'datasetCreation' }), t)
notify({ type: 'error', message: errorMessage })
onFileUpdate(fileItem, -2, fileListRef.current)
return Promise.resolve({ ...fileItem })
})
.finally()
}, [fileListRef, notify, onFileUpdate, t])
const uploadBatchFiles = useCallback((bFiles: FileItem[]) => {
bFiles.forEach(bf => (bf.progress = 0))
return Promise.all(bFiles.map(fileUpload))
}, [fileUpload])
const uploadMultipleFiles = useCallback(async (files: FileItem[]) => {
const batchCountLimit = fileUploadConfig.batch_count_limit
const length = files.length
let start = 0
let end = 0
while (start < length) {
if (start + batchCountLimit > length)
end = length
else
end = start + batchCountLimit
const bFiles = files.slice(start, end)
await uploadBatchFiles(bFiles)
start = end
}
}, [fileUploadConfig, uploadBatchFiles])
const initialUpload = useCallback((files: File[]) => {
const filesCountLimit = fileUploadConfig.file_upload_limit
if (!files.length)
return false
if (files.length + fileList.length > filesCountLimit && !IS_CE_EDITION) {
notify({ type: 'error', message: t('stepOne.uploader.validation.filesNumber', { ns: 'datasetCreation', filesNumber: filesCountLimit }) })
return false
}
const preparedFiles = files.map((file, index) => ({
fileID: `file${index}-${Date.now()}`,
file,
progress: -1,
}))
const newFiles = [...fileListRef.current, ...preparedFiles]
prepareFileList(newFiles)
fileListRef.current = newFiles
uploadMultipleFiles(preparedFiles)
}, [prepareFileList, uploadMultipleFiles, notify, t, fileList, fileUploadConfig])
const handleDragEnter = (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
if (e.target !== dragRef.current)
setDragging(true)
}
const handleDragOver = (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
}
const handleDragLeave = (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
if (e.target === dragRef.current)
setDragging(false)
}
type FileWithPath = {
relativePath?: string
} & File
const traverseFileEntry = useCallback(
(entry: any, prefix = ''): Promise<FileWithPath[]> => {
return new Promise((resolve) => {
if (entry.isFile) {
entry.file((file: FileWithPath) => {
file.relativePath = `${prefix}${file.name}`
resolve([file])
})
}
else if (entry.isDirectory) {
const reader = entry.createReader()
const entries: any[] = []
const read = () => {
reader.readEntries(async (results: FileSystemEntry[]) => {
if (!results.length) {
const files = await Promise.all(
entries.map(ent =>
traverseFileEntry(ent, `${prefix}${entry.name}/`),
),
)
resolve(files.flat())
}
else {
entries.push(...results)
read()
}
})
}
read()
}
else {
resolve([])
}
})
},
[],
)
const handleDrop = useCallback(
async (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
setDragging(false)
if (!e.dataTransfer)
return
const nested = await Promise.all(
Array.from(e.dataTransfer.items).map((it) => {
const entry = (it as any).webkitGetAsEntry?.()
if (entry)
return traverseFileEntry(entry)
const f = it.getAsFile?.()
return f ? Promise.resolve([f]) : Promise.resolve([])
}),
)
let files = nested.flat()
if (!supportBatchUpload)
files = files.slice(0, 1)
files = files.slice(0, fileUploadConfig.batch_count_limit)
const valid = files.filter(isValid)
initialUpload(valid)
},
[initialUpload, isValid, supportBatchUpload, traverseFileEntry, fileUploadConfig],
)
const selectHandle = () => {
if (fileUploader.current)
fileUploader.current.click()
}
const removeFile = (fileID: string) => {
if (fileUploader.current)
fileUploader.current.value = ''
fileListRef.current = fileListRef.current.filter(item => item.fileID !== fileID)
onFileListUpdate?.([...fileListRef.current])
}
const fileChangeHandle = useCallback((e: React.ChangeEvent<HTMLInputElement>) => {
let files = Array.from(e.target.files ?? []) as File[]
files = files.slice(0, fileUploadConfig.batch_count_limit)
initialUpload(files.filter(isValid))
}, [isValid, initialUpload, fileUploadConfig])
const { theme } = useTheme()
const chartColor = useMemo(() => theme === Theme.dark ? '#5289ff' : '#296dff', [theme])
useEffect(() => {
dropRef.current?.addEventListener('dragenter', handleDragEnter)
dropRef.current?.addEventListener('dragover', handleDragOver)
dropRef.current?.addEventListener('dragleave', handleDragLeave)
dropRef.current?.addEventListener('drop', handleDrop)
return () => {
dropRef.current?.removeEventListener('dragenter', handleDragEnter)
dropRef.current?.removeEventListener('dragover', handleDragOver)
dropRef.current?.removeEventListener('dragleave', handleDragLeave)
dropRef.current?.removeEventListener('drop', handleDrop)
}
}, [handleDrop])
return (
<div className="mb-5 w-[640px]">
<div className={cn('mb-1 text-sm font-semibold leading-6 text-text-secondary', titleClassName)}>
{t('stepOne.uploader.title', { ns: 'datasetCreation' })}
</div>
{!hideUpload && (
<UploadDropzone
dropRef={dropRef}
dragRef={dragRef}
fileUploaderRef={fileUploaderRef}
dragging={dragging}
supportBatchUpload={supportBatchUpload}
supportTypesShowNames={supportTypesShowNames}
fileUploadConfig={fileUploadConfig}
acceptTypes={acceptTypes}
onSelectFile={selectHandle}
onFileChange={fileChangeHandle}
<input
ref={fileUploader}
id="fileUploader"
className="hidden"
type="file"
multiple={supportBatchUpload}
accept={ACCEPTS.join(',')}
onChange={fileChangeHandle}
/>
)}
{fileList.length > 0 && (
<div className="max-w-[640px] cursor-default space-y-1">
{fileList.map(fileItem => (
<FileListItem
key={fileItem.fileID}
fileItem={fileItem}
onPreview={handlePreview}
onRemove={removeFile}
/>
))}
<div className={cn('mb-1 text-sm font-semibold leading-6 text-text-secondary', titleClassName)}>{t('stepOne.uploader.title', { ns: 'datasetCreation' })}</div>
{!hideUpload && (
<div ref={dropRef} className={cn('relative mb-2 box-border flex min-h-20 max-w-[640px] flex-col items-center justify-center gap-1 rounded-xl border border-dashed border-components-dropzone-border bg-components-dropzone-bg px-4 py-3 text-xs leading-4 text-text-tertiary', dragging && 'border-components-dropzone-border-accent bg-components-dropzone-bg-accent')}>
<div className="flex min-h-5 items-center justify-center text-sm leading-4 text-text-secondary">
<RiUploadCloud2Line className="mr-2 size-5" />
<span>
{supportBatchUpload ? t('stepOne.uploader.button', { ns: 'datasetCreation' }) : t('stepOne.uploader.buttonSingleFile', { ns: 'datasetCreation' })}
{supportTypes.length > 0 && (
<label className="ml-1 cursor-pointer text-text-accent" onClick={selectHandle}>{t('stepOne.uploader.browse', { ns: 'datasetCreation' })}</label>
)}
</span>
</div>
<div>
{t('stepOne.uploader.tip', {
ns: 'datasetCreation',
size: fileUploadConfig.file_size_limit,
supportTypes: supportTypesShowNames,
batchCount: fileUploadConfig.batch_count_limit,
totalCount: fileUploadConfig.file_upload_limit,
})}
</div>
{dragging && <div ref={dragRef} className="absolute left-0 top-0 h-full w-full" />}
</div>
)}
<div className="max-w-[640px] cursor-default space-y-1">
{fileList.map((fileItem, index) => (
<div
key={`${fileItem.fileID}-${index}`}
onClick={() => fileItem.file?.id && onPreview(fileItem.file)}
className={cn(
'flex h-12 max-w-[640px] items-center rounded-lg border border-components-panel-border bg-components-panel-on-panel-item-bg text-xs leading-3 text-text-tertiary shadow-xs',
// 'border-state-destructive-border bg-state-destructive-hover',
)}
>
<div className="flex w-12 shrink-0 items-center justify-center">
<DocumentFileIcon
size="xl"
className="shrink-0"
name={fileItem.file.name}
extension={getFileType(fileItem.file)}
/>
</div>
<div className="flex shrink grow flex-col gap-0.5">
<div className="flex w-full">
<div className="w-0 grow truncate text-sm leading-4 text-text-secondary">{fileItem.file.name}</div>
</div>
<div className="w-full truncate leading-3 text-text-tertiary">
<span className="uppercase">{getFileType(fileItem.file)}</span>
<span className="px-1 text-text-quaternary">·</span>
<span>{getFileSize(fileItem.file.size)}</span>
{/* <span className='px-1 text-text-quaternary'>·</span>
<span>10k characters</span> */}
</div>
</div>
<div className="flex w-16 shrink-0 items-center justify-end gap-1 pr-3">
{/* <span className="flex justify-center items-center w-6 h-6 cursor-pointer">
<RiErrorWarningFill className='size-4 text-text-warning' />
</span> */}
{(fileItem.progress < 100 && fileItem.progress >= 0) && (
// <div className={s.percent}>{`${fileItem.progress}%`}</div>
<SimplePieChart percentage={fileItem.progress} stroke={chartColor} fill={chartColor} animationDuration={0} />
)}
<span
className="flex h-6 w-6 cursor-pointer items-center justify-center"
onClick={(e) => {
e.stopPropagation()
removeFile(fileItem.fileID)
}}
>
<RiDeleteBinLine className="size-4 text-text-tertiary" />
</span>
</div>
</div>
))}
</div>
</div>
)
}

View File

@@ -154,7 +154,7 @@ export const GeneralChunkingOptions: FC<GeneralChunkingOptionsProps> = ({
</div>
))}
{
showSummaryIndexSetting && IS_CE_EDITION && (
showSummaryIndexSetting && (
<div className="mt-3">
<SummaryIndexSetting
entry="create-document"

View File

@@ -12,7 +12,6 @@ import Divider from '@/app/components/base/divider'
import { ParentChildChunk } from '@/app/components/base/icons/src/vender/knowledge'
import RadioCard from '@/app/components/base/radio-card'
import SummaryIndexSetting from '@/app/components/datasets/settings/summary-index-setting'
import { IS_CE_EDITION } from '@/config'
import { ChunkingMode } from '@/models/datasets'
import FileList from '../../assets/file-list-3-fill.svg'
import Note from '../../assets/note-mod.svg'
@@ -192,7 +191,7 @@ export const ParentChildOptions: FC<ParentChildOptionsProps> = ({
</div>
))}
{
showSummaryIndexSetting && IS_CE_EDITION && (
showSummaryIndexSetting && (
<div className="mt-3">
<SummaryIndexSetting
entry="create-document"

View File

@@ -1,262 +0,0 @@
import type { SimpleDocumentDetail } from '@/models/datasets'
import { render } from '@testing-library/react'
import { describe, expect, it } from 'vitest'
import { DataSourceType } from '@/models/datasets'
import { DatasourceType } from '@/models/pipeline'
import DocumentSourceIcon from './document-source-icon'
const createMockDoc = (overrides: Record<string, unknown> = {}): SimpleDocumentDetail => ({
id: 'doc-1',
position: 1,
data_source_type: DataSourceType.FILE,
data_source_info: {},
data_source_detail_dict: {},
dataset_process_rule_id: 'rule-1',
dataset_id: 'dataset-1',
batch: 'batch-1',
name: 'test-document.txt',
created_from: 'web',
created_by: 'user-1',
created_at: Date.now(),
tokens: 100,
indexing_status: 'completed',
error: null,
enabled: true,
disabled_at: null,
disabled_by: null,
archived: false,
archived_reason: null,
archived_by: null,
archived_at: null,
updated_at: Date.now(),
doc_type: null,
doc_metadata: undefined,
doc_language: 'en',
display_status: 'available',
word_count: 100,
hit_count: 10,
doc_form: 'text_model',
...overrides,
}) as unknown as SimpleDocumentDetail
describe('DocumentSourceIcon', () => {
describe('Rendering', () => {
it('should render without crashing', () => {
const doc = createMockDoc()
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
})
describe('Local File Icon', () => {
it('should render FileTypeIcon for FILE data source type', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.FILE,
data_source_info: {
upload_file: { extension: 'pdf' },
},
})
const { container } = render(<DocumentSourceIcon doc={doc} fileType="pdf" />)
const icon = container.querySelector('svg, img')
expect(icon).toBeInTheDocument()
})
it('should render FileTypeIcon for localFile data source type', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.localFile,
created_from: 'rag-pipeline',
data_source_info: {
extension: 'docx',
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
const icon = container.querySelector('svg, img')
expect(icon).toBeInTheDocument()
})
it('should use extension from upload_file for legacy data source', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.FILE,
created_from: 'web',
data_source_info: {
upload_file: { extension: 'txt' },
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should use fileType prop as fallback for extension', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.FILE,
created_from: 'web',
data_source_info: {},
})
const { container } = render(<DocumentSourceIcon doc={doc} fileType="csv" />)
expect(container.firstChild).toBeInTheDocument()
})
})
describe('Notion Icon', () => {
it('should render NotionIcon for NOTION data source type', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.NOTION,
created_from: 'web',
data_source_info: {
notion_page_icon: 'https://notion.so/icon.png',
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should render NotionIcon for onlineDocument data source type', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.onlineDocument,
created_from: 'rag-pipeline',
data_source_info: {
page: { page_icon: 'https://notion.so/icon.png' },
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should use page_icon for rag-pipeline created documents', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.NOTION,
created_from: 'rag-pipeline',
data_source_info: {
page: { page_icon: 'https://notion.so/custom-icon.png' },
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
})
describe('Web Crawl Icon', () => {
it('should render globe icon for WEB data source type', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.WEB,
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
const icon = container.querySelector('svg')
expect(icon).toBeInTheDocument()
expect(icon).toHaveClass('mr-1.5')
expect(icon).toHaveClass('size-4')
})
it('should render globe icon for websiteCrawl data source type', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.websiteCrawl,
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
const icon = container.querySelector('svg')
expect(icon).toBeInTheDocument()
})
})
describe('Online Drive Icon', () => {
it('should render FileTypeIcon for onlineDrive data source type', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.onlineDrive,
data_source_info: {
name: 'document.xlsx',
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should extract extension from file name', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.onlineDrive,
data_source_info: {
name: 'spreadsheet.xlsx',
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should handle file name without extension', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.onlineDrive,
data_source_info: {
name: 'noextension',
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should handle empty file name', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.onlineDrive,
data_source_info: {
name: '',
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should handle hidden files (starting with dot)', () => {
const doc = createMockDoc({
data_source_type: DatasourceType.onlineDrive,
data_source_info: {
name: '.gitignore',
},
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
})
describe('Unknown Data Source Type', () => {
it('should return null for unknown data source type', () => {
const doc = createMockDoc({
data_source_type: 'unknown',
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeNull()
})
})
describe('Edge Cases', () => {
it('should handle undefined data_source_info', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.FILE,
data_source_info: undefined,
})
const { container } = render(<DocumentSourceIcon doc={doc} />)
expect(container.firstChild).toBeInTheDocument()
})
it('should memoize the component', () => {
const doc = createMockDoc()
const { rerender, container } = render(<DocumentSourceIcon doc={doc} />)
const firstRender = container.innerHTML
rerender(<DocumentSourceIcon doc={doc} />)
expect(container.innerHTML).toBe(firstRender)
})
})
})

View File

@@ -1,100 +0,0 @@
import type { FC } from 'react'
import type { LegacyDataSourceInfo, LocalFileInfo, OnlineDocumentInfo, OnlineDriveInfo, SimpleDocumentDetail } from '@/models/datasets'
import { RiGlobalLine } from '@remixicon/react'
import * as React from 'react'
import FileTypeIcon from '@/app/components/base/file-uploader/file-type-icon'
import NotionIcon from '@/app/components/base/notion-icon'
import { extensionToFileType } from '@/app/components/datasets/hit-testing/utils/extension-to-file-type'
import { DataSourceType } from '@/models/datasets'
import { DatasourceType } from '@/models/pipeline'
type DocumentSourceIconProps = {
doc: SimpleDocumentDetail
fileType?: string
}
const isLocalFile = (dataSourceType: DataSourceType | DatasourceType) => {
return dataSourceType === DatasourceType.localFile || dataSourceType === DataSourceType.FILE
}
const isOnlineDocument = (dataSourceType: DataSourceType | DatasourceType) => {
return dataSourceType === DatasourceType.onlineDocument || dataSourceType === DataSourceType.NOTION
}
const isWebsiteCrawl = (dataSourceType: DataSourceType | DatasourceType) => {
return dataSourceType === DatasourceType.websiteCrawl || dataSourceType === DataSourceType.WEB
}
const isOnlineDrive = (dataSourceType: DataSourceType | DatasourceType) => {
return dataSourceType === DatasourceType.onlineDrive
}
const isCreateFromRAGPipeline = (createdFrom: string) => {
return createdFrom === 'rag-pipeline'
}
const getFileExtension = (fileName: string): string => {
if (!fileName)
return ''
const parts = fileName.split('.')
if (parts.length <= 1 || (parts[0] === '' && parts.length === 2))
return ''
return parts[parts.length - 1].toLowerCase()
}
const DocumentSourceIcon: FC<DocumentSourceIconProps> = React.memo(({
doc,
fileType,
}) => {
if (isOnlineDocument(doc.data_source_type)) {
return (
<NotionIcon
className="mr-1.5"
type="page"
src={
isCreateFromRAGPipeline(doc.created_from)
? (doc.data_source_info as OnlineDocumentInfo).page.page_icon
: (doc.data_source_info as LegacyDataSourceInfo).notion_page_icon
}
/>
)
}
if (isLocalFile(doc.data_source_type)) {
return (
<FileTypeIcon
type={
extensionToFileType(
isCreateFromRAGPipeline(doc.created_from)
? (doc?.data_source_info as LocalFileInfo)?.extension
: ((doc?.data_source_info as LegacyDataSourceInfo)?.upload_file?.extension ?? fileType),
)
}
className="mr-1.5"
/>
)
}
if (isOnlineDrive(doc.data_source_type)) {
return (
<FileTypeIcon
type={
extensionToFileType(
getFileExtension((doc?.data_source_info as unknown as OnlineDriveInfo)?.name),
)
}
className="mr-1.5"
/>
)
}
if (isWebsiteCrawl(doc.data_source_type)) {
return <RiGlobalLine className="mr-1.5 size-4" />
}
return null
})
DocumentSourceIcon.displayName = 'DocumentSourceIcon'
export default DocumentSourceIcon

View File

@@ -1,342 +0,0 @@
import type { ReactNode } from 'react'
import type { SimpleDocumentDetail } from '@/models/datasets'
import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
import { fireEvent, render, screen } from '@testing-library/react'
import { beforeEach, describe, expect, it, vi } from 'vitest'
import { DataSourceType } from '@/models/datasets'
import DocumentTableRow from './document-table-row'
const mockPush = vi.fn()
vi.mock('next/navigation', () => ({
useRouter: () => ({
push: mockPush,
}),
}))
const createTestQueryClient = () => new QueryClient({
defaultOptions: {
queries: { retry: false, gcTime: 0 },
mutations: { retry: false },
},
})
const createWrapper = () => {
const queryClient = createTestQueryClient()
return ({ children }: { children: ReactNode }) => (
<QueryClientProvider client={queryClient}>
<table>
<tbody>
{children}
</tbody>
</table>
</QueryClientProvider>
)
}
type LocalDoc = SimpleDocumentDetail & { percent?: number }
const createMockDoc = (overrides: Record<string, unknown> = {}): LocalDoc => ({
id: 'doc-1',
position: 1,
data_source_type: DataSourceType.FILE,
data_source_info: {},
data_source_detail_dict: {
upload_file: { name: 'test.txt', extension: 'txt' },
},
dataset_process_rule_id: 'rule-1',
dataset_id: 'dataset-1',
batch: 'batch-1',
name: 'test-document.txt',
created_from: 'web',
created_by: 'user-1',
created_at: Date.now(),
tokens: 100,
indexing_status: 'completed',
error: null,
enabled: true,
disabled_at: null,
disabled_by: null,
archived: false,
archived_reason: null,
archived_by: null,
archived_at: null,
updated_at: Date.now(),
doc_type: null,
doc_metadata: undefined,
doc_language: 'en',
display_status: 'available',
word_count: 500,
hit_count: 10,
doc_form: 'text_model',
...overrides,
}) as unknown as LocalDoc
// Helper to find the custom checkbox div (Checkbox component renders as a div, not a native checkbox)
const findCheckbox = (container: HTMLElement): HTMLElement | null => {
return container.querySelector('[class*="shadow-xs"]')
}
describe('DocumentTableRow', () => {
const defaultProps = {
doc: createMockDoc(),
index: 0,
datasetId: 'dataset-1',
isSelected: false,
isGeneralMode: true,
isQAMode: false,
embeddingAvailable: true,
selectedIds: [],
onSelectOne: vi.fn(),
onSelectedIdChange: vi.fn(),
onShowRenameModal: vi.fn(),
onUpdate: vi.fn(),
}
beforeEach(() => {
vi.clearAllMocks()
})
describe('Rendering', () => {
it('should render without crashing', () => {
render(<DocumentTableRow {...defaultProps} />, { wrapper: createWrapper() })
expect(screen.getByText('test-document.txt')).toBeInTheDocument()
})
it('should render index number correctly', () => {
render(<DocumentTableRow {...defaultProps} index={5} />, { wrapper: createWrapper() })
expect(screen.getByText('6')).toBeInTheDocument()
})
it('should render document name with tooltip', () => {
render(<DocumentTableRow {...defaultProps} />, { wrapper: createWrapper() })
expect(screen.getByText('test-document.txt')).toBeInTheDocument()
})
it('should render checkbox element', () => {
const { container } = render(<DocumentTableRow {...defaultProps} />, { wrapper: createWrapper() })
const checkbox = findCheckbox(container)
expect(checkbox).toBeInTheDocument()
})
})
describe('Selection', () => {
it('should show check icon when isSelected is true', () => {
const { container } = render(<DocumentTableRow {...defaultProps} isSelected />, { wrapper: createWrapper() })
// When selected, the checkbox should have a check icon (RiCheckLine svg)
const checkbox = findCheckbox(container)
expect(checkbox).toBeInTheDocument()
const checkIcon = checkbox?.querySelector('svg')
expect(checkIcon).toBeInTheDocument()
})
it('should not show check icon when isSelected is false', () => {
const { container } = render(<DocumentTableRow {...defaultProps} isSelected={false} />, { wrapper: createWrapper() })
const checkbox = findCheckbox(container)
expect(checkbox).toBeInTheDocument()
// When not selected, there should be no check icon inside the checkbox
const checkIcon = checkbox?.querySelector('svg')
expect(checkIcon).not.toBeInTheDocument()
})
it('should call onSelectOne when checkbox is clicked', () => {
const onSelectOne = vi.fn()
const { container } = render(<DocumentTableRow {...defaultProps} onSelectOne={onSelectOne} />, { wrapper: createWrapper() })
const checkbox = findCheckbox(container)
if (checkbox) {
fireEvent.click(checkbox)
expect(onSelectOne).toHaveBeenCalledWith('doc-1')
}
})
it('should stop propagation when checkbox container is clicked', () => {
const { container } = render(<DocumentTableRow {...defaultProps} />, { wrapper: createWrapper() })
// Click the div containing the checkbox (which has stopPropagation)
const checkboxContainer = container.querySelector('td')?.querySelector('div')
if (checkboxContainer) {
fireEvent.click(checkboxContainer)
expect(mockPush).not.toHaveBeenCalled()
}
})
})
describe('Row Navigation', () => {
it('should navigate to document detail on row click', () => {
render(<DocumentTableRow {...defaultProps} />, { wrapper: createWrapper() })
const row = screen.getByRole('row')
fireEvent.click(row)
expect(mockPush).toHaveBeenCalledWith('/datasets/dataset-1/documents/doc-1')
})
it('should navigate with correct datasetId and documentId', () => {
render(
<DocumentTableRow
{...defaultProps}
datasetId="custom-dataset"
doc={createMockDoc({ id: 'custom-doc' })}
/>,
{ wrapper: createWrapper() },
)
const row = screen.getByRole('row')
fireEvent.click(row)
expect(mockPush).toHaveBeenCalledWith('/datasets/custom-dataset/documents/custom-doc')
})
})
describe('Word Count Display', () => {
it('should display word count less than 1000 as is', () => {
const doc = createMockDoc({ word_count: 500 })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByText('500')).toBeInTheDocument()
})
it('should display word count 1000 or more in k format', () => {
const doc = createMockDoc({ word_count: 1500 })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByText('1.5k')).toBeInTheDocument()
})
it('should display 0 with empty style when word_count is 0', () => {
const doc = createMockDoc({ word_count: 0 })
const { container } = render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
const zeroCells = container.querySelectorAll('.text-text-tertiary')
expect(zeroCells.length).toBeGreaterThan(0)
})
it('should handle undefined word_count', () => {
const doc = createMockDoc({ word_count: undefined as unknown as number })
const { container } = render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(container).toBeInTheDocument()
})
})
describe('Hit Count Display', () => {
it('should display hit count less than 1000 as is', () => {
const doc = createMockDoc({ hit_count: 100 })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByText('100')).toBeInTheDocument()
})
it('should display hit count 1000 or more in k format', () => {
const doc = createMockDoc({ hit_count: 2500 })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByText('2.5k')).toBeInTheDocument()
})
it('should display 0 with empty style when hit_count is 0', () => {
const doc = createMockDoc({ hit_count: 0 })
const { container } = render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
const zeroCells = container.querySelectorAll('.text-text-tertiary')
expect(zeroCells.length).toBeGreaterThan(0)
})
})
describe('Chunking Mode', () => {
it('should render ChunkingModeLabel with general mode', () => {
render(<DocumentTableRow {...defaultProps} isGeneralMode isQAMode={false} />, { wrapper: createWrapper() })
// ChunkingModeLabel should be rendered
expect(screen.getByRole('row')).toBeInTheDocument()
})
it('should render ChunkingModeLabel with QA mode', () => {
render(<DocumentTableRow {...defaultProps} isGeneralMode={false} isQAMode />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
})
describe('Summary Status', () => {
it('should render SummaryStatus when summary_index_status is present', () => {
const doc = createMockDoc({ summary_index_status: 'completed' })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
it('should not render SummaryStatus when summary_index_status is absent', () => {
const doc = createMockDoc({ summary_index_status: undefined })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
})
describe('Rename Action', () => {
it('should call onShowRenameModal when rename button is clicked', () => {
const onShowRenameModal = vi.fn()
const { container } = render(
<DocumentTableRow {...defaultProps} onShowRenameModal={onShowRenameModal} />,
{ wrapper: createWrapper() },
)
// Find the rename button by finding the RiEditLine icon's parent
const renameButtons = container.querySelectorAll('.cursor-pointer.rounded-md')
if (renameButtons.length > 0) {
fireEvent.click(renameButtons[0])
expect(onShowRenameModal).toHaveBeenCalledWith(defaultProps.doc)
expect(mockPush).not.toHaveBeenCalled()
}
})
})
describe('Operations', () => {
it('should pass selectedIds to Operations component', () => {
render(<DocumentTableRow {...defaultProps} selectedIds={['doc-1', 'doc-2']} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
it('should pass onSelectedIdChange to Operations component', () => {
const onSelectedIdChange = vi.fn()
render(<DocumentTableRow {...defaultProps} onSelectedIdChange={onSelectedIdChange} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
})
describe('Document Source Icon', () => {
it('should render with FILE data source type', () => {
const doc = createMockDoc({ data_source_type: DataSourceType.FILE })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
it('should render with NOTION data source type', () => {
const doc = createMockDoc({
data_source_type: DataSourceType.NOTION,
data_source_info: { notion_page_icon: 'icon.png' },
})
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
it('should render with WEB data source type', () => {
const doc = createMockDoc({ data_source_type: DataSourceType.WEB })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
})
describe('Edge Cases', () => {
it('should handle document with very long name', () => {
const doc = createMockDoc({ name: `${'a'.repeat(500)}.txt` })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByRole('row')).toBeInTheDocument()
})
it('should handle document with special characters in name', () => {
const doc = createMockDoc({ name: '<script>test</script>.txt' })
render(<DocumentTableRow {...defaultProps} doc={doc} />, { wrapper: createWrapper() })
expect(screen.getByText('<script>test</script>.txt')).toBeInTheDocument()
})
it('should memoize the component', () => {
const wrapper = createWrapper()
const { rerender } = render(<DocumentTableRow {...defaultProps} />, { wrapper })
rerender(<DocumentTableRow {...defaultProps} />)
expect(screen.getByRole('row')).toBeInTheDocument()
})
})
})

View File

@@ -1,152 +0,0 @@
import type { FC } from 'react'
import type { SimpleDocumentDetail } from '@/models/datasets'
import { RiEditLine } from '@remixicon/react'
import { pick } from 'es-toolkit/object'
import { useRouter } from 'next/navigation'
import * as React from 'react'
import { useCallback } from 'react'
import { useTranslation } from 'react-i18next'
import Checkbox from '@/app/components/base/checkbox'
import Tooltip from '@/app/components/base/tooltip'
import ChunkingModeLabel from '@/app/components/datasets/common/chunking-mode-label'
import Operations from '@/app/components/datasets/documents/components/operations'
import SummaryStatus from '@/app/components/datasets/documents/detail/completed/common/summary-status'
import StatusItem from '@/app/components/datasets/documents/status-item'
import useTimestamp from '@/hooks/use-timestamp'
import { DataSourceType } from '@/models/datasets'
import { formatNumber } from '@/utils/format'
import DocumentSourceIcon from './document-source-icon'
import { renderTdValue } from './utils'
type LocalDoc = SimpleDocumentDetail & { percent?: number }
type DocumentTableRowProps = {
doc: LocalDoc
index: number
datasetId: string
isSelected: boolean
isGeneralMode: boolean
isQAMode: boolean
embeddingAvailable: boolean
selectedIds: string[]
onSelectOne: (docId: string) => void
onSelectedIdChange: (ids: string[]) => void
onShowRenameModal: (doc: LocalDoc) => void
onUpdate: () => void
}
const renderCount = (count: number | undefined) => {
if (!count)
return renderTdValue(0, true)
if (count < 1000)
return count
return `${formatNumber((count / 1000).toFixed(1))}k`
}
const DocumentTableRow: FC<DocumentTableRowProps> = React.memo(({
doc,
index,
datasetId,
isSelected,
isGeneralMode,
isQAMode,
embeddingAvailable,
selectedIds,
onSelectOne,
onSelectedIdChange,
onShowRenameModal,
onUpdate,
}) => {
const { t } = useTranslation()
const { formatTime } = useTimestamp()
const router = useRouter()
const isFile = doc.data_source_type === DataSourceType.FILE
const fileType = isFile ? doc.data_source_detail_dict?.upload_file?.extension : ''
const handleRowClick = useCallback(() => {
router.push(`/datasets/${datasetId}/documents/${doc.id}`)
}, [router, datasetId, doc.id])
const handleCheckboxClick = useCallback((e: React.MouseEvent) => {
e.stopPropagation()
}, [])
const handleRenameClick = useCallback((e: React.MouseEvent) => {
e.stopPropagation()
onShowRenameModal(doc)
}, [doc, onShowRenameModal])
return (
<tr
className="h-8 cursor-pointer border-b border-divider-subtle hover:bg-background-default-hover"
onClick={handleRowClick}
>
<td className="text-left align-middle text-xs text-text-tertiary">
<div className="flex items-center" onClick={handleCheckboxClick}>
<Checkbox
className="mr-2 shrink-0"
checked={isSelected}
onCheck={() => onSelectOne(doc.id)}
/>
{index + 1}
</div>
</td>
<td>
<div className="group mr-6 flex max-w-[460px] items-center hover:mr-0">
<div className="flex shrink-0 items-center">
<DocumentSourceIcon doc={doc} fileType={fileType} />
</div>
<Tooltip popupContent={doc.name}>
<span className="grow-1 truncate text-sm">{doc.name}</span>
</Tooltip>
{doc.summary_index_status && (
<div className="ml-1 hidden shrink-0 group-hover:flex">
<SummaryStatus status={doc.summary_index_status} />
</div>
)}
<div className="hidden shrink-0 group-hover:ml-auto group-hover:flex">
<Tooltip popupContent={t('list.table.rename', { ns: 'datasetDocuments' })}>
<div
className="cursor-pointer rounded-md p-1 hover:bg-state-base-hover"
onClick={handleRenameClick}
>
<RiEditLine className="h-4 w-4 text-text-tertiary" />
</div>
</Tooltip>
</div>
</div>
</td>
<td>
<ChunkingModeLabel
isGeneralMode={isGeneralMode}
isQAMode={isQAMode}
/>
</td>
<td>{renderCount(doc.word_count)}</td>
<td>{renderCount(doc.hit_count)}</td>
<td className="text-[13px] text-text-secondary">
{formatTime(doc.created_at, t('dateTimeFormat', { ns: 'datasetHitTesting' }) as string)}
</td>
<td>
<StatusItem status={doc.display_status} />
</td>
<td>
<Operations
selectedIds={selectedIds}
onSelectedIdChange={onSelectedIdChange}
embeddingAvailable={embeddingAvailable}
datasetId={datasetId}
detail={pick(doc, ['name', 'enabled', 'archived', 'id', 'data_source_type', 'doc_form', 'display_status'])}
onUpdate={onUpdate}
/>
</td>
</tr>
)
})
DocumentTableRow.displayName = 'DocumentTableRow'
export default DocumentTableRow

View File

@@ -1,4 +0,0 @@
export { default as DocumentSourceIcon } from './document-source-icon'
export { default as DocumentTableRow } from './document-table-row'
export { default as SortHeader } from './sort-header'
export { renderTdValue } from './utils'

View File

@@ -1,124 +0,0 @@
import { fireEvent, render, screen } from '@testing-library/react'
import { describe, expect, it, vi } from 'vitest'
import SortHeader from './sort-header'
describe('SortHeader', () => {
const defaultProps = {
field: 'name' as const,
label: 'File Name',
currentSortField: null,
sortOrder: 'desc' as const,
onSort: vi.fn(),
}
describe('rendering', () => {
it('should render the label', () => {
render(<SortHeader {...defaultProps} />)
expect(screen.getByText('File Name')).toBeInTheDocument()
})
it('should render the sort icon', () => {
const { container } = render(<SortHeader {...defaultProps} />)
const icon = container.querySelector('svg')
expect(icon).toBeInTheDocument()
})
})
describe('inactive state', () => {
it('should have disabled text color when not active', () => {
const { container } = render(<SortHeader {...defaultProps} />)
const icon = container.querySelector('svg')
expect(icon).toHaveClass('text-text-disabled')
})
it('should not be rotated when not active', () => {
const { container } = render(<SortHeader {...defaultProps} />)
const icon = container.querySelector('svg')
expect(icon).not.toHaveClass('rotate-180')
})
})
describe('active state', () => {
it('should have tertiary text color when active', () => {
const { container } = render(
<SortHeader {...defaultProps} currentSortField="name" />,
)
const icon = container.querySelector('svg')
expect(icon).toHaveClass('text-text-tertiary')
})
it('should not be rotated when active and desc', () => {
const { container } = render(
<SortHeader {...defaultProps} currentSortField="name" sortOrder="desc" />,
)
const icon = container.querySelector('svg')
expect(icon).not.toHaveClass('rotate-180')
})
it('should be rotated when active and asc', () => {
const { container } = render(
<SortHeader {...defaultProps} currentSortField="name" sortOrder="asc" />,
)
const icon = container.querySelector('svg')
expect(icon).toHaveClass('rotate-180')
})
})
describe('interaction', () => {
it('should call onSort when clicked', () => {
const onSort = vi.fn()
render(<SortHeader {...defaultProps} onSort={onSort} />)
fireEvent.click(screen.getByText('File Name'))
expect(onSort).toHaveBeenCalledWith('name')
})
it('should call onSort with correct field', () => {
const onSort = vi.fn()
render(<SortHeader {...defaultProps} field="word_count" onSort={onSort} />)
fireEvent.click(screen.getByText('File Name'))
expect(onSort).toHaveBeenCalledWith('word_count')
})
})
describe('different fields', () => {
it('should work with word_count field', () => {
render(
<SortHeader
{...defaultProps}
field="word_count"
label="Words"
currentSortField="word_count"
/>,
)
expect(screen.getByText('Words')).toBeInTheDocument()
})
it('should work with hit_count field', () => {
render(
<SortHeader
{...defaultProps}
field="hit_count"
label="Hit Count"
currentSortField="hit_count"
/>,
)
expect(screen.getByText('Hit Count')).toBeInTheDocument()
})
it('should work with created_at field', () => {
render(
<SortHeader
{...defaultProps}
field="created_at"
label="Upload Time"
currentSortField="created_at"
/>,
)
expect(screen.getByText('Upload Time')).toBeInTheDocument()
})
})
})

View File

@@ -1,44 +0,0 @@
import type { FC } from 'react'
import type { SortField, SortOrder } from '../hooks'
import { RiArrowDownLine } from '@remixicon/react'
import * as React from 'react'
import { cn } from '@/utils/classnames'
type SortHeaderProps = {
field: Exclude<SortField, null>
label: string
currentSortField: SortField
sortOrder: SortOrder
onSort: (field: SortField) => void
}
const SortHeader: FC<SortHeaderProps> = React.memo(({
field,
label,
currentSortField,
sortOrder,
onSort,
}) => {
const isActive = currentSortField === field
const isDesc = isActive && sortOrder === 'desc'
return (
<div
className="flex cursor-pointer items-center hover:text-text-secondary"
onClick={() => onSort(field)}
>
{label}
<RiArrowDownLine
className={cn(
'ml-0.5 h-3 w-3 transition-all',
isActive ? 'text-text-tertiary' : 'text-text-disabled',
isActive && !isDesc ? 'rotate-180' : '',
)}
/>
</div>
)
})
SortHeader.displayName = 'SortHeader'
export default SortHeader

View File

@@ -1,90 +0,0 @@
import { render, screen } from '@testing-library/react'
import { describe, expect, it } from 'vitest'
import { renderTdValue } from './utils'
describe('renderTdValue', () => {
describe('Rendering', () => {
it('should render string value correctly', () => {
const { container } = render(<>{renderTdValue('test value')}</>)
expect(screen.getByText('test value')).toBeInTheDocument()
expect(container.querySelector('div')).toHaveClass('text-text-secondary')
})
it('should render number value correctly', () => {
const { container } = render(<>{renderTdValue(42)}</>)
expect(screen.getByText('42')).toBeInTheDocument()
expect(container.querySelector('div')).toHaveClass('text-text-secondary')
})
it('should render zero correctly', () => {
const { container } = render(<>{renderTdValue(0)}</>)
expect(screen.getByText('0')).toBeInTheDocument()
expect(container.querySelector('div')).toHaveClass('text-text-secondary')
})
})
describe('Null and undefined handling', () => {
it('should render dash for null value', () => {
render(<>{renderTdValue(null)}</>)
expect(screen.getByText('-')).toBeInTheDocument()
})
it('should render dash for null value with empty style', () => {
const { container } = render(<>{renderTdValue(null, true)}</>)
expect(screen.getByText('-')).toBeInTheDocument()
expect(container.querySelector('div')).toHaveClass('text-text-tertiary')
})
})
describe('Empty style', () => {
it('should apply text-text-tertiary class when isEmptyStyle is true', () => {
const { container } = render(<>{renderTdValue('value', true)}</>)
expect(container.querySelector('div')).toHaveClass('text-text-tertiary')
})
it('should apply text-text-secondary class when isEmptyStyle is false', () => {
const { container } = render(<>{renderTdValue('value', false)}</>)
expect(container.querySelector('div')).toHaveClass('text-text-secondary')
})
it('should apply text-text-secondary class when isEmptyStyle is not provided', () => {
const { container } = render(<>{renderTdValue('value')}</>)
expect(container.querySelector('div')).toHaveClass('text-text-secondary')
})
})
describe('Edge Cases', () => {
it('should handle empty string', () => {
render(<>{renderTdValue('')}</>)
// Empty string should still render but with no visible text
const div = document.querySelector('div')
expect(div).toBeInTheDocument()
})
it('should handle large numbers', () => {
render(<>{renderTdValue(1234567890)}</>)
expect(screen.getByText('1234567890')).toBeInTheDocument()
})
it('should handle negative numbers', () => {
render(<>{renderTdValue(-42)}</>)
expect(screen.getByText('-42')).toBeInTheDocument()
})
it('should handle special characters in string', () => {
render(<>{renderTdValue('<script>alert("xss")</script>')}</>)
expect(screen.getByText('<script>alert("xss")</script>')).toBeInTheDocument()
})
it('should handle unicode characters', () => {
render(<>{renderTdValue('Test Unicode: \u4E2D\u6587')}</>)
expect(screen.getByText('Test Unicode: \u4E2D\u6587')).toBeInTheDocument()
})
it('should handle very long strings', () => {
const longString = 'a'.repeat(1000)
render(<>{renderTdValue(longString)}</>)
expect(screen.getByText(longString)).toBeInTheDocument()
})
})
})

View File

@@ -1,16 +0,0 @@
import type { ReactNode } from 'react'
import { cn } from '@/utils/classnames'
import s from '../../../style.module.css'
export const renderTdValue = (value: string | number | null, isEmptyStyle = false): ReactNode => {
const className = cn(
isEmptyStyle ? 'text-text-tertiary' : 'text-text-secondary',
s.tdValue,
)
return (
<div className={className}>
{value ?? '-'}
</div>
)
}

View File

@@ -1,4 +0,0 @@
export { useDocumentActions } from './use-document-actions'
export { useDocumentSelection } from './use-document-selection'
export { useDocumentSort } from './use-document-sort'
export type { SortField, SortOrder } from './use-document-sort'

View File

@@ -1,438 +0,0 @@
import type { ReactNode } from 'react'
import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
import { act, renderHook, waitFor } from '@testing-library/react'
import { beforeEach, describe, expect, it, vi } from 'vitest'
import { DocumentActionType } from '@/models/datasets'
import * as useDocument from '@/service/knowledge/use-document'
import { useDocumentActions } from './use-document-actions'
vi.mock('@/service/knowledge/use-document')
const mockUseDocumentArchive = vi.mocked(useDocument.useDocumentArchive)
const mockUseDocumentSummary = vi.mocked(useDocument.useDocumentSummary)
const mockUseDocumentEnable = vi.mocked(useDocument.useDocumentEnable)
const mockUseDocumentDisable = vi.mocked(useDocument.useDocumentDisable)
const mockUseDocumentDelete = vi.mocked(useDocument.useDocumentDelete)
const mockUseDocumentBatchRetryIndex = vi.mocked(useDocument.useDocumentBatchRetryIndex)
const mockUseDocumentDownloadZip = vi.mocked(useDocument.useDocumentDownloadZip)
const createTestQueryClient = () => new QueryClient({
defaultOptions: {
queries: { retry: false },
mutations: { retry: false },
},
})
const createWrapper = () => {
const queryClient = createTestQueryClient()
return ({ children }: { children: ReactNode }) => (
<QueryClientProvider client={queryClient}>
{children}
</QueryClientProvider>
)
}
describe('useDocumentActions', () => {
const mockMutateAsync = vi.fn()
beforeEach(() => {
vi.clearAllMocks()
// Setup all mocks with default values
const createMockMutation = () => ({
mutateAsync: mockMutateAsync,
isPending: false,
isError: false,
isSuccess: false,
isIdle: true,
data: undefined,
error: null,
mutate: vi.fn(),
reset: vi.fn(),
status: 'idle' as const,
variables: undefined,
context: undefined,
failureCount: 0,
failureReason: null,
submittedAt: 0,
})
mockUseDocumentArchive.mockReturnValue(createMockMutation() as unknown as ReturnType<typeof useDocument.useDocumentArchive>)
mockUseDocumentSummary.mockReturnValue(createMockMutation() as unknown as ReturnType<typeof useDocument.useDocumentSummary>)
mockUseDocumentEnable.mockReturnValue(createMockMutation() as unknown as ReturnType<typeof useDocument.useDocumentEnable>)
mockUseDocumentDisable.mockReturnValue(createMockMutation() as unknown as ReturnType<typeof useDocument.useDocumentDisable>)
mockUseDocumentDelete.mockReturnValue(createMockMutation() as unknown as ReturnType<typeof useDocument.useDocumentDelete>)
mockUseDocumentBatchRetryIndex.mockReturnValue(createMockMutation() as unknown as ReturnType<typeof useDocument.useDocumentBatchRetryIndex>)
mockUseDocumentDownloadZip.mockReturnValue({
...createMockMutation(),
isPending: false,
} as unknown as ReturnType<typeof useDocument.useDocumentDownloadZip>)
})
describe('handleAction', () => {
it('should call archive mutation when archive action is triggered', async () => {
mockMutateAsync.mockResolvedValue({ result: 'success' })
const onUpdate = vi.fn()
const onClearSelection = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection,
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleAction(DocumentActionType.archive)()
})
expect(mockMutateAsync).toHaveBeenCalledWith({
datasetId: 'ds1',
documentIds: ['doc1'],
})
})
it('should call onUpdate on successful action', async () => {
mockMutateAsync.mockResolvedValue({ result: 'success' })
const onUpdate = vi.fn()
const onClearSelection = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection,
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleAction(DocumentActionType.enable)()
})
await waitFor(() => {
expect(onUpdate).toHaveBeenCalled()
})
})
it('should call onClearSelection on delete action', async () => {
mockMutateAsync.mockResolvedValue({ result: 'success' })
const onUpdate = vi.fn()
const onClearSelection = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection,
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleAction(DocumentActionType.delete)()
})
await waitFor(() => {
expect(onClearSelection).toHaveBeenCalled()
})
})
})
describe('handleBatchReIndex', () => {
it('should call retry index mutation', async () => {
mockMutateAsync.mockResolvedValue({ result: 'success' })
const onUpdate = vi.fn()
const onClearSelection = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1', 'doc2'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection,
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleBatchReIndex()
})
expect(mockMutateAsync).toHaveBeenCalledWith({
datasetId: 'ds1',
documentIds: ['doc1', 'doc2'],
})
})
it('should call onClearSelection on success', async () => {
mockMutateAsync.mockResolvedValue({ result: 'success' })
const onUpdate = vi.fn()
const onClearSelection = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection,
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleBatchReIndex()
})
await waitFor(() => {
expect(onClearSelection).toHaveBeenCalled()
expect(onUpdate).toHaveBeenCalled()
})
})
})
describe('handleBatchDownload', () => {
it('should not proceed when already downloading', async () => {
mockUseDocumentDownloadZip.mockReturnValue({
mutateAsync: mockMutateAsync,
isPending: true,
} as unknown as ReturnType<typeof useDocument.useDocumentDownloadZip>)
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: ['doc1'],
onUpdate: vi.fn(),
onClearSelection: vi.fn(),
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleBatchDownload()
})
expect(mockMutateAsync).not.toHaveBeenCalled()
})
it('should call download mutation with downloadable ids', async () => {
const mockBlob = new Blob(['test'])
mockMutateAsync.mockResolvedValue(mockBlob)
mockUseDocumentDownloadZip.mockReturnValue({
mutateAsync: mockMutateAsync,
isPending: false,
} as unknown as ReturnType<typeof useDocument.useDocumentDownloadZip>)
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1', 'doc2'],
downloadableSelectedIds: ['doc1'],
onUpdate: vi.fn(),
onClearSelection: vi.fn(),
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleBatchDownload()
})
expect(mockMutateAsync).toHaveBeenCalledWith({
datasetId: 'ds1',
documentIds: ['doc1'],
})
})
})
describe('isDownloadingZip', () => {
it('should reflect isPending state from mutation', () => {
mockUseDocumentDownloadZip.mockReturnValue({
mutateAsync: mockMutateAsync,
isPending: true,
} as unknown as ReturnType<typeof useDocument.useDocumentDownloadZip>)
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: [],
downloadableSelectedIds: [],
onUpdate: vi.fn(),
onClearSelection: vi.fn(),
}),
{ wrapper: createWrapper() },
)
expect(result.current.isDownloadingZip).toBe(true)
})
})
describe('error handling', () => {
it('should show error toast when handleAction fails', async () => {
mockMutateAsync.mockRejectedValue(new Error('Action failed'))
const onUpdate = vi.fn()
const onClearSelection = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection,
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleAction(DocumentActionType.archive)()
})
// onUpdate should not be called on error
expect(onUpdate).not.toHaveBeenCalled()
})
it('should show error toast when handleBatchReIndex fails', async () => {
mockMutateAsync.mockRejectedValue(new Error('Re-index failed'))
const onUpdate = vi.fn()
const onClearSelection = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection,
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleBatchReIndex()
})
// onUpdate and onClearSelection should not be called on error
expect(onUpdate).not.toHaveBeenCalled()
expect(onClearSelection).not.toHaveBeenCalled()
})
it('should show error toast when handleBatchDownload fails', async () => {
mockMutateAsync.mockRejectedValue(new Error('Download failed'))
mockUseDocumentDownloadZip.mockReturnValue({
mutateAsync: mockMutateAsync,
isPending: false,
} as unknown as ReturnType<typeof useDocument.useDocumentDownloadZip>)
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: ['doc1'],
onUpdate: vi.fn(),
onClearSelection: vi.fn(),
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleBatchDownload()
})
// Mutation was called but failed
expect(mockMutateAsync).toHaveBeenCalled()
})
it('should show error toast when handleBatchDownload returns null blob', async () => {
mockMutateAsync.mockResolvedValue(null)
mockUseDocumentDownloadZip.mockReturnValue({
mutateAsync: mockMutateAsync,
isPending: false,
} as unknown as ReturnType<typeof useDocument.useDocumentDownloadZip>)
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: ['doc1'],
onUpdate: vi.fn(),
onClearSelection: vi.fn(),
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleBatchDownload()
})
// Mutation was called but returned null
expect(mockMutateAsync).toHaveBeenCalled()
})
})
describe('all action types', () => {
it('should handle summary action', async () => {
mockMutateAsync.mockResolvedValue({ result: 'success' })
const onUpdate = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection: vi.fn(),
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleAction(DocumentActionType.summary)()
})
expect(mockMutateAsync).toHaveBeenCalled()
await waitFor(() => {
expect(onUpdate).toHaveBeenCalled()
})
})
it('should handle disable action', async () => {
mockMutateAsync.mockResolvedValue({ result: 'success' })
const onUpdate = vi.fn()
const { result } = renderHook(
() => useDocumentActions({
datasetId: 'ds1',
selectedIds: ['doc1'],
downloadableSelectedIds: [],
onUpdate,
onClearSelection: vi.fn(),
}),
{ wrapper: createWrapper() },
)
await act(async () => {
await result.current.handleAction(DocumentActionType.disable)()
})
expect(mockMutateAsync).toHaveBeenCalled()
await waitFor(() => {
expect(onUpdate).toHaveBeenCalled()
})
})
})
})

View File

@@ -1,126 +0,0 @@
import type { CommonResponse } from '@/models/common'
import { useCallback, useMemo } from 'react'
import { useTranslation } from 'react-i18next'
import Toast from '@/app/components/base/toast'
import { DocumentActionType } from '@/models/datasets'
import {
useDocumentArchive,
useDocumentBatchRetryIndex,
useDocumentDelete,
useDocumentDisable,
useDocumentDownloadZip,
useDocumentEnable,
useDocumentSummary,
} from '@/service/knowledge/use-document'
import { asyncRunSafe } from '@/utils'
import { downloadBlob } from '@/utils/download'
type UseDocumentActionsOptions = {
datasetId: string
selectedIds: string[]
downloadableSelectedIds: string[]
onUpdate: () => void
onClearSelection: () => void
}
/**
* Generate a random ZIP filename for bulk document downloads.
* We intentionally avoid leaking dataset info in the exported archive name.
*/
const generateDocsZipFileName = (): string => {
const randomPart = (typeof crypto !== 'undefined' && typeof crypto.randomUUID === 'function')
? crypto.randomUUID()
: `${Date.now().toString(36)}${Math.random().toString(36).slice(2, 10)}`
return `${randomPart}-docs.zip`
}
export const useDocumentActions = ({
datasetId,
selectedIds,
downloadableSelectedIds,
onUpdate,
onClearSelection,
}: UseDocumentActionsOptions) => {
const { t } = useTranslation()
const { mutateAsync: archiveDocument } = useDocumentArchive()
const { mutateAsync: generateSummary } = useDocumentSummary()
const { mutateAsync: enableDocument } = useDocumentEnable()
const { mutateAsync: disableDocument } = useDocumentDisable()
const { mutateAsync: deleteDocument } = useDocumentDelete()
const { mutateAsync: retryIndexDocument } = useDocumentBatchRetryIndex()
const { mutateAsync: requestDocumentsZip, isPending: isDownloadingZip } = useDocumentDownloadZip()
type SupportedActionType
= | typeof DocumentActionType.archive
| typeof DocumentActionType.summary
| typeof DocumentActionType.enable
| typeof DocumentActionType.disable
| typeof DocumentActionType.delete
const actionMutationMap = useMemo(() => ({
[DocumentActionType.archive]: archiveDocument,
[DocumentActionType.summary]: generateSummary,
[DocumentActionType.enable]: enableDocument,
[DocumentActionType.disable]: disableDocument,
[DocumentActionType.delete]: deleteDocument,
} as const), [archiveDocument, generateSummary, enableDocument, disableDocument, deleteDocument])
const handleAction = useCallback((actionName: SupportedActionType) => {
return async () => {
const opApi = actionMutationMap[actionName]
if (!opApi)
return
const [e] = await asyncRunSafe<CommonResponse>(
opApi({ datasetId, documentIds: selectedIds }),
)
if (!e) {
if (actionName === DocumentActionType.delete)
onClearSelection()
Toast.notify({ type: 'success', message: t('actionMsg.modifiedSuccessfully', { ns: 'common' }) })
onUpdate()
}
else {
Toast.notify({ type: 'error', message: t('actionMsg.modifiedUnsuccessfully', { ns: 'common' }) })
}
}
}, [actionMutationMap, datasetId, selectedIds, onClearSelection, onUpdate, t])
const handleBatchReIndex = useCallback(async () => {
const [e] = await asyncRunSafe<CommonResponse>(
retryIndexDocument({ datasetId, documentIds: selectedIds }),
)
if (!e) {
onClearSelection()
Toast.notify({ type: 'success', message: t('actionMsg.modifiedSuccessfully', { ns: 'common' }) })
onUpdate()
}
else {
Toast.notify({ type: 'error', message: t('actionMsg.modifiedUnsuccessfully', { ns: 'common' }) })
}
}, [retryIndexDocument, datasetId, selectedIds, onClearSelection, onUpdate, t])
const handleBatchDownload = useCallback(async () => {
if (isDownloadingZip)
return
const [e, blob] = await asyncRunSafe(
requestDocumentsZip({ datasetId, documentIds: downloadableSelectedIds }),
)
if (e || !blob) {
Toast.notify({ type: 'error', message: t('actionMsg.downloadUnsuccessfully', { ns: 'common' }) })
return
}
downloadBlob({ data: blob, fileName: generateDocsZipFileName() })
}, [datasetId, downloadableSelectedIds, isDownloadingZip, requestDocumentsZip, t])
return {
handleAction,
handleBatchReIndex,
handleBatchDownload,
isDownloadingZip,
}
}

View File

@@ -1,317 +0,0 @@
import type { SimpleDocumentDetail } from '@/models/datasets'
import { act, renderHook } from '@testing-library/react'
import { describe, expect, it, vi } from 'vitest'
import { DataSourceType } from '@/models/datasets'
import { useDocumentSelection } from './use-document-selection'
type LocalDoc = SimpleDocumentDetail & { percent?: number }
const createMockDocument = (overrides: Partial<LocalDoc> = {}): LocalDoc => ({
id: 'doc1',
name: 'Test Document',
data_source_type: DataSourceType.FILE,
data_source_info: {},
data_source_detail_dict: {},
word_count: 100,
hit_count: 10,
created_at: 1000000,
position: 1,
doc_form: 'text_model',
enabled: true,
archived: false,
display_status: 'available',
created_from: 'api',
...overrides,
} as LocalDoc)
describe('useDocumentSelection', () => {
describe('isAllSelected', () => {
it('should return false when documents is empty', () => {
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: [],
selectedIds: [],
onSelectedIdChange,
}),
)
expect(result.current.isAllSelected).toBe(false)
})
it('should return true when all documents are selected', () => {
const docs = [
createMockDocument({ id: 'doc1' }),
createMockDocument({ id: 'doc2' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1', 'doc2'],
onSelectedIdChange,
}),
)
expect(result.current.isAllSelected).toBe(true)
})
it('should return false when not all documents are selected', () => {
const docs = [
createMockDocument({ id: 'doc1' }),
createMockDocument({ id: 'doc2' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1'],
onSelectedIdChange,
}),
)
expect(result.current.isAllSelected).toBe(false)
})
})
describe('isSomeSelected', () => {
it('should return false when no documents are selected', () => {
const docs = [createMockDocument({ id: 'doc1' })]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: [],
onSelectedIdChange,
}),
)
expect(result.current.isSomeSelected).toBe(false)
})
it('should return true when some documents are selected', () => {
const docs = [
createMockDocument({ id: 'doc1' }),
createMockDocument({ id: 'doc2' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1'],
onSelectedIdChange,
}),
)
expect(result.current.isSomeSelected).toBe(true)
})
})
describe('onSelectAll', () => {
it('should select all documents when none are selected', () => {
const docs = [
createMockDocument({ id: 'doc1' }),
createMockDocument({ id: 'doc2' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: [],
onSelectedIdChange,
}),
)
act(() => {
result.current.onSelectAll()
})
expect(onSelectedIdChange).toHaveBeenCalledWith(['doc1', 'doc2'])
})
it('should deselect all when all are selected', () => {
const docs = [
createMockDocument({ id: 'doc1' }),
createMockDocument({ id: 'doc2' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1', 'doc2'],
onSelectedIdChange,
}),
)
act(() => {
result.current.onSelectAll()
})
expect(onSelectedIdChange).toHaveBeenCalledWith([])
})
it('should add to existing selection when some are selected', () => {
const docs = [
createMockDocument({ id: 'doc1' }),
createMockDocument({ id: 'doc2' }),
createMockDocument({ id: 'doc3' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1'],
onSelectedIdChange,
}),
)
act(() => {
result.current.onSelectAll()
})
expect(onSelectedIdChange).toHaveBeenCalledWith(['doc1', 'doc2', 'doc3'])
})
})
describe('onSelectOne', () => {
it('should add document to selection when not selected', () => {
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: [],
selectedIds: [],
onSelectedIdChange,
}),
)
act(() => {
result.current.onSelectOne('doc1')
})
expect(onSelectedIdChange).toHaveBeenCalledWith(['doc1'])
})
it('should remove document from selection when already selected', () => {
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: [],
selectedIds: ['doc1', 'doc2'],
onSelectedIdChange,
}),
)
act(() => {
result.current.onSelectOne('doc1')
})
expect(onSelectedIdChange).toHaveBeenCalledWith(['doc2'])
})
})
describe('hasErrorDocumentsSelected', () => {
it('should return false when no error documents are selected', () => {
const docs = [
createMockDocument({ id: 'doc1', display_status: 'available' }),
createMockDocument({ id: 'doc2', display_status: 'error' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1'],
onSelectedIdChange,
}),
)
expect(result.current.hasErrorDocumentsSelected).toBe(false)
})
it('should return true when an error document is selected', () => {
const docs = [
createMockDocument({ id: 'doc1', display_status: 'available' }),
createMockDocument({ id: 'doc2', display_status: 'error' }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc2'],
onSelectedIdChange,
}),
)
expect(result.current.hasErrorDocumentsSelected).toBe(true)
})
})
describe('downloadableSelectedIds', () => {
it('should return only FILE type documents from selection', () => {
const docs = [
createMockDocument({ id: 'doc1', data_source_type: DataSourceType.FILE }),
createMockDocument({ id: 'doc2', data_source_type: DataSourceType.NOTION }),
createMockDocument({ id: 'doc3', data_source_type: DataSourceType.FILE }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1', 'doc2', 'doc3'],
onSelectedIdChange,
}),
)
expect(result.current.downloadableSelectedIds).toEqual(['doc1', 'doc3'])
})
it('should return empty array when no FILE documents selected', () => {
const docs = [
createMockDocument({ id: 'doc1', data_source_type: DataSourceType.NOTION }),
createMockDocument({ id: 'doc2', data_source_type: DataSourceType.WEB }),
]
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: docs,
selectedIds: ['doc1', 'doc2'],
onSelectedIdChange,
}),
)
expect(result.current.downloadableSelectedIds).toEqual([])
})
})
describe('clearSelection', () => {
it('should call onSelectedIdChange with empty array', () => {
const onSelectedIdChange = vi.fn()
const { result } = renderHook(() =>
useDocumentSelection({
documents: [],
selectedIds: ['doc1', 'doc2'],
onSelectedIdChange,
}),
)
act(() => {
result.current.clearSelection()
})
expect(onSelectedIdChange).toHaveBeenCalledWith([])
})
})
})

View File

@@ -1,66 +0,0 @@
import type { SimpleDocumentDetail } from '@/models/datasets'
import { uniq } from 'es-toolkit/array'
import { useCallback, useMemo } from 'react'
import { DataSourceType } from '@/models/datasets'
type LocalDoc = SimpleDocumentDetail & { percent?: number }
type UseDocumentSelectionOptions = {
documents: LocalDoc[]
selectedIds: string[]
onSelectedIdChange: (selectedIds: string[]) => void
}
export const useDocumentSelection = ({
documents,
selectedIds,
onSelectedIdChange,
}: UseDocumentSelectionOptions) => {
const isAllSelected = useMemo(() => {
return documents.length > 0 && documents.every(doc => selectedIds.includes(doc.id))
}, [documents, selectedIds])
const isSomeSelected = useMemo(() => {
return documents.some(doc => selectedIds.includes(doc.id))
}, [documents, selectedIds])
const onSelectAll = useCallback(() => {
if (isAllSelected)
onSelectedIdChange([])
else
onSelectedIdChange(uniq([...selectedIds, ...documents.map(doc => doc.id)]))
}, [isAllSelected, documents, onSelectedIdChange, selectedIds])
const onSelectOne = useCallback((docId: string) => {
onSelectedIdChange(
selectedIds.includes(docId)
? selectedIds.filter(id => id !== docId)
: [...selectedIds, docId],
)
}, [selectedIds, onSelectedIdChange])
const hasErrorDocumentsSelected = useMemo(() => {
return documents.some(doc => selectedIds.includes(doc.id) && doc.display_status === 'error')
}, [documents, selectedIds])
const downloadableSelectedIds = useMemo(() => {
const selectedSet = new Set(selectedIds)
return documents
.filter(doc => selectedSet.has(doc.id) && doc.data_source_type === DataSourceType.FILE)
.map(doc => doc.id)
}, [documents, selectedIds])
const clearSelection = useCallback(() => {
onSelectedIdChange([])
}, [onSelectedIdChange])
return {
isAllSelected,
isSomeSelected,
onSelectAll,
onSelectOne,
hasErrorDocumentsSelected,
downloadableSelectedIds,
clearSelection,
}
}

View File

@@ -1,340 +0,0 @@
import type { SimpleDocumentDetail } from '@/models/datasets'
import { act, renderHook } from '@testing-library/react'
import { describe, expect, it } from 'vitest'
import { useDocumentSort } from './use-document-sort'
type LocalDoc = SimpleDocumentDetail & { percent?: number }
const createMockDocument = (overrides: Partial<LocalDoc> = {}): LocalDoc => ({
id: 'doc1',
name: 'Test Document',
data_source_type: 'upload_file',
data_source_info: {},
data_source_detail_dict: {},
word_count: 100,
hit_count: 10,
created_at: 1000000,
position: 1,
doc_form: 'text_model',
enabled: true,
archived: false,
display_status: 'available',
created_from: 'api',
...overrides,
} as LocalDoc)
describe('useDocumentSort', () => {
describe('initial state', () => {
it('should return null sortField initially', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: [],
statusFilterValue: '',
remoteSortValue: '',
}),
)
expect(result.current.sortField).toBeNull()
expect(result.current.sortOrder).toBe('desc')
})
it('should return documents unchanged when no sort is applied', () => {
const docs = [
createMockDocument({ id: 'doc1', name: 'B' }),
createMockDocument({ id: 'doc2', name: 'A' }),
]
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
expect(result.current.sortedDocuments).toEqual(docs)
})
})
describe('handleSort', () => {
it('should set sort field when called', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: [],
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortField).toBe('name')
expect(result.current.sortOrder).toBe('desc')
})
it('should toggle sort order when same field is clicked twice', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: [],
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortOrder).toBe('desc')
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortOrder).toBe('asc')
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortOrder).toBe('desc')
})
it('should reset to desc when different field is selected', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: [],
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('name')
})
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortOrder).toBe('asc')
act(() => {
result.current.handleSort('word_count')
})
expect(result.current.sortField).toBe('word_count')
expect(result.current.sortOrder).toBe('desc')
})
it('should not change state when null is passed', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: [],
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort(null)
})
expect(result.current.sortField).toBeNull()
})
})
describe('sorting documents', () => {
const docs = [
createMockDocument({ id: 'doc1', name: 'Banana', word_count: 200, hit_count: 5, created_at: 3000 }),
createMockDocument({ id: 'doc2', name: 'Apple', word_count: 100, hit_count: 10, created_at: 1000 }),
createMockDocument({ id: 'doc3', name: 'Cherry', word_count: 300, hit_count: 1, created_at: 2000 }),
]
it('should sort by name descending', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('name')
})
const names = result.current.sortedDocuments.map(d => d.name)
expect(names).toEqual(['Cherry', 'Banana', 'Apple'])
})
it('should sort by name ascending', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('name')
})
act(() => {
result.current.handleSort('name')
})
const names = result.current.sortedDocuments.map(d => d.name)
expect(names).toEqual(['Apple', 'Banana', 'Cherry'])
})
it('should sort by word_count descending', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('word_count')
})
const counts = result.current.sortedDocuments.map(d => d.word_count)
expect(counts).toEqual([300, 200, 100])
})
it('should sort by hit_count ascending', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('hit_count')
})
act(() => {
result.current.handleSort('hit_count')
})
const counts = result.current.sortedDocuments.map(d => d.hit_count)
expect(counts).toEqual([1, 5, 10])
})
it('should sort by created_at descending', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('created_at')
})
const times = result.current.sortedDocuments.map(d => d.created_at)
expect(times).toEqual([3000, 2000, 1000])
})
})
describe('status filtering', () => {
const docs = [
createMockDocument({ id: 'doc1', display_status: 'available' }),
createMockDocument({ id: 'doc2', display_status: 'error' }),
createMockDocument({ id: 'doc3', display_status: 'available' }),
]
it('should not filter when statusFilterValue is empty', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
expect(result.current.sortedDocuments.length).toBe(3)
})
it('should not filter when statusFilterValue is all', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: 'all',
remoteSortValue: '',
}),
)
expect(result.current.sortedDocuments.length).toBe(3)
})
})
describe('remoteSortValue reset', () => {
it('should reset sort state when remoteSortValue changes', () => {
const { result, rerender } = renderHook(
({ remoteSortValue }) =>
useDocumentSort({
documents: [],
statusFilterValue: '',
remoteSortValue,
}),
{ initialProps: { remoteSortValue: 'initial' } },
)
act(() => {
result.current.handleSort('name')
})
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortField).toBe('name')
expect(result.current.sortOrder).toBe('asc')
rerender({ remoteSortValue: 'changed' })
expect(result.current.sortField).toBeNull()
expect(result.current.sortOrder).toBe('desc')
})
})
describe('edge cases', () => {
it('should handle documents with missing values', () => {
const docs = [
createMockDocument({ id: 'doc1', name: undefined as unknown as string, word_count: undefined }),
createMockDocument({ id: 'doc2', name: 'Test', word_count: 100 }),
]
const { result } = renderHook(() =>
useDocumentSort({
documents: docs,
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortedDocuments.length).toBe(2)
})
it('should handle empty documents array', () => {
const { result } = renderHook(() =>
useDocumentSort({
documents: [],
statusFilterValue: '',
remoteSortValue: '',
}),
)
act(() => {
result.current.handleSort('name')
})
expect(result.current.sortedDocuments).toEqual([])
})
})
})

Some files were not shown because too many files have changed in this diff Show More