Compare commits

...

109 Commits

Author SHA1 Message Date
GareArc
825765231b fix: remove extra exempts
Some checks are pending
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Waiting to run
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Waiting to run
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Waiting to run
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Blocked by required conditions
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Blocked by required conditions
2026-03-05 01:10:59 -08:00
GareArc
4e35fbbff4 Merge branch 'fix/enterprise-api-error-handling' into deploy/enterprise 2026-03-05 00:27:55 -08:00
GareArc
11f657019a Squash merge fix/enterprise-api-error-handling into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-03-04 22:31:21 -08:00
GareArc
eaea4ad6dd fix: use payload.id instead of undefined args in set_default_provider 2026-03-04 22:28:35 -08:00
GareArc
7007aa3c61 Merge branch 'fix/enterprise-api-error-handling' into deploy/enterprise 2026-03-04 19:54:13 -08:00
GareArc
8049c90a38 Merge remote-tracking branch 'origin/release/e-1.12.1' into deploy/enterprise 2026-03-04 17:32:33 -08:00
GareArc
3f771544b1 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-03-04 17:31:51 -08:00
GareArc
ee13650e3d fix(api): restore missing reg(ModelConfig) from 1.12.1 refactor 2026-03-04 17:31:19 -08:00
GareArc
9fa8f6235e Merge branch 'release/e-1.12.1' into 1.12.1-otel-ee 2026-03-04 16:59:21 -08:00
GareArc
5d54c198c0 Merge branch '1.12.1-otel-ee' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-03-02 20:01:15 -08:00
GareArc
6536489195 fix(telemetry): restore TRACE_TASK_TO_CASE lookup broken by CE safety refactor
The CE safety commit (8a3485454a) converted module-level dicts to lazy
functions but forgot to update __init__.py, which still imported the
now-deleted TRACE_TASK_TO_CASE constant causing an ImportError at startup.

Add get_trace_task_to_case() to gateway.py as a lazy public wrapper
(inverse of _get_case_to_trace_task) and update __init__.py to call it.
2026-03-02 19:59:20 -08:00
GareArc
8f1d2455f4 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-03-02 18:50:39 -08:00
GareArc
8a3485454a fix(telemetry): ensure CE safety for enterprise-only imports and DB lookups
- Move enqueue_draft_node_execution_trace import inside call site in workflow_service.py
- Make gateway.py enterprise type imports lazy (routing dicts built on first call)
- Restore typed ModelConfig in llm_generator method signatures (revert dict regression)
- Fix generate_structured_output using wrong key model_parameters -> completion_params
- Replace unsafe cast(str, msg.content) with get_text_content() across llm_generator
- Remove duplicated payload classes from generator.py, import from core.llm_generator.entities
- Gate _lookup_app_and_workspace_names and credential lookups in ops_trace_manager behind is_enterprise_telemetry_enabled()
2026-03-02 18:45:33 -08:00
GareArc
cf15f0d681 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-03-02 15:56:52 -08:00
GareArc
d6de27a25a feat(telemetry): promote gen_ai scalar fields from log-only to span attributes
Move gen_ai.usage.*, gen_ai.request.model, gen_ai.provider.name, and
gen_ai.user.id from companion-log-only to span attributes on workflow
and node execution spans.

These are small scalars with no size risk. Having them on spans enables
filtering and grouping in trace UIs (Tempo, Jaeger, Datadog) without
requiring a cross-signal join to companion logs.

Data dictionary updated: span tables gain the new fields; companion log
'additional attributes' tables trimmed to only list fields not already
covered by 'All span attributes'.
2026-03-02 15:55:10 -08:00
GareArc
11ab67c8cb Merge branch '1.12.1-otel-ee' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-03-02 04:20:06 -08:00
GareArc
fe741140d5 fix(telemetry): fix zero-value message and workflow duration histograms
Workflow RT: replace float(info.workflow_run_elapsed_time) with
(end_time - start_time).total_seconds() using workflow_run.created_at and
workflow_run.finished_at. The elapsed_time DB field defaults to 0 and can
be stale if the workflow_storage Celery task has not committed yet when the
trace fires. Wall-clock timestamps are more reliable; elapsed_time is kept
as fallback.

Message RT: change end_time from created_at + provider_response_latency to
message.updated_at when updated_at > created_at. The pipeline explicitly
sets message.updated_at = naive_utc_now() at the moment the LLM response
is complete, making it the canonical response-complete timestamp.
Falls back to the latency-based calculation for error/aborted messages.
2026-03-02 04:14:57 -08:00
GareArc
9b5b355a4e fix(telemetry): gate ObservabilityLayer content attrs behind ENTERPRISE_INCLUDE_CONTENT
Add should_include_content() helper to extensions/otel/parser/base.py that
returns True in CE (no behaviour change) and respects ENTERPRISE_INCLUDE_CONTENT
in EE. Gate all content-bearing span attributes in LLM, retrieval, tool, and
default node parsers so that gen_ai.completion, gen_ai.prompt, retrieval.document,
tool call arguments/results, and node input/output values are suppressed when
ENTERPRISE_ENABLED=True and ENTERPRISE_INCLUDE_CONTENT=False.
2026-03-02 04:04:26 -08:00
GareArc
ff35f1bfaa Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-03-02 02:28:30 -08:00
GareArc
3364003f90 fix(telemetry): add credential_name lookup with async-safe fallback 2026-03-02 02:27:31 -08:00
GareArc
e387d0205b Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-03-02 01:54:55 -08:00
GareArc
6df00c83ae fix(telemetry): populate LLM credential info in node execution traces
- Add _lookup_llm_credential_info() to query Provider/ProviderModel tables
- Lookup LLM credentials when tool credential_id is null
- Fall back to provider-level credential if no model-specific credential
2026-03-02 01:47:39 -08:00
GareArc
05cf2336ac docs(telemetry): add token consumption query patterns to data dictionary
Add token hierarchy diagram, common PromQL queries (totals, drill-down,
rates), and app name lookup via trace query.
2026-03-02 01:19:00 -08:00
GareArc
b710c9ad59 fix(telemetry): populate missing fields in node execution trace
- Extract model_provider/model_name from process_data (LLM nodes store
  model info there, not in execution_metadata)
- Add invoke_from to node execution trace metadata dict
- Add credential_id to node execution trace metadata dict
- Add conversation_id to metadata after message_id lookup
- Add tool_name to tool_info dict in tool node
2026-03-02 01:18:59 -08:00
GareArc
a2a5b02a53 docs(telemetry): add token consumption query patterns to data dictionary
Add token hierarchy diagram, common PromQL queries (totals, drill-down,
rates), and app name lookup via trace query.
2026-03-02 01:07:18 -08:00
GareArc
1fcb05432d fix(telemetry): populate missing fields in node execution trace
- Extract model_provider/model_name from process_data (LLM nodes store
  model info there, not in execution_metadata)
- Add invoke_from to node execution trace metadata dict
- Add credential_id to node execution trace metadata dict
- Add conversation_id to metadata after message_id lookup
- Add tool_name to tool_info dict in tool node
2026-03-02 01:07:10 -08:00
L1nSn0w
9c148218fc Merge branch 'deploy/enterprise' of https://github.com/langgenius/dify into deploy/enterprise 2026-03-02 16:58:01 +08:00
L1nSn0w
02ab3a34b4 Merge branch 'release/e-1.12.1' into deploy/enterprise 2026-03-02 16:57:31 +08:00
GareArc
aa7f648712 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-03-01 22:30:09 -08:00
GareArc
9d4b2715e8 fix(celery): register enterprise_telemetry_task in worker imports
Fixes Celery worker error where process_enterprise_telemetry task
was unregistered despite being dispatched from the app.

Added conditional import when ENTERPRISE_TELEMETRY_ENABLED=true
to ensure the task is available in the worker process.

Resolves: KeyError 'tasks.enterprise_telemetry_task.process_enterprise_telemetry'
2026-03-01 22:27:44 -08:00
GareArc
e2fc3417be Merge branch 'fix/otel-upgrade-e-1.12.1' into deploy/enterprise 2026-03-01 21:48:37 -08:00
GareArc
eb1b1eb09c Merge 1.12.1-otel-ee into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-03-01 19:37:06 -08:00
GareArc
83f5850d0a refactor(telemetry): add resolved_parent_context property and fix edge cases
- Add resolved_parent_context property to BaseTraceInfo for reusable parent context extraction
- Refactor enterprise_trace.py to use property instead of duplicated dict plucking (~19 lines eliminated)
- Fix UUID validation in exporter.py with specific error logging for invalid trace correlation IDs
- Add error isolation in event_handlers.py to prevent telemetry failures from breaking user operations
- Replace pickle-based payload_fallback with JSON storage rehydration for security
- Update TelemetryEnvelope to use Pydantic v2 ConfigDict with extra='forbid'
- Update tests to reflect contract changes and new error handling behavior
2026-03-01 19:33:59 -08:00
yunlu.wen
3368d4cf02 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-03-02 10:10:28 +08:00
yunlu.wen
7a92c1764f fix token label 2026-03-02 10:10:01 +08:00
yunlu.wen
5617d69ca7 try to fix exception logging 2026-03-02 09:53:11 +08:00
GareArc
1a6aded8e0 Merge branch '1.12.1-otel-ee' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-03-01 02:25:23 -08:00
GareArc
9952a17fed fix(telemetry): use URL scheme instead of API key for gRPC TLS detection
- Change insecure parameter from API key-based to URL scheme-based detection
- https:// endpoints now correctly use TLS (insecure=False)
- All other endpoints (http://, no scheme) use insecure=True
- Update tests to reflect URL scheme-based logic
- Remove incorrect documentation claiming API key controls TLS
2026-03-01 02:24:25 -08:00
GareArc
36ff9b447d Merge origin/release/e-1.12.1 into 1.12.1-otel-ee
Sync enterprise 1.12.1 changes:
- feat: implement heartbeat mechanism for database migration lock
- refactor: replace AutoRenewRedisLock with DbMigrationAutoRenewLock
- fix: improve logging for database migration lock release
- fix: make flask upgrade-db fail on error
- fix: include sso_verified in access_mode validation
- fix: inherit web app permission from original app
- fix: make e-1.12.1 enterprise migrations database-agnostic
- fix: get_message_event_type return wrong message type
- refactor: document_indexing_sync_task split db session
- fix: trigger output schema miss
- test: remove unrelated enterprise service test

Conflict resolution:
- Combined OTEL telemetry imports with tool signature import in easy_ui_based_generate_task_pipeline.py
2026-03-01 00:18:46 -08:00
GareArc
1fa1960201 Merge branch '1.12.1-otel-ee' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-02-28 20:34:15 -08:00
GareArc
ff877ee39c fix(telemetry): add resolved_trace_id property to eliminate trace_id inconsistencies
Add computed property to BaseTraceInfo that provides intelligent fallback:
1. External trace_id (from X-Trace-Id header)
2. workflow_run_id (for workflow-related traces)
3. message_id (as final fallback)

This ensures attribute dify.trace_id always matches log-level trace_id,
eliminating inconsistencies where attribute was null but log-level had value.

Changes:
- Add resolved_trace_id property to BaseTraceInfo (trace_entity.py)
- Replace 4 direct trace_id attribute assignments with resolved_trace_id
- Add trace_id_source parameter to 5 emit_metric_only_event calls

Fixes trace_id inconsistency found in MESSAGE_RUN, TOOL_EXECUTION,
MODERATION_CHECK, SUGGESTED_QUESTION_GENERATION, GENERATE_NAME_EXECUTION,
DATASET_RETRIEVAL, and PROMPT_GENERATION_EXECUTION events.

All 78 telemetry tests passing.
2026-02-28 20:32:15 -08:00
GareArc
370e1fa5e2 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-02-28 19:30:49 -08:00
GareArc
abcf14a571 refactor(telemetry): move gateway to core as stateless module-level functions
Move routing table, emit(), and is_enterprise_telemetry_enabled() from
enterprise/telemetry/gateway.py into core/telemetry/gateway.py so both
CE and EE share one code path. The ce_eligible flag in CASE_ROUTING
controls which events flow in CE — flipping it is the only change needed
to enable an event in community edition.

- Delete enterprise/telemetry/gateway.py (class-based singleton)
- Create core/telemetry/gateway.py (stateless functions, no shared state)
- Simplify core/telemetry/__init__.py to thin facade over gateway
- Remove TelemetryGateway class and get_gateway() from ext_enterprise_telemetry
- Single-source is_enterprise_telemetry_enabled in core.telemetry.gateway
- Fix pre-existing test bugs (missing dify.event.id in metric handler tests)
- Update all imports and mock paths across 7 test files
2026-02-28 19:27:24 -08:00
GareArc
9bd938b4e1 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-02-28 17:41:17 -08:00
GareArc
5e57f73598 feat(telemetry): add model provider and name tags to all trace metrics
Add comprehensive model tracking across all OTEL metrics and logs:
- Node execution metrics now include model_name for LLM operations
- Suggested question metrics include model_provider and model_name
- Dataset retrieval captures both embedding and rerank model info
- Updated DATA_DICTIONARY.md with complete metric label documentation

This enables granular cost tracking, performance analysis, and usage monitoring per model across all operation types.
2026-02-28 00:06:44 -08:00
GareArc
62592be60b docs(enterprise): split telemetry docs into README and data dictionary
Separate background/configuration instructions from the data dictionary:
- README.md: Overview, configuration, correlation model, content gating
- DATA_DICTIONARY.md: Pure reference format with signals and attributes

The data dictionary is now concise (465 lines vs 911) and focuses on
attribute types and relationships without verbose explanations.
2026-02-27 12:32:48 -08:00
L1nSn0w
7a8c96b4b7 Merge branch 'release/e-1.12.1' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-02-14 17:00:06 +08:00
GareArc
f17b51ab3a Merge branch 'fix/access-mode-sso-verified-e-1.12.1' into deploy/enterprise 2026-02-13 23:41:04 -08:00
GareArc
23c75c7ec7 fix: centralize access_mode validation and support sso_verified
- Add ALLOWED_ACCESS_MODES constant to centralize valid access modes
- Include 'sso_verified' in validation to fix app duplication errors
- Update error message to dynamically list all allowed modes
- Refactor for maintainability: single source of truth for access modes

This fixes the issue where apps with access_mode='sso_verified' could not
be duplicated because the validation in update_app_access_mode() was missing
this mode, even though it was documented in WebAppSettings model.
2026-02-13 23:29:05 -08:00
GareArc
588e6561dc Merge branch 'hotfix/e-1.12.1-app-copy-inherit-webapp-permission' into deploy/enterprise 2026-02-13 22:42:35 -08:00
GareArc
efbdb4c706 fix(app-copy): inherit web app permission from original app
When copying an app, the copied app was not getting a web_app_settings
record created. This caused the enterprise service to query for settings
that don't exist, falling back to default behavior.

This fix ensures copied apps inherit the same access mode as the original:
- If original has explicit settings (public/private/private_all/sso_verified),
  the copy gets the same setting
- If original has no settings (old apps), copy defaults to 'public' to match
  the original's effective permission via fallback

This prevents permission mismatches between original and copied apps and
ensures the enterprise service has explicit settings to query.

Related: langgenius/dify-enterprise#423
2026-02-13 22:11:03 -08:00
L1nSn0w
2bbe74be23 fix: make e-1.12.1 enterprise migrations database-agnostic for MySQL/TiDB (#32269)
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-12 15:57:38 +08:00
GareArc
76471821d7 Merge branch 'release/e-1.12.1' into deploy/enterprise 2026-02-11 21:43:42 -08:00
GareArc
25c457e2ed Merge branch '1.12.1-otel-ee' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-02-10 20:12:16 -08:00
GareArc
262b7d4d08 docs(enterprise): add telemetry data dictionary for OTEL signals
- Comprehensive reference for all enterprise telemetry signals
- Documents 3 span types, 10 counters, 6 histograms, 13 log events
- Includes trace correlation model with ASCII diagrams
- Configuration reference for all 8 ENTERPRISE_* variables
- Per-emission-site label tables for metrics
- Full JSON schemas for structured log events
- Content gating behavior and token double-counting warnings
2026-02-10 19:51:14 -08:00
GareArc
efeae4c46f Merge branch '1.12.1-otel-ee' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-02-10 00:31:34 -08:00
GareArc
b5dbabf5d0 feat(telemetry): add missing ID fields for name attributes
- Add dify.credential.id to node execution events
- Add dify.event.id to all telemetry events (APP_CREATED, APP_UPDATED, APP_DELETED, FEEDBACK_CREATED)

This ensures all .name fields have corresponding .id fields for reliable aggregation and deduplication.
2026-02-10 00:09:41 -08:00
GareArc
d207ca3f1e Merge branch 'deploy/enterprise' of https://github.com/langgenius/dify into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-02-09 01:57:13 -08:00
GareArc
7cabef2b42 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-02-09 01:47:23 -08:00
GareArc
aa34ec0d25 test(enterprise-telemetry): add unit tests for OTEL bearer auth and insecure flag 2026-02-09 01:44:21 -08:00
GareArc
ffa8aedc48 feat(enterprise-telemetry): wire bearer token auth and configurable insecure flag into OTEL exporter 2026-02-09 01:44:21 -08:00
GareArc
f78b0f1f36 feat(enterprise-telemetry): add ENTERPRISE_OTLP_API_KEY config field 2026-02-09 01:44:21 -08:00
GareArc
f85275e5f9 test(enterprise-telemetry): add unit tests for OTEL bearer auth and insecure flag 2026-02-09 01:35:17 -08:00
GareArc
f1b5863bb5 feat(enterprise-telemetry): wire bearer token auth and configurable insecure flag into OTEL exporter 2026-02-09 01:29:40 -08:00
GareArc
f2d07f3ec5 feat(enterprise-telemetry): add ENTERPRISE_OTLP_API_KEY config field 2026-02-09 01:26:26 -08:00
wangxiaolei
284c5f40f1 refactor: document_indexing_update_task split database session (#32105)
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-02-09 15:57:42 +08:00
wangxiaolei
55de893984 refactor: partition Celery task sessions into smaller, discrete execu… (#32085)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-02-09 15:57:42 +08:00
QuantumGhost
b035b091fa perf: use batch delete method instead of single delete (#32036)
Co-authored-by: fatelei <fatelei@gmail.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: FFXN <lizy@dify.ai>
2026-02-09 15:57:42 +08:00
wangxiaolei
9898df5ed5 fix: fix tool type is miss (#32042) 2026-02-09 15:57:42 +08:00
wangxiaolei
075e90a253 fix: fix agent node tool type is not right (#32008)
Infer real tool type via querying relevant database tables.

The root cause for incorrect `type` field is still not clear.
2026-02-09 15:57:42 +08:00
QuantumGhost
9742185e6b perf(api): Optimize the response time of AppListApi endpoint (#31999) 2026-02-09 15:57:42 +08:00
wangxiaolei
51946a734a fix: fix miss use db.session (#31971) 2026-02-09 15:57:42 +08:00
NFish
243c3f7dc0 fix: include app id in automatic generation requests (#32138) 2026-02-09 15:57:42 +08:00
GareArc
1b3a21e6f8 feat(telemetry): unify token metric label structure with Pydantic enforcement
- Add TokenMetricLabels BaseModel to enforce consistent label structure
- All dify.token.* metrics now use identical 6-label structure:
  * tenant_id, app_id, operation_type, model_provider, model_name, node_type
- Pydantic validation ensures runtime enforcement (extra='forbid', frozen=True)
- Enables filtering by operation_type to avoid double-counting:
  * workflow: aggregated workflow-level tokens
  * node_execution: individual node-level tokens
  * message: direct message tokens
  * rule_generate/code_generate: prompt generation tokens

Previously, inconsistent label cardinality made aggregation impossible:
- WORKFLOW: 3 labels
- NODE_EXECUTION: 6 labels
- MESSAGE: 5 labels
- PROMPT_GENERATION: 5 labels

Now all use the same 6-label structure for consistent querying.
2026-02-06 03:10:20 -08:00
GareArc
944eb28486 feat(telemetry): unify token metric label structure with Pydantic enforcement
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
- Add TokenMetricLabels BaseModel to enforce consistent label structure
- All dify.token.* metrics now use identical 6-label structure:
  * tenant_id, app_id, operation_type, model_provider, model_name, node_type
- Pydantic validation ensures runtime enforcement (extra='forbid', frozen=True)
- Enables filtering by operation_type to avoid double-counting:
  * workflow: aggregated workflow-level tokens
  * node_execution: individual node-level tokens
  * message: direct message tokens
  * rule_generate/code_generate: prompt generation tokens

Previously, inconsistent label cardinality made aggregation impossible:
- WORKFLOW: 3 labels
- NODE_EXECUTION: 6 labels
- MESSAGE: 5 labels
- PROMPT_GENERATION: 5 labels

Now all use the same 6-label structure for consistent querying.
2026-02-06 03:06:06 -08:00
GareArc
4e624af5e0 Merge branch '1.12.1-otel-ee' into deploy/enterprise 2026-02-06 02:41:58 -08:00
GareArc
11c74d741a feat: add dedicated app event counters and convert event names to StrEnum
- Add APP_CREATED, APP_UPDATED, APP_DELETED counters to EnterpriseTelemetryCounter
- Create EnterpriseTelemetryEvent StrEnum for type-safe event names
- Update metric_handler to use new app-specific counters with labels (tenant_id, app_id, mode)
- Convert all event_name strings to EnterpriseTelemetryEvent enum values
- Update exporter to create OTEL meters for new app counters (dify.app.created.total, etc.)
- Update tests to verify new counter behavior and enum usage
2026-02-06 02:38:19 -08:00
GareArc
ea9081f22d feat(telemetry): add operation_type labels for token metrics
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-06 01:06:07 -08:00
GareArc
4e3112bd7f feat(telemetry): add enterprise OTEL telemetry with gateway, traces, metrics, and logs 2026-02-06 01:02:19 -08:00
GareArc
ac8e96bd9d docs(telemetry): clarify enterprise_telemetry queue is EE-only 2026-02-05 23:10:37 -08:00
GareArc
91a6fe25d1 feat(telemetry): add enterprise OTEL telemetry with gateway, traces, metrics, and logs 2026-02-05 23:10:30 -08:00
GareArc
576eca2113 Merge branch '1.12.1-otel-ee' into deploy/enterprise
Some checks failed
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/amd64, build-api-amd64) (push) Has been cancelled
Build and Push API & Web / build (api, DIFY_API_IMAGE_NAME, linux/arm64, build-api-arm64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/amd64, build-web-amd64) (push) Has been cancelled
Build and Push API & Web / build (web, DIFY_WEB_IMAGE_NAME, linux/arm64, build-web-arm64) (push) Has been cancelled
Build and Push API & Web / create-manifest (api, DIFY_API_IMAGE_NAME, merge-api-images) (push) Has been cancelled
Build and Push API & Web / create-manifest (web, DIFY_WEB_IMAGE_NAME, merge-web-images) (push) Has been cancelled
2026-02-05 23:07:48 -08:00
GareArc
8ded2d73f0 fix(telemetry): move EE guard to gateway routing level
Prevents CE users from enqueueing EE-only events (all METRIC_LOG cases)
to non-existent enterprise_telemetry Celery queue.

- Add _should_drop_ee_only_event() check in emit() before routing
- Remove redundant check from _emit_trace()
- Single guard at gateway level protects both trace and metric/log paths
2026-02-05 22:58:40 -08:00
GareArc
4a9b74f86b refactor(telemetry): simplify by eliminating TelemetryFacade
**Problem:**
The telemetry system had unnecessary abstraction layers and bad practices
from the last 3 commits introducing the gateway implementation:
- TelemetryFacade class wrapper around emit() function
- String literals instead of SignalType enum
- Dictionary mapping enum → string instead of enum → enum
- Unnecessary ENTERPRISE_TELEMETRY_GATEWAY_ENABLED feature flag
- Duplicate guard checks scattered across files
- Non-thread-safe TelemetryGateway singleton pattern
- Missing guard in ops_trace_task.py causing RuntimeError spam

**Solution:**
1. Deleted TelemetryFacade - replaced with thin emit() function in core/telemetry/__init__.py
2. Added SignalType enum ('trace' | 'metric_log') to enterprise/telemetry/contracts.py
3. Replaced CASE_TO_TRACE_TASK_NAME dict with CASE_TO_TRACE_TASK: dict[TelemetryCase, TraceTaskName]
4. Deleted is_gateway_enabled() and _emit_legacy() - using existing ENTERPRISE_ENABLED + ENTERPRISE_TELEMETRY_ENABLED instead
5. Extracted _should_drop_ee_only_event() helper to eliminate duplicate checks
6. Moved TelemetryGateway singleton to ext_enterprise_telemetry.py:
   - Init once in init_app() for thread-safety
   - Access via get_gateway() function
7. Re-added guard to ops_trace_task.py to prevent RuntimeError when EE=OFF but CE tracing enabled
8. Updated 11 caller files to import 'emit as telemetry_emit' instead of 'TelemetryFacade'

**Result:**
- 322 net lines deleted (533 removed, 211 added)
- All 91 tests pass
- Thread-safe singleton pattern
- Cleaner API surface: from TelemetryFacade.emit() to telemetry_emit()
- Proper enum usage throughout
- No RuntimeError spam in EE=OFF + CE=ON scenario
2026-02-05 22:41:09 -08:00
GareArc
4d47339ce6 feat: Add parent trace context propagation for workflow-as-tool hierarchy
Enables distributed tracing for nested workflows across all trace providers
(Langfuse, LangSmith, community providers). When a workflow invokes another
workflow via workflow-as-tool, the child workflow now includes parent context
attributes that allow trace systems to reconstruct the full execution tree.

Changes:
- Add parent_trace_context field to WorkflowTool
- Set parent context in tool node when invoking workflow-as-tool
- Extract and pass parent context through app generator

This is a community enhancement (ungated) that improves distributed tracing
for all users. Parent context includes: trace_id, node_execution_id,
workflow_run_id, and app_id.
2026-02-05 20:19:29 -08:00
GareArc
6e47e163b8 fix(telemetry): use atomic Redis SET NX for idempotency and register Celery queue 2026-02-05 20:15:34 -08:00
GareArc
1663a7ab4c feat(telemetry): add gateway diagnostics and verify integration
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-Claude)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-05 20:15:13 -08:00
GareArc
51b0c5c89c feat(telemetry): implement gateway routing and enqueue logic
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-Claude)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-05 20:15:13 -08:00
GareArc
752b01ae91 refactor(telemetry): migrate event handlers to gateway-only producers
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-Claude)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-02-05 20:15:12 -08:00
GareArc
3d3e8d75d8 feat(telemetry): add gateway envelope contracts and routing table 2026-02-05 20:15:12 -08:00
GareArc
55c0fe503d fix(telemetry): correct enterprise-only trace filtering logic
The logic was inverted - we were blocking all CE traces and only allowing
enterprise traces. The correct logic should be:
- Allow all CE traces (workflow, message, tool, etc.)
- Only block enterprise-only traces when enterprise telemetry is disabled

Before: if event.name not in _ENTERPRISE_ONLY_TRACES: return
After: if event.name in _ENTERPRISE_ONLY_TRACES and not is_enterprise_telemetry_enabled(): return
2026-02-05 20:15:12 -08:00
GareArc
adadf1ec5f refactor(telemetry): migrate to type-safe enum-based event routing with centralized enterprise filtering
Changes:
- Change TelemetryEvent.name from str to TraceTaskName enum for type safety
- Remove hardcoded trace_task_name_map from facade (no mapping needed)
- Add centralized enterprise-only filter in TelemetryFacade.emit()
- Rename is_telemetry_enabled() to is_enterprise_telemetry_enabled()
- Update all 11 call sites to pass TraceTaskName enum values
- Remove redundant enterprise guard from draft_trace.py
- Add unit tests for TelemetryFacade.emit() routing (6 tests)
- Add unit tests for TraceQueueManager telemetry guard (5 tests)
- Fix test fixture scoping issue for full test suite compatibility
- Fix tenant_id handling in agent tool callback handler

Benefits:
- 100% type-safe: basedpyright catches errors at compile time
- No string literals: eliminates entire class of typo bugs
- Single point of control: centralized filtering in facade
- All guards removed except facade
- Zero regressions: 4887 tests passing

Verification:
- make lint: PASS
- make type-check: PASS (0 errors, 0 warnings)
- pytest: 4887 passed, 8 skipped
2026-02-05 20:15:12 -08:00
GareArc
ed222945aa refactor(telemetry): introduce TelemetryFacade to centralize event emission
Migrate from direct TraceQueueManager.add_trace_task calls to TelemetryFacade.emit
with TelemetryEvent abstraction. This reduces CE code invasion by consolidating
telemetry logic in core/telemetry/ with a single guard in ops_trace_manager.py.
2026-02-05 20:15:11 -08:00
GareArc
2d60be311d fix: extract model_provider from model_config in prompt generation trace
The model_provider field in prompt generation traces was being incorrectly
extracted by parsing the model name (e.g., 'deepseek-chat'), which resulted
in an empty string when the model name didn't contain a '/' character.

Now extracts the provider directly from the model_config parameter, with
a fallback to the old parsing logic for backward compatibility.

Changes:
- Update _emit_prompt_generation_trace to accept model_config parameter
- Extract provider from model_config.get('provider') when available
- Update all 6 call sites to pass model_config
- Maintain backward compatibility with fallback logic
2026-02-05 20:15:11 -08:00
GareArc
80ee2e982e fix(telemetry): prevent UUID validation error for tenant-prefixed storage IDs
- get_ops_trace_instance was trying to query App table with storage_id format "tenant-{uuid}"
- This caused psycopg2.errors.InvalidTextRepresentation when app_id is None
- Added early return for tenant-prefixed storage identifiers to skip App lookup
- Enterprise telemetry still works correctly with these storage IDs
2026-02-05 20:15:11 -08:00
GareArc
5bbc938a0d fix(telemetry): add prompt generation trace emission for no_variable=false path
- The no_variable=false code path in generate_rule_config was missing trace emission
- Added timing wrapper and _emit_prompt_generation_trace call to ensure metrics/logs are captured
- Trace now emitted on both success and failure cases for consistency with no_variable=true path
2026-02-05 20:15:10 -08:00
GareArc
052f50805f feat(telemetry): add node_execution_id and app_id support to trace metadata
- Forward kwargs to message_trace to preserve node_execution_id
- Add node_execution_id extraction to all trace methods
- Add app_id parameter to prompt generation API endpoints
- Enable app_id tracing for rule_generate, code_generate, and structured_output operations
2026-02-05 20:15:10 -08:00
GareArc
f5043a8ac8 fix(telemetry): enable metrics and logs for standalone prompt generation
Remove app_id parameter from three endpoints and update trace manager to use
tenant_id as storage identifier when app_id is unavailable. This allows
standalone prompt generation utilities to emit telemetry.

Changes:
- controllers/console/app/generator.py: Remove app_id=None from 3 endpoints
  (RuleGenerateApi, RuleCodeGenerateApi, RuleStructuredOutputGenerateApi)
- core/ops/ops_trace_manager.py: Use tenant_id fallback in send_to_celery
  - Extract tenant_id from task.kwargs when app_id is None
  - Use 'tenant-{tenant_id}' format as storage identifier
  - Skip traces only if neither app_id nor tenant_id available

The trace metadata still contains the actual tenant_id, so enterprise
telemetry correctly emits metrics and logs grouped by tenant.
2026-02-05 20:15:10 -08:00
GareArc
a4bebbb5b5 fix(telemetry): remove app_id parameter from standalone prompt generation endpoints
Remove app_id=None from three prompt generation endpoints that lack proper
app context. These standalone utilities only have tenant_id available, so
we don't pass app_id at all rather than passing incomplete information.

Affected endpoints:
- /rule-generate (RuleGenerateApi)
- /code-generate (RuleCodeGenerateApi)
- /structured-output-generate (RuleStructuredOutputGenerateApi)
2026-02-05 20:15:10 -08:00
GareArc
22c8d8d772 feat(telemetry): add prompt generation telemetry to Enterprise OTEL
- Add PromptGenerationTraceInfo trace entity with operation_type field
- Implement telemetry for rule-generate, code-generate, structured-output, instruction-modify operations
- Emit metrics: tokens (total/input/output), duration histogram, requests counter, errors counter
- Emit structured logs with model info and operation context
- Content redaction controlled by ENTERPRISE_INCLUDE_CONTENT env var
- Fix user_id propagation in TraceTask kwargs
- Fix latency calculation when llm_result is None

No spans exported - metrics and logs only for lightweight observability.
2026-02-05 20:14:49 -08:00
GareArc
e67afa7a5b feat(telemetry): add input/output token metrics and fix trace cleanup
- Add dify.tokens.input and dify.tokens.output OTEL metrics
- Remove token split from trace log attributes (keep metrics only)
- Emit split token metrics for workflows and node executions
- Gracefully handle trace file deletion failures to prevent task crashes

BREAKING: None
MIGRATION: None
2026-02-05 20:12:30 -08:00
GareArc
8ceb1ed96f feat(telemetry): add input/output token split to enterprise OTEL traces
- Add PROMPT_TOKENS and COMPLETION_TOKENS to WorkflowNodeExecutionMetadataKey
- Store prompt/completion tokens in node execution metadata JSON (no schema change)
- Calculate workflow-level token split by summing node executions on-the-fly
- Export gen_ai.usage.input_tokens and output_tokens to enterprise telemetry
- Add semantic convention constants for token attributes
- Maintain backward compatibility (historical data shows null)

BREAKING: None
MIGRATION: None (uses JSON metadata, no schema changes)
2026-02-05 20:12:30 -08:00
GareArc
701f02f853 feat(telemetry): add invoked_by user tracking to enterprise OTEL 2026-02-05 20:12:29 -08:00
GareArc
639fb304ca fix(enterprise): Remove OTEL log export 2026-02-05 20:12:29 -08:00
GareArc
df44e79599 feat(enterprise): Add independent metrics export with dedicated MeterProvider
- Create dedicated MeterProvider instance (independent from ext_otel.py)
- Add create_metric_exporter() to _ExporterFactory with HTTP/gRPC support
- Enterprise metrics now work without requiring standard OTEL to be enabled
- Add MeterProvider shutdown to cleanup lifecycle
- Update module docstring to reflect full independence (Tracer, Logger, Meter)
2026-02-05 20:12:29 -08:00
GareArc
0497fd7469 fix(enterprise): Scope log handler to telemetry logger only
Only export structured telemetry logs, not all application logs. The attach_log_handler method now attaches to the 'dify.telemetry' logger instead of the root logger.
2026-02-05 20:12:29 -08:00
GareArc
bb3fcbfd5c feat(enterprise): Add gRPC protocol support for OTLP telemetry
- Add ENTERPRISE_OTLP_PROTOCOL config (http/grpc, default: http)
- Introduce _ExporterFactory class for protocol-agnostic exporter creation
- Support both HTTP and gRPC OTLP endpoints for traces and logs
- Refactor endpoint path handling into factory methods
2026-02-05 20:12:28 -08:00
GareArc
4d7ab24eb1 feat(enterprise): Add OTEL logs export with span_id correlation
- Add ENTERPRISE_OTEL_LOGS_ENABLED and ENTERPRISE_OTLP_LOGS_ENDPOINT config options
- Implement EnterpriseLoggingHandler for log record translation with trace/span ID parsing
- Add LoggerProvider and BatchLogRecordProcessor for OTLP log export
- Correlate telemetry logs with spans via span_id_source parameter
- Attach log handler during enterprise telemetry initialization
2026-02-05 20:12:28 -08:00
GareArc
3461c3a8ef feat(enterprise): Add OTEL telemetry with slim traces, metrics, and structured logs
- Add EnterpriseOtelTrace handler with span emission for workflows and nodes
- Implement minimal-span strategy: slim spans + detailed companion logs
- Add deterministic span/trace IDs for cross-workflow trace correlation
- Add metric collection at 100% accuracy (counters & histograms)
- Add event handlers for app lifecycle and feedback telemetry
- Add cross-workflow trace linking with parent context propagation
- Add OTEL exporter with configurable sampling and privacy controls
- Wire enterprise telemetry into workflow execution pipeline
- Add telemetry configuration in enterprise configs
2026-02-05 20:12:28 -08:00

View File

@@ -43,22 +43,9 @@ def create_flask_app_with_configs() -> DifyApp:
if is_console_api or is_webapp_api:
if is_console_api:
# Console bootstrap APIs exempt from license check:
# - system-features: license status for expiry UI (GlobalPublicStoreProvider)
# - setup: install/setup status check (AppInitializer)
# - features: billing/plan features (ProviderContextProvider)
# - account/profile: login check + user profile (AppContextProvider, useIsLogin)
# - workspaces/current: workspace + model providers (AppContextProvider)
# - version: version check (AppContextProvider)
# - activate/check: invitation link validation (signin page)
# Without these exemptions, the signin page triggers location.reload()
# on unauthorized_and_force_logout, causing an infinite loop.
console_exempt_prefixes = (
"/console/api/system-features",
"/console/api/setup",
"/console/api/features",
"/console/api/account/profile",
"/console/api/workspaces/current",
"/console/api/version",
"/console/api/activate/check",
)
@@ -71,10 +58,6 @@ def create_flask_app_with_configs() -> DifyApp:
# Check license status with caching (10 min TTL)
license_status = EnterpriseService.get_cached_license_status()
if license_status in ["inactive", "expired", "lost"]:
# Cookie clearing is handled by register_external_error_handlers
# in libs/external_api.py which detects the error code and calls
# build_force_logout_cookie_headers(). Frontend then checks
# code === 'unauthorized_and_force_logout' and calls location.reload().
raise UnauthorizedAndForceLogout(
f"Enterprise license is {license_status}. "
"Please contact your administrator."
@@ -82,9 +65,6 @@ def create_flask_app_with_configs() -> DifyApp:
except UnauthorizedAndForceLogout:
raise
except Exception:
# If license check fails, log but don't block the request.
# This prevents service disruption if enterprise API is temporarily
# unavailable.
logger.exception("Failed to check enterprise license status")
# add after request hook for injecting trace headers from OpenTelemetry span context