bump version to 0.4.8 (#2074 )

feat: service api add llm usage (#2051 )
fix: azure customize model name duplicate (#2073 )
2026-01-08 15:24:14 +00:00 · 2024-01-17 22:51:02 +08:00 · 2024-01-17 22:39:47 +08:00 · 2024-01-17 21:17:59 +08:00 · 2024-01-17 15:02:27 +08:00 · 2024-01-16 19:45:21 +08:00
71 changed files with 1211 additions and 501 deletions
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -14,22 +14,35 @@ body:
          required: true
  - type: textarea
    attributes:
-      label: Description of the new feature / enhancement
-      placeholder: What is the expected behavior of the proposed feature?
+      label: 1. Is this request related to a challenge you're experiencing?
+      placeholder: Please describe the specific scenario or problem you're facing as clearly as possible. For instance "I was trying to use [feature] for [specific task], and [what happened]... It was frustrating because...."
    validations:
      required: true
  - type: textarea
    attributes:
-      label: Scenario when this would be used?
-      placeholder: What is the scenario this would be used? Why is this important to your workflow as a dify user?
+      label: 2. Describe the feature you'd like to see
+      placeholder: Think about what you want to achieve and how this feature will help you. Sketches, flow diagrams, or any visual representation will be a major plus.
    validations:
      required: true
  - type: textarea
    attributes:
-      label: Supporting information
-      placeholder: "Having additional evidence, data, tweets, blog posts, research, ... anything is extremely helpful. This information provides context to the scenario that may otherwise be lost."
+      label: 3. How will this feature improve your workflow or experience?
+      placeholder: Tell us how this change will benefit your work. This helps us prioritize based on user impact.
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: 4. Additional context or comments
+      placeholder: (Any other information, comments, documentations, links, or screenshots that would provide more clarity. This is the place to add anything else not covered above.)
    validations:
      required: false
+  - type: checkboxes
+    attributes:
+      label: 5. Can you help us with this feature?
+      description: Let us know! This is not a commitment, but a starting point for collaboration.
+      options:
+        - label: I am interested in contributing to this feature.
+          required: false
  - type: markdown
    attributes:
      value: Please limit one request per issue.
--- a/.github/workflows/api-model-runtime-tests.yml
+++ b/.github/workflows/api-model-runtime-tests.yml
@@ -4,10 +4,6 @@ on:
  pull_request:
    branches:
      - main
-  push:
-    branches:
-      - deploy/dev
-      - feat/model-runtime

 jobs:
  test:
--- a/.github/workflows/style.yml
+++ b/.github/workflows/style.yml
@@ -4,9 +4,6 @@ on:
  pull_request:
    branches:
      - main
-  push:
-    branches:
-      - deploy/dev

 concurrency:
  group: dep-${{ github.head_ref || github.run_id }}
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,66 +1,156 @@
-# Contributing
+So you're looking to contribute to Dify - that's awesome, we can't wait to see what you do. As a startup with limited headcount and funding, we have grand ambitions to design the most intuitive workflow for building and managing LLM applications. Any help from the community counts, truly.

-Thanks for your interest in [Dify](https://dify.ai) and for wanting to contribute! Before you begin, read the
-[code of conduct](https://github.com/langgenius/.github/blob/main/CODE_OF_CONDUCT.md) and check out the
-[existing issues](https://github.com/langgenius/langgenius-gateway/issues).
-This document describes how to set up your development environment to build and test [Dify](https://dify.ai).
+We need to be nimble and ship fast given where we are, but we also want to make sure that contributors like you get as smooth an experience at contributing as possible. We've assembled this contribution guide for that purpose, aiming at getting you familiarized with the codebase & how we work with contributors, so you could quickly jump to the fun part. 

-### Install dependencies
+This guide, like Dify itself, is a constant work in progress. We highly appreciate your understanding if at times it lags behind the actual project, and welcome any feedback for us to improve.

-You need to install and configure the following dependencies on your machine to build [Dify](https://dify.ai):
+In terms of licensing, please take a minute to read our short [License and Contributor Agreement](./license). The community also adheres to the [code of conduct](https://github.com/langgenius/.github/blob/main/CODE_OF_CONDUCT.md).
+
+## Before you jump in
+
+[Find](https://github.com/langgenius/dify/issues?q=is:issue+is:closed) an existing issue, or [open](https://github.com/langgenius/dify/issues/new/choose) a new one. We categorize issues into 2 types:
+
+### Feature requests:
+
+* If you're opening a new feature request, we'd like you to explain what the proposed feature achieves, and include as much context as possible.
+
+* If you want to pick one up from the existing issues, simply drop a comment below it saying so.
+
+  
+
+  A team member working in the related direction will be looped in. If all looks good, they will give the go-ahead for you to start coding. We ask that you hold off working on the feature until then, so none of your work goes to waste should we propose changes.
+
+  Depending on whichever area the proposed feature falls under, you might talk to different team members. Here's rundown of the areas each our team members are working on at the moment:
+
+  | Member                                                       | Scope                                                |
+  | ------------------------------------------------------------ | ---------------------------------------------------- |
+  | [@yeuoly](https://github.com/Yeuoly)                         | Architecting Agents                                  |
+  | [@jyong](https://github.com/JohnJyong)                       | RAG pipeline design                                  |
+  | [@GarfieldDai](https://github.com/GarfieldDai)               | Building workflow orchestrations                     |
+  | [@iamjoel](https://github.com/iamjoel) & [@zxhlyh](https://github.com/zxhlyh) | Making our frontend a breeze to use                  |
+  | [@guchenhe](https://github.com/guchenhe) & [@crazywoola](https://github.com/crazywoola) | Developer experience, points of contact for anything |
+  | [@takatost](https://github.com/takatost)                     | Overall product direction and architecture           |
+
+  How we prioritize:
+
+  | Feature Type                                                 | Priority        |
+  | ------------------------------------------------------------ | --------------- |
+  | High-Priority Features as being labeled by a team member     | High Priority   |
+  | Popular feature requests from our [community feedback board](https://feedback.dify.ai/) | Medium Priority |
+  | Non-core features and minor enhancements                     | Low Priority    |
+  | Valuable but not immediate                                   | Future-Feature  |
+
+### Anything else (e.g. bug report, performance optimization, typo correction):
+
+* Start coding right away.
+
+  How we prioritize:
+
+  | Issue Type                                                   | Priority        |
+  | ------------------------------------------------------------ | --------------- |
+  | Bugs in core functions (cannot login, applications not working, security loopholes) | Critical        |
+  | Non-critical bugs, performance boosts                        | Medium Priority |
+  | Minor fixes (typos, confusing but working UI)                | Low Priority    |
+
+
+## Installing
+
+Here are the steps to set up Dify for development:
+
+### 1. Fork this repository
+
+### 2. Clone the repo
+
+ Clone the forked repository from your terminal:
+
+```
+git clone git@github.com:<github_username>/dify.git
+```
+
+### 3. Verify dependencies
+
+Dify requires the following dependencies to build, make sure they're installed on your system:

- [Git](http://git-scm.com/)
 - [Docker](https://www.docker.com/)
 - [Docker Compose](https://docs.docker.com/compose/install/)
 - [Node.js v18.x (LTS)](http://nodejs.org)
 - [npm](https://www.npmjs.com/) version 8.x.x or [Yarn](https://yarnpkg.com/)
 - [Python](https://www.python.org/) version 3.10.x

-## Local development
+### 4. Installations

-To set up a working development environment, just fork the project git repository and install the backend and frontend dependencies using the proper package manager and create run the docker-compose stack.
+Dify is composed of a backend and a frontend. Navigate to the backend directory by `cd api/`, then follow the [Backend README](api/README.md) to install it. In a separate terminal, navigate to the frontend directory by `cd web/`, then follow the [Frontend README](web/README.md) to install.

-### Fork the repository
+Check the [installation FAQ](https://docs.dify.ai/getting-started/faq/install-faq) for a list of common issues and steps to troubleshoot.

-you need to fork the [repository](https://github.com/langgenius/dify).
+### 5. Visit dify in your browser

-### Clone the repo
+To validate your set up, head over to [http://localhost:3000](http://localhost:3000) (the default, or your self-configured URL and port) in your browser. You should now see Dify up and running. 

-Clone your GitHub forked repository:
+## Developing
+
+If you are adding a model provider, [this guide](https://github.com/langgenius/dify/blob/main/api/core/model_runtime/README.md) is for you.
+
+To help you quickly navigate where your contribution fits, a brief, annotated outline of Dify's backend & frontend is as follows:
+
+### Backend
+
+Dify’s backend is written in Python using [Flask](https://flask.palletsprojects.com/en/3.0.x/). It uses [SQLAlchemy](https://www.sqlalchemy.org/) for ORM and [Celery](https://docs.celeryq.dev/en/stable/getting-started/introduction.html) for task queueing. Authorization logic goes via Flask-login. 

 ```
-git clone git@github.com:<github_username>/dify.git
+[api/]
+├── constants             // Constant settings used throughout code base.
+├── controllers           // API route definitions and request handling logic.           
+├── core                  // Core application orchestration, model integrations, and tools.
+├── docker                // Docker & containerization related configurations.
+├── events                // Event handling and processing
+├── extensions            // Extensions with 3rd party frameworks/platforms.
+├── fields                // field definitions for serialization/marshalling.
+├── libs                  // Reusable libraries and helpers.
+├── migrations            // Scripts for database migration.
+├── models                // Database models & schema definitions.
+├── services              // Specifies business logic.
+├── storage               // Private key storage.      
+├── tasks                 // Handling of async tasks and background jobs.
+└── tests
 ```

-### Install backend
+### Frontend

-To learn how to install the backend application, please refer to the [Backend README](api/README.md).
+The website is bootstrapped on [Next.js](https://nextjs.org/) boilerplate in Typescript and uses [Tailwind CSS](https://tailwindcss.com/) for styling. [React-i18next](https://react.i18next.com/) is used for internationalization.

-### Install frontend
+```
+[web/]
+├── app                   // layouts, pages, and components
+│   ├── (commonLayout)    // common layout used throughout the app
+│   ├── (shareLayout)     // layouts specifically shared across token-specific sessions 
+│   ├── activate          // activate page
+│   ├── components        // shared by pages and layouts
+│   ├── install           // install page
+│   ├── signin            // signin page
+│   └── styles            // globally shared styles
+├── assets                // Static assets
+├── bin                   // scripts ran at build step
+├── config                // adjustable settings and options 
+├── context               // shared contexts used by different portions of the app
+├── dictionaries          // Language-specific translate files 
+├── docker                // container configurations
+├── hooks                 // Reusable hooks
+├── i18n                  // Internationalization configuration
+├── models                // describes data models & shapes of API responses
+├── public                // meta assets like favicon
+├── service               // specifies shapes of API actions
+├── test                  
+├── types                 // descriptions of function params and return values
+└── utils                 // Shared utility functions
+```

-To learn how to install the frontend application, please refer to the [Frontend README](web/README.md).
+## Submitting your PR

-### Visit dify in your browser
+At last, time to open a pull request (PR) to our repo. For major features, we first merge them into the `deploy/dev` branch for testing, before they go into the `main` branch. If you run into issues like merge conflicts or don't know how to open a pull request, check out [GitHub's pull request tutorial](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests). 

-Finally, you can now visit [http://localhost:3000](http://localhost:3000) to view the [Dify](https://dify.ai) in local environment.
+And that's it! Once your PR is merged, you will be featured as a contributor in our [README](https://github.com/langgenius/dify/blob/main/README.md).

+## Getting Help

-## Create a pull request
-
-After making your changes, open a pull request (PR). Once you submit your pull request, others from the Dify team/community will review it with you.
-
-Did you have an issue, like a merge conflict, or don't know how to open a pull request? Check out [GitHub's pull request tutorial](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests) on how to resolve merge conflicts and other issues. Once your PR has been merged, you will be proudly listed as a contributor in the [contributor chart](https://github.com/langgenius/langgenius-gateway/graphs/contributors).
-
-## Community channels
-
-Stuck somewhere? Have any questions? Join the [Discord Community Server](https://discord.gg/j3XRWSPBf7). We are here to help!
-
-
-### Provider Integrations
-If you see a model provider not yet supported by Dify that you'd like to use, follow these [steps](api/core/model_runtime/README.md) to submit a PR.
-
-
-### i18n (Internationalization) Support
-
-We are looking for contributors to help with translations in other languages. If you are interested in helping, please join the [Discord Community Server](https://discord.gg/AhzKf7dNgk) and let us know.  
-Also check out the [Frontend i18n README]((web/i18n/README_EN.md)) for more information.
+If you ever get stuck or got a burning question while contributing, simply shoot your queries our way via the related GitHub issue, or hop onto our [Discord](https://discord.gg/AhzKf7dNgk) for a quick chat. 
--- a/README.md
+++ b/README.md
@@ -26,23 +26,25 @@

 ![](./images/demo.png)

-## Use Cloud Services

-[Dify.AI Cloud](https://dify.ai) provides all the capabilities of the open-source version, and includes 200 free requests to OpenAI GPT-3.5.

-## Why Dify
+## Using our Cloud Services

-Dify is model-agnostic and boasts a comprehensive tech stack compared to hardcoded development libraries like LangChain. Unlike OpenAI's Assistants API, Dify allows for full local deployment of services.
+You can try out [Dify.AI Cloud](https://dify.ai) now. It provides all the capabilities of the self-deployed version, and includes 200 free requests to OpenAI GPT-3.5.
+
+## Dify vs. LangChain vs. Assistants API

 | Feature | Dify.AI | Assistants API | LangChain |
 |---------|---------|----------------|-----------|
 | **Programming Approach** | API-oriented | API-oriented | Python Code-oriented |
-| **Ecosystem Strategy** | Open Source | Closed and Commercial | Open Source |
+| **Ecosystem Strategy** | Open Source | Close Source | Open Source |
 | **RAG Engine** | Supported | Supported | Not Supported |
 | **Prompt IDE** | Included | Included | None |
-| **Supported LLMs** | Rich Variety | Only GPT | Rich Variety |
+| **Supported LLMs** | Rich Variety | OpenAI-only | Rich Variety |
 | **Local Deployment** | Supported | Not Supported | Not Applicable |

+
+
 ## Features

 ![](./images/models.png)
@@ -59,7 +61,7 @@ Dify is model-agnostic and boasts a comprehensive tech stack compared to hardcod

 ## Before You Start

-**Star us, and you'll get instant notifications for all new releases on GitHub!**
+**Star us on GitHub, and be instantly notified for new releases!**

 ![star-us](https://github.com/langgenius/dify/assets/100913391/95f37259-7370-4456-a9f0-0bc01ef8642f)

@@ -103,17 +105,39 @@ If you need to customize the configuration, please refer to the comments in our

 [![Star History Chart](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)

+## Contributing
+
+For those who'd like to contribute code, see our [Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md). 
+
+At the same time, please consider supporting Dify by sharing it on social media and at events and conferences.
+
+### Contributors
+
+<a href="https://github.com/langgenius/dify/graphs/contributors">
+  <img src="https://contrib.rocks/image?repo=langgenius/dify" />
+</a>
+
+### Translations
+
+We are looking for contributors to help with translating Dify to languages other than Mandarin or English. If you are interested in helping, please see the [i18n README](https://github.com/langgenius/dify/blob/main/web/i18n/README_EN.md) for more information, and leave us a comment in the `global-users` channel of our [Discord Community Server](https://discord.gg/AhzKf7dNgk).

 ## Community & Support

-We welcome you to contribute to Dify to help make Dify better in various ways, submitting code, issues, new ideas, or sharing the interesting and useful AI applications you have created based on Dify. At the same time, we also welcome you to share Dify at different events, conferences, and social media.
+* [Canny](https://feedback.dify.ai/). Best for: sharing feedback and checking out our feature roadmap.
+* [GitHub Issues](https://github.com/langgenius/dify/issues). Best for: bugs you encounter using Dify.AI, and feature proposals. See our [Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).
+* [Email Support](mailto:hello@dify.ai?subject=[GitHub]Questions%20About%20Dify). Best for: questions you have about using Dify.AI.
+* [Discord](https://discord.gg/FngNHpbcY7). Best for: sharing your applications and hanging out with the community.
+* [Twitter](https://twitter.com/dify_ai). Best for: sharing your applications and hanging out with the community.
+* [Business Contact](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry). Best for: business inquiries of licensing Dify.AI for commercial use.

- [Roadmap and Feedback](https://feedback.dify.ai/). Best for: sharing feedback and checking out our feature roadmap.
- [GitHub Issues](https://github.com/langgenius/dify/issues). Best for: bugs and errors you encounter using Dify.AI, see the [Contribution Guide](CONTRIBUTING.md).
- [Email Support](mailto:hello@dify.ai?subject=[GitHub]Questions%20About%20Dify). Best for: questions you have about using Dify.AI.
- [Discord](https://discord.gg/FngNHpbcY7). Best for: sharing your applications and hanging out with the community.
- [Twitter](https://twitter.com/dify_ai). Best for: sharing your applications and hanging out with the community.
- [Business License](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry). Best for: business inquiries of licensing Dify.AI for commercial use.
+### Direct Meetings
+
+**Help us make Dify better. Reach out directly to us**.
+
+|                       Point of Contact                       |                           Purpose                            |
+| :----------------------------------------------------------: | :----------------------------------------------------------: |
+| <a href='https://cal.com/guchenhe/15min' target='_blank'><img src='https://i.postimg.cc/fWBqSmjP/Git-Hub-README-Button-3x.png' border='0' alt='Git-Hub-README-Button-3x' height="60" width="214"/></a> | Product design feedback, user experience discussions, feature planning and roadmaps. |
+| <a href='https://cal.com/pinkbanana' target='_blank'><img src='https://i.postimg.cc/LsRTh87D/Git-Hub-README-Button-2x.png' border='0' alt='Git-Hub-README-Button-2x' height="60" width="225"/></a> |        Technical support, issues, or feature requests        |

 ## Security Disclosure

--- a/api/.env.example
+++ b/api/.env.example
@@ -102,10 +102,10 @@ NOTION_CLIENT_ID=you-client-id
 NOTION_INTERNAL_SECRET=you-internal-secret

 # Hosted Model Credentials
-HOSTED_OPENAI_ENABLED=false
 HOSTED_OPENAI_API_KEY=
 HOSTED_OPENAI_API_BASE=
 HOSTED_OPENAI_API_ORGANIZATION=
+HOSTED_OPENAI_TRIAL_ENABLED=false
 HOSTED_OPENAI_QUOTA_LIMIT=200
 HOSTED_OPENAI_PAID_ENABLED=false

@@ -114,9 +114,9 @@ HOSTED_AZURE_OPENAI_API_KEY=
 HOSTED_AZURE_OPENAI_API_BASE=
 HOSTED_AZURE_OPENAI_QUOTA_LIMIT=200

-HOSTED_ANTHROPIC_ENABLED=false
 HOSTED_ANTHROPIC_API_BASE=
 HOSTED_ANTHROPIC_API_KEY=
+HOSTED_ANTHROPIC_TRIAL_ENABLED=false
 HOSTED_ANTHROPIC_QUOTA_LIMIT=600000
 HOSTED_ANTHROPIC_PAID_ENABLED=false

--- a/api/config.py
+++ b/api/config.py
@@ -39,13 +39,19 @@ DEFAULTS = {
    'CELERY_BACKEND': 'database',
    'LOG_LEVEL': 'INFO',
    'HOSTED_OPENAI_QUOTA_LIMIT': 200,
-    'HOSTED_OPENAI_ENABLED': 'False',
+    'HOSTED_OPENAI_TRIAL_ENABLED': 'False',
    'HOSTED_OPENAI_PAID_ENABLED': 'False',
+    'HOSTED_OPENAI_PAID_INCREASE_QUOTA': 1,
+    'HOSTED_OPENAI_PAID_MIN_QUANTITY': 1,
+    'HOSTED_OPENAI_PAID_MAX_QUANTITY': 1,
    'HOSTED_AZURE_OPENAI_ENABLED': 'False',
    'HOSTED_AZURE_OPENAI_QUOTA_LIMIT': 200,
    'HOSTED_ANTHROPIC_QUOTA_LIMIT': 600000,
-    'HOSTED_ANTHROPIC_ENABLED': 'False',
+    'HOSTED_ANTHROPIC_TRIAL_ENABLED': 'False',
    'HOSTED_ANTHROPIC_PAID_ENABLED': 'False',
+    'HOSTED_ANTHROPIC_PAID_INCREASE_QUOTA': 1,
+    'HOSTED_ANTHROPIC_PAID_MIN_QUANTITY': 1,
+    'HOSTED_ANTHROPIC_PAID_MAX_QUANTITY': 1,
    'HOSTED_MODERATION_ENABLED': 'False',
    'HOSTED_MODERATION_PROVIDERS': '',
    'CLEAN_DAY_SETTING': 30,
@@ -66,7 +72,8 @@ def get_env(key):


 def get_bool_env(key):
-    return get_env(key).lower() == 'true'
+    value = get_env(key)
+    return value.lower() == 'true' if value is not None else False


 def get_cors_allow_origins(env, default):
@@ -87,7 +94,7 @@ class Config:
        # ------------------------
        # General Configurations.
        # ------------------------
-        self.CURRENT_VERSION = "0.4.6"
+        self.CURRENT_VERSION = "0.4.8"
        self.COMMIT_SHA = get_env('COMMIT_SHA')
        self.EDITION = "SELF_HOSTED"
        self.DEPLOY_ENV = get_env('DEPLOY_ENV')
@@ -260,23 +267,35 @@ class Config:
        # ------------------------
        # Platform Configurations.
        # ------------------------
-        self.HOSTED_OPENAI_ENABLED = get_bool_env('HOSTED_OPENAI_ENABLED')
        self.HOSTED_OPENAI_API_KEY = get_env('HOSTED_OPENAI_API_KEY')
        self.HOSTED_OPENAI_API_BASE = get_env('HOSTED_OPENAI_API_BASE')
        self.HOSTED_OPENAI_API_ORGANIZATION = get_env('HOSTED_OPENAI_API_ORGANIZATION')
+        self.HOSTED_OPENAI_TRIAL_ENABLED = get_bool_env('HOSTED_OPENAI_TRIAL_ENABLED')
        self.HOSTED_OPENAI_QUOTA_LIMIT = int(get_env('HOSTED_OPENAI_QUOTA_LIMIT'))
        self.HOSTED_OPENAI_PAID_ENABLED = get_bool_env('HOSTED_OPENAI_PAID_ENABLED')
+        self.HOSTED_OPENAI_PAID_STRIPE_PRICE_ID = get_env('HOSTED_OPENAI_PAID_STRIPE_PRICE_ID')
+        self.HOSTED_OPENAI_PAID_INCREASE_QUOTA = int(get_env('HOSTED_OPENAI_PAID_INCREASE_QUOTA'))
+        self.HOSTED_OPENAI_PAID_MIN_QUANTITY = int(get_env('HOSTED_OPENAI_PAID_MIN_QUANTITY'))
+        self.HOSTED_OPENAI_PAID_MAX_QUANTITY = int(get_env('HOSTED_OPENAI_PAID_MAX_QUANTITY'))

        self.HOSTED_AZURE_OPENAI_ENABLED = get_bool_env('HOSTED_AZURE_OPENAI_ENABLED')
        self.HOSTED_AZURE_OPENAI_API_KEY = get_env('HOSTED_AZURE_OPENAI_API_KEY')
        self.HOSTED_AZURE_OPENAI_API_BASE = get_env('HOSTED_AZURE_OPENAI_API_BASE')
        self.HOSTED_AZURE_OPENAI_QUOTA_LIMIT = int(get_env('HOSTED_AZURE_OPENAI_QUOTA_LIMIT'))

-        self.HOSTED_ANTHROPIC_ENABLED = get_bool_env('HOSTED_ANTHROPIC_ENABLED')
        self.HOSTED_ANTHROPIC_API_BASE = get_env('HOSTED_ANTHROPIC_API_BASE')
        self.HOSTED_ANTHROPIC_API_KEY = get_env('HOSTED_ANTHROPIC_API_KEY')
+        self.HOSTED_ANTHROPIC_TRIAL_ENABLED = get_bool_env('HOSTED_ANTHROPIC_TRIAL_ENABLED')
        self.HOSTED_ANTHROPIC_QUOTA_LIMIT = int(get_env('HOSTED_ANTHROPIC_QUOTA_LIMIT'))
        self.HOSTED_ANTHROPIC_PAID_ENABLED = get_bool_env('HOSTED_ANTHROPIC_PAID_ENABLED')
+        self.HOSTED_ANTHROPIC_PAID_STRIPE_PRICE_ID = get_env('HOSTED_ANTHROPIC_PAID_STRIPE_PRICE_ID')
+        self.HOSTED_ANTHROPIC_PAID_INCREASE_QUOTA = int(get_env('HOSTED_ANTHROPIC_PAID_INCREASE_QUOTA'))
+        self.HOSTED_ANTHROPIC_PAID_MIN_QUANTITY = int(get_env('HOSTED_ANTHROPIC_PAID_MIN_QUANTITY'))
+        self.HOSTED_ANTHROPIC_PAID_MAX_QUANTITY = int(get_env('HOSTED_ANTHROPIC_PAID_MAX_QUANTITY'))
+
+        self.HOSTED_MINIMAX_ENABLED = get_bool_env('HOSTED_MINIMAX_ENABLED')
+        self.HOSTED_SPARK_ENABLED = get_bool_env('HOSTED_SPARK_ENABLED')
+        self.HOSTED_ZHIPUAI_ENABLED = get_bool_env('HOSTED_ZHIPUAI_ENABLED')

        self.HOSTED_MODERATION_ENABLED = get_bool_env('HOSTED_MODERATION_ENABLED')
        self.HOSTED_MODERATION_PROVIDERS = get_env('HOSTED_MODERATION_PROVIDERS')
--- a/api/controllers/console/app/completion.py
+++ b/api/controllers/console/app/completion.py
@@ -163,29 +163,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except services.errors.conversation.ConversationNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Conversation Not Exists.")).get_json()) + "\n\n"
-            except services.errors.conversation.ConversationCompletedError:
-                yield "data: " + json.dumps(api.handle_error(ConversationCompletedError()).get_json()) + "\n\n"
-            except services.errors.app_model_config.AppModelConfigBrokenError:
-                logging.exception("App model config broken.")
-                yield "data: " + json.dumps(api.handle_error(AppUnavailableError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError as ex:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError(ex.description)).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/controllers/console/app/message.py
+++ b/api/controllers/console/app/message.py
@@ -241,27 +241,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except MessageNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Message Not Exists.")).get_json()) + "\n\n"
-            except MoreLikeThisDisabledError:
-                yield "data: " + json.dumps(api.handle_error(AppMoreLikeThisDisabledError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError as ex:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError(ex.description)).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(
-                    api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/controllers/console/explore/completion.py
+++ b/api/controllers/console/explore/completion.py
@@ -158,29 +158,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except services.errors.conversation.ConversationNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Conversation Not Exists.")).get_json()) + "\n\n"
-            except services.errors.conversation.ConversationCompletedError:
-                yield "data: " + json.dumps(api.handle_error(ConversationCompletedError()).get_json()) + "\n\n"
-            except services.errors.app_model_config.AppModelConfigBrokenError:
-                logging.exception("App model config broken.")
-                yield "data: " + json.dumps(api.handle_error(AppUnavailableError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError as ex:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError(ex.description)).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/controllers/console/explore/message.py
+++ b/api/controllers/console/explore/message.py
@@ -117,26 +117,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except MessageNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Message Not Exists.")).get_json()) + "\n\n"
-            except MoreLikeThisDisabledError:
-                yield "data: " + json.dumps(api.handle_error(AppMoreLikeThisDisabledError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError as ex:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError(ex.description)).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/controllers/console/universal_chat/chat.py
+++ b/api/controllers/console/universal_chat/chat.py
@@ -109,29 +109,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except services.errors.conversation.ConversationNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Conversation Not Exists.")).get_json()) + "\n\n"
-            except services.errors.conversation.ConversationCompletedError:
-                yield "data: " + json.dumps(api.handle_error(ConversationCompletedError()).get_json()) + "\n\n"
-            except services.errors.app_model_config.AppModelConfigBrokenError:
-                logging.exception("App model config broken.")
-                yield "data: " + json.dumps(api.handle_error(AppUnavailableError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError()).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/controllers/service_api/init.py
+++ b/api/controllers/service_api/init.py
@@ -6,5 +6,6 @@ bp = Blueprint('service_api', __name__, url_prefix='/v1')
 api = ExternalApi(bp)


+from . import index
 from .app import app, audio, completion, conversation, file, message
 from .dataset import dataset, document, segment
--- a/api/controllers/service_api/app/completion.py
+++ b/api/controllers/service_api/app/completion.py
@@ -13,7 +13,7 @@ from core.application_queue_manager import ApplicationQueueManager
 from core.entities.application_entities import InvokeFrom
 from core.errors.error import ModelCurrentlyNotSupportError, ProviderTokenNotInitError, QuotaExceededError
 from core.model_runtime.errors.invoke import InvokeError
-from flask import Response, stream_with_context
+from flask import Response, stream_with_context, request
 from flask_restful import reqparse
 from libs.helper import uuid_value
 from services.completion_service import CompletionService
@@ -75,11 +75,18 @@ class CompletionApi(AppApiResource):


 class CompletionStopApi(AppApiResource):
-    def post(self, app_model, end_user, task_id):
+    def post(self, app_model, _, task_id):
        if app_model.mode != 'completion':
            raise AppUnavailableError()

-        ApplicationQueueManager.set_stop_flag(task_id, InvokeFrom.SERVICE_API, end_user.id)
+        parser = reqparse.RequestParser()
+        parser.add_argument('user', required=True, nullable=False, type=str, location='json')
+
+        args = parser.parse_args()
+
+        end_user_id = args.get('user')
+
+        ApplicationQueueManager.set_stop_flag(task_id, InvokeFrom.SERVICE_API, end_user_id)

        return {'result': 'success'}, 200

@@ -139,11 +146,13 @@ class ChatApi(AppApiResource):


 class ChatStopApi(AppApiResource):
-    def post(self, app_model, end_user, task_id):
+    def post(self, app_model, _, task_id):
        if app_model.mode != 'chat':
            raise NotChatAppError()

-        ApplicationQueueManager.set_stop_flag(task_id, InvokeFrom.SERVICE_API, end_user.id)
+        end_user_id = request.get_json().get('user')
+
+        ApplicationQueueManager.set_stop_flag(task_id, InvokeFrom.SERVICE_API, end_user_id)

        return {'result': 'success'}, 200

@@ -153,29 +162,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except services.errors.conversation.ConversationNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Conversation Not Exists.")).get_json()) + "\n\n"
-            except services.errors.conversation.ConversationCompletedError:
-                yield "data: " + json.dumps(api.handle_error(ConversationCompletedError()).get_json()) + "\n\n"
-            except services.errors.app_model_config.AppModelConfigBrokenError:
-                logging.exception("App model config broken.")
-                yield "data: " + json.dumps(api.handle_error(AppUnavailableError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError as ex:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError(ex.description)).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/controllers/service_api/app/conversation.py
+++ b/api/controllers/service_api/app/conversation.py
@@ -86,5 +86,4 @@ class ConversationRenameApi(AppApiResource):

 api.add_resource(ConversationRenameApi, '/conversations/<uuid:c_id>/name', endpoint='conversation_name')
 api.add_resource(ConversationApi, '/conversations')
-api.add_resource(ConversationApi, '/conversations/<uuid:c_id>', endpoint='conversation')
 api.add_resource(ConversationDetailApi, '/conversations/<uuid:c_id>', endpoint='conversation_detail')
--- a/api/controllers/service_api/index.py
+++ b/api/controllers/service_api/index.py
@@ -0,0 +1,16 @@
+from flask import current_app
+from flask_restful import Resource
+
+from controllers.service_api import api
+
+
+class IndexApi(Resource):
+    def get(self):
+        return {
+            "welcome": "Dify OpenAPI",
+            "api_version": "v1",
+            "server_version": current_app.config['CURRENT_VERSION']
+        }
+
+
+api.add_resource(IndexApi, '/')
--- a/api/controllers/web/completion.py
+++ b/api/controllers/web/completion.py
@@ -146,29 +146,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except services.errors.conversation.ConversationNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Conversation Not Exists.")).get_json()) + "\n\n"
-            except services.errors.conversation.ConversationCompletedError:
-                yield "data: " + json.dumps(api.handle_error(ConversationCompletedError()).get_json()) + "\n\n"
-            except services.errors.app_model_config.AppModelConfigBrokenError:
-                logging.exception("App model config broken.")
-                yield "data: " + json.dumps(api.handle_error(AppUnavailableError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError as ex:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError(ex.description)).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/controllers/web/message.py
+++ b/api/controllers/web/message.py
@@ -151,26 +151,8 @@ def compact_response(response: Union[dict, Generator]) -> Response:
        return Response(response=json.dumps(response), status=200, mimetype='application/json')
    else:
        def generate() -> Generator:
-            try:
-                for chunk in response:
-                    yield chunk
-            except MessageNotExistsError:
-                yield "data: " + json.dumps(api.handle_error(NotFound("Message Not Exists.")).get_json()) + "\n\n"
-            except MoreLikeThisDisabledError:
-                yield "data: " + json.dumps(api.handle_error(AppMoreLikeThisDisabledError()).get_json()) + "\n\n"
-            except ProviderTokenNotInitError as ex:
-                yield "data: " + json.dumps(api.handle_error(ProviderNotInitializeError(ex.description)).get_json()) + "\n\n"
-            except QuotaExceededError:
-                yield "data: " + json.dumps(api.handle_error(ProviderQuotaExceededError()).get_json()) + "\n\n"
-            except ModelCurrentlyNotSupportError:
-                yield "data: " + json.dumps(api.handle_error(ProviderModelCurrentlyNotSupportError()).get_json()) + "\n\n"
-            except InvokeError as e:
-                yield "data: " + json.dumps(api.handle_error(CompletionRequestError(e.description)).get_json()) + "\n\n"
-            except ValueError as e:
-                yield "data: " + json.dumps(api.handle_error(e).get_json()) + "\n\n"
-            except Exception:
-                logging.exception("internal server error.")
-                yield "data: " + json.dumps(api.handle_error(InternalServerError()).get_json()) + "\n\n"
+            for chunk in response:
+                yield chunk

        return Response(stream_with_context(generate()), status=200,
                        mimetype='text/event-stream')
--- a/api/core/app_runner/basic_app_runner.py
+++ b/api/core/app_runner/basic_app_runner.py
@@ -146,7 +146,7 @@ class BasicApplicationRunner(AppRunner):

        # get context from datasets
        context = None
-        if app_orchestration_config.dataset:
+        if app_orchestration_config.dataset and app_orchestration_config.dataset.dataset_ids:
            context = self.retrieve_dataset_context(
                tenant_id=app_record.tenant_id,
                app_record=app_record,
--- a/api/core/app_runner/generate_task_pipeline.py
+++ b/api/core/app_runner/generate_task_pipeline.py
@@ -5,16 +5,18 @@ from typing import Generator, Optional, Union, cast

 from core.app_runner.moderation_handler import ModerationRule, OutputModerationHandler
 from core.application_queue_manager import ApplicationQueueManager, PublishFrom
-from core.entities.application_entities import ApplicationGenerateEntity
+from core.entities.application_entities import ApplicationGenerateEntity, InvokeFrom
 from core.entities.queue_entities import (AnnotationReplyEvent, QueueAgentThoughtEvent, QueueErrorEvent,
                                          QueueMessageEndEvent, QueueMessageEvent, QueueMessageReplaceEvent,
                                          QueuePingEvent, QueueRetrieverResourcesEvent, QueueStopEvent)
+from core.errors.error import ProviderTokenNotInitError, QuotaExceededError, ModelCurrentlyNotSupportError
 from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta, LLMUsage
 from core.model_runtime.entities.message_entities import (AssistantPromptMessage, ImagePromptMessageContent,
                                                          PromptMessage, PromptMessageContentType, PromptMessageRole,
                                                          TextPromptMessageContent)
 from core.model_runtime.errors.invoke import InvokeAuthorizationError, InvokeError
 from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
+from core.model_runtime.utils.encoders import jsonable_encoder
 from core.prompt.prompt_template import PromptTemplateParser
 from events.message_event import message_was_created
 from extensions.ext_database import db
@@ -135,6 +137,8 @@ class GenerateTaskPipeline:
                        completion_tokens
                    )

+                self._task_state.metadata['usage'] = jsonable_encoder(self._task_state.llm_result.usage)
+
                # response moderation
                if self._output_moderation_handler:
                    self._output_moderation_handler.stop_thread()
@@ -145,12 +149,13 @@ class GenerateTaskPipeline:
                    )

                # Save message
-                self._save_message(event.llm_result)
+                self._save_message(self._task_state.llm_result)

                response = {
                    'event': 'message',
                    'task_id': self._application_generate_entity.task_id,
                    'id': self._message.id,
+                    'message_id': self._message.id,
                    'mode': self._conversation.mode,
                    'answer': event.llm_result.message.content,
                    'metadata': {},
@@ -161,7 +166,7 @@ class GenerateTaskPipeline:
                    response['conversation_id'] = self._conversation.id

                if self._task_state.metadata:
-                    response['metadata'] = self._task_state.metadata
+                    response['metadata'] = self._get_response_metadata()

                return response
            else:
@@ -176,7 +181,9 @@ class GenerateTaskPipeline:
            event = message.event

            if isinstance(event, QueueErrorEvent):
-                raise self._handle_error(event)
+                data = self._error_to_stream_response_data(self._handle_error(event))
+                yield self._yield_response(data)
+                break
            elif isinstance(event, (QueueStopEvent, QueueMessageEndEvent)):
                if isinstance(event, QueueMessageEndEvent):
                    self._task_state.llm_result = event.llm_result
@@ -213,6 +220,8 @@ class GenerateTaskPipeline:
                        completion_tokens
                    )

+                self._task_state.metadata['usage'] = jsonable_encoder(self._task_state.llm_result.usage)
+
                # response moderation
                if self._output_moderation_handler:
                    self._output_moderation_handler.stop_thread()
@@ -244,13 +253,14 @@ class GenerateTaskPipeline:
                    'event': 'message_end',
                    'task_id': self._application_generate_entity.task_id,
                    'id': self._message.id,
+                    'message_id': self._message.id,
                }

                if self._conversation.mode == 'chat':
                    response['conversation_id'] = self._conversation.id

                if self._task_state.metadata:
-                    response['metadata'] = self._task_state.metadata
+                    response['metadata'] = self._get_response_metadata()

                yield self._yield_response(response)
            elif isinstance(event, QueueRetrieverResourcesEvent):
@@ -410,6 +420,86 @@ class GenerateTaskPipeline:
        else:
            return Exception(e.description if getattr(e, 'description', None) is not None else str(e))

+    def _error_to_stream_response_data(self, e: Exception) -> dict:
+        """
+        Error to stream response.
+        :param e: exception
+        :return:
+        """
+        if isinstance(e, ValueError):
+            data = {
+                'code': 'invalid_param',
+                'message': str(e),
+                'status': 400
+            }
+        elif isinstance(e, ProviderTokenNotInitError):
+            data = {
+                'code': 'provider_not_initialize',
+                'message': e.description,
+                'status': 400
+            }
+        elif isinstance(e, QuotaExceededError):
+            data = {
+                'code': 'provider_quota_exceeded',
+                'message': "Your quota for Dify Hosted Model Provider has been exhausted. "
+                           "Please go to Settings -> Model Provider to complete your own provider credentials.",
+                'status': 400
+            }
+        elif isinstance(e, ModelCurrentlyNotSupportError):
+            data = {
+                'code': 'model_currently_not_support',
+                'message': e.description,
+                'status': 400
+            }
+        elif isinstance(e, InvokeError):
+            data = {
+                'code': 'completion_request_error',
+                'message': e.description,
+                'status': 400
+            }
+        else:
+            logging.error(e)
+            data = {
+                'code': 'internal_server_error',
+                'message': 'Internal Server Error, please contact support.',
+                'status': 500
+            }
+
+        return {
+            'event': 'error',
+            'task_id': self._application_generate_entity.task_id,
+            'message_id': self._message.id,
+            **data
+        }
+
+    def _get_response_metadata(self) -> dict:
+        """
+        Get response metadata by invoke from.
+        :return:
+        """
+        metadata = {}
+
+        # show_retrieve_source
+        if 'retriever_resources' in self._task_state.metadata:
+            if self._application_generate_entity.invoke_from in [InvokeFrom.DEBUGGER, InvokeFrom.SERVICE_API]:
+                metadata['retriever_resources'] = self._task_state.metadata['retriever_resources']
+            else:
+                metadata['retriever_resources'] = []
+                for resource in self._task_state.metadata['retriever_resources']:
+                    metadata['retriever_resources'].append({
+                        'segment_id': resource['segment_id'],
+                        'position': resource['position'],
+                        'document_name': resource['document_name'],
+                        'score': resource['score'],
+                        'content': resource['content'],
+                    })
+
+        # show usage
+        if self._application_generate_entity.invoke_from in [InvokeFrom.DEBUGGER, InvokeFrom.SERVICE_API]:
+            metadata['usage'] = self._task_state.metadata['usage']
+
+        return metadata
+
    def _yield_response(self, response: dict) -> str:
        """
        Yield response.
--- a/api/core/entities/provider_configuration.py
+++ b/api/core/entities/provider_configuration.py
@@ -165,7 +165,7 @@ class ProviderConfiguration(BaseModel):
                    if value == '[__HIDDEN__]' and key in original_credentials:
                        credentials[key] = encrypter.decrypt_token(self.tenant_id, original_credentials[key])

-        model_provider_factory.provider_credentials_validate(
+        credentials = model_provider_factory.provider_credentials_validate(
            self.provider.provider,
            credentials
        )
@@ -308,24 +308,13 @@ class ProviderConfiguration(BaseModel):
                    if value == '[__HIDDEN__]' and key in original_credentials:
                        credentials[key] = encrypter.decrypt_token(self.tenant_id, original_credentials[key])

-        model_provider_factory.model_credentials_validate(
+        credentials = model_provider_factory.model_credentials_validate(
            provider=self.provider.provider,
            model_type=model_type,
            model=model,
            credentials=credentials
        )

-        model_schema = (
-            model_provider_factory.get_provider_instance(self.provider.provider)
-            .get_model_instance(model_type)._get_customizable_model_schema(
-                model=model,
-                credentials=credentials
-            )
-        )
-
-        if model_schema:
-            credentials['schema'] = json.dumps(encoders.jsonable_encoder(model_schema))
-
        for key, value in credentials.items():
            if key in provider_credential_secret_variables:
                credentials[key] = encrypter.encrypt_token(self.tenant_id, value)
--- a/api/core/features/dataset_retrieval.py
+++ b/api/core/features/dataset_retrieval.py
@@ -166,8 +166,7 @@ class DatasetRetrievalFeature:
                dataset_ids=[dataset.id for dataset in available_datasets],
                tenant_id=tenant_id,
                top_k=retrieve_config.top_k or 2,
-                score_threshold=(retrieve_config.score_threshold or 0.5)
-                if retrieve_config.reranking_model.get('score_threshold_enabled', False) else None,
+                score_threshold=retrieve_config.score_threshold,
                hit_callbacks=[hit_callback],
                return_resource=return_resource,
                retriever_from=invoke_from.to_source(),
--- a/api/core/hosting_configuration.py
+++ b/api/core/hosting_configuration.py
@@ -1,9 +1,8 @@
-import os
 from typing import Optional

 from core.entities.provider_entities import QuotaUnit, RestrictModel
 from core.model_runtime.entities.model_entities import ModelType
-from flask import Flask
+from flask import Flask, Config
 from models.provider import ProviderQuotaType
 from pydantic import BaseModel

@@ -48,46 +47,47 @@ class HostingConfiguration:
    moderation_config: HostedModerationConfig = None

    def init_app(self, app: Flask) -> None:
-        if app.config.get('EDITION') != 'CLOUD':
+        config = app.config
+
+        if config.get('EDITION') != 'CLOUD':
            return

-        self.provider_map["azure_openai"] = self.init_azure_openai()
-        self.provider_map["openai"] = self.init_openai()
-        self.provider_map["anthropic"] = self.init_anthropic()
-        self.provider_map["minimax"] = self.init_minimax()
-        self.provider_map["spark"] = self.init_spark()
-        self.provider_map["zhipuai"] = self.init_zhipuai()
+        self.provider_map["azure_openai"] = self.init_azure_openai(config)
+        self.provider_map["openai"] = self.init_openai(config)
+        self.provider_map["anthropic"] = self.init_anthropic(config)
+        self.provider_map["minimax"] = self.init_minimax(config)
+        self.provider_map["spark"] = self.init_spark(config)
+        self.provider_map["zhipuai"] = self.init_zhipuai(config)

-        self.moderation_config = self.init_moderation_config()
+        self.moderation_config = self.init_moderation_config(config)

-    def init_azure_openai(self) -> HostingProvider:
+    def init_azure_openai(self, app_config: Config) -> HostingProvider:
        quota_unit = QuotaUnit.TIMES
-        if os.environ.get("HOSTED_AZURE_OPENAI_ENABLED") and os.environ.get("HOSTED_AZURE_OPENAI_ENABLED").lower() == 'true':
+        if app_config.get("HOSTED_AZURE_OPENAI_ENABLED"):
            credentials = {
-                "openai_api_key": os.environ.get("HOSTED_AZURE_OPENAI_API_KEY"),
-                "openai_api_base": os.environ.get("HOSTED_AZURE_OPENAI_API_BASE"),
+                "openai_api_key": app_config.get("HOSTED_AZURE_OPENAI_API_KEY"),
+                "openai_api_base": app_config.get("HOSTED_AZURE_OPENAI_API_BASE"),
                "base_model_name": "gpt-35-turbo"
            }

            quotas = []
-            hosted_quota_limit = int(os.environ.get("HOSTED_AZURE_OPENAI_QUOTA_LIMIT", "1000"))
-            if hosted_quota_limit != -1 or hosted_quota_limit > 0:
-                trial_quota = TrialHostingQuota(
-                    quota_limit=hosted_quota_limit,
-                    restrict_models=[
-                        RestrictModel(model="gpt-4", base_model_name="gpt-4", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-4-32k", base_model_name="gpt-4-32k", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-4-1106-preview", base_model_name="gpt-4-1106-preview", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-4-vision-preview", base_model_name="gpt-4-vision-preview", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-35-turbo", base_model_name="gpt-35-turbo", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-35-turbo-1106", base_model_name="gpt-35-turbo-1106", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-35-turbo-instruct", base_model_name="gpt-35-turbo-instruct", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-35-turbo-16k", base_model_name="gpt-35-turbo-16k", model_type=ModelType.LLM),
-                        RestrictModel(model="text-davinci-003", base_model_name="text-davinci-003", model_type=ModelType.LLM),
-                        RestrictModel(model="text-embedding-ada-002", base_model_name="text-embedding-ada-002", model_type=ModelType.TEXT_EMBEDDING),
-                    ]
-                )
-                quotas.append(trial_quota)
+            hosted_quota_limit = int(app_config.get("HOSTED_AZURE_OPENAI_QUOTA_LIMIT", "1000"))
+            trial_quota = TrialHostingQuota(
+                quota_limit=hosted_quota_limit,
+                restrict_models=[
+                    RestrictModel(model="gpt-4", base_model_name="gpt-4", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-4-32k", base_model_name="gpt-4-32k", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-4-1106-preview", base_model_name="gpt-4-1106-preview", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-4-vision-preview", base_model_name="gpt-4-vision-preview", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-35-turbo", base_model_name="gpt-35-turbo", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-35-turbo-1106", base_model_name="gpt-35-turbo-1106", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-35-turbo-instruct", base_model_name="gpt-35-turbo-instruct", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-35-turbo-16k", base_model_name="gpt-35-turbo-16k", model_type=ModelType.LLM),
+                    RestrictModel(model="text-davinci-003", base_model_name="text-davinci-003", model_type=ModelType.LLM),
+                    RestrictModel(model="text-embedding-ada-002", base_model_name="text-embedding-ada-002", model_type=ModelType.TEXT_EMBEDDING),
+                ]
+            )
+            quotas.append(trial_quota)

            return HostingProvider(
                enabled=True,
@@ -101,43 +101,44 @@ class HostingConfiguration:
            quota_unit=quota_unit,
        )

-    def init_openai(self) -> HostingProvider:
+    def init_openai(self, app_config: Config) -> HostingProvider:
        quota_unit = QuotaUnit.TIMES
-        if os.environ.get("HOSTED_OPENAI_ENABLED") and os.environ.get("HOSTED_OPENAI_ENABLED").lower() == 'true':
+        quotas = []
+
+        if app_config.get("HOSTED_OPENAI_TRIAL_ENABLED"):
+            hosted_quota_limit = int(app_config.get("HOSTED_OPENAI_QUOTA_LIMIT", "200"))
+            trial_quota = TrialHostingQuota(
+                quota_limit=hosted_quota_limit,
+                restrict_models=[
+                    RestrictModel(model="gpt-3.5-turbo", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-3.5-turbo-1106", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-3.5-turbo-instruct", model_type=ModelType.LLM),
+                    RestrictModel(model="gpt-3.5-turbo-16k", model_type=ModelType.LLM),
+                    RestrictModel(model="text-davinci-003", model_type=ModelType.LLM),
+                    RestrictModel(model="whisper-1", model_type=ModelType.SPEECH2TEXT),
+                ]
+            )
+            quotas.append(trial_quota)
+
+        if app_config.get("HOSTED_OPENAI_PAID_ENABLED"):
+            paid_quota = PaidHostingQuota(
+                stripe_price_id=app_config.get("HOSTED_OPENAI_PAID_STRIPE_PRICE_ID"),
+                increase_quota=int(app_config.get("HOSTED_OPENAI_PAID_INCREASE_QUOTA", "1")),
+                min_quantity=int(app_config.get("HOSTED_OPENAI_PAID_MIN_QUANTITY", "1")),
+                max_quantity=int(app_config.get("HOSTED_OPENAI_PAID_MAX_QUANTITY", "1"))
+            )
+            quotas.append(paid_quota)
+
+        if len(quotas) > 0:
            credentials = {
-                "openai_api_key": os.environ.get("HOSTED_OPENAI_API_KEY"),
+                "openai_api_key": app_config.get("HOSTED_OPENAI_API_KEY"),
            }

-            if os.environ.get("HOSTED_OPENAI_API_BASE"):
-                credentials["openai_api_base"] = os.environ.get("HOSTED_OPENAI_API_BASE")
+            if app_config.get("HOSTED_OPENAI_API_BASE"):
+                credentials["openai_api_base"] = app_config.get("HOSTED_OPENAI_API_BASE")

-            if os.environ.get("HOSTED_OPENAI_API_ORGANIZATION"):
-                credentials["openai_organization"] = os.environ.get("HOSTED_OPENAI_API_ORGANIZATION")
-
-            quotas = []
-            hosted_quota_limit = int(os.environ.get("HOSTED_OPENAI_QUOTA_LIMIT", "200"))
-            if hosted_quota_limit != -1 or hosted_quota_limit > 0:
-                trial_quota = TrialHostingQuota(
-                    quota_limit=hosted_quota_limit,
-                    restrict_models=[
-                        RestrictModel(model="gpt-3.5-turbo", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-3.5-turbo-1106", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-3.5-turbo-instruct", model_type=ModelType.LLM),
-                        RestrictModel(model="gpt-3.5-turbo-16k", model_type=ModelType.LLM),
-                        RestrictModel(model="text-davinci-003", model_type=ModelType.LLM),
-                    ]
-                )
-                quotas.append(trial_quota)
-
-            if os.environ.get("HOSTED_OPENAI_PAID_ENABLED") and os.environ.get(
-                    "HOSTED_OPENAI_PAID_ENABLED").lower() == 'true':
-                paid_quota = PaidHostingQuota(
-                    stripe_price_id=os.environ.get("HOSTED_OPENAI_PAID_STRIPE_PRICE_ID"),
-                    increase_quota=int(os.environ.get("HOSTED_OPENAI_PAID_INCREASE_QUOTA", "1")),
-                    min_quantity=int(os.environ.get("HOSTED_OPENAI_PAID_MIN_QUANTITY", "1")),
-                    max_quantity=int(os.environ.get("HOSTED_OPENAI_PAID_MAX_QUANTITY", "1"))
-                )
-                quotas.append(paid_quota)
+            if app_config.get("HOSTED_OPENAI_API_ORGANIZATION"):
+                credentials["openai_organization"] = app_config.get("HOSTED_OPENAI_API_ORGANIZATION")

            return HostingProvider(
                enabled=True,
@@ -151,33 +152,33 @@ class HostingConfiguration:
            quota_unit=quota_unit,
        )

-    def init_anthropic(self) -> HostingProvider:
+    def init_anthropic(self, app_config: Config) -> HostingProvider:
        quota_unit = QuotaUnit.TOKENS
-        if os.environ.get("HOSTED_ANTHROPIC_ENABLED") and os.environ.get("HOSTED_ANTHROPIC_ENABLED").lower() == 'true':
+        quotas = []
+
+        if app_config.get("HOSTED_ANTHROPIC_TRIAL_ENABLED"):
+            hosted_quota_limit = int(app_config.get("HOSTED_ANTHROPIC_QUOTA_LIMIT", "0"))
+            trial_quota = TrialHostingQuota(
+                quota_limit=hosted_quota_limit
+            )
+            quotas.append(trial_quota)
+
+        if app_config.get("HOSTED_ANTHROPIC_PAID_ENABLED"):
+            paid_quota = PaidHostingQuota(
+                stripe_price_id=app_config.get("HOSTED_ANTHROPIC_PAID_STRIPE_PRICE_ID"),
+                increase_quota=int(app_config.get("HOSTED_ANTHROPIC_PAID_INCREASE_QUOTA", "1000000")),
+                min_quantity=int(app_config.get("HOSTED_ANTHROPIC_PAID_MIN_QUANTITY", "20")),
+                max_quantity=int(app_config.get("HOSTED_ANTHROPIC_PAID_MAX_QUANTITY", "100"))
+            )
+            quotas.append(paid_quota)
+
+        if len(quotas) > 0:
            credentials = {
-                "anthropic_api_key": os.environ.get("HOSTED_ANTHROPIC_API_KEY"),
+                "anthropic_api_key": app_config.get("HOSTED_ANTHROPIC_API_KEY"),
            }

-            if os.environ.get("HOSTED_ANTHROPIC_API_BASE"):
-                credentials["anthropic_api_url"] = os.environ.get("HOSTED_ANTHROPIC_API_BASE")
-
-            quotas = []
-            hosted_quota_limit = int(os.environ.get("HOSTED_ANTHROPIC_QUOTA_LIMIT", "0"))
-            if hosted_quota_limit != -1 or hosted_quota_limit > 0:
-                trial_quota = TrialHostingQuota(
-                    quota_limit=hosted_quota_limit
-                )
-                quotas.append(trial_quota)
-
-            if os.environ.get("HOSTED_ANTHROPIC_PAID_ENABLED") and os.environ.get(
-                    "HOSTED_ANTHROPIC_PAID_ENABLED").lower() == 'true':
-                paid_quota = PaidHostingQuota(
-                    stripe_price_id=os.environ.get("HOSTED_ANTHROPIC_PAID_STRIPE_PRICE_ID"),
-                    increase_quota=int(os.environ.get("HOSTED_ANTHROPIC_PAID_INCREASE_QUOTA", "1000000")),
-                    min_quantity=int(os.environ.get("HOSTED_ANTHROPIC_PAID_MIN_QUANTITY", "20")),
-                    max_quantity=int(os.environ.get("HOSTED_ANTHROPIC_PAID_MAX_QUANTITY", "100"))
-                )
-                quotas.append(paid_quota)
+            if app_config.get("HOSTED_ANTHROPIC_API_BASE"):
+                credentials["anthropic_api_url"] = app_config.get("HOSTED_ANTHROPIC_API_BASE")

            return HostingProvider(
                enabled=True,
@@ -191,9 +192,9 @@ class HostingConfiguration:
            quota_unit=quota_unit,
        )

-    def init_minimax(self) -> HostingProvider:
+    def init_minimax(self, app_config: Config) -> HostingProvider:
        quota_unit = QuotaUnit.TOKENS
-        if os.environ.get("HOSTED_MINIMAX_ENABLED") and os.environ.get("HOSTED_MINIMAX_ENABLED").lower() == 'true':
+        if app_config.get("HOSTED_MINIMAX_ENABLED"):
            quotas = [FreeHostingQuota()]

            return HostingProvider(
@@ -208,9 +209,9 @@ class HostingConfiguration:
            quota_unit=quota_unit,
        )

-    def init_spark(self) -> HostingProvider:
+    def init_spark(self, app_config: Config) -> HostingProvider:
        quota_unit = QuotaUnit.TOKENS
-        if os.environ.get("HOSTED_SPARK_ENABLED") and os.environ.get("HOSTED_SPARK_ENABLED").lower() == 'true':
+        if app_config.get("HOSTED_SPARK_ENABLED"):
            quotas = [FreeHostingQuota()]

            return HostingProvider(
@@ -225,9 +226,9 @@ class HostingConfiguration:
            quota_unit=quota_unit,
        )

-    def init_zhipuai(self) -> HostingProvider:
+    def init_zhipuai(self, app_config: Config) -> HostingProvider:
        quota_unit = QuotaUnit.TOKENS
-        if os.environ.get("HOSTED_ZHIPUAI_ENABLED") and os.environ.get("HOSTED_ZHIPUAI_ENABLED").lower() == 'true':
+        if app_config.get("HOSTED_ZHIPUAI_ENABLED"):
            quotas = [FreeHostingQuota()]

            return HostingProvider(
@@ -242,12 +243,12 @@ class HostingConfiguration:
            quota_unit=quota_unit,
        )

-    def init_moderation_config(self) -> HostedModerationConfig:
-        if os.environ.get("HOSTED_MODERATION_ENABLED") and os.environ.get("HOSTED_MODERATION_ENABLED").lower() == 'true' \
-                and os.environ.get("HOSTED_MODERATION_PROVIDERS"):
+    def init_moderation_config(self, app_config: Config) -> HostedModerationConfig:
+        if app_config.get("HOSTED_MODERATION_ENABLED") \
+                and app_config.get("HOSTED_MODERATION_PROVIDERS"):
            return HostedModerationConfig(
                enabled=True,
-                providers=os.environ.get("HOSTED_MODERATION_PROVIDERS").split(',')
+                providers=app_config.get("HOSTED_MODERATION_PROVIDERS").split(',')
            )

        return HostedModerationConfig(
--- a/api/core/indexing_runner.py
+++ b/api/core/indexing_runner.py
@@ -13,7 +13,7 @@ from core.docstore.dataset_docstore import DatasetDocumentStore
 from core.errors.error import ProviderTokenNotInitError
 from core.generator.llm_generator import LLMGenerator
 from core.index.index import IndexBuilder
-from core.model_manager import ModelManager
+from core.model_manager import ModelManager, ModelInstance
 from core.model_runtime.entities.model_entities import ModelType, PriceType
 from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
 from core.model_runtime.model_providers.__base.text_embedding_model import TextEmbeddingModel
@@ -61,8 +61,24 @@ class IndexingRunner:
                # load file
                text_docs = self._load_data(dataset_document, processing_rule.mode == 'automatic')

+                # get embedding model instance
+                embedding_model_instance = None
+                if dataset.indexing_technique == 'high_quality':
+                    if dataset.embedding_model_provider:
+                        embedding_model_instance = self.model_manager.get_model_instance(
+                            tenant_id=dataset.tenant_id,
+                            provider=dataset.embedding_model_provider,
+                            model_type=ModelType.TEXT_EMBEDDING,
+                            model=dataset.embedding_model
+                        )
+                    else:
+                        embedding_model_instance = self.model_manager.get_default_model_instance(
+                            tenant_id=dataset.tenant_id,
+                            model_type=ModelType.TEXT_EMBEDDING,
+                        )
+
                # get splitter
-                splitter = self._get_splitter(processing_rule)
+                splitter = self._get_splitter(processing_rule, embedding_model_instance)

                # split to documents
                documents = self._step_split(
@@ -121,8 +137,24 @@ class IndexingRunner:
            # load file
            text_docs = self._load_data(dataset_document, processing_rule.mode == 'automatic')

+            # get embedding model instance
+            embedding_model_instance = None
+            if dataset.indexing_technique == 'high_quality':
+                if dataset.embedding_model_provider:
+                    embedding_model_instance = self.model_manager.get_model_instance(
+                        tenant_id=dataset.tenant_id,
+                        provider=dataset.embedding_model_provider,
+                        model_type=ModelType.TEXT_EMBEDDING,
+                        model=dataset.embedding_model
+                    )
+                else:
+                    embedding_model_instance = self.model_manager.get_default_model_instance(
+                        tenant_id=dataset.tenant_id,
+                        model_type=ModelType.TEXT_EMBEDDING,
+                    )
+
            # get splitter
-            splitter = self._get_splitter(processing_rule)
+            splitter = self._get_splitter(processing_rule, embedding_model_instance)

            # split to documents
            documents = self._step_split(
@@ -253,7 +285,7 @@ class IndexingRunner:
            text_docs = FileExtractor.load(file_detail, is_automatic=processing_rule.mode == 'automatic')

            # get splitter
-            splitter = self._get_splitter(processing_rule)
+            splitter = self._get_splitter(processing_rule, embedding_model_instance)

            # split to documents
            documents = self._split_to_documents_for_estimate(
@@ -384,7 +416,7 @@ class IndexingRunner:
                )

                # get splitter
-                splitter = self._get_splitter(processing_rule)
+                splitter = self._get_splitter(processing_rule, embedding_model_instance)

                # split to documents
                documents = self._split_to_documents_for_estimate(
@@ -499,10 +531,13 @@ class IndexingRunner:
    def filter_string(self, text):
        text = re.sub(r'<\|', '<', text)
        text = re.sub(r'\|>', '>', text)
-        text = re.sub(r'[\x00-\x08\x0B\x0C\x0E-\x1F\x7F\x80-\xFF]', '', text)
+        text = re.sub(r'[\x00-\x08\x0B\x0C\x0E-\x1F\x7F\xEF\xBF\xBE]', '', text)
+        # Unicode  U+FFFE
+        text = re.sub(u'\uFFFE', '', text)
        return text

-    def _get_splitter(self, processing_rule: DatasetProcessRule) -> TextSplitter:
+    def _get_splitter(self, processing_rule: DatasetProcessRule,
+                      embedding_model_instance: Optional[ModelInstance]) -> TextSplitter:
        """
        Get the NodeParser object according to the processing rule.
        """
@@ -517,19 +552,20 @@ class IndexingRunner:
            if separator:
                separator = separator.replace('\\n', '\n')

-
-            character_splitter = FixedRecursiveCharacterTextSplitter.from_gpt2_encoder(
+            character_splitter = FixedRecursiveCharacterTextSplitter.from_encoder(
                chunk_size=segmentation["max_tokens"],
                chunk_overlap=0,
                fixed_separator=separator,
-                separators=["\n\n", "。", ".", " ", ""]
+                separators=["\n\n", "。", ".", " ", ""],
+                embedding_model_instance=embedding_model_instance
            )
        else:
            # Automatic segmentation
-            character_splitter = EnhanceRecursiveCharacterTextSplitter.from_gpt2_encoder(
+            character_splitter = EnhanceRecursiveCharacterTextSplitter.from_encoder(
                chunk_size=DatasetProcessRule.AUTOMATIC_RULES['segmentation']['max_tokens'],
                chunk_overlap=0,
-                separators=["\n\n", "。", ".", " ", ""]
+                separators=["\n\n", "。", ".", " ", ""],
+                embedding_model_instance=embedding_model_instance
            )

        return character_splitter
@@ -714,7 +750,7 @@ class IndexingRunner:
        return text

    def format_split_text(self, text):
-        regex = r"Q\d+:\s*(.*?)\s*A\d+:\s*([\s\S]*?)(?=Q\d+:|$)" 
+        regex = r"Q\d+:\s*(.*?)\s*A\d+:\s*([\s\S]*?)(?=Q\d+:|$)"
        matches = re.findall(regex, text, re.UNICODE)

        return [
--- a/api/core/model_runtime/entities/model_entities.py
+++ b/api/core/model_runtime/entities/model_entities.py
@@ -149,8 +149,8 @@ class ParameterRule(BaseModel):
    help: Optional[I18nObject] = None
    required: bool = False
    default: Optional[Any] = None
-    min: Optional[float | int] = None
-    max: Optional[float | int] = None
+    min: Optional[float] = None
+    max: Optional[float] = None
    precision: Optional[int] = None
    options: list[str] = []

--- a/api/core/model_runtime/model_providers/__base/ai_model.py
+++ b/api/core/model_runtime/model_providers/__base/ai_model.py
@@ -1,6 +1,4 @@
 import decimal
-import json
-import logging
 import os
 from abc import ABC, abstractmethod
 from typing import Optional
@@ -12,7 +10,6 @@ from core.model_runtime.entities.model_entities import (AIModelEntity, DefaultPa
                                                        PriceConfig, PriceInfo, PriceType)
 from core.model_runtime.errors.invoke import InvokeAuthorizationError, InvokeError
 from core.model_runtime.model_providers.__base.tokenizers.gpt2_tokenzier import GPT2Tokenizer
-from pydantic import ValidationError


 class AIModel(ABC):
@@ -54,14 +51,16 @@ class AIModel(ABC):
        :param error: model invoke error
        :return: unified error
        """
+        provider_name = self.__class__.__module__.split('.')[-3]
+
        for invoke_error, model_errors in self._invoke_error_mapping.items():
            if isinstance(error, tuple(model_errors)):
                if invoke_error == InvokeAuthorizationError:
-                    return invoke_error(description="Incorrect model credentials provided, please check and try again. ")
+                    return invoke_error(description=f"[{provider_name}] Incorrect model credentials provided, please check and try again. ")

-                return invoke_error(description=f"{invoke_error.description}: {str(error)}")
+                return invoke_error(description=f"[{provider_name}] {invoke_error.description}, {str(error)}")

-        return InvokeError(description=f"Error: {str(error)}")
+        return InvokeError(description=f"[{provider_name}] Error: {str(error)}")

    def get_price(self, model: str, credentials: dict, price_type: PriceType, tokens: int) -> PriceInfo:
        """
--- a/api/core/model_runtime/model_providers/azure_openai/llm/llm.py
+++ b/api/core/model_runtime/model_providers/azure_openai/llm/llm.py
@@ -1,3 +1,4 @@
+import copy
 import logging
 from typing import Generator, List, Optional, Union, cast

@@ -625,9 +626,10 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):
    def _get_ai_model_entity(base_model_name: str, model: str) -> AzureBaseModel:
        for ai_model_entity in LLM_BASE_MODELS:
            if ai_model_entity.base_model_name == base_model_name:
-                ai_model_entity.entity.model = model
-                ai_model_entity.entity.label.en_US = model
-                ai_model_entity.entity.label.zh_Hans = model
-                return ai_model_entity
+                ai_model_entity_copy = copy.deepcopy(ai_model_entity)
+                ai_model_entity_copy.entity.model = model
+                ai_model_entity_copy.entity.label.en_US = model
+                ai_model_entity_copy.entity.label.zh_Hans = model
+                return ai_model_entity_copy

        return None
--- a/api/core/model_runtime/model_providers/azure_openai/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/azure_openai/text_embedding/text_embedding.py
@@ -1,4 +1,5 @@
 import base64
+import copy
 import time
 from typing import Optional, Tuple

@@ -186,9 +187,10 @@ class AzureOpenAITextEmbeddingModel(_CommonAzureOpenAI, TextEmbeddingModel):
    def _get_ai_model_entity(base_model_name: str, model: str) -> AzureBaseModel:
        for ai_model_entity in EMBEDDING_BASE_MODELS:
            if ai_model_entity.base_model_name == base_model_name:
-                ai_model_entity.entity.model = model
-                ai_model_entity.entity.label.en_US = model
-                ai_model_entity.entity.label.zh_Hans = model
-                return ai_model_entity
+                ai_model_entity_copy = copy.deepcopy(ai_model_entity)
+                ai_model_entity_copy.entity.model = model
+                ai_model_entity_copy.entity.label.en_US = model
+                ai_model_entity_copy.entity.label.zh_Hans = model
+                return ai_model_entity_copy

        return None
--- a/api/core/model_runtime/model_providers/huggingface_hub/llm/llm.py
+++ b/api/core/model_runtime/model_providers/huggingface_hub/llm/llm.py
@@ -134,7 +134,55 @@ class HuggingfaceHubLargeLanguageModel(_CommonHuggingfaceHub, LargeLanguageModel
            precision=0,
        )

-        return [temperature_rule, top_k_rule, top_p_rule]
+        max_new_tokens = ParameterRule(
+            name='max_new_tokens',
+            label={
+                'en_US': 'Max New Tokens',
+                'zh_Hans': '最大新标记',
+            },
+            type='int',
+            help={
+                'en_US': 'Maximum number of generated tokens.',
+                'zh_Hans': '生成的标记的最大数量。',
+            },
+            required=False,
+            default=20,
+            min=1,
+            max=4096,
+            precision=0,
+        )
+
+        seed = ParameterRule(
+            name='seed',
+            label={
+                'en_US': 'Random sampling seed',
+                'zh_Hans': '随机采样种子',
+            },
+            type='int',
+            help={
+                'en_US': 'Random sampling seed.',
+                'zh_Hans': '随机采样种子。',
+            },
+            required=False,
+            precision=0,
+        )
+
+        repetition_penalty = ParameterRule(
+            name='repetition_penalty',
+            label={
+                'en_US': 'Repetition Penalty',
+                'zh_Hans': '重复惩罚',
+            },
+            type='float',
+            help={
+                'en_US': 'The parameter for repetition penalty. 1.0 means no penalty.',
+                'zh_Hans': '重复惩罚的参数。1.0 表示没有惩罚。',
+            },
+            required=False,
+            precision=1,
+        )
+
+        return [temperature_rule, top_k_rule, top_p_rule, max_new_tokens, seed, repetition_penalty]

    def _handle_generate_stream_response(self,
                                         model: str,
--- a/api/core/model_runtime/model_providers/jina/text_embedding/jina-embeddings-v2-base-zh.yaml
+++ b/api/core/model_runtime/model_providers/jina/text_embedding/jina-embeddings-v2-base-zh.yaml
@@ -0,0 +1,9 @@
+model: jina-embeddings-v2-base-zh
+model_type: text-embedding
+model_properties:
+  context_size: 8192
+  max_chunks: 2048
+pricing:
+  input: '0.001'
+  unit: '0.001'
+  currency: USD
--- a/api/core/model_runtime/model_providers/jina/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/jina/text_embedding/text_embedding.py
@@ -17,7 +17,7 @@ class JinaTextEmbeddingModel(TextEmbeddingModel):
    Model class for Jina text embedding model.
    """
    api_base: str = 'https://api.jina.ai/v1/embeddings'
-    models: list[str] = ['jina-embeddings-v2-base-en', 'jina-embeddings-v2-small-en']
+    models: list[str] = ['jina-embeddings-v2-base-en', 'jina-embeddings-v2-small-en', 'jina-embeddings-v2-base-zh']

    def _invoke(self, model: str, credentials: dict,
                texts: list[str], user: Optional[str] = None) \
--- a/api/core/model_runtime/model_providers/minimax/llm/abab5.5-chat.yaml
+++ b/api/core/model_runtime/model_providers/minimax/llm/abab5.5-chat.yaml
@@ -10,8 +10,14 @@ model_properties:
 parameter_rules:
  - name: temperature
    use_template: temperature
+    min: 0.01
+    max: 1
+    default: 0.9
  - name: top_p
    use_template: top_p
+    min: 0.01
+    max: 1
+    default: 0.95
  - name: max_tokens
    use_template: max_tokens
    required: true
--- a/api/core/model_runtime/model_providers/minimax/llm/abab5.5s-chat.yaml
+++ b/api/core/model_runtime/model_providers/minimax/llm/abab5.5s-chat.yaml
@@ -0,0 +1,35 @@
+model: abab5.5s-chat
+label:
+  en_US: Abab5.5s-Chat
+model_type: llm
+features:
+  - agent-thought
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    min: 0.01
+    max: 1
+    default: 0.9
+  - name: top_p
+    use_template: top_p
+    min: 0.01
+    max: 1
+    default: 0.95
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 3072
+    min: 1
+    max: 8192
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+pricing:
+  input: '0.00'
+  output: '0.005'
+  unit: '0.001'
+  currency: RMB
--- a/api/core/model_runtime/model_providers/minimax/llm/chat_completion_pro.py
+++ b/api/core/model_runtime/model_providers/minimax/llm/chat_completion_pro.py
@@ -22,7 +22,7 @@ class MinimaxChatCompletionPro(object):
        """
            generate chat completion
        """
-        if model != 'abab5.5-chat':
+        if model not in ['abab5.5-chat', 'abab5.5s-chat']:
            raise BadRequestError(f'Invalid model: {model}')
        
        if not api_key or not group_id:
--- a/api/core/model_runtime/model_providers/minimax/llm/llm.py
+++ b/api/core/model_runtime/model_providers/minimax/llm/llm.py
@@ -18,6 +18,7 @@ from core.model_runtime.model_providers.minimax.llm.types import MinimaxMessage

 class MinimaxLargeLanguageModel(LargeLanguageModel):
    model_apis = {
+        'abab5.5s-chat': MinimaxChatCompletionPro,
        'abab5.5-chat': MinimaxChatCompletionPro,
        'abab5-chat': MinimaxChatCompletion
    }
--- a/api/core/model_runtime/model_providers/model_provider_factory.py
+++ b/api/core/model_runtime/model_providers/model_provider_factory.py
@@ -61,7 +61,7 @@ class ModelProviderFactory:
        # return providers
        return providers

-    def provider_credentials_validate(self, provider: str, credentials: dict) -> None:
+    def provider_credentials_validate(self, provider: str, credentials: dict) -> dict:
        """
        Validate provider credentials

@@ -80,13 +80,15 @@ class ModelProviderFactory:

        # validate provider credential schema
        validator = ProviderCredentialSchemaValidator(provider_credential_schema)
-        validator.validate_and_filter(credentials)
+        filtered_credentials = validator.validate_and_filter(credentials)

        # validate the credentials, raise exception if validation failed
-        model_provider_instance.validate_provider_credentials(credentials)
+        model_provider_instance.validate_provider_credentials(filtered_credentials)
+
+        return filtered_credentials

    def model_credentials_validate(self, provider: str, model_type: ModelType,
-                                   model: str, credentials: dict) -> None:
+                                   model: str, credentials: dict) -> dict:
        """
        Validate model credentials

@@ -107,13 +109,15 @@ class ModelProviderFactory:

        # validate model credential schema
        validator = ModelCredentialSchemaValidator(model_type, model_credential_schema)
-        validator.validate_and_filter(credentials)
+        filtered_credentials = validator.validate_and_filter(credentials)

        # get model instance of the model type
        model_instance = model_provider_instance.get_model_instance(model_type)

        # call validate_credentials method of model type to validate credentials, raise exception if validation failed
-        model_instance.validate_credentials(model, credentials)
+        model_instance.validate_credentials(model, filtered_credentials)
+
+        return filtered_credentials

    def get_models(self,
                   provider: Optional[str] = None,
--- a/api/core/model_runtime/model_providers/openai/llm/llm.py
+++ b/api/core/model_runtime/model_providers/openai/llm/llm.py
@@ -765,7 +765,6 @@ class OpenAILargeLanguageModel(_CommonOpenAI, LargeLanguageModel):
        num_tokens = 0
        for tool in tools:
            num_tokens += len(encoding.encode('type'))
-            num_tokens += len(encoding.encode(tool.get("type")))
            num_tokens += len(encoding.encode('function'))

            # calculate num tokens for function object
--- a/api/core/model_runtime/model_providers/tongyi/llm/llm.py
+++ b/api/core/model_runtime/model_providers/tongyi/llm/llm.py
@@ -1,8 +1,8 @@
-from http import HTTPStatus
 from typing import Generator, List, Optional, Union

-import dashscope
-from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta
+from dashscope import get_tokenizer
+
+from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta, LLMMode
 from core.model_runtime.entities.message_entities import (AssistantPromptMessage, PromptMessage, PromptMessageTool,
                                                          SystemPromptMessage, UserPromptMessage)
 from core.model_runtime.errors.invoke import (InvokeAuthorizationError, InvokeBadRequestError, InvokeConnectionError,
@@ -51,19 +51,12 @@ class TongyiLargeLanguageModel(LargeLanguageModel):
        :param tools: tools for tool calling
        :return:
        """
-        # transform credentials to kwargs for model instance
-        credentials_kwargs = self._to_credential_kwargs(credentials)
+        tokenizer = get_tokenizer(model)

-        response = dashscope.Tokenization.call(
-            model=model,
-            prompt=self._convert_messages_to_prompt(prompt_messages),
-            **credentials_kwargs
-        )
-        
-        if response.status_code == HTTPStatus.OK:
-            return response['usage']['input_tokens']
-        else:
-            raise self._invoke_error_mapping[InvokeBadRequestError][0](response['message'])
+        # convert string to token ids
+        tokens = tokenizer.encode(self._convert_messages_to_prompt(prompt_messages))
+
+        return len(tokens)

    def validate_credentials(self, model: str, credentials: dict) -> None:
        """
@@ -119,14 +112,22 @@ class TongyiLargeLanguageModel(LargeLanguageModel):

        params = {
            'model': model,
-            'prompt': self._convert_messages_to_prompt(prompt_messages),
            **model_parameters,
            **credentials_kwargs
        }
+
+        mode = self.get_model_mode(model, credentials)
+
+        if mode == LLMMode.CHAT:
+            params['messages'] = self._convert_prompt_messages_to_tongyi_messages(prompt_messages)
+        else:
+            params['prompt'] = self._convert_messages_to_prompt(prompt_messages)
+
        if stream:
            responses = stream_generate_with_retry(
                client, 
                stream=True,
+                incremental_output=True,
                **params
            )

@@ -267,6 +268,35 @@ class TongyiLargeLanguageModel(LargeLanguageModel):
        # trim off the trailing ' ' that might come from the "Assistant: "
        return text.rstrip()

+    def _convert_prompt_messages_to_tongyi_messages(self, prompt_messages: list[PromptMessage]) -> list[dict]:
+        """
+        Convert prompt messages to tongyi messages
+
+        :param prompt_messages: prompt messages
+        :return: tongyi messages
+        """
+        tongyi_messages = []
+        for prompt_message in prompt_messages:
+            if isinstance(prompt_message, SystemPromptMessage):
+                tongyi_messages.append({
+                    'role': 'system',
+                    'content': prompt_message.content,
+                })
+            elif isinstance(prompt_message, UserPromptMessage):
+                tongyi_messages.append({
+                    'role': 'user',
+                    'content': prompt_message.content,
+                })
+            elif isinstance(prompt_message, AssistantPromptMessage):
+                tongyi_messages.append({
+                    'role': 'assistant',
+                    'content': prompt_message.content,
+                })
+            else:
+                raise ValueError(f"Got unknown type {prompt_message}")
+
+        return tongyi_messages
+
    @property
    def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
        """
--- a/api/core/model_runtime/model_providers/tongyi/llm/qwen-max-1201.yaml
+++ b/api/core/model_runtime/model_providers/tongyi/llm/qwen-max-1201.yaml
@@ -0,0 +1,59 @@
+model: qwen-max-1201
+label:
+  en_US: qwen-max-1201
+model_type: llm
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+    help:
+      zh_Hans: 用于控制随机性和多样性的程度。具体来说，temperature值控制了生成文本时对每个候选词的概率分布进行平滑的程度。较高的temperature值会降低概率分布的峰值，使得更多的低概率词被选择，生成结果更加多样化；而较低的temperature值则会增强概率分布的峰值，使得高概率词更容易被选择，生成结果更加确定。
+      en_US: Used to control the degree of randomness and diversity. Specifically, the temperature value controls the degree to which the probability distribution of each candidate word is smoothed when generating text. A higher temperature value will reduce the peak value of the probability distribution, allowing more low-probability words to be selected, and the generated results will be more diverse; while a lower temperature value will enhance the peak value of the probability distribution, making it easier for high-probability words to be selected. , the generated results are more certain.
+  - name: top_p
+    use_template: top_p
+    default: 0.8
+    min: 0.1
+    max: 0.9
+    help:
+      zh_Hans: 生成过程中核采样方法概率阈值，例如，取值为0.8时，仅保留概率加起来大于等于0.8的最可能token的最小集合作为候选集。取值范围为（0,1.0)，取值越大，生成的随机性越高；取值越低，生成的确定性越高。
+      en_US: The probability threshold of the kernel sampling method during the generation process. For example, when the value is 0.8, only the smallest set of the most likely tokens with a sum of probabilities greater than or equal to 0.8 is retained as the candidate set. The value range is (0,1.0). The larger the value, the higher the randomness generated; the lower the value, the higher the certainty generated.
+  - name: max_tokens
+    use_template: max_tokens
+    default: 1500
+    min: 1
+    max: 6000
+    help:
+      zh_Hans: 用于限制模型生成token的数量，max_tokens设置的是生成上限，并不表示一定会生成这么多的token数量。
+      en_US: It is used to limit the number of tokens generated by the model. max_tokens sets the upper limit of generation, which does not mean that so many tokens will be generated.
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 生成时，采样候选集的大小。例如，取值为50时，仅将单次生成中得分最高的50个token组成随机采样的候选集。取值越大，生成的随机性越高；取值越小，生成的确定性越高。默认不传递该参数，取值为None或当top_k大于100时，表示不启用top_k策略，此时，仅有top_p策略生效。
+      en_US: The size of the sample candidate set when generated. For example, when the value is 50, only the 50 highest-scoring tokens in a single generation form a randomly sampled candidate set. The larger the value, the higher the randomness generated; the smaller the value, the higher the certainty generated. This parameter is not passed by default. The value is None or when top_k is greater than 100, it means that the top_k policy is not enabled. At this time, only the top_p policy takes effect.
+    required: false
+  - name: seed
+    label:
+      zh_Hans: 随机种子
+      en_US: Random seed
+    type: int
+    help:
+      zh_Hans: 生成时，随机数的种子，用于控制模型生成的随机性。如果使用相同的种子，每次运行生成的结果都将相同；当需要复现模型的生成结果时，可以使用相同的种子。seed参数支持无符号64位整数类型。
+      en_US: When generating, the random number seed is used to control the randomness of model generation. If you use the same seed, the results generated by each run will be the same; when you need to reproduce the results of the model, you can use the same seed. The seed parameter supports unsigned 64-bit integer types.
+    required: false
+  - name: repetition_penalty
+    label:
+      en_US: Repetition penalty
+    type: float
+    default: 1.1
+    help:
+      zh_Hans: 用于控制模型生成时的重复度。提高repetition_penalty时可以降低模型生成的重复度。1.0表示不做惩罚。
+      en_US: Used to control the repetition of model generation. Increasing the repetition_penalty can reduce the repetition of model generation. 1.0 means no punishment.
+    required: false
--- a/api/core/model_runtime/model_providers/tongyi/llm/qwen-max-longcontext.yaml
+++ b/api/core/model_runtime/model_providers/tongyi/llm/qwen-max-longcontext.yaml
@@ -0,0 +1,59 @@
+model: qwen-max-longcontext
+label:
+  en_US: qwen-max-longcontext
+model_type: llm
+model_properties:
+  mode: chat
+  context_size: 30000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+    help:
+      zh_Hans: 用于控制随机性和多样性的程度。具体来说，temperature值控制了生成文本时对每个候选词的概率分布进行平滑的程度。较高的temperature值会降低概率分布的峰值，使得更多的低概率词被选择，生成结果更加多样化；而较低的temperature值则会增强概率分布的峰值，使得高概率词更容易被选择，生成结果更加确定。
+      en_US: Used to control the degree of randomness and diversity. Specifically, the temperature value controls the degree to which the probability distribution of each candidate word is smoothed when generating text. A higher temperature value will reduce the peak value of the probability distribution, allowing more low-probability words to be selected, and the generated results will be more diverse; while a lower temperature value will enhance the peak value of the probability distribution, making it easier for high-probability words to be selected. , the generated results are more certain.
+  - name: top_p
+    use_template: top_p
+    default: 0.8
+    min: 0.1
+    max: 0.9
+    help:
+      zh_Hans: 生成过程中核采样方法概率阈值，例如，取值为0.8时，仅保留概率加起来大于等于0.8的最可能token的最小集合作为候选集。取值范围为（0,1.0)，取值越大，生成的随机性越高；取值越低，生成的确定性越高。
+      en_US: The probability threshold of the kernel sampling method during the generation process. For example, when the value is 0.8, only the smallest set of the most likely tokens with a sum of probabilities greater than or equal to 0.8 is retained as the candidate set. The value range is (0,1.0). The larger the value, the higher the randomness generated; the lower the value, the higher the certainty generated.
+  - name: max_tokens
+    use_template: max_tokens
+    default: 2000
+    min: 1
+    max: 28000
+    help:
+      zh_Hans: 用于限制模型生成token的数量，max_tokens设置的是生成上限，并不表示一定会生成这么多的token数量。
+      en_US: It is used to limit the number of tokens generated by the model. max_tokens sets the upper limit of generation, which does not mean that so many tokens will be generated.
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 生成时，采样候选集的大小。例如，取值为50时，仅将单次生成中得分最高的50个token组成随机采样的候选集。取值越大，生成的随机性越高；取值越小，生成的确定性越高。默认不传递该参数，取值为None或当top_k大于100时，表示不启用top_k策略，此时，仅有top_p策略生效。
+      en_US: The size of the sample candidate set when generated. For example, when the value is 50, only the 50 highest-scoring tokens in a single generation form a randomly sampled candidate set. The larger the value, the higher the randomness generated; the smaller the value, the higher the certainty generated. This parameter is not passed by default. The value is None or when top_k is greater than 100, it means that the top_k policy is not enabled. At this time, only the top_p policy takes effect.
+    required: false
+  - name: seed
+    label:
+      zh_Hans: 随机种子
+      en_US: Random seed
+    type: int
+    help:
+      zh_Hans: 生成时，随机数的种子，用于控制模型生成的随机性。如果使用相同的种子，每次运行生成的结果都将相同；当需要复现模型的生成结果时，可以使用相同的种子。seed参数支持无符号64位整数类型。
+      en_US: When generating, the random number seed is used to control the randomness of model generation. If you use the same seed, the results generated by each run will be the same; when you need to reproduce the results of the model, you can use the same seed. The seed parameter supports unsigned 64-bit integer types.
+    required: false
+  - name: repetition_penalty
+    label:
+      en_US: Repetition penalty
+    type: float
+    default: 1.1
+    help:
+      zh_Hans: 用于控制模型生成时的重复度。提高repetition_penalty时可以降低模型生成的重复度。1.0表示不做惩罚。
+      en_US: Used to control the repetition of model generation. Increasing the repetition_penalty can reduce the repetition of model generation. 1.0 means no punishment.
+    required: false
--- a/api/core/model_runtime/model_providers/tongyi/llm/qwen-max.yaml
+++ b/api/core/model_runtime/model_providers/tongyi/llm/qwen-max.yaml
@@ -0,0 +1,59 @@
+model: qwen-max
+label:
+  en_US: qwen-max
+model_type: llm
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+    help:
+      zh_Hans: 用于控制随机性和多样性的程度。具体来说，temperature值控制了生成文本时对每个候选词的概率分布进行平滑的程度。较高的temperature值会降低概率分布的峰值，使得更多的低概率词被选择，生成结果更加多样化；而较低的temperature值则会增强概率分布的峰值，使得高概率词更容易被选择，生成结果更加确定。
+      en_US: Used to control the degree of randomness and diversity. Specifically, the temperature value controls the degree to which the probability distribution of each candidate word is smoothed when generating text. A higher temperature value will reduce the peak value of the probability distribution, allowing more low-probability words to be selected, and the generated results will be more diverse; while a lower temperature value will enhance the peak value of the probability distribution, making it easier for high-probability words to be selected. , the generated results are more certain.
+  - name: top_p
+    use_template: top_p
+    default: 0.8
+    min: 0.1
+    max: 0.9
+    help:
+      zh_Hans: 生成过程中核采样方法概率阈值，例如，取值为0.8时，仅保留概率加起来大于等于0.8的最可能token的最小集合作为候选集。取值范围为（0,1.0)，取值越大，生成的随机性越高；取值越低，生成的确定性越高。
+      en_US: The probability threshold of the kernel sampling method during the generation process. For example, when the value is 0.8, only the smallest set of the most likely tokens with a sum of probabilities greater than or equal to 0.8 is retained as the candidate set. The value range is (0,1.0). The larger the value, the higher the randomness generated; the lower the value, the higher the certainty generated.
+  - name: max_tokens
+    use_template: max_tokens
+    default: 1500
+    min: 1
+    max: 6000
+    help:
+      zh_Hans: 用于限制模型生成token的数量，max_tokens设置的是生成上限，并不表示一定会生成这么多的token数量。
+      en_US: It is used to limit the number of tokens generated by the model. max_tokens sets the upper limit of generation, which does not mean that so many tokens will be generated.
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 生成时，采样候选集的大小。例如，取值为50时，仅将单次生成中得分最高的50个token组成随机采样的候选集。取值越大，生成的随机性越高；取值越小，生成的确定性越高。默认不传递该参数，取值为None或当top_k大于100时，表示不启用top_k策略，此时，仅有top_p策略生效。
+      en_US: The size of the sample candidate set when generated. For example, when the value is 50, only the 50 highest-scoring tokens in a single generation form a randomly sampled candidate set. The larger the value, the higher the randomness generated; the smaller the value, the higher the certainty generated. This parameter is not passed by default. The value is None or when top_k is greater than 100, it means that the top_k policy is not enabled. At this time, only the top_p policy takes effect.
+    required: false
+  - name: seed
+    label:
+      zh_Hans: 随机种子
+      en_US: Random seed
+    type: int
+    help:
+      zh_Hans: 生成时，随机数的种子，用于控制模型生成的随机性。如果使用相同的种子，每次运行生成的结果都将相同；当需要复现模型的生成结果时，可以使用相同的种子。seed参数支持无符号64位整数类型。
+      en_US: When generating, the random number seed is used to control the randomness of model generation. If you use the same seed, the results generated by each run will be the same; when you need to reproduce the results of the model, you can use the same seed. The seed parameter supports unsigned 64-bit integer types.
+    required: false
+  - name: repetition_penalty
+    label:
+      en_US: Repetition penalty
+    type: float
+    default: 1.1
+    help:
+      zh_Hans: 用于控制模型生成时的重复度。提高repetition_penalty时可以降低模型生成的重复度。1.0表示不做惩罚。
+      en_US: Used to control the repetition of model generation. Increasing the repetition_penalty can reduce the repetition of model generation. 1.0 means no punishment.
+    required: false
--- a/api/core/model_runtime/model_providers/tongyi/llm/qwen-plus.yaml
+++ b/api/core/model_runtime/model_providers/tongyi/llm/qwen-plus.yaml
@@ -17,6 +17,8 @@ parameter_rules:
  - name: top_p
    use_template: top_p
    default: 0.8
+    min: 0.1
+    max: 0.9
    help:
      zh_Hans: 生成过程中核采样方法概率阈值，例如，取值为0.8时，仅保留概率加起来大于等于0.8的最可能token的最小集合作为候选集。取值范围为（0,1.0)，取值越大，生成的随机性越高；取值越低，生成的确定性越高。
      en_US: The probability threshold of the kernel sampling method during the generation process. For example, when the value is 0.8, only the smallest set of the most likely tokens with a sum of probabilities greater than or equal to 0.8 is retained as the candidate set. The value range is (0,1.0). The larger the value, the higher the randomness generated; the lower the value, the higher the certainty generated.
@@ -24,7 +26,7 @@ parameter_rules:
    use_template: max_tokens
    default: 2000
    min: 1
-    max: 2000
+    max: 30000
    help:
      zh_Hans: 用于限制模型生成token的数量，max_tokens设置的是生成上限，并不表示一定会生成这么多的token数量。
      en_US: It is used to limit the number of tokens generated by the model. max_tokens sets the upper limit of generation, which does not mean that so many tokens will be generated.
@@ -42,10 +44,9 @@ parameter_rules:
      zh_Hans: 随机种子
      en_US: Random seed
    type: int
-    default: 1234
    help:
-      zh_Hans: 生成时，随机数的种子，用于控制模型生成的随机性。如果使用相同的种子，每次运行生成的结果都将相同；当需要复现模型的生成结果时，可以使用相同的种子。seed参数支持无符号64位整数类型。默认值 1234。
-      en_US: When generating, the random number seed is used to control the randomness of model generation. If you use the same seed, the results generated by each run will be the same; when you need to reproduce the results of the model, you can use the same seed. The seed parameter supports unsigned 64-bit integer types. Default value 1234.
+      zh_Hans: 生成时，随机数的种子，用于控制模型生成的随机性。如果使用相同的种子，每次运行生成的结果都将相同；当需要复现模型的生成结果时，可以使用相同的种子。seed参数支持无符号64位整数类型。
+      en_US: When generating, the random number seed is used to control the randomness of model generation. If you use the same seed, the results generated by each run will be the same; when you need to reproduce the results of the model, you can use the same seed. The seed parameter supports unsigned 64-bit integer types.
    required: false
  - name: repetition_penalty
    label:
@@ -55,3 +56,8 @@ parameter_rules:
    help:
      zh_Hans: 用于控制模型生成时的重复度。提高repetition_penalty时可以降低模型生成的重复度。1.0表示不做惩罚。
      en_US: Used to control the repetition of model generation. Increasing the repetition_penalty can reduce the repetition of model generation. 1.0 means no punishment.
+pricing:
+  input: '0.02'
+  output: '0.02'
+  unit: '0.001'
+  currency: RMB
--- a/api/core/model_runtime/model_providers/tongyi/llm/qwen-turbo.yaml
+++ b/api/core/model_runtime/model_providers/tongyi/llm/qwen-turbo.yaml
@@ -17,6 +17,8 @@ parameter_rules:
  - name: top_p
    use_template: top_p
    default: 0.8
+    min: 0.1
+    max: 0.9
    help:
      zh_Hans: 生成过程中核采样方法概率阈值，例如，取值为0.8时，仅保留概率加起来大于等于0.8的最可能token的最小集合作为候选集。取值范围为（0,1.0)，取值越大，生成的随机性越高；取值越低，生成的确定性越高。
      en_US: The probability threshold of the kernel sampling method during the generation process. For example, when the value is 0.8, only the smallest set of the most likely tokens with a sum of probabilities greater than or equal to 0.8 is retained as the candidate set. The value range is (0,1.0). The larger the value, the higher the randomness generated; the lower the value, the higher the certainty generated.
@@ -24,7 +26,7 @@ parameter_rules:
    use_template: max_tokens
    default: 1500
    min: 1
-    max: 1500
+    max: 6000
    help:
      zh_Hans: 用于限制模型生成token的数量，max_tokens设置的是生成上限，并不表示一定会生成这么多的token数量。
      en_US: It is used to limit the number of tokens generated by the model. max_tokens sets the upper limit of generation, which does not mean that so many tokens will be generated.
@@ -42,10 +44,9 @@ parameter_rules:
      zh_Hans: 随机种子
      en_US: Random seed
    type: int
-    default: 1234
    help:
-      zh_Hans: 生成时，随机数的种子，用于控制模型生成的随机性。如果使用相同的种子，每次运行生成的结果都将相同；当需要复现模型的生成结果时，可以使用相同的种子。seed参数支持无符号64位整数类型。默认值 1234。
-      en_US: When generating, the random number seed is used to control the randomness of model generation. If you use the same seed, the results generated by each run will be the same; when you need to reproduce the results of the model, you can use the same seed. The seed parameter supports unsigned 64-bit integer types. Default value 1234.
+      zh_Hans: 生成时，随机数的种子，用于控制模型生成的随机性。如果使用相同的种子，每次运行生成的结果都将相同；当需要复现模型的生成结果时，可以使用相同的种子。seed参数支持无符号64位整数类型。
+      en_US: When generating, the random number seed is used to control the randomness of model generation. If you use the same seed, the results generated by each run will be the same; when you need to reproduce the results of the model, you can use the same seed. The seed parameter supports unsigned 64-bit integer types.
    required: false
  - name: repetition_penalty
    label:
@@ -56,3 +57,8 @@ parameter_rules:
      zh_Hans: 用于控制模型生成时的重复度。提高repetition_penalty时可以降低模型生成的重复度。1.0表示不做惩罚。
      en_US: Used to control the repetition of model generation. Increasing the repetition_penalty can reduce the repetition of model generation. 1.0 means no punishment.
    required: false
+pricing:
+  input: '0.008'
+  output: '0.008'
+  unit: '0.001'
+  currency: RMB
--- a/api/core/model_runtime/schema_validators/common_validator.py
+++ b/api/core/model_runtime/schema_validators/common_validator.py
@@ -46,7 +46,7 @@ class CommonValidator:
        :return: validated credential form schema value
        """
        #  If the variable does not exist in credentials
-        if credential_form_schema.variable not in credentials:
+        if credential_form_schema.variable not in credentials or not credentials[credential_form_schema.variable]:
            # If required is True, an exception is thrown
            if credential_form_schema.required:
                raise ValueError(f'Variable {credential_form_schema.variable} is required')
--- a/api/core/model_runtime/utils/encoders.py
+++ b/api/core/model_runtime/utils/encoders.py
@@ -151,6 +151,8 @@ def jsonable_encoder(
        return str(obj)
    if isinstance(obj, (str, int, float, type(None))):
        return obj
+    if isinstance(obj, Decimal):
+        return format(obj, 'f')
    if isinstance(obj, dict):
        encoded_dict = {}
        allowed_keys = set(obj.keys())
--- a/api/core/moderation/keywords/keywords.py
+++ b/api/core/moderation/keywords/keywords.py
@@ -30,7 +30,7 @@ class KeywordsModeration(Moderation):

            if query:
                inputs['query__'] = query
-            keywords_list = self.config['keywords'].split('\n')
+            keywords_list = [keyword for keyword in self.config['keywords'].split('\n') if keyword]
            flagged = self._is_violated(inputs, keywords_list)

        return ModerationInputsResult(flagged=flagged, action=ModerationAction.DIRECT_OUTPUT, preset_response=preset_response)
--- a/api/core/provider_manager.py
+++ b/api/core/provider_manager.py
@@ -597,18 +597,28 @@ class ProviderManager:
        quota_configurations = []
        for provider_quota in provider_hosting_configuration.quotas:
            if provider_quota.quota_type not in quota_type_to_provider_records_dict:
-                continue
+                if provider_quota.quota_type == ProviderQuotaType.FREE:
+                    quota_configuration = QuotaConfiguration(
+                        quota_type=provider_quota.quota_type,
+                        quota_unit=provider_hosting_configuration.quota_unit,
+                        quota_used=0,
+                        quota_limit=0,
+                        is_valid=False,
+                        restrict_models=provider_quota.restrict_models
+                    )
+                else:
+                    continue
+            else:
+                provider_record = quota_type_to_provider_records_dict[provider_quota.quota_type]

-            provider_record = quota_type_to_provider_records_dict[provider_quota.quota_type]
-
-            quota_configuration = QuotaConfiguration(
-                quota_type=provider_quota.quota_type,
-                quota_unit=provider_hosting_configuration.quota_unit,
-                quota_used=provider_record.quota_used,
-                quota_limit=provider_record.quota_limit,
-                is_valid=provider_record.quota_limit > provider_record.quota_used or provider_record.quota_limit == -1,
-                restrict_models=provider_quota.restrict_models
-            )
+                quota_configuration = QuotaConfiguration(
+                    quota_type=provider_quota.quota_type,
+                    quota_unit=provider_hosting_configuration.quota_unit,
+                    quota_used=provider_record.quota_used,
+                    quota_limit=provider_record.quota_limit,
+                    is_valid=provider_record.quota_limit > provider_record.quota_used or provider_record.quota_limit == -1,
+                    restrict_models=provider_quota.restrict_models
+                )

            quota_configurations.append(quota_configuration)

@@ -670,6 +680,7 @@ class ProviderManager:
                    current_using_credentials = cached_provider_credentials
            else:
                current_using_credentials = {}
+                quota_configurations = []

        return SystemConfiguration(
            enabled=True,
--- a/api/core/spiltter/fixed_text_splitter.py
+++ b/api/core/spiltter/fixed_text_splitter.py
@@ -1,8 +1,10 @@
 """Functionality for splitting text."""
 from __future__ import annotations

-from typing import Any, List, Optional
+from typing import Any, List, Optional, cast

+from core.model_manager import ModelInstance
+from core.model_runtime.model_providers.__base.text_embedding_model import TextEmbeddingModel
 from core.model_runtime.model_providers.__base.tokenizers.gpt2_tokenzier import GPT2Tokenizer
 from langchain.text_splitter import (TS, AbstractSet, Collection, Literal, RecursiveCharacterTextSplitter,
                                     TokenTextSplitter, Type, Union)
@@ -12,22 +14,30 @@ class EnhanceRecursiveCharacterTextSplitter(RecursiveCharacterTextSplitter):
    """
        This class is used to implement from_gpt2_encoder, to prevent using of tiktoken
    """
+
    @classmethod
-    def from_gpt2_encoder(
-        cls: Type[TS],
-        encoding_name: str = "gpt2",
-        model_name: Optional[str] = None,
-        allowed_special: Union[Literal["all"], AbstractSet[str]] = set(),
-        disallowed_special: Union[Literal["all"], Collection[str]] = "all",
-        **kwargs: Any,
+    def from_encoder(
+            cls: Type[TS],
+            embedding_model_instance: Optional[ModelInstance],
+            allowed_special: Union[Literal["all"], AbstractSet[str]] = set(),
+            disallowed_special: Union[Literal["all"], Collection[str]] = "all",
+            **kwargs: Any,
    ):
        def _token_encoder(text: str) -> int:
-            return GPT2Tokenizer.get_num_tokens(text)
+            if embedding_model_instance:
+                embedding_model_type_instance = embedding_model_instance.model_type_instance
+                embedding_model_type_instance = cast(TextEmbeddingModel, embedding_model_type_instance)
+                return embedding_model_type_instance.get_num_tokens(
+                    model=embedding_model_instance.model,
+                    credentials=embedding_model_instance.credentials,
+                    texts=[text]
+                )
+            else:
+                return GPT2Tokenizer.get_num_tokens(text)

        if issubclass(cls, TokenTextSplitter):
            extra_kwargs = {
-                "encoding_name": encoding_name,
-                "model_name": model_name,
+                "model_name": embedding_model_instance.model if embedding_model_instance else 'gpt2',
                "allowed_special": allowed_special,
                "disallowed_special": disallowed_special,
            }
@@ -35,6 +45,7 @@ class EnhanceRecursiveCharacterTextSplitter(RecursiveCharacterTextSplitter):

        return cls(length_function=_token_encoder, **kwargs)

+
 class FixedRecursiveCharacterTextSplitter(EnhanceRecursiveCharacterTextSplitter):
    def __init__(self, fixed_separator: str = "\n\n", separators: Optional[List[str]] = None, **kwargs: Any):
        """Create a new TextSplitter."""
@@ -90,4 +101,4 @@ class FixedRecursiveCharacterTextSplitter(EnhanceRecursiveCharacterTextSplitter)
        if _good_splits:
            merged_text = self._merge_splits(_good_splits, separator)
            final_chunks.extend(merged_text)
-        return final_chunks
+        return final_chunks
--- a/api/core/tool/dataset_multi_retriever_tool.py
+++ b/api/core/tool/dataset_multi_retriever_tool.py
@@ -94,6 +94,7 @@ class DatasetMultiRetrieverTool(BaseTool):
        document_context_list = []
        index_node_ids = [document.metadata['doc_id'] for document in all_documents]
        segments = DocumentSegment.query.filter(
+            DocumentSegment.dataset_id.in_(self.dataset_ids),
            DocumentSegment.completed_at.isnot(None),
            DocumentSegment.status == 'completed',
            DocumentSegment.enabled == True,
--- a/api/core/vector_store/vector/qdrant.py
+++ b/api/core/vector_store/vector/qdrant.py
@@ -1450,7 +1450,7 @@ class Qdrant(VectorStore):
                wal_config=wal_config,
                quantization_config=quantization_config,
                init_from=init_from,
-                timeout=timeout,  # type: ignore[arg-type]
+                timeout=int(timeout),  # type: ignore[arg-type]
            )
            is_new_collection = True
        if force_recreate:
--- a/api/events/event_handlers/deduct_quota_when_messaeg_created.py
+++ b/api/events/event_handlers/deduct_quota_when_messaeg_created.py
@@ -23,12 +23,16 @@ def handle(sender, **kwargs):
    for quota_configuration in system_configuration.quota_configurations:
        if quota_configuration.quota_type == system_configuration.current_quota_type:
            quota_unit = quota_configuration.quota_unit
+
+            if quota_configuration.quota_limit == -1:
+                return
+
            break

    used_quota = None
    if quota_unit:
-        if quota_unit == QuotaUnit.TOKENS.value:
-            used_quota = message.message_tokens + message.prompt_tokens
+        if quota_unit == QuotaUnit.TOKENS:
+            used_quota = message.message_tokens + message.answer_tokens
        else:
            used_quota = 1

--- a/api/libs/external_api.py
+++ b/api/libs/external_api.py
@@ -31,6 +31,10 @@ class ExternalApi(Api):
                'message': getattr(e, 'description', http_status_message(status_code)),
                'status': status_code
            }
+
+            if default_data['message'] and default_data['message'] == 'Failed to decode JSON object: Expecting value: line 1 column 1 (char 0)':
+                default_data['message'] = 'Invalid JSON payload received or JSON payload is empty.'
+
            headers = e.get_response().headers
        elif isinstance(e, ValueError):
            status_code = 400
--- a/api/requirements.txt
+++ b/api/requirements.txt
@@ -9,12 +9,12 @@ flask-restful==0.3.9
 flask-session2==1.3.1
 flask-cors==3.0.10
 gunicorn~=21.2.0
-gevent~=22.10.2
+gevent~=23.9.1
 langchain==0.0.250
 openai~=1.3.6
 tiktoken~=0.5.2
 psycopg2-binary~=2.9.6
-pycryptodome==3.17
+pycryptodome==3.19.1
 python-dotenv==1.0.0
 pytest~=7.3.1
 pytest-mock~=3.11.1
@@ -44,14 +44,14 @@ readabilipy==0.2.0
 google-search-results==2.4.2
 replicate~=0.22.0
 websocket-client~=1.7.0
-dashscope~=1.13.5
+dashscope[tokenizer]~=1.14.0
 huggingface_hub~=0.16.4
 transformers~=4.31.0
 pandas==1.5.3
 xinference-client~=0.6.4
 safetensors==0.3.2
 zhipuai==1.0.7
-werkzeug==2.3.7
+werkzeug==2.3.8
 pymilvus==2.3.0
 qdrant-client==1.6.4
 cohere~=4.32
--- a/api/services/completion_service.py
+++ b/api/services/completion_service.py
@@ -27,10 +27,15 @@ class CompletionService:
        auto_generate_name = args['auto_generate_name'] \
            if 'auto_generate_name' in args else True

-        if app_model.mode != 'completion' and not query:
-            raise ValueError('query is required')
+        if app_model.mode != 'completion':
+            if not query:
+                raise ValueError('query is required')

-        query = query.replace('\x00', '')
+        if query:
+            if not isinstance(query, str):
+                raise ValueError('query must be a string')
+
+            query = query.replace('\x00', '')

        conversation_id = args['conversation_id'] if 'conversation_id' in args else None

@@ -230,6 +235,10 @@ class CompletionService:

            value = user_inputs[variable]

+            if value:
+                if not isinstance(value, str):
+                    raise ValueError(f"{variable} in input form must be a string")
+
            if input_type == "select":
                options = input_config["options"] if "options" in input_config else []
                if value not in options:
@@ -243,4 +252,3 @@ class CompletionService:
            filtered_inputs[variable] = value.replace('\x00', '') if value else None

        return filtered_inputs
-
--- a/api/tests/integration_tests/model_runtime/openai/test_llm.py
+++ b/api/tests/integration_tests/model_runtime/openai/test_llm.py
@@ -327,10 +327,35 @@ def test_get_num_tokens():
            UserPromptMessage(
                content='Hello World!'
            )
+        ],
+        tools=[
+            PromptMessageTool(
+                name='get_weather',
+                description='Determine weather in my location',
+                parameters={
+                    "type": "object",
+                    "properties": {
+                      "location": {
+                        "type": "string",
+                        "description": "The city and state e.g. San Francisco, CA"
+                      },
+                      "unit": {
+                        "type": "string",
+                        "enum": [
+                          "c",
+                          "f"
+                        ]
+                      }
+                    },
+                    "required": [
+                      "location"
+                    ]
+                }
+            ),
        ]
    )

-    assert num_tokens == 21
+    assert num_tokens == 72

@pytest.mark.parametrize('setup_openai_mock', [['chat', 'remote']], indirect=True)
 def test_fine_tuned_models(setup_openai_mock):
--- a/docker/docker-compose.yaml
+++ b/docker/docker-compose.yaml
@@ -2,7 +2,7 @@ version: '3.1'
 services:
  # API service
  api:
-    image: langgenius/dify-api:0.4.6
+    image: langgenius/dify-api:0.4.8
    restart: always
    environment:
      # Startup mode, 'api' starts the API server.
@@ -131,7 +131,7 @@ services:
  # worker service
  # The Celery worker for processing the queue.
  worker:
-    image: langgenius/dify-api:0.4.6
+    image: langgenius/dify-api:0.4.8
    restart: always
    environment:
      # Startup mode, 'worker' starts the Celery worker for processing the queue.
@@ -202,7 +202,7 @@ services:

  # Frontend web application.
  web:
-    image: langgenius/dify-web:0.4.6
+    image: langgenius/dify-web:0.4.8
    restart: always
    environment:
      EDITION: SELF_HOSTED
@@ -287,7 +287,7 @@ services:
  # (if uncommented, you need to comment out the weaviate service above,
  # and set VECTOR_STORE to qdrant in the api & worker service.)
  # qdrant:
-  #   image: langgenius/qdrant:1.7.3
+  #   image: langgenius/qdrant:v1.7.3
  #   restart: always
  #   volumes:
  #     - ./volumes/qdrant:/qdrant/storage
--- a/images/models.png
+++ b/images/models.png
--- a/web/app/(commonLayout)/datasets/(datasetDetailLayout)/[datasetId]/layout.tsx
+++ b/web/app/(commonLayout)/datasets/(datasetDetailLayout)/[datasetId]/layout.tsx
@@ -55,8 +55,8 @@ const LikedItem = ({
  isMobile,
 }: ILikedItemProps) => {
  return (
-    <Link className={classNames(s.itemWrapper, 'px-0 sm:px-3 justify-center sm:justify-start')} href={`/app/${detail?.id}/overview`}>
-      <div className={classNames(s.iconWrapper, 'mr-0 sm:mr-2')}>
+    <Link className={classNames(s.itemWrapper, 'px-0', isMobile && 'justify-center')} href={`/app/${detail?.id}/overview`}>
+      <div className={classNames(s.iconWrapper, 'mr-0')}>
        <AppIcon size='tiny' icon={detail?.icon} background={detail?.icon_background}/>
        {type === 'app' && (
          <div className={s.statusPoint}>
@@ -64,7 +64,7 @@ const LikedItem = ({
          </div>
        )}
      </div>
-      {!isMobile && <div className={s.appInfo}>{detail?.name || '--'}</div>}
+      {!isMobile && <div className={classNames(s.appInfo, 'ml-2')}>{detail?.name || '--'}</div>}
    </Link>
  )
 }
@@ -197,14 +197,14 @@ const DatasetDetailLayout: FC<IAppDetailLayoutProps> = (props) => {
    return <Loading />

  return (
-    <div className='flex overflow-hidden'>
+    <div className='grow flex overflow-hidden'>
      {!hideSideBar && <AppSideBar
        title={datasetRes?.name || '--'}
        icon={datasetRes?.icon || 'https://static.dify.ai/images/dataset-default-icon.png'}
        icon_background={datasetRes?.icon_background || '#F5F5F5'}
        desc={datasetRes?.description || '--'}
        navigation={navigation}
-        extraInfo={<ExtraInfo isMobile={isMobile} relatedApps={relatedApps} />}
+        extraInfo={mode => <ExtraInfo isMobile={mode === 'collapse'} relatedApps={relatedApps} />}
        iconType={datasetRes?.data_source_type === DataSourceType.NOTION ? 'notion' : 'dataset'}
      />}
      <DatasetDetailContext.Provider value={{
--- a/web/app/(commonLayout)/datasets/template/template.en.mdx
+++ b/web/app/(commonLayout)/datasets/template/template.en.mdx
@@ -539,7 +539,7 @@ import { Row, Col, Properties, Property, Heading, SubProperty, Paragraph } from
 ---

 <Heading
-  url='/datasets/{dataset_id}/batch/{batch}/indexing-status'
+  url='/datasets/{dataset_id}/documents/{batch}/indexing-status'
  method='GET'
  title='Get document embedding status (progress)'
  name='#indexing_status'
@@ -560,7 +560,7 @@ import { Row, Col, Properties, Property, Heading, SubProperty, Paragraph } from
    <CodeGroup
      title="Request"
      tag="GET"
-      label="/datasets/{dataset_id}/batch/{batch}/indexing-status"
+      label="/datasets/{dataset_id}/documents/{batch}/indexing-status"
      targetCode={`curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{batch}/indexing-status' \\\n--header 'Authorization: Bearer {api_key}'`}
    >
    ```bash {{ title: 'cURL' }}
--- a/web/app/(commonLayout)/datasets/template/template.zh.mdx
+++ b/web/app/(commonLayout)/datasets/template/template.zh.mdx
@@ -539,7 +539,7 @@ import { Row, Col, Properties, Property, Heading, SubProperty, Paragraph } from
 ---

 <Heading
-  url='/datasets/{dataset_id}/batch/{batch}/indexing-status'
+  url='/datasets/{dataset_id}/documents/{batch}/indexing-status'
  method='GET'
  title='获取文档嵌入状态（进度）'
  name='#indexing_status'
@@ -560,7 +560,7 @@ import { Row, Col, Properties, Property, Heading, SubProperty, Paragraph } from
    <CodeGroup
      title="Request"
      tag="GET"
-      label="/datasets/{dataset_id}/batch/{batch}/indexing-status"
+      label="/datasets/{dataset_id}/documents/{batch}/indexing-status"
      targetCode={`curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{batch}/indexing-status' \\\n--header 'Authorization: Bearer {api_key}'`}
    >
    ```bash {{ title: 'cURL' }}
--- a/web/app/components/app-sidebar/index.tsx
+++ b/web/app/components/app-sidebar/index.tsx
@@ -20,7 +20,7 @@ export type IAppDetailNavProps = {
    icon: NavIcon
    selectedIcon: NavIcon
  }>
-  extraInfo?: React.ReactNode
+  extraInfo?: (modeState: string) => React.ReactNode
 }

 const AppDetailNav = ({ title, desc, icon, icon_background, navigation, extraInfo, iconType = 'app' }: IAppDetailNavProps) => {
@@ -72,7 +72,7 @@ const AppDetailNav = ({ title, desc, icon, icon_background, navigation, extraInf
            <NavLink key={index} mode={modeState} iconMap={{ selected: item.selectedIcon, normal: item.icon }} name={item.name} href={item.href} />
          )
        })}
-        {extraInfo ?? null}
+        {extraInfo && extraInfo(modeState)}
      </nav>
      {
        !isMobile && (
--- a/web/app/components/app/chat/log/index.tsx
+++ b/web/app/components/app/chat/log/index.tsx
@@ -46,7 +46,11 @@ const Log: FC<LogProps> = ({
              `}>
                <div
                  className='flex items-center justify-center rounded-md w-full h-full hover:bg-gray-100'
-                  onClick={() => setShowModal(true)}
+                  onClick={(e) => {
+                    e.stopPropagation()
+                    setShowModal(true)
+                  }
+                  }
                >
                  <File02 className='w-4 h-4 text-gray-500' />
                </div>
--- a/web/app/components/app/configuration/dataset-config/params-config/index.tsx
+++ b/web/app/components/app/configuration/dataset-config/params-config/index.tsx
@@ -103,7 +103,7 @@ const ParamsConfig: FC = () => {
    const config = { ...tempDataSetConfigs }
    if (config.retrieval_model === RETRIEVE_TYPE.multiWay && !config.reranking_model) {
      config.reranking_model = {
-        reranking_provider_name: rerankDefaultModel?.provider,
+        reranking_provider_name: rerankDefaultModel?.provider?.provider,
        reranking_model_name: rerankDefaultModel?.model,
      } as any
    }
--- a/web/app/components/billing/pricing/plan-item.tsx
+++ b/web/app/components/billing/pricing/plan-item.tsx
@@ -107,7 +107,6 @@ const PlanItem: FC<Props> = ({
            <div>{t('billing.plansCommon.supportItems.emailSupport')}</div>
            <div className='mt-3.5 flex items-center space-x-1'>
              <div>+ {t('billing.plansCommon.supportItems.logoChange')}</div>
-              <div>{comingSoon}</div>
            </div>
            <div className='mt-3.5 flex items-center space-x-1'>
              <div className='flex items-center'>
@@ -135,7 +134,6 @@ const PlanItem: FC<Props> = ({
            <div>{t('billing.plansCommon.supportItems.priorityEmail')}</div>
            <div className='mt-3.5 flex items-center space-x-1'>
              <div>+ {t('billing.plansCommon.supportItems.logoChange')}</div>
-              <div>{comingSoon}</div>
            </div>
            <div className='mt-3.5 flex items-center space-x-1'>
              <div>+ {t('billing.plansCommon.supportItems.SSOAuthentication')}</div>
--- a/web/app/components/header/account-setting/model-provider-page/index.tsx
+++ b/web/app/components/header/account-setting/model-provider-page/index.tsx
@@ -38,7 +38,13 @@ const ModelProviderPage = () => {
    const notConfigedProviders: ModelProvider[] = []

    providers.forEach((provider) => {
-      if (provider.custom_configuration.status === CustomConfigurationStatusEnum.active || provider.system_configuration.enabled === true)
+      if (
+        provider.custom_configuration.status === CustomConfigurationStatusEnum.active
+        || (
+          provider.system_configuration.enabled === true
+          && provider.system_configuration.quota_configurations.find(item => item.quota_type === provider.system_configuration.current_quota_type)
+        )
+      )
        configedProviders.push(provider)
      else
        notConfigedProviders.push(provider)
--- a/web/app/components/header/account-setting/model-provider-page/provider-added-card/index.tsx
+++ b/web/app/components/header/account-setting/model-provider-page/provider-added-card/index.tsx
@@ -9,6 +9,8 @@ import type {
 import { ConfigurateMethodEnum } from '../declarations'
 import {
  DEFAULT_BACKGROUND_COLOR,
+  MODEL_PROVIDER_QUOTA_GET_FREE,
+  MODEL_PROVIDER_QUOTA_GET_PAID,
  modelTypeFormat,
 } from '../utils'
 import ProviderIcon from '../provider-icon'
@@ -41,7 +43,7 @@ const ProviderAddedCard: FC<ProviderAddedCardProps> = ({
  const configurateMethods = provider.configurate_methods.filter(method => method !== ConfigurateMethodEnum.fetchFromRemote)
  const systemConfig = provider.system_configuration
  const hasModelList = fetched && !!modelList.length
-  const showQuota = systemConfig.enabled && ['minimax', 'spark', 'zhipuai', 'anthropic', 'openai'].includes(provider.provider) && !IS_CE_EDITION
+  const showQuota = systemConfig.enabled && [...MODEL_PROVIDER_QUOTA_GET_FREE, ...MODEL_PROVIDER_QUOTA_GET_PAID].includes(provider.provider) && !IS_CE_EDITION

  const getModelList = async (providerName: string) => {
    if (loading)
--- a/web/app/components/header/account-setting/model-provider-page/provider-added-card/quota-panel.tsx
+++ b/web/app/components/header/account-setting/model-provider-page/provider-added-card/quota-panel.tsx
@@ -11,6 +11,10 @@ import {
  useFreeQuota,
  useUpdateModelProviders,
 } from '../hooks'
+import {
+  MODEL_PROVIDER_QUOTA_GET_FREE,
+  MODEL_PROVIDER_QUOTA_GET_PAID,
+} from '../utils'
 import PriorityUseTip from './priority-use-tip'
 import { InfoCircle } from '@/app/components/base/icons/src/vender/line/general'
 import Button from '@/app/components/base/button'
@@ -34,7 +38,7 @@ const QuotaPanel: FC<QuotaPanelProps> = ({
  const priorityUseType = provider.preferred_provider_type
  const systemConfig = provider.system_configuration
  const currentQuota = systemConfig.enabled && systemConfig.quota_configurations.find(item => item.quota_type === systemConfig.current_quota_type)
-  const openaiOrAnthropic = ['openai', 'anthropic'].includes(provider.provider)
+  const openaiOrAnthropic = MODEL_PROVIDER_QUOTA_GET_PAID.includes(provider.provider)

  return (
    <div className='group relative shrink-0 min-w-[112px] px-3 py-2 rounded-lg bg-white/[0.3] border-[0.5px] border-black/5'>
@@ -72,7 +76,7 @@ const QuotaPanel: FC<QuotaPanelProps> = ({
        )
      }
      {
-        !currentQuota && ['minimax', 'spark', 'zhipuai'].includes(provider.provider) && (
+        !currentQuota && MODEL_PROVIDER_QUOTA_GET_FREE.includes(provider.provider) && (
          <Button
            className='h-6 bg-white text-xs font-medium rounded-md'
            onClick={() => handleFreeQuota(provider.provider)}
--- a/web/app/components/header/account-setting/model-provider-page/provider-card/index.tsx
+++ b/web/app/components/header/account-setting/model-provider-page/provider-card/index.tsx
@@ -7,6 +7,7 @@ import type {
 import { ConfigurateMethodEnum } from '../declarations'
 import {
  DEFAULT_BACKGROUND_COLOR,
+  MODEL_PROVIDER_QUOTA_GET_FREE,
  modelTypeFormat,
 } from '../utils'
 import {
@@ -55,7 +56,7 @@ const ProviderCard: FC<ProviderCardProps> = ({
  }
  const handleFreeQuota = useFreeQuota(handleFreeQuotaSuccess)
  const configurateMethods = provider.configurate_methods.filter(method => method !== ConfigurateMethodEnum.fetchFromRemote)
-  const canGetFreeQuota = ['mininmax', 'spark', 'zhipuai'].includes(provider.provider) && !IS_CE_EDITION
+  const canGetFreeQuota = MODEL_PROVIDER_QUOTA_GET_FREE.includes(provider.provider) && !IS_CE_EDITION && provider.system_configuration.enabled

  return (
    <div
--- a/web/app/components/header/account-setting/model-provider-page/utils.ts
+++ b/web/app/components/header/account-setting/model-provider-page/utils.ts
@@ -23,6 +23,9 @@ export const languageMaps = {
  'zh-Hans': 'zh_Hans'
 }

+export const MODEL_PROVIDER_QUOTA_GET_FREE = ['minimax', 'spark', 'zhipuai']
+export const MODEL_PROVIDER_QUOTA_GET_PAID = ['anthropic', 'openai']
+
 export const DEFAULT_BACKGROUND_COLOR = '#F3F4F6'

 export const isNullOrUndefined = (value: any) => {
--- a/web/package.json
+++ b/web/package.json
@@ -1,6 +1,6 @@
 {
  "name": "dify-web",
-  "version": "0.4.6",
+  "version": "0.4.8",
  "private": true,
  "scripts": {
    "dev": "next dev",
@@ -74,6 +74,7 @@
    "remark-math": "^5.1.1",
    "scheduler": "^0.23.0",
    "server-only": "^0.0.1",
+    "sharp": "^0.33.2",
    "sortablejs": "^1.15.0",
    "swr": "^2.1.0",
    "use-context-selector": "^1.4.1"
--- a/web/yarn.lock
+++ b/web/yarn.lock
@@ -112,6 +112,13 @@
  resolved "https://registry.npmjs.org/@braintree/sanitize-url/-/sanitize-url-6.0.4.tgz"
  integrity sha512-s3jaWicZd0pkP0jf5ysyHUI/RE7MHos6qlToFcGWXVp+ykHOy77OUMrfbgJ9it2C5bow7OIQwYYaHjk9XlBQ2A==

+"@emnapi/runtime@^0.45.0":
+  version "0.45.0"
+  resolved "https://registry.yarnpkg.com/@emnapi/runtime/-/runtime-0.45.0.tgz#e754de04c683263f34fd0c7f32adfe718bbe4ddd"
+  integrity sha512-Txumi3td7J4A/xTTwlssKieHKTGl3j4A1tglBx72auZ49YK7ePY6XZricgIg9mnZT4xPfA+UPCUdnhRuEFDL+w==
+  dependencies:
+    tslib "^2.4.0"
+
 "@emoji-mart/data@^1.1.2":
  version "1.1.2"
  resolved "https://registry.npmjs.org/@emoji-mart/data/-/data-1.1.2.tgz"
@@ -235,6 +242,119 @@
  resolved "https://registry.npmjs.org/@humanwhocodes/object-schema/-/object-schema-1.2.1.tgz"
  integrity sha512-ZnQMnLV4e7hDlUvw8H+U8ASL02SS2Gn6+9Ac3wGGLIe7+je2AeAOxPY+izIPJDfFDb7eDjev0Us8MO1iFRN8hA==

+"@img/sharp-darwin-arm64@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-darwin-arm64/-/sharp-darwin-arm64-0.33.2.tgz#0a52a82c2169112794dac2c71bfba9e90f7c5bd1"
+  integrity sha512-itHBs1rPmsmGF9p4qRe++CzCgd+kFYktnsoR1sbIAfsRMrJZau0Tt1AH9KVnufc2/tU02Gf6Ibujx+15qRE03w==
+  optionalDependencies:
+    "@img/sharp-libvips-darwin-arm64" "1.0.1"
+
+"@img/sharp-darwin-x64@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-darwin-x64/-/sharp-darwin-x64-0.33.2.tgz#982e26bb9d38a81f75915c4032539aed621d1c21"
+  integrity sha512-/rK/69Rrp9x5kaWBjVN07KixZanRr+W1OiyKdXcbjQD6KbW+obaTeBBtLUAtbBsnlTTmWthw99xqoOS7SsySDg==
+  optionalDependencies:
+    "@img/sharp-libvips-darwin-x64" "1.0.1"
+
+"@img/sharp-libvips-darwin-arm64@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-darwin-arm64/-/sharp-libvips-darwin-arm64-1.0.1.tgz#81e83ffc2c497b3100e2f253766490f8fad479cd"
+  integrity sha512-kQyrSNd6lmBV7O0BUiyu/OEw9yeNGFbQhbxswS1i6rMDwBBSX+e+rPzu3S+MwAiGU3HdLze3PanQ4Xkfemgzcw==
+
+"@img/sharp-libvips-darwin-x64@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-darwin-x64/-/sharp-libvips-darwin-x64-1.0.1.tgz#fc1fcd9d78a178819eefe2c1a1662067a83ab1d6"
+  integrity sha512-eVU/JYLPVjhhrd8Tk6gosl5pVlvsqiFlt50wotCvdkFGf+mDNBJxMh+bvav+Wt3EBnNZWq8Sp2I7XfSjm8siog==
+
+"@img/sharp-libvips-linux-arm64@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-linux-arm64/-/sharp-libvips-linux-arm64-1.0.1.tgz#26eb8c556a9b0db95f343fc444abc3effb67ebcf"
+  integrity sha512-bnGG+MJjdX70mAQcSLxgeJco11G+MxTz+ebxlz8Y3dxyeb3Nkl7LgLI0mXupoO+u1wRNx/iRj5yHtzA4sde1yA==
+
+"@img/sharp-libvips-linux-arm@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-linux-arm/-/sharp-libvips-linux-arm-1.0.1.tgz#2a377b959ff7dd6528deee486c25461296a4fa8b"
+  integrity sha512-FtdMvR4R99FTsD53IA3LxYGghQ82t3yt0ZQ93WMZ2xV3dqrb0E8zq4VHaTOuLEAuA83oDawHV3fd+BsAPadHIQ==
+
+"@img/sharp-libvips-linux-s390x@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-linux-s390x/-/sharp-libvips-linux-s390x-1.0.1.tgz#af28ac9ba929204467ecdf843330d791e9421e10"
+  integrity sha512-3+rzfAR1YpMOeA2zZNp+aYEzGNWK4zF3+sdMxuCS3ey9HhDbJ66w6hDSHDMoap32DueFwhhs3vwooAB2MaK4XQ==
+
+"@img/sharp-libvips-linux-x64@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-linux-x64/-/sharp-libvips-linux-x64-1.0.1.tgz#4273d182aa51912e655e1214ea47983d7c1f7f8d"
+  integrity sha512-3NR1mxFsaSgMMzz1bAnnKbSAI+lHXVTqAHgc1bgzjHuXjo4hlscpUxc0vFSAPKI3yuzdzcZOkq7nDPrP2F8Jgw==
+
+"@img/sharp-libvips-linuxmusl-arm64@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-linuxmusl-arm64/-/sharp-libvips-linuxmusl-arm64-1.0.1.tgz#d150c92151cea2e8d120ad168b9c358d09c77ce8"
+  integrity sha512-5aBRcjHDG/T6jwC3Edl3lP8nl9U2Yo8+oTl5drd1dh9Z1EBfzUKAJFUDTDisDjUwc7N4AjnPGfCA3jl3hY8uDg==
+
+"@img/sharp-libvips-linuxmusl-x64@1.0.1":
+  version "1.0.1"
+  resolved "https://registry.yarnpkg.com/@img/sharp-libvips-linuxmusl-x64/-/sharp-libvips-linuxmusl-x64-1.0.1.tgz#e297c1a4252c670d93b0f9e51fca40a7a5b6acfd"
+  integrity sha512-dcT7inI9DBFK6ovfeWRe3hG30h51cBAP5JXlZfx6pzc/Mnf9HFCQDLtYf4MCBjxaaTfjCCjkBxcy3XzOAo5txw==
+
+"@img/sharp-linux-arm64@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-linux-arm64/-/sharp-linux-arm64-0.33.2.tgz#af3409f801a9bee1d11d0c7e971dcd6180f80022"
+  integrity sha512-pz0NNo882vVfqJ0yNInuG9YH71smP4gRSdeL09ukC2YLE6ZyZePAlWKEHgAzJGTiOh8Qkaov6mMIMlEhmLdKew==
+  optionalDependencies:
+    "@img/sharp-libvips-linux-arm64" "1.0.1"
+
+"@img/sharp-linux-arm@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-linux-arm/-/sharp-linux-arm-0.33.2.tgz#181f7466e6ac074042a38bfb679eb82505e17083"
+  integrity sha512-Fndk/4Zq3vAc4G/qyfXASbS3HBZbKrlnKZLEJzPLrXoJuipFNNwTes71+Ki1hwYW5lch26niRYoZFAtZVf3EGA==
+  optionalDependencies:
+    "@img/sharp-libvips-linux-arm" "1.0.1"
+
+"@img/sharp-linux-s390x@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-linux-s390x/-/sharp-linux-s390x-0.33.2.tgz#9c171f49211f96fba84410b3e237b301286fa00f"
+  integrity sha512-MBoInDXDppMfhSzbMmOQtGfloVAflS2rP1qPcUIiITMi36Mm5YR7r0ASND99razjQUpHTzjrU1flO76hKvP5RA==
+  optionalDependencies:
+    "@img/sharp-libvips-linux-s390x" "1.0.1"
+
+"@img/sharp-linux-x64@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-linux-x64/-/sharp-linux-x64-0.33.2.tgz#b956dfc092adc58c2bf0fae2077e6f01a8b2d5d7"
+  integrity sha512-xUT82H5IbXewKkeF5aiooajoO1tQV4PnKfS/OZtb5DDdxS/FCI/uXTVZ35GQ97RZXsycojz/AJ0asoz6p2/H/A==
+  optionalDependencies:
+    "@img/sharp-libvips-linux-x64" "1.0.1"
+
+"@img/sharp-linuxmusl-arm64@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-linuxmusl-arm64/-/sharp-linuxmusl-arm64-0.33.2.tgz#10e0ec5a79d1234c6a71df44c9f3b0bef0bc0f15"
+  integrity sha512-F+0z8JCu/UnMzg8IYW1TMeiViIWBVg7IWP6nE0p5S5EPQxlLd76c8jYemG21X99UzFwgkRo5yz2DS+zbrnxZeA==
+  optionalDependencies:
+    "@img/sharp-libvips-linuxmusl-arm64" "1.0.1"
+
+"@img/sharp-linuxmusl-x64@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-linuxmusl-x64/-/sharp-linuxmusl-x64-0.33.2.tgz#29e0030c24aa27c38201b1fc84e3d172899fcbe0"
+  integrity sha512-+ZLE3SQmSL+Fn1gmSaM8uFusW5Y3J9VOf+wMGNnTtJUMUxFhv+P4UPaYEYT8tqnyYVaOVGgMN/zsOxn9pSsO2A==
+  optionalDependencies:
+    "@img/sharp-libvips-linuxmusl-x64" "1.0.1"
+
+"@img/sharp-wasm32@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-wasm32/-/sharp-wasm32-0.33.2.tgz#38d7c740a22de83a60ad1e6bcfce17462b0d4230"
+  integrity sha512-fLbTaESVKuQcpm8ffgBD7jLb/CQLcATju/jxtTXR1XCLwbOQt+OL5zPHSDMmp2JZIeq82e18yE0Vv7zh6+6BfQ==
+  dependencies:
+    "@emnapi/runtime" "^0.45.0"
+
+"@img/sharp-win32-ia32@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-win32-ia32/-/sharp-win32-ia32-0.33.2.tgz#09456314e223f68e5417c283b45c399635c16202"
+  integrity sha512-okBpql96hIGuZ4lN3+nsAjGeggxKm7hIRu9zyec0lnfB8E7Z6p95BuRZzDDXZOl2e8UmR4RhYt631i7mfmKU8g==
+
+"@img/sharp-win32-x64@0.33.2":
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/@img/sharp-win32-x64/-/sharp-win32-x64-0.33.2.tgz#148e96dfd6e68747da41a311b9ee4559bb1b1471"
+  integrity sha512-E4magOks77DK47FwHUIGH0RYWSgRBfGdK56kIHSVeB9uIS4pPFr4N2kIVsXdQQo4LzOsENKV5KAhRlRL7eMAdg==
+
 "@jridgewell/gen-mapping@^0.3.2":
  version "0.3.3"
  resolved "https://registry.npmjs.org/@jridgewell/gen-mapping/-/gen-mapping-0.3.3.tgz"
@@ -1581,11 +1701,27 @@ color-name@1.1.3:
  resolved "https://registry.npmjs.org/color-name/-/color-name-1.1.3.tgz"
  integrity sha512-72fSenhMw2HZMTVHeCA9KCmpEIbzWiQsjN+BHcBbS9vr1mtt+vJjPdksIBNUmKAW8TFUDPJK5SUU3QhE9NEXDw==

-color-name@~1.1.4:
+color-name@^1.0.0, color-name@~1.1.4:
  version "1.1.4"
  resolved "https://registry.npmjs.org/color-name/-/color-name-1.1.4.tgz"
  integrity sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==

+color-string@^1.9.0:
+  version "1.9.1"
+  resolved "https://registry.yarnpkg.com/color-string/-/color-string-1.9.1.tgz#4467f9146f036f855b764dfb5bf8582bf342c7a4"
+  integrity sha512-shrVawQFojnZv6xM40anx4CkoDP+fZsw/ZerEMsW/pyzsRbElpsL/DBVW7q3ExxwusdNXI3lXpuhEZkzs8p5Eg==
+  dependencies:
+    color-name "^1.0.0"
+    simple-swizzle "^0.2.2"
+
+color@^4.2.3:
+  version "4.2.3"
+  resolved "https://registry.yarnpkg.com/color/-/color-4.2.3.tgz#d781ecb5e57224ee43ea9627560107c0e0c6463a"
+  integrity sha512-1rXeuUUiGGrykh+CeBdu5Ie7OJwinCgQY0bc7GCRxy5xVHy+moaqkpL/jqQq0MtQOeYcrqEz4abc5f0KtU7W4A==
+  dependencies:
+    color-convert "^2.0.1"
+    color-string "^1.9.0"
+
 colorette@^2.0.19:
  version "2.0.20"
  resolved "https://registry.npmjs.org/colorette/-/colorette-2.0.20.tgz"
@@ -2076,6 +2212,11 @@ dequal@^2.0.0, dequal@^2.0.3:
  resolved "https://registry.npmjs.org/dequal/-/dequal-2.0.3.tgz"
  integrity sha512-0je+qPKHEMohvfRTCEo3CrPG6cAzAYgmzKyxRiYSSDkS6eGJdyVJm7WaYA5ECaAD9wLB2T4EEeymA5aFVcYXCA==

+detect-libc@^2.0.2:
+  version "2.0.2"
+  resolved "https://registry.yarnpkg.com/detect-libc/-/detect-libc-2.0.2.tgz#8ccf2ba9315350e1241b88d0ac3b0e1fbd99605d"
+  integrity sha512-UX6sGumvvqSaXgdKGUsgZWqcUyIXZ/vZTrlRT/iobiKhGL0zL4d3osHj3uqllWJK+i+sixDS/3COVEOFbupFyw==
+
 didyoumean@^1.2.2:
  version "1.2.2"
  resolved "https://registry.npmjs.org/didyoumean/-/didyoumean-1.2.2.tgz"
@@ -3521,6 +3662,11 @@ is-arrayish@^0.2.1:
  resolved "https://registry.npmjs.org/is-arrayish/-/is-arrayish-0.2.1.tgz"
  integrity sha512-zz06S8t0ozoDXMG+ube26zeCTNXcKIPJZJi8hBrF4idCLms4CG9QtK7qBl1boi5ODzFpjswb5JPmHCbMpjaYzg==

+is-arrayish@^0.3.1:
+  version "0.3.2"
+  resolved "https://registry.yarnpkg.com/is-arrayish/-/is-arrayish-0.3.2.tgz#4574a2ae56f7ab206896fb431eaeed066fdf8f03"
+  integrity sha512-eVRqCvVlZbuw3GrM63ovNSNAeA1K16kaR/LRY/92w0zxQ5/1YzwblUX652i4Xs9RwAGjW9d9y6X88t8OaAJfWQ==
+
 is-async-function@^2.0.0:
  version "2.0.0"
  resolved "https://registry.yarnpkg.com/is-async-function/-/is-async-function-2.0.0.tgz#8e4418efd3e5d3a6ebb0164c05ef5afb69aa9646"
@@ -6072,6 +6218,35 @@ set-function-name@^2.0.0, set-function-name@^2.0.1:
    functions-have-names "^1.2.3"
    has-property-descriptors "^1.0.0"

+sharp@^0.33.2:
+  version "0.33.2"
+  resolved "https://registry.yarnpkg.com/sharp/-/sharp-0.33.2.tgz#fcd52f2c70effa8a02160b1bfd989a3de55f2dfb"
+  integrity sha512-WlYOPyyPDiiM07j/UO+E720ju6gtNtHjEGg5vovUk1Lgxyjm2LFO+37Nt/UI3MMh2l6hxTWQWi7qk3cXJTutcQ==
+  dependencies:
+    color "^4.2.3"
+    detect-libc "^2.0.2"
+    semver "^7.5.4"
+  optionalDependencies:
+    "@img/sharp-darwin-arm64" "0.33.2"
+    "@img/sharp-darwin-x64" "0.33.2"
+    "@img/sharp-libvips-darwin-arm64" "1.0.1"
+    "@img/sharp-libvips-darwin-x64" "1.0.1"
+    "@img/sharp-libvips-linux-arm" "1.0.1"
+    "@img/sharp-libvips-linux-arm64" "1.0.1"
+    "@img/sharp-libvips-linux-s390x" "1.0.1"
+    "@img/sharp-libvips-linux-x64" "1.0.1"
+    "@img/sharp-libvips-linuxmusl-arm64" "1.0.1"
+    "@img/sharp-libvips-linuxmusl-x64" "1.0.1"
+    "@img/sharp-linux-arm" "0.33.2"
+    "@img/sharp-linux-arm64" "0.33.2"
+    "@img/sharp-linux-s390x" "0.33.2"
+    "@img/sharp-linux-x64" "0.33.2"
+    "@img/sharp-linuxmusl-arm64" "0.33.2"
+    "@img/sharp-linuxmusl-x64" "0.33.2"
+    "@img/sharp-wasm32" "0.33.2"
+    "@img/sharp-win32-ia32" "0.33.2"
+    "@img/sharp-win32-x64" "0.33.2"
+
 shebang-command@^2.0.0:
  version "2.0.0"
  resolved "https://registry.npmjs.org/shebang-command/-/shebang-command-2.0.0.tgz"
@@ -6098,6 +6273,13 @@ signal-exit@^3.0.2, signal-exit@^3.0.3, signal-exit@^3.0.7:
  resolved "https://registry.npmjs.org/signal-exit/-/signal-exit-3.0.7.tgz"
  integrity sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==

+simple-swizzle@^0.2.2:
+  version "0.2.2"
+  resolved "https://registry.yarnpkg.com/simple-swizzle/-/simple-swizzle-0.2.2.tgz#a4da6b635ffcccca33f70d17cb92592de95e557a"
+  integrity sha512-JA//kQgZtbuY83m+xT+tXJkmJncGMTFT+C+g2h2R9uxkYIrE2yy9sgmcLhCnw57/WSD+Eh3J97FPEDFnbXnDUg==
+  dependencies:
+    is-arrayish "^0.3.1"
+
 size-sensor@^1.0.1:
  version "1.0.1"
  resolved "https://registry.npmjs.org/size-sensor/-/size-sensor-1.0.1.tgz"
Author	SHA1	Message	Date
takatost	8654415f33	bump version to 0.4.8 (#2074 )	2024-01-17 22:51:02 +08:00
takatost	1a6ad05a23	feat: service api add llm usage (#2051 )	2024-01-17 22:39:47 +08:00
takatost	1d91535ba6	fix: azure customize model name duplicate (#2073 )	2024-01-17 21:17:59 +08:00
takatost	8799c888e3	fix: free quota type apply button missing (#2069 ) Co-authored-by: StyleZhang <jasonapring2015@outlook.com>	2024-01-17 15:02:27 +08:00
crazywoola	d7209d9057	feat: add abab5.5s-chat (#2063 )	2024-01-16 19:45:21 +08:00
Chenhe Gu	5960103cb8	Fix aspect ratio of buttons in readme (#2062 )	2024-01-16 19:25:57 +08:00
Bowen Liang	2ffea39a5c	fix: add sharp package to fix sharp-missing-in-production warning (#2060 )	2024-01-16 17:25:18 +08:00
crazywoola	1e76b1bf2d	Update README.md (#2058 )	2024-01-16 17:24:49 +08:00
Chenhe Gu	2022ca1d52	fix indentation & move contact into community section (#2055 )	2024-01-16 16:59:38 +08:00
Chenhe Gu	e1319d1a2d	Update contribution guide (#2053 )	2024-01-16 15:12:35 +08:00
Jyong	a61df6cb03	timeout parameter error (#2052 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-16 14:44:47 +08:00
Jyong	790b885d0a	fix multi-dataset retrieve score limit (#2050 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-16 14:14:34 +08:00
Ricky	1a2eacc5a6	Add jina-embeddings-v2-base-zh model configuration (#2049 )	2024-01-16 12:25:42 +08:00
zxhlyh	f7a2f7a727	fix: dataset sidebar (#2048 )	2024-01-16 12:14:09 +08:00
takatost	a4adca595a	fix qdrant tag in docker-compose.yaml (#2047 )	2024-01-16 10:24:06 +08:00
takatost	c51e179db8	bump version to 0.4.7 (#2045 )	2024-01-16 01:13:10 +08:00
takatost	b582fc13c3	fix: qwen top_p min/max wrong (#2044 )	2024-01-16 01:12:55 +08:00
Jyong	add33cb5e6	fix SQL slow query (#2043 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-16 00:59:28 +08:00
Garfield Dai	83105d0d8f	fix: dataset and moderation. (#2042 )	2024-01-15 21:53:31 +08:00
Joel	7b0818b8e5	feat: fix debug rerank params error (#2041 )	2024-01-15 20:27:22 +08:00
takatost	28cd3a8c9f	fix: dependencies security problems (#2040 )	2024-01-15 19:26:08 +08:00
crazywoola	0355645a0e	doc: replace readme images (#2039 )	2024-01-15 17:38:22 +08:00
Jyong	cb7a608d75	ascii filter Unicode U+FFFE (#2038 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-15 16:52:18 +08:00
crazywoola	bdb0d77227	doc: replace readme images (#2030 )	2024-01-15 12:23:30 +08:00
Yeuoly	149102927b	fix: openai tool tokens (#2026 )	2024-01-14 15:51:05 +08:00
Vikey Chen	d8c0d722d2	fix: datasets indexing-status api document (#2019 )	2024-01-14 09:43:52 +08:00
Garfield Dai	cb7be3767c	feat: huggingface llm add new params. (#2014 )	2024-01-12 21:15:07 +08:00
takatost	34bf2877c8	fix: tongyi stream generate not incremental and add qwen max models (#2013 )	2024-01-12 19:19:12 +08:00
killpanda	3ebec8fa41	fixup /stop api (#2012 ) Co-authored-by: mayue <mayue05@qiyi.com>	2024-01-12 19:10:42 +08:00
Mark Sun	f877d19c6a	Update CONTRIBUTING.md (#2010 )	2024-01-12 19:01:29 +08:00
Jyong	a63a9c7d45	text spliter length method use default embedding model tokenizer (#2011 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-12 18:45:34 +08:00
takatost	1779cea6e3	fix: model provider credentials null value validate failed (#2009 )	2024-01-12 16:48:38 +08:00
Ricky	26eff330f9	fix: chat log wont show up (#2007 )	2024-01-12 14:36:56 +08:00