📌Overview

Purpose: Gaia-X aims to provide an enterprise-level chatbot application platform that addresses existing AI product limitations while leveraging a new AI paradigm.

Overview: The platform integrates advanced management features, multi-agent collaboration, and dynamic rendering capabilities, designed to enhance enterprise AI operations. Its innovative architecture allows for secure, modular functionalities that respond to various enterprise needs.

Key Features:

MCP (Model Context Protocol) Support: Integrates with community MCP Servers for improved management and resource usage while ensuring security through isolated environments.
Multi-Agent Intelligent Coordination: Facilitates automated collaboration among agents to efficiently execute complex tasks, enhancing productivity and operational efficiency.
Enterprise Management Center: Offers centralized user authentication and model management, streamlining enterprise resource control and user access across various authentication methods.

Gaia-X Next-Generation Enterprise AI Application Platform Based on AI New Paradigm

🌟 Project Overview

Gaia-X is the first chatbot application platform designed for enterprise scenarios based on the new AI paradigm, addressing core pain points of existing AI products with an innovative technical architecture:

🚀 Enterprise Management
🤖 MCP Protocol Support
👥 Multi-Agent Collaboration
💻 Natural Language RPA
✅ Human Confirmation Mechanism
🎨 Intelligent Canvas Rendering

Try Now | Admin Center | Documentation Center

🎯 Core Pain Point Solutions

Pain Point Area	Gaia-X Innovative Solutions
Lack of Enterprise Management	Complete user/permission/billing system and LLM API hosting
No MCP Protocol Support	The first enterprise-grade MCP support
Risk of Sensitive Operations	ReAct tool calls with human confirmation, dynamic rendering
Difficulty in Natural Language RPA	Large model-driven RPA automation
Weak Multi-Agent Collaboration	Intelligent Agent retrieval and multi-agent collaboration for complex tasks

📦 Project Architecture

1.1 Overall Project Architecture

1.2 MCP Call Process

During the Agent call process, the management center does not call MCP tools directly. Instead, after the client's node program receives the response, it determines if there are function calls and calls the corresponding MCP tools to execute. The execution chain proceeds as follows:

graph LR
    U --> |1|C[Gaia-X Client]
    C-->|2|API[Admin Center]
    API-->|3|LLM[LLM API]
    LLM-->|4|API[Admin Center]
    API-->|5|C[Gaia-X Client]
    C-->|6|B{Function Call?}
    B -- No -->U[User]
    B -- Yes --> MCP[MCP Servers]
    MCP -->|8a. Call| H[Tool 1]
    MCP -->|8b. Call| I[Tool 2]
    H-->|8.1|C[Gaia-X Client]
    I-->|8.1|C[Gaia-X Client]
    C-->|8.2|API[Admin Center]
    API-->|8.3|LLM[LLM API]
    LLM-->|8.4|API[Admin Center]
    API-->|8.5|C[Gaia-X Client]
    C-->|8.6|U[User]
   
    subgraph Possible Nested/Iterative Calls
    H
    I
    end

1.3 Client Authentication Logic at Startup

As an enterprise-level project, the client requires login authentication before accessing the Chatbot interface. Using OAuth2.0 as an example, the process is:

sequenceDiagram
    participant C as Client Program
    participant A as Admin Center
    participant O as OAuth2.0 Server
    participant U as User

    C->>A: Request authentication page (non-OAuth2.0 login page)
    A->>A: Check current login status
    alt Not logged in
        A->>O: Redirect and open OAuth2.0 login page
        O->>U: Display login interface
        U->>O: Submit credentials for login
        O->>A: Return authentication result (e.g., Token)
        A->>A: Generate JWT after verifying Token and update login status
    else Already logged in
        A->>U: Display authorization button
    end
    U->>A: Click authorize
    A->>C: Redirect back to client and pass authorization information (gaia://oauth-callback?code=xyz)
    C->>C: Client completes login process

Process Explanation:

Client requests authentication page from admin center.
Admin backend checks user login status.
If not logged in, redirects to OAuth2.0 login page.
User submits credentials, authentication result returned after successful login.
Admin backend generates JWT and updates login status.
User authorizes and the client completes login.

2. Core Features (Some still under development)

2.1 MCP (Model Context Protocol) Support

Support for integrating any community MCP Servers.
Unified management of all MCP Servers through the enterprise management center (in development).
All MCP Servers run in isolated sandbox environments on the client for security and data isolation.
Independent MCP Server display page (planned).

2.2 Multi-Agent Intelligent Coordination

Support for intelligent collaboration based on task orchestration or RAG to automatically complete complex tasks (planned).
Built-in key agents for computer operations, Python programming, web browsing, etc. (planned).

2.3 New Paradigm Infinite Canvas

Each Agent process in Multi-Agent conversations is an independent canvas with automatic summary as permanent memory (planned).
Support for common Artifacts including SVG, HTML, Mermaid, Echarts, PlantUML.
Support for MCP tools to dynamically render forms for interactive user modification and control submissions.
Support for Python, TypeScript, HTML, and other code execution (planned).

2.4 Intelligent Computer Operations

Integration with Claude Sonnet 3.5+, Zhipu CogAgent, ByteDance UI-TARS, OpenAI computer-use, and other models.
Agents can autonomously execute any computer operation.

2.5 Text Selection Analysis

Support for automatic display of Agent toolbar after text selection in any software interface for instant functions such as translation and copywriting generation.

2.6 Enterprise Management Center

Unified Authentication and User Management supporting OAuth 2.0, LDAP, DingTalk, Feishu, and more.
Unified Model and Tool Management via centralized configuration and authorization.
Quota management for users and APIs (planned).
Enterprise internal application ecosystem: MCP marketplace, Agent marketplace, application task marketplace (planned).
Business reports with comprehensive usage data (planned).

2.7 Third-party Agent Integration

Native support for third-party Agent platforms such as Dify and Coze with unified authorization management.

3. Technology Stack

3.1 Client

The client uses a plugin-based design, planned to evolve into a micro-kernel plugin architecture similar to VS Code, allowing extensions.

Framework: Electron + React
LLM UI: Ant Design X
Text Selection Monitoring: C++ (Windows), Objective-C (MacOS)

3.2 Admin Backend

Backend implements model and tool calls but does not execute MCP tools directly.

Large Model Interaction: Eino Framework + Self-developed Multi-Agent
Admin UI: Ant Design Pro (refactored frontend from GVA)
API Service: Golang + Gin (based on GVA)

3.3 MCP Server

Runs in independent sandbox environment.
Supports implementation in both Python and TypeScript.

4. Model Selection Reference

Computer Operation Models: Claude Sonnet 3.5+, CogAgent, UI-TARS, OpenAI computer-use
ReAct Recommended Models: Claude Sonnet 3.5+, GPT 4o (DeepSeek v3 not recommended for critical tasks)

5. Acknowledgements

The admin API is built based on Gin-Vue-Admin, reducing backend development efforts.
Eino by ByteDance, a golang-based large model application framework, addressing model integration and Agent issues.
Ant Design Pro was used to refactor frontend pages of Gin-Vue-Admin, unifying UI frameworks for admin backend and client.
Secondary developments on Eino include integration of common model providers and Multi-Agent architecture, published as Eino-X.
Embedded binary MCP runtime environment (supporting MacOS and Windows) is packaged under the project mcp-runtime.

6. License

This project is licensed under the Apache 2.0 License. Contributions and customization are welcome!

7. Community Support

Feedback and suggestions are welcome via Issues. Your participation is appreciated in building the future enterprise AI application ecosystem.

🎉 Thank you for your attention and support!

gaia-x