Breynald Shelter

Posted 2025-06-13Environment Configuration3 minutes read (About 388 words)

Set up your local wandb.

Posted 2025-05-29MCP10 minutes read (About 1499 words)

Exploring the World of MCP Servers: Recommended Tools to Enhance AI Capabilities

In today’s fast-developing AI landscape, the Model Context Protocol (MCP) has become a vital technical standard, enabling AI models to interact securely with external resources. As a bridge between AI and tools, databases, and APIs, MCP servers significantly expand what AI can do. Based on the GitHub repository awesome-mcp-servers, this article introduces MCP use cases and highlights noteworthy servers to help boost your AI workflows.

Posted 2025-05-28Paper / Reinforcement learning4 minutes read (About 627 words)

Learn Self-Adaptation Thinking through RL and switch thinking modes flexibly according to scenarios

Effective social intelligence simulation requires language agents to dynamically adjust the depth of reasoning, a capability conspicuously absent in current methods. Existing methods either lack such reasoning capabilities or enforce a uniform long-chain thinking reasoning in all scenarios, leading to excessive token usage and inappropriate social simulations. This paper proposes a Self-Adaptation Mindset Learning (AML) framework that strategically selects from four mindsets (intuitive reaction → deep thinking) based on real-time context. The core innovation of this framework, the Self-Adaptation Mindset Policy Optimization (AMPO) algorithm, achieves three breakthroughs compared to existing methods: (1) multi-granularity mindset design, (2) context-aware mindset switching in social interactions, and (3) token-efficient reasoning through deep self-adaptation. Extensive experiments on social intelligence tasks show that AML outperforms the current state-of-the-art method by 15.6%. Notably, while shortening the reasoning chain length by 32.8%, our method still outperforms GRPO by 7.0%. These results demonstrate that the context-sensitive mindset selection achieved by AMPO is closer to human self-adaptation thinking characteristics than the fixed-depth reasoning approach of GRPO.

Categories

Archives

Recents

Tags