AI Agents MCP Servers Workflows Blog Submit
Vllm Mlx

Vllm Mlx

Coding Free Open Source Featured

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

<p><strong>Vllm Mlx</strong> is a coding MCP server that openAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code..</p> <p>With <strong>650 GitHub stars</strong>, Vllm Mlx is one of the most popular coding MCP servers in the open-source community.</p> <p>Built with <strong>Python</strong>, Vllm Mlx is designed for developers who want a reliable and maintainable solution.</p> <h2>How to Use Vllm Mlx</h2> <p>To use Vllm Mlx, you need an MCP-compatible client such as Claude Desktop, Cursor, or VS Code with an MCP extension. Once configured, the AI model can automatically use the tools provided by this server.</p>

Key Features

  • Open source with community contributions
  • Code generation and editing
  • Multi-language support