Vllm Mlx

Coding Free Open Source Featured

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

Vllm Mlx is a coding MCP server that openAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.. With 650 GitHub stars, Vllm Mlx is one of the most popular coding MCP servers in the open-source community. Built with Python, Vllm Mlx is designed for developers who want a reliable and maintainable solution. <h2>How to Use Vllm Mlx</h2> To use Vllm Mlx, you need an MCP-compatible client such as Claude Desktop, Cursor, or VS Code with an MCP extension. Once configured, the AI model can automatically use the tools provided by this server.

Key Features

Open source with community contributions
Code generation and editing
Multi-language support

Similar MCP Servers

View all →

Vllm Mlx

Key Features

Similar MCP Servers

Cc Switch

Github MCP Server

Antigravity Awesome Skills

Repomix