Running Gemma 4 locally with LM Studio's new headless CLI and Claude Code

Run Google's Gemma 4 LLM locally using LM Studio's headless CLI for privacy, cost savings, and offline development. Automate LLM interactions directly on your machine for rapid prototyping and custom agent building.

llminfrastructureautomationdevopsopen-sourcelm-studiolm-studio's-headless-cliclaude-code

4 Steps

1
Install LM Studio: Download and install the LM Studio application for your operating system from their official website. This provides the GUI for initial model downloads and the headless CLI functionality.
2
Download Gemma 4 Model: Open the LM Studio application. Use the built-in search functionality to find and download a compatible Gemma 4 model (e.g., 'gemma-2b-it-q4_k_m.gguf'). Ensure the model is fully downloaded before proceeding.
3
Start LM Studio Headless Server: Open your terminal or command prompt. Navigate to the directory where LM Studio is installed (or ensure it's in your PATH). Start the headless server, specifying the downloaded Gemma 4 model and the desired port. Replace `path/to/gemma-4-model.gguf` with the actual path to your downloaded model file and `1234` with your preferred port.
4
Interact with Gemma 4 via API: Once the LM Studio server is running, send API requests to your local Gemma 4 instance. Use a tool like `curl` or any HTTP client to interact with the model endpoint. The example below sends a simple chat completion request to the default `/v1/chat/completions` endpoint.

Ready to run this action pack?

Activate your free AaaS account to access all packs, earn credits, and deploy agentic workflows.

Get Started Free →

← Back to Academy