When the Ecosystem Lags: Adding gpt-image-2 to an MCP Server Myself

I was working on a project where I needed Claude Code to generate and edit images — diagrams, mockups, visual assets that would have taken me longer to produce by hand than to describe in a prompt. Claude Code doesn’t generate images natively, but it supports MCP servers, and MCP servers exist for exactly this kind of capability gap. So I went looking for one that wrapped OpenAI’s image generation API.

I found openai-gpt-image-mcp by SureScaleAI. Clean TypeScript, focused scope, supported both image generation and editing, worked with Claude Code out of the box. It had about a hundred stars on GitHub and seemed like the most mature option. One problem: it only supported gpt-image-1, and OpenAI had released gpt-image-2 six days earlier.

The repository’s last commit was from May 2025 — almost a year old. No open issues requesting the update, no signs the maintainer was actively working on it. I had a choice that’s familiar to anyone who depends on open source tooling: wait for someone else to update it, build my own from scratch, or fork it and add what I needed.

I forked it.

Finding the Right Server

The MCP ecosystem for image generation has grown quickly. When I searched, I found half a dozen servers that could wrap OpenAI’s image API — writingmate/imagegen-mcp, lpenguin/openai-image-mcp, spartanz51/imagegen-mcp, and several others. Some support multiple providers (Gemini, Flux, DALL-E alongside GPT Image models), some are minimal wrappers around a single endpoint.

I picked SureScaleAI’s server because it hit the right balance for my needs. It exposes two tools — create-image for generation and edit-image for modifying existing images with optional masks — and handles the practical details well: automatic switching to file output when the base64 payload exceeds 1MB, support for Azure OpenAI deployments, and sensible defaults. The codebase was small enough to understand in an afternoon, which matters when you’re about to modify it.

None of the servers I found supported gpt-image-2 yet. The model had been out for less than a week, and open source maintainers have their own priorities. That’s not a criticism — it’s just the reality of how volunteer-maintained projects work.

Why gpt-image-2 Was Worth the Effort

If gpt-image-2 were just a version number bump with marginal improvements, I would have stuck with gpt-image-1 and moved on. But the differences are substantial enough to change what you can actually do with image generation in a development workflow.

The most immediately useful change is custom resolutions. gpt-image-1 gives you three preset sizes: 1024×1024, 1536×1024, or 1024×1536. That’s it. gpt-image-2 accepts any resolution where both dimensions are multiples of 16, neither edge exceeds 3840 pixels, the total pixel count falls between 655,360 and 8,294,400, and the aspect ratio stays within 3:1. In practice, this means you can generate images that match the exact dimensions you need — a 1920×1080 hero image, a 2048×2048 texture, a 1200×630 social media card — without resizing or cropping after the fact.

Text rendering improved dramatically, going from roughly 90-95% accuracy to over 99%. For diagrams, architecture visuals, or anything with labels, that’s the difference between usable output and output you have to fix by hand. The model also has what OpenAI calls “thinking mode” — it reasons about layout and composition before generating, which produces noticeably more coherent results for complex prompts.

There are trade-offs. gpt-image-2 doesn’t support transparent backgrounds, which gpt-image-1 does. It’s slower due to the reasoning overhead. It costs more per image. These are worth knowing about, but for my use case — generating and editing images inside a coding workflow — the flexible resolutions and text accuracy alone justified the switch.

The Contribution

I could have made a minimal change — add "gpt-image-2" to the model enum and call it done. The API endpoint is the same; OpenAI didn’t change the URL or the authentication. A one-line change would have technically worked for basic generation with default parameters.

But gpt-image-2’s flexible resolution system means the size validation logic needed to change fundamentally. gpt-image-1’s sizes are a simple enum — you pick from a list. gpt-image-2 requires parsing a width×height string and validating it against four independent constraints. Passing an invalid size to the API would produce a cryptic error from OpenAI’s backend, which is a poor experience when the MCP server could validate it locally and return a clear message instead.

The original codebase was a single index.ts file with no tests and no CI pipeline. If I was going to add non-trivial validation logic, I wanted tests to prove it worked correctly. And if I was going to submit a PR with tests, a CI pipeline to run them automatically seemed like the responsible addition. So the scope grew beyond what I initially planned.

The final PR touches eight files: model support with resolution validation, extracted helper modules, comprehensive test coverage, and a GitHub Actions workflow that runs tests across Node 18, 20, 22, and 24. About 1,200 lines added, 200 removed.

I’m aware this makes for a larger PR than most maintainers prefer to review. There’s a real tension in open source between making the minimal viable change and making a change that’s properly engineered. I chose the latter because a contribution with tests and CI is more likely to be accepted and maintained long-term than a quick patch that shifts the testing burden to the maintainer. Whether the maintainer agrees is up to them — the PR is open.

MCP Makes This Worth It

The reason this contribution pattern works is MCP itself. I didn’t need to wait for the PR to be reviewed and merged before I could use gpt-image-2 in my workflow. I pointed my Claude Code configuration at my fork:

{
  "mcpServers": {
    "openai-image": {
      "command": "node",
      "args": ["/path/to/my-fork/dist/index.js"],
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

That’s it. Claude Code picked up the server on next launch, and I had gpt-image-2 generation and editing available immediately. No waiting for Anthropic to add native image support, no waiting for the MCP server maintainer to merge my PR, no intermediary steps. The protocol decouples tool capability from any single party’s release schedule.

This is what makes MCP more than just “a way to add tools to Claude.” It’s an architecture that means no capability is a dead end. If the tool you need doesn’t exist, you build a server. If it exists but is outdated, you fork it. If your fork diverges permanently, you publish it. At every step, you’re unblocked — the protocol guarantees that any compliant server works with any compliant client.

The Gap Is Structural

Models are releasing faster than the ecosystem around them can adapt. OpenAI ships a new image model, and it takes days to weeks before the MCP servers, SDK wrappers, and integration tools catch up. This isn’t anyone’s fault — it’s the natural consequence of a fast-moving field built on volunteer-maintained infrastructure.

The engineers who treat this gap as an opportunity rather than a blocker will consistently have access to better tools sooner. The skills involved aren’t exotic: read the API changelog, understand what changed, fork the relevant project, implement the support, write tests, submit a PR. It’s standard open source contribution, applied to a domain that moves fast enough to reward the effort.

I submitted my PR six days after gpt-image-2 launched. I was using it in my workflow within hours of starting the fork. That’s the value proposition of open source in a rapidly evolving ecosystem — not free software, but the freedom to fix what’s missing, when you need it fixed.