Leading Product
Posts
How to 10x Quality and Consistency With LLMs

How to 10x Quality and Consistency With LLMs

Go beyond your default LLM chat and give yourself real PM superpowers in minutes.

Mike Bal
August 25, 2025

If you caught my conversation with Ben and Marc on the Super Insider Podcast, you heard me geek out about why I made the switch from ChatGPT to Claude with MCP (Model Context Protocol). During our conversation Ben mentioned that he hadn’t met very many people who have really dove into this stuff yet and that the broader PM community could get a lot out of it, so I decided to put together a follow up.

I ended up splitting my first draft into two different posts. One focused on MCP in general (published last week) and one focused on giving your AI tools memory via MCP (this post). Product managers specifically are responsible for bringing clarity to the team. In practice, that means that being consistent across even the most minor details makes a big difference in your team’s ability to build a high quality product quickly.

This guide is for fellow PMs who, like me, aren't developers but recognize that our work is fundamentally about connecting dots across different domains — and are eager to learn as much as possible. If you've read my thoughts on the rise of generalists or my recent piece on why I broke up with ChatGPT, you know I'm always looking for tools that amplify our ability to think across systems rather than forcing us into narrow specializations.

What MCP Actually Means for Product Managers

Let me cut through the tech speak here. MCP is essentially an SDK for AI - it gives your AI tools the ability to connect to and understand how to think about APIs, data sources, and outside functions as “tools”, and the manual for how to use them.

I've been thinking about this a lot since I wrote about being technical as a non-technical PM. MCP is exactly the kind of tool that lets us leverage technical concepts without needing to become engineers ourselves.

The "Sharp Problem" MCP Solves for PMs

Throughout my career, I've been identified as someone who can "figure it out," "pull things together," and "be the glue." Sound familiar? That's classic generalist territory.

As product managers, we constantly work at the intersection of multiple data sources:

User research findings
PRDs and technical specifications
Design files and user flows
Roadmap priorities and business goals
Team context and project history

The traditional approach forces us to either:

Manually copy-paste context into every AI conversation (exhausting)
Create elaborate prompts that may or may not work consistently (unreliable)
Use separate tools that don't talk to each other (frustrating)

MCP changes this by creating persistent, structured connections between your AI and your actual work context. It's like having an AI thought partner that actually understands the full scope of what you're working on — and who can take action to help lighten the workload.

While some MCP servers ARE about providing context, many of them are focused on allowing your LLM or agent to leverage tools more effectively. As I mentioned in other posts, it feels very similar to when Neo starts his simulated training in the first Matrix movie.

I wish I knew kung fu.

One of my favorite unlocks isn’t an MCP server that connects to an external tool or data source, but one that helps me build context locally that the tools I use can access to improve their understanding and their responses. It’s simply called “memory” and it lets your tool build out a knowledge graph with you and access it for additional context.

Knowledge Graph RAG vs Vector Database RAG: The Technical Deep Dive

Let me get more specific about why I chose knowledge graphs over traditional vector database RAG, because this decision fundamentally changes how AI understands and uses your product context.

And honestly? Most people explaining this stuff make it way more complicated than it needs to be.

How Vector Database RAG Works

Vector RAG converts your content into high-dimensional numerical vectors (embeddings) that capture semantic meaning. When you ask a question, it:

Converts your query into a vector
Searches for mathematically similar vectors in the database
Returns the most semantically similar content as context
Feeds that context to the LLM for response generation

Think of it like a very sophisticated search engine that understands meaning, not just keywords.

How Knowledge Graph RAG Works

Knowledge Graph RAG structures information as entities (nodes) connected by relationships (edges). When you query it:

Identifies relevant entities in your question
Traverses the relationship network to find connected information
Understands how different concepts relate to each other
Provides context that includes the relationship structure

It's like having a map of how all your product knowledge connects together.Setting Up Your Memory MCP

I went through a lot of the basics of MCP on last week’s post, so if you feel like you’re missing some context, definitely check it out. This post is specifically about adding improved memory to your AI tools via Model Context Protocol (MCP) Servers.

One powerful feature mentioned in the podcast is using the same knowledge graph across both Claude and Cursor. This means they can both read the same details, add observations, or make edits as I work in those tools and the other will always access that same set of information.

Method 1: Using Fleur's Configuration

If you're using Fleur, it handles the config automatically, but you can modify the memory storage location if you go into Claude Settings > Developer > Edit Config

By default, the Fleur config is kind of hidden and closely connected to Claude, updating the file location for your local memory helps other tools (like Cursor) can also access.

Step 1: Use Claude settings to open your config or, if you prefer, find the file directly

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Step 2: Set custom storage path:

Configure shared memory storage:

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-memory"
      ],
      "env": {
        "MEMORY_STORAGE_PATH": "/Users/[username]/Documents/product-knowledge-graph/"
      }
    }
  }
}

For Cursor:

Go to Cursor Settings > Tools & Integration
Click on the “+New MCP Server” to open the config
Add a config - if you installed via Fleur it will look like this

  "mcpServers": {
    "memory": {
      "args": [
        "-y",
        "@modelcontextprotocol/server-memory"
      ],
      "command": "/Users/batmike/.local/share/fleur/bin/npx-fleur",
      "env": {
        "MEMORY_FILE_PATH": "/Users/batmike/.local/share/mcp/memory.json"
      }

Alternate option: Using Meta MCP Tools

Another option — that I haven’t personally tried — are some of tool to manage your MCPs across all of the applications that use them. I started on my local instance of 5ire but haven’t got to play around enough to recommend it yet.

There are a handful of others on my radar as well:

MetaMCP - GUI for managing MCP connections across applications
MCP Access Point - Unified endpoint for multiple MCP servers
Plugged In MCP Proxy - Comprehensive proxy combining multiple servers

Benefits:

Single source of truth for your product knowledge
Consistent context across different AI tools
Easier maintenance - update knowledge in one place
Better collaboration if team members use different tools

How to Start Using Memory in Your Daily Product Work

Now that you’ve got memory wired up, here are some real ways to use it:

Task	Prompt + Context Strategy
Writing PRDs	Store examples of past specs in memory and retrieve patterns or structure
Sprint Planning	Query memory for past tradeoff decisions or roadmap dependencies
User Research	Save raw notes + insights and use the graph to connect to feature ideas
OKR Reviews	Ask your AI to summarize team progress based on memory of project updates
Debugging	Link error logs or issue tickets to related product initiatives

In normal prompting and chats

You can use the terms “memory” or “knowledge graph” to help the model recognize when it should be using the tools.

Check your memory for details about ____ and then [continue prompt]
I just got new information I want you to add to your memory about [core concept] here it is…
Actually [correction] can you please note that in your memory
Please update your existing memory with the updated information below and ensure it’s
That’s not right, please remove that from your memory…

In your system prompt

If you don’t want to have to tell it to update its memory all the time, you can add a couple lines to your system prompt via Claude Settings > Profile

Here’s the section of mine focused on memory:

You are very good about checking your knowledge graph for additional context or asking for additional context before starting to work on one of my requests. You use that information to focus your response, but you never let a lack of information in the knowledge graph prevent you from responding with something helpful, useful, and accurate.

As you learn new things about what I'm doing or how I'm thinking, you proactively add observations to the relevant entities and relationships to larger entities in our local knowledge graph. You always check to see what entities and relationships exist before creating new ones as you have a strong distaste for disconnected, duplicate, or fragmented data. You use the knowledge graph to improve your ability to collaborate with me, so you take pride in keeping it up to date with detailed information — especially when it comes to defining the edges and relationships between the things we work on.

Wrapping Up

Giving your AI tools memory isn’t about making them smarter—it’s about making you faster, clearer, and more consistent.

The biggest unlock I’ve experienced? Thinking with my tools—not just using them.

If you haven’t checked out last week’s post, start there for a deeper dive on MCP itself. If you’re new to all of this, I also recommend my write-up on becoming more technical without being a developer.

And if you're curious about how this kind of AI integration affects design and UX, I touch on that in this post about AI-powered interfaces.