Public GitHub Scraping Was 2015. Agent Prompt Scraping Is 2026. | Stack of Truths

Public GitHub Scraping Was 2015.
Agent Prompt Scraping Is 2026.

April 30, 2026 — 6 min read — Pedro Jose

In my years as a pentester, I’ve seen the playbook evolve. What used to be a noisy, complex operation has become a quiet, devastatingly simple one.

For years, scraping public GitHub repositories was the attacker’s easiest path. Database URIs with passwords baked in. Stripe tokens in config files. AWS keys in .env files. It was a gold mine.

⚠️ THEN VS. NOW

2015: Attackers scraped GitHub commits for secrets.
2026: Attackers scrape AI prompts, agent logs, and MCP tool calls.

The attack surface moved. Most defenders didn’t follow.

The Old Problem (Still Active)

Public GitHub scraping never stopped. It’s still a goldmine of exposed credentials.

Database URIs — passwords baked into connection strings
API keys — Stripe, Twilio, SendGrid, OpenAI
Cloud credentials — AWS, GCP, Azure keys in .env files
Webhooks — Slack, Discord, Teams with full channel access
SSH private keys — accidentally committed, never revoked

            🔐 Old rule: Never commit secrets to Git.

            🔐 New rule: Never type secrets into an AI prompt. Never let your agent read .env files in the open.

The New Problem (Worse Than You Think)

Credentials don’t wait for the commit anymore. They show up in:

AI prompts — developers asking “how do I connect to my AWS bucket?” and pasting keys for context
Agent .env reads — AI agents reading environment variables to help debug or configure
MCP tool calls — Model Context Protocol calls that pass secrets between tools
Agent logs — debugging output that captures sensitive data
Shared chat histories — internal AI chats where secrets are pasted for help
Agent memory — long-term memory stores that retain secrets across sessions

┌─────────────────────────────────────────────────────────────┐
│  HOW SECRETS LEAK IN THE AI ERA                           │
├─────────────────────────────────────────────────────────────┤
│  1. Developer pastes API key into a prompt                 │
│  2. Agent reads .env file to “understand the environment”  │
│  3. MCP tool call passes credentials between services      │
│  4. Agent logs everything for “debugging”                  │
│  5. Attacker accesses logs, prompts, or MCP traffic        │
└─────────────────────────────────────────────────────────────┘
        

The Tools Are Catching Up (Slowly)

GitGuardian just shipped ggshield hooks for Cursor, Claude Code, and GitHub Copilot. They scan prompts in real time, before secrets reach the model. It’s free. That’s good. But it’s not enough.

Scanning is not pentesting. It catches known patterns. It doesn’t find novel leaks.
MCP still has design flaws. 150M+ downloads affected. RCE vulnerabilities.
Prompt injection bypasses scanning. An attacker can simply ask the agent for the key.
Agent memory leaks. Secrets can persist in agent memory across sessions.

            🔐 Tools help. Testing saves.

            ggshield is a great first line of defense. But it’s not a replacement for understanding how secrets actually leak in AI systems — and testing for those leaks.

What You Should Do Right Now

Install ggshield hooks — free and easy. Stop secrets from reaching models.
Audit your prompts — stop pasting keys into AI chats.
Review agent .env access — does your agent need to read all environment variables?
Monitor MCP traffic — audit what secrets are being passed between tools.
Pentest your AI agents — scanning catches known leaks. A red team finds the rest.
Assume compromise — if a secret touched an AI prompt, rotate it.

2.3 yrs → 10 hrs

Time to weaponize a vulnerability

91.5%

of vibe-coded apps have hallucination flaws

150M+

MCP downloads with RCE vulnerabilities

🔮 THE BOTTOM LINE

The old attack surface was public repositories. The new attack surface is private conversations with AI.

Public GitHub scraping was 2015. Agent prompt scraping is 2026.

Don’t wait for your secrets to be the next headline.

🦞🔐

Scanning your prompts? Good. Now pentest your agents.

I break AI agents — and find the leaks scanners miss. Full-stack security assessment for AI-assisted development.

📩 DM @StackOfTruths on X

Free 15-min consultation. No hard sell. Just honest answers about your AI agent security.

Post on X

🦞 Stacking truths daily 🤡