Public GitHub Scraping Was 2015. Agent Prompt Scraping Is 2026. | Stack of Truths

Public GitHub Scraping Was 2015. Agent Prompt Scraping Is 2026. | Stack of Truths

Public GitHub Scraping Was 2015.
Agent Prompt Scraping Is 2026.

April 30, 2026 — 6 min read — Pedro Jose

In my years as a pentester, I’ve seen the playbook evolve. What used to be a noisy, complex operation has become a quiet, devastatingly simple one.

For years, scraping public GitHub repositories was the attacker’s easiest path. Database URIs with passwords baked in. Stripe tokens in config files. AWS keys in .env files. It was a gold mine.

⚠️ THEN VS. NOW

2015: Attackers scraped GitHub commits for secrets.
2026: Attackers scrape AI prompts, agent logs, and MCP tool calls.

The attack surface moved. Most defenders didn’t follow.

The Old Problem (Still Active)

Public GitHub scraping never stopped. It’s still a goldmine of exposed credentials.

  • Database URIs — passwords baked into connection strings
  • API keys — Stripe, Twilio, SendGrid, OpenAI
  • Cloud credentials — AWS, GCP, Azure keys in .env files
  • Webhooks — Slack, Discord, Teams with full channel access
  • SSH private keys — accidentally committed, never revoked
🔐 Old rule: Never commit secrets to Git.
🔐 New rule: Never type secrets into an AI prompt. Never let your agent read .env files in the open.

The New Problem (Worse Than You Think)

Credentials don’t wait for the commit anymore. They show up in:

  • AI prompts — developers asking “how do I connect to my AWS bucket?” and pasting keys for context
  • Agent .env reads — AI agents reading environment variables to help debug or configure
  • MCP tool calls — Model Context Protocol calls that pass secrets between tools
  • Agent logs — debugging output that captures sensitive data
  • Shared chat histories — internal AI chats where secrets are pasted for help
  • Agent memory — long-term memory stores that retain secrets across sessions
┌─────────────────────────────────────────────────────────────┐ │ HOW SECRETS LEAK IN THE AI ERA │ ├─────────────────────────────────────────────────────────────┤ │ 1. Developer pastes API key into a prompt │ │ 2. Agent reads .env file to “understand the environment” │ │ 3. MCP tool call passes credentials between services │ │ 4. Agent logs everything for “debugging” │ │ 5. Attacker accesses logs, prompts, or MCP traffic │ └─────────────────────────────────────────────────────────────┘

The Tools Are Catching Up (Slowly)

GitGuardian just shipped ggshield hooks for Cursor, Claude Code, and GitHub Copilot. They scan prompts in real time, before secrets reach the model. It’s free. That’s good. But it’s not enough.

  • Scanning is not pentesting. It catches known patterns. It doesn’t find novel leaks.
  • MCP still has design flaws. 150M+ downloads affected. RCE vulnerabilities.
  • Prompt injection bypasses scanning. An attacker can simply ask the agent for the key.
  • Agent memory leaks. Secrets can persist in agent memory across sessions.
🔐 Tools help. Testing saves.

ggshield is a great first line of defense. But it’s not a replacement for understanding how secrets actually leak in AI systems — and testing for those leaks.

What You Should Do Right Now

  1. Install ggshield hooks — free and easy. Stop secrets from reaching models.
  2. Audit your prompts — stop pasting keys into AI chats.
  3. Review agent .env access — does your agent need to read all environment variables?
  4. Monitor MCP traffic — audit what secrets are being passed between tools.
  5. Pentest your AI agents — scanning catches known leaks. A red team finds the rest.
  6. Assume compromise — if a secret touched an AI prompt, rotate it.
2.3 yrs → 10 hrs
Time to weaponize a vulnerability
91.5%
of vibe-coded apps have hallucination flaws
150M+
MCP downloads with RCE vulnerabilities
🔮 THE BOTTOM LINE

The old attack surface was public repositories. The new attack surface is private conversations with AI.

Public GitHub scraping was 2015. Agent prompt scraping is 2026.

Don’t wait for your secrets to be the next headline.
🦞🔐

Scanning your prompts? Good. Now pentest your agents.

I break AI agents — and find the leaks scanners miss. Full-stack security assessment for AI-assisted development.

📩 DM @StackOfTruths on X

Free 15-min consultation. No hard sell. Just honest answers about your AI agent security.


© 2026 Stack of Truths — AI Agent Pentesting & Security Audits. All opinions are my own.
English is not my first language, I use AI to help write clearly. The ideas and experience are mine.

🦞 “10 years cybersecurity. 5 years AI. I break AI agents so you don’t get broken.”

Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every month.

We don’t spam! Read our privacy policy for more info.

Leave a Reply

Your email address will not be published. Required fields are marked *


You cannot copy content of this page

error

Enjoy this blog? Please spread the word :)

Follow by Email
YouTube
YouTube
LinkedIn
LinkedIn
Share