← Back to feed
Lecture · April 24, 2026

Guest Lecture @ University of Moratuwa - LLM Penetration Testing

#llm#ai-security#redteam#UoM

Delivered a session on LLM Penetration Testing — Breaking AI Systems for final-year Cyber Security students at University of Moratuwa.


🚨 Why LLM Security Matters

  • LLMs don’t understand security — only text prediction
  • 12+ major incidents (2023–2024)
  • Attacks require zero technical skill

Natural language is the new attack surface.


🔍 LLM Attack Surface

  • User Input → Prompt Injection
  • System Prompt → Prompt Extraction
  • LLM Core → Hallucination
  • Tools → Command Injection
  • External Data → Indirect Injection

⚠️ Real Incidents

  • Samsung (2023): Source code leaked via ChatGPT
  • Bing Chat: System prompt extracted (“Sydney”)
  • DPD Bot: Manipulated to damage its own brand

💡 All attacks used simple prompts — not exploits.


🧠 Key Insight

LLMs treat all input equally
(System vs User vs Malicious)


🛡️ What Actually Works

  • Allowlist actions
  • Sandbox execution
  • Human approval
  • Input/output validation
  • Least privilege

Prompts are NOT a security boundary.


🔴 Live Demo (AI Bot Exploit)

  • Bypassed 3 layers:

    • Input WAF
    • LLM Guardrails
    • Command Filter
  • Used:

    • Prompt injection
    • Roleplay / impersonation
    • Command tag abuse [CMD]...[/CMD]

➡️ Result: Full system compromise


📌 Takeaway

AI systems are not hacked — they are manipulated.

Security must be enforced at the application layer, not the prompt.