It seems like you're analyzing the performance of different AI models (Claude Sonnet 4.5, GPT-4o, Gemini 2.5 Pro, and others) in detecting security vulnerabilities such as SQL injections. Let's summarize your findings and provide some additional context:
Summary of Findings
-
SQL Injection Detection:
- All frontier models (Claude Sonnet 4.5, GPT-4o, Gemini 2.5 Pro) detected direct SQL injection vulnerabilities with 100% recall.
- Differentiation occurred in detecting indirect SQL injections where user input passed through multiple function calls before reaching a query.
-
Indirect Injection Example:
- Claude Sonnet 4.5 (with extended thinking) was able to trace the data flow across three function boundaries in a Django application and identified an injection vector.
- GPT-4o also caught this issue but might have required more context or specific prompts.
-
Security Analysis Performance:
- Claude Sonnet 4.5 with extended thinking had the highest recall (86.0%) and precision (88.1%) for security issues, followed by GPT-4o
Read the full article at DEV Community
Want to create content about this topic? Use Nemati AI tools to generate articles, social posts, and more.

![[AINews] The Unreasonable Effectiveness of Closing the Loop](/_next/image?url=https%3A%2F%2Fmedia.nemati.ai%2Fmedia%2Fblog%2Fimages%2Farticles%2F600e22851bc7453b.webp&w=3840&q=75)



