Based on your detailed analysis of various models for an event extraction task, several key insights stand out:
Model Performance Highlights
-
Gemma 4 26B A4B:
- Best Accuracy: 96%
- Perfect Recall: No false negatives (missed no real events)
- False Positives: 8
- Conclusion: This model is ideal for minimizing the risk of missing actual events, which is crucial for ensuring completeness in your product.
-
Qwen 3.5 9B:
- Best Precision: 94.7%
- False Positives: Only 2
- Conclusion: This model excels at minimizing false positives but may miss some true events, making it suitable if avoiding incorrect classifications is a priority.
-
Qwen 3.5 35B A3B:
- Best Judge Score: Average score of 70.3
- Conclusion: Produces more polished and coherent structured outputs but may have slightly worse event detection accuracy compared to Gemma 4 26B.
Fine-Tuning Insights
- The fine-tuned
Read the full article at DEV Community
Want to create content about this topic? Use Nemati AI tools to generate articles, social posts, and more.

![[AINews] The Unreasonable Effectiveness of Closing the Loop](/_next/image?url=https%3A%2F%2Fmedia.nemati.ai%2Fmedia%2Fblog%2Fimages%2Farticles%2F600e22851bc7453b.webp&w=3840&q=75)



