Open-Source AI Matches Big Tech at Detecting Cyber Threats

TL;DR

Locally run AI models classify security threats with 60% accuracy, cutting costs and keeping sensitive data private.

In a world where cyber incidents are skyrocketing—Brazil alone reported over 516,000 in 2024 and 181,000 in just the first half of 2025—security teams are overwhelmed by alerts. This surge pushes organizations to seek AI solutions that can automate threat classification without compromising sensitive data. A new study reveals that open-source AI models, run on local servers, provide a viable alternative to expensive commercial systems, balancing accuracy with critical benefits like data privacy and cost control. For businesses and institutions, this means faster, more secure handling of security incidents without relying on external providers.

The researchers discovered that open-source models, when deployed on-premise, achieve accuracy rates around 60% for classifying security incidents into categories like malware, data leaks, and denial-of-service attacks. This performance, while lower than the over 90% accuracy of proprietary models like GPT-4o, is sufficient for initial triage and decision support in security operations centers. The study compared two groups of models: larger ones with up to 70 billion parameters and smaller ones ranging from 7 to 12 billion parameters, showing that even compact models can handle complex categorization tasks effectively. Importantly, these local models eliminate privacy risks associated with sending data to third-party servers, aligning with regulations like Brazil's General Personal Data Protection Law (LGPD).

To conduct the evaluation, the team used a dataset of 24 anonymized real-world security incidents, categorized according to the NIST SP 800-61r3 taxonomy, which includes 12 types such as account compromise and social engineering. They applied five prompt-engineering techniques—Progressive Hint Prompting (PHP), Self-Hint Prompting (SHP), Hypothesis Testing Prompting (HTP), Pattern Recognition Prompting (PRP), and Zero-Shot Learning (ZSL)—to various open-source models running locally via the Ollama framework. The pipeline involved stages from input data preprocessing to output analysis, ensuring reproducibility and standardization across models with different architectures, attention mechanisms, and activation functions.

, Detailed in Figures 2 and 3 of the paper, show that Progressive Hint Prompting (PHP) was the most effective , achieving 61.7% accuracy in larger models and around 53% in smaller ones. In contrast, Hypothesis Testing Prompting (HTP) performed poorly, with accuracy as low as 18.9% in smaller models, highlighting how iterative hinting improves outcomes. Models with optimized architectures, such as those using grouped-query attention (GQA) and modern activation functions like SwiGLU, demonstrated greater consistency and efficiency. Additionally, operational metrics revealed that local deployments could reduce costs significantly, with token processing expenses as low as $0.003 per thousand tokens compared to $0.09 for commercial options, making them economically attractive for sustained use.

This research matters because it empowers organizations, especially those in sensitive sectors, to maintain control over their cybersecurity operations without sacrificing performance. By using open-source models, companies can avoid dependency on external AI providers, reduce costs by up to 29 times, and ensure compliance with data protection laws. For everyday readers, this means that critical infrastructure—from banks to government agencies—can respond faster to threats while keeping personal information secure. The approach also supports digital sovereignty, allowing countries to build resilient cyber defenses tailored to local needs.

However, the study acknowledges limitations, including the use of a small dataset of 24 incidents, which may not capture the full diversity of real-world threats. The accuracy of open-source models, while adequate for support tasks, still falls short of top-tier commercial systems, and performance varies with model size and architecture. Future work aims to expand the dataset, incorporate fine-tuning techniques, and add metrics like precision and recall to enhance reliability. These steps will help solidify open-source AI as a robust, interpretable tool for cybersecurity, though current users should be aware of its constraints in high-stakes scenarios.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn