AI Threat Landscape Digest January-February 2026

Sp123

29 Mar, 2026

KEY FINDINGS

Introduction to Malware Binary Triage (IMBT) Course

Looking to level up your skills? Get 10% off using coupon code: MWNEWS10 for any flavor.

Enroll Now and Save 10%: Coupon Code MWNEWS10

Note: Affiliate link – your enrollment helps support this platform at no extra cost to you.

AI-assisted malware development has reached operational maturity.
VoidLink framework, which is modular, professionally engineered, and fully functional,
was built by a single developer using a commercial AI-powered IDE within a compressed
timeframe. AI-assisted development is no longer experimental but produces deploymentready
output.

AI-assisted development is not always obvious from the final product.
VoidLink was initially assessed as the work of a coordinated team based on its architecture
and implementation quality. The development method was exposed not from analyzing
the malware but through an operational security failure. AI-assisted development
should be considered a possibility from the outset, not as an afterthought.

Adoption of self-hosted, open-source AI models is growing but still limited in
practice.
Actors of varying skill levels are investing in self-hosted and unrestricted models to
avoid commercial platform restrictions. However, underground discussions consistently
reveal a gap between aspiration and capability: local models still underperform, finetuning
remains aspirational, and commercial models remain the productive choice
even for actors with explicit malicious intent.

Jailbreaking is shifting from direct prompt engineering toward agenticarchitecture
abuse.
Traditional copy-paste jailbreaks are increasingly ineffective. The misuse of AI agent
configuration mechanisms, specifically project files that redefine agent behavior, is a
more significant development as it represents a qualitative shift from manipulating a
model’s responses to abusing its operational architecture.

AI is showing early signs of deployment as a real-time operational component.
Beyond its use as a development aid, AI is beginning to appear as a live element in
offensive workflows as autonomous agents performing security research tasks, and
LLMs classifying and engaging targets at scale within automated pipelines.

Enterprise AI adoption is itself an expanding attack surface.
GenAI activity across enterprise networks shows that one in every 31 prompts risked
sensitive data leakage, impacting 90% of GenAI-adopting organizations.

INTRODUCTION

During January-February 2026, cyber crime ecosystems continue to adopt AI in a widespread but
uneven pattern. Throughout 2025, legitimate software development began shifting from promptbased
AI assistance to agent-based development. Tools such as Cursor, GitHub Copilot, Claude Code,
and TRAE introduced a common paradigm: developers write structured specifications in markdown
files, and AI agents autonomously implement, test, and iterate code based on those instructions. This
agentic model, in which markdown is the operative control layer, is now starting to appear across the
threat landscape.

The critical differentiator in what we observed is AI methodology combined with domain expertise. Across
cyber crime forums, the dominant pattern of AI use remains unstructured prompting: actors request
malware or exploit code from AI models as if entering a query in a search engine. VoidLink (detailed
below) on the other hand, is the first documented case of AI producing truly advanced, deploymentready
malware. The developer combined deep security knowledge with a disciplined, spec-driven
workflow to produce results indistinguishable from professional team-based engineering. Forum
activity, which constitutes the bulk of observable evidence, primarily consists of actors who have not
yet adopted structured AI workflows and whose efforts remain relatively unsophisticated. The more
capable actors, those who combine domain expertise with disciplined AI methodology, leave far fewer
traces in open forums, making the true scope of this shift harder to measure.

VOIDLINK: THE STANDARD WE MEASURE AGAINST

In January 2026, Check Point Research (CPR) exposed VoidLink, a Linux-based malware framework
featuring modular command-and-control (C2) architecture, eBPF and LKM rootkits, cloud and container
enumeration, and more than 30 post-exploitation plugins. The framework is highly sophisticated and
professionally engineered, so much so that the initial assessment was that VoidLink was likely the product
of a coordinated, multi-person development effort conducted over months of intensive development.

Operational security (OPSEC) failures by the developer later exposed internal development artifacts
that told a different story. These materials revealed that VoidLink was authored by a single
developer using TRAE SOLO, the paid tier of ByteDance’s commercial AI-powered IDE. Instead of
unstructured prompting, the developer used Spec Driven Development (SDD), a disciplined engineering
workflow, to first define the project goals and constraints, and then use an AI agent to generate a
comprehensive architecture and development plan across three virtual teams (Core, Arsenal, and
Backend). The resulting plan included sprint schedules, feature breakdowns, coding standards,
and acceptance criteria, all documented as structured markdown files. The AI agent implemented
the framework sprint by sprint, with each sprint producing working, testable code. The developer
acted as product owner, directing, reviewing, and refining, while the AI agent did the actual work.

The results were striking. The recovered source code aligned so closely with the specification documents
that it left little doubt that the codebase was written to those exact instructions. What normally would
have been a 30-week engineering effort across three teams was executed in under a week, producing
over 88,000 lines of functional code. VoidLink reached its first functional implant around December
4, 2025, one week after development began.

THIS CASE ESTABLISHES TWO PRINCIPLES:

AI-assisted development now produces operationally viable, deployment-ready
malware: it has crossed the threshold from experimental to functional.
The AI involvement was invisible until it was exposed by an unrelated OPSEC failure. For
analysts and defenders, this means AI involvement in malware development should be
treated as a default working assumption, even when there are no visible indicators

The ramifications of VoidLink’s methodology go beyond this individual case. Its workflow, in which structured markdown specifications direct an AI agent to autonomously implement, test, and iterate, is the same paradigm that defined the agentic AI revolution in legitimate software development throughout 2025. The cyber crime ecosystem is not developing its own AI capability. It is adopting the same tools and architectural patterns as legitimate technology, with the additional goal of trying to overcome the protective limitations built into these systems. This is more important than which model or platform the attackers use.

The same architectural pattern repeatedly appears across the cases highlighted in our report: markdown skill files that transform a coding agent into an autonomous offensive security operator, and configuration files abused to override agent safety controls. In each case, the operative control layer is not code but structured documentation that determines what the AI agents build, how they behave, and what constraints they observe or ignore. This is in direct contrast to the underground forum activity, where the dominant approach remains unstructured prompting.

MODELS: COMMERCIAL, SELF-HOSTED, AND INFORMAL SERVICES

SELF-HOSTED OPEN-SOURCE MODELS

Across cyber crime forums, actors at all skill levels are actively exploring self-hosted, open-source AI models as alternatives to commercial platforms. Their motivations are consistent: to avoid moderation, prevent account bans, and maintain operational privacy.

Users with malware and hacking backgrounds are installing uncensored model variants such as wizardlm-33b-v1.0-uncensored and openhermes-2.5-mistral, and prompt them with comprehensive malicious wishlists spanning ransomware, keyloggers, phishing kits, and exploit code.

Figure 1 – User installing local LLM variants and prompting them to generate malware and fraud tooling.

More established actors are conducting structured cost-benefit analyses, evaluating not only hardware requirements and GPU costs but whether locally hosted models produce reliable output (or hallucinate to the point of being operationally useless), and whether AI-generated malware meets the quality bar of current evasion techniques.

Figure 2 – Threat actor inquiry into hardware, cost, and feasibility of running a fully “unrestricted” locally hosted model.

SELF-HOSTED MODELS: LIMITATIONS IN PRACTICE

Self-hosted models consistently show a gap between aspiration and capability. Community advice on improving local model output focuses on basic optimizations, such as switching to English-language prompts and increasing quantization levels, while references to more advanced techniques such as LoRA fine-tuning remain aspirational rather than operational.

Figure 3 – Community feedback suggesting alternative local models and highlighting token/context limitations of smaller deployments.

Cost estimates range from $5,000 to $50,000 depending on the desired performance, with training timelines of 3–12 months and frank admissions that models “hallucinate a lot” without extensive investment.

Figure 4 – Discussion on cost and requirements for locally hosted unrestricted models.

Most tellingly, an active offensive tools vendor, advertising C2 setups, EDR bypass services, and red team tooling, concluded that local deployment is currently “more of a burden than something productive,” while acknowledging that commercial models remain useful despite increasing restrictions.

Figure 5 – Participants comparing commercial AI systems with alternative models and discussing perceived restriction levels.

COMMERCIAL PLATFORMS AND INFORMAL ACCESS SHARING

Rather than migrating to self-hosted infrastructure, users are comparing what the prevailing workarounds among commercial models provide. Participants recommended specific providers they view as less restrictive, shared experiences with account enforcement on multiple platforms, and refined prompt-splitting techniques to incrementally bypass safeguards, such as requesting explanations before progressing toward executable code.

Figure 6 – Example of the structured prompt-splitting technique suggested to incrementally bypass AI safety restrictions.

Some early signs of informal access sharing have been observed, with operators of local models offering to generate restricted outputs for others on request. However, given the historical precedent of “dark LLM” services that largely failed to deliver on their promises, it remains to be seen whether these will develop into durable service models.

Figure 7 – Community member offering private generation of restricted output via locally hosted model infrastructure.

JAILBREAKING AS ARCHITECTURAL ABUSE

Traditional jailbreaking, the practice of circulating copy‑paste prompts designed to trick models into producing restricted output, is becoming increasingly difficult to utilize. In some forum discussions, users seeking Claude jailbreaks were told that easy public prompts are no longer available, platforms have been cracking down on abusers, dedicated subreddits have been banned, and developing new jailbreaks is costly because the accounts are eventually terminated. Single‑prompt jailbreaking is becoming less attractive as model providers invest in safety enforcement.

Figure 8 – Forum discussion highlighting the declining availability of easy public jailbreak prompts.

ABUSING AGENT ARCHITECTURE

A more significant development is the emergence of jailbreaking techniques that target the architecture of AI agent systems rather than the model’s conversational safeguards. A packaged “Claude Code Jailbreak” distributed on forums illustrates this shift.

Claude Code is designed to read a CLAUDE.md file from a project’s root directory as configuration. Legitimate developers use this mechanism to define the project context, coding standards, and agent behavior. The jailbreak abuses this by placing override instructions in the CLAUDE.md file that suppresses safety controls and redefines the agent’s role. When Claude Code initializes in the directory, it reads these instructions as authoritative project configuration and follows them. The screenshots below claim successful generation of a RAT (Remote Access Trojan) using this method.

Figure 9 – Packaged Claude Code jailbreak exploiting the CLAUDE.md project configuration mechanism.

Figure 10 – Alleged jailbreak output showing generation of remote access malware code.

This is not prompt injection in the traditional sense, but manipulation of the agent’s instruction hierarchy, the same architecture used for agentic AI tools in legitimate development. The CLAUDE. md file occupies the same functional role as VoidLink’s markdown specification files or RAPTOR’s skill definitions: a structured document that determines what the agent does, how it behaves, and what constraints it observes.

FROM DEVELOPMENT TOOL TO OPERATIONAL AGENT

The preceding sections document AI as a development aid (as seen by VoidLink), a resource actors struggle to access on their own terms (self-hosted models), and as a system whose restrictions they attempt to bypass (jailbreaking). Now let’s look at AI deployed as a real-time operational component, performing offensive tasks autonomously within live workflows.

RAPTOR: AGENT-BASED OFFENSIVE ARCHITECTURE VIA MARKDOWN SKILLS

RAPTOR is a legitimate, open-source security research framework created by established security researchers and published on GitHub under an MIT license. It is not malicious tooling. Its significance for threat intelligence lies in its architectural pattern, and that criminal communities are paying attention.

RAPTOR transforms Claude Code into an autonomous offensive security agent through a set of markdown skill files and agent definitions. The framework integrates static analysis, fuzzing, exploit generation, and vulnerability triage into an agentic pipeline orchestrated entirely through structured markdown instructions, with no compiled tooling required. In its most explicit form, it demonstrates what the agentic paradigm makes possible: a set of text files that turn a general‑purpose coding agent into a specialized offensive security operator.

Figure 11 – RAPTOR documentation highlighting offensive security agent capabilities and exploit generation benchmarks across LLM providers.

RAPTOR’s own data provides an additional data point on the commercial versus self-hosted question we discussed earlier. An evaluation of exploit generation across multiple model providers found that commercial frontier models (Anthropic Claude, OpenAI GPT-4, and Google Gemini) consistently produce compilable C code at approximately $0.03 per vulnerability, while locally hosted models via Ollama were marked as “often broken” and unreliable for exploit generation. This reinforces the conclusion reached independently by experienced actors in underground forums: commercial models remain significantly more capable than self-hosted alternatives for operational tasks.

Figure 12 – Forum post sharing RAPTOR as an autonomous offensive and defensive security framework built on Claude Code.

Discussions on criminal forums indicate that threat actors are aware of this architecture. The combination of a proven architectural pattern, open source availability, and documented criminal interest suggests that similar configurations, whether directly based on RAPTOR or just replicating its approach, are likely being developed and tested privately.

AI AS ATTACK SURFACE: ENTERPRISE EXPOSURE

The preceding sections document how threat actors engage with AI as an offensive tool. But the same wave of AI adoption is simultaneously creating exposure from the defensive side. As enterprises integrate generative AI into daily workflows, the volume of sensitive data flowing through these tools introduces a distinct category of risk: instead of AI weaponized against organizations, AI is adopted by organizations in ways that outpace security controls.

In January – February 2026, corporate use of generative AI tools continued to expand at scale. Analysis of GenAI activity across enterprise networks shows that one in every 31 prompts (approximately 3.2%) posed a high risk of sensitive data leakage, including the potential sharing of confidential business information, regulated data, source code, or other sensitive corporate content with external GenAI services.

Critically, this risk is broadly distributed across the enterprise landscape rather than limited to a small number of outliers. High-risk prompt activity impacted 90% of organizations that use GenAI tools on a regular basis, indicating that nearly all GenAI-adopting enterprises encounter meaningful data leakage risk through everyday AI usage. Beyond these clearly high-risk events,16% of prompts contained potentially sensitive information, reflecting a wider pattern of questionable data-handling behavior that can still translate into compliance exposure or IP loss.

Adoption trends further amplify the challenge. Over the last couple of months, organizations used 10 different GenAI tools on average, reflecting multi-tool environments. At the user level, an average employee generated 69 GenAI prompts per month. As prompt volume grows, the possibility of data exposure events scales accordingly, reinforcing the need for security policies, visibility, and real-time prevention controls.

The post AI Threat Landscape Digest January-February 2026 appeared first on Check Point Research.

Article Link: https://research.checkpoint.com/2026/ai-threat-landscape-digest-january-february-2026/

1 post - 1 participant

Read full topic

Malware Analysis, News and Indicators - Latest topics

Sp123

"An underestimated security threat to organizations is employee apathy and burn out.‌‌"

AI Threat Landscape Digest January-February 2026

KEY FINDINGS

Introduction to Malware Binary Triage (IMBT) Course

INTRODUCTION

VOIDLINK: THE STANDARD WE MEASURE AGAINST

THIS CASE ESTABLISHES TWO PRINCIPLES:

MODELS: COMMERCIAL, SELF-HOSTED, AND INFORMAL SERVICES

SELF-HOSTED OPEN-SOURCE MODELS

SELF-HOSTED MODELS: LIMITATIONS IN PRACTICE

COMMERCIAL PLATFORMS AND INFORMAL ACCESS SHARING

JAILBREAKING AS ARCHITECTURAL ABUSE

ABUSING AGENT ARCHITECTURE

FROM DEVELOPMENT TOOL TO OPERATIONAL AGENT

AI AS ATTACK SURFACE: ENTERPRISE EXPOSURE

Sp123

Popular Posts

Archive