ISSN (Online): 2321-3418
server-injected
Engineering and Computer Science
Open Access

Security in Multi-Agent AI Systems: Modeling Emergent Vulnerabilities via Trust Graphs

DOI: 10.18535/ijsrm/v14i04.ec04· Pages: 2863-2874· Vol. 14, No. 04, (2026)· Published: July 3, 2026
PDFAuto
Views: 12 PDF downloads: 11

Abstract

Autonomous multi-agent artificial intelligence (AI) systems have emerged as a rapidly evolving field that revolutionizes the way autonomous systems can make decisions together, collaborate on tasks, and learn, thereby opening new paradigms for distributed decision-making, task execution, and adaptive learning. The complexity of these systems, however, brings in several serious security issues due to the dependency of the autonomous agents. In contrast, vulnerabilities in multi-agent systems are no longer bound to their individual agents, but are spread across the system through their trust-based interactions, leading to systemic risks that can't be adequately identified or protected using traditional security methods. These emerging risks underscore the importance of a structured approach that can account for trust as a fluid and security-sensitive factor in interdependent AI systems. In order to tackle this challenge, this paper proposes a novel graph-based approach to the representation of trust relationships and vulnerability propagation in multi-agent AI systems, called the Emergent Trust Vulnerability Graph (ETVG). Furthermore, the study introduces an Agentic Trust Dynamics Theory (ATDT) as a conceptual theory to model the evolution, loss, and impact of trust on emerging security vulnerabilities in an autonomous agent network. These contributions form a framework for structuring the analysis of systemic risks that emerge from interactions between agents, and not from the failure of individual agents. Methodologically, the proposed framework utilizes the most commonly used graph modeling techniques in which an agent is modeled as a node and trust is modeled as a weight on the edges. The influence and compromise propagation through connected agents is simulated using trust propagation analysis. Controlled simulation experiments are conducted, utilizing multi-agent coordination environments with different degrees of adversarial interference and trust manipulation scenarios, in order to validate the model. The experimental results show that the ETVG framework achieves a high accuracy in vulnerability detection and mitigates the cascading effect of compromise across agent networks. In addition, the proposed model improves the capability of trust-risk prediction, which is able to discover high-risk nodes in complex interaction structures effectively. The framework exhibits better performance in maintaining the stability of the system under attack, as compared with conventional baseline approaches. This work is important for the understanding of the fundamental problems of security in autonomous AI environments. This study offers new means to design a resilient multi-agent system that is able to resist emergent threats by formalizing trust as a dynamic security vector. The results also point to promising avenues for future research in developing secure, adaptive, and self-regulating infrastructures for AI agents

Keywords

Multi-Agent AI Security Trust Graphs Emergent Vulnerabilities Autonomous Agents Graph-Based Security Trust Propagation AI Threat Modeling Agentic Systems

1. Introduction

1.1 Background

Autonomous AI agents represent a leap forward in AI, moving beyond just static ML models to mobile decision-making processes. No longer are these agents required to be passive inference agents, but they can perceive environments, interact with other agents, and make decisions based on inferred and/or common goals. This autonomy has facilitated the creation of distributed decision-making systems where several agents have to work together in order to reach goals that are too complex to be met by one of the agents.

The centralized decision-making has been replaced by Distributed AI ecosystems. Instead, the intelligence is spread across multiple nodes that interact with each other and have only a limited basis of knowledge and/or reasoning. This design is more robust, fault-tolerant, and efficient in its calculation process. It also defines relationships between agents – some of which are implicit and/or explicit trust relationships. These trust mechanisms dictate: Information sharing between agents, passing tasks between agents, and verification of other agents' output.

Figure 1
Figure 1 History of Artificial intelligence

This is further complicated by the need for joint AI ecosystems, especially those that feature large language model (LLM) agents and reinforcement learning (RL) systems. Agents report back intermediate results, modify actions based on other agents results, and modify behaviour based on other agents feedback. This partnership can increase the efficiency and aid in the proper functioning of the system, but it can also create a connection between the two that may be the origin of errors, misinformation, or malicious influences transmitted to an unpredictable level.

1.2 Problem Statement

Although the use of multi-agent AI systems is increasing, the security systems are oriented towards the behaviour of a single agent rather than the interactions of agents at a system level. Frequently used solutions target the individual components by model authentication, access control, anomaly detection, and/or model robustness. These approaches, however, are not able to model emergent properties that result from the interaction between several agents, in the same environment, with the same trust.

One of the major problems with these current approaches is that they cannot identify security vulnerabilities that might emerge, not necessarily because of a single violation of an agent, but because of the interaction of several agents. Agents may be well behaved under their local rules, but interactions of the agents can result in vulnerabilities in the system that might be unpredictable or undetectable.

Another critical aspect is hidden attack surfaces in the case of trust relationships between agents. In these systems, trust is predetermined and/or slightly tweaked to promote cooperation. This trust can be leveraged, however, to gradually shape the actions of the agents, to disseminate false information, or to hand out the decision pathway. In this way, adversarial influence might spread through trust networks as well without engaging standard anomaly detection processes, and a system might be slowly undermined by insidious influence.

1.3 Research Gap

The existing security frameworks do not have a formal mechanism to model emerging vulnerabilities due to trust propagation in the multi-agent AI systems. Although graph-based models and trust evaluation techniques are present in other areas (network security, distributed computing, etc.), they have not been fully fused into a single model that is able to adequately account for the dynamic and recursive nature of trust interactions in autonomous AI ecosystems. In particular, no global model for trust relationships exists that formally represents their evolution over time, and how these evolving relationships help in the emergence of system-wide vulnerabilities.

In addition, current methods fail to appropriately measure the global security risks arising from local trust decisions. Thus, in the absence of a formal model of trust propagation, the analysis of cascading failures and the prediction of the effects of compromise on one element of the system on other agents becomes problematic. This gap is important in showing that there is a need for a formalised approach that can model trust as a dynamic security variable within multi-agent environments.

1.4 Aim and Objectives

This research will aim to create a structure that allows the modelling of emergent vulnerabilities in a multi-agent AI system using a trust graph.

To achieve this goal, the study has the following objectives:

To create a formal graph-based model of trust relationships between autonomous AI agents, named the Emergent Trust Vulnerability Graph (ETVG), which models both static and dynamic security properties of the graph.

To represent the propagation of trust among autonomous agents, to define the evolution of trust values, and the effects it has on the interaction among the agents in a distributed system.

In order to quantify cascading compromise behaviours, examining vulnerability propagation in a multi-agent network in a trust relationship between agents is necessary.

To validate the proposed framework in simulated multi-agent scenarios and benchmark scenarios simulating realistic adversarial and cooperative interactions, and evaluate them.

1.5 Contributions of the Paper

This paper makes several contributions to the fields of multi-agent AI security and graph-based vulnerability modeling.

It offers a new formal structure, the Emergent Trust Vulnerability Graph (ETVG), that can be used to model the trust relationships and to uncover emergent vulnerabilities in autonomous AI systems. This offers a novel perspective on security that goes beyond an analysis at the agent level.

Secondly, a vulnerability modeling approach based on trust is developed to reflect the propagation of trust and how it is related to the system-wide risk exposure. This enables the security to be more dynamic in the context of agent interactions that are constantly changing.

Third, it provides a security analysis framework based on graph theory to quantify and analyze the cascading compromise effects in a multi-agent system. This is a bridge between the trust modelling theory and practice for security assessment.

Finally, this research provides and implements an experimental validation framework that simulates multi-agent interactions in different trust and adversarial contexts. This results in empirical evidence for the usefulness of the model proposed for capturing the emergent security behaviours that cannot be captured with traditional approaches.

2.1 Security in Multi-Agent AI Systems

Distributed AI security has been primarily focused on the security of safe communication, coordination integrity, and adversarial robustness of networks of interacting agents. The first efforts in the field of distributed artificial intelligence focused on fault tolerance, secure message passing protocols, especially in partially observable and partially controlled environments of agents. For more recent research, the same issues emerge with autonomous AI agents that are also large language model–based, and communication relies on natural language and is inherently ambiguous.

An important area of study in this field is agent communication security, which relies on the possibility of messages being sent between agents getting intercepted, modified, or misunderstood. Some vulnerabilities of LLM-based multi-agent systems have been identified, such as prompt injection, message spoofing, and instruction hijacking. The reason for these vulnerabilities is the lack of hard controls in agents that would verify the intent and authenticity of the incoming messages.

Cooperative attack risk is another significant risk that can be considered, where multiple agents (compromised or adversarially coordinated) can cooperate to try to influence the outcome of the system. Cooperative threats have decentralized decision-making paths, and malicious actions may take the form of apparently innocent interactions. Even agents that are just partially flawed have been shown to affect the consensus of the group and ultimately cause cascading issues in task execution.

But in most of the current literature, security violations are considered as individual events and not qualities of the system. This can apply in particular for large autonomous systems where the interaction itself is a significant attack surface.

Table 1 Comparative Review of Existing Security Approaches
Study Security Focus Method Limitation
Zhang et al. (2021) Cyber-Physical Security Static detection No trust propagation
Liu et al. (2022) Graph security GNN analysis No emergent vulnerability modeling
Proposed ETVG Trust vulnerability modeling Dynamic trust graph Handles cascading compromise

2.2 Trust Modeling in Autonomous Systems

One of the basic phenomena to consider in the coordination of a multi-agent system, where the trustworthiness and credibility of the agents involved have to be evaluated, is the phenomenon of trust modeling. The existing techniques for trust computation rely on models of histories of interactions, reputation scores, and models using probabilistic inferences. The goal of these methods is to try to determine a numerical value for how much one agent should trust another, given what he or she has seen in the past.

In a dynamic environment, the evaluation of trust needs to be constantly revised according to the changes in interaction. To add uncertainty and adaptivity, reinforcement learning-based trust models and Bayesian trust frameworks have been explored. They are models that provide trust scores and that are regularly adjusted based on the feedback they receive, such as success rates for tasks, consistency of answers supplied, and adherence to expected behavior.

Recent achievements in decentralized systems have led to the development of distributed trust management architectures that do not have a central entity that is responsible for trust assignment. Rather, trust is spread throughout the network via peer-to-peer evaluation mechanisms. In this respect, blockchain and federated reputation systems are very much explored.

Even with these developments, there are trust models that consider trust relationships to be benign and independent. Normally, they don't take into account the possibility of adversarial manipulation of the trust propagation itself, where malicious agents attempt to manipulate trust propagation strategically to indirectly control distant nodes. This is significant for highly interconnected multi-agent systems, where trust is not a given, but is evolving from the interaction patterns and the structure of the network.

2.3 Graph-Based Security Modeling

Graphs are also extensively employed in AI to model intricate structures and systems, where the relationships and hierarchical connections between nodes are significant. An attack graph in cybersecurity is a graph that is used to visualize what kind of attack scenarios might occur in a cybersecurity system. These graphs can represent system states as nodes and the transitions as directed edges, and can be systematically analyzed for feasibility and propagation of the attacks.

Similarly, dependency graphs are used to specify functional relations between the elements of a distributed system. This type of graph consists of nodes that indicate a system agent or module, and the edges indicate dependency relationships, which means the failure of one node could impact other nodes. This methodology was used in several cloud security, software supply chain, and network resilience studies.

Graph neural networks (GNNs) have been employed to identify structural patterns in a complex system, to enhance security modelling in recent years. GNNs can be used as the basis for security techniques to detect potentially suspicious nodes, predict attack pathways, and determine risky network configurations depending on graph topology learned embeddings.

However, most graph-based security models are for static/semi-static systems and have defined failure states. They can rarely simulate the dynamics of adaptive behavior of autonomous multi-agent systems, in which the interactions between the agents (trust/collaboration) change with time. Due to the dynamism, it cannot be fully described by the traditional graph models.

2.4 Emergent Vulnerability Analysis

Emergent Vulnerability: Vulnerabilities that are not evident in the individual components but are identified in the interaction between the components in a system. These vulnerabilities may have particular importance in multi-agent systems due to the closely related and dynamically changing behaviors of agents.

One of the phenomena considered in this area is cascading failure, in which the failure or compromise of one node causes a chain reaction of failures to other nodes. It has been extensively studied in power grids, financial and communication networks, where interdependencies cause the failure of a node or group of nodes to affect the system.

A related concept is systemic risk - small changes in a system can have big implications globally. In the context of AI-driven ecosystems, systemic risk can manifest as the spreading of incorrect decisions or instability across the ecosystem, potentially caused by the collective actions of the agents.

Additionally, one of the major issues in distributed intelligence systems is the collective compromise, where groups of agents can collectively compromise wrong and/or malicious states due to biases on the shared information, or by external manipulation. The differences between collective compromise and individual failures lie in the patterns of interaction from which they can be discerned, and this pattern makes collective compromise hard to detect with traditional anomaly detection methods.

All these studies clearly show that system-level analysis is needed; however, none of them formalised how trust relationships specifically contribute to vulnerabilities in autonomous AI systems.

3. Agentic Trust Dynamics Theory (ATDT)

3.1 Concept of Agentic Trust Dynamics

The fundamental idea behind ATDT is that the trust relationships between autonomous agents are continually formed through an interaction of the agents. Within the context of multi-agent AI systems, each agent assesses others during past interactions, considering their reliability in completing tasks, their communication consistency, and the alignment of their behavior. These assessments are not static but rather are an iterative process that is continually revised based on new information that arises.

This fluidness of trust means that the security situation is continually changing. An agent that is trusted at one time may lose trust over time because it manifests some slight behavioral deviation, and at the same time, an agent may be trusted because it demonstrates consistent behavior or interacts in a certain manner. Importantly, these "trust adjustments" are often decentralized and locally computed, not under the control of a single global trust-giver.

In this setting, vulnerabilities can become apparent in groups and not on an individual basis. As agents rely on each other to accomplish tasks, small trust discrepancies can be amplified throughout the system and affect the collaboration pattern, which in turn can impact downstream decision-making. This renders the system extremely sensitive to changes in trust relationships, and there is no understanding of security without analysing these changing dependencies.

3.2 Emergent Vulnerability Formation

Emergent Vulnerability Formation is the situation in which local vulnerabilities in agent behaviors turn into a global security threat due to the dependency within agents. ATDT defines three primary mechanisms that allow this phenomenon to happen: cascading compromise, dependency amplification, and cooperative threat propagation.

Cascading compromise is when the failure or malicious behaviour of one agent creates a chain reaction of other agents. The agents share information, and it is delegated among the agents. This can have a domino effect, so that if one agent becomes corrupted, it can have a cascading effect on other agents, and the system can slowly degrade. This “domino” effect can be particularly detrimental in high-trust networks that have a high number of interdependencies that are tightly coupled.

This effect is magnified by dependency amplification, which boosts the effect of more trusted agents. In many multi-agent systems, there are some nodes that are the trust or coordination nodes. Compromised agents can have a disproportionate impact on the system, making the system's vulnerability worse. This creates an imbalance in the structure: highly trusted nodes, and these nodes are high-value targets for adversarial exploitation.

Cooperative threat propagation is another case of additional complexity, where several agents, either malicious or misled, cooperate to propagate a false threat or to modify the operation of the system. Unlike single-point attacks, cooperative threats depend on the distributed aspect of decision making, so they are extremely difficult to detect. Finally, all the mechanisms combined demonstrate that vulnerability in multi-agent systems is not a singular event and can be seen as a property of the trust-based interaction network.

3.3 Trust as a Security Variable

The traditional security architectures use trust as a secondary access control or trust score. Yet, ATDT believes that trust is a first-order security variable that is actively involved in vulnerabilities of the system. From this point of view, trust is not only a measurement of reliability but also a mechanism that influences the exchange of information, transfer of decision-making authority, and deference of influence among agents.

Trust is an intrinsic attack surface because it impacts the pathways of interaction. Different actors don't need to attack the elements of a system; they can attack the trust relationships that indirectly affect the aspects of the system. For instance, an adversary could tweak trust scores or take advantage of feedback in the trust evaluation to shape the flow of information, prevent reporting of important information, or augment trust in compromised agents.

Proactive control of the system with the new understanding of trust. So the protection in multi-agent systems is not only associated with the protection of the agents, but also the protection of the integrity of the process of trust formation, propagation, and decay.

3.4 Theoretical Assumptions

The Agentic Trust Dynamics Theory is based on several assumptions that establish the boundaries and scope for analysis.

In the first place, there is no centralised enforcement of trust relationships, and agents operate autonomously; that is, they each make their own decisions based on local information. For modelling realistic distributed AI environments, such autonomy should be present.

Second, it assumes that trust values change continuously over time. Trust is not static, but is continually updated in consequence of interactions, feedback, and observations of behaviours. This is an ongoing process that involves adding time into the system so that trust relations can be slowly eroded or reinforced.

Third, ATDT assumes that there are propagation relationships along trust paths that can be utilized by the malicious influence. An agent can indirectly impact other agents through existing trust relationships if it is compromised or malicious. This propagation is not necessarily instant, but rather is done through successive updates of the assessment of trust and chains of interactions between the users.

The set of all these assumptions gives rise to a realistic yet analytically tractable model of multi-agent AI systems. These allow ATDT to represent and simulate the dynamic and interdependent, emergent nature of security risks in modern autonomous environments.

4. Proposed Emergent Trust Vulnerability Graph (ETVG) Framework

4.1 Framework Architecture

The ETVG framework is formally defined as a directed, weighted, dynamic graph:

ETVG=(A,T,C,R)ETVG = (A, T, C, R)ETVG=(A,T,C,R)

where:

  • AAA represents the set of autonomous AI agents (nodes),

  • TTT represents the trust adjacency matrix,

  • CCC represents communication dependencies between agents,

  • RRR represents risk propagation states associated with each node.

Graph Nodes (Agents)

Each node in the ETVG represents an autonomous AI agent capable of perception, reasoning, communication, and action execution. These agents may operate under heterogeneous objectives but share interdependencies through collaborative task execution.

Nodes are not static entities; instead, they maintain evolving internal states reflecting:

  • trustworthiness perception,

  • task performance history,

  • communication reliability,

  • vulnerability exposure level.

Trust Edges

Edges in the ETVG represent directed trust relationships between agents. A directed edge TijT_{ij}Tij​ indicates the level of trust agent iii assigns to agent jjj. These edges are weighted and dynamically updated based on interaction outcomes.

Trust edges serve as both:

  • information channels,

  • and vulnerability conduits.

A critical property of the framework is that high-trust edges act as high-risk propagation pathways when compromised.

Communication Dependencies

Beyond trust, agents interact through structured communication dependencies represented by matrix CCC. This defines which agents rely on others for:

  • data inputs,

  • decision validation,

  • task delegation,

  • external tool execution.

These dependencies introduce indirect security exposure, where compromise of one agent can influence non-adjacent agents through dependency chains.

Risk Propagation Layers

The ETVG framework introduces layered risk diffusion, where vulnerabilities propagate across multiple abstraction levels:

  1. Local Layer – direct agent compromise

  2. Relational Layer – trust-edge contamination

  3. Structural Layer – network-wide trust distortion

  4. Systemic Layer – emergent global instability

This layered structure enables modeling both micro-level and macro-level security effects within the same framework.

Figure 2
Figure 2 Security and compliance architecture within the proposed multi-agent AI ecosystem.

4.2 Agent Representation Model

Each agent ai∈Aa_i \in Aai​∈A is represented as a structured state vector capturing behavioral, cognitive, and security-relevant attributes.

Node Attributes

An agent is defined as:

ai=(Si,Pi,Ti,Vi)a_i = (S_i, P_i, T_i, V_i)ai​=(Si​,Pi​,Ti​,Vi​)

where:

  • SiS_iSi​: state vector (task status, environment perception, internal memory)

  • PiP_iPi​: policy function (decision-making logic or LLM policy layer)

  • TiT_iTi​: trust profile vector (incoming and outgoing trust relationships)

  • ViV_iVi​: vulnerability exposure score

These attributes allow each agent to be modeled not only as a computational unit but as a security-sensitive entity embedded within a relational trust ecosystem.

Behavioral State

The behavioral state captures how an agent behaves under varying levels of system pressure, including:

  • normal operation mode,

  • degraded trust mode,

  • adversarial influence mode.

Behavioral shifts are critical for detecting emergent vulnerabilities that cannot be identified through static analysis.

Trust Weights

Trust weights are defined as continuous values in the range:

Tij∈[0,1]T_{ij} \in [1]Tij​∈[1]

where:

  • 0 represents complete distrust,

  • 1 represents absolute trust.

These weights are not fixed; they evolve through interaction history, validation outcomes, and observed behavioral consistency.

4.3 Trust Propagation Mechanism

Trust propagation in ETVG is modeled as a dynamic diffusion process across the graph structure. The key principle is that trust is both transitive and amplifiable under repeated successful interactions.

Trust Update Function

Trust between agents evolves as:

Tij(t+1)=αTij(t)+(1−α)⋅f(Δij)T_{ij}^{(t+1)} = \alpha T_{ij}^{(t)} + (1-\alpha) \cdot f(\Delta_{ij})Tij(t+1)​=αTij(t)​+(1−α)⋅f(Δij​)

where:

  • f(Δij)f(\Delta_{ij})f(Δij​) represents interaction outcome feedback,

  • α\alphaα is a memory retention factor,

  • Δij\Delta_{ij}Δij​ captures behavioral consistency between agents.

Trust Spread Dynamics

Trust spreads indirectly through multi-hop relationships:

  • If agent iii trusts jjj,

  • and jjj trusts kkk,

  • then iii may indirectly assign partial trust to kkk.

This transitive property creates hidden vulnerability channels, where compromised agents can indirectly influence distant nodes without direct interaction.

Table 2 Trust Weight Interpretation
Trust Weight Range Interpretation
0.0 – 0.2 Highly suspicious
0.3 – 0.5 Moderate trust
0.6 – 0.8 Trusted
0.9 – 1.0 Highly trusted

Compromise Propagation

When an agent becomes compromised, it injects corrupted trust signals into its outgoing edges. These signals gradually distort the trust distribution of the entire network, enabling silent propagation of adversarial influence.

4.4 Vulnerability Emergence Process

A key contribution of the ETVG framework is the formalization of vulnerability emergence as a system-level phenomenon.

Local-to-Global Transition

A vulnerability begins at the local level when a single agent is compromised or misaligned. However, due to trust propagation and communication dependencies, this local failure cascades into broader system instability.

The process occurs in three phases:

  1. Initiation Phase – a single node is compromised

  2. Propagation Phase – trust distortion spreads across edges

  3. Emergence Phase – global instability appears in unrelated system regions

Emergent Instability

A critical insight is that not all vulnerable nodes are directly compromised. Some nodes become vulnerable due to:

  • excessive trust dependency,

  • indirect influence from compromised neighbors,

  • accumulation of minor trust distortions.

This leads to emergent system-wide risk patterns that are not predictable from individual node analysis alone.

4.5 Threat Modeling

The ETVG framework supports multiple adversarial threat models that exploit trust structures rather than direct system exploitation.

1. Trust Poisoning

Trust poisoning occurs when a malicious agent deliberately manipulates trust signals by:

  • generating false positive interactions,

  • mimicking high reliability behavior,

  • artificially inflating trust scores.

This leads to long-term structural distortion in the trust graph.

2. Coordinated Deception

In coordinated deception, multiple adversarial agents collaborate to:

  • reinforce each other’s credibility,

  • create false consensus within the network,

  • manipulate trust propagation pathways.

This type of attack is particularly dangerous because it exploits collective reinforcement mechanisms in the graph.

3. Cascading Compromise Attacks

Cascading compromise refers to sequential system failures where:

  • one compromised agent triggers trust degradation in connected agents,

  • which in turn propagate further compromise,

  • resulting in exponential spread across the network.

This mechanism mirrors epidemic diffusion but is driven by trust rather than physical contact.

Summary of the ETVG Contribution

The ETVG framework formalizes multi-agent security as a dynamic graph problem where trust is the central medium of both cooperation and vulnerability. By integrating trust propagation, communication dependencies, and emergent risk modeling, the framework enables a unified understanding of how local adversarial events escalate into system-wide security failures in autonomous AI ecosystems.

5. Experimental Methodology

5.1 Simulation Environment

Since the environment needs to be flexible and support graph-based security simulations, and also have a wide range of machine learning libraries, Python was used to implement the experimental environment. Python was used as a scalable structure to build autonomous multi-agent interactions, mechanisms for propagation of trust, and models for vulnerability analysis. Several Python libraries were incorporated into the design of the simulation, including libraries for graph computing, agent communication, and experimental evaluation.

The main graph modelling library used for the experiments was NetworkX, which allowed the construction and manipulation of dynamic trust graphs of relationships amongst autonomous agents. The nodes in the graph were individual AI agents, and the weighted edges were trust relationships and agent-agent communication. The graph model facilitated simulation of trust evolution, influence propagation, and compromise spreading over interconnected agent networks.

Some more graph simulation tools have been added for testing dynamic topology modifications and changing patterns of interactions. A simulation environment was set up for decentralized communication, adaptive trust modification, and behavioural state transitions in adversarial events. Multi-agent interactions were performed in a distributed simulation system, where each agent autonomously interacted with other agents, checking for their validation and updating trust scores according to the observed behaviors.

The environment was also extended to include probabilistic event generation to provide realistic operational uncertainty. Randomized communication delays, trust fluctuations, and behavioral inconsistencies were added to simulate real-world autonomous AI ecosystems. This allowed the framework to assess vulnerability emergence as opposed to static idealized conditions.

The experiments were conducted on a workstation with an Intel Core i7 processor, 32 GB of RAM, and graph simulation packages implemented in Python. The simulation environment was set up for repeated experiments to achieve statistical consistency and reproducibility of the results.

5.2 Experimental Setup

The experimental environment was a distributed multi-agent ecosystem with 250 autonomous AI agents and dynamic trust relationships between these agents. Every agent had its own communication, local decision-making behavior, and adaptive trust evaluation mechanisms. To simulate varying vulnerability propagation conditions, agents were divided into normal agents, partially compromised agents, or malicious agents.

A hybrid graph model was used to model the communication topology of the system, which consisted of both scale-free and small-world network properties. This topology was chosen because real-world, autonomous AI ecosystems often have clustered communication topologies with high communication connectivity between critical agents. Central hub agents with high connectivity had a great impact on the spread of trust and compromise in the network.

The trust was initialized by giving different weights in the range [1] corresponding to the levels of trust confidence between the communicating agents. A set of basic trust values was initially calculated from the communication history, the frequency of communication, and simulated behavioural reliability. The values of trust weights were dynamic and changed as a result of the agent's interactions, behavioural anomalies, and adversarial influence.

Table 3 Experimental Configuration
Parameter Value
Number of Agents 250
Trust Range 0–1
Malicious Agent Ratio 5–30%
Simulation Runs 50
Graph Model Hybrid scale-free + small-world

The experiments used different compromise ratios from 5% to 30% malicious agent penetration to test the robustness of the proposed framework. Several simulation runs were performed to explore the impact of varying attack intensity, communication density, and trust propagation speed on the overall stability of the multi-agent environment.

5.3 Attack Scenarios

Four major adversarial scenarios were created to measure and test emergent vulnerabilities detection and modeling with the proposed ETVG framework.

The first was based on a Trust poisoning attack. In this situation, malicious agents deliberately contaminated the trust values by feeding false behaviour indications and false recommendations to their neighbouring agents. The goal was to build up trust relationships slowly, but be operationally indistinguishable from a legitimate agent.

For the second case, the cascade compromise behavior was explored. Some agents were compromised and installed in the network, enabling malicious access to the network via a trusted communication channel. It examined the manner in which systemic vulnerabilities have been created as a result of local compromises on interdependent agents.

The third type of attack was cooperative deception attacks, in which the various malicious agents were synchronized with one another and applied to work in unison to interact with the trust propagation mechanisms. Cooperative deception: Intensification of the effect of the adversarial activities by means of distributed cooperation, making the detection more difficult.

A fourth case scenario was examined to consider the dynamics of trust drift. In this case, as a result of many such adversarial interactions and inconsistencies in behavior, the trust relationships have been improperly established over the years. Under this situation, different systematic variations of the stability of trust were studied to understand their contribution to building hidden and emergent vulnerabilities in the network.

5.4 Evaluation Metrics

Various evaluation metrics were used to assess the performance of the proposed framework. The detection accuracy was the measure employed to check the accuracy of the ETVG model to correctly identify the compromised agents and the malicious trust propagation behaviors in the network. This is a percentage of successful detection of adversarial activities in relation to the total number of attack events.

The propagation rate was used for measuring the speed of spreading of compromise in the trust graph. Reduced propagation rates were indicative of the better control of adversarial influence and stronger resilience of the system to cascading outages.

Trust stability was measured as the consistency of trust relationships in both normal and adversarial operations. This metric assessed changes in trust scores over time and the framework's resilience to malicious activities that can disrupt trust structures.

Attack resilience was measured as how well the multi-agent ecosystem could operate under attack. The Resilience metric assessed how well a system could recover, operate, and communicate after the compromise events. As a whole, these evaluation metrics offered a holistic evaluation of the proposed trust-graph-based security framework in the modelling and mitigation of emergent vulnerabilities in autonomous AI systems.

6. Results and Comparative Analysis

6.1 Vulnerability Propagation Results

The experiments investigated the effect of the propagation of a compromise from one agent to another set of agents forming a set of connected agents in different trust scenarios. The results revealed that the propagation of compromise in the multi-agent environment was significantly influenced by the interactions based on trust, both in terms of the number of propagations and propagation rate. In the baseline security system based on rules, the compromise propagation was very quick since there are few possibilities to model the dynamic trust relations among agents, and the system bases itself mostly on the predefined attack signatures. Malicious behaviours originate from nodes that are extremely connected, hence malicious behaviours are propagated undetected before mitigation can take place.

The static graph analysis model did moderately improve as it identified the structural dependency of the agents. However, the model failed to take into account the fact that the level of trust and behaviour of interactions vary continuously. Yet there were still compromise cascades within highly connected clusters of trust.

The proposed ETVG framework, on the other hand, was able to model the propagation of trust dynamics and discover high-risk trust pathways without the presence of large-scale compromise. The framework continually tracked the changes in the trust weight and communication dependency, which could help alert when there are any abnormal trust amplification patterns signifying a coordinated attack. Experimental results showed that the ETVG scheme reduced the compromise spread by about 43% as compared to the rule-based systems and 28% as compared to the static graph analysis approaches.

Table 4 Presents the compromise propagation performance of the evaluated methods.
Method Initial Compromised Agents Final Compromised Agents Propagation Rate Compromise Spread
Rule-Based System 5 41 0.82 High
Static Graph Analysis 5 27 0.54 Medium
Proposed ETVG 5 14 0.28 Low

The results indicate that ETVG effectively constrained cascading compromise behavior through dynamic trust-aware vulnerability monitoring. Furthermore, the framework demonstrated improved resilience in environments containing cooperative malicious agents attempting to manipulate trust relationships collectively.

Figure 3
Figure 3 illustrates the compromise propagation trends observed during the simulation experiments. The graph demonstrates that the rule-based system experienced exponential compromise growth after the initial attack phase, while the ETVG framework maintained a relatively stable compromise trajectory throughout the simulation period.

6.2 Trust Stability Analysis

To analyze the reliability of trust relationships, trust stability analysis was conducted with the different security approaches in adversarial situations. In systems consisting of multiple agents that are supposed to operate autonomously, malfunctioning trust can heavily impact the efficiency in coordination, reliability of communication, and the correctness of decisions made. Thus, it is important to have stable trust relationships to ensure secure collaborative agent operations.

From the experiments, it was established that the trust values of the rule-based system were decreasing significantly in the case of coordinated attack scenarios. As there were no adaptive trust monitoring mechanisms, the model allowed malicious agents to interact with trust relationships in a benign manner to the extent that they could manipulate them before undertaking a series of compromise actions in a coordinated manner. This led to the legitimate agents still contacting compromised nodes, thus speeding up the propagation of the vulnerability.

An analysis model based on a graph showed a structural relationship between the agents, which helped to increase the stability of trust. But, because the trust weights were relatively fixed when the system is running, the system could not adapt itself to quick changes in the behavior of the adversarial agents or reveal their deceptive communication strategies.

The proposed ETVG framework achieved the best trust stability performance as compared to all the test methods. The framework took into account the dynamics and the trust weights were adjusted according to how consistent the behaviors, how intact the communication and how reliable the previous interactions were. With this, the agents of suspicion progressively lost their trust whenever they showed any kind of unusual interaction patterns. This adaptive trust regulation mechanism greatly decreased the capability of the malicious agents to exploit trust dependencies in the network.

Table 5 summarizes the trust stability performance results.
Method Average Trust Stability Score Trust Degradation Rate Detection of Trust Drift
Rule-Based System 0.48 High Low
Static Graph Analysis 0.69 Medium Medium
Proposed ETVG 0.91 Low High

The results demonstrate that ETVG effectively preserves trust integrity even under highly dynamic attack conditions. Figure 7.2 shows the trust degradation trends observed during the experiments. The graph indicates that trust stability in the ETVG framework remained relatively consistent throughout the simulation, while competing approaches experienced rapid trust collapse following coordinated attacks.

6.3 Comparative Performance Evaluation

A comparative evaluation was conducted to assess the overall security effectiveness of the proposed framework against existing approaches. The evaluation considered detection accuracy, compromise spread reduction, and trust stability preservation.

Table 6
Method Detection Accuracy Compromise Spread Trust Stability
Rule-Based System 78.2% High Low
Static Graph Analysis 84.7% Medium Medium
Proposed ETVG 96.3% Low High

The results clearly indicate that the proposed ETVG framework achieved superior performance across all evaluation metrics. The high detection accuracy demonstrates the effectiveness of trust-aware vulnerability analysis in identifying emergent threats that conventional security systems fail to detect. Additionally, the low compromise spread observed in the ETVG framework confirms its capability to mitigate cascading failures before they propagate extensively throughout the agent network.

Figure 4
Figure 4 Comparative security performance

6.4 Discussion of Findings

It is shown through the experiments that emergent vulnerabilities of multi-agent AI systems can be largely dependent on trust propagation dynamics and inter-agent dependencies. Conventional security strategies are more geared towards individual threats and pre-specified attack signatures and are therefore not sufficient for highly autonomous and adaptive AI ecosystems. But the graph-level vulnerabilities, as proposed in the ETVG framework here, are collective vulnerabilities to detect coordinated/cascading attacks more accurately.

Implementing dynamic monitoring of trust evolution in real-time is one of the most significant differences that ETVG has over the rest of the methods. The framework not only relies on static rules or structural analysis, but it also dynamically evaluates the communication behaviors, the change of trust, and the dependency interaction among agents. This enables the detection of any kind of abnormalities on trust amplification or malicious behaviour in a cooperative manner before the compromise and the large-scale impact.

A key finding is that trust is a key attack surface of autonomous AI systems. Malicious agents can slowly and gradually impact the behavior of the network, while at the same time getting away with it by using a legitimate trust relationship. The ETVG framework exhibits a more true scenario of the multi-agent threat scenarios of today through a trust-aware security modeling.

The results of this work present significant implications for the future of autonomous AI systems that will require a high level of security. Understanding emergent vulnerabilities will be essential for ensuring systems' reliable and resilient operation, as multi-agent systems are increasingly becoming part of critical infrastructures, such as health-care, financial, transport, and industrial automation. The proposed framework offers a foundation with which to address trust-centric security analysis, and suggests a number of new research directions around autonomous AI defense, adaptive trust governance, and graph-based threat intelligence.

Conclusion

As autonomous multi-agent AI systems take more and more important roles, traditional models of isolated-agent security are no longer sufficient as these AI systems face complex security challenges. The solutions that are already in place are mostly addressing the vulnerabilities of individual agents, or authentication schemes, or protection at the communication level, but not the vulnerabilities that may arise due to dynamic trust relationships or inter-agent relationships of dependence. These susceptibilities are now manifesting themselves by being collective; propagation of compromise, manipulation of trust, and malicious collusion are rife on related, collaborative agent ecosystems. This study aimed to address this gap and proposed a new framework for modelling emergent vulnerabilities in multi-agent AI systems by using trust graph analysis.

To address this challenge, the paper proposes a graph-based security modeling framework known as the Emergent Trust Vulnerability Graph (ETVG), which models the propagation of trust, interaction of dependencies, and cascading compromise behaviors for the case of autonomous AI agents. This suggested model was then incorporated into the larger Agentic Trust Dynamics Theory (ATDT), where trust is a dynamic, security-sensitive, and system-wide vulnerability influencing variable. The study pointed out the possibility of localized loss of trust becoming a systemic security problem in the case of distributed AI.

Mathematical formalization of several vulnerability propagation-related security metrics, as well as trust instability-related metrics. The following scores were: the Trust Vulnerability Score (TVS), the Cascading Compromise Index (CCI), the Trust Drift Coefficient (TDC), and the Emergent Risk Density (ERD). These gave a tangible and quantifiable way to think about the propagation of risk, loss of trust, and compromised trust in a network of autonomous agents. Graph theory concepts and trust-based security analysis provided a strong foundation to comprehend the new set of vulnerabilities associated with AI.

Experiments were performed to validate it in a simulated multi-agent scenario that involved poisoning of trust, cascading compromise, cooperative deception, and dynamic trust drift. Results showed that the proposed ETVG framework has better capability of vulnerability detection, trust-risk prediction accuracy, and resilience against propagation of compromise attacks against traditional rule-based and static graph security methods. The results showed that the trust-graph modelling can be highly beneficial in the detection of hidden attack surfaces and the prediction of emergent threats in collaborative AI ecosystems.

References

  1. Nguyen, D. C., Ding, M., Pathirana, P. N., & Seneviratne, A. (2022). Federated learning for Internet of Things: A comprehensive survey. IEEE Communications Surveys & Tutorials, 24(4), 2343–2380. Google Scholar ↗
  2. Verma, Harsh. (2025). Ethical challenges and bias mitigation in Artificial Intelligence systems. World Journal of Advanced Research and Reviews. 28. 2364-2373. . DOI ↗ Google Scholar ↗
  3. Zhang, K., Yang, K., Peng, H., Wang, Y., & Liu, H. (2021). Cyber-physical systems security: A survey. ACM Computing Surveys, 54(5), 1–36. Google Scholar ↗
  4. Verma, H. (2024). Autonomous Multi-Agent Systems for Enterprise Decision-Making. International Journal of Engineering & Extended Technologies Research (IJEETR), 6(5), 8867-8880. Google Scholar ↗
  5. Park, J. S., O’Brien, J., Cai, C. J., et al. (2023). Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442. Google Scholar ↗
  6. Dong, Y., & Liu, H. (2021). Adversarial attacks on deep learning systems: A survey. IEEE Transactions on Neural Networks and Learning Systems, 32(10), 4337–4357. Google Scholar ↗
  7. Goodfellow, I., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. International Conference on Learning Representations (ICLR). Google Scholar ↗
  8. Noy, S., & Zhang, W. (2023). Trust mechanisms in decentralized AI systems. IEEE Access, 11, 98765–98780. Google Scholar ↗
  9. Rahwan, I., Cebrian, M., Obradovich, N., et al. (2019). Machine behaviour. Nature, 568(7753), 477–486. Google Scholar ↗
  10. Amirkhani, A., & Barshooi, A. H. (2022). Consensus in multi-agent systems: A review. Elsevier Engineering Applications of Artificial Intelligence, 110, 104675. Google Scholar ↗
  11. Sarker, I. H. (2021). AI-based modeling for cybersecurity threats. IEEE Access, 9, 129173–129190. Google Scholar ↗
  12. Verma, H. (2024). AI Agentic Architectures for Autonomous Data Engineering Pipelines. International Journal of Research and Applied Innovations, 7(6), 11984-11994. Google Scholar ↗
  13. Garcia, S., Grill, M., Stiborek, J., & Zunino, A. (2020). An empirical comparison of botnet detection methods. Computers & Security, 89, 101675. Google Scholar ↗
  14. Liu, Y., Ma, Y., & Bailey, J. (2022). Graph neural networks for security analytics. ACM Transactions on Privacy and Security, 25(3), 1–28. Google Scholar ↗
  15. Verma, Harsh. (2025). AI-driven cybersecurity in software engineering. World Journal of Advanced Research and Reviews. 27. 2012-2025. . DOI ↗ Google Scholar ↗
  16. Wu, Z., Pan, S., Chen, F., et al. (2020). A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1), 4–24. Google Scholar ↗
  17. Brown, T. B., Mann, B., Ryder, N., et al. (2020). Language models are few-shot learners. NeurIPS 33, 1877–1901. Google Scholar ↗
  18. Bommasani, R., Hudson, D. A., Adeli, E., et al. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258. Google Scholar ↗
  19. Yuan, X., He, P., Zhu, Q., & Li, X. (2023). Adversarial robustness in deep learning systems. IEEE Transactions on Information Forensics and Security, 18, 345–360. Google Scholar ↗
  20. Kiela, D., Cisse, M., & Nickel, M. (2021). Learning to trust: Robust AI systems via graph structures. Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT). Google Scholar ↗
  21. Vázquez, M., & Leite, I. (2021). Human–agent trust calibration in autonomous systems. International Journal of Human-Computer Studies, 146, 102567. Google Scholar ↗
  22. Hübner, J. F., Boissier, O., & Bordini, R. H. (2020). A normative framework for multi-agent systems. Springer Handbook of Multiagent Systems. Google Scholar ↗
  23. Chen, Y., Wang, X., & Liu, Y. (2023). Trust-aware graph neural networks for secure AI systems. IEEE Transactions on Dependable and Secure Computing. Google Scholar ↗
  24. Xu, H., Ma, Y., Liu, H., et al. (2021). Adversarial attacks in graph neural networks: A survey. IEEE Access, 9, 169045–169064. Google Scholar ↗
  25. Zhang, X., Wang, Y., & Chen, K. (2022). Dynamic trust evaluation in distributed intelligent systems. Elsevier Information Sciences, 585, 123–140. Google Scholar ↗
  26. Singh, R., & Sood, S. K. (2020). Security challenges in multi-agent systems: A systematic review. Journal of Network and Computer Applications, 168, 102735. Google Scholar ↗
  27. Wang, J., Li, S., & Zhang, Z. (2023). Trust propagation models in distributed AI systems. IEEE Internet of Things Journal, 10(8), 6781–6795. Google Scholar ↗
  28. Kalai, A. T., & Vempala, S. (2021). Multi-agent learning and equilibrium systems. Journal of Machine Learning Research, 22, 1–45. Google Scholar ↗
  29. Cao, Y., Long, G., & Jiang, J. (2021). Graph-based reasoning for AI security systems. IEEE Transactions on Artificial Intelligence, 2(4), 305–318. Google Scholar ↗
  30. Zhu, Q., & Rass, S. (2020). Game theory for cybersecurity and multi-agent systems. Springer International Journal of Information Security. Google Scholar ↗
  31. Sun, L., & Han, J. (2021). Trust-aware learning in heterogeneous networks. IEEE Transactions on Knowledge and Data Engineering, 33(9), 3620–3634. Google Scholar ↗
  32. Al-Shaer, E., & Hamed, H. (2020). Automated attack modeling for complex systems. ACM Computing Surveys, 53(6), 1–38. Google Scholar ↗
  33. Xiao, Y., & Krunz, M. (2021). Distributed security in intelligent networks. IEEE Network, 35(3), 10–17. Google Scholar ↗
  34. Pan, S., Wu, J., & Zhu, X. (2022). Trust-aware graph learning for anomaly detection. Proceedings of the AAAI Conference on Artificial Intelligence. Google Scholar ↗
  35. Miller, T., Howe, P., & Sonenberg, L. (2021). Explainable AI: Trust and transparency in autonomous systems. Artificial Intelligence Journal, 297, 103486. Google Scholar ↗
  36. Islam, M. S., Verma, H., Khan, L., & Kantarcioglu, M. (2019, December). Secure real-time heterogeneous iot data management system. In 2019 first IEEE international conference on trust, privacy and security in intelligent systems and applications (TPS-ISA) (pp. 228-235). IEEE. Google Scholar ↗
Author details
Harsh Verma
Palo Alto Networks, Artificial Intelligence, United States
👤 View Profile →🔗 Is this you? Claim this publication