SMILE-IT: Stable Multi-agent learning for networks
Agent-driven networks, consisting of many collaborating and competing (human and machine) agents, can be found in a wide range of problem areas, such as telecommu- nications, smart grids, smart cities, traffic guidance, and flight control. As the complexity and size of these networks increases, automated techniques for configuring, guiding, and managing them become increasingly important, in order to limit operational costs and guarantee optimality. The SMILE-IT project aims to develop such an automated network management framework, based on multi-agent reinforcement learning (MARL) techniques.
The central research question is: “How can complex networks become self-organising while ensuring stability and without sacrificing on performance. Moreover the decisions taken by the system should be understand- able and guidable.”
More precisely, the project aims to develop a framework for studying and managing modern distributed networked systems that contain a large number of entities or agents, both machine and human, which strive to achieve their personal objectives. The framework developed within the proposal will guide these entities, either through direct control or by way of incentives, in order to achieve system-wide optimal behaviour, satisfy global objectives and adhere to the system’s operational constraints in the face of diverging and incompatible personal goals. Software language abstractions will be identified and developed, to support the ease of the deployment of the framework on a wide variety of networks.
The framework will build on the expertise of the teams in machine learning (including game theory, self-organisation of complex systems, large-scale multi-agent systems and emergent social behaviour), network management and modelling, and software language design. The key idea of the framework is that the context within which intelligent decision making components or agents operate may depend on spatial and temporal factors. As such, they should be able to adapt their behaviour and goals as a function of space and time.
The framework should satisfy the following requirements: It should be generic so as to be applicable to a wide range of networks, it should be scalable with respect to the size of the network, the resulting behaviour should be (near) optimal and at all times minimal performance should be guaranteed, also in unexpected situations. Several fundamental scientific challenges remain to be solved before this high-level objective can be achieved.