Shaan Desai, Marios Mattheakis, Hayden Joy, Pavlos Protopapas, and Stephen Roberts. Submitted. “One-Shot Transfer Learning of Physics-Informed Neural Networks.” In . Publisher's VersionAbstract
Solving differential equations efficiently and accurately sits at the heart of progress in many areas of scientific research, from classical dynamical systems to quantum mechanics. There is a surge of interest in using Physics-Informed Neural Networks (PINNs) to tackle such problems as they provide numerous benefits over traditional numerical approaches. Despite their potential benefits for solving differential equations, transfer learning has been under explored. In this study, we present a general framework for transfer learning PINNs that results in one-shot inference for linear systems of both ordinary and partial differential equations. This means that highly accurate solutions to many unknown differential equations can be obtained instantaneously without retraining an entire network. We demonstrate the efficacy of the proposed deep learning approach by solving several real-world problems, such as first- and second-order linear ordinary equations, the Poisson equation, and the time-dependent Schrodinger complex-value partial differential equation.
Anwesh Bhattacharya, Marios Mattheakis, and Pavlos Protopapas. Submitted. “Encoding Involutory Invariance in Neural Networks.” In . Publisher's VersionAbstract

In certain situations, Neural Networks (NN) are trained upon data that obey underlying physical symmetries. However, it is not guaranteed that NNs will obey the underlying symmetry unless embedded in the network structure. In this work, we explore a special kind of symmetry where functions are invariant with respect to involutory linear/affine transformations up to parity p = ±1. We develop mathe- matical theorems and propose NN architectures that ensure invariance and universal approximation properties. Numerical experiments indicate that the proposed mod- els outperform baseline networks while respecting the imposed symmetry. An adaption of our technique to convolutional NN classification tasks for datasets with inherent horizontal/vertical reflection symmetry has also been proposed.

Marios Mattheakis, Hayden Joy, and Pavlos Protopapas. Submitted. “Unsupervised Reservoir Computing for Solving Ordinary Differential Equations.” In . Publisher's VersionAbstract

There is a wave of interest in using unsupervised neural networks for solving differential equations. The existing methods are based on feed-forward networks, while recurrent neural network differential equation solvers have not yet been reported. We introduce an nsupervised reservoir computing (RC), an echo-state recurrent neural network capable of discovering approximate solutions that satisfy ordinary differential equations (ODEs). We suggest an approach to calculate time derivatives of recurrent neural network outputs without using backpropagation. The internal weights of an RC are fixed, while only a linear output layer is trained, yielding efficient training. However, RC performance strongly depends on finding the optimal hyper-parameters, which is a computationally expensive process. We use Bayesian optimization to efficiently discover optimal sets in a high-dimensional hyper-parameter space and numerically show that one set is robust and can be used to solve an ODE for different initial conditions and time ranges. A closed-form formula for the optimal output weights is derived to solve first order linear equations in a backpropagation-free learning process. We extend the RC approach by solving nonlinear system of ODEs using a hybrid optimization method consisting of gradient descent and Bayesian optimization. Evaluation of linear and nonlinear systems of equations demonstrates the efficiency of the RC ODE solver.

Mattia Angelia, Georgios Neofotistos, Marios Mattheakis, and Efthimios Kaxiras. 1/2022. “Modeling the effect of the vaccination campaign on the Covid-19 pandemic.” Chaos, Solitons and Fractals, 154, Pp. 111621. Publisher's VersionAbstract

Population-wide vaccination is critical for containing the SARS-CoV-2 (Covid-19) pandemic when combined with restrictive and prevention measures. In this study we introduce SAIVR, a mathematical model able to forecast the Covid-19 epidemic evolution during the vaccination campaign. SAIVR extends the widely used Susceptible-Infectious-Removed (SIR) model by considering the Asymptomatic (A) and Vaccinated (V) compartments. The model contains sev- eral parameters and initial conditions that are estimated by employing a semi-supervised machine learning procedure. After training an unsupervised neural network to solve the SAIVR differ- ential equations, a supervised framework then estimates the optimal conditions and parameters that best fit recent infectious curves of 27 countries. Instructed by these results, we performed an extensive study on the temporal evolution of the pandemic under varying values of roll-out daily rates, vaccine efficacy, and a broad range of societal vaccine hesitancy/denial levels. The concept of herd immunity is questioned by studying future scenarios which involve different vaccination efforts and more infectious Covid-19 variants.

Shaan Desai, Marios Mattheakis, David Sondak, Pavlos Protopapas, and Stephen Roberts. 9/2021. “Port-Hamiltonian Neural Networks for Learning Explicit Time-Dependent Dynamical Systems.” Phys Rev. E, 104, Pp. 034312. Publisher's VersionAbstract
Accurately learning the temporal behavior of dynamical systems requires models with well-chosen learning biases. Recent innovations embed the Hamiltonian and Lagrangian formalisms into neural networks and demonstrate a significant improvement over other approaches in predicting trajectories of physical systems. These methods generally tackle autonomous systems that depend implicitly on time or systems for which a control signal is known apriori. Despite this success, many real world dynamical systems are non-autonomous, driven by time-dependent forces and experience energy dissipation. In this study, we address the challenge of learning from such non-autonomous systems by embedding the port-Hamiltonian formalism into neural networks, a versatile framework that can capture energy dissipation and time-dependent control forces. We show that the proposed \emph{port-Hamiltonian neural network} can efficiently learn the dynamics of nonlinear physical systems of practical interest and accurately recover the underlying stationary Hamiltonian, time-dependent force, and dissipative coefficient. A promising outcome of our network is its ability to learn and predict chaotic systems such as the Duffing equation, for which the trajectories are typically hard to learn.
Shaan Desai, Marios Mattheakis, and Stephen Roberts. 9/2021. “Variational Integrator Graph Networks for Learning Energy Conserving Dynamical Systems.” Phys. Rev. E, 104, Pp. 035310. Publisher's VersionAbstract
Recent advances show that neural networks embedded with physics-informed priors significantly outperform vanilla neural networks in learning and predicting the long term dynamics of complex physical systems from noisy data. Despite this success, there has only been a limited study on how to optimally combine physics priors to improve predictive performance. To tackle this problem we unpack and generalize recent innovations into individual inductive bias segments. As such, we are able to systematically investigate all possible combinations of inductive biases of which existing methods are a natural subset. Using this framework we introduce Variational Integrator Graph Networks - a novel method that unifies the strengths of existing approaches by combining an energy constraint, high-order symplectic variational integrators, and graph neural networks. We demonstrate, across an extensive ablation, that the proposed unifying framework outperforms existing methods, for data-efficient learning and in predictive accuracy, across both single and many-body problems studied in recent literature. We empirically show that the improvements arise because high order variational integrators combined with a potential energy constraint induce coupled learning of generalized position and momentum updates which can be formalized via the Partitioned Runge-Kutta method.
Tiago A. E. Ferreira, Marios Mattheakis, and Pavlos Protopapas. 2021. “A New Artificial Neuron Proposal with Trainable Simultaneous Local and Global Activation Function”.Abstract
The activation function plays a fundamental role in the artificial neural net-work learning process. However, there is no obvious choice or procedure todetermine the best activation function, which depends on the problem. Thisstudy proposes a new artificial neuron, named Global-Local Neuron, with atrainable activation function composed of two components, a global and alocal. The global component term used here is relative to a mathematicalfunction to describe a general feature present in all problem domains. Thelocal component is a function that can represent a localized behavior, like atransient or a perturbation. This new neuron can define the importance ofeach activation function component in the learning phase. Depending on theproblem, it results in a purely global, or purely local, or a mixed global and local activation function after the training phase. Here, the trigonometric sinefunction was employed for the global component and the hyperbolic tangentfor the local component. The proposed neuron was tested for problems wherethe target was a purely global function, or purely local function, or a com-position of two global and local functions. Two classes of test problems wereinvestigated, regression problems and differential equations solving. The experimental tests demonstrated the Global-Local Neuron network’s superiorperformance, compared with simple neural networks with sine or hyperbolictangent activation function, and with a hybrid network that combines thesetwo simple neural networks.
Alessandro Paticchio, Tommaso Scarlatti, Marios Mattheakis, Pavlos Protopapas, and Marco Brambilla. 12/2020. “Semi-supervised Neural Networks solve an inverse problem for modeling Covid-19 spread.” In 2020 NeurIPS Workshop on Machine Learning and the Physical Sciences. NeurIPS. Publisher's Version 2020_covid_2010.05074.pdf
Henry Jin, Marios Mattheakis, and Pavlos Protopapas. 12/2020. “Unsupervised Neural Networks for Quantum Eigenvalue Problems.” In 2020 NeurIPS Workshop on Machine Learning and the Physical Sciences. NeurIPS. Publisher's VersionAbstract
Eigenvalue problems are critical to several fields of science and engineering. We present a novel unsupervised neural network for discovering eigenfunctions and eigenvalues for differential eigenvalue problems with solutions that identically satisfy the boundary conditions. A scanning mechanism is embedded allowing the method to find an arbitrary number of solutions. The network optimization is data-free and depends solely on the predictions. The unsupervised method is used to solve the quantum infinite well and quantum oscillator eigenvalue problems.
Marios Mattheakis. 8/26/2020. “Riding Waves in Neuromorphic Computing.” APS Physics 12 (132), Pp. 1-3. Publisher's VersionAbstract
An artificial neural network incorporating nonlinear waves could help reduce energy consumption within a bioinspired (neuromorphic) computing device.
Yue Luo, Rebecca Engelke, Marios Mattheakis, Michele Tamagnone, Stephen Carr, Kenji Watanabe, Takashi Taniguchi, Efthimios Kaxiras, Philip Kim, and William L. Wilson. 8/2020. “In-situ nanoscale imaging of moiré superlattices in twisted van der Waals heterostructures.” Nature Communication, 11, 4209, Pp. 1-7. Publisher's VersionAbstract
Direct visualization of nanometer-scale properties of moiré superlattices in van der Waals
heterostructure devices is a critically needed diagnostic tool for study of the electronic and optical phenomena induced by the periodic variation of atomic structure in these complex systems. Conventional imaging methods are destructive and insensitive to the buried device geometries, preventing practical inspection. Here we report a versatile scanning probe  microscopy employing infrared light for imaging moiré superlattices of twisted bilayers graphene encapsulated by hexagonal boron nitride. We map the pattern using the scattering dynamics of phonon polaritons launched in hexagonal boron nitride capping layers via its interaction with the buried moiré superlattices. We explore the origin of the double-line features imaged and show the mechanism of the underlying effective phase change of the phonon polariton reflectance at domain walls. The nano-imaging tool developed provides a non-destructive analytical approach to elucidate the complex physics of moiré engineered heterostructures.
G. A. Tritsaris, S. Carr, Z. Zhu, Y. Xie, S. Torrisi, J. Tang, M.Mattheakis, D. Larson, and E. Kaxiras. 6/2020. “Electronic structure calculations of twisted multi-layer graphene superlattices.” 2D Materials, 7, Pp. 035028. Publisher's VersionAbstract
Quantum confinement endows two-dimensional (2D) layered materials with exceptional physics and novel properties compared to their bulk counterparts. Although certain two- and few-layer configurations of graphene have been realized and studied, a systematic investigation of the properties of arbitrarily layered graphene assemblies is still lacking. We introduce theoretical concepts and methods for the processing of materials information, and as a case study, apply them to investigate the electronic structure of multi-layer graphene-based assemblies in a high-throughput fashion. We provide a critical discussion of patterns and trends in tight binding band structures and we identify specific layered assemblies using low-dispersion electronic bands as indicators of potentially interesting physics like strongly correlated behavior. A combination of data-driven models for visualization and prediction is used to intelligently explore the materials space. This work more generally aims to increase confidence in the combined use of physics-based and data-driven modeling for the systematic refinement of knowledge about 2D layered materials, with implications for the development of novel quantum devices.
Georgios A. Tritsaris, Yiqi Xie, Alexander M. Rush, Stephen Carr, Marios Mattheakis, and Efthimios Kaxiras. 6/2020. “LAN -- A materials notation for 2D layered assemblies.” J. Chem. Inf. Model. . Publisher's VersionAbstract
Two-dimensional (2D) layered materials offer intriguing possibilities for novel physics and applications. Before any attempt at exploring the materials space in a systematic fashion, or combining insights from theory, computation and experiment, a formal description of information about an assembly of arbitrary composition is required. Here, we introduce a domain-generic notation that is used to describe the space of 2D layered materials from monolayers to twisted assemblies of arbitrary composition, existent or not-yet-fabricated. The notation corresponds to a theoretical materials concept of stepwise assembly of layered structures using a sequence of rotation, vertical stacking, and other operations on individual 2D layers. Its scope is demonstrated with a number of example structures using common single-layer materials as building blocks. This work overall aims to contribute to the systematic codification, capture and transfer of materials knowledge in the area of 2D layered materials.
Feiyu Chen, David Sondak, Pavlos Protopapas, Marios Mattheakis, Shuheng Liu, Devansh Agarwal, and Marco Di Giovanni. 2/2020. “NeuroDiffEq: A Python package for solving differential equations with neural networks.” Journal of Open Source Software, 5, 46. Publisher's Version 2020_joss_neurodiffeq.pdf
G. Barmparis, G. Neofotistos, M.Mattheakis, J. Hitzanidi, G. P. Tsironis, and E. Kaxiras. 2/2020. “Robust prediction of complex spatiotemporal states through machine learning with sparse sensing.” Physics Letters A, 384, Pp. 126300. Publisher's VersionAbstract
Complex spatiotemporal states arise frequently in material as well as biological systems consisting of multiple interacting units. A specific, but rather ubiquitous and interesting example is that of “chimeras”, existing in the edge between order and chaos. We use Machine Learning methods involving “observers” to predict the evolution of a system of coupled lasers, comprising turbulent chimera states and of a less chaotic biological one, of modular neuronal networks containing states that are synchronized across the networks. We demonstrated the necessity of using “observers” to improve the performance of Feed-Forward Networks in such complex systems. The robustness of the forecasting capabilities of the “Observer Feed-Forward Networks” versus the distribution of the observers, including equidistant and random, and the motion of them, including stationary and moving was also investigated. We conclude that the method has broader applicability in dynamical system context when partial dynamical information about the system is available.
M.Mattheakis, D. Sondak, and P. Protopapas. 2020. “Hamiltonian neural networks for solving differential equations”. Publisher's VersionAbstract

There has been a wave of interest in applying machine learning to study dynamical systems. In particular, neural networks have been applied to solve the equations of motion, and therefore, track the evolution of a system. In contrast to other applications of neural networks and machine learning, dynamical systems possess invariants such as energy, momentum, and angular momentum, depending on their underlying symmetries. Traditional numerical integration methods sometimes violate these conservation laws, propagating errors in time, ultimately reducing the predictability of the method. We present a data-free Hamiltonian neural network that solves the differential equations that govern dynamical systems. This is an equation-driven unsupervised learning method where the optimization process of the network depends solely on the predicted functions without using any ground truth data. This unsupervised model learns solutions that satisfy identically, up to an arbitrarily small error, Hamilton’s equations and, therefore, conserve the Hamiltonian invariants. Once the network is optimized, the proposed architecture is considered a symplectic unit due to the introduction of an efficient parametric form of solutions. In addition, the choice of an appropriate activation function drastically improves the predictability of the network. An error analysis is derived and states that the numerical errors depend on the overall network performance. The symplectic architecture is then employed to solve the equations for the nonlinear oscillator and the chaotic H ́enon-Heiles dynamical system. In both systems, a symplectic Euler integrator requires two orders more evaluation points than the Hamiltonian network in order to achieve the same order of the numerical error in the predicted phase space trajectories.

M. Maier, M.Mattheakis, E. Kaxiras, M. Luskin, and D. Margetis. 10/2019. “Homogenization of plasmonic crystals: Seeking the epsilon-near-zero behavior.” Proceedings of the Royal Society A, 475, 2230. Publisher's VersionAbstract
By using an asymptotic analysis and numerical simulations, we derive and investigate a system of homogenized Maxwell's equations for conducting material sheets that are periodically arranged and embedded in a heterogeneous and anisotropic dielectric host.  This structure is motivated by the need to design plasmonic crystals that enable the propagation of electromagnetic waves with no phase delay (epsilon-near-zero effect). Our microscopic model incorporates the surface conductivity of the two-dimensional (2D) material of each sheet and a corresponding line charge density through a line conductivity along possible edges of the sheets. Our analysis generalizes averaging principles inherent in previous Bloch-wave approaches. We investigate physical implications of our findings. In particular, we emphasize the role of the vector-valued corrector field, which expresses microscopic modes of surface waves on the 2D material. By using a Drude model for the surface conductivity of the sheet, we construct a Lorentzian function that describes the effective dielectric permittivity tensor of the plasmonic crystal as a function of frequency.
Marios Mattheakis, Matthias Maier, Wei Xi Boo, and Efthimios Kaxiras. 9/2019. “Graphene epsilon-near-zero plasmonic crystals.” In NANOCOM '19 Proceedings of the Sixth Annual ACM International Conference on Nanoscale Computing and Communication. Dublin, Ireland. Publisher's VersionAbstract
Plasmonic crystals are a class of optical metamaterials that consist of engineered structures at the sub-wavelength scale. They exhibit optical properties that are not found under normal circumstances in nature, such as negative-refractive-index and epsilon-near-zero (ENZ) behavior. Graphene-based plasmonic crystals present linear, elliptical, or hyperbolic dispersion relations that exhibit ENZ behavior, normal or negative-index diffraction. The optical properties can be dynamically tuned by controlling the operating frequency and the doping level of graphene. We propose a construction approach to expand the frequency range of the ENZ behavior. We demonstrate how the combination of a host material with an optical Lorentzian response in combination with a graphene  conductivity that follows a Drude model leads to an ENZ condition spanning a large  frequency range.
M.Mattheakis, P. Protopapas, D. Sondak, M. Di Giovanni, and E. Kaxiras. 4/2019. “Physical Symmetries Embedded in Neural Networks.” arXiv paper, 1904.08991. Publisher's VersionAbstract
Neural networks are a central technique in machine learning. Recent years have seen a wave of interest in applying neural networks to physical systems for which the governing dynamics are known and expressed through differential equations. Two fundamental challenges facing the development of neural networks in physics applications is their lack of interpretability and their physics-agnostic design. The focus of the present work is to embed physical constraints into the structure of the neural network to address the second fundamental challenge. By constraining tunable parameters (such as weights and biases) and adding special layers to the network, the desired constraints are guaranteed to be  satisfied without the need for explicit regularization terms. This is demonstrated on  supervised and unsupervised networks for two basic symmetries: even/odd symmetry of a function and energy conservation. In the supervised case, the network with embedded constraints is shown to perform well on regression problems while simultaneously obeying the desired constraints whereas a traditional network fits the data but violates the underlying  constraints. Finally, a new unsupervised neural network is proposed that guarantees energy conservation through an embedded symplectic structure. The symplectic neural network is used to solve a system of energy-conserving differential equations and out-performs an  unsupervised, non-symplectic neural network.
G. N. Neofotistos, M.Mattheakis, G. Barmparis, J. Hitzanidi, G. P. Tsironis, and E. Kaxiras. 3/1/2019. “Machine learning with observers predicts complex spatiotemporal behavior.” Front. Phys. - Quantum Computing , 7, 24, Pp. 1-9. Publisher's VersionAbstract
Chimeras and branching are two archetypical complex phenomena that appear in many physical systems; because of their different intrinsic dynamics, they delineate opposite non-trivial limits in the complexity of wave motion and present severe challenges in predicting chaotic and singular behavior in extended physical systems. We report on the long-term forecasting capability of Long Short-Term Memory (LSTM) and reservoir computing (RC) recurrent neural networks, when they are applied to the spatiotemporal evolution of turbulent chimeras in simulated arrays of coupled superconducting quantum interference devices (SQUIDs) or lasers, and branching in the electronic flow of two-dimensional graphene with random potential. We propose a new method in which we assign one LSTM network to each system node except for “observer” nodes which provide continual “ground truth” measurements as input; we refer to this method as “Observer LSTM” (OLSTM). We
demonstrate that even a small number of observers greatly improves the data-driven (model-free) long-term forecasting capability of the LSTM networks and provide the framework for a consistent comparison between the RC and LSTM methods. We find that RC requires smaller training datasets than OLSTMs, but the latter require fewer observers. Both methods are benchmarked against Feed-Forward neural networks (FNNs), also trained to make predictions with observers (OFNNs).