PROXIMAL BELLMAN MAPPINGS FOR REINFORCEMENT LEARNING AND THEIR
APPLICATION TO ROBUST ADAPTIVE FILTERING
Abstract
This paper aims at the algorithmic/theoretical core of reinforcement
learning (RL) by introducing the novel class of proximal Bellman
mappings. These mappings are defined in reproducing kernel Hilbert
spaces (RKHSs), to benefit from the rich approximation properties and
inner product of RKHSs, they are shown to belong to the powerful
Hilbertian family of (firmly) nonexpansive mappings, regardless of the
values of their discount factors, and possess ample degrees of design
freedom to even reproduce attributes of the classical Bellman mappings
and to pave the way for novel RL designs. An approximate
policy-iteration scheme is built on the proposed class of mappings to
solve the problem of selecting online, at every time instance, the
“optimal” exponent p in a p-norm loss to combat outliers in linear
adaptive filtering, without training data and any knowledge on the
statistical properties of the outliers. Numerical tests on synthetic
data showcase the superior performance of the proposed framework over
several non-RL and kernel-based RL schemes.
—-
© 20XX IEEE. Personal use of this material is permitted. Permission
from IEEE must be obtained for all other uses, in any current or future
media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted
component of this work in other works.