VVV08 Interaction Histories
Assif Mirza - IIT, Genova and previously University of Hertfordshire, UK
Interaction Histories for Grounded Robot Ontogeny have been the subject of my PhD research work over the last few years and here at the summer school my goal is to demonstrate their capabilities using the simple interaction game "peekaboo".
Interaction History Architecture
An interaction history is basically a collection of past "experiences" of a robot that can be used to generate future action based on new experiences. At the heart of the architecture is a "metric space of experiences" where an experience can be compared with any other in terms of their informational-theoretic relationships. This comparison is a metric and so experiences can be "placed" in a space with distances relative to each other.
The Interaction History Architecture selects those experiences from the history that are "closest" to the current experience. Then it chooses one of those experiences and the action that was executed in the past just after that experience, and executes that experience. The choice is determined by proximity in the metric space as well as the subjective value of the experience in relation to the reward received in the past.
The architecture also includes mechanisms for "forgetting" and formation of proto-typical experiences. The first simply deletes those experiences with low subjective value, and the second merges together experiences based on thier proximity in the space.
Motivation feedback (reward) is provided through two mechanisms: observation of a face, and audio feedback.
A face can be detected in the robot's camera image using OpenCV HAAR Cascades, and this provides direct positive reward. Habituation causes this reward to drop-off over time.
The reward for face detection, f, constrained to be in the range [0,1], is a function of the number of consecutive timesteps a face is seen. First the reward rises linearly, then holds at 1 for a period before decaying towards 0. f is calculated incrementally
Sound is captured from a microphone using Portaudio v1.9, and used both as an additional sensory signal as well as providing further environmental reward. The "energy" of the sound over the period of a timestep, Esound, provides a sensory input to the robot. It is calculated as the sum of the amplitude of the sound signal for every sound sample in a period of a timestep, and is normalized to take values in the range [0,1]. In converting Esound to a reward signal Rs, low level background noise is attenuated by taking the square of the sound sensor variable for all values below a threshold Tsound, above which the reward value is set to 1. Taking the square of the sound signal results in a greater attenuation of smaller values of the variable than larger ones thus effectively reducing background noise and emphasizing the reward when the sound is above the threshold.
Resulting Reward Signal
The final reward signal is a combination of the sound and face reward signals, as follows:
R = max( 1, α( Rf + Rs ) )
where α, in the range [0,1] attenuates the reward signal. With α=0.5, R is the average of the reward signals, and with α=1, either of the reward signals can result in a maximum resulting reward. For these experiments, α=0.75, meaning that neither reward signal on its own can result in a maximum R, but requires support from the other reward signal.
|6||HID||Hide Head with Hands|
|8||RAU||Right Arm Up|
|9||LAU||Left Arm Up|
|12||RAW||Wave Right Arm|
|13||LAW||Wave Left Arm|
|14||TR||"Think" Right raise right arm to chin and look right|
|15||TL||"Think" Left raise left arm to chin|
|0||Rst||All motors to resting position|
|5||HF||Head to forward position|
|10||RAD||Right Arm Down|
|11||LAD||Left Arm Down|
The development of gestural communicative interaction skills is grounded in the early interaction games that infants play. In the study of the ontogeny of social interaction, gestural communication and turn-taking in artificial agents, it is instructive to look at the kinds of interactions that children are capable of in early development and how they learn to interact appropriately with adults and other children. A well known interaction game is ``peekaboo.
Defining a Peekaboo Sequence
A ``peekaboo sequence is defined to be a sequence of actions beginning with the robot hiding its face (action 6 - HID), followed by any number of ``no-action actions (action 7 - NA) and ending with the robot back in the resting position (action 0 - Rst). Furthermore, for the purposes of evaluating the results of this experiment the actions should be selected from previous experience rather than executed randomly.
The code is in the iCub repository at:
and the application start-up/stop scripts are in
The key scripts are:
10_run_supporting.sh 20_connect_supporting.sh 30_run_and_connect_iha_processes.sh 40_run_controller.sh
and to stop:
70_stop_controller.sh 80_stop_iha_processes.sh 90_stop_supporting.sh
Here is a flow diagram showing how the processes in the Interaction History are interconnected.