the  test  dataset. Fifteen classifier types  were  tested, 
and their performances evaluated using subsets of the 
features.  The  model  achieving  the  highest  accuracy 
was of the Fine Tree type and used a subset of 29 
features.  Applying  a  temporal  averaging  to  the  test 
data  in  order  to  remove noise showed  an  increased 
performance for most of the classifiers.  
The three intensity Action Units that proved to be 
relevant  for  the  classification  model  were 
AU
, AU
, AU
. They refer to ‘Outer Brow Raiser’, 
Brow Lowerer’ and ‘Lip Corner Puller’ respectively. 
These are facial features that relate to frowning and 
smiling,  which  could  have  significance  when 
evaluating interest. It should be noted, however, that 
the  Action  Units  might  not  be  completely  reliable, 
especially in many of the ’No Interest’ cases  where 
the person is not facing the camera directly. Also, the 
reliability  of  the  Action  Unit  values  are  noted  to 
possibly be lower when using the feature extraction 
method  on  sequences  containing  multiple  faces, 
which  was  used  for  this  work,  but  despite  this  the 
achieved performance of the classifiers was good. 
The  only  previous  work  found  discussing 
classification  of  human  interest  in  a  robot  (Munoz-
Salinas  et al,  2005) used  the  detection  of  skin  area 
based  on  colour  to  determine  how  much  interest  a 
person  was  showing.  This  method  has  two  major 
limitations.  Firstly,  the  usage  of  skin  colour  as  a 
determining  factor  is  not  ideal  as  discerning  in  the 
case  of  bald  people  and  people  with  low  contrast 
between  hair  and  skin  colour  can  be  problematic. 
Secondly,  face  orientation  does  not  necessarily 
signify  an  interest  in  the  robot.  Both  of  these 
limitations are addressed with our suggested method. 
Including  gaze  provides  a  more  reliable  estimate  a 
person’s focus and thereby their point of interest. 
The interest detection system described above can 
have  different  applications,  and  it  will  be  primarily 
used on our robot for detecting which person in the 
robot’s environment is interested in interacting with 
it and taking a cup of water which the robot carries. 
This detection algorithm will be especially useful in 
high traffic noisy situations where we cannot rely on 
the  verbal  communication  channel,  i.e.  speech 
recognition, to gauge people’s interest in having a cup 
of water. 
ACKNOWLEDGEMENTS 
This work has been supported by InnovationsFonden 
Danmark  in  the  context  of  the  Project  “Seamless 
huMan-robot interactiOn fOr THe support of elderly 
people” (SMOOTH). 
REFERENCES 
Argyle,  M.  (1972).  Non-verbal  communication in  human 
social interaction. 
Mavridis,  N.  (2015).  A  review  of  verbal  and  non-verbal 
human–robot interactive communication. Robotics and 
Autonomous Systems, 63, 22-35. 
Juel, W. K., Haarslev, F., Ramírez, E. R., Marchetti, E., 
Fischer,  K.,  Shaikh,  D.,  ...  &  Krüger,  N.  (2020). 
SMOOTH Robot: Design for a novel modular welfare 
robot. Journal of Intelligent & Robotic Systems, 98(1), 
19-37. 
Palinko,  O.,  Fischer,  K.,  Ruiz  Ramirez,  E.,  Damsgaard 
Nissen,  L.,  &  Langedijk,  R.  M.  (2020,  March).  A 
Drink-Serving  Mobile  Social  Robot  Selects  who  to 
Interact  with  Using  Gaze.  In  Companion of the 2020 
ACM/IEEE International Conference on Human-Robot 
Interaction (pp. 384-385). 
Baltrušaitis,  T.,  Robinson,  P.,  &  Morency,  L.  P.  (2016, 
March).  Openface:  an  open  source  facial  behavior 
analysis  toolkit.  In  2016 IEEE Winter Conference on 
Applications of Computer Vision (WACV)  (pp.  1-10). 
IEEE. 
Hjortsjö,  C.  H.  (1969).  Man's face and mimic language. 
Studentlitteratur. 
Argyle, M., & Cook, M. (1976). Gaze and mutual gaze. 
Palinko,  O.,  Rea,  F.,  Sandini,  G.,  &  Sciutti,  A.  (2015, 
November). Eye gaze tracking for a humanoid robot. In 
2015 IEEE-RAS 15th International Conference on 
Humanoid Robots (Humanoids) (pp. 318-324). IEEE. 
Munoz-Salinas,  R.,  Aguirre,  E.,  García-Silvente,  M.,  & 
González,  A.  (2005).  A  fuzzy  system  for  visual 
detection of interest in human-robot interaction. In 2nd 
International Conference on Machine Intelligence 
(ACIDCA-ICMI’2005) (pp. 574-581). 
Palinko,  O.,  Rea,  F.,  Sandini,  G.,  &  Sciutti,  A.  (2016, 
October). Robot reading human gaze: Why eye tracking 
is  better  than  head  tracking  for  human-robot 
collaboration.  In  2016 IEEE/RSJ International 
Conference on Intelligent Robots and Systems (IROS) 
(pp. 5048-5054). IEEE. 
Zhang,  Y.  (Ed.).  (2010).  New advances in machine 
learning. BoD–Books on Demand. 
Ekman, R. (2012). What the face reveals: Basic and applied 
studies of spontaneous expression using the Facial 
Action Coding System (FACS).  
 
HUCAPP 2021 - 5th International Conference on Human Computer Interaction Theory and Applications