Guoying Zhao is currently a Professor with the Center for Machine Vision and Signal Analysis, University of Oulu, Finland, where she has been a senior researcher since 2005 and an Associate Professor since 2014. She received the Ph.D. degree in computer science from the Chinese Academy of Sciences, Beijing, China, in 2005.  She got the Academy Postdoctoral position in 2007, in 2011, she was selected to the highly competitive Academy Research Fellow position, and in 2020, she was selected to one of the ten Academy Professors. She has authored or co-authored more than 260 papers in journals and conferences, and has served as a reviewer for many journals and conferences. Her papers have currently over 15400 citations in Google Scholar (h-index 57). She is Co-Program chair of ACM International Conference on Multimodal Interaction 2021 (ICMI), and Special Sessions/Panels Chairs for FG 2023. She was general chair for ICBEA (2019, 2020), Co-Chair for Late Breaking Results of ICMI 2019, co-publicity chair for FG 2018, has served as area chairs for several conferences and is associate editor for Pattern Recognition, IEEE Transactions on Circuits and Systems for Video Technology, and Image and Vision Computing Journals. She has lectured tutorials at FG 2018, ICPR 2006, ICCV 2009, and SCIA 2013, and authored/edited three books and eight special issues in journals. Dr. Zhao was a Co-Chair of 17 International Workshops / special sessions in top venues, such as ICCV, CVPR, ECCV, ACCV and FG. Her students and researchers are frequent recipients of very prestigious and highly competitive fellowships, such as Academy of Finland Postdoc position, the Nokia Scholarship, Endeavour Research Fellowship, Tauno Tönning Research funding, Kauta Foundation grant and Jorma Ollila grant. She is IEEE Senior Member. Her current research interests include image and video descriptors, gait analysis, dynamic-texture recognition, facial-expression recognition, human motion analysis, and person identification. Her research has been reported by Finnish TV programs, newspapers and MIT Technology Review.

Nationality: P.R.China

Center for Machine Vision and Signal Analysis,
P.O.Box 4500 FI-90014 University of Oulu, Finland.



Phone Number: +358 294487564


Research Interests: Computer Vision, Pattern Recognition, Affective Computing, Digital image & video processing, Human motion analysis, Virtual Reality, Biometrics, etc.
Google Scholar

Open positions: Postdoctoral researchers, PhD students, Master thesis workers.

Call for Postdoctoral Researcher for Machine Vision and Signal Analysis:, Deadline: Apr. 15, 2021.

Call for Doctoral Student for Machine Vision and Signal Analysis:, Deadline: Apr. 15, 2021.


2021.06: Muzammil recently won the Tauno Tönning Foundation Grant.

2021.03: The 2nd Multimodal Sentiment Analysis Challenge:, with ACM MM 2021. Welcome to participate!

2021.03: IEEE Finland Section best student conference paper award 2020, for our ICCV 2019 paper "Zitong Yu, Wei Peng, Xiaobai Li, Xiaopeng Hong, Guoying Zhao. Remote Heart Rate Measurement from Highly Compressed Facial Videos: an End-to-end Deep Learning Solution with Video Enhancement. ICCV 2019."

2020.10: Mr. Muzammil Behzad won the three minutes PhD thesis competition organized by ICIP2020:

2020.08: We have got the 2nd Place on Action Recognition Track of ECCV 2020 VIPriors Challenges (with accuracy of 88.31%). More details can be found here.

2020.08: Dr. Xiaobai Li has been selected to Assistant Professor (Tenure Track) position in CMVS, University of Oulu.

2020.07: Dr. Jingang Shi has been appointed Associate Professor position in Xi'an Jiaotong University, China.

2020.06: We have won the first place in the ChaLearn multi-modal face anti-spoofing attack detection challenge @ CVPR 2020, and the second place in the ChaLearn single-modal face anti-spoofing attack detection challenge @CVPR 2020. More details:

2020.06: Joint work with Learning & Educational Technology Research Unit (LET) accepted to top tier IEEE Transactions on Affective Computing: "Muhterem Dindar, Sanna Järvelä, Sara Ahola, Xiaohua Huang, Guoying Zhao. Leaders and followers identified by emotional mimicry during collaborative learning: A facial expression recognition study on emotional valence."

2020.06: Ms. Yingyue Xu has successfully defended her doctoral dissertation and obtained PhD degree.

2020.05: Dr. Xin Liu has been selected to highly competitive Academy of Finland postdoctoral position 2020.09-2023.08.

Downloading: Databases:


  • OuluVS database: It includes the video and audio data for 20 subjects uttering ten phrases: Hello, Excuse me, I am sorry, Thank you, Good bye, See you, Nice to meet you, You are welcome, How are you, Have a good time. Each person spoke each phrase five times. There are also videos with head motion from front to left, from front to right, without utterance, five times for each person. The details and the baseline results for visual speech recognition can be found in:

Zhao G, Barnard M & Pietikäinen M (2009). Lipreading with local spatiotemporal descriptors. IEEE Transactions on Multimedia 11(7):1254-1265.

The database can be used, for example,  in studying the visual speech recognition (lipreading). If you want to get a copy, please contact me.

  • Oulu-CASIA NIR&VIS facial expression database: It contains videos with the six typical expressions  (happiness, sadness, surprise,
    anger, fear, disgust) from 80 subjects captured with two imaging systems, NIR (Near Infrared) and VIS (Visible light), under three different illumination conditions: normal indoor illumination, weak illumination (only computer display is on) and dark illumination (all lights are off). The database can be used, for example,  in studying the effects of illumination variations to facial expressions, cross-imaging-system facial expression recognition or face recognition.

This database has been released. If you are interested, please contact me.

  • SPOS database - spontaneous and posed facial expressions database

SPOS database includes spontaneous and posed facial expressions of 7 subjects. Emotional movie clips were shown to subjects to induce spontaneous facial expressions, which include six categories of basic emotions (happy, sad, anger, surprise, fear disgust). Subjects were also asked to pose the six kinds of facial expressions after watching movie clips. Data are recorded by both visual and near infer-red camera. All together 84 posed and 147 spontaneous facial expression clips were labeled out from the starting point to the apex.

So far, spontaneous and posed facial expressions are usually found in different databases. The difference between databases (different experimental setting and different participants) hindered researches which considered both spontaneous and posed facial expressions. This database offers data collected on the same participants and under the same recording condition, which can be used for comparing or distinguishing spontaneous and posed facial expressions