Code and data

ChaLearn Looking at People series: all competitions, workshops, special issues and publications material: codes, data, fact sheets, etc.

Folded Recurrent Neural Networks for Future Video Prediction. Code. ArXiv paper.

Mohammad A. Haque, Ruben B. Bautista, Kamal Nasrollahi, Sergio Escalera, Christian B. Laursen, Ramin Irani, Ole K. Andersen, Erika G. Spaich, Kaustubh Kulkarni, Thomas B. Moeslund, Marco Bellantonio, Golamreza Anbarjafari, and Fatemeh Noroozi, Deep Multimodal Pain Recognition: A Database and Comparision of Spatio-Temporal Visual Modalities, Faces and Gestures, FG, 2018. Slides. Poster. Data.


Rain Eric Haamer, Kaustubh Kulkarni, Nasrin Imanpour, Mohammad Ahsanul Haque, Egils Avots, Michelle Breisch, Kamal Nasrollahi, Sergio Escalera, Cagri Ozcinar, Xavier Baro, Ahmad R. Naghsh-Nilchi, Thomas B. Moeslund, and Gholamreza Anbarjafari, Changes in Facial Expression as Biometric: A Database and Benchmarks of Identification, Faces and Gestures workshops, FGW, FG, 2018. Slides. Ask for the data to:


Appa-real dataset and extended face attributes annotations


Eirikur Agustsson, Radu Timofte, Sergio Escalera, Xavier Baro, Isabelle Guyon, Rasmus Rothe, Apparent and real age estimation in still images with deep residual regressors on APPA-REAL database, FG, 2017.

Albert Clapés, Ozan Bilici, Dariia Temirova, Egils Avots, Gholamreza Anbarjafari, and Sergio Escalera, From apparent to real age: gender, age, ethnic, makeup, and expression bias analysis in real age estimation.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2373-2382. 2018.

3D hand pose

HandDBMeysam Madadi, Sergio Escalera, Alex Carruesco Llorens, Carlos Andujar, Xavier Baro, Jordi Gonzalez, Occlusion aware hand pose recovery from sequences of depth images,  FG, 2017.


Support Vector Machines with Time Series Distance Kernels for Action Classification. Code

bagheri2016supportMohammad Ali Bagheri, Qigang Gao, Sergio Escalera, Support Vector Machines with Time Series Distance Kernels for Action Classification, WACV, 2016.

A new class of SVM that is applicable to trajectory classification, such as action recognition, is developed by incorporating two efficient time-series distances measures into the kernel function. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite kernels in the SVM formulation.


Tri-modal RGB-Depth-Thermal dataset for human analysis

Database description:The dataset features a total of 5724 annotated frames divided in three indoor scenes.

Activity in scene 1 and 3 is using the full depth range of the Kinect for XBOX 360 sensor whereas activity in scene 2 is constrained to a depth range of plus/minus 0.250 m in order to suppress the parallax between the two physical sensors. Scene 1 and 2 are situated in a closed meeting room with little natural light to disturb the depth sensing, whereas scene 3 is situated in an area with wide windows and a substantial amount of sunlight. For each scene, a total of three persons are interacting, reading, walking, sitting, reading, etc.Every person is annotated with a unique ID in the scene on a pixel-level in the RGB modality. For the thermal and depth modalities, annotations are transferred from the RGB images using a registration algorithm found in registrator.cpp.

Reference: Palmero, C., Clapés, A., Bahnsen, C., Møgelmose, A., Moeslund, T. B., & Escalera, S. (2016). Multi-modal RGB–Depth–Thermal Human Body Segmentation. International Journal of Computer Vision, pp 1-23.


Continuous Supervised Descent Method for Facial Landmark Localisation


Reference: Marc Oliu, Ciprian Corneanu, Laszlo A. Jeni, Jeff rey F. Cohn, Takeo Kanade, and Sergio Escalera, Continuous Supervised Descent Method for Facial Landmark Localisation, ACCV 2016.  Slides. Poster. Oral. Code and data webpage.

Reference: Sergio Escalera, Oriol Pujol, and Petia Radeva, Error-Correcting Output Codes Library, Journal of Machine Learning Research, vol. 11, pp. 661-664, MIT Press, USA, ISSN 1532-4435, IF JCR CCIA 2.789 2009 18/103, 2010. Open Source Library,Machine Learning Open Source Software.


Contextual rescoring


HuPBA-90 data set

Video sample
Reference: Daniel Sánchez, Juan Carlos Ortega, Miguel Ángel Bautista, and Sergio Escalera, Human Body Segmentation with Multi-limb Error-Correcting Output Codes Detection and Graph Cuts Optimization, 6th Iberian Conference on Pattern Recognition and Image Analysis, IBPRIA, Madeira, 2013.
ChaLearn-HuPBA Multi-Modal Gesture Recognition Challenge 2013 data set

Data set webpage

Reference: S. Escalera, J. Gonzàlez, X. Baró, M. Reyes, O. Lopes, I. Guyon, V. Athitsos, and H.J. Escalante, Multi-modal Gesture Recognition Challenge 2013: Dataset and Results, ICMI, 2013. Video sample of gesture categories and data modalities.

HumanLimb data set

227 images from 25 different people and different background complexity. 14 limbs are labeled per image.
7Reference: Antonio Hernández-Vela, Miguel Reyes, Víctor Ponce, and Sergio Escalera, GrabCut-Based Human Segmentation in Video Sequences, Sensors, Volume 12, Issue 11, 15376-15393; doi: 10.3390/s121115376, 2012.


3D human pose data

This dataset contains labelled body parts in videos recorded with Kinect camera (RGB+Depth).
Reference: Antonio Hernández-Vela, Nadezhda Zlateva, Alexander Marinov, Miguel Reyes, Petia Radeva, Dimo Dimov, and Sergio Escalera, Graph Cuts Optimization for Multi-Limb Human Segmentation in Depth Maps, IEEE Computer Vision and Pattern Recognition conference, 16/06/2012-21/06/2012, Providence, Rhode Island, 2012.

Reference: Antonio Hernández-Vela, Nadezhda Zlateva, Alexander Marinov, Miguel Reyes, Petia Radeva, Dimo Dimov, and Sergio Escalera, Human Limb Segmentation in Depth Maps based on Spatio-Temporal Graph Cuts Optimization, Journal of Ambient Intelligence and Smart Environments JAISE, 2012.


Cover data set for text detection

Includes more than 15000 images and a subset labeled in xml.

Reference: Sergio Escalera, Xavier Baró, Jordi Vitrià and Petia Radeva, Text Detection in Urban Scenes, International Conference of the “Associació Catalana d’Intel·ligència Artificial”, CCIA 2009.


Symbols in natural scenes data set

This data set includes about 550 images of 17 different symbols that appear in natural scenes.

Reference: Sergio Escalera, Alicia Fornés, Oriol Pujol, Petia Radeva, Gemma Sánchez, and Josep Lladós, Blurred Shape Model for Binary and Grey-level Symbol Recognition, Pattern Recognition Letters, doi:10.1016/j.patrec.2009.08.001, 2009.