Self-supervised learning (SSL) has shown great promise in problems involving natural language and vision modalities. Nonetheless, human-centric problems (such as activity recognition, pose estimation, affective computing, BCI, health analytics) rely on information modalities with specific spatiotemporal properties, e.g., ECG, EEG, speech, human motion, and clinical data. To adapt SSL frameworks to build effective human-centric deep learning solutions, the following challenges and opportunities need to be explored:
- What are effective current and future directions for generating pseudo-labels for human-related sensor time-series, activity images/videos, speech signals, and others? Given the unique properties of human-centric data, what are effective network architectures, auxiliary tasks, and data augmentations that can be used for robust representation learning in SSL frameworks? What are the challenges and opportunities involving human-centric multi-modal SSL?
- What are the considerations around responsible development of human-centric SSL? Specifically, what are the ethical and legal implications of using SSL on human-centric data, i.e., generating and attributing pseudo-labels to real human data? From an ethical and privacy perspective, what are the opportunities and benefits of training models without the actual labels given the labels can be biased themselves? How does SSL affect bias or fairness in downstream human-centric tasks?
The goal of HC-SSL is to bring together researchers from academia and industry to promote the exchange of ideas, results, methods, and recent findings in the cross-section of SSL and human-centric AI. Our goal is to highlight and facilitate discussions on the above-mentioned challenges and opportunities to expose the attendees to emerging potentials of SSL for human-centric representation learning, and promote responsible AI within the context of SSL.