Modern representation learning techniques like deep neural networks have had a major impact both within and beyond the field of machine learning, achieving new state-of-the-art performances with little or no feature engineering on a vast array of tasks. However, these gains are often difficult to translate into real-world settings as they require massive hand-labeled training sets. And in the vast majority of real-world settings, collecting such training sets by hand is infeasible due to the cost of labeling data or the paucity of data in a given domain (e.g. rare diseases in medical applications). In this workshop we focus on techniques for few sample learning and using weaker supervision when large unlabeled datasets are available, as well as theory associated with both.

One increasingly popular approach is to use weaker forms of supervision—i.e. supervision that is potentially noisier, biased, and/or less precise. An overarching goal of such approaches is to use domain knowledge and resources from subject matter experts, but to solicit it in higher-level, lower-fidelity, or more opportunistic ways. Examples include higher-level abstractions such as heuristic labeling rules, feature annotations, constraints, expected distributions, and generalized expectation criteria; noisier or biased labels from distant supervision, crowd workers, and weak classifiers; data augmentation strategies to express class invariances; and potentially mismatched training data such as in multitask and transfer learning settings.

Along with practical methods and techniques for dealing with limited labeled data settings, this workshop will also focus on the theory of learning in this general setting. Although several classic techniques in the statistical learning theory exist which handle the case of few samples and high dimensions, extending these results for example to the recent success of deep learning is still a challenge. How can the theory or the techniques that have gained success in deep learning be adapted to the case of limited labeled data? How can systems designed (and potentially deployed) for large scale learning be adapted to small data settings? What are efficient and practical ways to incorporate prior knowledge? This workshop will focus on highlighting both practical and theoretical aspects of learning with limited labeled data, including but not limited to topics such as:

  • Learning from noisy labels
  • Distant or heuristic supervision
  • Non-standard labels such as feature annotations, distributions, and constraints
  • Zero-shot, one-shot, transfer, and multi-task learning
  • Data augmentation and/or the use of simulated data
  • Frameworks that can tackle both very few samples and settings with more data without extensive intervention
  • Effective and practical techniques for incorporating domain knowledge
  • Applications of machine learning for small data problems in medical images and industry.

December 9 2017

Location Grand Ballroom B

890 Seats

10 Speakers


Andrew McCallum

University of Massachusetts Amherst

Tom Mitchell

Carnegie Mellon University

Gaël Varoquaux


Ian Goodfellow

Google Brain

Sebastian Riedel

University College London/Bloomsbury AI.

Sameer Singh

University of California, Irvine

Nina Balcan

Carnegie Mellon University

Alan Ritter

The Ohio State University.

Daniel Rubin


Matthew Lungren


Ina Fiterau


Submit a Contribution

Please format your papers using the standard NIPS 2017 style files. The page limit is 5 pages (excluding references).
Please do not include author information, submissions must be made anonymous. All accepted papers will be presented as posters(poster dimensions: 36 x 48 in. (91cm x 122cm)), with exceptional submissions also presented as oral talks.

We are pleased to announce that our sponsors, Owkin (2 awards of $500 each) and CFM (2 awards of $500 each), will provide best paper awards.

  • Submission site: https://easychair.org/conferences/?conf=lld2017
  • Style files: https://nips.cc/Conferences/2017/PaperInformation/StyleFiles
  • Submissions are reviewed through a confidential double-blind process.
  • Cross-submissions are allowed, yet please clearly indicate if the submitted work has been presented somewhere else. Accepted papers will not be archived, thus submission does not preclude publications in other venues.
  • We strongly encourage at least one author per submission to attend the workshop to present in person, however due to registration difficulties this year, submissions with no attending authors will still be considered.
  • Email organizing chairs: lld2017[at]googlegroups[dot]com

Important dates

  • Submission deadline: November 3, 2017, 23:59 EST
  • Notification of acceptance: November 15, 2017
  • Camera-ready Due: December 1, 2017
  • Workshop: December 9, 2017

Event Schedule

8:30 AM: Welcome (pdf)
8:40 AM: Invited Talk: Gaël Varoquaux, Tales from fMRI: Learning from limited labeled data (pdf)
9:10 AM: Invited Talk: Tom Mitchell, Learning from Limited Labeled Data (But a Lot of Unlabeled Data) (pdf)
9:40 AM: Contributed Talk 1: Yucen Luo, Smooth Neighbors on Teacher Graphs for Semi-supervised Learning
9:55 AM: 1-minute poster spotlights
10:15 AM: Poster Session 1/ Coffee Break
11:00 AM: Invited Talk: Andrew McCallum, Light Supervision of Structured Prediction Energy Networks (pdf)
11:30 AM: Invited Talk: Sebastian Riedel, Forcing Neural Link Predictors to Play by the Rules (pdf)
12:00 PM: Lunch
2:00 PM: Panel: Daniel Rubin, Matt Lungren, Ina Fiterau, Limited Labeled Data in Medical Imaging (pdf)
2:30 PM: 1-minute poster spotlights
2:50 PM: Poster Session 2 / Coffee Break
3:30 PM: Invited Talk: Nina Balcan, Sample and Computationally Efficient Active Learning Algorithms (pdf)
4:00 PM: Contributed Talk 2: Maxim Grechkin, EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation
4:15 PM: Invited Talk: Sameer Singh, That Doesn't Make Sense! A Case Study in Actively Annotating Model Explanations (pdf)
4:45 PM: Invited Talk: Ian Goodfellow, Overcoming Limited Data with GANs (pdf)
5:15 PM: Contributed Talk 3, Tatjana Chavdarova (on behalf of Suraj Srinivas), Local Affine Approximators of Deep Neural Nets for Improving Knowledge Transfer
5:30 PM: Contributed Talk 4, Elaheh Raisi, Co-trained Ensemble Models for Weakly Supervised Cyberbullying Detection
5:45 PM: Invited Talk: Alan Ritter, What’s so Hard About Natural Language Understanding? (pdf)
6:15 PM: Award ceremony
6:25 PM: Closing Remarks (pdf)

Workshop Sponsors

Accepted papers

(*) Award! (*)Runner up for best paper awards!

Session Poster 1

(*)Yucen Luo, Jun Zhu, Mengxi Li, Yong Ren and Bo Zhang. Smooth Neighbors on Teacher Graphs for Semi-supervised Learning, pdf
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov and Stefano Ermon. Structured Prediction with Adversarial Constraint Learning, pdf
Dennis Forster, Abdul-Saboor Sheikh and Jörg Lücke. Neural Simpletrons – Learning in the Limit of Few Labels with Directed Generative Networks, pdf
Nazanin Zounemat Kermani. Identifying Novel Peroxisomal Proteins by Multiple Kernel Learning (MKL) and likely Positive-Iterative Classification (LP-IC)
Raphael Gontijo Lopes, Stefano Fenu and Thad Starner. Data-Free Knowledge Distillation for Deep Neural Networks, pdf
David Inouye, Pradeep Ravikumar, Pradipto Das and Ankur Datta. Hyperparameter Selection under Localized Label Noise via Corrupt Validation, pdf
Petr Babkin, Md Faisal Chowdhury, Alfio Gliozzo, Martin Hirzel and Avraham Shinnar. Bootstrapping Chatbots for Novel Domains, pdf
Michal Derezinski and Manfred Warmuth. Subsampling for Ridge Regression via Regularized Volume Sampling, pdf
(*)Nathan Hodas, Aryk Anderson, Artem Yankov, Kyle Schaffer and Courtney Corley. Beyond Fine Tuning: Adding capacity to leverage few labels., pdf
Bin Yu, Daniel Gray, Martine De Cock and Anderson Nascimento. Training Malicious Domain Name Classifiers with Real, Heuristically Labeled Data, pdf
Han Zhao, Otilia Stretcu, Renato Negrinho, Alex Smola and Geoff Gordon. Efficient Multi-task Feature and Relationship Learning, pdf
Zhipeng Luo and Milos Hauskrecht. Active Learning of Classification Models from Soft-Labeled Groups, pdf
Jialin Song, Ravi Lanka, Albert Zhao, Yisong Yue and Masahiro Ono. Learning to Search via Self-Imitation with Application to Risk-Aware Planning, pdf
Taewan Kim and Joydeep Ghosh. Relaxed Oracles for Semi-Supervised Clustering, pdf
Zhe Cui and Shivani Rao. Video to Skill Tagging using Transcripts under Weak Supervision, pdf
Rui Shu, Hung Bui and Stefano Ermon. A DIRT-T Approach to Unsupervised Domain Adaptation, pdf
Sheng Lundquist, Melanie Mitchell and Garrett Kenyon. Sparse Coding on Stereo Video for Object Detection, pdf
Shashank Srivastava, Igor Labutov and Tom Mitchell. Learning Classifiers from Declarative Language
(*)Suraj Srinivas and Francois Fleuret. Local Affine Approximators for Improving Knowledge Transfer, pdf
Yao-Hung Hubert Tsai and Ruslan Salakhutdinov. Improving One-Shot Learning through Fusing Side Information, pdf
Steven Hickson, Anelia Angelova, Irfan Essa and Rahul Sukthankar. Object category learning and retrieval with weak supervision, pdf
Ana Marasovic and Anette Frank. SRL4ORL: Improving Opinion Role Labelling using Multi-task Learning with Semantic Role Labeling, pdf
Vikas Raunak. Simple and Effective Dimensionality Reduction for Word Embeddings, pdf
Yu-Xiong Wang, Deva Ramanan and Martial Hebert. Learning to Model the Tail, pdf

Session Poster 2

(*)Elaheh Raisi and Bert Huang. Co-trained Ensemble Models for Weakly Supervised Cyberbullying Detection, pdf
Katharina Kann, Jesus Manuel Mager Hois, Ivan Vladimir Meza Ruiz and Hinrich Schütze. Improving Neural Morphological Segmentation for Polysynthetic Minimal-Resource Languages
Cicero Nogueira Dos Santos, Kahini Wadhawan and Bowen Zhou. Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks, pdf
(*)Maxim Grechkin, Hoifung Poon and Bill Howe. EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation, pdf
Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua Tenenbaum, Hugo Larochelle and Richard Zemel. Meta-Learning for Semi-Supervised Few-Shot Classification, pdf
Pierre Courtiol, Eric Tramel, Marc Sanselme and Gilles Wainrib. Classification and Disease Localization in Histopathology Using Only Global Labels: A Weakly-Supervised Approach, pdf
Jos van der Westhuizen, Joan Lasenby, Tris Edwards, Raphael Schmetterling, Robert Tinn, Oliver Armitage and Emil Hewage. Using adversarial autoencoders to infer actions from the peripheral nervous system, pdf
Liane Canas, Benjamin Yvernault, Carole Sudre, M Jorge Cardoso, John Thornton, Frederik Barkhof, Sebastien Ourselin, Simon Mead and Marc Modat. Multikernel Gaussian Processes for patient stratification from imaging biomarkers with heterogeneous patterns, pdf
Esube Bekele and Wallace Lawson. Learning by Generation with Mental Imagery from Limited Labeled Dataset, pdf
Sabina Tomkins, Lise Getoor, Yunfei Chen and Yi Zhang. Detecting Cyber-bullying from Sparse Data and Inconsistent Labels, pdf
Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, Iyad Rahwan and Nikhil Naik. Regularizing Prediction Entropy Enhances Deep Learning with Limited Data, pdf
Mayank Kejriwal, Jiayuan Ding, Runqi Shao, Anoop Kumar and Pedro Szekely. FlagIt: A System for Minimally Supervised Human Trafficking Indicator Mining, pdf
Todor Mihaylov, Zornitsa Kozareva and Anette Frank. Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension, pdf
Grigorios Chrysos, Yannis Panagakis and Stefanos Zafeiriou. Augmenting the data by linearizing the dynamics, pdf
Saki Shinoda, Daniel Worrall and Gabriel Brostow. Virtual Adversarial Ladder Networks For Semi-supervised Learning, pdf
(*)Jingyi Xu, Zilu Zhang, Tal Friedman, Yitao Liang and Guy Van den Broeck. A Semantic Loss Function for Deep Learning Under Weak Supervision, pdf
Mateusz Kozinski, Loic Simon and Frederic Jurie. An Adversarial Regularisation for Semi-Supervised Training of Structured Output Neural Networks, pdf
Manik Dhar, Aditya Grover and Stefano Ermon. Sparse-Gen: Combining Sparse Recovery and Generative Modeling for Compressed Sensing
Aniket Anand Deshmukh, Srinagesh Sharma, James Cutler and Clayton Scott. Multiclass Domain Generalization, pdf
Lijie Fan, Shengjia Zhao and Stefano Ermon. Adversarial Localization Network, pdf




  • Mathieu Andreux, ENS
  • Maruan Al-Shedivat, CMU
  • Maria Barrett, University of Copenhagen
  • Maxim Berman, KU Leuven
  • Joachim Bingel, University of Copenhagen
  • Johannes Bjerva, University of Copenhagen,
  • Brian Cheung, UC Berkeley
  • Bogdan Cirstea, Télécom ParisTech
  • Christoph Dann, CMU
  • Tri Dao, Stanford
  • Laurent Dinh, Université de Montréal
  • Elvis Dohmatob, INRIA
  • Jared Dunnmon, Stanford
  • Henry Ehrenberg, Facebook
  • Michael Eickenberg, UC Berkeley
  • Georgios Exarchakis, ENS
  • Enzo Ferrante, CONICET/UNL
  • Lucie Flekova, Amazon
  • Aina Frau-Pascual, Massachusetts General Hospital
  • Ronan Fruit, INRIA
  • Aditya Grover, Stanford
  • Arthur Guillon, LIP6
  • Braden Hancock, Stanford
  • Bryan He, Stanford
  • Chin-Wei Huang, Université de Montréal
  • Jörn Jacobsen, University of Amsterdam
  • Katharina Kann, University of Munich
  • Kyle Kastner, Université de Montréal
  • Ravi Kiran, Uncanny Vision
  • Géraud Le Falher, INRIA
  • Joël Legrand, INRIA
  • Vincent Lostanlen, Cornell Lab
  • José Ignacio Orlando, UNCPBA, Argentina
  • Mohammad Pezeshki, Université de Montréal
  • Thomas Pumir, Princeton
  • Amal Rannen, KU Leuven
  • Mengye Ren, University of Toronto
  • Xiang Ren, University of Southern California
  • Stéphane Rivaud, Sony
  • Sebastian Ruder, Insight Research Centre for Data Analytics
  • Abulhair Saparov, CMU
  • Naomi Saphra, University of Edinburgh
  • Damien Scieur, INRIA
  • Daniel Selsam, Stanford
  • Konstantinos Skianis, Ecole Polytechnique
  • Louis Thiry, ENS
  • Mariya Toneva, CMU
  • Eleni Triantafillou, University of Toronto
  • Stavros Tsogkas, University of Toronto
  • Jonathan Vacher, Albert Einstein College of Medicine
  • Paroma Varma, Stanford
  • Claire Vernade, Télécom ParisTech
  • Irene Waldspurger, CNRS
  • Johannes Welbl, University College London
  • Jian Zhang, Louisiana State University
  • Sixin Zhang, ENS