LLD Workshop - NIPS 2017

Overview

Modern representation learning techniques like deep neural networks have had a major impact both within and beyond the field of machine learning, achieving new state-of-the-art performances with little or no feature engineering on a vast array of tasks. However, these gains are often difficult to translate into real-world settings as they require massive hand-labeled training sets. And in the vast majority of real-world settings, collecting such training sets by hand is infeasible due to the cost of labeling data or the paucity of data in a given domain (e.g. rare diseases in medical applications). In this workshop we focus on techniques for few sample learning and using weaker supervision when large unlabeled datasets are available, as well as theory associated with both.

One increasingly popular approach is to use weaker forms of supervision—i.e. supervision that is potentially noisier, biased, and/or less precise. An overarching goal of such approaches is to use domain knowledge and resources from subject matter experts, but to solicit it in higher-level, lower-fidelity, or more opportunistic ways. Examples include higher-level abstractions such as heuristic labeling rules, feature annotations, constraints, expected distributions, and generalized expectation criteria; noisier or biased labels from distant supervision, crowd workers, and weak classifiers; data augmentation strategies to express class invariances; and potentially mismatched training data such as in multitask and transfer learning settings.

Along with practical methods and techniques for dealing with limited labeled data settings, this workshop will also focus on the theory of learning in this general setting. Although several classic techniques in the statistical learning theory exist which handle the case of few samples and high dimensions, extending these results for example to the recent success of deep learning is still a challenge. How can the theory or the techniques that have gained success in deep learning be adapted to the case of limited labeled data? How can systems designed (and potentially deployed) for large scale learning be adapted to small data settings? What are efficient and practical ways to incorporate prior knowledge? This workshop will focus on highlighting both practical and theoretical aspects of learning with limited labeled data, including but not limited to topics such as:

Learning from noisy labels
Distant or heuristic supervision
Non-standard labels such as feature annotations, distributions, and constraints
Zero-shot, one-shot, transfer, and multi-task learning
Data augmentation and/or the use of simulated data
Frameworks that can tackle both very few samples and settings with more data without extensive intervention
Effective and practical techniques for incorporating domain knowledge
Applications of machine learning for small data problems in medical images and industry.

December 9 2017

Location Grand Ballroom B

890 Seats

10 Speakers

Speakers

Andrew McCallum

University of Massachusetts Amherst

Tom Mitchell

Carnegie Mellon University

Gaël Varoquaux

INRIA

Ian Goodfellow

Google Brain

Sebastian Riedel

University College London/Bloomsbury AI.

Sameer Singh

University of California, Irvine

Nina Balcan

Carnegie Mellon University

Alan Ritter

The Ohio State University.

Daniel Rubin

Stanford

Matthew Lungren

Stanford

Ina Fiterau

Stanford

Submit a Contribution

Please format your papers using the standard NIPS 2017 style files. The page limit is 5 pages (excluding references).
Please do not include author information, submissions must be made anonymous. All accepted papers will be presented as posters(poster dimensions: 36 x 48 in. (91cm x 122cm)), with exceptional submissions also presented as oral talks.

We are pleased to announce that our sponsors, Owkin (2 awards of $500 each) and CFM (2 awards of $500 each), will provide best paper awards.

Submission site: https://easychair.org/conferences/?conf=lld2017
Style files: https://nips.cc/Conferences/2017/PaperInformation/StyleFiles
Submissions are reviewed through a confidential double-blind process.
Cross-submissions are allowed, yet please clearly indicate if the submitted work has been presented somewhere else. Accepted papers will not be archived, thus submission does not preclude publications in other venues.
We strongly encourage at least one author per submission to attend the workshop to present in person, however due to registration difficulties this year, submissions with no attending authors will still be considered.
Email organizing chairs: lld2017[at]googlegroups[dot]com

Important dates

Submission deadline: ~~November 3, 2017, 23:59 EST~~
Notification of acceptance: ~~November 15, 2017~~
Camera-ready Due: ~~December 1, 2017~~
Workshop: December 9, 2017

Event Schedule

8:30 AM: Welcome (pdf)
8:40 AM: Invited Talk: Gaël Varoquaux, Tales from fMRI: Learning from limited labeled data (pdf)
9:10 AM: Invited Talk: Tom Mitchell, Learning from Limited Labeled Data (But a Lot of Unlabeled Data) (pdf)
9:40 AM: Contributed Talk 1: Yucen Luo, Smooth Neighbors on Teacher Graphs for Semi-supervised Learning
9:55 AM: 1-minute poster spotlights
10:15 AM: Poster Session 1/ Coffee Break
11:00 AM: Invited Talk: Andrew McCallum, Light Supervision of Structured Prediction Energy Networks (pdf)
11:30 AM: Invited Talk: Sebastian Riedel, Forcing Neural Link Predictors to Play by the Rules (pdf)
12:00 PM: Lunch
2:00 PM: Panel: Daniel Rubin, Matt Lungren, Ina Fiterau, Limited Labeled Data in Medical Imaging (pdf)
2:30 PM: 1-minute poster spotlights
2:50 PM: Poster Session 2 / Coffee Break
3:30 PM: Invited Talk: Nina Balcan, Sample and Computationally Efficient Active Learning Algorithms (pdf)
4:00 PM: Contributed Talk 2: Maxim Grechkin, EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation
4:15 PM: Invited Talk: Sameer Singh, That Doesn't Make Sense! A Case Study in Actively Annotating Model Explanations (pdf)
4:45 PM: Invited Talk: Ian Goodfellow, Overcoming Limited Data with GANs (pdf)
5:15 PM: Contributed Talk 3, Tatjana Chavdarova (on behalf of Suraj Srinivas), Local Affine Approximators of Deep Neural Nets for Improving Knowledge Transfer
5:30 PM: Contributed Talk 4, Elaheh Raisi, Co-trained Ensemble Models for Weakly Supervised Cyberbullying Detection
5:45 PM: Invited Talk: Alan Ritter, What’s so Hard About Natural Language Understanding? (pdf)
6:15 PM: Award ceremony
6:25 PM: Closing Remarks (pdf)

Accepted papers

(*) Award! (*)Runner up for best paper awards!

Session Poster 1

(*)Yucen Luo, Jun Zhu, Mengxi Li, Yong Ren and Bo Zhang. Smooth Neighbors on Teacher Graphs for Semi-supervised Learning, pdf

Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov and Stefano Ermon. Structured Prediction with Adversarial Constraint Learning, pdf

Dennis Forster, Abdul-Saboor Sheikh and Jörg Lücke. Neural Simpletrons – Learning in the Limit of Few Labels with Directed Generative Networks, pdf

Nazanin Zounemat Kermani. Identifying Novel Peroxisomal Proteins by Multiple Kernel Learning (MKL) and likely Positive-Iterative Classification (LP-IC)

Raphael Gontijo Lopes, Stefano Fenu and Thad Starner. Data-Free Knowledge Distillation for Deep Neural Networks, pdf

David Inouye, Pradeep Ravikumar, Pradipto Das and Ankur Datta. Hyperparameter Selection under Localized Label Noise via Corrupt Validation, pdf

Petr Babkin, Md Faisal Chowdhury, Alfio Gliozzo, Martin Hirzel and Avraham Shinnar. Bootstrapping Chatbots for Novel Domains, pdf

Michal Derezinski and Manfred Warmuth. Subsampling for Ridge Regression via Regularized Volume Sampling, pdf

(*)Nathan Hodas, Aryk Anderson, Artem Yankov, Kyle Schaffer and Courtney Corley. Beyond Fine Tuning: Adding capacity to leverage few labels., pdf

Bin Yu, Daniel Gray, Martine De Cock and Anderson Nascimento. Training Malicious Domain Name Classifiers with Real, Heuristically Labeled Data, pdf

Han Zhao, Otilia Stretcu, Renato Negrinho, Alex Smola and Geoff Gordon. Efficient Multi-task Feature and Relationship Learning, pdf

Zhipeng Luo and Milos Hauskrecht. Active Learning of Classification Models from Soft-Labeled Groups, pdf

Jialin Song, Ravi Lanka, Albert Zhao, Yisong Yue and Masahiro Ono. Learning to Search via Self-Imitation with Application to Risk-Aware Planning, pdf

Taewan Kim and Joydeep Ghosh. Relaxed Oracles for Semi-Supervised Clustering, pdf

Zhe Cui and Shivani Rao. Video to Skill Tagging using Transcripts under Weak Supervision, pdf

Rui Shu, Hung Bui and Stefano Ermon. A DIRT-T Approach to Unsupervised Domain Adaptation, pdf

Sheng Lundquist, Melanie Mitchell and Garrett Kenyon. Sparse Coding on Stereo Video for Object Detection, pdf

Shashank Srivastava, Igor Labutov and Tom Mitchell. Learning Classifiers from Declarative Language

(*)Suraj Srinivas and Francois Fleuret. Local Affine Approximators for Improving Knowledge Transfer, pdf

Yao-Hung Hubert Tsai and Ruslan Salakhutdinov. Improving One-Shot Learning through Fusing Side Information, pdf

Steven Hickson, Anelia Angelova, Irfan Essa and Rahul Sukthankar. Object category learning and retrieval with weak supervision, pdf

Ana Marasovic and Anette Frank. SRL4ORL: Improving Opinion Role Labelling using Multi-task Learning with Semantic Role Labeling, pdf

Vikas Raunak. Simple and Effective Dimensionality Reduction for Word Embeddings, pdf

Yu-Xiong Wang, Deva Ramanan and Martial Hebert. Learning to Model the Tail, pdf

Session Poster 2

(*)Elaheh Raisi and Bert Huang. Co-trained Ensemble Models for Weakly Supervised Cyberbullying Detection, pdf

Katharina Kann, Jesus Manuel Mager Hois, Ivan Vladimir Meza Ruiz and Hinrich Schütze. Improving Neural Morphological Segmentation for Polysynthetic Minimal-Resource Languages

Cicero Nogueira Dos Santos, Kahini Wadhawan and Bowen Zhou. Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks, pdf

(*)Maxim Grechkin, Hoifung Poon and Bill Howe. EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation, pdf

Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua Tenenbaum, Hugo Larochelle and Richard Zemel. Meta-Learning for Semi-Supervised Few-Shot Classification, pdf

Pierre Courtiol, Eric Tramel, Marc Sanselme and Gilles Wainrib. Classification and Disease Localization in Histopathology Using Only Global Labels: A Weakly-Supervised Approach, pdf

Jos van der Westhuizen, Joan Lasenby, Tris Edwards, Raphael Schmetterling, Robert Tinn, Oliver Armitage and Emil Hewage. Using adversarial autoencoders to infer actions from the peripheral nervous system, pdf

Liane Canas, Benjamin Yvernault, Carole Sudre, M Jorge Cardoso, John Thornton, Frederik Barkhof, Sebastien Ourselin, Simon Mead and Marc Modat. Multikernel Gaussian Processes for patient stratification from imaging biomarkers with heterogeneous patterns, pdf

Esube Bekele and Wallace Lawson. Learning by Generation with Mental Imagery from Limited Labeled Dataset, pdf

Sabina Tomkins, Lise Getoor, Yunfei Chen and Yi Zhang. Detecting Cyber-bullying from Sparse Data and Inconsistent Labels, pdf

Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, Iyad Rahwan and Nikhil Naik. Regularizing Prediction Entropy Enhances Deep Learning with Limited Data, pdf

Mayank Kejriwal, Jiayuan Ding, Runqi Shao, Anoop Kumar and Pedro Szekely. FlagIt: A System for Minimally Supervised Human Trafficking Indicator Mining, pdf

Todor Mihaylov, Zornitsa Kozareva and Anette Frank. Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension, pdf

Grigorios Chrysos, Yannis Panagakis and Stefanos Zafeiriou. Augmenting the data by linearizing the dynamics, pdf

Saki Shinoda, Daniel Worrall and Gabriel Brostow. Virtual Adversarial Ladder Networks For Semi-supervised Learning, pdf

(*)Jingyi Xu, Zilu Zhang, Tal Friedman, Yitao Liang and Guy Van den Broeck. A Semantic Loss Function for Deep Learning Under Weak Supervision, pdf

Mateusz Kozinski, Loic Simon and Frederic Jurie. An Adversarial Regularisation for Semi-Supervised Training of Structured Output Neural Networks, pdf

Manik Dhar, Aditya Grover and Stefano Ermon. Sparse-Gen: Combining Sparse Recovery and Generative Modeling for Compressed Sensing

Aniket Anand Deshmukh, Srinagesh Sharma, James Cutler and Clayton Scott. Multiclass Domain Generalization, pdf

Lijie Fan, Shengjia Zhao and Stefano Ermon. Adversarial Localization Network, pdf

People

Organizers

Reviewers

Mathieu Andreux, ENS
Maruan Al-Shedivat, CMU
Maria Barrett, University of Copenhagen
Maxim Berman, KU Leuven
Joachim Bingel, University of Copenhagen
Johannes Bjerva, University of Copenhagen,
Brian Cheung, UC Berkeley
Bogdan Cirstea, Télécom ParisTech
Christoph Dann, CMU
Tri Dao, Stanford
Laurent Dinh, Université de Montréal
Elvis Dohmatob, INRIA
Jared Dunnmon, Stanford
Henry Ehrenberg, Facebook
Michael Eickenberg, UC Berkeley
Georgios Exarchakis, ENS
Enzo Ferrante, CONICET/UNL
Lucie Flekova, Amazon
Aina Frau-Pascual, Massachusetts General Hospital
Ronan Fruit, INRIA
Aditya Grover, Stanford
Arthur Guillon, LIP6
Braden Hancock, Stanford
Bryan He, Stanford
Chin-Wei Huang, Université de Montréal
Jörn Jacobsen, University of Amsterdam
Katharina Kann, University of Munich
Kyle Kastner, Université de Montréal
Ravi Kiran, Uncanny Vision
Géraud Le Falher, INRIA
Joël Legrand, INRIA
Vincent Lostanlen, Cornell Lab
José Ignacio Orlando, UNCPBA, Argentina
Mohammad Pezeshki, Université de Montréal
Thomas Pumir, Princeton
Amal Rannen, KU Leuven
Mengye Ren, University of Toronto
Xiang Ren, University of Southern California
Stéphane Rivaud, Sony
Sebastian Ruder, Insight Research Centre for Data Analytics
Abulhair Saparov, CMU
Naomi Saphra, University of Edinburgh
Damien Scieur, INRIA
Daniel Selsam, Stanford
Konstantinos Skianis, Ecole Polytechnique
Louis Thiry, ENS
Mariya Toneva, CMU
Eleni Triantafillou, University of Toronto
Stavros Tsogkas, University of Toronto
Jonathan Vacher, Albert Einstein College of Medicine
Paroma Varma, Stanford
Claire Vernade, Télécom ParisTech
Irene Waldspurger, CNRS
Johannes Welbl, University College London
Jian Zhang, Louisiana State University
Sixin Zhang, ENS

Sat Dec 9th, 8.30AM-6.30PM

Learning with Limited Labeled Data:
Weak Supervision and Beyond

This webpage is outdated! Please check http://lld-workshop.github.io to reach the new website!

Overview

December 9 2017

Location Grand Ballroom B

890 Seats

10 Speakers

Speakers

Andrew McCallum

Tom Mitchell

Gaël Varoquaux

Ian Goodfellow

Sebastian Riedel

Sameer Singh

Nina Balcan

Alan Ritter

Daniel Rubin

Matthew Lungren

Ina Fiterau

Submit a Contribution

Important dates

Event Schedule

Workshop Sponsors

Accepted papers

Session Poster 1

Session Poster 2

People

Organizers

Reviewers