welcome to Ahmad Humayun's webpage.

I work in the AI for Health team at Google DeepMind. The team develops multi-modal techniques spanning radiology, pathology, genomics, and transcriptomics. Our goal is to help clinicians make better decisions for their patients, and to enable scientists discover promising medicine.

Prior to this, I led the Neural Networks Vision group at Vicarious. We developed visual recognition algorithms for kitting, packaging and bin picking robotic applications.

I completed my Ph.D. in computer vision under Jim Rehg at the School of Interactive Computing, Georgia Tech. in 2018. My thesis explored recognition problems in videos where motion and sparse labeling can be used to build a life-long object learning system. Before Georgia Tech, I received my Masters degree in CG, Vision & Imaging in late 2010 from UCL. Here, I researched with Gabriel Brostow (aka Gabe) on detecting regions of occlusion in consecutive video frames.

I also did a brief stint at The University of Warwick with Nasir Rajpoot, developing registration and dimensionality reduction techniques for cancerous tissue examined under Toponome Imaging System. Previously, I was stationed at LUMS SSE where I had the honor to work with Sohaib Khan. In my 3 years stay, I collaborated with biologists at LUMS SSE and MRC NIMR in developing tracking techniques for fluorescence microscopy, with some interludes in Systems research with Umar Saif.

Résumé   (last updated: September 2024)
  • PhD Thesis
    "Detection and Incremental Object Learning in Videos"
    PhD. Dissertation, School of Interactive Computing, Georgia Tech. April 2018.
    [PDF], [BibTeX]
  • MS Thesis
    "Learning Occlusion Regions"
    MS. Dissertation, Computer Graphics, Vision and Imaging, University College London. Sept 2010.
    Microsoft Research Project Prize
    [PDF], [BibTeX]

Publications    [Google Scholar | MS Academic]
  • New
    "Incremental Object Learning From Contiguous Views"
    Computer Vision and Pattern Recognition (CVPR), June 2019.
    [Webpage + Data + Code], [Talk], [Paper], [BibTeX]
  • "Iterative Machine Teaching"
    International Conference on Machine Learning (ICML), August 2017.
    [Code], [Demo], [Talk], [Slides], [PDF], [BibTeX]
  • "The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals"
    IEEE International Conference on Computer Vision (ICCV), Dec. 2015.
    [Webpage + Code], [PDF], [BibTeX]
  • "Finding Temporally Consistent Occlusion Boundaries in Videos using Geometric Context"
    IEEE Winter Conference on Applications of Computer Vision (WACV), Jan. 2015.
    [Webpage + Dataset], [PDF], [BibTeX]
  • "RIGOR: Reusing Inference in Graph Cuts for generating Object Regions"
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014.
    [Webpage + Code], [PDF], [BibTeX]
  • "Video Segmentation by Tracking Many Figure-Ground Segments"
    IEEE Conference on Computer Vision (ICCV), Dec. 2013.
    [Webpage + Code], [Dataset], [PDF], [BibTeX]
  • "Learning a Confidence Measure for Optical Flow"
    IEEE Pattern Analysis and Machine Intelligence, May 2013.
    [Webpage + Code], [PDF], [BibTeX]
  • "A Novel Paradigm for Mining Cell Phenotypes in Multi-Tag Bioimages using a Locality Preserving Nonlinear Embedding"
    International Conference on Neural Information Processing (ICONIP), Nov. 2012.
    [Preprint PDF], [BibTeX]
  • "RAMTaB: Robust Alignment of Multi-Tag Bioimages"
    PLoS ONE, Feb. 2012.
    [link], [BibTeX]
    This journal is ranked 1st by Eigenfactor across all scientific proceedings
  • "A Framework for Molecular Co-Expression Pattern Analysis in Multi-Channel Toponome Fluorescence Images"
    Microscopy Image Analysis with Apps. in Biology (MIAAB), Sept. 2011.
    [PDF], [BibTeX]
  • "Towards Protein Network Analysis Using TIS Imaging and Exploratory Data Analysis"
    Workshop on Computational Systems Biology (WCSB), June 2011.
    [Preprint PDF], [BibTeX]
  • "Learning to Find Occlusion Regions"
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011.
    [Webpage + Code], [PDF], [BibTeX]
  • "Myosin motors drive long-range alignment of actin filaments"
    Journal of Biological Chemistry, Feb. 2010.
    [link], [BibTeX]
    This journal is ranked 12th by Eigenfactor across all scientific proceedings
Notes
  • Machine Learning
    Notes I took while preparing for my PhD. qualifying exam. References from different books: Kevin Murphy's ML, Simon Prince's CV Models, Tom Mitchell ML, and Russell and Norvig's AI.
    Topics: Probability, Generative and Discriminative Models, GLMs. Exponential Family, Directed/Undirected Graphical Models, EM, Gaussian Processes, Variational Inference, Reinforcement Learning, etc.
  • Computer Vision
    Notes I took during Simon Prince's Computer Vision class at UCL.
    Topics: Single/Multi Camera Geometry, Overview of Probability, Learning, and Inference, Single Pixel Inference, MRFs, MCMC. and Graph Cuts.
  • Linear Algebra (hand-written)
    Notes while listening to Gilbert Strang's Linear Algebra lectures).
    Topics: Factorization, Subspaces, Projections, Orthogonal Matrices, Deteriminants, Eigenvectors, FFT, SVD.
  • Optimization
    Notes I took while taking the Numerical Optimization course at UCL. References from Chong and Zak's Introduction to Optimization.
    Topics: Basics of Unconstrained, Multivariate, and Constrained Optimization, and Regularization.
  • Computer Graphics
    Notes I took during Jan Kautz' CG class at UCL (GV10).
    Topics: Reflection, Illumination, View Cameras, Representations, Clipping, Rasterization, Texture Mapping, Radiosity, Shadows, Culling.
  • Computational Photography
    Notes I took during Gabe Brostow's Computational Photography and Capture class at UCL.
    Topics: Cameras and Aberrations, Compositing, Blending, Time-lapse, Morphing, HDR, Bilateral Filter, Texture Synthesis, Deblurring, Deconvolution, Intrinsic Images.
  • Image Processing
    Notes I took during Gabe Brostow/Simon Prince's Image Processing class at UCL (GV12).
    Topics: Segmentation, Transformations, Morphological Operators, Image Filtering, Edge Detection, Corner/Feature Detection, Matching and Flow.
Email: Voice: