29 October 2020
Prof. Michal Irani
Weizmann Institute of Science
Michal Irani is a Professor at the Weizmann Institute of Science, in the Department of CS and Applied Mathematics. She received her PhD from the Hebrew University (1994), and joined the Weizmann Institute in 1997. Her research interests center around Computer-Vision, Image-Processing, AI and Video information analysis. Michal's recent prizes and honors include the Maria Petrou Prize (2016), the Helmholtz “Test of Time Award” (2017), the Landau Prize for Arts & Sciences (2019), and the Rothschild Prize (2020). She also received the ECCV Best Paper Award in 2000 and in 2002, and was awarded the Honorable Mention for the Marr Prize in 2001 and in 2005.
I will show how complex visual inference can be performed with Deep-Learning, in a totally unsupervised way, by training on a single image -- the test image itself. The strong recurrence of information inside a single image provides powerful internal examples, which suffice for self-supervision of CNNs, without any prior examples or training data. This gives rise to true “Zero-Shot Learning”. I will show the power of this approach to a variety of problems, including super-resolution, segmentation, transparency separation, dehazing, image-retargeting, and more.
I will further show how self-supervision can be used for “Mind-Reading” (reconstructing images from fMRI brain recordings), despite having only little training data.
Prof. Shai Shalev-Shwartz
Chief Technology Officer, Mobileye
Senior Fellow, Intel Corporation
Professor at the Rachel and Selim Benin School of Computer Science and Engineering at the Hebrew University of Jerusalem
Shai Shalev-Shwartz is the CTO of Mobileye, a Senior Fellow at Intel.
Professor Shalev-Shwartz holds a professor position in the Rachel and Selim Benin School of Computer Science and Engineering at the Hebrew University of Jerusalem. Before joining Hebrew University, Prof. Shalev-Shwartz was a research assistant professor at Toyota Technological Institute in Chicago, as well as having worked at Google and IBM research. Prof. Shalev-Shwartz is the author of the book “Online Learning and Online Convex Optimization,” and a co-author of the book “Understanding Machine Learning: From Theory to Algorithms.” Prof. Shalev-Shwartz has written more than 100 research papers, focusing on machine learning, online prediction, optimization techniques, and practical algorithms.
Humans can drive a car using a vision-only system, without relying on 3D sensors at all, and achieve a remarkable high accuracy. Can we match this ability using computer vision? The talk will focus on some of the challenges, including machine learning with extremely high accuracy, lifting a 2D projection back to the 3D world, and developing decision-making algorithms that are robust to sensing errors.
Prof. Amir Globerson
The Blavatnik School of Computer Science
Tel Aviv University
Holds a BSc in computer science and physics, and a PhD in computational neuroscience. After his PhD, he was a postdoctoral fellow at the University of Toronto and a postdoctoral fellow at MIT. His research interests include machine learning, deep learning, graphical models, optimization, machine vision, and natural language processing. His work has received several prizes including five paper awards at NeurIPS, ICML and UAI. In 2019, he received the ERC Consolidator Grant.
Scene graphs are detailed semantic descriptions of images. In this talks I will describe methods for annotating images with scene graphs, learning how to annotate from weak supervision, and generating images from scene graphs. In particular, I will discuss questions of representation invariance in these architectures.
Prof. Nadav Cohen
Asst. Professor of Computer Science at Tel Aviv University
Chief Scientist at Imubit
Deep learning is experiencing unprecedented success in recent years, delivering state of the art performance in numerous application domains. However, despite its extreme popularity and the vast attention it is receiving, this technology suffers from various limitations --- in terms of stability, reliability, explainability and more --- hindering its proliferation. In this talk I will argue that theoretical analyses of deep learning may assist in addressing such limitations, by providing principled tools for neural architecture and optimization algorithm design. Two examples will be given: (i) application of tensor analysis and quantum mechanics for configuring the architecture of a convolutional neural network; and (ii) dynamical analysis of gradient descent over linear neural networks for enhancing convergence and generalization properties.
VP of Algorithms
Nathaniel Bubis is VP of Algorithms at Healthy.io, leading the development of advanced computer vision algorithms across the company's research teams. These algorithms enable Healthy.io's vision of turning smartphone cameras into medical devices, allowing for early detection of chronic kidney disease, faster treatment for urinary tract infections and digitizing chronic wound management. Prior to joining Healthy.io, Nathaniel held roles at a number of successful start-ups before leading computer vision teams at Amazon's Lab126. He holds an M.Sc in Physics from Tel Aviv University.
In this talk we'll discuss the opportunities and challenges involved in taking smartphone based medical diagnostics to the next dimension, by using the camera to create accurate 3D models. These 3D models not only allow for accurate physical measurements, but also open up new opportunities to provide practitioners with novel clinical viewpoints, while utilizing recent advancements in the quickly growing field of 3D machine learning.
Imperial College London
Michael Bronstein is a professor at Imperial College London, where he holds the Chair in Machine Learning and Pattern Recognition, and Head of Graph Learning Research at Twitter. He also heads ML research in Project CETI, a TED Audacious Prize winning collaboration aimed at understanding the communication of sperm whales. Michael received his PhD from the Technion in 2007. He has held visiting appointments at Stanford, MIT, Harvard, and Tel Aviv University, and has also been affiliated with three Institutes for Advanced Study (at TU Munich as a Rudolf Diesel Fellow (2017-2019), at Harvard as a Radcliffe fellow (2017-2018), and at Princeton as visitor (2020)). Michael is the recipient of five ERC grants, Member of the Academia Europaea, Fellow of IEEE, IAPR, and ELLIS, ACM Distinguished Speaker, and World Economic Forum Young Scientist. In addition to his academic career, Michael is a serial entrepreneur and founder of multiple startup companies, including Novafora, Invision (acquired by Intel in 2012), Videocites, and Fabula AI (acquired by Twitter in 2019). He has previously served as Principal Engineer at Intel Perceptual Computing and was one of the key developers of the Intel RealSense technology.
Geometric deep learning has recently become one of the hottest topics in machine learning, with its particular instance, graph neural networks, being used in a broad spectrum of applications ranging from 3D computer vision and graphics to high energy physics and drug design. Despite the promise and a series of success stories of geometric deep learning methods, we have not witnessed so far anything close to the smashing success convolutional networks have had in computer vision. In this talk, I will outline my views on the possible reasons and how the field could progress in the next few years.
Senior Research Scientist
Tali Dekel is currently a Senior Research Scientist at Google, Cambridge MA, developing algorithms at the intersection of computer vision, computer graphics, and machine learning. She will join the Mathematics and Computer Science Department at the Weizmann Institute, Israel, as a faculty member (Assistant Professor) in 2021. Before Google, she was a Postdoctoral Associate at the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT. Tali completed her PhD studies at the school of electrical engineering, Tel-Aviv University, Israel. Her research interests include computational photography, image/video synthesize, geometry and 3D reconstruction. Her awards and honors include the National Postdoctoral Award for Advancing Women in Science (2014), the Rothschild Postdoctoral Fellowship (2015), the SAMSON - Prime Minster's Researcher Recruitment Prize (2019), Best Paper Honorable Mention in CVPR 2019, and Best Paper Award (Marr Prize) in ICCV 2019. She served as workshop co-chair for CVPR 2020.
Dr. Matan Protter
Head of eXtended Reality (XR) efforts
Alibaba DAMO Israel Lab
Matan is leading the eXtended Reality (XR) efforts in Alibaba DAMO Israel Lab. He was previously the CTO and co - founder of Infinity Augmented Reality, which developed AR glasses and was acquired by Alibaba in 2019. He has been working in various computer vision fields for over 15 years. Matan holds a PhD (direct program) in Computer Science from the Technion (2010) and is an alumni of Talpiot program.
Gone are the days when we researchers would spend years specializing in only one computer vision field. In this talk, we will show how we are combining the gamut of CV tasks, from classification and segmentation to 3D and GANs, using data (both real and synthetic) to solve real-world e-commerce challenges in Alibaba’s scale. As an example, we will detail how we effectively train, combine and deploy all of these varied tasks in the context of a Home Decor e-retail project.
Prof. Tal Arbel
Tal Arbel is a Professor in the Department of Electrical and Computer Engineering, where she is the Director of the Probabilistic Vision Group and Medical Imaging Lab in the Centre for Intelligent Machines, McGill University. She is also an elected Associate Member of MILA (Montreal Institute for Learning Algorithms) and the Goodman Cancer Research Centre. Prof. Arbel’s research focuses on development of probabilistic machine learning methods in computer vision and medical image analysis, with a wide range of applications in neurology and neurosurgery. Her recent awards include receiving a Canada CIFAR AI Chair (2019), and the 2019 McGill Engineering Christophe Pierre Research Award. She regularly serves on the organizing team of major international conferences in both fields (e.g. MICCAI, MIDL, ICCV, CVPR). She is currently an Associate Editor (AE) for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) and is the Editor-in-Chief of a newly launched arXiv overlay journal: Machine Learning for Biomedical Imaging (MELBA).
Although deep learning (DL) models have been shown to outperform other frameworks for a variety of medical contexts, inference in the presence of pathology in medical images presents challenges to popular networks. Errors in deterministic outputs lead to distrust by clinicians and hinders the adoption of DL methods in the clinic. Moreover, given that medical image analysis typically requires a sequence of inference tasks to be performed, this results in an accumulation of errors over the sequence of outputs. This talk will describe recent work exploring (MC-dropout) measures of uncertainty in DL lesion and tumour detection and segmentation models in patient images and illustrate how propagating uncertainties across cascaded medical imaging tasks can improve DL inference. The models are successfully applied to large-scale, multi-scanner, multi-center clinical trial datasets of patients with Multiple Sclerosis and to the MICCAI BRaTs brain tumour segmentation challenge datasets. Finally, current work on prediction of future lesion activity and disease progression based on baseline MRI will be briefly described.
As deep learning is showing potential value in different markets, there is an increasing need to be able to run inference efficiently on edge devices.
In this talk we will focus on the fundamental characteristics of deep learning algorithms, analyze the challenges they introduce to the classical 60 years old Von-Neuman processing approach and review the guidelines to building more efficient domain specific processing architecture.
Beginning with some theoretical reasoning behind domain-specific architectures and their implementation in the field of deep learning, and more specifically for machine vision applications. We will use various quantitative measures, and more detailed design examples in order to make a link between theory and practice.
Hailo has developed a specialized deep learning processor that delivers the performance of a data center-class computer to edge devices. Hailo’s AI microprocessor is the product of a rethinking of traditional computer architectures, enabling smart devices to perform sophisticated deep learning tasks such as imagery and sensory processing in real time with minimal power consumption, size and cost.
Yuval is a postdoctoral researcher working with Prof. Tomer Michaeli at the Technion. His research focuses on the intersection of computer vision and audio processing with Machine learning. He completed his PhD at the Weizmann Institute of Science, where his advisor was Prof. Michal Irani. Previously, he completed his M.Sc. at the Technion, where he was advised by Prof. Yoav Y. Schechner.
Single image super resolution (SR) has seen major performance leaps in recent years. However, existing methods do not allow exploring the infinitely many plausible reconstructions that might have given rise to the observed low-resolution (LR) image. These different explanations to the LR image may dramatically vary in their textures and fine details, and may often encode completely different semantic information. In this work, we introduce the task of explorable super resolution. We propose a framework comprising a graphical user interface with a neural network backend, allowing editing the SR output so as to explore the abundance of plausible HR explanations to the LR input. At the heart of our method is a novel module that can wrap any existing SR network, analytically guaranteeing that its SR outputs would precisely match the LR input, when downsampled. Besides its importance in our setting, this module is guaranteed to decrease the reconstruction error of any SR network it wraps, and can be used to cope with blur kernels that are different from the one the network was trained for. We illustrate our approach in a variety of use cases, ranging from medical imaging and forensics, to graphics.
Senior Algorithm Researcher
Alibaba DAMO Israel Lab
Tal Ridnik is a Senior Algorithm Engineer and Researcher at the AutoML team in Alibaba DAMO Israel Lab. His research focuses on automating the process of training high-quality and efficient neural network models, making deep learning accessible to developers with limited machine learning expertise. Tal completed his B.Sc in physics and electrical engineering in the Technion, as a part of "Psagot" Program, and M.Sc in physics from the Ben-Gurion University.
Alibaba DAMO Academy Israel
Hussam is an Algorithm Engineer in Alibaba DAMO Israel Lab. Hussam enjoys doing applied research on pose estimation, person Re-Identification, and image classification. Prior to Alibaba, Hussam worked in several companies in the retail business as a senior android developer.
Hussam completed his B.Sc in tandem with high school studies as a part of "Etgar" program at the University of Haifa
Content-based image retrieval (CBIR) is an important vision problem and significant progress has been made thanks to deep learning. One of the most popular applications of CBIR is a visual product search, which gained popularity among leading e-commerce vendors lately. Visual product search enables a more convenient interaction for the consumer as well as more fine-grained intent description than text. The product matching can be further improved by incorporating the user’s feedback in the form of relevance, relative or absolute attribute to the search query.
Cross-modal image retrieval allows using different types of query and user’s feedback into the visual search, such as text to image retrieval or text and image combination to image retrieval. A deep learning approach for learning the joint embeddings of images and text has shown impressive results in addressing this scenario.
In this talk, we will present the latest trends for product visual search including the multi-modal scenarios and will provide some hands-on tips for effective results.
Nathan is an Algorithm Engineer at DataGen Technologies.
His research focuses on creating high quality simulated data for computer vision applications such as pose estimation.
Nathan previously worked at Intel as a Computer Vision Engineer and graduated Summa Cum Laude from Imperial College London with a MEng in Electrical Engineering and a thesis on Action Recognition.
In the computer vision industry, gathering and manually annotating data is the most substantial bottleneck in the development of deep learning solutions. A promising solution is to generate data through 3D simulations as they provide perfect annotations and densely sample edge cases that real datasets fail to capture. Yet, a known shortcoming of this method is the domain gap between the simulated and real world domains. We show it can be overcome through the mutual use of Photorealistic Simulation and Domain Adaptation. To validate our claim on a study case, we generated simulated datasets that achieve state-of-the-art performance for 2D hand joints estimation. In this talk, we will present this methodology as a base for solving practical computer vision challenges in a wide range of domains.
Principal Data Scientist
Pavel Levin is a Principal Data Scientist at Booking.com, one of the world's leading digital travel platforms. Over the past five years with the company, he has worked on a number of important AI products, including the Booking Assistant (a customer service chatbot), an in-house machine translation engine, various recommendation and personalization applications and computer vision projects to create an even smoother, insightful and relevant experience on Booking.com. Trained as an applied mathematician, he has keen interest in all applied aspects of statistical models, learning algorithms and data science in general.
In today's increasingly visual world of e-commerce products are often accompanied by photo galleries describing various product aspects. We are going to deep dive into the travel accommodations use case and discuss a deep learning-based solution to the problem of finding meaningful representations of hotel galleries in a large scale e-commerce setting. The universality of embeddings and their flexibility to new downstream tasks is achieved through training the gallery encoder on multiple independent tasks using multi-task learning (MTL) approach. To evaluate the role of MTL in gallery encoding we look at how the performance of the joint MTL-trained model on each task compares to the model performances of separately trained end-to-end models. To assess the quality of learned representations we mainly look at their performance in downstream applications.
Weizmann Institute of Science
This paper is my MSc. thesis at Weizmann Institute of science, got accepted to an oral presentation at NeurIPS 2019. Currently working as a deep learning research engineer at Deci.ai.
Super-resolution (SR) methods typically assume that the low-resolution (LR) image was downscaled from the unknown high-resolution (HR) image by a fixed 'ideal' downscaling kernel (e.g. Bicubic downscaling). However, this is rarely the case in real LR images, in contrast to synthetically generated SR datasets. When the assumed downscaling kernel deviates from the true one, the performance of SR methods significantly deteriorates. This gave rise to Blind-SR - namely, SR when the downscaling kernel ("SR-kernel") is unknown. It was further shown that the true SR-kernel is the one that maximizes the recurrence of patches across scales of the LR image. In this paper we show how this powerful cross-scale recurrence property can be realized using Deep Internal Learning. We introduce "KernelGAN", an image-specific Internal-GAN, which trains solely on the LR test image at test time, and learns its internal distribution of patches. Its Generator is trained to produce a downscaled version of the LR test image, such that its Discriminator cannot distinguish between the patch distribution of the downscaled image, and the patch distribution of the original LR image. The Generator, once trained, constitutes the downscaling operation with the correct image-specific SR-kernel. KernelGAN is fully unsupervised, requires no training data other than the input image itself, and leads to state-of-the-art results in Blind-SR when plugged into existing SR algorithms.
Irit Chelly is a Computer Science PhD student at Ben-Gurion University under the supervision of Dr. Oren Freifeld at the Vision, Inference, and Learning group. Her current research focuses on unsupervised learning and video analysis. She is interested in probabilistic graphical models, spatial transformations, dimensionality reduction, and deep learning. Irit won the national-level Aloni PhD scholarship from Israel's Ministry of Technology and Science as well as the BGU Hi-tech scholarship for excellent PhD students.
Background models are widely used in computer vision. While successful Static-camera Background (SCB) models exist, Moving-camera Background (MCB) models are limited. Seemingly, there is a straightforward solution: 1) align the video frames; 2) learn an SCB model; 3) warp either original or previously-unseen frames toward the model. This approach, however, has drawbacks, especially when the accumulative camera motion is large and/or the video is long. Here we propose a purely-2D unsupervised modular method that systematically eliminates those issues. First, to estimate warps in the original video, we solve a joint-alignment problem while leveraging a certifiably-correct initialization. Next, we learn both multiple partially-overlapping local subspaces and how to predict alignments. Lastly, in test time, we warp a previously-unseen frame, based on the prediction, and project it on a subset of those subspaces to obtain a background/foreground separation. We show the method handles even large scenes with a relatively-free camera motion (provided the camera-to-scene distance does not change much) and that it not only yields State-of-the-Art results on the original video but also generalizes gracefully to previously-unseen videos of the same scene. The talk is based on [Chelly et al., CVPR '20]. This is joint work with Vlad Winter, Dor Litvak, Oren Freifed (all from BGU CS) and David Rosen (MIT).
Dr. Aviv Zeevi Balasiano
VP and Head of the Division - Technology Infrastructure
Israeli Innovation Authority
Dr. Aviv Zeevi Balasiano is a VP and head of the division -Technology Infrastructure in the Israeli innovation authority. Until two years ago, Dr. Balasiano served as the head of the ICT department in the Israeli Directorate for EU FP – A government agency aims at promoting joint Israeli-EU R&D ventures within the EU’s R&D Framework Program. He has a PhD in Information Systems from Tel Aviv University. His research field involves Estimating the value of information of R&D. Aviv has also taken part in an international research definition of the productivity of ICT in the Era of Cyberspace, Internet, Open Information and Shared Knowledge in cooperation with Stevens Institute of Technology. He holds degrees in Economics and Political Science.
Dr. Balasiano has served for 5 years as an Artillery Officer in the IDF and has received General IDF Commander's honor followed by 16 years in the IT industry mainly in software development and simulation.
In order to perform complex calculations in the field of artificial intelligence, a great deal of computational power is required, which is also able to handle information in very large volumes. In fact, the need for artificial intelligence computing and the need to solve increasingly complex computing problems are pushing the computing market forward consistently, including moving to GPU processing units and designing new components that will be specifically tailored for artificial intelligence computing. For Israel
The ability to stay up-to-date and relevant is required, along with the ability to research and innovate independently. It is important to emphasize that the needs are not only in the power of calculation itself, but also in storage, communication, support and more..
When we come to define the infrastructure needs for artificial intelligence uses, we need to ask two questions: The first is who our users are. The second is what their needs are regarding the following topics: access to shared information, cost savings, ability to solve large-scale problems, classification constraints, confidentiality and security, performing innovative hardware and software testing, community support, education and training.
Student Researcher, IBM
MSc Student, Tel Aviv University
Sivan Doveh is a student researcher at the Computer Vision and Augmented Reality (CVAR) group at IBM Research AI.
She is also completed an MSc at Tel Aviv University under the supervision of Raja Giryes. Her research is focused on meta-learning.
Network architecture search (NAS) achieves state-of-the-art results in various tasks such as classification and semantic segmentation. Recently, a reinforcement learning-based approach has been proposed for Generative Adversarial Networks (GANs) search. In this work, we propose an alternative strategy for GAN search by using a proxy task instead of common GAN training. Our method is called DEGAS (Differentiable Efficient GenerAtor Search), which focuses on efficiently finding the generator in the GAN. Our search algorithm is inspired by the differential architecture search strategy and the Global Latent Optimization (GLO) procedure. This leads to both an efficient and stable GAN search. After the generator architecture is found, it can be plugged into any existing framework for GAN training.
ProVision Algorithm Manager
I hold B.Sc. in Electronic and Computer Engineering from Ben-Gurion University, and M.Sc. in Electronic and Computer Engineering from Tel Aviv University, Specialization in Signal and Image processing. In the last years I manage an algorithm group at Applied Materials. We develop innovative Metrology methods in SEM images for the semiconductor industry. We specialize in Computer Vision, Machine Learning and Deep Learning, mainly in the fields of detection, background subtraction and segmentation.
Many real-world applications suffer from lack of ground-truth. we propose innovative an end to end network, dealing with zero-shot or few-shot segmentation.
We will show an innovative visual intuition that makes triplet-loss post processing redundant and enables end-to-end networks for many applications.
In addition, our network has the advantage of dealing with noisy labeling, by letting the network optimize accuracy without compromising consistency.
Oshri Halimi is a Ph.D. student in the electrical engineering faculty at Technion, supervised by Prof. Ron Kimmel.
Her research investigates geometric invariants and their application in computer vision and shapes analysis. In particular, she is interested in the interface between geometry and deep learning.
She published in top-tier conferences for computer vision (CVPR, ECCV) and organized workshops in the field: "iGDL 2020: Israeli Geometric Deep Learning Workshop" and "Learning and Processing of Geometric Visual Structures," SIAM Conference on Imaging Science (SIAM-IS20). She was awarded the Israel Ministry of Science Jabotinsky Fellowship for Doctoral Students.
She holds B.Sc in physics and electrical engineering from Technion, which she graduated cum laude. She is an alumna of the Technion Excellence Program, the Archimedes Program, and a bronze medalist in the IChO. She served in Unit 8200.
We introduce the first completely unsupervised correspondence learning approach for deformable 3D shapes.
Key to our model is the understanding that natural deformations, such as changes in pose, approximately preserve the metric structure of the surface, yielding a natural criterion to drive the learning process toward distortion-minimizing predictions. On this basis, we overcome the need for an- notated data and replace it by a purely geometric criterion. The resulting learning model is class-agnostic, and is able to leverage any type of deformable geometric data for the training phase. In contrast to existing supervised approaches which specialize on the class seen at training time, we demonstrate stronger generalization as well as applicability to a variety of challenging settings. We showcase our method on a wide selection of correspondence benchmarks, where the proposed method outperforms other methods in terms of accuracy, generalization, and efficiency.
Senior Applied Scientist
Assaf is a Senior Applied Scientist at Amazon. Since 2015, he has taken part in various deep learning projects, mostly in the Fashion AI domain.
Before joining Amazon, Assaf was a computer vision researcher at the Israeli Intelligence Corps and a senior algorithm engineer at medical and cybersecurity startups.
Assaf has published papers in IEEE, KDD and CVPR, and holds a BSc and MSc in Electrical Engineering from Tel-Aviv University.
This paper presents a new image-based virtual try-on approach (Outfit-VITON) that helps visualize how a composition of clothing items selected from various reference images form a cohesive outfit on a person in a query image. Our algorithm has two distinctive properties. First, it is inexpensive, as it simply requires a large set of single (non-corresponding) images (both real and catalog) of people wearing various garments without explicit 3D information. The training phase requires only single images, eliminating the need for manually creating image pairs, where one image shows a person wearing a particular garment and the other shows the same catalog garment alone. Secondly, it can synthesize images of multiple garments composed into a single, coherent outfit; and it enables control of the type of garments rendered in the final outfit. Once trained, our approach can then synthesize a cohesive outfit from multiple images of clothed human models, while fitting the outfit to the body shape and pose of the query person. An online optimization step takes care of fine details such as intricate textures and logos. Quantitative and qualitative evaluations on an image dataset containing large shape and style variations demonstrate superior accuracy compared to existing state-of-the-art methods, especially when dealing with highly detailed garments.
Computer Vision Algorithm Team Leader
Tal Perl is a computer vision algorithm team leader at EyeSight. His main focus is in geometrical deep learning, specifically estimating 3D properties using 2D images. He holds a BSc and MSc in electrical engineering from Tel Aviv University.
Face alignment is a known computer vision challenge that has been explored enthusiastically in the past two decades. Thanks to the deep learning paradigm, researchers are now able to train networks to estimate 3D poses of objects using large amounts of data. A recurring limitation in these solutions is the lack of labelled data. Moreover, it is challenging to annotate 2D images with 3D labels. In this talk, we review the latest works on this topic and present how we generated massive amounts of semi-synthetic data using 3D morphable models. With this data, we were able to train a competitive real-time face tracker that is also lightweight and highly accurate for the task of head pose estimation.
Tammy Riklin Raviv
Faculty Member, School of Electrical and Computer Engineering
Tammy Riklin Raviv is a faculty member in the School of Electrical and Computer Engineering of Ben-Gurion University. Her main research interests are Computer Vision, Machine Learning and Biomedical Image Analysis. She is an associate editor at the IEEE Transactions on Medical Imaging (TMI) Journal and a TC member at the IEEE Bio Imaging and Signal Processing (BISP) Committee. She holds a B.Sc. in Physics and an M.Sc. in Computer Science (both magna cum laude) from the Hebrew University of Jerusalem. She received her Ph.D. from the School of Electrical Engineering of Tel-Aviv University. During 2010-2012 she was a research fellow at Harvard Medical School and the Broad Institute of MIT and Harvard. Prior to this (2008-2010) she was a post-doctorate associate in CSAIL, MIT.
In this talk I will introduce a novel Deep Learning framework, which quantitatively estimates image segmentation quality without the need for human inspection or labeling. We refer to this method as a Quality Assurance Network - QANet. Specifically, given an image and a ‘proposed’ corresponding segmentation, obtained by any method including manual annotation, the QANet solves a regression problem in order to estimate a predefined quality measure (or example the IoU or a Dice score) with respect to the unknown ground truth. The QANet is by no means yet another segmentation method. Instead, it performs a multi-level, multi-feature comparison of an image-segmentation pair based on a unique network architecture, called the RibCage.
To demonstrate the strength of the QANet, we addressed the evaluation of instance segmentation using two different datasets from different domains, namely, high throughput live cell microscopy images from the Cell Segmentation Benchmark and natural images of plants from the Leaf Segmentation Challenge. While synthesized segmentations were used to train the QANet, it was tested on segmentations obtained by publicly available methods that participated in the different challenges. We show that the QANet accurately estimates the scores of the evaluated segmentations with respect to the hidden ground truth, as published by the challenges’ organizers.
Computer Vision & Deep Learning Researcher Elbit Systems Aerospace. Yakov Miron is a BScEE from Ben-Gurion university and an MScEE from Tel Aviv university in Israel. He was working for Motorola Inc. and Silentium as an algorithm developer. His current position is Computer Vision and Deep Learning Researcher in the R&D division at Elbit Systems. His interest topics are Machine Learning, Deep Learning, Computer Vision, 3D Modeling, as well as Navigation, Localization and SLAM.
This work offers a new method for generating photo-realistic images from semantic label maps and a simulator edge map images. We do so in a conditional manner, where we train a Generative Adversarial network (GAN) given an image and its semantic label map to output a photo-realistic version of that scene. Existing architectures of GANs still lack the photo-realism capabilities. We address this issue by embedding edge maps, and presenting the Generator with an edge map image as a prior, which enables generating high level details in the image. We offer a model that uses this generator to create visually appealing videos as well, when a sequence of images is given.
Chief Science Officer & Head of AI
I am leading the science, computer vision and ML activities of SEETREE.AI, a new successful start-up in the agtech domain, since its pre-seed development. As former senior director in algorithms and R&D in Mobileye, I leverage 13 years of experience in Mobileye to help introduce a similar transformation to the agtech world, together with an excellent multidisciplinary team.
We introduce a new method of image-registration, named "semantic spatial alignment" (SSA).
This method performs an optimization of the semantic difference loss between two images, using a gradient-descend-process which optimizes the parameters of a neural-network composed of a single differentiable spatial-transformer. This new method shows a dramatic improvement over state-of-the-art feature-point-matching methods (e.g SIFT, ORB), when inputs are time-repeating orthomosaics of tree-plantations, where inputs can be from different sensors and resolutions, and contain changes in the shape of the tree objects. The method is also superior in cases where the success of affine, projective or other simple homographic transformation maps are limited. The method shows a successful use of deep learning in dramatically improving a traditional "classical computer vision task" as image-registration.
Dr. Leonid Karlinsky
Senior Research Scientist, Research Team Lead
IBM Research AI
Leonid Karlinsky leads the CV & DL research team in the Computer Vision and Augmented Reality (CVAR) group @ IBM Research AI. Before joining IBM, he served as a research scientist in Applied Materials, Elbit, and FDNA. He is actively publishing and reviewing at ECCV, ICCV, CVPR and NeurIPS, and is serving as an IMVC steering committee member for the past 3 years. His recent research is in the areas of few-shot learning with specific focus on object detection, metric learning, and example synthesis methods. He received his PhD degree at the Weizmann Institute of Science, supervised by Prof. Shimon Ullman.
In this talk we will discuss our recent advances in few-shot learning, a regime where only a handful of training examples (maybe just one) are available for learning novel categories unseen during training. We will cover a method for few-shot classification that is capable of matching and localizing instances of novel categories, despite being trained and used with only category level image labels and without any location supervision, also opening the door for weakly supervised few-shot detection. We will cover a method for meta-learning a model that automatically modifies its architecture to better adapt to novel few-shot tasks. Finally, we will discuss the limitation of the current few-shot learning methods when handling extreme cases of domain transfer, and offer a new benchmark and some ideas towards cross-domain few-shot learning.
Michal Holtzman Gazit
Lead Computer Vision Researcher
Michal Holtzman Gazit is a lead computer vision researcher in Novocure, with nearly 20 years of experience in the field of computer vision and image processing and medical images. She received her BSc. (1998) and MSc. (2004) in Electrical Engineering Technion, and her PhD (2010) in Computer Science, Technion. During 2010-2012, she was a post-doctorate fellow in the computer science department in the University of British Columbia, Vancouver, Canada. Her main research interests are computer vision, image processing, AI in healthcare and deep learning.
Tumor Treating Fields (TTFields) is an FDA approved treatment for specific types of cancer and significantly extends patients’ life. The intensity of the TTFields within the tumor is associated with the treatment outcomes: the larger the intensity, the longer the patients are likely to survive. This requires optimizing TTFields transducer array location such that their intensity is maximized around the tumor. Finding the best array placement is a multi-stage computationally expensive optimization problem. Here, we present a novel method that incorporates machine learning and deep learning in order to allow physicians a better TTFields treatment planning.
Staff Researcher at the Smart Sensing and Vision Group
General Motors R&D Israel
I am a Staff Researcher at the Smart Sensing and Vision group, General Motors R&D Israel, in the fields of computer vision and machine learning. I received his B.Sc. degree (with honor) in mathematics and computer science from the Tel-Aviv University, in 2000, and the M.Sc. and PhD degrees in applied mathematics and computer science at the Weizmann Institute, in 2004 and 2009 respectively. In the Weizmann Institute I conducted research in human and computer vision under the supervision of Professor Shimon Ullman. Since 2007 I have been conducting industrial computer vision research and development at several companies including General Motors and Elbit Systems, Israel.
We introduce a network that directly predicts the 3D layout of lanes in a road scene from a single image. This work marks a first attempt to address this task with on-board sensing without assuming a known constant lane width or relying on pre-mapped environments. Our network architecture, 3D-LaneNet, applies two new concepts: intra-network inverse-perspective mapping (IPM) and anchor-based lane representation. The intra-network IPM projection facilitates a dual-representation information flow in both regular image-view and top-view. An anchor-per-column output representation enables our end-to-end approach which replaces common heuristics such as clustering and outlier rejection, casting lane estimation as an object detection problem. In addition, our approach explicitly handles complex situations such as lane merges and splits. Results are shown on two new 3D lane datasets, a synthetic and a real one. For comparison with existing methods, we test our approach on the image-only tuSimple lane detection benchmark, achieving performance competitive with state-of-the-art.
Computer Vision Researcher
Uriya is currently a computer vision researcher at Rafael. He has worked on noise removal from imagery for noise-sensitive sensors and on change detection. His current research is focused on unsupervised change detection on aerial images, based on metric-learning.
He has an Electrical Engineering M.Sc. from Tel Aviv university, specializing in computer vision algorithms and software development.
Given a pair of images of the same geographic area taken at different times, we wish to detect changes between them. Change detection is a challenging task. It is required to distinguish between fundamental changes, often man made, and insignificant natural ones. The latter may result from changing lighting, weather, camera pose, slight vegetation movement due to wind, and small errors in image registration. We address the change detection problem by training a learned descriptor using registered image pairs. Our fully convolutional CNN-based descriptor can efficiently detect changes in large aerial image pairs. It is shown to generalize well for a completely new scene and type of changes, while being robust to registration errors. The labeling of each image pair as similar or different is implied by the automatic registration process. Therefore, no manual annotation of any kind is required. While the lack of supervision results in label noise, the algorithm proves highly robust to it.
Dr. Anna Levant
3D Metrology Algorithm Team Leader
Dr. Anna Levant is a 3D metrology algorithm team leader at Applied Materials. She holds her PhD degree from Weizmann Institute of Science in Applied Mathematics, specifically Chaos problem. Prior to joining Applied Materials, she worked for 10 years in various medical devices companies leading the development of algorithms for various modalities as MRI, X-ray, ECG etc.
3D metrology is a new fascinating field in the semiconductor industry. Shrinkage of planar devices has reached its physical limit and advanced nodes resort to 3D design to increase the feature density in the device. Reliable measurements of these 3D structures are crucial for a chip development process.
We propose a novel supervised ML (Machine Learning) based solution for inferring 3D structure from 2D SEM (Scanning Electron Microscope) images. Our algorithm reached sub-nanometer accuracy and high precision.
The generality of our method and its ability to extract hidden information from SEM images open the door to a plethora of applications in 3D metrology for memory and logic devices.
Algorithm Department Manager
Ovadya joined Percepto on January 2019 as a Computer Vision team leader. With over 20 years of experience building Computer Vision solutions in the industry
with companies such as Intel Corporation, Applied Materials and PointGrab. Ovadya’s last position was with Innoviz-Tech, headed the Computer Vision department. Ovadya set the foundation for the Innoviz-Tech Computer Vision department, including defining the computer vision product specs Ovadya has vast experience in Computer Vision applications, including Deep learning, Object detection and tracking in mass production such as Samsung TV.
Ovadya holds an Msc. degree in the field of computer vision from the Weizmann Institute of Science.
Monitoring large areas is presently feasible with high resolution drone cameras, as opposed to time-consuming and expensive ground surveys. In this work we reveal for the first time, the potential of using a state-of-the-art change detection GAN based algorithm with high resolution drone images for infrastructure inspection. We demonstrate this concept on solar panel installation. A deep learning, data-driven algorithm for identifying changes based on a change detection deep learning algorithm was proposed. We use the Conditional Adversarial Network approach to present a framework for change detection in images. The proposed network architecture is based on pix2pix GAN framework. Extensive experimental results have shown that our proposed approach outperforms the other state-of-the-art change detection methods.
Dr. Amir Handelman
Senior Lecturer at Faculty of Electrical Engineering
Holon Institute of Technology
Dr. Amir Handelman received his BSc, MSc and PhD degrees in Electrical Engineering in 2008, 2011 and 2014, respectively, all from Tel-Aviv University, Israel. In 2014, Amir joined the faculty of Electrical Engineering in Holon Institute of Technology (HIT) as a tenure-track faculty member and established there the Applied Optics and Machine Vision Lab. In addition to his academic background, Amir has over 10 years' experience in computer vision and optics, which he gained during his works in several Hi-Tech companies, such as Israel Aerospace Industries (IAI), Volume-Elements Ltd., and KLA-Tencor.
Today, there is an increasing desire by patients and hospital managers to make trainings for residents and surgeons before performing actual operations. This desire made modern surgery syllabus to include use of simulators for improving surgical skills in teaching programs. In this talk I will review our computerized algorithms aim to score the performance of laparoscopic cutting and suturing operation in both general surgery and ophthamology medical fields . Using our algorithms, human assessment is not necessary and the quality of the surgery outcomes is objectively evaluated.
MaxQ-AI and Tel Aviv University
Leah Bar holds B.Sc. in Physics, M.Sc. in Bio-Medical Engineering and PhD in Electrical Engineering from Tel-Aviv University.
She worked as a post-doctoral fellow in the Department of Electrical Engineering at the University of Minnesota.
She is currently a senior researcher at MaxQ-AI, a medical AI start-up, and in addition a researcher at the Mathematics Department in Tel-Aviv University.
Her research interest are: machine learning, image processing, computer vision and variational methods.
We introduce a novel neural network-based partial differential equations solver for forward and inverse problems. The solver is grid free, mesh free and shape free, and the solution is approximated by a neural network.
We employ an unsupervised approach such that the input to the network is a points set in an arbitrary domain, and the output is the set of the corresponding function values. The network is trained to minimize deviations of the learned function from the PDE solution and satisfy the boundary conditions.
The resulting solution in turn is an explicit smooth differentiable function with a known analytical form.
Unlike other numerical methods such as finite differences and finite elements, the derivatives of the desired function can be analytically calculated to any order. This framework therefore, enables the solution of high order non-linear PDEs. The proposed algorithm is a unified formulation of both forward and inverse problems where the optimized loss function consists of few elements: fidelity terms of L2 and L infinity norms, boundary and initial conditions constraints, and additional regularizers. This setting is flexible in the sense that regularizers can be tailored to specific problems. We demonstrate our method on several free shape 2D second order systems with application to Electrical Impedance Tomography (EIT).
Elad Levi is a machine learning researcher at Nexar, a startup aiming to create a real-time traffic network for shaping the future of mobility. His work focuses on leveraging Nexar's large-scale datasets of real-world driving environments to mapping and automotive safety applications. Elad received a PhD degree in mathematics from the Hebrew University. His thesis was in the field of model-theoretic with applications to combinatorics problems.
Building fresh accurate maps of road items is a key ingredient in smart cities management and enabling fully autonomous vehicles. Building such maps from chip sensors such as monocular camera, GPS sensor and IMU, is a major challenge. It is even harder doing it in crowdsourcing setting, where the data is noisy and the camera position is arbitrary and unknown.
In this talk, we address this problem and related issues, namely; Camera alignment, self-localization, depth estimation, etc’. We demonstrate that using self-supervised approaches along with large corpus of diverse noisy-unlabeled data, we can get surprisingly accurate results.
AI & Data Science Researcher
Adi is a member of the core AI & data science research team of Intel’s Advanced Analytics group (Deep learning, NLP and computer vision research for sales and marketing, manufacturing, healthcare), in parallel to PhD research at the Hebrew University’s Computer Science department, supervised by Prof. Leo Joskowicz.
Adi holds an M.Sc in Bio-Engineering, an M.E. in Bio-Medical Engineering and a B.Sc in Electronics engineering.
We present a deep learning system for testing graphics units by detecting novel visual corruptions in videos. Unlike previous work in which manual tagging was required to collect labeled training data, our weak supervision method is fully automatic and needs no human labelling. This is achieved by reproducing driver bugs that increase the probability of generating corruptions, and by making use of ideas and methods from the Multiple Instance Learning (MIL) setting. In our experiments, we significantly outperform self-supervised methods such as GAN-based models and discover novel corruptions undetected by baselines, while adhering to strict requirements on accuracy and efficiency of our real-time system.
Engineering Manager – Image Processing
Jeff Mather is a senior software engineer and the development manager of the Image Processing Toolbox. He has managed the team since 2013 and has developed features for the toolbox and MATLAB since 2000, particularly in the area of file formats, medical image processing, HDR imaging, color science, and software performance optimization. He has an undergraduate degree in mathematics from Grinnell College and a Master of Software Engineering from Brandeis University. He has been with MathWorks since 1998.
We will explore the challenges of processing very large pathology images and present a framework to solve problems in segmentation, feature extraction, measurement, and labeling. The solution exploits multiple resolution levels, parallelism, and conditional processing. Special attention is given to using deep convolutional neural networks.
94, Yigal Alon St.
Tel Aviv 6109202