October 26, 2021
Pavilion 10, EXPO Tel Aviv
Datagen Technologies Ltd.
Tel Aviv University
Israel Atomic Energy Commission (IAEC)
Alibaba Damo Academy
Huawei Tel Aviv Research Center
Siemens Digital Industries Software
Bosch Center for Artificial Intelligence (BCAI)
IBM Research AI
Sensory Motor Integration lab
Applied Materials Israel
Image Denoising – Not What You Think
Laurence Moroney leads AI Advocacy at Google. He's the author of over 20 books, including the recent best-seller "AI and Machine Learning for Coders" at O'Reilly. He's the instructor of the popular online TensorFlow specializations at Coursera and deeplearning.ai, as well as the TinyML specialization on edX with Harvard University. When not googling, he's also the author of the popular 'Legend of the Locust' sci fi book series, the prequel comic books to the movie 'Equilibrium', and an imdb-listed screenwriter. Laurence is based in Washington State, in the USA, where he drinks way too much coffee.
'State of the Union' Around the TensorFlow Ecosystem
Yoav Shoham is professor emeritus of computer science at Stanford University. A leading AI expert, Prof. Shoham is Fellow of AAAI, ACM and the Game Theory Society. Among his awards are the IJCAI Research Excellence Award, the AAAI/ACM Allen Newell Award, and the ACM/SIGAI Autonomous Agents Research Award. His online Game Theory course has been watched by close to a million people. Prof. Shoham has founded several AI companies, including TradingDynamics (acquired by Ariba), Katango and Timeful (both acquired by Google), and AI21 Labs. Prof. Shoham also chairs the AI Index initiative (www.AIindex.org), which tracks global AI activity and progress, and WeCode (www.wecode.org.il), a nonprofit initiative to train high-quality programmers from disadvantaged populations.
Lessons from Developing Very Large Language Models
Chip Huyen (@chipro) is an engineer and founder working on infrastructure for real-time machine learning. Through her work with Snorkel AI, NVIDIA, Netflix, Primer AI, and her current startup, she has helped some of the world’s largest organizations develop and deploy machine learning systems. She teaches CS 329S: Machine Learning Systems Design at Stanford.
LinkedIn included her among Top Voices in Software Development (2019) and Top Voices in Data Science & AI (2020).
Previously, she helped launch Cốc Cốc, Vietnam’s second most popular web browser with 20M monthly active users. She’s also the author of four bestselling Vietnamese books.
Real-time Machine Learning: Motivations and Challenges
This talk covers the two levels of real-time machine learning: online prediction (as opposed to batch prediction) and continual learning (as opposed to batch learning). Online prediction means making predictions as soon as requests arrive. Continual learning is when an ML system allows an ML system to automatically update with new data in production, speeding up the iteration cycle to combat model performance’s decay due to concept drift.
Real-time machine learning allows a system to be adaptive to changing users' behaviors and environments, which leads to better model performance and, in some cases, reduced compute cost. However, real-time ML comes with many infrastructural and theoretical challenges, and this talk discusses the key challenges.
Mark Grobman is the ML CTO at Hailo, a startup offering a uniquely designed microprocessor for accelerating embedded AI applications on edge devices. Mark has been at Hailo since it was founded in 2017 and has overseen the ML R&D in the company. Before joining Hailo, Mark served at the Israeli Intelligence Corps Technology Unit in various inter-disciplinary R&D roles. Mark holds a double B.Sc. in Physics and Electrical Engineering from the Technion and an M.Sc. in Neuroscience from the Gonda Multidisciplinary Brain Research Center at Bar-Ilan University.
Quantization at Hailo – Co-Evolution of Hardware, Software and Algorithms
“Power-efficient hardware for deep neural network (DNN) acceleration is a major obstacle for the successful deployment at scale of many edge AI applications such as autonomous driving, smart retail, smart cities and more. Quantization is exploited by modern DNN acceleration at the edge to significantly reduce power consumption. In the first part of the talk, we will give a brief overview of the hardware aspects of quantization, followed by a high-level review of the main approaches to quantization. We show how these concepts are leveraged in the Hailo-8, a highly power-efficient DNN accelerator. In the second part, we discuss "real-world" challenges of quantization as well as suggest perspectives for future work to address current gaps.”
Towards fully unsupervised learning
Supervising with boolean class labels all future scenarios a machine learning application may encounter is understood to be infeasible, leading to renewed interest in adaptive and self-supervised methods. Yet several popular “unsupervised” methods implicitly presume knowledge of target domain invariances. I’ll present new methods that do not require hand-selecting augmentation strategies and learn without supervision: multi-task contrastive learning, automatic selection of augmentation policies, and entropy-based adaptation at test time without access to source labels or data. Fully unsupervised and adaptive learning can work together to dramatically reduce the level of supervision required for real-world tasks.
Dr. Jonathan Laserson is the Head of AI Research at Datagen, and an early adopter of deep learning algorithms. He did his bachelor studies at the Israel Institute of Technology, and has a PhD from the Computer Science AI lab at Stanford University.
After a few years at Google, he ventured into the startup world, and has been involved in many practical applications of ML and deep learning. Most recently, at Zebra Medical vision, he led the development of two FDA-approved clinical products, applying neural networks to millions of medical images and unstructured textual reports.
Embedding Synthetic Assets Using a Neural Radiance Field
At Datagen, we maintain a large catalogue of artist-made 3D assets from many categories (i.e. tables, chairs,bottles, etc.). Each asset consists of a 3D mesh and a texture map. We form an alternative, implicit volumetric representation of all assets, where each asset in a category is assigned a latent code. A single network coupled with a differential renderer is trained to render 2D images of each asset given its code, that are similar to the images rendered directly from the asset's mesh. The codes fully encapsulate the assets shape and appearance, and can be used to extract the visual attributes of each asset, without the need to re-render it.
PhD student for computer sciences at the TechnionTechnion
Yossi is a PhD student at the Technion, researching computer vision and planning algorithms for robotic systems. He holds an M.Sc in Physics from Tel-Aviv University and B.Sc in Electrical Engineering and B.Sc in Physics From the Technion.
Local Trajectory Planning For UAV Autonomous Landing
An important capability of autonomous Unmanned Aerial Vehicles (UAVs) is autonomous landing while avoiding collision with obstacles in the process. Such capability requires real-time local trajectory planning. Although trajectory-planning methods have been introduced for cases such as emergency landing, they have not been evaluated in real-life scenarios where only the surface of obstacles can be sensed and detected. We propose a novel optimization framework using a pre-planned global path and a priority map of the landing area. Several trajectory planning algorithms were implemented and evaluated in a simulator that includes a 3D urban environment, LiDAR-based obstacle-surface sensing and UAV guidance and dynamics. We show that using our proposed optimization criterion can successfully improve the landing-mission success probability while avoiding collisions with obstacles in real-time.
Graduate Student in the School of Computer Science Tel Aviv University
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleGAN is able to generate highly realistic images in a variety of domains, and therefore much recent work has focused on understanding how to use the latent spaces of StyleGAN to manipulate generated and real images. In this talk, I will present our recent paper "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery". In this paper, we explore leveraging the power of recently introduced CLIP models in order to develop a text-based interface for StyleGAN image manipulation. This interface provides a great expressivity for image editing and allows to perform edits that were not possible with previous approaches.
Computer Vision Algorithm Engineer at 3DFY.ai. Technion
Rajaei Khatib is a Computer Vision Algorithm Engineer at 3DFY.ai. Rajaei obtained both B.Sc. and M.Sc. from the department of Computer Science at the Technion, where he was advised by Prof. Michael Elad. His research focused on the connection between sparse representation and deep neural networks.
Learned Greedy Method (LGM): A novel neural architecture for sparse coding and beyond
The fields of signal and image processing have been deeply influenced by the introduction of deep neural networks. Despite their impressive success, the architectures used in these solutions come with no clear justification, being ‘‘black box’’ machines that lack interpretability. A constructive remedy to this drawback is a systematic design of networks by unfolding well-understood iterative algorithms. A popular representative of this approach is LISTA, evaluating sparse representations of processed signals. In this paper, we revisit this task and propose an unfolded version of a greedy pursuit algorithm for the same goal, this method is known as LGM.
Master's student for Physics Israel Atomic Energy Commission (IAEC)
Re'em Harel is a Master's student for Physics at Bar-Ilan University whilst working for the Israel Atomic Energy Commission (IAEC)
Complete Deep Computer Vision Methodology for Investigating Hydrodynamic Instabilities
In fluid dynamics, one of the most important research fields is hydrodynamic instabilities and their evolution in different flow regimes. Currently, three main methods are used for understanding such phenomena -- namely analytical and statistical models, experiments, and simulations -- and all of them are primarily investigated and correlated using human expertise. This presentation/work demonstrates how a major portion of this research effort could and should be analyzed using recent breakthrough advancements in the field of Computer Vision with Deep Learning. Specifically, Image Retrieval, Template Matching, Parameters Regression, and Spatiotemporal Prediction -- for the quantitative and qualitative benefits they provide. In order to do so, this research focuses mainly on one of the most representative instabilities, the Rayleigh-Taylor instability. The techniques which were developed and proved in this work can serve as essential tools for physicists in the field of hydrodynamics for investigating a variety of physical systems. Some of them can be easily applied to already existing simulation results, while others could be used via Transfer Learning to other instabilities research.
Applied Research Scientist at DAMO AcademyAlibaba Group
Emanuel Ben Baruch is an applied researcher at the Alibaba DAMO Academy, Machine intelligence Israel lab. His main fields of interests are deep learning approaches for image understanding as multi-label classification and object detection. Before joining Alibaba, Emanuel worked as a Computer Vision algorithm developer in Applied Materials and in an Israeli defense company.
Emanuel holds BSc and MSc Degrees in Electrical Engineering, Specializing in statistical signal processing, both from Bar Ilan University.
Asymmetric Loss For Multi-Label Classification
Pictures of everyday life are inherently multi-label in nature.
Hence, multi-label classification is commonly used to analyze
their content. In typical multi-label datasets, each picture contains only a few positive labels, and many negative ones. This
positive-negative imbalance can result in under-emphasizing
gradients from positive labels during training, leading to poor
In this lecture, we will introduce a novel asymmetric loss (”ASL”),
that operates differently on positive and negative samples.
The loss dynamically down-weights the importance of easy
negative samples, causing the optimization process to focus
more on the positive samples, and also enables to discard mislabeled negative samples.
We demonstrate how ASL leads to a more ”balanced” network, with increased average probabilities for positive samples, and show how this balanced network is translated to better mAP scores, compared to commonly used losses. Furthermore, we offer a method that can dynamically adjust the level
of asymmetry throughout the training.
With ASL, we reach new state-of-the-art results on three
common multi-label datasets, including achieving 86.6% on
MS-COCO. We also demonstrate ASL applicability for other
tasks such as fine-grain single-label classification and object
ASL is effective, easy to implement, and does not increase
the training time or complexity.
Code is availabl: https://github.com/Alibaba-MIIL/ASL
Phd. Candidate in Applied Mathematics, Research Scientist at eBayTel-Aviv University
Ido is a Phd. Candidate in Applied Mathematics at Tel-Aviv University, as well as a research scientist at eBay research. He is interested in representation learning, model interpretability, and theoretical deep learning. Before his work at eBay, Ido worked as a team leader at DeePathology.ai.
Sparsity-Probe: Analysis tool for Deep Learning Models
We propose a probe for the analysis of deep learning architectures that is based on machine learning and approximation theoretical principles. Given a deep learning architecture and a training set, during or after training, the Sparsity Probe allows to analyze the performance of intermediate layers by quantifying the geometrical features of representations of the training set. We talk about the importance of representation learning, and geometrical features in the latent space.
Graduate StudentTel-Aviv University
ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement
PhD student at the applied Mathematics Program, Mathematics departmentTechnion
Samah is a Ph.D. candidate at the applied Mathematics department, working on her research under the supervision of Dr. Moti Freiman, from the Biomedical engineering faculty. Currently, she is conducting research on Bayesian Deep-learning methods for MRI Registration. She received B.Sc. and M.Sc. (Cum Laude), both from Technion's Viterbi faculty of electrical engineering in 2017 and 2019, respectively. She supervises image-processing-related projects. Previously, she worked at Intel and served as a TA in various courses and tutor in Landa (equal opportunities) project. Samah is a fellow of the Ariane de Rothschild Women Doctoral Program.
Non-Parametric Bayesian Deep-learning Method for Brain MRI Registration
Recently, deep neural networks (DNN) are successfully employed to predict the best deformation field by minimizing a dissimilarity function between the moving and target images. Bayesian DNN models enable safer utilization in medical imaging, improves generalization and assesses the uncertainty of the predictions. In our study, we propose a non-parametric Bayesian approach to estimate the uncertainty in DNN-based algorithms for MRI deformable registration. We demonstrated the added-value of our Bayesian registration framework on the brain MRI dataset. Further, we quantified the uncertainty of the registration and assessed its correlation with the out-of-distribution data.
ResearcherAlibaba Damo Academy
Dr. Yonathan Aflalo holds a MSc. from Ecole Polytechnique in France and a PhD from the Technion where he specialized in spectral analysis of geometrical shapes. He is currently working as a researcher in Alibaba Damo Academy, specialized in deep learning model optimization, including deep neural networks pruning, and network architecture search to discover very efficient models able to run in real time on any platform.
Realistic use of neural networks often requires adhering to multiple constraints on latency, energy and memory among others. In Alibaba, models deployed in production need to have a very low inference time to answer several cost constraints. In this talk we present a Neural Architecture Search (NAS) method that we have developed to construct these models by introducing Hard Constrained diffeRentiable NAS (HardCoRe-NAS), that is based on an accurate formulation of the expected resource requirement and a scalable search method that satisfies the hard constraint throughout the search. Our experiments show that HardCoRe-NAS generates state-of-the-art architectures, surpassing other NAS methods.
Algorithm ResearcherForesight Autonomous
Omri is a computer vision researcher at the R&D team of Foresight Autonomous.
His main research topics are sensor pose estimation and 3D reconstruction. He holds his BSc in Computer Science from Ben-Gurion University.
Multi Sensor Quality Assessment Using 3D Reconstruction
Many systems have sensor redundancy thus require real time assessment of the data quality and relevance from each source. The assessment is useful when allocating the computation resources and user interface. It is difficult to reliably make such comparison between different sensor types in real time as that usually requires data registration and/or deep understanding of the observed scene.
We present a method for assessing and comparing data quality and relevancy through 3D reconstruction, such as by stereo vision. This method allows data assessment to be performed in real-time before other complex algorithms. By using this approach, we can avoid confusing data from noisy channels, and concentrate our resources on the best channel.
Chief Scientist & Co-founder Brodmann17
Using Synthetic Data for ADAS Applications and Perception Challenges
The performance of the Artificial Intelligence (AI) that powers automotive systems is directly linked to the data used to train and test it. Our research tackles the issue of using synthetic data for two different Computer Vision tasks where relevant data is hard to collect or simply doesn’t exist. We show how to train object detectors, based only on synthetic data, that has generalized from real-world data. We then show how to use synthetic data to evaluate the performance of distance estimation algorithms. The success of these two tasks paves the way for future research taking advantage of synthetic data.
Research Scientist Team LeadHuawei Tel Aviv Research Center
Yoli Shavit is a Research Scientist Team Lead at Huawei TRC and a Postdoctoral Researcher at Bar-Ilan University. Her current research focuses on deep learning methods for camera localization and 3D reconstruction. Before joining Huawei, Yoli worked at Amazon and interned at Microsoft Research. She holds a PhD in Computer Science from the University of Cambridge, an MSc in Bioinformatics from Imperial College London and a BSc in Computer Science and in Life Science from Tel Aviv University. Yoli is the recipient of the Cambridge International Scholarship and her thesis was nominated for the best thesis award in the UK.
Learning Multi-Scene Absolute Pose Regression with Transformers
Absolute camera pose regressors (APRs) estimate the position and orientation of a camera from the captured image alone, offering a fast, lightweight and standalone alternative to localization pipelines. However, APRs are also less accurate and are typically trained per scene. In this work, we propose a transformer-based approach for improving the accuracy of APRs while learning multiple scenes in parallel. We use encoders to aggregate activation maps with self-attention and decoders for transforming latent features and scenes encoding into candidate pose predictions. Our proposed approach achieves a new state-of-the-art localization accuracy for pose regression methods across indoor and outdoor benchmarks.
Data scientist and software engineer Siemens Digital Industries Software
Shahar is a Data Scientist and software engineer at Siemens Digital Industries Software, and an electrical engineering M.Sc. student at TAU.
In her role at Siemens, she connects between advanced robotics simulation and machine learning.
She develops solutions for different problems from initial feasibility study, through a basic proof of concept until productization.
She is passionate about taking existing concepts and creatively introducing them to new domains.
Her current focus is about making ML accessible for manufacturing engineers by use of synthetic data and autoML.
Her research interests are: Geometric deep learning, 3D sensors processing and sim2real transfer
Synthetic data for dataset enhancement
"By 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated” - Erick Brethenoux, Gartner VP analyst.
As data collection and annotating is hard, tedious, and expensive, using rich, accurately annotated, synthetically generated data is gaining momentum.
In manufacturing, 3D simulators are used for generating synthetic data to train tasks like bin picking, robotic manipulation, quality inspection etc.
We will review the current state, possibilities, and limitations of synthetic data, as well as known synthetic data generation tools.
Finally, we will discuss advanced methods for reducing the gap between synthetic and real data.
Phd StudentTel-Aviv University
Roei Herzig is a CS Ph.D. student at Tel Aviv University and a Berkeley AI Research Lab member working with Prof. Amir Globerson and Prof. Trevor Darrell. His research focuses on the intersection of computer vision, language, and robotics, particularly video understanding and compositionality. Previously, Roei graduated magna cum laude from Tel Aviv University with MSc (CS), BSc (CS), and BSc (Physics), and worked as a researcher at Nexar and Trax for five years. His long-term research goal is to develop compositional models that leverage inductive biases into our architectures to generalize well across various tasks.
Towards Compositionality in Video Understanding
The key aspect behind compositionality is that humans understand the world as a sum of its parts. For example, objects can be broken down into pieces and events are composed of atomic actions. Our understanding of the world is naturally hierarchical and structured, and when new concepts are introduced, humans tend to decompose the familiar parts. This leads to the hypothesis that intelligent machines would need to develop a compositional understanding that is robust and generalizable. In this talk, I will present our two recent projects towards achieving this goal, our ICML21 paper on synthesizing goal-oriented videos and hot-off-the-press work on bringing object interactions to video transformers, called Object-Region Video Transformers (ORViT).
Algorithm DeveloperWSC Sports
Nitzan is a member of the Algorithms team at WSC Sports. The team researches and develops end-to-end real-time solutions for sports content automation, using deep learning for video, audio, and data analysis.
Before joining WSC Sports, Nitzan interned at Lightricks and Facebook and served for 5 years in an elite intelligence unit, leading an R&D team.
Nitzan holds a BSc in Computer Science from the Hebrew University of Jerusalem.
Weakly Supervised Sports Event Detection
WSC Sports’ main goal is developing sports events segmentation and recognition algorithms in order to automatically generate highlight videos from sports broadcasts.
A segment of an event usually includes a short time of the preparation for the play, a celebration, or any other expression of players at the end of the event. Each event can have few valid boundaries and the ideal ones are subjective.
In this presentation, we will show our method to solve this problem, how we leverage our massive raw data to train a weakly supervised model and how we evaluated this multiple solutions’ task.
Post-Doctoral Research AffiliateTechnion
Shira Nemirovsky-Rotman is a Post-Doctoral research fellow at Technion’s Computational MRI Laboratory (TCML), Faculty of Bio-Medical Engineering, Technion. She completed her PhD research at Faculty of Electrical and Computer Engineering at Technion in 2020. Her PhD research dealt with medical image processing and analysis. She holds an MSc in Electrical Engineering and a double BSc in Electrical Engineering and Physics (both Summa Cum Laude). She is also alumni of the Technion’s Excellence Program. Previously, Shira worked for several years as a computer vision engineer in the industry. Shira has previously won the Qualcomm-Jacobs Fellowship Award for Excellent PhD students and The Israel Ministry of Science Scholarship.
Physically Motivated Deep-Learning Models for Quantitative MRI
Quantitative Diffusion-Weighted MRI (DW-MRI) analysis with the “Intra-voxel Incoherent motion” (IVIM) signal decay model shows potential to provide quantitative imaging biomarkers for various clinical applications. Nevertheless, reliable estimation of the IVIM parameters from clinical DW-MRI data remains a challenge. Recently, deep-neural-networks (DNN) based approaches for IVIM model parameters prediction demonstrated potential in obtaining more accurate and faster parameter estimates compared to classical model-fitting methods; however, these methods’ capability to generalize the IVIM model fitting to different clinical acquisition protocols is limited. To address this shortcoming, we propose a novel DNN model: “IVIM-MIAP: Model Learning based on Incorporation of the Acquisition Protocol”, which incorporates the acquisition protocol as part of the architecture. We demonstrate the added value of IVIM-MIAP compared to previously proposed DNN-based methods through simulation studies as well as in-vivo DW-MRI data.
Computer Vision and Algorithms Team LeaderPenta-AI
Rethinking FUN: Frequency-Domain Utilization Networks
Lead Algorithm ResearcherNovocure
Michal Holtzman Gazit is a lead computer vision researcher in Novocure, with 20 years of experience in the field of computer vision and image processing and medical images. She received her BSc. (1998) and MSc. (2004) in Electrical Engineering Technion, and her PhD (2010) in Computer Science, Technion. During 2010-2012, she was a post-doctorate fellow in the computer science department in the University of British Columbia, Vancouver, Canada. Her main research interests are computer vision, image processing, AI in healthcare and deep learning.
Uncertainty Estimation in Postoperative GBM Segmentation
Glioblastoma multiforme (GBM) is the most frequent and lethal malignant brain tumor in adults. Novocure’s Tumor Treating Fields (TTFields) therapy was recently introduced as a novel therapeutic modality that significantly extends GBM patients’ life. TTFields treatment planning requires the segmentation of GBM tissues on postoperative MR images for evaluating the distribution of TTFields in the tumor. We present a novel method for the segmentation of GBM in postoperative patients which incorporates noisy labels and models the uncertainty of the segmentation. This uncertainty can be further deployed for intuitive and fast editing of the segmentation result.
MSc student Tel-Aviv University
Hila is an MSc student at the Blavatnik School of Computer Sciences at Tel-Aviv University, working at the Deep Learning lab under the supervision of Prof. Lior Wolf. Her research interests include attention-based models for computer vision and NLP, self-supervised learning, zero-shot and few-shot learning, multi-modal learning, as well as interpretability of deep neural networks.
Her work on Transformer interpretability appeared in CVPR 2021 and is set to appear in ICCV 2021 as an oral presentation. The novel methods presented in her work are state-of-the-art in Transformer interpretability and have been widely adopted for computer vision and NLP.
Interpretability of Transformer-based Models
In this lecture, we will explore state-of-the-art methods for Transformer interpretability. We will present applications for Transformer interpretability in computer vision, NLP, and multi-modal tasks while showing examples with some of the most popular Transformer-based models such as the Vision Transformer, CLIP, and BERT.
DNN interpretability is an ever-growing field of research targeted at explaining neural networks by developing methods to diagnose what aspects of the input drive the decisions made by the model.
Recently, Transformers gained increasing popularity in several fields, such as NLP and computer vision. This wave of popularity substantiates the necessity of interpretability methods for Transformers.
Deep Learning Computer Vision ResearcherViz.ai
Clara works as a researcher at Viz's AI team, leading algorithms development for various healthcare products, leveraging Computer Vision, Signal Processing and Deep Learning methods. Clara holds a BSc and an MSc from the Hebrew University of Jerusalem, where she specialized in Medical Image Processing using Machine Learning and Deep Learning techniques.
A Deep Learning and Signal Processing approach for dynamic brain CT analysis
CT Perfusion (CTP) is a 4 dimensional CT time-series dynamic scan, where intravenous contrast is injected while the patient’s head is scanned repeatedly as contrast enters and leaves the brain. The scan is useful for assessing a stroke patient suitability for treatment, as it enables differentiation of salvageable brain tissue from the irrevocably damaged brain tissue.
In this talk, we describe the complex algorithmic pipeline we have developed at Viz.ai for the automatic, scalable and accurate analysis of this high-variability, noisy data including Deep Learning based motion correction, signal processing methods for denoising, Deep Learning models for damaged tissue segmentation, and more, turning data into clinically actionable information for improving patients’ care.
Senior Research Scientist OriginAI
Rami Ben-Ari is a senior research scientist at OriginAI (an AI research center), and an adjunct professor at Bar-Ilan university. Prior to that, Rami was a research staff member in medical imaging and technical lead at Video-AI technologies in IBM- Research. He holds a PhD in Applied Mathematics from Tel-Aviv University, specializing in computer vision. He is active in the academic community and has published over 50 papers and patents and has organized workshops and challenges in CV/ML venues. His research interests cover deep learning methods for video understanding, image retrieval and multimodal learning.
Challenges in Deep Fake Detection
Deep fake images and videos are ubiquitous e.g. in social networks, Youtube and movies. Recent studies have shown rapid progress in facial manipulation, enabling attackers to manipulate the facial image of an individual and generate a new identity. As a remedy, many CNN based approaches suggest a deep fake detection classifier, capable of distinguishing between real and fake images/videos. In recent years we are witness to competitions organized and driven by large and diverse fake footage, created from different generators. Apparently, the generalization capability of the current detectors to unknown generators is still a debate in the community. In this talk I’ll present an overview of the problem focusing on the generalization task.
Research Engineer Bosch Center for Artificial Intelligence (BCAI)
Oren Spector is a Research Engineer at Bosch Center for Artificial Intelligence (BCAI). His current research is focused machine learning and reinforcement learning for robotics manipulation mainly for industrial purposes. Oren obtained both B.Sc. and M.Sc. from the department of mechanical engineering at the Technion, where he was advised by Prof. Miriam Zacksenhouse. His research focused on Learning Contact-Rich Skills Using Residual Admittance. Before his work at BCAI, Oren did research and algorithm developing at Rafael Advanced Defense Systems.
InsertionNet - A Scalable Solution for Insertion
Complicated assembly processes can be described as a sequence of two main activities: grasping and insertion. While general grasping solutions are common in industry, insertion is still only applicable to small subsets of problems, mainly ones involving simple shapes in fixed locations and in which the variations are not taken into consideration. Recently, RL approaches with prior knowledge (e.g., LfD or residual policy) have been adopted. However, these approaches might be problematic in contact-rich tasks since interaction might endanger the robot and its equipment. In this paper, we tackled this challenge by formulating the problem as a regression problem. By combining visual and force inputs, we demonstrate that our method can scale to 16 different insertion tasks in less than 10 minutes. The resulting policies are robust to changes in the socket position, orientation or peg color, as well as to small differences in peg shape. Finally, we demonstrate an end-to-end solution for 2 complex assembly tasks with multi-insertion objectives when the assembly board is randomly placed on a table.
Ph.D. CandidateTel Aviv University
Sigal Raab is a Ph.D. candidate in the School of Computer Science at Tel Aviv University, under the supervision of Daniel Cohen-Or. Her research deals with 3D human motion analysis and synthesis, with the term motion referring to bone lengths associated with temporally coherent 3D joint rotations. Before pursuing her doctoral studies, Sigal worked for many years in the high-tech industry. There, she took part in the research and development of a variety of innovative computer vision products.
FLEX, parameter free multi view human motion reconstruction
Video recordings made by multiple cameras facilitate mitigating occlusion and depth ambiguities in pose and motion reconstruction. Yet, multi-view algorithms strongly depend on camera parameters. Such a dependency becomes a hurdle once shifting to uncontrolled settings. We introduce FLEX (Free muLti-view rEconstruXion), an end-to-end parameter-free multi-view model, that does not require any camera parameters. Our key idea is that the 3D angles between skeletal parts, as well as bone lengths, are invariant to the camera position, in contrast to joint locations. We present our results on several datasets and show that in the absence of camera parameters, we outperform other algorithms by a large margin.
Principal AI researcherVimeo
Alon Faktor is a principal AI researcher at Vimeo, the world’s leading all-in-one video software solution. Before joining Vimeo, Alon worked as an lead AI researcher at Magisto, which was acquired by Vimeo in 2019. Alon's research focuses on deep learning for consumer and SMB applications. He is especially passionate about research in the video domain including video segmentation, tracking and action recognition. Alon holds a B.Sc. in Physics and Electrical Engineering from the Technion and a M.Sc. and PhD in Mathematics and Computer Science from the Weizmann Institute of Science.
Foreground layer segmentation - Bridging the gap between semantic segmentation, depth and saliency
Automatic image segmentation is a well-studied and long-lasting problem in computer vision. Most of the vision community effort in this field over the years has focused on semantic segmentation of a predefined and usually small set of object or scene categories. However, much less focus has been given to the problem of foreground layer segmentation, which is a fundamental task in the photo editing and video creation community. In this talk, we overview the main challenges of this task, show why semantic segmentation fails here, and present a system which can handle the problem robustly.
Manager-Applied ScientistAmazon Lab126
Amazon Halo will use your smartphone camera to assess your ‘Movement Health’
In this presentation we will present the underlying algorithmic work that has been developed and launched in Amazon's Health and Wellness CVML team. We developed a novel set of biomarkers and CV algorithms to assess physical (Movement) Health. Movement Health is based on functional fitness, which is your body’s readiness to execute the everyday movements you do without thinking. To provide these biomarkers, we use the customer's phone camera to acquire videos of fitness. Once acquired and uploaded to the cloud, our algorithms evaluate the customer's body movement to identify limitations in their stability, mobility, and posture. Following this, we provide an overall Movement score out of 100, details about stability, mobility, and posture, and a breakdown across the body areas.
PhD Electrical Engineering, Diploma PhysicsCorephotonics Samsung
Dr. Michael Scherer is enthusiastic about history, languages, and bringing new technologies to market. As a versatile engineer, he developed and marketed technologies for mobile photography, computer vision, organic electronics and printed sensors in Israel and Germany.
Autonomous Mobile Cameras
After a brief history of mobile photography, we show how Corephotonics uses latest AI advancements and a unique proprietary camera hardware to introduce world’s first autonomous mobile camera.
Bella is a Phd candidate at HUJI under the supervision of prof. Leo Joskowicz and a Machine Learning consultant. Prior to that she worked 7 years at Philips as a senior scientist developing and productizing medical imaging algorithms. Bella is also a community manager of Haifa Machine Learning meetup.
Self-training for sequence Transfer Bootstrapping and State-of-the-art Placenta Segmentation
Quantitative evaluation of the placenta is important for fetal health evaluation. However, manual segmentation of the placenta is time-consuming and suffers from high observer variability. We present a method for bootstrapping automatic placenta segmentation by deep learning on different MRI sequences without requiring annotations for each scanning sequence. The method consists of automatic segmentation of one sequence followed by automatic adaptation using self-training to a new sequence with only additional unlabeled cases. It uses a novel combined contour and soft Dice loss function. The contour Dice loss and self-training approach achieve state-of-the art placenta segmentation results and sequence transfer bootstrapping.
CEO & FounderCognata LTD
Danny Atsmon is a seasoned technologist and subject matter expert in the domains of machine learning, computer vision, and automated driving. Leveraging this unique expertise, Danny has been in the business of launching high-tech products for more than 20 years. He served as Harman’s (NYSE:HAR now Samsung) Director of Advanced Driver Assistance Systems and Senior Director of Machine learning, and has co-founded several successful technology companies operating at the intersection of artificial intelligence and automotive technology.
Depth Perception from Synthetic Data
Depth perception is a unique task in which there is no available real-life sensor coverage for applying the task of monocular depth perception estimation from a camera. The best source for that task is Synthetic data.
We present in this session a couple of data handling techniques that improve the performance of monocular depth perception Deep Neural Networks utilizing Generative Adversarial Networks (GAN) and Scalable Data generation techniques.
We have based our results set on the known KITTI dataset and present a benchmark that was done on that dataset.
ManagerIBM Research AI
Leonid Karlinsky leads the AI Vision research group @ IBM Research AI. Before joining IBM, he served as a research scientist in Applied Materials, Elbit, and FDNA. He is part of the program committee of and is actively publishing at ECCV, ICCV, CVPR, AAAI, ICLR, and NeurIPS, and is serving as an IMVC steering committee member for the past 4 years. His recent research is in the areas of cross-domain, multi-modal, and low supervision learning (including few-shot, self-supervised, and weakly supervised learning directions). He received his PhD degree at the Weizmann Institute of Science, supervised by Prof. Shimon Ullman.
Learning with Weak Supervision - from fine-grained recognition to text grounding in images
Any practitioner of computer vision or machine learning in general knows - supervision is an asset, and in some cases quite an expensive one!
In this talk, we will focus on two of our recent works highlighting what can be achieved with limited supervision, be it learning to recognize new fine-grained classes with few examples after pre-training with only coarse class labels, or learning to ground (localize) arbitrary free-text phrases in images while learning from image and text pairs without any image location supervision.
Guy Tamir is a technology evangelist in Intel Software and Advanced Technology group. Bringing years of experience in hardware and software design, his main areas of interest and expertise are AI, Computer vision, Video processing, and Heterogeneous, parallel computing. He is an active YouTuber, with the OpenVINO and oneAPI youTube channels. Guy was the product manager of OpenVINO and held various positions leading engineering teams to design Intel Core products. Guy holds an M.Sc in computer engineering from the Technicon and an MBA.
oneAPI – Cross architecture software solutions for the AI era
Software tool for the AI accelerators era.
How to accelerate the AI development process from data preparation to deployment and serving using Intel SW tools. How to program multi-device (CPU/GPU/AI accelerator) heterogeneous systems efficiently with oneAPI. A quick overview and demos of Intel SW tools to preprocess, visualize, train and deploy ML/AI and analytic workloads.
Head of AIWix.com
Dr. Eli Brosh is the head of AI research at Wix, where he and his team are building creativity tools powered by machine learning and computer vision for media content editing and creation and website design intelligence. Prior to Wix, Eli held AI leadership positions in leading startups in the fields of visual driving analytics and smartphone-based medical diagnostics. Eli holds a Ph.D. in Computer Science from Columbia University and has authored multiple publications and patents.
Towards a Data-Centric AI Development
Wix provides a cloud-based website development platform for over 200 million users worldwide today. At Wix, we’re actively building creativity tools powered by machine learning and computer vision to help users build professional websites, from automatic image editing to website design understanding and more. Building such tools at Wix’s scale requires not only adapting standard models to specific application use cases, but also dataset engineering practices and supporting platforms to deal with constantly evolving data.
In this talk, we’ll present our computer vision pipelines and workflows, and our journey to move from a conventional 'model-centric' AI to a more ‘data-centric’ one. Throughout this journey, we significantly improved product velocity, created unique tools for MLOps, and established new roles. We learned that many challenges of evolving AI systems can be addressed by high quality data and systematic approaches to data labelling, curation and analytics.
Head of Data Science OperationsWix.com
Computer Vision Department ManagerPercepto
Ovadya joined Percepto on January 2019 as a Computer Vision team leader . With over 20 years of experience building Computer Vision solutions in the industry ,
with companies such as Intel Corporation, Applied Materials and PointGrab. Ovadya’s last position was with Innoviz-Tech, where he was a member of the company’s leadership team and headed the Computer Vision department. Ovadya built the Innoviz-Tech Computer Vision department . Ovadya has vast experience in Computer Vision applications , including Deep learning, Object detection and tracking in mass production such as Samsung TV.
He is coinventor of more than 30 patents and patent applications.
Ovadya holds an Msc. degree in the field of computer vision from the Weizmann Institute of Science and a B.Sc. degree in Math and Computer Science from the Bar Ilan University.
Attention based change detection using transformers
Recently, attention-based neural networks have been shown to achieve state of the art results on image classification using pre training on huge datasets. We address the problem of change detection using transformers on small change detection dataset. Following the Compact Transformers introduced by Hassani et al  Our model uses both convolution and the Compact Convolutional Transformer (CCT). We show that compared to pure CNNs, the compact transformers have fewer parameters, while obtaining similar accuracies. This method is modular in terms of model configuration, and in this case, it have 4.1M parameters which is third the amount of parameters of pure CNN based model with similar accuracy. In this work we demonstrate for the, the potential of using state of the art change detection transformers-based algorithm on the CD2014 dataset
Senior Research ScientistNexar
Matan Friedmann is a research scientist at Nexar, a startup company dedicated to creating a network of connected vehicles for the future of mobility. His work focuses on utilizing large-scale datasets collected from real-world driving scenarios, in order to unlock value from fresh crowd-sourced data, and provide solutions for autonomous mapping systems. Matan received a BSc in physics, an MBA, and an MSc in astrophysics from Tel Aviv University, where he published several papers in leading journals in the fields of microlensing exoplanets and observational supernovae analysis using the Hubble Space Telescope.
Data-Driven Approach for 3D Collision Reconstruction
Understanding what happens on the road can be tricky. Autonomous driving and road insight platforms require the ability to untangle diverse and complex dynamic scenes. At Nexar, we use crowd-sourced data to train deep-learning models to reconstruct corner-cases from monocular views taken by our network of dashcams. In this talk, we’ll discuss how we leverage vision with sensor fusion to produce full temporal 3D reconstruction of corner-cases, and how we overcome obstacles originating from monocular views. We will also see how such systems can be used as stepping stones for robust scene reconstruction even when sensor fusion is not applicable.
DirectorSensory Motor Integration lab
Miriam Zacksenhouse received B.Sc. degrees in Mathematics and Physics from the Hebrew University of Jerusalem (1977), and in Mechanical Engineering from the Technion (1980). She was awarded her M.Sc. in
Mechanical Engineering from MIT in 1982, and her Ph.D. in Electrical and Computer
Engineering from Rice University in 1993.
Miriam joined the Faculty of Mechanical Engineering at the Technion in 1995, where she directs the Sensory-Motor Integration Laboratory, and the Brain-Computer Interfaces for Rehabilitation Laboratory.
Her research interests include computational motor control, neural modeling, brain-machine interfaces, bio-inspired robotics and machine learning.
Brain Computer Interfaces for hand control using convolutional neural networks
Brain computer interfaces (BCIs) provide direct communication between the brain and the external world. A widely used technique for measuring brain activity is electroencephalogram (EEG). EEG measurements from a number of electrodes (here 35) over time (here 900 msec) can be represented as 3-dimensional (3-d) images and analyzed using 3-d image processing techniques. In particular, convolutional neural networks (CNN) can be trained to classify spatial-temporal EEG patterns.
We developed 3-d CNNs for EEG classification to: (1) determine whether the user wants to close / open or maintain the posture of the hands, and (2) determine whether the executed movement is the intended one. The latter is very important since BCIs are prone to mistakes. Detected errors can be corrected to improve overall accuracy. Our experiments demonstrate that CNN-EEG based error correction can improve the accuracy of CNN-EEG based BCI for hand control by more than 7%.
Our research facilitates the development of comfortable and intuitive brain-controlled hand prosthesis that may help 3 million upper limb amputees worldwide.
Head of Computer Vision & Perception Research GroupGeneral Motors
I am currently leading the computer vision & perception research group at General Motors (GM) Israel. Since 2009 I have been a researcher in the fields of computer vision and machine learning at GM. I received his B.Sc. degree (with honor) in mathematics and computer science from the Tel-Aviv University, in 2000, and the M.Sc. and PhD degrees in applied mathematics and computer science at the Weizmann Institute, in 2004 and 2009 respectively. In the Weizmann Institute I conducted research in human and computer vision under the supervision of Professor Shimon Ullman. Since 2007 I have been conducting industrial computer vision research and development at several companies including General Motors and Elbit Systems, Israel.
Synthetic-to-real domain adaptation for lane detection
Accurate lane detection, a crucial enabler for autonomous driving, currently relies on obtaining a large and diverse labeled training dataset. In this work, we explore learning from abundant, randomly generated synthetic data, together with unlabeled or partially labeled target domain data, instead. Randomly generated synthetic data has the advantage of controlled variability in the lane geometry and lighting, but it is limited in terms of photo-realism. This poses the challenge of adapting models learned on the unrealistic synthetic domain to real images. To this end we develop a novel autoencoder-based approach that uses synthetic labels unaligned with particular images for adapting to target domain data. In addition, we explore existing domain adaptation approaches, such as image translation and self-supervision, and adjust them to the lane detection task. We test all approaches in the unsupervised domain adaptation setting in which no target domain labels are available and in the semi-supervised setting in which a small portion of the target images are labeled. In extensive experiments using three different datasets, we demonstrate the possibility to save costly target domain labeling efforts. For example, using our proposed autoencoder approach on the llamas and tuSimple lane datasets, we can almost recover the fully supervised accuracy with only 10% of the labeled data. In addition, our autoencoder approach outperforms all other methods in the semi-supervised domain adaptation scenario.
Algorithm group manager Applied Materials Israel
Dr. Boris Levant is the Data Products Algorithm group manager at Applied Materials Israel. In this role he leads the algorithmic R&D of new products that take advantage of the huge amounts of data generated by Applied Materials tools in the Semiconductor manufacturers fabs. In addition, Boris leads the company’s center of excellence of Algorithms and Deep Learning. Boris has over 15 years of experience in leading Algorithm development in various companies, previously at Nova and Landa Digital Printing.
Boris holds BSc from TAU and MSc and PhD from the Weizmann Institute of Science.
Unsupervised Anomaly Detection in Semiconductor
Defect detection plays a key role in preventing yield loss excursion events in semiconductor manufacturing. In this talk we present the industry-standard detection pipeline, performed with Applied Materials electron-beam and optical inspection tools. We focus on two challenging real-world scenarios: one in which non-defective reference images of the pattern are available, and an unsupervised scenario where they are not. In the “with reference” scenario we propose an efficient method based on the internal statistics of the images. For the completely unsupervised scenario, our research is based on the recent advances in the self-supervised contrastive learning.
This research is the joint work with Dr. Ran Yacoby, Bar Dubovsky, Ore Shtalrid, Dr. Nati Ofir and Omer Granovitser
Lead Computer Vision and Machine Learning engineer Apple
Brandon Joffe is a lead Computer Vision and Machine Learning engineer at Apple, working on depth sensing technologies. Previously, Brandon was the chief neural network architect and computer vision engineer at Camerai. He holds an undergraduate degree in computer electrical engineering from the University of Cape Town (2016) and currently pursuing a masters in Electrical Engineering at Tel Aviv University.
LiDAR Scanner and Depth Densification
94, Yigal Alon St.
Tel Aviv 6109202