Speakers

Gershon Celniker

R&D Lab Group Manager General Motors

Bio:

Gershon Celniker is an R&D Lab Group Manager at GM, previously a Principal Data Scientist at Verint, Check Point and Chief Data Scientist at Wiser. He holds a BSc from Technion Institute and a MSc from Hebrew University in Bioinformatics and Machine learning applications with vast academic experience as a fellow CS researcher from Weizmann institute and Tel-Aviv University. Currently, his main areas of research interest lie in the design of AI and CV algorithms and their applications in the Automotive industry.

Title:

Understanding and modeling gaze patterns in the automotive environment

Abstract:

intuitively, when a person is relaxed and has no task to perform, one tends to look at salient objects in the field of view (bottom-up). As tasks are introduced and workload increases, one usually tends to select a more task-oriented gaze behavior (top-down) and a shift from salient objects-oriented gaze patterns to important objects-oriented gaze patterns can be observed. In the automotive environment, this shift between gaze pattern types and their linkage to the driver or passengers’ states suggests that modeling a gaze pattern can lead to an understanding of one’s state and vice versa.

 

Gaze patterns were modeled by training both deep learning networks and statistical models. Deep learning networks were trained to digest effectively larger datasets, and statistical models were selected for their simplicity and explainability. A set of experiments was conducted both in real-world setups and in simulated environments. The real-world experiments took place in Israel and the USA while modeling the behavior of drivers and passengers. Overall, our results supported our assumptions and can be divided into two types: prediction of expected gaze patterns given the environment and establishing a linkage between gaze patterns and the driver’s and passenger’s state.

 

Roy Orfaig

Blue white robotics and Tel-Aviv Unviersity

Bio:

Roy Orfaig specializes in the fields of AI, computer vision, and robotics for autonomous vehicles. He has been serving as a lecturer and an advisor for master's research at Tel-Aviv University, focusing on perception, localization and mapping applications for autonomous robots within the Department of Electrical Engineering.

 

Furthermore, he is an AI Tech Lead at Blue White Robotics, a startup that pioneers cutting-edge autonomous tractors for smart farming. Before joining Blue White Robotics, he held various key roles and gained extensive experience at top companies such as Applied Materials, Elta (within the autonomous ground robotics group), and Brodmann17. He holds an M.Sc. in Electrical Engineering from Ben-Gurion University.

Title:

CLRMatchNet: Enhancing Curved Lane Detection with Deep Matching Process

Abstract:

Lane detection is crucial for autonomous driving, furnishing indispensable data for safe navigation. Modern algorithms employ anchor-based detectors, followed by a label assignment process to categorize training detections as either positive or negative instances. However, the existing methods might be limited and not necessary optimal due to relying on predefined classic cost functions with not many calibration parameters.
Our research introduces MatchNet, a deep learning-based approach aimed at optimizing the label assignment process. Interwoven into a SOTA lane detection network such as CLRNet, MatchNet replaces the conventional label assignment process with a submodule network. This integration yields significant enhancements, particularly in challenging scenarios such as curve detection (+1.82%), shadows (+1.1%), and non-visible lane markings (+0.88%). Notably, this method boosts lane detection confidence level, enabling a 3% increase in the confidence threshold.

Daniel Duenias

M.Sc. student in Electrical and Computer EngineeringBen-Gurion University of the Negev

Bio:

Daniel Duenias holds a B.Sc. in Computer Engineering (with honors) and is currently pursuing his M.Sc. in Electrical and Computer Engineering at Ben-Gurion University, mentored by Prof. Tammy Riklin Raviv. He collaborates with Prof. Tal Arbel from McGill University on his master's research, which is focused on multimodal data integration of medical imaging, aiming to leverage deep-learning models for enhanced medical imaging analysis.

Title:

HyperFusion: Imaging-Tabular Data Integration for Predictive Modeling in Healthcare

Abstract:

Integrating medical imaging with Electronic Health Records (EHRs) is crucial for comprehensive patient analysis. Deep Neural Networks excel in multimodal tasks in the medical domain, yet, the complex endeavor of effectively merging medical imaging with clinical, demographic and genetic information represented as numerical tabular data remains a highly active and ongoing research pursuit. We propose a novel hypernetwork-based framework for tabular-imaging fusion, where the image processing is conditioned on EHR values and measurements, thereby leveraging them to enhance the predictive results. Tested on brain MRI tasks, including age prediction and Alzheimer's classification, our method shows generality and outperforms single-modality models and existing fusion methods.


 

Shachar Ben Dayan

Applied ScientistAmazon Prime Video Sports

Bio:

Shachar is an applied scientist at Amazon, Prime Video Sports division. Specializing in computer vision and deep learning, she holds an M.Sc. in Electrical Engineering from Tel Aviv University, with research focused on Light Field photography.

Title:

Sports, Computer Vision and AI

Abstract:

American Football, a complex sport of strategy, is the most popular sport in the US, engaging tens of millions of fans every week. Amazon owns exclusive broadcast rights for the Thursday Night Football (TNF) and is working to create a unique viewing experience, presenting new analytic features and enhanced graphics to help fans get more out of the game. This lecture will give a peek into the new features of the 2023 season, covering two ML powered features that are based on player tracking data collected in low latency in the NFL venues

Miri Kenig

Tel Aviv University

Bio:

Miri Kenig is a physicist studying the ability of generative AI to learn, generalize, and explore quantum reality. Spearheading the application and training of GenAI to explore quantum systems. With a background in physics (holds BSc and MSc degrees with honors), Miri led groundbreaking research that developed the world’s first deep-learning algorithm capable of learning quantum processes from examples only. Her pioneering research was published in the American Physical Review A (PRA) physics journal, highlighted at the leading Israel Physical Society conference in 2023 (IPS), and most recently covered in Ynet Science. Miri is currently furthering this approach to analyze and explore the implications of creating scientific AI to explore, as yet, unsolved physical phenomena with the aim of fostering breakthrough insights and scientific progress.

Title:

Exploring and analyzing quantum dynamics with generative AI

Abstract:

In this talk, I will show that generative models can learn the dynamics of interacting quantum particles on disordered chains, a general scenario underlying a wide range of physical problems, from many-body quantum physics to quantum computation. Our algorithm learns complex quantum correlations from unlabeled examples and can then generate new physically valid instances with tunable physical parameters. This enables post-training exploration of the problem space, revealing underlying physical phenomena and accelerating the learning of more complex problems. These results suggest a general framework for generative AI in physical analysis and discovery. 

Dr. Alona Strugatski Faktor

Postdoctoral fellowWeizmann Institute of Science

Bio:

Alona Strugatski-Faktor is a Postdoctoral fellow at the Weizmann Institute of Science. Alona's research focuses on cognitive capabilities of AI models and visual scene interpretation.
She is specifically interested in combining human vision research with state-of-the-art AI models. Alona holds a B.Sc. in Physics and Electrical Engineering from the Technion,

a M.Sc. in Electrical Engineering from Tel Aviv University and a PhD in Mathematics and Computer Science from the Weizmann Institute of Science.

Title:

Why does Visual-Language Models Struggle with Scene Structure Extraction

Abstract:

Though the huge breakthrough in vision-language models, they are still far from achieving human-level scene-understanding and have several fundamental limitations.
We show that these models are not able to perform simple tasks such as questions regarding locations and relations between objects. We suggest a model which can
naturally answer such questions and achieve scene-understanding even for complex scenes. It does this by using an iterative goal-driven approach that resembles
human vision. Our model is able to focus its attention in each iteration on the relevant parts of the scene and thus iteratively build a complex understanding of the scene.

Yochai Yemini

PhD studentBar-Ilan University and OriginAI

Bio:

Yochai Yemini is a PhD student at Bar-Ilan University, under the supervision of Prof. Sharon Gannot and Dr. Ethan Fetaya. He is also a deep learning researcher at OriginAI. His areas of interest include computer vision, speech processing and their intersection, and his current research focuses on deep learning methods for audio-visual tasks.

Title:

LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading

Abstract:

In the lip-to-speech task, the objective is to accurately generate the missing speech for a soundless video of a person talking. It is required, e.g., when the speech signal is completely obfuscated by background noises. In this talk, I will present LipVoicer, a novel approach for producing high-quality speech for in-the-wild silent videos. LipVoicer leverages the transcription of the speech we wish to generate as predicted by a lip-reading model, and a diffusion model conditioned on the video to generate mel-spectrograms. LipVoicer achieves exceptional results, and the generated speech sounds natural and synchronized to the lip motion. 

Boris Greenberg

VP of XR SolutionsVoxelSensors

Bio:

Boris Greenberg leads the Spatial and Empathic Computing Solution team at VoxelSensors. He has over 21 years of expertise in multi-disciplinary R&D within the high-tech industry and academia. His previous roles include founding EyeWay Vision, I.C. Inside, and serving as an R&D lead in Automated Optical Inspection at Orbotech. Notably, Mr. Greenberg holds more than 20 patents and pursued his studies in physics at the Hebrew University of Jerusalem.

Title:

Low-Power, Low-Latency Perception for XR

Abstract:

XR devices immerse users in augmented realities, seamlessly merging digital and physical realms. Achieving this demands advanced perception technology that is resilient across environments, low in power usage, and has minimal latency. Yet, existing solutions struggle to meet this demanding combination of requirements, even with Apple’s Vision Pro setting a new standard for XR glasses’ 3D perception.

VoxelSensors’ Active Event Sensors (AES) enable robust, low-power, low-latency 3D sensing using laser triangulation. This innovation enhances SLAM, odometry, gesture recognition, and tracking, potentially revolutionizing augmented reality experiences. Boris will outline this groundbreaking approach and perspectives in XR 3D perception.

 

Ofer Lavi

CEOdataspan.ai

Bio:

With his long career in machine learning, Ofer has learned data is the biggest obstacle to implementing successful AI projects. His company, dataspan.ai, uses generative artificial intelligence to assist teams in creating better computer vision applications.


His last position was as a program manager for IBM Research AI, responsible for natural language processing and artificial intelligence for customer care. Prior to that, he managed IBM Research Haifa's Machine Learning Technology group. Bringing AI from research to production, he published more than twenty peer-reviewed papers and patents.

Title:

Can AI train AI?

Abstract:

This talk will demonstrate how Generative AI can be adapted to augment datasets for computer vision training.  Utilizing diffusion models, we enhance dataset quality by seamlessly implanting concepts into images, improving downstream model performance. We provide an algorithmic framework for localizing the appropriate place for implanting a concept and for the actual generation of the concept given the background. To address the stochastic nature of diffusion models, which may generate images that do not contribute to the improvement of downstream models, we employ clustering and filtering, maximizing dataset relevance. We demonstrate the methods on both public and real world datasets.

Omri Danziger

Computer Vision ResearcherForesight Autonomous

Bio:

Omri is a computer vision researcher at the R&D team of Foresight Autonomous.

His main research topics are sensor pose estimation and 3D reconstruction. He holds his BSc in Computer Science from Ben-Gurion University.

Title:

Consistent Pixel Matching between different cameras using individual temporal updates

Abstract:

Matching points across images from different cameras is commonly used in vision systems for a variety of purposes such as 3D reconstruction and field calibration. Systems doing so over time are often designed to achieve high consistency. Meaning that matches may change but do not jitter between solutions or errors. The presented method estimates consistent matches in adynamic environment using dense optical flows between the sequent images of each individual camera.

Sapir Kontente

Tel Aviv University

Bio:

Sapir holds a B.Sc. in Physics and Electrical Engineering from Tel Aviv University. She is currently pursuing her M.Sc. in Electrical Engineering at TAU, focusing on detection for autonomous driving under the supervision of Prof. Ben-Zion Bobrovsky and Roy Orfaig. Additionally, Sapir works as an Algorithm Engineer at Samsung R&D Center in the field of image processing.

Title:

CLRMatchNet: Enhancing Curved Lane Detection with Deep Matching Process

Abstract:

Lane detection is crucial for autonomous driving, furnishing indispensable data for safe navigation. Modern algorithms employ anchor-based detectors, followed by a label assignment process to categorize training detections as either positive or negative instances. However, the existing methods might be limited and not necessary optimal due to relying on predefined classic cost functions with not many calibration parameters.
Our research introduces MatchNet, a deep learning-based approach aimed at optimizing the label assignment process. Interwoven into a SOTA lane detection network such as CLRNet, MatchNet replaces the conventional label assignment process with a submodule network. This integration yields significant enhancements, particularly in challenging scenarios such as curve detection (+1.82%), shadows (+1.1%), and non-visible lane markings (+0.88%). Notably, this method boosts lane detection confidence level, enabling a 3% increase in the confidence threshold.

Hamza Murad

MD., Dept. Of Orthopedics B and Spine surgeryGalilee Medical Center, Nahariya Israel

Bio:

I am Hamza Murad, currently a resident in Orthopedic Surgery at Galilee Medical Center. My academic journey includes a solid biology education at Technion Institute of Technology, accompanied by medical studies at Hebrew University in Jerusalem. Proficient in Python programming, I am particularly drawn to the application of unsupervised techniques in skeletal radiology. By merging my medical expertise with technical skills, I am eager to offer a distinctive viewpoint at the upcoming Computer Vision Conference, where the fusion of medical imaging and technology takes center stage.

Title:

Clustering-based Detection of Occult Osteoporotic Fractures using Machine Learning and CT Scans

Abstract:

Osteoporotic vertebral compression fractures (VCFs) in the elderly pose significant quality-of-life challenges. a fraction of the VCF are occult, and cannot be distinguished from normal vertebrae by Traditional imaging like x-rays and CT scans. Tc99 bone scans is usually utilized to identify occult fractures. We propose a data-driven solution employing machine learning and computer vision to identify unique radiological patterns in occult VCFs. Our method, using only CT scans data of 24 vertebrae, successfully segments vertebrae into clusters, revealing distinct volume ratios that distinguish normal vertebrae from those with occult fractures. Importantly, we identified that vertebral posterior element volumes aid occult fracture identification and may play a role in the pathology of VCF. This approach highlights the potential of machine learning in enhancing skeletal condition diagnosis, bridging inter-modality gaps.

 

Loai AbdAllah

Ph.D., Senior Lecturer at the Department of Information Systems Max Stern Yezreel Valley College

Bio:

I am Dr. Loai Abdallah, and I have honed my expertise in data analysis and artificial intelligence over 15 years. I hold a senior lecturer position at the Department of Information Systems at the Max Stern Yezreel Valley College. My main research is focusing in data mining and big data. In addition to my academic pursuits, I am the founder and CEO of xBiDa, a company at the forefront of AI for big data and computer vision.

Title:

Clustering-based Detection of Occult Osteoporotic Fractures using Machine Learning and CT Scans

Abstract:

Osteoporotic vertebral compression fractures (VCFs) in the elderly pose significant quality-of-life challenges. a fraction of the VCF are occult, and cannot be distinguished from normal vertebrae by Traditional imaging like x-rays and CT scans. Tc99 bone scans is usually utilized to identify occult fractures. We propose a data-driven solution employing machine learning and computer vision to identify unique radiological patterns in occult VCFs. Our method, using only CT scans data of 24 vertebrae, successfully segments vertebrae into clusters, revealing distinct volume ratios that distinguish normal vertebrae from those with occult fractures. Importantly, we identified that vertebral posterior element volumes aid occult fracture identification and may play a role in the pathology of VCF. This approach highlights the potential of machine learning in enhancing skeletal condition diagnosis, bridging inter-modality gaps.

 

Dr. Elad Levi

Senior Machine Learning EngineerSightful

Bio:

Elad Levi is a machine learning engineer at Sightful, a startup that is creating the first AR laptop. His work focuses on leveraging multimodal inputs (in particular vision and language) in order to build a novel AR operating system. Elad received a PhD degree in mathematics from the Hebrew University. His thesis was in the field of model-theoretic with applications to combinatorics problems.

Title:

Democratizing Large Language Models

Abstract:

Large language models (LLMs) have emerged as a breakthrough technology, exhibiting remarkable performance across a wide range of tasks. Until recently, the development of LLMs seemed constrained by high barriers, resulting in a few companies dominating the field. However, recent advancements in the field have significantly lowered these barriers, enabling the development of high-quality LLMs with a limited amount of effort and computation resources.

In this tutorial, we will explore the challenges involved in building LLM models, the development that allows building such high-performance custom models with a small amount of resources, and the new possibilities it unlocks, including multimodal extension and expanded context windows.

Eyal Hanania

MSc student in the Electrical and Computer Engineering faculty Technion

Bio:

Eyal is currently pursuing his MSc in Electrical and Computer Engineering at the Technion, mentored jointly by Dr. Moti Freiman (Faculty of Biomedical Engineering) and Prof. Israel Cohen (Faculty of Electrical and Computer Engineering). His ongoing research is centered on creating deep-learning models with physical constraints for motion correction in medical imaging. Alongside his academic work, Eyal serves as an AI Research Intern at GE Research. He brings several years of industrial experience as an algorithm and computer vision engineer to his role. He earned his B.Sc in Electrical and Computer Engineering from the Technion.

Title:

Free-breathing myocardial T1 mapping with Physically-Constrained Motion Correction

Abstract:

T1 mapping is a quantitative MRI technique that has emerged as a valuable tool in the diagnosis of diffuse myocardial diseases. However, prevailing approaches have relied heavily on breath-hold sequences to eliminate respiratory motion artifacts. This limitation hinders accessibility and effectiveness for patients who cannot tolerate breath-holding. We address this limitation by introducing PCMC-T1, a physically-constrained deep-learning model that accounts for the signal decay along the longitudinal relaxation axis for motion correction in free-breathing T1 mapping. PCMC-T1 demonstrated superior results compared to baseline methods using a 5-fold experimental setup on a publicly available dataset of 210 patients.

Dr. Ravid Shwartz Ziv

Assistant Professor and Faculty FellowNew York University (NYU)

Bio:

Ravid Shwartz-Ziv is currently a CDS Assistant Professor and Faculty Fellow at the NYU Center for Data Science. He collaborates with Prof. Yann LeCun, focusing on neural networks, information theory, and self-supervised learning. Ravid's research aims to dissect the complexities of deep neural networks to enhance their efficiency and effectiveness. He is particularly intrigued by what defines a 'good' representation in machine learning and explores its impact on various applications. His work also delves into data compression and its implications for machine learning, as well as investigating the essential components for effective learning and the dynamics of training algorithms.

 

Title:

Decoding the Information Bottleneck in Self-Supervised Learning: Pathway to Optimal Representations and Semantic Alignment

Abstract:

Deep Neural Networks (DNNs) have excelled in many fields, largely due to their proficiency in supervised learning tasks. However, the dependence on vast labeled data becomes a constraint when such data is scarce.
Self-Supervised Learning (SSL), a promising approach, harnesses unlabeled data to derive meaningful representations. Yet, how SSL filters irrelevant information without explicit labels remains unclear.
In this talk, we aim to unravel the enigma of SSL using the lens of Information Theory, with a spotlight on the Information Bottleneck principle. This principle, while providing a sound understanding of the balance between compressing and preserving relevant features in supervised learning, presents a puzzle when applied to SSL due to the absence of labels during training.
We will delve into the concept of 'optimal representation' in SSL, its relationship with data augmentations, optimization methods, and downstream tasks, and how SSL training learns and achieves optimal representations.
Our discussion unveils our pioneering discoveries, demonstrating how SSL training naturally leads to the creation of optimal, compact representations that correlate with semantic labels. Remarkably, SSL seems to orchestrate an alignment of learned representations with semantic classes across multiple hierarchical levels, an alignment that intensifies during training and grows more defined deeper into the network.
Considering these insights and their implications for class set performance, we conclude our talk by applying our analysis to devise more robust SSL-based information algorithms. These enhancements in transfer learning could lead to more efficient learning systems, particularly in data-scarce environments.
Joint work with Yann LeCun,  Ido Ben Shaul, and Tomer Galanti.

 

Ron Shapira Weber

Ph.D. Student Ben Gurion University

Bio:

Ron Shapira Weber is a  Ph.D. student at Ben Gurion University (BGU) at the Vision, Inference, and Learning (VIL) under the supervision of Dr. Oren Freifeld at the Computer Science Dept. His interest areas include time series analysis and computer vision, with applications to time series joint alignment and averaging, image registration, and video analysis. He did his master's in Cognitive Science at BGU as a part of VIL group, under Dr. Oren Freifeld, and of the Computational Psychiatry Lab under Dr. Oren Shriki. Between 2019-2021 he worked as an algorithm researcher at BeyondMinds.

Title:

Regularization-free Diffeomorphic Temporal Alignment Nets

Abstract:

In time-series analysis, nonlinear temporal misalignment is a major problem that forestalls even simple averaging. An effective learning-based solution for this problem is the Diffeomorphic Temporal Alignment Net (DTAN), that, by relying on a diffeomorphic temporal transformer net and the amortization of the joint-alignment task, eliminates the drawbacks of traditional alignment methods. Unfortunately, existing solutions for the joint alignment problem crucially depend on a regularization term whose optimal hyperparameters are dataset-specific and usually searched via a large number of experiments. Here we propose a regularization-free DTAN that obviates the need to perform such an expensive, and often impractical, search. Concretely, we propose a new well-behaved loss that we call the Inverse Consistency Averaging Error (ICAE), as well as a related new triplet loss and support for variable-length signals joint alignment. Our code is available at https://github.com/BGU-CS-VIL/RF-DTAN.

 

Ruth Bergman

CTO and VP software engineering, Edison DataGE HealthCare

Bio:

Dr. Ruth Bergman serves as the Chief Technology Officer at Edison Data within GE HealthCare's Science and Technology Organization. Her team's focus lies in establishing a unified data fabric to ensure consistency in data aggregation, normalization, and exchange across healthcare devices and applications. This comprehensive data fabric incorporates diverse patient data, spanning medical images, waveforms, labs, pathology, and genomic profiles. This integrated data accelerates analytics and machine learning efforts, accessible via open-standard Application Programming Interfaces (APIs). Dr. Bergman's prior achievements include spearheading the development of Graffiti, the first FDA Cleared Clinical Virtual Assistant, and a cloud-based collaboration tool for clinicians, both aimed at enhancing patient care and preventing sepsis-related deterioration. Her expansive experience encompasses leadership roles at GE Global Research and Hewlett Packard Labs Israel, underpinned by a profound technology background encompassing machine learning, artificial intelligence, computer vision, and algorithms. Dr. Bergman holds a PhD in Electrical Engineering and Computer Science from MIT, along with a wealth of patents and academic publications.

Title:

Navigating the AI Landscape in Healthcare: Striking the Balance Between Uncertainty and Risk

Abstract:

Risk management is paramount in all enterprizes, and particularly in Healthcare where patient safety takes precedence. Flaws in design or product functionality can result in treatment delays, patient harm, and reputational damage. AI, exemplified by models like ChatGPT and Dall-E, has the potential to revolutionize digital Healthcare, enabling more patient-focused care and informed interactions. However, AI's inherent uncertainty and the risk of generating inaccurate or misleading outputs pose challenges. This discussion explores the delicate balance between leveraging AI's capabilities and safeguarding patient safety, highlighting the need for responsible implementation in healthcare settings.

Michael Baltaxe

Senior ResearcherGeneral Motors

Bio:

Michael Baltaxe is a senior researcher at General Motors. He works in machine learning and computer vision projects in the automotive field, specially focusing on scene understanding using multiple viewing sensors and 3D point clouds. His research strives to improve machine perception in complex scenarios by harnessing multi-modal data gathered in efficient manners. Previously, Michael held algorithm development positions at Microsoft and Orbotech. He holds and M.Sc. in Computer Science from the Technion.

Title:

Polarimetric Imaging for Perception

Abstract:

Autonomous driving and advanced driver-assistance systems rely on sensors and algorithms to perform appropriate actions. Typically, the sensors include color cameras, radar, lidar and ultrasonic sensors. Strikingly however, although light polarization is a fundamental property of light, it is seldom harnessed for perception tasks. Here, we analyze the potential for improvement when using an RGB-polarimetric camera for the tasks of monocular depth estimation and free space detection, as compared to using a standard RGB-only camera. We show that quantifiable improvement can be achieved using state-of-the-art neural networks, with minimum architectural changes. Additionally, we introduce an open dataset with RGB-polarimetric images, lidar scans, GNSS / IMU readings and free space segmentations that can be used by the community for new research.

Prof. Ilan Tsarfaty

Head of Lab, Department of clinical microbiology and Immunology, the Faculty of Medical and Health SciencesTAU

Bio:

My research harnesses molecular biology, pathomics, radiomics, and AI to explore carcinogenesis mechanisms. Together with my team, we cloned the Muc1 gene, establishing it as a crucial breast cancer marker. Our work has illuminated the MET tyrosine kinase receptor's significance in cell transformation and metabolic alteration via Mimp/MTCH1 induction, another gene we have cloned. Currently, we are pioneering personalized medicine, focusing on identifying unique signatures and therapeutic targets for individuals with inherited MET, p53, BRCA1, and BRCA2 mutations. This effort aims to revolutionize cancer prevention and treatment, offering hope for more effective, customized interventions.

Title:

From Cancer Research to (IS-AI4VI) - Iron Sword War AI for Victim Identification

Abstract:

The Iron Swords War brought profound tragedy, notably in identifying countless victims under challenging conditions. Iron Sword Artificial Intelligence Victim
Identification (IS_AI4VI) is developing an AI tool to facelifted victim identificatiom leveraging the expertise of over 80 volunteers from academia, the medical sector, and industry. This AI-driven initiative focuses on matching post-mortem CT scans with ante-mortem records to identify victims accurately. Operating under stringent data integrity and confidentiality protocols on a secure AWS cloud endorsed by the Ministry of Health, IS_AI4VI prioritizes ethical data handling. By streamlining the collection, anonymization, and analysis of medical data, and seeking Helsinki approval from Israeli hospitals, IS_AI4VI is dedicated to refining the process of victim identification.

Shila Ofek-Koifman

IBM

Bio:

Shila Ofek-Koifman is a Director for Language Technologies in IBM Research AI. Shila manages the AI- Language & Media area in the Haifa Research Lab, and co-leads Research AI’s strategy in the area of Natural Language Understanding, and aspects of the AI-Driven Customer Care strategy, including research on natural language generation, document understanding, summarization, neural information retrieval, conversation and Large Language models. Shila works closely with the Watson products, and under her leadership, her teams deliver differentiating research technologies into the products. Shila received multiple IBM awards for her research work and contributions to the business, including an IBM Corporate award and the "Best of IBM" award.

Title:

What’s next in Multimodal Learning for Enterprises

Abstract:

Two mostly separate fields of machine learning -- computer vision and natural language processing -- have gradually become closer in recent years.

Advancements in each field have greatly influenced the other, driven in part by the abundance of weakly annotated data in the form of image-text pairs. These advancements brought focus to the multimodal Vision-Language models (VL) which jointly process images and free-text.

At IBM Research we have focused our research on multimodal learning, the limitations, applications and the adaptation of these models to the world of business documents.

In this talk, we will cover our latest work in the VL field, with topics such as Foundation Models for Expert Task Applications, Understanding Structured Vision and Language Concepts, and more.

Yair Adato

Co-founder & CEO BRIA AI

Bio:

Dr. Yair Adato, Co-founder & CEO of BRIA AI, is a visionary in his field. He holds a PhD in Computer Vision from Ben-Gurion University and has conducted joint research with Harvard University. With 67 patents in machine learning and AI, Dr. Adato boasts a remarkable innovation record.

 

Before leading BRIA, Dr. Adato served as CTO at Trax Retail, pivotal in propelling the company from startup to unicorn status. His expertise transcends BRIA, offering valuable advisory guidance to prominent firms like Sparx, Vicomi, Tasq, DataGen, and Anima.

Title:

Solving the big problems for Visual Generative AI

Abstract:

The biggest and most challenging problems when using this amazing technology in a commercial setting are not necessarily algorithmic in nature. This talk suggests the biggest problems are related to trained data and responsible AI, to models accessibility by the community, and lastly, how to use Visual Generative AI to create a commercial impacta. Specifically, we will focus on solving one hard problem, attribution and transparency of the v-gen-ai. 

 

Hila Chefer

PhD CandidateTel Aviv University & Google research

Bio:

Hila is a PhD candidate at Tel-Aviv University, advised by Prof. Lior Wolf. Her research focuses on constructing faithful explainable AI algorithms for classifiers and generative models, and leveraging explanations to promote model accuracy and robustness. 

Title:

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

Abstract:

Can a diffusion process be corrected after taking a wrong turn? We present Attend-and-Excite, a novel method that guides a text-to-image diffusion model to attend to all subjects in a text prompt and strengthen — or excite — their activations, encouraging the generation of all subjects in the prompt.

Dr. Dori Peleg

Sightful

Bio:

Dori Peleg is Sightful’s Sr. Director of algorithms. He has a PhD in Electrical Engineering from the Technion Israel institute of technology and was a lecturer for a graduate optimization course. Dori’s technical expertise is machine learning and optimization. He has led AI and algorithms teams for 15 years in companies such as Cortica, Given Imaging, Medtronc and Sightful. At Medtronic, the world’s largest medical device company, he was a technical and Bakken fellow and led the AI for the Gastrointestinal division. He also led and initiated Medtronic's AI conference and mentorship program.

 

Title:

How to make a useful AR product in the balance of AI, HW and UX

Abstract:

Sightful is committed to moving AR beyond the hype and to create solutions that are immediately valuable and intuitive. In this talk, join Dori Peleg, Sr. Director of algorithms, as he present the first AR laptop and discuss how we build perception algorithms by synthesizing and balancing Artificial Intelligence (AI), hardware (HW) engineering and user experience (UX) design

Assaf Hoogi

PhD, School of Computer ScienceAriel University

Bio:

Assaf is a senior lecturer at Ariel University, leading the Computer Vision and Deep Learning lab. His research stands on the theoretical-practical line, improving core elements of deep learning by proposing adaptive solutions for optimization, data normalization, and regularization. These improvements aim to enhance accuracy, robustness, and efficiency in both natural and medical computer vision, addressing their significant challenges. Assaf holds a BSc in biomedical engineering from Ben Gurion University and MSc/PhD in biomedical signal and image processing from Technion. He completed a postdoc at Stanford University and received the Young Investigator Award from NCI-NIH for his exceptional contributions to medical imaging.

Title:

Leveraging the Triple Exponential Moving Average for Fast-Adaptive Moment Estimation

Abstract:

To enhance deep network performance, the precision and efficiency of optimizers in recognizing gradient trends are crucial. Existing optimizers primarily rely on first-order Exponential Moving Averages, resulting in noticeable delays and suboptimal performance. We introduce the Fast-Adaptive Moment Estimation (FAME) optimizer. FAME leverages a higher-order Triple Exponential Moving Average (TEMA, inspired by the financial domain) to improve gradient trend identification. Here, TEMA actively influences optimization dynamics, unlike its passive role in finance. FAME excels in identifying gradient trends accurately, reducing lag and offering smoother responses to fluctuations compared to first-order methods. Results showed FAME’s superiority. It minimizes noisy trend fluctuations, enhances robustness, and boosts accuracy in significantly fewer training epochs than existing optimizers.

Dr. Eli Brosh

Head of AI researchWix.com

Bio:

Dr. Eli Brosh is the Head of AI Research at Wix, where he is working on future-looking technologies for website building using language and vision-based models. His research focuses on applying deep learning models and utilizing multimodal inputs for graphic design systems and layout generation. Prior to Wix, Eli held leadership positions in top companies in the fields of visual driving analytics and medical diagnostics. Eli holds a PhD in Computer Science from Columbia University and is the author of more than 30 publications and patents.

Title:

Generative AI in graphic design: challenges and opportunities

Abstract:

In this talk, we focus on the layout generation process, an essential ingredient of graphic design applications. We discuss the main challenges in the field, describe the different solution approaches, and introduce our recently proposed method, DLT, which consists of a novel joint discrete-continuous diffusion process, and highlight its effectiveness for conditioned layout generation.

Andrey Gurevich

Algorithmic Team LeaderMobileye

Bio:

Andrey Gurevich is an Algorithmic Team Lead in Mobileye's AI Engineering group, developing computer vision systems for autonomous driving. At Mobileye Andrey leads research on unsupervised knowledge distillation, efficient finetuning, neural architecture search, and auto-regressive models for object detection. He holds an MSc in Electrical and Computer Engineering from Ben-Gurion University, where his research and publications were focused on sequential anomaly detection under the supervision of Prof. Kobi Cohen.

Title:

Less is More

Abstract:

In this talk, I will introduce SubTuning, a novel parameter-efficient finetuning method for neural networks that selectively trains a subset of the layers. This approach is based on the observation that the utility of layers in a pretrained model varies when adapting to a target task, influenced by factors such as model architecture, pretraining tasks, and data volume. By leveraging this observation, SubTuning carefully chooses the layers to finetune, providing a flexible method that outperforms conventional methods in scenarios with scarce data while also enabling efficient inference in multi-task settings.

Efrat Shimron

Technion – Israel Institute of Technology

Bio:

Efrat Shimron is an assistant professor at the Technion, with dual affiliation to the departments of Electrical and Computer Engineering and Biomedical Engineering. She was previously a postdoctoral fellow at UC Berkeley. Her research spans the development of Compressed Sensing and AI algorithms for medical imaging, focusing on magnetic resonance imaging (MRI). She also investigates topics of bias in AI models; her work on identifying “data crimes” in medical AI was published in the proceedings of the national academy of sciences (PNAS) journal. Efrat recently received several career awards, including MIT’s Rising Star in Electrical Engineering and Computer Science award.Efrat Shimron is an assistant professor at the Technion, with dual affiliation to the departments of Electrical and Computer Engineering and Biomedical Engineering. She was previously a postdoctoral fellow at UC Berkeley. Her research spans the development of Compressed Sensing and AI algorithms for medical imaging, focusing on magnetic resonance imaging (MRI). She also investigates topics of bias in AI models; her work on identifying “data crimes” in medical AI was published in the proceedings of the national academy of sciences (PNAS) journal. Efrat recently received several career awards, including MIT’s Rising Star in Electrical Engineering and Computer Science award.

Title:

Data Crimes: The Risk in Naive Training of Medical AI Algorithms

Abstract:

In contrast to the computer vision field, where large open-access databases are abundant, the medical AI field suffers from data scarcity. Specifically, datasets of raw magnetic resonance imaging (MRI) measurements are small and quite limited. This poses a challenge for training AI algorithms for certain tasks, e.g. image reconstruction from MRI measurements. A common workaround is to download non-raw datasets that were published for other tasks, such as tumor segmentation, and use them for synthesizing “raw” MRI data and training reconstruction algorithms. Nevertheless, this could lead to biased results. In this talk I will describe how the bias emerges from such naïve workflows, and how it leads to fantastic, overly-optimistic results, which are too good to be true. Moreover, I will show that algorithms trained on synthesized data could later fail in clinical settings and miss important details. Next, I will introduce a new framework, titled “k-band”, which our team developed to address this challenge. The k-band framework enables training MRI reconstruction algorithms using only limited data, in a self-supervised manner, and hence reduces the need for massive datasets.

Dr. Maor Farid

Co-Founder and CEOLeo AI

Bio:

Dr. Maor Farid (Co-Founder & CEO wikiLIweb) is an AI and Chaos Theory scientist and lecturer at the Technion. He previously served as a Fulbright postdoctoral fellow at MIT and  Israel's representative at Harvard’s leadership program. During his military service in the IDF, he served as a researcher and commander in the Brakim excellence program, the Israeli Prime Minister's Office, and Unit 8200 (Captain), and was acknowledged as a Distinguished Scientist (top 3 scientists in the IDF). He completed his Ph.D. with the highest honors as the youngest graduate at the Technion, at the age of 24. Dr. Farid is the recipient of some of the most prestigious academic awards, including Israel's National Academy Award, and Israel's Ministry of Science and Technology award for groundbreaking research. He's also the founder of the Center of Israeli Scholars at MIT (ScienceAbroad, NGO) and an  NGO "Learn to Succeed" for empowering youths at risk, and the author of a top seller that carries the same name. Dr. Farid is a member of the Forbes 30 Under 30 list.
 

Title:

GenAI & the Next Industrial Revolution - How will humanity engineer the future?

Abstract:

In today's landscape, engineering design remains mostly manual, posing significant challenges in translating market and product requirements into engineering concepts, technical specifications, and 3D computational (CAD) models. This labor-intensive process results in extended Time to Market (TTM), often causing organizations to lag behind their competitors. While the potential of Generative Artificial Intelligence (GenAI) is promising, existing generally-trained Language Models (LLMs) and Deep Learning models struggle to comprehend the complexity of engineering systems. A tailored engineering-specific solution is essential.  

 

Raphael Mamane

Researcher in the AI Automotive groupNexar Inc.

Bio:

Raphael Mamane is a researcher in the AI Automotive group at Nexar, a startup company dedicated to creating a network of connected vehicles for the future of mobility. There, Raphael is part of the autonomous mapping team which focuses on the creation of high-definition road maps using crowd-sourced vision datasets from AI-powered dashcams. These maps are built as a precise and scalable solution for smart driving platforms. Prior to joining Nexar, Raphael conducted theoretical research at the Racah Institute of Physics. Raphael holds an M.Sc. in Physics from the Hebrew University of Jerusalem.

Title:

Scalable HD-map creation from crowd-source vision

Abstract:

Creating and updating HD maps for autonomous driving is critical yet prohibitively expensive at scale. This presentation delves into our scalable solution, relying on crowd-sourced vision data from AI-powered dashcams, while addressing mixed-fleet localization accuracy challenges, and without the use of expensive lidars. Utilizing deep learning in computer vision and structure-from-motion components, we generate precise 3D point clouds with dense 3D representations of various road assets, and provide high levels of asset localization. Join us as we explore the technical intricacies of this approach, offering insights into its potential to revolutionize autonomous navigation. 

Noam Tal

Algorithm ManagerApplied Materials

Bio:

I hold a BSc and MSc in Physics with research in the field of Superconductivity.

I have 13 years of experience in the Semiconductors industry in leading companies like Intel, Nova, and Applied Materials filling various positions in data science and algorithm development.

Co-inventor of 10 patents in this field in machine and deep learning.

Currently, leading an algorithm group developing deep learning solutions for next-generation Anomaly Detection and Segmentation industry-specified solutions.

Title:

Generating the Perfect Reference: Anomaly Detection Via Fusion of Stochastic and Deterministic Learning

Abstract:

Defect Detection in the Semiconductor industry is an extreme case of Anomaly Detection comprised of very small anomalies often well-harmonized in the background pattern.

Although very similar, no reference sample is perfect for comparison due to production variation.

We propose to generate this perfect reference – a generated image counterpart that is identical to the input sample everywhere, but the defective area.

We are using a novel fusion of stochastic and deterministic learning to train a conditional generative deep VAE model.

We demonstrate perfect reference generation for MVTec dataset and silicon manufacturing Scanning Electron Microscope (SEM) images achieving industry SOTA results.

Dr. Yossi Rubner

CEORTC Vision

Bio:

Yossi Rubner serves as the CEO of RTC Vision, a company focused on the development and implementation of AI and Computer Vision technologies.

Rubner has combined industrial and academic roles for more than 25 years, and he is also the founder and CTO of Kitov.ai.

He earned his B.Sc. in Computer Engineering at the Technion Institute of Technology, followed by a Ph.D. in Computer Science and Electrical Engineering from Stanford University, specializing in Computer Vision.

Rubner is the author of more than 30 publications and patents and his contributions to the field were recognized in 2013 when the IEEE Computer Society awarded him the Helmholtz Prize.

Title:

From Spine Surgery to Body Identification: Computer Vision’s Role in Solving October 7th’s Forensic Challenges

Abstract:

After the atrocities of Oct 7th, 2023, there was a need to identify victims whose conditions made traditional forensic methods ineffective. In this talk, we will show how RTC Vision, in collaboration with Mazor Robotics (now part of Medtronic), pivoted Computer-Vision technology originally designed for robot-assisted spine surgery to meet this urgent need.

Our approach leverages the technology to match postmortem CT scans of vertebrae with pre-existing CT scans, achieving a remarkable 100% identification success rate in our cases. Furthermore, I'll show how, even in the absence of prior CT scans, we can utilize antemortem X-ray images for identification.

 

Amit Svarzenberg

Microsoft for Startups EMEA CTO

Bio:

Amit is the Microsoft for Startups EMEA CTO 

He is a trusted AI advisor to portfolio companies of VCs, accelerators, and incubators, offering mentorship, supporting their technical journey and connection to Microsoft products and business groups. In addition, Amit is building the wider technical motion for Startups as part of the global tech team.

Before joining Microsoft, Amit was the Open Innovation Manager for Samsung Research, Israel where he led strategic investment projects. Amit was instrumental in opening Samsung Research’s R&D site, in conjunction with the University of Haifa, for developing the next generation of AI startups.

Amit lectures in Metaverse and Practical AI at Tel Aviv University.

Title:

How AI Multimodality is the Missing Link for Autonomous Agents

Abstract:

In this groundbreaking talk, Amit Svarzenberg, CTO of Microsoft for Startups, explores the crucial role of AI multimodality in advancing the capabilities of autonomous agents. He discusses how leveraging multimodal AI systems—capable of processing and understanding diverse data types including text, images, audio, and sensor data—significantly enhances the autonomy, adaptability, and operational efficiency of these agents. Through an examination of recent advancements and practical case studies, Svarzenberg demonstrates how multimodal AI not only improves agents' perception and decision-making skills but also enables more seamless and natural interactions between humans and agents. By incorporating multimodal AI, autonomous agents can achieve a more comprehensive understanding of their surroundings, allowing for the execution of complex tasks with remarkable accuracy and dependability. This presentation highlights the transformative impact of multimodal AI on the evolution of autonomous agents towards truly intelligent and self-sufficient entities, marking a pivotal advancement in our pursuit of an AI-empowered future.

 

Eyal Enav

Vision AI Alliances Manager NVIDIA

Bio:

Eyal is a Vision AI Alliances Manager at NVIDIA, Leading NVIDIA Metropolis program for Vision AI in Israel, Specializing in computer vision and deep learning, he holds an B.Sc. in Electrical Engineering from the Technion, Focused in Video analytics for the past 15 years, Working closely with dozens of Vision AI companies in Israel, both tech and business development

Title:

Create purpose-built AI using vision and language with multi-modal Foundation Models

Abstract:

Address the challenges and opportunities of AI training in developing Vision AI solutions. Hear about the latest capabilities from NVIDIA TAO Toolkit and learn how TAO brings the power of Vision Transformers, Foundational models and Gen AI models to help developers build their AI solutions.

Tali Dekel

Assistant Professor (Senior Lecturer)Mathematics and Computer Science Department at the Weizmann Institute

Bio:

Tali Dekel is an Assistant Professor (Senior Lecturer) at the Mathematics and Computer Science Department at the Weizmann Institute, Israel. She is also a Staff Research Scientist at Google, developing algorithms at the intersection of computer vision, computer graphics, and machine learning. Before Google, she was a Postdoctoral Associate at the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT. Tali completed her Ph.D. studies at the school of electrical engineering, Tel-Aviv University, Israel. Her research interests include computational photography, image/video synthesis, geometry and 3D reconstruction. Her awards and honors include the National Postdoctoral Award for Advancing Women in Science (2014), the Rothschild Postdoctoral Fellowship (2015), the SAMSON - Prime Minister's Researcher Recruitment Prize (2019), Best Paper Honorable Mention in CVPR 2019, and Best Paper Award (Marr Prize) in ICCV 2019.  She often serves as program committee member and area chair of major vision and graphics conferences More information in: https://www.weizmann.ac.il/math/dekel/home 

Title:

From Single-Video Models to All-Video Models

Abstract:

The field of computer vision is in the midst of a generative revolution, demonstrating groundbreaking image synthesis results, portraying highly complex visual concepts such as objects’ interaction, lighting, 3D shape, and pose. Expanding this progress to videos introduces two key challenges: (i) the distribution of natural videos is vast and complex, requiring orders of magnitude more training data than images, and (ii) raw video data is extremely high dimensional, requiring extensive computation and memory. In this talk, I’ll present different methodologies aimed at overcoming these challenges and advancing our capabilities to synthesize and edit visual content across both space and time. These methods range from layered video representations tailored to a specific video, to leveraging generative image priors for video synthesis tasks, and finally, designing and harnessing large-scale text-to-video models, which provides us with powerful motion priors. I’ll demonstrate how these methods unlock a variety of novel content creation applications, such as transferring motion across distinct object categories, image-to-video synthesis, video inpainting, and stylized video generation.

Dr. Chen Sagiv

Co Founder & Co CEOSagivTech

Bio:

Chen Sagiv earned her PhD in Applied Mathematics from the Tel Aviv University focusing on variational methods and Gabor Analysis.

After working as algorithms developer, she became a parallel entrepreneur and co founded SagivTech, a computer vision projects company, DeePathology working in AI for computational pathology and SurgeonAI working on bringing AI to the OR.

Chen is also co founder of IMVC.

Chen is passionate about bringing technology to healthcare, promoting Math education to at risk youth and dogs.

Title:

Introduction to Transformers

Abstract:

Transformers are neural networks that learns context from relationships in sequential data using a mechanism called attention.

The modern transformer was proposed in the 2017 paper titled 'Attention Is All You Need' by Ashish Vaswani et al., Google Brain team.

While transformer models are basically large encoder/decoder blocks that process data they also have an attention ingredient that allows them to detect patterns in data.

In this session, a brief introduction to the foundations of transformers will be given.  

Morris Alper

Tel Aviv University

Bio:

Morris Alper is a PhD student at the School of Electrical Engineering, Tel Aviv University (TAU). Under the mentorship of Dr. Hadar Averbuch-Elor, he is researching multimodal learning – machine learning applied to tasks involving vision and language. He received his MSc with honors from TAU (Computer Science), and his BSc from MIT (Mathematics and Linguistics).

Title:

Kiki or Bouba? Sound Symbolism in Vision-and-Language Models

Abstract:

Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism. Among the many dimensions of meaning, sound symbolism is particularly salient and well-demonstrated with regards to cross-modal associations between language and the visual domain. In this work, we investigate vision-and-language models such as CLIP and Stable Diffusion and find strong evidence that they do display sound symbolic patterns, paralleling the well-known kiki–bouba effect in psycholinguistics.

Shira Weinberg Harel

Product and AI Consultant

Bio:

With over two decades of experience in the tech industry, Shira has held prominent product and AI leadership roles at companies like Microsoft and monday.com. As a co-founder of LeadWith, a non-profit organization dedicated to empowering women in the tech field, she is passionate about promoting diversity and inclusivity in the industry. Shira’s contributions have been recognized by being selected for the esteemed 40 under 40 list by Globes. Currently, she works as an independent consultant and speaker, leveraging her expertise to mentor product managers and impart knowledge through her own academy. Additionally, her influential podcast serves as a platform for engaging discussions on all aspects of product management.

Title:

Artificial Intelligence, Real Biases: Examining Gender Biases in AI

Abstract:

Have you ever wondered why when you ask Midjourney for pictures of drivers, only images of men appear? Or why Siri's default voice is female?
The lecture explores the existence of gender biases in artificial intelligence and how they impact our understanding of reality.

As AI continues to permeate our lives, it's crucial to recognize that it is not always neutral. From Google Translate to cutting-edge generative AI platforms, this lecture will examine the gender biases present in various technologies. Through engaging examples, attendees will gain a deeper understanding of the issue and learn about available tools to address and create a more equitable future.

Dr. Itzik Ben Shabat

Research FellowThe Australian National University and Technion, Israel Institute of Technology

Bio:

Dr. Yizhak Ben-Shabat (Itzik) is a Research Fellow at the Australian National University (ANU) and Technion – Israel Institute of Technology. With expertise in 3D computer vision, machine learning, and geometric algorithms, Itzik research focuses on applying deep learning methods to 3D point clouds for tasks like 3D reconstruction, classification, detection, and action recognition. Besides his research role, Itzik is the founder and host of The Talking Papers Podcast—a groundbreaking platform for disseminating research and supporting early career academics and PhD students. Itzik earned his Ph.D. in 2019 from the Technion and later served as a Research Fellow at the ARC Centre of Excellence for Robotic Vision (ACRV). 

Full details, publications, and code are available on his personal website: www.itzikbs.com.

Title:

Octree Guided Unoriented Surface Reconstruction

Abstract:

We address the problem of surface reconstruction from unoriented point clouds. Implicit neural representations (INRs) have become popular for this task, but when information relating to the inside versus outside of a shape is not available optimization relies on heuristics and regularizers to recover the surface. These methods can be slow to converge and easily get stuck in local minima. We propose a two-step approach, OG-INR, where we (1) construct an octree and label what is inside and outside (2) optimize for a continuous and high-fidelity shape using an INR that is initially guided by the octree's labelling. To solve for our labelling, we propose an energy function over the discrete structure and provide an efficient move-making algorithm that explores many possible labellings.Our results show that the exploration by the move-making algorithm avoids many of the bad local minima reached by purely gradient descent optimized methods .

Or Levi

VP of Data ScienceZefr

Bio:

Or Levi is an AI Researcher and VP of Data Science at Zefr. He holds a M.Sc. (Magna Cum Laude) in Information Retrieval from the Technion, the Israel Institute of Technology. Or’s strongest passion is using AI for social impact, which led him to develop innovative AI to fight the spread of misinformation online. The technology was named among CB Insights’ International Game Changers – with potential to transform society and economies for the better. Or’s work has been presented in leading AI conferences and covered by international media.

 

Title:

Detecting AI-Generated Fakes with Machine Vision

Abstract:

With the meteoric rise of AI-image generators, fake images of public figures - such as 'Trump's arrest' - have recently become viral sensations. The risks of synthetic media being utilized to spread misinformation and undermine democracy, were brought into the public’s attention, raising an interesting question - can we use AI to catch AI-generated images before they become the next viral hit? Zefr, the global leader in brand suitability, is introducing advanced vision models to detect AI-generated images and counter misinformation. The talk will cover real world examples, the challenges of detecting fakes and practical tips for training and deploying specialized vision models at scale.

Adam Polyak

Research EngineerMeta AI

Bio:

Adam is a Research Engineer at Meta AI Research (formerly Facebook AI Research) and a PhD student under Prof. Lior Wolf at Tel-Aviv University. He holds a BSc in computer science and mathematics from Bar-Ilan University, and an MSc in computer science from Tel-Aviv University. His research is focused on advancing generative models in image, audio, and video domains, with recent achievements in large-scale foundational generative models for images and videos.

 

Title:

Text to Dynamic Visual Worlds: Advancements in Video and 4D Scene Generation

Abstract:

In this talk, we present two methods for Text-to-Video generation: i) Make-A-Video (MAV), video generation from textual prompts, and ii) Make-A-Video3D (MAV3D), three-dimensional dynamic scenes generation from text descriptions. MAV introduces a paradigm for directly translating the tremendous recent progress in Text-to-Image generation to Text-to-Video. MAV3D leverages a 4D dynamic Neural Radiance Field (NeRF) optimized for scene appearance, density, and motion consistency through the MAV model. The dynamic video output generated from the provided text can be viewed from any camera location and angle, and can be composited into any 3D environment. Both methods rely only on text-image pairs and unlabeled videos. To the best of our knowledge, MAV3D is the first to generate 3D dynamic scenes given a text description.

 

Bella Specktor Fadida

Teaching FellowHaifa University

Bio:

Bella is a lecturer in the Medical Imaging Sciences department at Haifa University. She completed her PhD at the Hebrew University under the supervision of Prof. Leo Joskowicz. Prior to that, Bella worked on medical imaging algorithms for 7 years at Philips. She is also the founder and organizer of the Machine Learning for Medical Imaging (MLMI) and Haifa Machine Learning meetups.

Title:

Abstract:

We present a new method for partial annotations of MR images that uses a small set of consecutive annotated slices from each scan with an annotation effort equal to few annotated cases. The training is performed by using only annotated blocks, incorporating information about slices outside the structure, and modifying a batch loss function. For fetal body segmentation of in-distribution data, the use of partial annotations resulted in decrease in Standard Deviations of Dice scores by 22% and 27.5% for the FIESTA and TRUFI sequences respectively. For TRUFI out-of-distribution data, the method increased average Dice scores from 0.84 to 0.9.

Natan Bagrov

Deep Learning Research Engineer Deci

Bio:

Natan is a Deep Learning Research Engineer with a vast experience in designing state-of-the-art Object Detection and Semantic Segmentation models and optimizing them to run efficiently on diverse hardware platforms.  At Deci, Natan leads the Computer Vision team, focusing on the productization of Deci's core technological breakthroughs, and developing tools that are used by the world's leading AI teams. He holds a Master's degree in Machine Learning and a Bachelor's degree in Computer Science from the Technion Israel Institute of Technology.

 

Title:

Advancing Object Detection with YOLO-NAS: A new foundation model designed with Neural Architecture Search-Based Approach

Abstract:

Object detection is a pivotal component in the realm of computer vision, instrumental in facilitating machines to discern and localize objects within visual data. Recent years have seen significant advancements in object detection through the evolution of potent neural network architectures, notably the YOLO (You Only Look Once) family. This talk introduces a novel YOLO-based architecture, YOLO-NAS, developed via a proprietary neural architecture search (NAS) algorithm, AutoNAC. By optimizing accuracy and efficiency, YOLO-NAS redefines state-of-the-art object detection, paving the way for increased precision and performance in applications like autonomous vehicles, robotics, and video analytics.

Amos Bercovich

Algorithm Team LeaderWSC Sports

Bio:

Amos Bercovich is an Algorithm Team Leader at WSC Sports, where he and his team research and develop end-to-end real-time solutions for generating sports content from live sports broadcasts automatically, using deep learning for video, image, and audio analysis. Before joining WSC Sports, Amos worked at Cortica as an Algorithm Developer where he focused on developing image recognition applications. He has acquired his B.Sc. and M.Sc. degrees at the Ben-Gurion University of the Negev, with a thesis in the field of Computer Vision in collaboration with the Agricultural Research Organization.

Title:

Zero-Shot Event Retrieval in Sports Broadcasting

Abstract:

WSC Sports is developing an AI platform for generating automatic sports highlights. To make the storytelling of our content more compelling, our system is also programmed to add to the highlight video, short transitions such as, team lineup graphics,  close-ups of reactions, etc.
Unlike regular plays in the game, these transitions tend to change from one broadcaster to the other, and over time. Although plain supervised algorithms can recognize and classify these transitions, it can be quite costly and difficult to keep their performance at a high level. In this presentation, we will present our Concept Detector, a framework for creating concepts - a set of textual and visual queries combined with a set of rules to retrieve those events. 

Lilach Arbel

Algorithm DeveloperGentex Technologies Israel

Bio:

Lilach works as an algorithm developer at Gentex Technologies Israel. She holds a BSc in Biomedical Engineering at Tel Aviv University and is now pursuing a master's in electrical engineering, in partnership with Sheba Medical Center and Tel Aviv University. Her research, under the guidance of Prof. Nahum Kiraty and Dr. Arnaldo Mayer, is centered on using deep learning techniques to transform between modalities, specifically focusing on CT and MRI data.

Title:

Solving 3D Human Pose Ambiguities with Quadratic Programming

Abstract:

3D human pose estimation (HPE) is a fundamental task in human-computer interaction. Monocular 3D HPE is a challenging task due to a lack of in-the-wild annotated data, high computational load, and accessibility to depth observations. Despite their success on 2D HPE, end-to-end DNN approaches to 3D HPE hardly generalize to in-the-wild scenes with multiple self-occlusions. Recent approaches suggested optimizing 3D humanoid model parameters to minimize a 2D objective, however, as they optimize in 2D, they suffer from depth ambiguities.

We propose a two-stage depth-based solution to monocular 3D HPE. We start by using a deep neural network to predict 2D body-joint locations and to classify joints as occluded or visible. Then, having valid depth for the visible joints, we solve a Quadratically Constrained Quadratic Program enforcing skeletal and temporal-continuity constraints and thereby solving the self-occlusion problem. We demonstrate our method effectiveness on Gentex’s in-cabin 180 degrees fisheye depth camera and show that it can reconstruct reliable 3D human pose in complex situations.

Zvi Figov

Principal Data ScientistMicrosoft

Bio:

Zvi Figov is a data scientist with over 20 years of experience in various computer vision fields. He currently works in the Azure Video Indexer group at Microsoft. He holds a BSc and MSc in computer science and mathematics from Bar-Ilan University. Zvi has vast experience in computer vision applications, including deep learning, object detection and tracking. Since joining  the Video Indexer group 4 years ago Zvi has also been working on creating solutions based on multimodality analysis, combining vision, audio and NLP.

Title:

Person tracker for Media and Entertainment videos

Abstract:

Azure Video Indexer is an analytical tool to generate insights from videos while indexing them. Person tracking is a crucial aspect of video analysis and plays a significant role in Azure Video Indexer. However, it poses several algorithmic and computational challenges, particularly in real-world scenarios such as the media industry. These challenges include the need for efficient and scalable algorithms, handling multiple camera switches, dealing with different angles and poses, occlusions and more.

In this talk, I will present our novel pipeline for person tracking, with significant improvements for media and entertainment videos. Our approach addresses the above mentioned challenges and significantly reduces the computational cost and runtime required for person tracking. It combines Neural Network Models together with a novel tracking algorithm, all running fast and efficiently on a CPU.

Sivan Doveh

Ph.D. candidate at Weizmann Institute of Science and AI Researcher at IBMIBM

Bio:

Sivan works as an AI Researcher at IBM. She is a Ph.D. candidate at the Weizmann Institute of Science under the supervision of Prof. Shimon Ullman. Her papers have been published in top AI conferences, including CVPR, NeurIPS, ICCV, and AAAI. Sivan's research focuses on the fields of weakly supervised learning and Multi-Modal image-text learning.

 

Title:

Teaching Structured Vision&Language Concepts to Vision & Language Models

Abstract:

Vision and Language (VL) models have demonstrated remarkable zero-shot performance in a variety of tasks. However, some aspects of complex language understanding still remain a challenge. We introduce the collective notion of Structured Vision&Language Concepts (SVLC) which includes object attributes, relations, and states which are present in the text and visible in the image. Recent studies have shown that even the best VL models struggle with SVLC. A possible way to fix this issue is by collecting dedicated datasets for teaching each SVLC type, which might be expensive and time-consuming. Instead, we propose a more elegant data-driven approach for enhancing VL models' understanding of SVLCs that makes more effective use of existing VL pre-training datasets and does not require any additional data. While automatic understanding of image structure remains largely unsolved, language structure is much better modeled and understood, allowing for its effective use in teaching VL models. We propose various techniques based on language structure understanding that can be used to manipulate the textual part of off-the-shelf paired VL datasets. VL models trained with the updated data exhibit a significant improvement of up to 15% in their SVLC understanding with only a mild degradation in their zero-shot capabilities both when training from scratch or fine-tuning a pre-trained model.

Or Litany

NVIDIA and Technion

Bio:

Or Litany is a senior researcher at Nvidia and an assistant professor at the Technion where he leads the visual computing and AI lab. His research focuses on semantic scene understanding and spatiotemporal content generation. 

Title:

Data driven simulation for autonomous driving

Abstract:

Simulation is a critical tool for ensuring the safety of autonomous driving. However, traditional simulation methods can be labor-intensive and struggle to scale.
In this talk, I will discuss an innovative neural simulation approach that learns to simulate driving scenarios from data. Specifically, I will focus on my latest research in three key areas: scene reconstruction in both appearance and geometry, motion generation of humans and vehicles, and LiDAR view synthesis.

Tami Ellison

MS. CEO and Co-Founder,Conflu3nce ltd and Conflu3nce Health AI (CHAI)

Bio:

Tami Ellison is co-founder of conflu3nce - a Jerusalem-based health technology start-up. She leads the company’s early disease detection initiatives, applying her patented technologies to transform image intelligence for both humans and machines. An accomplished photographer with exhibitions in Israel and the US, her multiplexed, figure-ground visual illusions bring a Gestalt-based understanding of how images/image parts interact with one another. A C-level consultant, working with public and private entities for over 25 years, she holds a thesis research MS from UIC’s Laboratory for Cell, Molecular and Developmental Biology, investigating developmental model systems, expression patterns, and systems-level regulation/control mechanisms.

Title:

Deep Learning for ALL: Enhancing Image Inputs - Building Knowledge Outputs

Abstract:

Globally, an estimated 40M diagnostic reading errors occur annually; approximately 62% can be attributed to cognitive/perceptual issues associated with complacency, underreading, and search satisfaction. AI expert systems are critical to address the exponential growth in the volume of medical images generated and help alleviate workforce:workflow inefficiencies. But outsourcing clinical decision-making can exacerbate existing errors and introduce FN/FP reporting issues. We will present image enhancement methods that transform early disease detection capabilities, applying a “Deep Learning for ALL” approach that advances pixel-level “Image Intelligence” for both humans and AI and cooperatively promotes knowledge-building, pattern recognition, and attribute extraction

Tomer Weiss

Technion

Bio:

Tomer Weiss is a PhD student at the Computer Science faculty at the Technion, where he is working under the guidance of Prof. Alex Bronstein. His primary research focuses on harnessing deep learning methodologies for inverse design in computational imaging and cheminformatics. Holding an MSc with honors from the Technion in Computer Science and a BSc from Ben-Gurion University in Mathematics and Computer Science.

Title:

Computational Imaging

Abstract:

This talk introduces the concept of joint optimization in computational imaging using deep learning, demonstrating its practical benefits for improved performance. By showcasing real-world examples from Magnetic Resonance Imaging (MRI) and Multiple Input Multiple Output (MIMO) radar imaging, we reveal how this approach can positively impact the end performance. Join us for an exploration into the world of computational imaging, where simple yet effective techniques can make a meaningful difference.

Topaz Gilad

VP of AI and AlgorithmsVoyage81, ODDITY

Bio:

Topaz Gilad is an R&D manager specializing in AI, machine learning, and computer vision, leading production-oriented innovative research. With experience in large companies as well as startups, in various industries, from space imaging and semiconductor microscopy to sports tech, wellness, beauty, and self-care industry, she has developed methodologies to scale up while improving quality, delivery, and teamwork. Currently VP of AI and Algorithms at Voyage81, ODDITY, which excels in computer vision deep learning algorithms in both RGB and hyper-spectral domains. Previously head of AI at Pixellot, a leading AI-automated sports production company. Topaz is also an advocate for women in tech.

Title:

From Cost-Sensitive Classification to Regression: Unlock the True Potential of Your Labels!

Abstract:

Many of the tasks we face as data scientists or machine-learning researchers relate to categorization in one way or the other. In the words of David Mumford: "The world is continuous, but the mind is discrete." We often define categories when breaking down a real-world problem into an ML-based solution. However, actual target values may be continuous or at least ordered. This is something to consider and even leverage in the design of your ML model.

 

Using case studies from real-world data domains, we will see how acknowledging the inner relations of our target labels can boost the knowledge we provide in the training phase, better model the world, reduce overconfidence, and improve robustness. From classical concepts to state-of-the-art, this talk will walk you through regression-based approaches for what may seem like classification problems. Unlock the true potential of your labels and boost your classifiers!

Ofir Bibi

VP Research Lightricks

Bio:

Ofir Bibi, VP Research at Lightricks, has led the research department of 40+ researchers for the past seven and a half years and counting. Ofir specializes in bringing core technologies and ML solutions into products. Ofir has a vast experience in building products and processes by utilizing data to the furthest extent. His main research focus is on Machine Learning, Statistical Signal Processing, and Optimization.

Title:

Taming the Wild Generative Beast for the Everyday Creator

Abstract:

As the use of generative AI becomes more prevalent in the tech industry, it can be difficult to understand how to effectively implement this new technology in your company's products.In this talk, Ofir Bibi will provide a general overview of how we implemented generative AI at Lightricks and showcase our latest developments in image transformation technology. He will also share thoughts on the impact that generative AI will have on the industry. 

Amir Alush

Co-founder and CTO Visual Layer

Bio:

Amir Alush is the co-founder and CTO of Visual Layer, a company dedicated to improving the quality of image datasets used in AI model development. He holds a Ph.D. in Computer Vision and Machine Learning from Bar-Ilan University and has extensive experience in the field, including roles at Quris.AI and Brodmann17. His work focuses on AI system design, deep learning, and computer vision. His recent project fastdup (co-authored with Dr. Danny Bickson) showcases his commitment to practical, data-driven solutions in AI.

 

Title:

From Raw Data to Refined Datasets: Introducing VL Datasets for Reliable AI Model Development

Abstract:

Generative AI has revolutionized various domains, including art and design. However, the success of generative  models heavily relies on high-quality, extensive image datasets for effective training. Whether you're an AI enthusiast, researcher or a student, you've likely encountered challenges associated with untidy image datasets in Generative AI or other visual data-focused AI applications.

 

These challenges, including issues like duplicated images, mislabeled data, and outliers, can severely impact model reliability, waste computational resources and storage, and demand significant manual cleanup efforts. In our research project, LION-1B, we uncovered quality issues in approximately 105 million images. Notably, more than 90 million images were identified as duplicates, over 7 million images were blurry or of low quality, and more than 6 million images were deemed outliers.

 

To address these challenges, we have released a set of refined versions of popular visual datasets, namely LAION-1B and ImageNet-21K. We have named these refined datasets VL Datasets, and they are freely accessible through the visuallayer Python SDK or the free user-friendly VL Profiler UI. By utilizing VL Datasets, AI practitioners and researchers can enhance the development of more robust and reliable AI models.

Natanel Davidovits

Senior Manager, Data Science & ResearchDoubleVerify

Bio:

Seasoned AI leader, with over a decade of experience, solving complex industry problems.

Title:

Auto Labeling of Data Sets

Abstract:

Connecting traditional algorithms to ensure a successful lazy-labeling strategy.

Eyal Gil-Ad

Computer Vision Perception Group Manager Innoviz Technologies

Bio:

Eyal is an Algorithm Engineer with over 10 years of experience in leading productization of Computer Vision technologies.
Following his experience in building and leading research and development algorithms teams, these days, Eyal leads the Perception Group at Innoviz Technologies overseeing Innoviz's AI Perception and Calibrations solutions, from data aspects though algorithms to deployment .

Prior to Innoviz, Eyal held key positions in several startup companies, early stage core team to late stage, leading computer vision projects from conception to productization for applications such as autonomous drones, security, and industrial automation.
Eyal Holds a BSc in Electrical Engineering from Ben-Gurion University and an MSc in Electrical Engineering from Tel-Aviv University.

Title:

Using AI for 3D LiDAR perception in L3 Autonomous driving

Abstract:

In order to improve safety in autonomous driving it is important to use multiple sensors. In this talk we will discuss how to perform perception from 3D data acquired by the Innoviz lidar. Specifically we will describe how to identify obstacles or moving objects and perform detection in 3D. 

We will show how combining AI with classic thence provide us both with high accuracy and robustness to diverse conditions and scenarios.

Dr. Amit Alfassy

AI Research Scientist IBM

Bio:

Amit works as an AI Research Scientist @ IBM. He graduated with a PhD from the Technion under the supervision of prof. Alex Bronstein. His papers were published in top AI conferences such as: CVPR, NeurIPS, ECCV, AAAI. Amit researches the fields of Few-shot learning, Multi-Modal image-text learning and currently Multi-Modal image-language foundation models.

Title:

Attention based change detection using transformers

Abstract:

Amit will discuss FETA, NeurIPS 2022 main conference paper. FETA: Towards Specializing Foundation Models for Expert Task Applications. While Foundation Models (FMs) have demonstrated unprecedented capabilities, FMs still have poor out-of-the-box performance on expert tasks. We offer an automatic system and method to adapt FMs to expert data using raw documents only, without requiring any annotations. FETA can be easily used on any document. We also propose a benchmark built around the task of teaching FMs to understand technical documentation. Our FETA benchmark focuses on text-to-image and image-to-text retrieval in public car manuals and sales catalogue brochures. 

Rami Ben-Ari

Senior Research Scientist and Technical Leader OriginAI

Bio:

Rami Ben-Ari is a senior research scientist and technical leader at OriginAI, an AI research center at Israel. He maintains close collaboration with several universities in Israel, supervising graduate students and serves as an adjunct professor at Bar-Ilan university. Rami has published over 50 papers and patents, and organized various workshops and challenges in CV & ML.

His research interests cover deep learning methods in computer vision and particularly, image retrieval, multimodal learning and generative models.

He holds a PhD in Applied Mathematics from Tel-Aviv University, specializing in computer vision.

Title:

Enhancing Image Retrieval: Novel Approaches and Scenarios

Abstract:

Image retrieval are essential for managing, searching and making sense of the ever-growing volume of visual data, across diverse fields and applications. In this talk I will present several of our research works in interactive image retrieval, including a new architecture that leverages few shot learning, a greedy active learning approach for image retrieval, a new method that combines textual and visual search and finally introducing a system that leverages the emerging chat capabilities for the benefit of image retrieval.

Rebecca Hojsteen

AI Team Leader4M Analytics

Bio:

Rebecca Hojsteen is AI team leader at 4M Analytics, a company specializing in providing cutting-edge AI solutions for mapping underground infrastructures on a large scale.
Prior to joining 4M Analytics, Rebecca was an expert in computer vision algorithm development at RTC-vision and Samsung.
She holds an MSc in biomedical engineering from the Technion and a ME in electrical engineering from Supelec in Paris.
 

Title:

How to process engineering records for infrastructure mapping

Abstract:

In today's construction industry, precise knowledge of underground infrastructures is of paramount importance in project planning. However, the current scenario lacks a reliable and up-to-date map of underground infrastructure.
We have developed a groundbreaking mapping technology based on the processing of multiple sources, including engineering records which are extremely accurate and rich in information.
Engineering records need to be geolocated, extracted, and digitized. This is particularly challenging due to the very high variability in documents and the density of the sketches.
In this talk, we will present the innovative solution we have developed to process this data at scale.

Shiri Manor

Sr. Director of EyeQ Deep Learning FrameworksMobileye

Bio:

Shiri Manor is a Senior Director at Mobileye with over 20 years of experience in software engineering and management. Leading a dynamic team at Mobileye, Shiri Manor spearheads efforts to enable and enhance deep learning algorithms on Mobileye's embedded car hardware. With a focus on innovation, the team crafts cutting-edge tools utilizing open-source technology for streamlined deployment of deep learning networks, incorporating advanced optimization techniques.

Shiri Manor's professional journey includes being a Software Group Engineering Manager at Intel, where she played an instrumental role in designing and developing computer vision SDKs and Intel OpenCL products. She holds a BSc in computer science with honors and a MSc in the same field from the Technion.

Title:

How to enable running DL Networks in the car

Abstract:

Deep learning networks have emerged as a prominent technology for the accurate detection and identification of objects on the road, empowering the way to fully autonomous cars. In the pursuit of cost-effective and power-efficient solutions, this presentation delves into the challenges and strategies associated with optimizing deep learning networks for real-time execution in the context of vehicular environments.

Given the stringent constraints of cost-efficiency and low power consumption, a key challenge lies in achieving high performance without compromising accuracy.

Several optimization techniques are elucidated contribute to network efficiency, with a focus on minimizing computational redundancy and encouraging resource-efficient execution.

Through a compelling case study, the effectiveness of these optimization techniques is demonstrated in the real-world scenario of running a transformer network within an automotive system.

Tom Sharon

Master's student Weizmann Institute of Science

Bio:

Tom Sharon received the B.Sc degree in science physics with Summa Cum Laude, focusing on mathematics and physics from The Open University,Israel, in 2021. Presently, she’s completing her M.Sc. in mathematics and computer science from the Weizmann Institute, Rehovot,Israel, under Prof. Yonina Eldar’s supervision.

Her research interests include intersection between deep-learning and computer vision methods to physics challenges, including medical application. Her work focus on electromagnetic and acoustics signals for medical imaging using deep-learning methods such as model-based neural networks,and solving inverse scattering problems for quantitative imaging.

Her awards include the scholarship for excellence master’s degree students in high-tech fields.

Title:

Real-Time Model Based Quantitative Radar

Abstract:

Ultrasound and radar signals are beneficial for medical imaging due to their non-invasive and low-cost nature. Quantitative medical imaging can display various physical properties of the scanned medium, in contrast to traditional imaging techniques. This broadens the scope of medical applications including fast stroke imaging. However, current quantitative imaging techniques are time-consuming and tend to converge to local minima. We propose a neural network based on the physical model of wave propagation, to achieve real time multiple quantitative imaging for complex and realistic scenarios,using data from only eight elements, demonstrated for diverse transmission setups using either radar or ultrasound signals.

Maya Gilad

PhD candidate Technion

Bio:

Maya is a PhD candidate at the Technion. Working under the supervision of Dr. Moti Freiman, she specializes in the development of innovative algorithms for medical imaging. Maya holds both a BSc and an MSc in Computer Science, having graduated Magna Cum Laude. With a diverse background in software engineering and machine learning, she has had the opportunity to lead engineering teams in both the IDF and the private tech sector. Before assuming her current role at Voyantis, Maya served as an Algorithms Architect at Gett. Her current research efforts focus on leveraging DWI-MRI to improve breast cancer treatment outcomes.

Title:

Integrating Radiomics and Physiological Decomposition of DWI

Abstract:

We introduce PD-DWI, a machine-learning model for early prediction of pathological complete response (pCR) in breast cancer patients undergoing neoadjuvant chemotherapy (NAC). Leveraging decomposed diffusion-weighted MRI (DWI) and clinical data, our model outperforms conventional methods in the BMMR2 challenge, achieving an area under the curve (AUC) of 0.8849 versus 0.8397. PD-DWI has the potential to enhance pCR prediction accuracy, reduce MRI acquisition times, and eliminate the need for contrast agents.