Tuesday|April 1, 2025
The dept. of Mathematics and Computer ScienceThe Weizmann Institute of Science
Yonina Eldar is a Professor in the Department of Mathematics and Computer Science, Weizmann Institute of Science, Rehovot, Israel where she heads the center for Biomedical Engineering and Signal Processing and holds the Dorothy and Patrick Gorman Professorial Chair. She is also a Visiting Professor at MIT, a Visiting Scientist at the Broad Institute, and an Adjunct Professor at Duke University and was a Visiting Professor at Stanford. She is a member of the Israel Academy of Sciences and Humanities, an IEEE Fellow and a EURASIP Fellow. She received the B.Sc. degree in physics and the B.Sc. degree in electrical engineering from Tel-Aviv University, and the Ph.D. degree in electrical engineering and computer science from MIT, in 2002. She has received many awards for excellence in research and teaching, including the Israel Prize (2025), IEEE Signal Processing Society Technical Achievement Award (2013), the IEEE/AESS Fred Nathanson Memorial Radar Award (2014) and the IEEE Kiyo Tomiyasu Award (2016). She received the Michael Bruno Memorial Award from the Rothschild Foundation, the Weizmann Prize for Exact Sciences, the Wolf Foundation Krill Prize for Excellence in Scientific Research, the Henry Taub Prize for Excellence in Research (twice), the Hershel Rich Innovation Award (three times), and the Award for Women with Distinguished Contributions. She was selected as one of the 50 most influential women in Israel, and was a member of the Israel Committee for Higher Education. She is the Editor in Chief of Foundations and Trends in Signal Processing, a member of several IEEE Technical Committees and Award Committees, and heads the Committee for Promoting Gender Fairness in Higher Education Institutions in Israel.
Model Based Deep Learning: Applications to Imaging and Communications
Deep neural networks provide unprecedented performance gains in many real-world problems in signal and image processing. Despite these gains, the future development and practical deployment of deep networks are hindered by their black-box nature, i.e., a lack of interpretability and the need for very large training sets.
CTO, President & Co-FounderImubit & Tel Aviv University
Nadav Cohen is an Assoc. Prof. of Computer Science at Tel Aviv University, and CTO, President & Co-Founder at Imubit. His academic research centers on the foundations of deep learning, while at Imubit he leads development of deep reinforcement learning systems controlling manufacturing plants. Nadav earned a BSc in electrical engineering and a BSc in mathematics (both summa cum laude) at the Technion. He obtained his PhD (summa cum laude) at the Hebrew University, and was a postdoc in Princeton. For his contributions, Nadav won a number of awards, including an ERC Grant and a Google Research Scholar Award.
Offline Reinforcement Learning in the Wild
ProfessorTechnion - Israel Institute of Technology
Ayellet Tal is a professor and the Alfred and Marion Bär Chair in Engineering at the Technion's Department of Electrical and Computer Engineering. She holds a Ph.D. in Computer Science from Princeton University and a B.Sc degree (Summa cum Laude) in Mathematics and Computer Science from Tel Aviv University. Among Prof. Tal’s accomplishments are the Rechler Prize for Excellence in Research, the Henry Taub Prize for Academic Excellence, and the Milton and Lillian Edwards Academic Lectureship. Prof. Tal has chaired several conferences on computer graphics, shape modeling, and computer vision, including the upcoming ICCV.
Point Cloud Visualization – Why and How?
Chief Strategy & Product OfficerMentee Robotics
Assoc. Prof. Tel Aviv University
Visual Priors and How to Control Them for Generation
The emergence of large scale models has given rise to distilling these models’ vast knowledge to specific needs, treating them as priors. This foundation model approach allows generalizing to new domains, as well as more precise and intuitive control. In this talk, I discuss recent visual priors (e.g. Stable Diffusion, MDM), and the ways to control them. I exemplify these approaches through the work of my lab over the last couple of years, spanning generative tasks in three domains – 2D images, 3D shapes, and human motion. The talk presents SOTA methods for style transfer, personalization, text-to-mesh generation, and perhaps most importantly, demonstrates that the knowledge of visual priors can be leveraged in surprising ways.
Vice President AI TechnologiesIBM
Dr. Aya Soffer is the Vice President of AI Technologies at IBM Research and the Director of IBM’s research labs in Israel. In her role, Dr. Soffer is responsible for setting strategic directions and collaborating with IBM scientists globally to transform innovative ideas into cutting-edge AI technologies. She also works closely with IBM’s product groups and customers to bring research innovations to market. Dr. Soffer specializes in generative AI and its application in enterprise contexts, focusing on effectiveness, evaluation, trust, governance, and integration with enterprise data and assets. As the director of IBM Research – Israel, she ensures the lab remains a vibrant environment where research and innovation converge to tackle real-world challenges. Additionally, Dr. Soffer plays a key role in positioning the lab within the Israeli hi-tech ecosystem, fostering collaborations with academic institutions, multinational companies, and VC-backed startups. Throughout her tenure at IBM, she has spearheaded several strategic initiatives that evolved into successful products and solutions in the AI domains. Dr. Soffer has authored over 50 peer-reviewed papers, filed more than 15 patents, and has been an invited speaker at numerous conferences.
IBM Granite Vision – the journey to develop a large enterprise-focused Visual Language Model
Co-founder and CSONeuroKaire
Daphna is co-founder and CSO at NeuroKaire, pioneering precision medicine in psychiatry and neurology. Previously she was CSO at Ibex Medical Analytics, focusing on AI-based cancer diagnostics. Before joining Ibex, Daphna headed Teva Pharmaceutical’s personalized medicine group, supporting multiple aspects of Teva’s pipeline drugs and working in collaboration with the Israeli healthcare sector and multinational companies. Prior to that, Daphna led biomarker and diagnostic development activities within the pipeline of multiple top-10 pharma and fortune 500 companies. Daphna trained as post-doc at Harvard University, after receiving her PhD in Neuroscience from the Technion – Israel Institute of Technology.
A window to the brain: computer vision and AI in mental health
Co-Founder & CEO Decart
Dean Leitersdorf grew up between Israel, Switzerland and Silicon Valley. Dean completed his PhD at the Technion at the age of 23, while serving in Unit 8200, and later completed his postdoc at NUS Singapore.
Dean won the ACM PODC Dissertation Award in 2023, for the best PhD in distributed computing worldwide. Additional awards include three best student paper awards at PODC, and the פרס ביטחון ישראל from the IDF.
Dean serves as CEO of Decart, an efficiency-focused AI research lab, which he founded in 2023 with his cofounder, Moshe Shalev. Decart burst out of stealth in October 2024 with its demo, Oasis- a real-time, generative AI video game world. Decart aims to become the leading consumer AI company by helping users transform their imagination into visual reality, blending interactive, generative AI experiences into everyday life.
Decart’s innovation lies in its groundbreaking AI platform, which reduces the cost of running and training AI models by ten times—a development that has put the company on the radar of global tech giants. Their platform delivers real-time generative capabilities that include creating fully playable AI-generated video game worlds. This marks a transformative step forward in AI infrastructure.
Turning Israel into a Global Leader in Building Foundation Models for Computer Vision
Director of AI researchVimeo
Alon Faktor is a Director of AI research at Vimeo, the world’s largest private video hosting platform. Alon works on innovative applications for video consumption and interaction aiming to make video content dynamically used to increase value for Vimeo customers. Alon’s research focuses on efficient multi-modal large language models and video-RAG techniques for video understanding and indexing. Alon holds a B.Sc. in Physics and Electrical Engineering from the Technion and a M.Sc. and PhD in Computer Science from the Weizmann Institute of Science.
From Passive Viewing to Active Dialogue: Redefining Video Consumption
At this lecture, we will present how Vimeo is harnessing the latest AI techniques to enable novel video viewing experiences. We will give a deep dive into our talk-to-video technology which utilizes RAG and LLMs on the video domain. We will demonstrate our approach to extract and index the multimodal video information such that it can be effectively utilized with the RAG architecture. We will also present several applications which we have built on top of this method such as video Q&A and library search.
PhD candidate Hebrew University of Jerusalem
Task-Specific Adaptation with Restricted Model Access
The emergence of foundational models has greatly improved performance across various downstream tasks, with fine-tuning often yielding even better results. However, existing fine-tuning approaches typically require access to model weights and layers, leading to challenges such as managing multiple model copies or inference pipelines, inefficiencies in edge device optimization, and concerns over proprietary rights, privacy, and exposure to unsafe model variants. In this paper, we address these challenges by exploring ``Gray-box'' fine-tuning approaches, where the model's architecture and weights remain hidden, allowing only gradient propagation. We introduce a novel yet simple and effective framework that adapts to new tasks using two lightweight learnable modules at the model's input and output. Additionally, we present a less restrictive variant that offers more entry points into the model, balancing performance with model exposure. We evaluate our approaches across several backbones on benchmarks such as text-image alignment, text-video alignment, and sketch-image alignment. Results show that our Gray-box approaches are competitive with full-access fine-tuning methods, despite having limited access to the model.
AI System/Software ArchitectQualcomm
Generative AI on Mobile Devices: Challenges and Innovations
Founder & CEOCognata
Supervised Generative AI – Real Data Augmentation with DriveMatriX
AI Hub Research Manager Weizmann Institute of Science
Tamar Kashti is a seasoned leader in algorithms and AI with over 15 years of research experience across academia and industry. She specializes in deep learning, computer vision, and image processing. Tamar currently manages the AI Hub at the Weizmann Institute, where she oversees the AI internship program for students and research fellows and advances cutting-edge medical imaging projects. Dr. Kashti has led algorithmic teams at Landa Labs and HP Indigo, pioneering innovations in calibration and print technologies, earning 7 patents. She has authored 15 published papers and holds a Ph.D. in theoretical physics from the Weizmann Institute.
Deep Denoising of Multiplexed Mass-based Images: Supervised vs. Self-supervised
Multiplexed mass-based imaging (MBI) technologies offer transformative insights into cellular diversity but are often hindered by significant noise. This study presents a deep learning-based approach to denoise MBI data, comparing supervised and self-supervised methods. Both approaches effectively address noise-related artifacts, with the supervised method excelling after fine-tuning for specific datasets, while the self-supervised approach demonstrates strong generalization across diverse data. These methods significantly reduce manual effort, delivering high-quality, denoised images within minutes. By enhancing image usability and analysis efficiency, this automated solution accelerates the adoption of MBI technologies for researchers and clinicians in biological and clinical studies.
Ph.D. Student Ben Gurion University
SpaceJAM: a Lightweight and Regularization-free Method for Fast Joint Alignment of Images
Unsupervised Joint Alignment (JA) of images faces high complexity, geometric distortions, and convergence to poor optima. Vision Transformers provide powerful features but do not fully resolve these challenges, leading to reliance on expensive models with heavy regularization and extensive hyperparameter tuning. We propose the Spatial Joint Alignment Model (SpaceJAM), a compact architecture with ~16K trainable parameters requiring no regularization. Evaluations on SPair-71K and CUB demonstrate that SpaceJAM matches state-of-the-art performance while offering at least a 10x speedup. By setting a new standard for rapid, effective image alignment, SpaceJAM makes JA more accessible and efficient. Code is available at: https://bgu-cs-vil.github.io/SpaceJAM/
MSc student Technion - Israel Institute of Technology
SIMPLE: Simultaneous Multi-Plane Self-Supervised Learning for Isotropic MRI Restoration from Anisotropic Data
Algorithm Researcher 4M Analytics
Utilizing aerial imagery for utility object detection in infrastructure mapping
Algorithm DeveloperApplied Materials
Repair Blind Spots in Semantic Segmentation
Semantic segmentation of images is crucial in AI expert systems. The Unet network architecture is widely considered the most practical choice for CNN solutions in application-specific semantic segmentation.
Despite their success, the Unet architecture has inherent problems that prevent true shift-equivariance and single pixel-level accuracy, as we demonstrate below.
Here we propose alternative CNN architecture, offering higher accuracy and efficiency, which is specifically designed for mission-critical applications where each pixel is crucial, such as anomaly detection in medical imaging, industrial quality control, and more.
PhD Student Tel Aviv University
I am an Electrical Engineering Ph.D. student at Tel-Aviv University, advised by Prof. Raja Giryes.
My research is in the area of Artificial Intelligence and it focuses on the usage of 3D data. My research goal is to explore new 3D representations, and their integration with Generative AI methods such as Diffusion Models. Our vision is to use this unique integration in order to take the current 3D solutions a step forward. Previously, I obtained a B.Sc. (Cum Laude) and M.Sc. from the department of Computer Science at the Technion, where I was advised by Prof. Michael Elad. My M.Sc. thesis was about unfolding greedy sparse pursuit algorithms into deep neural networks.
TriNeRFLet: A Wavelet Based Triplane NeRF Representation
Lecturer University of Haifa
Fetal weight estimation using deep learning-based segmentation
Principal ScientistWalmart Global Tech
Generative video virtual try-on
Visualizations are critical in e-commerce, both for marketing and customer-vendor expectations matching. The particular use case of garment virtual try-on underwent a tremendous growth in the past decade, accelerating exponentially with the present leap of generative models. The specific differentiator between an impressive presentation and real-life solution is trustworthiness of the result, mitigating customer's risk during remote purchase.
M.Sc. student Technion - Israel Institute of Technology
Sharon is an M.Sc. student at the Technion's Data and Decisions Faculty, co-advised by Dr. Moti Freiman and Dr. Yosi Maruvka.
His research focuses on developing Multiple Instance Learning (MIL) models for gigapixel histopathological images (Whole Slide Images), with an emphasis on creating scalable solutions for clinical diagnostics.
PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification
Whole slide images are gigapixel-sized digital scans of tissue samples, designed to capture intricate cellular and morphological patterns.
Their immense size necessitates division into smaller tiles, typically analyzed as instances within a Multiple Instance Learning (MIL) framework.
However, this formulation often overlooks spatial relationships between instances, which are crucial for capturing complex tissue structures during a histopathological examination.
To this end, we present a novel attention-based MIL framework that utilizes a probabilistic interpretation of self-attention to dynamically infer spatial dependencies during training.
Principal Research Scientist OriginAI
Rami Ben-Ari is a Principal Research Scientist and Technical Lead at OriginAI and an Adjunct Professor at Bar-Ilan University in CS-EE Faculty. Actively engaged in the academic community, he co-supervises MSc and PhD students and has authored over 50 papers, along with numerous patents. His research focuses on deep learning techniques for image retrieval, multimodal learning, video understanding, and generative models. He holds a PhD in Applied Mathematics from Tel-Aviv University, specializing in computer vision.
Fast image inversion and editing with diffusion models
Emerging diffusion models have demonstrated impressive capabilities in generating images from textual prompts and sampled random noise, commonly referred to as a seed. A common approach to edit a real image involves adjusting the prompt while evaluating its corresponding seed, a process known as image inversion.
PhD studentReichman University
CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering
M.Sc studentTechnion - Israel Institute of Technology
From Semantic Understanding to Geometric Features: Using Foundation Models for Novel Robotic Tasks
AI Research LeadAppsFlyer
Designing VLM-Based AI Agents for Large-Scale Video Analysis
Ph.D Candidate Bar Ilan University
Ben Fishman is an AI & Algorithms researcher and manager with 11 years of experience in both industry and academia. Today he is a Computer Science Ph.D. candidate at Bar Ilan University, supervised by Prof. Gal Chechik and Dr. Idan Schwartz focusing on Generative AI & Multi Modal. Prior to that he served as a Director of AI & Algorithms at Microsoft and as a team leader at Mobileye. Ben’s main areas of interest include Computer Vision, Speech & Audio, Deep Learning, and LLMs. He holds an M.Sc. in Electrical Engineering and a B.Sc. in Biomedical Engineering both from Tel Aviv University.
Bringing AI to Production
Senior LecturerBen-Gurion University of the Negev
Breaking Barriers in Time Series Generation: From Koopman Dynamics to Image-Based Models
MSc StudentTechnion - Israel Institute of Technology
Ido Sobol is an MSc student in Computer Science at Technion, supervised by Prof. Or Litany. His research focuses on 3D computer vision and generative AI, with a particular emphasis on exploring the internal mechanisms of diffusion models.
Diffusion Models in the 3D Space: From Robust Inference to New Training Procedures
Diffusion models have dominated generative applications. However, they still face challenges. This talk explores two key challenges:
1.Generation Artifacts: Image-based diffusion models, used for 3D tasks as Novel View Synthesis, often produce artifacts. To address these artifacts, we introduce “Zero-to-Hero”, a training-free attention filtering mechanism, enhancing quality and condition-enforcement.
Training Challenges: Training diffusion models requires large-scale datasets of the target modality, but 3D data is scarce. In our work “A Lesson in Splats”, we propose a novel training strategy that decouples the denoised modality from the supervision modality, enabling to train 3D diffusion models using only 2D supervision.
Algorithmic Research LeadVisual Layer
Garbage In, Garbage Out: How Label Noise Affects Vision Models, and What To Do About It
Director of Research Trigo
Ido, a member of Trigo’s founding team, manages the company’s interdisciplinary research group. Trigo specializes in computer-vision based solutions for physical retail stores, such as category leading autonomous checkout and theft detection systems. Ido’s group develops cutting edge AI in-the-wild algorithms, serving millions of shoppers around the world, at scale and in real-time.
Anonymized tracking of hundreds of people across thousands of cameras, online, in autonomous retail stores
At Trigo we developed the world’s most advanced fully autonomous store, serving shoppers with accurate receipts in real-time. The underlying problem, namely to track all people and “understand” their interactions with products, is complex, multi-tiered and multi-faceted. Hence, solving it requires harnessing the latest advancements and exploring new frontiers across multiple algorithmic disciplines.
Senior Algorithm Developer Align Technology
Roie Cohen is a Senior Algorithm Developer at Align Technology.
His work focuses on identifying optimal NIR images for dental caries detection by combining optical-geometrical insights with traditional and AI-based computer vision techniques. Roie earned his PhD in physics from Tel Aviv University, where he was honored with multiple awards for his biophysical research in cellular mechanics.
His groundbreaking work has significantly advanced treatments for hearing deficiencies.
Align Technology, an S&P 500 company, leads the global dental industry with its Invisalign clear aligners and iTero 3D intraoral scanners, revolutionizing dental treatment through advanced 3D imaging and CAD/CAM technologies.
Combining NIR imaging with cutting-edge AI for non-radiation diagnostics
Screening for dental caries is a common practice aimed at early detection and prevention of invasive treatments. Using non-ionizing-radiation methods like Near-infrared (NIR) imaging eliminates the exposure to harmful X-rays while allowing accurate diagnosis.
NIR imaging leverages the teeth's partial transparency to visualize carious lesions. However, it faces challenges like internal and specular reflections that deteriorates image quality.Our innovative solution combines classical methods with machine learning, using geometrical optics and image data.
This approach produces high-contrast, clear NIR images, enabling quick and precise diagnoses.
This allows for frequent, radiation-free patient monitoring with minimal inconvenience, revolutionizing dental diagnostics.
AI Algorithms Principal Engineer Mobileye
Computationally Efficient Transformer for Autonomous Driving
Senior AI ResearcherNexar
PixelSHAP: Extending TokenSHAP for Vision-Language Models (VLMs)
Research EngineerGoogle & The Hebrew University of Jerusalem
Supervised Image Editing with Diffusion Models
AI team leaderDeePathology
Segmentation by Factorization: Unsupervised Semantic Segmentation for Pathology by Factorizing Foundation Model Features
Senior ResearcherAutobrains & Tel-Aviv University
Leah Bar holds a B.Sc. in Physics, an M.Sc. in Biomedical Engineering, and a Ph.D. in Electrical Engineering from Tel Aviv University. She completed her postdoctoral fellowship in the Department of Electrical Engineering at the University of Minnesota. Currently, she is a senior researcher at Autobrains, and also a researcher in the Applied Mathematics Department at Tel Aviv University. Her research interests include machine and deep learning, image processing, computer vision, and inverse problem.
Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval
Active Learning (AL) is a user-interactive approach aimed at reducing annotation costs by selecting the most crucial examples to label. Although AL has been extensively studied for image classification tasks, the specific scenario of interactive image retrieval has received relatively little attention. This scenario presents unique characteristics, including an open-set and class-imbalanced binary classification, starting with very few labeled samples. We introduce a novel batch-mode Active Learning framework named GAL (Greedy Active Learning) that better copes with this application. It incorporates new acquisition functions for sample selection that measure the impact of each unlabeled sample on the classifier. We further embed this strategy in a greedy selection approach, better exploiting the samples within each batch. We evaluate our framework with both linear (SVM) and non-linear MLP/Gaussian Process classifiers. For the Gaussian Process case, we show a theoretical guarantee on the greedy approximation. Finally, we assess our performance for the interactive content-based image retrieval task on several benchmarks and demonstrate its superiority over existing approaches and common baselines.
PhD candidateTel Aviv University
ProtoSAM - One shot medical image segmentation with foundational models
Computer vision lead engineer Samsung
Alexandra is a seasoned computer vision algorithm engineer with more than 20 years of experience in the industry.
In the last ten years she has been leading various computer vision projects at Samsung for its automotive and mobile SoCs.
She holds a PhD in Bioinformatics from Tel-Aviv University, MSc in Electrical Engineering from Tel-Aviv University and a BSc in Computer Engineering from the Technion and the author of numerous academic papers and patents.
Do More With What You Have: Transferring Depth-scale from Labeled to Unlabeled Domains
Adjusting pre-trained absolute depth predictors to new domains is a task with significant real-world applications.
This task is specifically challenging when images from the new domain are collected without ground-truth depth measurements, and possibly with sensors of different intrinsics.
We suggest an novel solution for this challenge and successfully demonstrate it on the KITTI, DDAD and nuScenes datasets, while using other existing real or synthetic source datasets, achieving comparable or better accuracy than other existing methods that do not have access to target ground-truth depths.
Research engineerGeneral Motors
Open-Vocabulary Object-Based Image Retrieval
Tech lead algorithm developerApplied Materials
Beyond Nanoscale: Novel SEM Image Segmentation Technique for Accurate Metrology
The chip manufacturing process heavily relies on accurate segmentation of semiconductor patterns to ensure the quality and efficiency of advanced chips. This talk addresses the challenges involved in segmenting nano-scale features in scanning electron microscope images of semiconductor patterns. We propose a foundation model, trained on a large dataset, and specifically designed to tackle the unique challenges present in this domain. We demonstrate the high performance and robustness of our algorithm, establishing it as a groundbreaking tool for the semiconductor industry. With its versatility and ease-of-use, our approach paves the way for advancements of semiconductor metrology and chip manufacturing.
Ph.D. CandidateBen-Gurion University of the Negev
Wavelet Convolutions for Large Receptive Fields
Senior Research Scientist and ManagerIBM
Assaf Arbelle (PhD) is a Senior Research Scientist and Manager of the Multimodal-AI group at IBM Research.
Assaf leads a team of 13 scientists with expertise in the fields of Computer Vision, Speech, NLP, and more. The team focuses on Multimodal Foundation Models, working to develop the next generation of AI systems capable of handling multiple tasks across various modalities. Assaf’s research interests include the intersection of Computer Vision and Language Models, as well as broader topics in Machine Learning, Speech and Audio, and general AI.
The work at IBM Research emphasizes advancing AI for business, implementing innovative methods and tools to accelerate real-world use cases.
Training a Large Vision-Language Model - Insights and lessons learned from building IBM’s Granite-Vision VLM
With the exponential growth of Large Language Models came a rise in the joint modality paradigm of Large Multimodal Models (LMMs) which expand the LLM’s reasoning and understanding capabilities to more modalities such as Images and Audio. This talk will present the work done at IBM Research to build Granite Vision, a lightweight large language model with vision capabilities, specifically designed to excel in enterprise use cases, particularly in visual document understanding. We will discuss the network’s architecture, the data used to train the model, and the technical details of training the model. We will also present compelling results surpassing SoTA models on enterprise related tasks.
PhD student Weizmann Institute of Science
Teaching VLMs to Localize Specific Objects from In-context Examples
Vision-Language Models (VLMs) excel in various visual tasks but struggle with context-aware object localization, a crucial ability in ambiguous scenarios. We address this gap with a few-shot personalized localization task, where a model learns to localize objects from a small set of annotated examples. To enhance contextual reasoning, we fine-tune VLMs using curated video object tracking data, simulating instruction-tuning dialogues. Additionally, we introduce a novel regularization technique that replaces object labels with pseudo-names, enforcing reliance on visual context. Our approach significantly improves few-shot localization while preserving generalization, establishing the first benchmark for personalized localization and advancing context-driven vision-language modeling.