Or Kozlovsky

Senior AI Applied Researcher Bluewhite Robotics

Bio:

Noa Cahan

Computer Vision ResearcherTel Aviv University

Bio:

Noa Cahan is a PhD candidate in Electrical Engineering at Tel Aviv University, advised by Prof. Hayit Greenspan. She holds both a BSc and an MSc in Electrical Engineering from Tel Aviv University. Noa's research focuses on deep learning and computer vision in medical imaging, with a particular interest in integrating diverse data modalities such as imaging, free text, and structured tabular data for medical prognosis, as well as developing cross-modal translation models using generative AI. Noa has been awarded ISF research grants, and her work has been published in leading journals such as Scientific Reports and NPJ Digital Medicine, and presented at top conferences including MICCAI and NeurIPS. Prior to her PhD, Noa worked at Amazon and Qualcomm.

Title:

Leveraging Diffusion Models towards PE Early Diagnosis using CXRs

Abstract:

Patients with respiratory issues in emergency rooms typically undergo chest X-rays (CXR), which are accessible and low-cost but provide limited low-resolution imaging. Higher-risk patients are referred for more detailed and expensive CT or Computed Tomography Pulmonary Angiography (CTPA) scans, which involve higher radiation. The study focuses on detecting Pulmonary Embolism (PE), usually invisible in CXR but detectable in CTPA.

By leveraging paired CXR–CTPA data, we investigate two complementary diffusion-based strategies that transfer diagnostic knowledge from the high-fidelity CTPA modality to the widely available CXR domain. In the first, a conditional diffusion model
is trained to generate 3D CTPA - like representations directly from 2D CXRs, enriching the initial imaging with high-resolution vascular cues and improving PE detection performance from 69% to 80% AUC. In addition, we introduce a latent-space
diffusion prior that performs cross-modal knowledge distillation, generating CTPA-informed classifier embeddings from CXR embeddings without explicit image synthesis, enabling state-of-the-art PE classification using CXR alone.
Together, these approaches demonstrate that diffusion models can act as powerful cross-modal bridges, either through image generation or embedding level supervision, substantially enhancing early PE diagnosis from CXRs while reducing reliance on
expensive and high radiation imaging. Although not a replacement for clinical CTPA, this framework highlights a scalable and generalizable pathway for augmenting low-cost imaging with high-level diagnostic insight.
Our contributions through these works are as follows: (1) First true CXR→CTPA diffusion pipeline with diagnostic validation; (2) A novel 1D-diffusion prior for CXR→CTPA embedding distillation; (3) State-of-the-art CXR-based PE classification; (4) Modality-agnostic framework extendable to other cross-modal imaging tasks, facilitating wider access to advanced diagnostic tools.

Ran Itay

Algorithm DeveloperApplied Materials

Bio:

Ran Itay is an algorithm developer at Applied Materials, working in the Process Diagnostics and Control business unit. He holds a Ph.D. in physics from the Department of Particle Physics and Astrophysics at the Weizmann Institute of Science in Israel. Ran began working in machine learning during his postdoctoral research at the Stanford Linear Accelerator Center (SLAC) in California, USA, where he led the deep learning group in the MicroBooNE experiment, in the field of neutrino physics. In his current role, Ran focuses on developing deep learning and classical algorithmic solutions across various domains, including metrology, defect detection, and physical simulation

Title:

Warp and Render: A Dual-Network Framework for Geometry-Controlled Simulation in Semiconductor Process Diagnostics

Abstract:

SEM images used in semiconductor manufacturing pose significant challenges for vision models because labels are scarce, geometric accuracy must be maintained at sub-pixel levels, and the domain gap from natural images is substantial. To address these limitations, we introduce Warp and Render, a dual-network framework that separates geometric structure from visual appearance and enables controlled, design-guided simulation. A deformation-prediction network aligns a reference layout to the observed image, and a rendering network generates realistic SEM-like appearance from the aligned geometry. The approach generalizes across diverse pattern types and imaging conditions, remains effective in low-data regimes, and preserves strong geometric consistency, supporting high-accuracy industrial SEM-image analysis.

Tom Hirshberg

Data Scientist Microsoft

Bio:

Tom Hirshberg is a data scientist at Microsoft in the Edge AI group, where she develops multimodal AI systems for large-scale video understanding. Previously, she was a research intern at Microsoft in Redmond, focusing on optimization and control methods for autonomous robotic systems.

Tom holds a BSc and an MSc (cum laude) in Computer Science from The Technion. During her studies, she was part of the algorithm team that developed the first student’s autonomous formula race car at The Technion. Her master’s thesis explored acoustic-based indoor localization for drones, bridging signal processing, machine learning, and robotics.

Title:

Object Detection and Tracking in Live Streams Using Textual and Visual Detailed Descriptions

Abstract:

In the live video analysis domain, everything must happen quickly, efficiently and accurately. While traditional object detection systems rely on predefined classes, modern applications require flexibility to describe, detect, and track any object in live video streams. This brings algorithmic and computational challenges, especially for edge devices, like handling detailed attributes (e.g., “a red vintage car”), integrating specialized trackers, and managing high camera loads efficiently.

This lecture presents an algorithm for detecting and tracking objects in live video streams using detailed textual description, image examples or both. Our approach is already successfully implemented in Microsoft’s Azure AI Video Indexer.

Eli Schwartz

Research Manager IBM Research

Bio:

Dr. Eli Schwartz is Research Manager of Multimodal AI at IBM Research. His research focuses on vision-language foundation models and learning with limited data. Eli earned his PhD from Tel Aviv University and has authored more than 30 papers and more than 10 patents. Before joining IBM Research, he co-founded Inka Robotics, working on autonomous robotics, and worked at Microsoft developing computer vision algorithms for AR/VR.

Title:

Adaptive Resolution Processing in Vision-Language Models

Abstract:

Modern vision-language models face a fundamental accuracy-efficiency trade-off with high-resolution inputs. This talk presents four approaches to adaptive resolution across two architectural paradigms. For contrastive encoders, WAVECLIP enables coarse-to-fine processing via wavelet tokenization with early exits, while CLIMP uses Mamba architectures for natural variable-resolution support. For decoder-based VLMs, CARES predicts minimal sufficient resolution with a lightweight preprocessor (80% compute reduction), while ZoomCall trains models to selectively fetch high-resolution crops via tool-calling and reinforcement learning. These complementary strategies—progressive refinement, learned preprocessing, and agent-based reasoning—enable dynamic accuracy-efficiency trade-offs within deployed models.

Or Greenberg

AI researcher General Motors

Bio:

I am a PhD candidate at The Hebrew University of Jerusalem, advised by Prof. Dani Lischinski, and a Senior Researcher at General Motors. My research focuses on image and video generation and manipulation, with a particular interest in adverse viewing conditions and out-of-distribution (OOD) concepts, primarily related to automotive scenarios.

Title:

Seed-to-Seed: Unpaired Image Translation in Diffusion Seed Space

Abstract:

Abstract:

Large Language Models (LLMs) and AI Agents are increasingly deployed in high-stakes human–AI systems such as video content moderation. Yet a fundamental limitation remains: they tend to respond with confidence even when they are wrong, creating significant real-world risks.

The central challenge in deploying LLMs and Agents is not maximizing autonomy, but enabling systems to recognize when an Agent should not be trusted. To address this, we introduce a trust-aware framework for human–AI collaboration in which a judge model predicts whether the LLM output should be trusted or escalated to a human.

Our approach relies on LLM Performance Predictors (LPPs) derived directly from LLM outputs, capturing confidence signals, self-reported uncertainty, and indicators of missing evidence or ambiguous decision rules. Evaluated on a large-scale multimodal moderation benchmark, our method improves performance while reducing unnecessary human intervention. These results suggest that reliable AI systems are built not by replacing humans, but by enabling models to know when to ask for human judgment.

Or Bachar

Data ScientistZefr

Ori Lifschitz

Head of Computer Vision Skana Robotics

Bio:

Ori holds a BSc in Electrical Engineering from Ben-Gurion University of the Negev (Israel) and an MSc in Marine Technologies from the Hatter Department of Marine Technologies at the University of Haifa (Israel), graduating on the Dean’s Honor List. During his MSc, he published two papers, including a NeurIPS 2025 paper co-authored with Prof. Tali Treibitz and Dr. Dan Rosenbaum, in which he addressed refraction-induced water-surface distortions using unsupervised, physics-constrained deep learning. In industry, Ori reduced training-data requirements for a semantic-segmentation model serving thousands of client API calls and owned deep-learning pipelines end to end—from research to deployment. He now heads Computer Vision at Skana Robotics, developing robust, real-world perception systems for marine environments.

Title:

Looking Into the Water by Unsupervised Learning of the Surface Shape

Abstract:

We address the problem of looking into the water from the air, where we seek to remove image distortions caused by refractions at the water surface. Our approach is based on modeling the different water surface structures at various points in time, assuming the underlying image is constant. To this end, we propose a model that consists of two neural-field networks. The first network predicts the height of the water surface at each spatial position and time, and the second network predicts the image color at each position. Using both networks, we reconstruct the observed sequence of images and can therefore use unsupervised training. We show that using implicit neural representations with periodic activation functions (SIREN) leads to effective modeling of the surface height spatio-temporal signal and its derivative, as required for image reconstruction. Using both simulated and real data we show that our method outperforms the latest unsupervised image restoration approach. In addition, it provides an estimate of the water surface.

Murad Mustafa Badarna

Senior LecturerThe Max Stern Yezreel Valley College

Bio:

Bio:

Dana is a researcher at Google Research, focusing on generative image processing. Previously, she worked on perception for autonomous vehicles at SAIPS. She completed her PhD at Tel Aviv University's School of Electrical Engineering, under the supervision of Prof. Shai Avidan and Prof. Tali Treibitz.

Title:

Progressive Photorealistic Simplification

Abstract:

Existing image simplification techniques often rely on Non-Photorealistic Rendering (NPR), transforming photographs into stylized sketches, cartoons,
or paintings. While effective at reducing visual complexity, such approaches typically sacrifice photographic realism. In this work, we explore a complementary direction: simplifying images while preserving their photorealistic appearance. We introduce progressive semantic image simplification, a framework that iteratively reduces scene complexity by removing and inpainting elements in a controlled manner. At each step, the resulting image remains a plausible natural photograph. Our method combines semantic understanding with generative editing, leveraging Vision-Language Models (VLMs) to identify and prioritize elements for removal, and a learned verifier to ensure photorealism and coherence throughout the process. This is implemented via an iterative Select–Remove–Verify pipeline that produces high-quality simplification trajectories. To improve efficiency, we further distill this process into an image-to-video generation model that directly predicts coherent simplification sequences from a single input image. Beyond generating cleaner and more focused compositions, our approach enables applications such as content-aware decluttering, semantic layer decomposition, and interactive editing. More broadly, our work suggests that simplification through structured content removal can serve as a practical mechanism for guiding visual interpretation within the photorealistic domain, complementing traditional abstraction methods.

15^th Israel Machine Vision Conference (IMVC) 2026

Program

Track 1: Generative AI

Track 2: LLMs & Vision

Track 3: 3D Vision

Track 4: Autonomous Systems

Track 5: Visual Surveillance & Object Tracking

Track 6: Agents & RAGS

Track 7: Advanced Learning Theory

Track 8: "Real-World" AI

Track 9: AI in Healthcare & Scientific Discovery

Sponsors

Platinum Sponsorship

Gold Sponsorship

Silver Sponsorship

Students Sponsorship

Bronze Sponsorship

Exhibitors

Startups & Demos Zone

Contact

Photo Gallery 2025

15th Israel Machine Vision Conference (IMVC) 2026

Program

Track 1: Generative AI

Track 2: LLMs & Vision

Track 3: 3D Vision

Track 4: Autonomous Systems

Track 5: Visual Surveillance & Object Tracking

Track 6: Agents & RAGS

Track 7: Advanced Learning Theory

Track 8: "Real-World" AI

Track 9: AI in Healthcare & Scientific Discovery

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

Bio:

Title:

Abstract:

15^th Israel Machine Vision Conference (IMVC) 2026