Selected Projects
This page contains a number of representative projects I was involved in and the progression in my research interests and directions.
Tonmoay Deb, Lichen Wang, Zachary Bessinger, Naji Khosravan, Eric Penner, Sing Bing Kang,
CVPRW, 2024
We presented an extension to Zillow Indoor Dataset, adding natural language descriptions of layout to enable bridging the gap between layout understanding and description in the form of natural language.
Taotao Jing, Lichen Wang, Naji Khosravan, Zhiqiang Wan, Zachary Bessinger, Zhengming Ding, Sing Bing Kang
WACV, 2024
We proposed a method that addresses the challenges related to the long tailed distribution of room layouts. Our method is more robust than state of the art to appearance, structure and shape variations in real world room layouts.
Negar Nejatishahidin, Will Hutchcroft, Manjunath Narayana, Ivaylo Boyadzhiev, Yuguang Li, Naji Khosravan, Jana Košecká, Sing Bing Kang
CVPRW, 2023, (OmniCV Best Paper Award)
We address the problem of wide-baseline camera pose estimation from a group of 360◦ panoramas, proposing a deep learning graph neural network based method to deal with textureless empty indoor spaces using co-visible structure from multiple views.
Yu Yin, Will Hutchcroft, Naji Khosravan, Ivaylo Boyadzhiev, Yun Fu, Sing Bing Kang
ICMR, 2022
We presented an iterative topological learning algorithm for a graph neural network solution to learn/predict topological structure of floorplans using individual room attributes as input.
Zhixiang Min, Naji Khosravan, Zachary Bessinger, Manjunath Narayana, Sing Bing Kang, Enrique Dunn, Ivaylo Boyadzhiev
CVPR, 2022 (Oral)
LASER, an image-based Monte Carlo Localization (MCL) framework. We proposed the concept of latent space rendering, where 2D pose hypotheses on the floor map are directly rendered into a geometrically-structured latent space by aggregating viewing ray features. Our method achieved state of the art performance and speed for visual localization in indoor textureless spaces.
Steve Cruz, Will Hutchcroft, Yuguang Li, Naji Khosravan, Ivaylo Boyadzhiev, Sing Bing Kang
CVPR, 2021
We released Zillow Indoor Dataset (ZInD). ZInD is the largest real dataset with layout annotations providing annotations of 3D room layouts, 2D and 3D floor plans, panorama location in the floor
plan, and locations of windows and doors. ZInD follows a real world distribution (cuboid, more general Manhattan, and non-Manhattan layouts) as opposed to the mostly cuboid or Manhattan layouts in current publicly available datasets.
Rodney LaLonde, Naji Khosravan, Ulas Bagci
Advanced Intelligent Systems, 2021
We proposed a new family of capsule neural networks, deformable capsules (DeformCaps), to address object detection problem. DeformCaps resolves the limitations of existing capsule networks around memory‐efficient with a new perspective of using deformable convolutions rather than imposing geometric constraints. Two new algorithms associated with our DeformCaps, a novel capsule structure (SplitCaps), and a novel dynamic routing algorithm (SE-Routing) is proposed to address the problem at hand.
Naji Khosravan, Shervin Ardeshir, Rohit Puri
CVPRW, 2019
We proposed a convolutional neural network (CNN) architecture using a spatio-temporal attention module that is capable of identifying the important portions of a video (eliminating background audio) to determine the synchronization between the audio and visual signals. We explored defining the problem of audio-video synchronization both as a regression and a classification problem.
Naji Khosravan
Ph.D. Dissertation, 2019
In this dissertation, I proposed a wide range of novel computer vision and machine learning algorithms to tackle current challenges in radiology screening. I proposed to incorporate expert/radiologist knowledge through eye-tracking, making the whole process a human-centered AI system. Proposed machine learning methods solve image-based diagnosis, detection and segmentation while using gaze as the minimally distractive and natural interaction medium with the human in the loop.
Naji Khosravan, Aliasghar Mortazi, Michael Wallace, Ulas Bagci
MICCAI, 2019
PAN is an adversarial learning approach designed to effectively capture long-range and high-level label consistencies in semantic segmentation. Specific to medical images, we proposed capturing 3D semantics in a computationally efficient way through 2D projections. Furthermore, we introduce an attention module into our framework that helps for a selective integration of global information.
Naji Khosravan, Haydar Celik, Baris Turkbey, Elizabeth C Jones, Bradford Wood, Ulas Bagci
Medical image analysis journal, 2019
In this study, we developed a paradigm shifting CAD system, called collaborative CAD (C-CAD), that unifies CAD and eye-tracking systems in realistic radiology room settings. We present a new graph based clustering and sparsification algorithm to transform eye-tracking data (gaze) into a graph model to enable effective interaction with the radiologist through eye-tracking. C-CAD incorporates a deep learning algorithm in a newly designed multi-task learning platform to segment and diagnose suspicious areas simultaneously.
Naji Khosravan, Ulas Bagci
IEEE EMBC, 2018
Limitations in expert annotated data is one of the major challenges in developing deep learning models for diagnosis and segmentation of abnormalities in medical images. This paper shows the benefit of multi-task learning, when auxiliary tasks are selected carefully, in addressing the limited performance of semi-supervised methods in with limited data available.
Naji Khosravan, Ulas Bagci
MICCAI, 2018
S4ND is the first single shot long nodule detection method. Modeling detection as a point detection problem and using dense residual connection within the architecture allowed us to rely on a single feed forward of a 3D CNN network to find tiny objects in a large 3D gray-scale search space (long nodules within volumetric CT scans).
Naji Khosravan, Haydar Celik, Baris Turkbey, Ruida Cheng, Evan McCreedy, Matthew McAuliffe, Sandra Bednarova, Elizabeth Jones, Xinjian Chen, Peter Choyke, Bradford Wood, Ulas Bagci
MICCAIW, 2017
Gaze2Segment was our very first step towards designing and implementing a Computer Aided Diagnosis system integrating expert/radiologists' gaze into computer vision algorithms in an interactive and dynamic way.