




Paper IDTypeTitle
5PosterSingle-Shot Refinement Neural Network for Object Detection
7PosterVideo Captioning via Hierarchical Reinforcement Learning
12OralDensePose: Multi-Person Dense Human Pose Estimation In The Wild
12PosterDensePose: Multi-Person Dense Human Pose Estimation In The Wild
19PosterFrustum PointNets for 3D Object Detection from RGB-D Data
21PosterTips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
24PosterRethinking the Faster R-CNN Architecture for Temporal Action Localization
27SpotlightShape from Shading through Shape Evolution
27PosterShape from Shading through Shape Evolution
34PosterA High-Quality Denoising Dataset for Smartphone Cameras
35PosterImproving Color Reproduction Accuracy in the Camera Imaging Pipeline
37SpotlightEnd-to-End Dense Video Captioning with Masked Transformer
37PosterEnd-to-End Dense Video Captioning with Masked Transformer
41PosterpOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
47PosterLearning to Segment Every Thing
48PosterDensity-aware Single Image De-raining using a Multi-stream Dense Network
49PosterDensely Connected Pyramid Dehazing Network
52PosterEmbodied Question Answering
53SpotlightTieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
53PosterTieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
64PosterTowards Open-Set Identity Preserving Face Synthesis
67PosterBaseline Desensitizing In Translation Averaging
68PosterLearning from the Deep: A Revised Underwater Image Formation Model
76OralContext Encoding for Semantic Segmentation
76PosterContext Encoding for Semantic Segmentation
77PosterDeep Texture Manifold for Ground Terrain Recognition
83PosterDS*: Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems
85PosterSparse, Smart Contours to Represent and Edit Images
92PosterEvery Smile is Unique: Landmark-guided Diverse Smile Generation
95PosterGenerative Non-Rigid Shape Completion with Graph Convolutional Autoencoders
97PosterLearning a Discriminative Prior for Blind Image Deblurring
100PosterAttentional ShapeContextNet for Point Cloud Recognition
102PosterLearning Superpixels with Segmentation-Aware Affinity Loss
103SpotlightReal-World Repetition Estimation by Div, Grad and Curl
103PosterReal-World Repetition Estimation by Div, Grad and Curl
106PosterRecurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation
109PosterMegaDepth: Learning Single-View Depth Prediction from Internet Photos
110SpotlightLearning Intrinsic Image Decomposition from Watching the World
110PosterLearning Intrinsic Image Decomposition from Watching the World
112PosterDon't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
116PosterHuman-centric Indoor Scene Synthesis Using Stochastic Grammar
120PosterLearning by Asking Questions
121PosterInstance Embedding Transfer to Unsupervised Video Object Segmentation
122PosterDetect-and-Track: Efficient Pose Estimation in Videos
124PosterSelf-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
125PosterGuided Proofreading of Automatic Segmentations for Connectomics
128OralAugmented Skeleton Space Transfer for Depth-based Hand Pose Estimation
128PosterAugmented Skeleton Space Transfer for Depth-based Hand Pose Estimation
130PosterContext-aware Synthesis for Video Frame Interpolation
131Poster2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning
135PosterNAG: Network for Adversary Generation
136SpotlightLiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation
136PosterLiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation
137PosterAvatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration
142SpotlightMulti-view Harmonized Bilinear Network for 3D Object Recognition
142PosterMulti-view Harmonized Bilinear Network for 3D Object Recognition
144SpotlightTangent Convolutions for Dense Prediction in 3D
144PosterTangent Convolutions for Dense Prediction in 3D
145OralSemi-parametric Image Synthesis
145PosterSemi-parametric Image Synthesis
147PosterInteractive Image Segmentation with Latent Diversity
155Spotlight3D Hand Pose Estimation: From Current Achievements to Future Goals
155Poster3D Hand Pose Estimation: From Current Achievements to Future Goals
165PosterW2F: A Weakly-Supervised to Fully-Supervised Framework for Object Detection
167SpotlightBlockDrop: Dynamic Inference Paths in Residual Networks
167PosterBlockDrop: Dynamic Inference Paths in Residual Networks
168SpotlightMapNet: Geometry-Aware Learning of Maps for Camera Localization
168PosterMapNet: Geometry-Aware Learning of Maps for Camera Localization
170PosterBPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
178PosterSalient Object Detection Driven by Fixation Prediction
179Poster3D Object Detection with Latent Support Surfaces
181OralPractical Block-wise Neural Network Architecture Generation
181PosterPractical Block-wise Neural Network Architecture Generation
182PosterGlimpse Clouds: Human Activity Recognition from Unstructured Feature Points
185OralAre You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
185PosterAre You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
186PosterVisual Grounding via Accumulated Attention
191PosterSupervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors
195PosterISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing
200PosterPerturbative Neural Networks: Rethinking Convolution in CNNs
203SpotlightNonlinear 3D Face Morphable Model
203PosterNonlinear 3D Face Morphable Model
205SpotlightNeural Baby Talk
205PosterNeural Baby Talk
216PosterTowards Pose Invariant Face Recognition in the Wild
224PosterMoNet: Deep Motion Exploitation for Video Object Segmentation
229PosterExploring Disentangled Feature Representation Beyond Face Identification
232PosterTowards Effective Low-bitwidth Convolutional Neural Networks
234PosterParallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
237PosterLearning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering
242SpotlightFew-Shot Image Recognition by Predicting Parameters from Activations
242PosterFew-Shot Image Recognition by Predicting Parameters from Activations
246PosterSingle-Shot Object Detection with Enriched Semantics
250PosterUnifying Identification and Context Learning for Person Recognition
252PosterSeparating Self-Expression and Visual Content in Hashtag Supervision
255PosterMulti-Cue Correlation Filters for Robust Visual Tracking
260PosterBeyond Trade-off: Accelerate FCN-based Face Detection with Higher Accuracy
261PosterOn the Robustness of Semantic Segmentation Models to Adversarial Attacks
266OralPWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
266PosterPWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
270OralIlluminant Spectra-based Source Separation Using Flash Photography
270PosterIlluminant Spectra-based Source Separation Using Flash Photography
281SpotlightTracking Multiple Objects Outside the Line of Sight using Speckle Imaging
281PosterTracking Multiple Objects Outside the Line of Sight using Speckle Imaging
285PosterImproved Human Pose Estimation through Adversarial Data Augmentation
289PosterGenerative Adversarial Learning Towards Fast Weakly Supervised Detection
298SpotlightAudio to Body Dynamics
298PosterAudio to Body Dynamics
299PosterThe Unreasonable Effectiveness of Deep Features as a Perceptual Metric
303PosterFrame-Recurrent Video Super-Resolution
304PosterDeep Mutual Learning
308PosterReal-world Anomaly Detection in Surveillance Videos
310PosterSoccer on Your Tabletop
312PosterDiversity Regularized Spatiotemporal Attention for Video-based Person Re-identification
313PosterHashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN
316PosterExcitation Backprop for RNNs
319PosterDynamic-Structured Semantic Propagation Network
325SpotlightSuper SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
325PosterSuper SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
326OralSPLATNet: Sparse Lattice Networks for Point Cloud Processing
326PosterSPLATNet: Sparse Lattice Networks for Point Cloud Processing
329PosterVideo Representation Learning Using Discriminative Pooling
330PosterAttend and Interact: Higher-Order Object Interactions for Video Understanding
342PosterHuman Pose Estimation with Parsing Induced Learner
345Poster4D Human Body Correspondences from Panoramic Depth Maps
346PosterRecognizing Human Actions as Evolution of Pose Estimation Maps
348PosterGraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning
350SpotlightDeep Adversarial Metric Learning
350PosterDeep Adversarial Metric Learning
353PosterRevisiting Video Saliency: A Large-scale Benchmark and a New Model
362PosterGraph-Cut RANSAC
363PosterFive-point Fundamental Matrix Estimation for Uncalibrated Cameras
367PosterHashing as Tie-Aware Learning to Rank
368PosterOptimizing Local Feature Descriptors for Nearest Neighbor Matching
369OralTotal Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
369PosterTotal Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
374SpotlightConsensus Maximization for Semantic Region Correspondences
374PosterConsensus Maximization for Semantic Region Correspondences
380PosterST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing
391PosterMotion-Guided Cascaded Refinement Network for Video Object Segmentation
397PosterZigzag Learning for Weakly Supervised Object Detection
405SpotlightLook, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
405PosterLook, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
406SpotlightVITON: An Image-based Virtual Try-on Network
406PosterVITON: An Image-based Virtual Try-on Network
408PosterCross-Domain Self-supervised Multi-task Feature Learning Using Synthetic Game Imagery
409PosterLayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image
418PosterThoracic Disease Identification and Localization with Limited Supervision
419PosterStochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
420PosterLearning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation
421PosterDeep End-to-End Time-of-Flight Imaging
423SpotlightFast and Accurate Online Video Object Segmentation via Tracking Parts
423PosterFast and Accurate Online Video Object Segmentation via Tracking Parts
425PosterMin-Entropy Latent Model for Weakly Supervised Object Detection
429PosterFuture Frame Prediction for Anomaly Detection A New Baseline
430PosterFace Aging with Identity-Preserved Conditional Generative Adversarial Networks
431PosterLearning to Compare: Relation Network for Few-Shot Learning
435OralDeep Layer Aggregation
435PosterDeep Layer Aggregation
436PosterStyle Aggregated Network for Facial Landmark Detection
442SpotlightM3: Multimodal Memory Modelling for Video Captioning
442PosterM3: Multimodal Memory Modelling for Video Captioning
449PosterClassification Driven Dynamic Image Enhancement
456PosterGenerative Image Inpainting with Contextual Attention
458SpotlightIterative Visual Reasoning Beyond Convolutions
458PosterIterative Visual Reasoning Beyond Convolutions
460PosterDual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification
465SpotlightTextbook Question Answering under Teacher Guidance with Memory Networks
465PosterTextbook Question Answering under Teacher Guidance with Memory Networks
468PosterMulti-Level Factorisation Net for Person Re-Identification
471SpotlightFunctional Map of the World
471PosterFunctional Map of the World
473PosterA Two-Step Disentanglement Method
475PosterTowards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
482PosterCan Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
483OralLeft-Right Comparative Recurrent Model for Stereo Matching
483PosterLeft-Right Comparative Recurrent Model for Stereo Matching
487OralAnalytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input
487PosterAnalytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input
488SpotlightZero-Shot Sketch-Image Hashing
488PosterZero-Shot Sketch-Image Hashing
490SpotlightInterpretable Convolutional Neural Networks
490PosterInterpretable Convolutional Neural Networks
491PosterReconstructing Thin Structures of Manifold Surfaces by Integrating Spatial Curves
493PosterEnhancing the Spatial Resolution of Stereo Images using a Parallax Prior
494PosterAnticipating Traffic Accidents with Adaptive Loss and Large-scale Incident DB
500SpotlightGenerating Synthetic X-ray Images of a Person from the Surface Geometry
500PosterGenerating Synthetic X-ray Images of a Person from the Surface Geometry
505PosterAttentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
506PosterUnsupervised CCA
510PosterDiscovering Point Lights with Intensity Distance Fields
512PosterUniversal Denoising Networks : A Novel CNN-based Network Architecture for Image Denoising
517PosterEasy Identification from Better Constraints: Multi-Shot Person Re-Identification from Reference Constraints
533SpotlightRecurrent Pixel Embedding for Instance Grouping
533PosterRecurrent Pixel Embedding for Instance Grouping
534PosterRecurrent Scene Parsing with Perspective Understanding in the Loop
540PosterLearning to Hash by Discrepancy Minimization
542PosterFast End-to-End Trainable Guided Filter
550PosterDisentangling Structure and Aesthetics for Content-aware Image Completion
552OralAn Analysis of Scale Invariance in Object Detection - SNIP
552PosterAn Analysis of Scale Invariance in Object Detection - SNIP
561PosterCSGNet: Neural Shape Parser for Constructive Solid Geometry
565OralFinding Tiny Faces in the Wild with Generative Adversarial Network
565PosterFinding Tiny Faces in the Wild with Generative Adversarial Network
567SpotlightSSNet: Scale Selection Network for Online 3D Action Prediction
567PosterSSNet: Scale Selection Network for Online 3D Action Prediction
568SpotlightIntegrated facial landmark localization and super-resolution of real-world very low resolution faces in arbitrary poses with GANs
568PosterIntegrated facial landmark localization and super-resolution of real-world very low resolution faces in arbitrary poses with GANs
569PosterThe Best of Both Worlds: Combining CNNs and Geometric Constraints for Hierarchical Motion Segmentation
573PosterIn-Place Activated BatchNorm for Memory-Optimized Training of DNNs
574PosterWing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
581SpotlightDeep Cross-media Knowledge Transfer
581PosterDeep Cross-media Knowledge Transfer
588PosterCoupled End-to-end Transfer Learning with Generalized Fisher Information
589PosterKnowledge Aided Consistency for Weakly Supervised Phrase Grounding
593PosterViewpoint-aware Attentive Multi-view Inference for Vehicle Re-identification
594PosterMatNet: Modular Attention Network for Referring Expression Comprehension
598PosterCBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation
601SpotlightNISP: Pruning Networks using Neuron Importance Score Propagation
601PosterNISP: Pruning Networks using Neuron Importance Score Propagation
603PosterWho Let The Dogs Out? Modeling Dog Behavior From Visual Data
609PosterEfficient Video Object Segmentation via Network Modulation
615PosterLearning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision
618PosterFeedback-prop: Convolutional Neural Network Inference under Partial Evidence
619PosterA Memory Network Approach for Story-based Temporal Summarization of 360?Videos
620PosterImproving Occlusion and Hard Negative Handling for Single-Stage Object Detectors
623PosterUV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition
630SpotlightLearning a Toolchain for Image Restoration
630PosterLearning a Toolchain for Image Restoration
631PosterLearning to Act Properly: Predicting and Explaining Affordances from Images
632PosterLearning a Discriminative Feature Network for Semantic Segmentation
633PosterOptimizing Video Object Detection via a Scale-Time Lattice
642PosterShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
643PosterCascaded Pyramid Network for Multi-Person Pose Estimation
648PosterSeeing Temporal Modulation of Lights from Standard Cameras
649PosterPoint-wise Convolutional Neural Networks
668SpotlightFine-grained Video Captioning for Sports Narrative
668PosterFine-grained Video Captioning for Sports Narrative
671PosterDense 3D Regression for Hand Pose Estimation
672PosterMissing Slice Recovery for Tensors Using a Low-rank Model in Embedded Space
673PosterLearning Convolutional Networks for Content-weighted Image Compression
678PosterLearning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking
680PosterDeep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation
683PosterFirst-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations
687SpotlightHand PointNet: 3D Hand Pose Estimation using Point Sets
687PosterHand PointNet: 3D Hand Pose Estimation using Point Sets
695PosterRecovering Realistic Texture in Image Super-resolution by Spatial Feature Modulation
700PosterCube Padding for Weakly-Supervised Saliency Prediction in 360$^{\circ}$ Videos
710PosterA Face to Face Neural Conversation Model
711PosterSurfConv: Bridging 3D and 2D Convolution for RGBD Images
717PosterDynamic Video Segmentation Network
721PosterMultiple Granularity Group Interaction Prediction
732SpotlightVisual Question Reasoning on General Dependency Tree
732PosterVisual Question Reasoning on General Dependency Tree
733PosterFrom Lifestyle VLOGs to Everyday Interactions
735PosterCOCO-Stuff: Thing and Stuff Classes in Context
736SpotlightGANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
736PosterGANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
739PosterNon-local Neural Networks
740PosterZero-shot Recognition via Semantic Embeddings and Knowledge Graphs
744OralTaskonomy: Disentangling Task Transfer Learning
744PosterTaskonomy: Disentangling Task Transfer Learning
747SpotlightEmbodied Real-World Active Perception
747PosterEmbodied Real-World Active Perception
754SpotlightSfSNet : Learning Shape, Reflectance and Illuminance of Faces `in the wild'
754PosterSfSNet : Learning Shape, Reflectance and Illuminance of Faces `in the wild'
756PosterEnd-to-end Recovery of Human Shape and Pose
757PosterFactoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
759PosterMulti-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction
762PosterA Fast Resection-Intersection Method for the Known Rotation Problem
764PosterImage Generation from Scene Graphs
765SpotlightWhat Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
765PosterWhat Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
766PosterPointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation
768OralHigh-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
768PosterHigh-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
769PosterSocial GAN: Socially Acceptable Trajectories with Generative Adversarial Networks
777SpotlightQuantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
777PosterQuantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
778OralFinding It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Video"
778PosterFinding It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Video"
779PosterUnsupervised Cross-dataset Person Re-identification by Transfer Learning of Spatio-temporal Patterns
784PosterKernelized Subspace Pooling for Deep Local Descriptors
786PosterVideo Rain Removal By Multiscale Convolutional Sparse Coding
789PosterLearning from Millions of 3D Scans for Large-scale 3D Face Recognition
792PosterReferring Relationships
794PosterImproving Object Localization with Fitness NMS and Bounded IoU Loss
801SpotlightUnsupervised Feature Learning via Non-Parametric Instance-level Discrimination
801PosterUnsupervised Feature Learning via Non-Parametric Instance-level Discrimination
809SpotlightCVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
809PosterCVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
811SpotlightVisual Question Generation as Dual Task of Visual Question Answering
811PosterVisual Question Generation as Dual Task of Visual Question Answering
812SpotlightRevisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation
812PosterRevisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation
816PosterLearning Dual Convolutional Neural Networks for Low-Level Vision
823PosterDeep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation
836SpotlightMegDet: A Large Mini-Batch Object Detector
836PosterMegDet: A Large Mini-Batch Object Detector
842PosterAttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
844SpotlightTOM-Net: Learning Transparent Object Matting from a Single Image
844PosterTOM-Net: Learning Transparent Object Matting from a Single Image
847PosterEnd-to-End Deep Kronecker-Product Matching for Person Re-identification
849PosterSemantic Visual Localization
851PosterJoint Cuts and Matching of Partitions in One Graph
853SpotlightBenchmarking 6DOF Outdoor Visual Localization in Changing Conditions
853PosterBenchmarking 6DOF Outdoor Visual Localization in Changing Conditions
862PosterCrowd Counting via Adversarial Cross-Scale Consistency Pursuit
874PosterDeep Group-shuffling Random Walk for Person Re-identification
878SpotlightLearning to Detect Features in Texture Images
878PosterLearning to Detect Features in Texture Images
888PosterTransferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification
890PosterCarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles
892PosterContext-aware Deep Feature Compression for High-speed Visual Tracking
894PosterDeep Material-aware Cross-spectral Stereo Matching
899PosterDeep Extreme Cut: From Extreme Points to Object Segmentation
906SpotlightLabel Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images
906PosterLabel Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images
908PosterHarmonious Attention Network for Person Re-Identication
909SpotlightUnsupervised Deep Generative Adversarial Hashing Network
909PosterUnsupervised Deep Generative Adversarial Hashing Network
910PosterPseudo-Mask Augmented Object Detection
914SpotlightLSTM stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH)
914PosterLSTM stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH)
927PosterAdversarial Complementary Learning for Weakly Supervised Object Localization
932OralUnsupervised Discovery of Object Landmarks as Structural Representations
932PosterUnsupervised Discovery of Object Landmarks as Structural Representations
936PosterDeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map
944PosterMonocular Relative Depth Perception with Web Stereo Data Supervision
948PosterImage-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification
952PosterObjects as context for detecting their semantic parts
954PosterCamera Style Adaptation for Person Re-identification
961PosterConditional Generative Adversarial Network for Structured Domain Adaptation
962PosterRotation-sensitive Regression for Oriented Scene Text Detection
963PosterResidual Parameter Transfer for Deep Domain Adaptation
967SpotlightSGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation
967PosterSGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation
974SpotlightWeakly Supervised Instance Segmentation using Class Peak Response
974PosterWeakly Supervised Instance Segmentation using Class Peak Response
978PosterRobust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network
984OralRotation Averaging and Strong Duality
984PosterRotation Averaging and Strong Duality
985PosterPackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
999OralIm2Flow: Motion Hallucination from Static Images for Action Recognition
999PosterIm2Flow: Motion Hallucination from Static Images for Action Recognition
1001PosterFeature Quantization for Defending Against Distortion of Images
1016PosterEnd-to-end weakly-supervised semantic alignment
1018SpotlightPointGrid: A Deep Network for 3D Shape Understanding
1018PosterPointGrid: A Deep Network for 3D Shape Understanding
1019PosterImagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
1020PosterA Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds
1022PosterA Benchmark for Articulated Human Pose Estimation and Tracking
1024PosterBoosting Self-Supervised Learning via Knowledge Transfer
1025SpotlightPPFNet: Global Context Aware Local Features for Robust 3D Point Matching
1025PosterPPFNet: Global Context Aware Local Features for Robust 3D Point Matching
1027SpotlightVision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
1027PosterVision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
1029SpotlightFast Video Object Segmentation by Reference-Guided Mask Propagation
1029PosterFast Video Object Segmentation by Reference-Guided Mask Propagation
1035PosterSuper-Resolving Very Low-Resolution Face Images with Supplementary Attributes
1036PosterVideo Person Re-identification with Competitive Snippet-similarity Aggregation and Co-attentive Snippet Embedding
1037PosterOne-shot Action Localization by Sequence Matching Network
1052PosterEfficient Subpixel Refinement with Symbolic Linear Predictors
1056PosterDistort-and-Recover: Color Enhancement using Deep Reinforcement Learning
1057OralGroup Consistent Similarity Learning via Deep CRFs for Person Re-Identification
1057PosterGroup Consistent Similarity Learning via Deep CRFs for Person Re-Identification
1058PosterSingle Image Reflection Separation with Perceptual Losses
1063SpotlightAVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
1063PosterAVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
1067PosterRecognize Actions by Disentangling Components of Dynamics
1078PosterZoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
1082PosterAttention-aware Compositional Network for Person Re-Identification
1083PosterHATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification
1085PosterMask-guided Contrastive Attention Model for Person Re-Identification
1097SpotlightPose-Guided Photorealistic Face Rotation
1097PosterPose-Guided Photorealistic Face Rotation
1099SpotlightAutomatic 3D Indoor Scene Modeling from Single Panorama
1099PosterAutomatic 3D Indoor Scene Modeling from Single Panorama
1101SpotlightSobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion
1101PosterSobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion
1103PosterA Biresolution Spectral framework for Product Quantization
1109PosterDynamic Zoom-in Network for Fast Object Detection in Large Images
1110PosterOn the Importance of Label Quality for Semantic Segmentation
1113PosterEPINET: A Fully-Convolutional Neural Network for Light Field Depth Estimation by Using Epipolar Geometry
1114PosterA Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking
1118PosterErase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos
1124PosterScalable and Effective Deep CCA via Soft Decorrelation
1126PosterHigh-order tensor regularization with application to attribute ranking
1128Oral3D-RCNN: Instance-level 3D Scene Understanding via Render-and-Compare
1128Poster3D-RCNN: Instance-level 3D Scene Understanding via Render-and-Compare
1129SpotlightFoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
1129PosterFoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
1133PosterDefocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network
1134PosterDecorrelated Batch Normalization
1139SpotlightUnsupervised Textual Grounding: Linking Words to Image Concepts
1139PosterUnsupervised Textual Grounding: Linking Words to Image Concepts
1156PosterScale-recurrent Network for Deep Image Deblurring
1162PosterLow-Shot Recognition with Imprinted Weights
1163OralBottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
1163PosterBottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
1164PosterCross-Domain Weakly-Supervised Object Detection through Progressive Domain Adaptation
1170PosterFacelet-Bank for Fast Portrait Manipulation
1172PosterDuplex Generative Adversarial Network for Unsupervised Domain Adaptation
1173PosterQuantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation
1177PosterReal-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
1178PosterStructure Preserving Video Prediction
1182PosterTagging Like Humans: Diverse and Distinct Image Annotation
1185PosterLearning to Sketch with Shortcut Cycle Consistency
1186PosterGroupCap: Group-based Image Captioning with Structured Relevance and Diversity Constraints
1193SpotlightDynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
1193PosterDynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
1194PosterHyperparameter Optimization for Tracking with Continuous Deep Q-Learning
1202SpotlightDeep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective
1202PosterDeep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective
1203SpotlightNeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning
1203PosterNeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning
1209SpotlightDetecting and Recognizing Human-Object Interactions
1209PosterDetecting and Recognizing Human-Object Interactions
1213PosterAugmenting Crowd-Sourced 3D Reconstructions using Semantic Detections
1219PosterVisual Relationship Learning with a Factorization-based Prior
1224PosterRe-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation
1226PosterFlow Guided Recurrent Neural Encoder for Video Salient Object Detection
1230PosterDisentangling 3D Pose in A Dendritic CNN for Unconstrained 2D Face Alignment
1235PosterProgressive Attention Guided Recurrent Network for Salient Object Detection
1240SpotlightAnswer with Grounding Snippets: Focal Visual-Text Attention for Visual Question Answering
1240PosterAnswer with Grounding Snippets: Focal Visual-Text Attention for Visual Question Answering
1244PosterUnsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints
1247PosterRepulsion Loss: Detecting Pedestrians in a Crowd
1248PosterPU-Net: Point Cloud Upsampling Network
1249SpotlightVideo Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF
1249PosterVideo Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF
1251PosterPiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection
1252PosterGated Fusion Network for Single Image Dehazing
1255SpotlightInterleaved Structured Sparse Convolutional Neural Networks
1255PosterInterleaved Structured Sparse Convolutional Neural Networks
1258PosterWhere and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
1264PosterEnd-to-end Flow Correlation Tracking with Spatial-temporal Attention
1271PosterLeft/Right Asymmetric Layer Skippable Networks
1276OralContext Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation
1276PosterContext Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation
1280SpotlightVITAL: VIsual Tracking via Adversarial Learning
1280PosterVITAL: VIsual Tracking via Adversarial Learning
1282PosterRotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints
1284SpotlightAction Sets: Weakly Supervised Action Segmentation without Ordering Constraints
1284PosterAction Sets: Weakly Supervised Action Segmentation without Ordering Constraints
1287OralSqueeze-and-Excitation Networks
1287PosterSqueeze-and-Excitation Networks
1288PosterEdit Probability for Scene Text Recognition
1289SpotlightBidirectional Attentive Fusion with Context Gating for Dense Video Captioning
1289PosterBidirectional Attentive Fusion with Context Gating for Dense Video Captioning
1290PosterExploit the Unknown Gradually:~ One-Shot Video-Based Person Re-Identification by Stepwise Learning
1294PosterLearning to Localize Sound Source in Visual Scenes
1296PosterDynamic Few-Shot Visual Learning without Forgetting
1303PosterWeakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features
1304PosterSINT++: Robust Visual Tracking via Adversarial Hard Positive Generation
1308PosterReal-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer
1315PosterFast and Accurate Single Image Super-Resolution via Information Distillation Network





提取码:关注【计算机视觉联盟】回复:  CVPR2018

