DAEDALUS(Massive-scale Urban Reconstruction, Classification, and Rendering from Remote Sensor Imagery) has received $599,317.50 CAD in funding from Presagis Inc, Canada’s Department of National Defense, and the Natural Sciences and Engineering Research Council of Canada under grant agreement DNDPJ515556- 17.
On August 1st, 2018, the kick-off meeting for the DAEDALUS project was held.
More information can be found on the project’s website: daedalus.theictlab.org
Project description: Advances in remote sensing technologies have enabled the widespread availability of geo-imagery including satellite and oblique aerial images, and a relatively recent yet fast maturing technology called wide-area motion imagery (WAMI). Remote sensing has already been successfully employed in many applications such as intelligence, security, reconnaissance, urban planning and monitoring, etc. However, images corresponding to urban areas typically result in large volumes of data which require new computer vision and 3D graphics techniques for emerging data exploitation, including 3D reconstruction, geospatial feature classification and photorealistic rendering of the reconstructed models. Our proposed research methodology is to develop new techniques for handling big sets of large-sized images containing repetitive patterns using dense matching and deep learning for 3D reconstruction, object classification, and realistic appearance modeling.
Join us at the 3DTV Conference 2018
ICT lab researcher Chen Qiao will be presenting our work on “Single-shot Dense Reconstruction with Epic-flow”. This work is co-authored with Chen Qiao, Charalambos Poullis.
In this paper we present a novel method for generating dense reconstructions by applying only structure-from-motion(SfM) on large-scale datasets without the need for multi-view stereo as a post-processing step. A state-of-the-art optical flow technique is used to generate dense matches. The matches are encoded such that verification for correctness becomes possible, and are stored in a database on-disk. The use of this out-of-core approach transfers the requirement for large memory space to disk, therefore allowing for the processing of even larger-scale datasets than before. We compare our approach with the state-of-the-art and present the results which verify our claims.
The Photogrammetric Vision lab of the Cyprus University of Technology will be presenting the joint work “Underwater Photogrammetry in Very Shallow Waters: Caustics Effect Removal and Main Challenges” at the ISPRS Technical Commission II Symposium, 2018. This work was done in collaboration with the ICT lab, and the Lab. of Photogrammetry of the National Technical University of Athens, and is co-authored with P. Agrafiotis, D. Skarlatos, T. Forbes, C. Poullis, M. Skamantzari, A. Georgopoulos.
In this paper, main challenges of underwater photogrammetry in shallow waters are described and analysed. The very short camera to object distance in such cases, as well as buoyancy issues, wave effects and turbidity of the waters are challenges to be resolved. Additionally, the major challenge of all, caustics, is addressed by a new approach for caustics removal (Forbes et al., 2018) which is applied in order to investigate its performance in terms of SfM-MVS and 3D reconstruction results. In the proposed approach the complex problem of removing caustics effects is addressed by classifying and then removing them from the images. We propose and test a novel solution based on two small and easily trainable Convolutional Neural Networks (CNNs). Real ground truth for caustics is not easily available. We show how a small set of synthetic data can be used to train the network and later transfer the le arning to real data with robustness to intra-class variation. The proposed solution results in caustic-free images which can be further used for other tasks as may be needed.
Our work “DeepCaustics: Classification and Removal of Caustics from Underwater Imagery” will appear as a regular journal publication in IEEE Journal of Oceanic Engineering 2018. This work is co-authored with Timothy Forbes, Mark Goldsmith, Sudhir Mudur, Charalambos Poullis.
Caustics are complex physical phenomena resulting from the projection of light rays being reflected or refracted by a curved surface. In this work, we address the problem of classifying and removing caustics from images and propose a novel solution based on two Convolutional Neural Networks (CNNs): SalienceNet and DeepCaustics. Caustics result in changes in illumination which are continuous in nature, therefore the first network is trained to produce a classification of caustics which is represented as a saliency map of the likelihood of caustics occurring at a pixel. In applications where caustic removal is essential, the second network is trained to generate a caustic-free image. It is extremely hard to generate real ground truth for caustics. We demonstrate how synthetic caustic data can be used for training in such cases, and then transfer the learning to real data. To the best of our knowledge, out of the handful of techniques which have been proposed this is the first time that the complex problem of caustic removal has been reformulated and
addressed as a classification and learning problem. This work is motivated by the real-world challenges in underwater archaeology.
Join us at the 15th Conference on Computer and Robot Vision 2018
ICT lab researcher Timothy Forbes will be presenting our work on “Deep Autoencoders with Aggregated Residual Transformations for Urban Reconstruction from Remote Sensing Data”. This work is co-authored with Timothy Forbes and Charalambos Poullis.
In this work we investigate urban reconstruction and propose a complete and automatic framework for reconstructing urban areas from remote sensing data.
Firstly, we address the complex problem of semantic labeling and propose a novel network architecture named SegNeXT which combines the strengths of deep-autoencoders with feed-forward links in generating smooth predictions and reducing the number of learning parameters, with the effectiveness which cardinality-enabled residual-based building blocks have shown in improving prediction accuracy and outperforming deeper/wider network architectures with a smaller number of learning parameters. The network is trained with benchmark datasets and the reported results show that it can provide at least similar and in some cases better classification than state-of-the-art.
Secondly, we address the problem of urban reconstruction and propose a complete pipeline for automatically converting semantic labels into virtual representations of the urban areas. An agglomerative clustering is performed on the points according to their classification and results in a set of contiguous and disjoint clusters. Finally, each cluster is processed according to the class it belongs: tree clusters are substituted with procedural models, cars are replaced with simplified CAD models, buildings’ boundaries are extruded to form 3D models, and road, low vegetation, and clutter clusters are triangulated and simplified.
The result is a complete virtual representation of the urban area. The proposed framework has been extensively tested on large-scale benchmark datasets and the semantic labeling and reconstruction results are reported.
ICT lab researcher Mhd Adnan Utayim has been awarded the NSERC Undergraduate Student Research Award.
The proposal is on “3D Reconstruction of large-scale areas from remote-sensor data”. A summary of the project is shown below:
In the proposed work we will explore other methods to alleviate issues relating to large-scale 3D reconstruction from remote-sensor data and result in higher quality, accurate 3D models. In particular, we will investigate the reformulation of the problem as an optimization problem involving a finite set of parameterized primitives. The anticipated result is the complete representation of the captured scene as a combination of basic primitives comprising of lightweight, watertight, and simple polygonal 3D models.
Our journal paper “Reflecting on the Design Process for Virtual Reality Applications” has been published and is available online:
A reflective analysis on the experience of Virtual Environment (VE) design is presented focusing on the human computer interaction (HCI) challenges presented by virtual reality (VR). HCI design guidelines were applied to development of two VRs, one in marine archaeology and the other in situation awareness simulation experiments. The impact of methods and HCI knowledge on the VR design process is analyzed leading to proposals for presenting HCI and cognitive knowledge in the context of design trade-offs in the choice of VR design techniques. Problems reconciling VE and standard GUI design components are investigated. A trade-off framework for design options set against criteria for usability, efficient operation, realism and presence is proposed. HCI-VR design advice and proposals for further research aimed towards improving human factors-related design in VEs are discussed.
Object classification is one of the many holy grails in computer vision and as such has resulted in a very large number of algorithms being proposed already. Specifically in recent years there has been considerable progress in this area primarily due to the increased efficiency and accessibility of deep learning techniques. In fact, for single-label object classification [i.e. only one object present in the image] the state-of-the-art techniques employ deep neural networks and are reporting very close to human-like performance. There are specialized applications in which single-label object-level classification will not suffice; for example in cases where the image contains multiple intertwined objects of different labels.
In this paper, we address the complex problem of multi-label pixelwise classification.
We present our distinct solution based on a convolutional neural network (CNN) for performing multi-label pixelwise classification and its application to large-scale urban reconstruction. A supervised learning approach is followed for training a 13-layer CNN using both LiDAR and satellite images. An empirical study has been conducted to determine the hyperparameters which result in the optimal performance of the CNN. Scale invariance is introduced by training the network on five different scales of the input and labeled data. This results in six pixelwise classifications for each different scale. An SVM is then trained to map the six pixelwise classifications into a single-label. Lastly, we refine boundary pixel labels using graph-cuts for maximum a-posteriori (MAP) estimation with Markov Random Field (MRF) priors. The resulting pixelwise classification is then used to accurately extract and reconstruct the buildings in large-scale urban areas. The proposed approach has been extensively
tested and the results are reported.
Join us at the 25th ACM Multimedia conference in Mountain View, CA, USA.
ICT lab researcher Behnam Maneshgar will be presenting our work “Automatic Adjustment of Stereoscopic Content for Long-Range Projections in Outdoor Areas”. The work is co-authored with Leila Sujir, Sudhir Mudur, and Charalambos Poullis.
Abstract: Projecting stereoscopic content onto large general outdoor surfaces, say building facades, presents many challenges to be overcome, particularly when using red-cyan anaglyph stereo representation, so that as accurate as possible colour and depth perception can still be achieved.
In this paper, we address the challenges relating to long-range projection mapping of stereoscopic content in outdoor areas and present a complete framework for the automatic adjustment of the content to compensate for any adverse projection surface behaviour. We formulate the problem of modeling the projection surface into one of simultaneous recovery of shape and appearance. Our system is composed of two standard fixed cameras, a long range fixed projector, and a roving video camera for multi-view capture. The overall computational framework comprises of four modules: calibration of a long-range vision system using the structure from motion technique, dense 3D reconstruction of projection surface from calibrated camera images, modeling the light behaviour of the projection surface using roving camera images and, iterative adjustment of the stereoscopic content. In addition to cleverly adapting some of the established computer vision techniques, the system design we present is distinct from previous work. The proposed framework has been tested in real-world applications with two non-trivial user experience studies and the results reported show considerable improvements in the quality of 3D depth and colour perceived by human participants.
ICT Lab researcher Oliver Philpin-Briscoe will be presenting our paper “A Serious Game for Understanding Ancient Seafaring in the Mediterranean Sea”. The work is co-authored with B. Simon, S. Mudur, C. Poullis, S. Rizvic, D. Boskovic, F. Liarokapis, D. Skarlatos, I. Katsouri, S. Demesticha.
Abstract: Commercial sea routes joining Europe with other cultures are vivid examples of cultural interaction. In this work, we present a serious game which aims to provide better insight and understanding of seaborne trade mechanisms and seafaring practices in the eastern Mediterranean during the Classical and Hellenistic periods. The game incorporates probabilistic geospatial analysis of possible ship routes through the re-use and spatial analysis from open GIS maritime, ocean, and weather data. These routes, along with naval engineering and sailing techniques from the period, are used as underlying information for the seafaring game. This work is part of the EU-funded project iMareCulture whose purpose is in raising the European identity awareness using maritime and underwater cultural interaction and exchange in the Mediterranean sea.