All posts in VR CAVE

Join us at the 16th Conference on Computer and Robot Vision, 2019

ICT lab researcher Bodhiswatta Chatterjee will be presenting our work “On Building Classification from Remote Sensor Imagery Using Deep Neural Networks and the Relation Between Classification and Reconstruction Accuracy Using Border Localization as Proxy”. The work is co-authored with Bodhiswatta Chatterjee and Charalambos Poullis.

Abstract: Convolutional neural networks have been shown to have a very high accuracy when applied to certain visual tasks and in particular semantic segmentation. In this paper we address the problem of semantic segmentation of buildings from remote sensor imagery. We present ICT-Net: a novel network with the underlying architecture of a fully convolutional network, infused with feature re-calibrated Dense blocks at each layer. Uniquely, the proposed network combines the localization accuracy and use of context of the U-Net network architecture, the compact internal representations and reduced feature redundancy of the Dense blocks, and the dynamic channel-wise feature re-weighting of the Squeeze-and-Excitation(SE) blocks. The proposed network has been tested on INRIA’s benchmark dataset and is shown to outperform all other state-of-the-art by more than 1.5% on the Jaccard index.

Furthermore, as the building classification is typically the first step of the reconstruction process, in the latter part of the paper we investigate the relationship of the classification accuracy to the reconstruction accuracy. A comparative quantitative analysis of reconstruction accuracies corresponding to different classification accuracies confirms the strong correlation between the two. We present the results which show a consistent and considerable reduction in the reconstruction accuracy. The source code and supplemental material is publicly available at

Our work on “Large-scale Urban Reconstruction with Tensor Clustering and Global Boundary Refinement” has been published as a regular journal paper in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. The work is authored by C. Poullis.


Accurate and efficient methods for large-scale urban reconstruction are of significant importance to the computer vision and computer graphics communities. Although rapid acquisition techniques such as airborne LiDAR have been around for many years, creating a useful and functional virtual environment from such data remains difficult and labor intensive. This is due largely to the necessity in present solutions for data dependent user defined parameters. In this paper we present a new solution for automatically converting large LiDAR data pointcloud into simplified polygonal 3D models. The data is first divided into smaller components which are processed independently and concurrently to extract various metrics about the points. Next, the extracted information is converted into tensors. A robust agglomerate clustering algorithm is proposed to segment the tensors into clusters representing geospatial objects e.g. roads, buildings, etc. Unlike previous methods, the proposed tensor clustering process has no data dependencies and does not require any user-defined parameter. The required parameters are adaptively computed assuming a Weibull distribution for similarity distances. Lastly, to extract boundaries from the clusters a new multi-stage boundary refinement process is developed by reformulating this extraction as a global optimization problem. We have extensively tested our methods on several pointcloud datasets of different resolutions which exhibit significant variability in geospatial characteristics e.g. ground surface inclination, building density,
etc and the results are reported. The source code for both tensor clustering and global boundary refinement will be made publicly available with the publication.

Available here:

Our work “Evaluation of “The Seafarers”: A serious game on seaborne trade in the Mediterranean sea during the Classical period” has been published as a regular journal publication in Elsevier Journal on Digital Applications in Archaeology and Cultural Heritage 2019. This work is co-authored with Charalambos Poullis, Marta Kersten-Oertel, J. Praveen Benjamin, Oliver Philbin-Briscoe, Bart Simon, Dimitra Perissiou, Stella Demesticha, Evangeline, Markou, Elias Frentzos, Phaedon Kyriakidis, Dimitrios Skarlatos, and Selma Rizvic.


Throughout the history of the Mediterranean region, seafaring and trading played a significant role in the interaction between the cultures and people in the area. In order to engage the general public in learning about maritime cultural heritage we have designed and developed a serious game incorporating geospatially analyzed data from open GIS archaeological maritime sources, and archaeological data resulting from shipwreck excavations. We present a second prototype of the seafaring serious game, and discuss the results of an evaluation which involved a large multi-site user study with participants from three continents.

More specifically, we present the evaluation of “The Seafarers” a strategy-based game which integrates knowledge from multiple disciplines in order to educate the user through playing. A first prototype was reported in Philbin-Briscoe et al. (2017) where an expert-user evaluation of the usability and the effectiveness of the game in terms of the learning objectives was performed.

In this paper, we present how the outcomes of the evaluation of the first prototype “The Seafarers – 1” by expert-users were used in the redesign and development of the game mechanics for the second prototype “The Seafarers-2”. We then present our methodology for evaluating the game with respect to the game objective of engagement in learning about maritime cultural heritage, seafaring and trading in particular. Specifically, the evaluation was to test the hypothesis that game playing allows for more engaged learning thus improving longer-term knowledge retention. The evaluation was conducted in two phases and includes a pilot study, followed by a multi-site, multi-continent user-study involving a large number of participants. We analyze the results of the user evaluation and discuss the outcomes.

This work is part of the EU-funded project iMareCulture and involves truly multi-continental, multi-institutional and multi-disciplinary cooperation – civil engineers and archaeologists from Cyprus, Human Computer Interaction (HCI) experts and Educationists from Bosnia and Herzegovina, Canada, and cultural sociologists and computer scientists from Canada.

Available here

Join us at the 3DTV Conference 2018

ICT lab researcher Chen Qiao will be presenting our work on “Single-shot Dense Reconstruction with Epic-flow”. This work is co-authored with Chen Qiao, Charalambos Poullis.


In this paper we present a novel method for generating dense reconstructions by applying only structure-from-motion(SfM) on large-scale datasets without the need for multi-view stereo as a post-processing step. A state-of-the-art optical flow technique is used to generate dense matches. The matches are encoded such that verification for correctness becomes possible, and are stored in a database on-disk. The use of this out-of-core approach transfers the requirement for large memory space to disk, therefore allowing for the processing of even larger-scale datasets than before. We compare our approach with the state-of-the-art and present the results which verify our claims.

The Photogrammetric Vision lab of the Cyprus University of Technology  will be presenting the joint work “Underwater Photogrammetry in Very Shallow Waters: Caustics Effect Removal and Main Challenges” at the ISPRS Technical Commission II Symposium, 2018. This work was done in collaboration with the ICT lab, and the Lab. of Photogrammetry of the National Technical University of Athens, and is co-authored with P. Agrafiotis, D. Skarlatos, T. Forbes, C.  Poullis, M. Skamantzari, A. Georgopoulos.


In this paper, main challenges of underwater photogrammetry in shallow waters are described and analysed. The very short camera to object distance in such cases, as well as buoyancy issues, wave effects and turbidity of the waters are challenges to be resolved. Additionally, the major challenge of all, caustics, is addressed by a new approach for caustics removal (Forbes et al., 2018) which is applied in order to investigate its performance in terms of SfM-MVS and 3D reconstruction results. In the proposed approach the complex problem of removing caustics effects is addressed by classifying and then removing them from the images. We propose and test a novel solution based on two small and easily trainable Convolutional Neural Networks (CNNs). Real ground truth for caustics is not easily available. We show how a small set of synthetic data can be used to train the network and later transfer the le arning to real data with robustness to intra-class variation. The proposed solution results in caustic-free images which can be further used for other tasks as may be needed.

Our work “DeepCaustics: Classification and Removal of Caustics from Underwater Imagery” will appear as a regular journal publication in IEEE Journal of Oceanic Engineering 2018. This work is co-authored with Timothy Forbes, Mark Goldsmith, Sudhir Mudur, Charalambos Poullis.


Caustics are complex physical phenomena resulting from the projection of light rays being reflected or refracted by a curved surface. In this work, we address the problem of classifying and removing caustics from images and propose a novel solution based on two Convolutional Neural Networks (CNNs): SalienceNet and DeepCaustics. Caustics result in changes in illumination which are continuous in nature, therefore the first network is trained to produce a classification of caustics which is represented as a saliency map of the likelihood of caustics occurring at a pixel. In applications where caustic removal is essential, the second network is trained to generate a caustic-free image. It is extremely hard to generate real ground truth for caustics. We demonstrate how synthetic caustic data can be used for training in such cases, and then transfer the learning to real data. To the best of our knowledge, out of the handful of techniques which have been proposed this is the first time that the complex problem of caustic removal has been reformulated and
addressed as a classification and learning problem. This work is motivated by the real-world challenges in underwater archaeology.

Join us at the 15th Conference on Computer and Robot Vision 2018

ICT lab researcher Timothy Forbes will be presenting our work on “Deep Autoencoders with Aggregated Residual Transformations for Urban Reconstruction from Remote Sensing Data”. This work is co-authored with Timothy Forbes and Charalambos Poullis.


In this work we investigate urban reconstruction and propose a complete and automatic framework for reconstructing urban areas from remote sensing data.

Firstly, we address the complex problem of semantic labeling and propose a novel network architecture named SegNeXT which combines the strengths of deep-autoencoders with feed-forward links in generating smooth predictions and reducing the number of learning parameters, with the effectiveness which cardinality-enabled residual-based building blocks have shown in improving prediction accuracy and outperforming deeper/wider network architectures with a smaller number of learning parameters. The network is trained with benchmark datasets and the reported results show that it can provide at least similar and in some cases better classification than state-of-the-art.

Secondly, we address the problem of urban reconstruction and propose a complete pipeline for automatically converting semantic labels into virtual representations of the urban areas. An agglomerative clustering is performed on the points according to their classification and results in a set of contiguous and disjoint clusters. Finally, each cluster is processed according to the class it belongs: tree clusters are substituted with procedural models, cars are replaced with simplified CAD models, buildings’ boundaries are extruded to form 3D models, and road, low vegetation, and clutter clusters are triangulated and simplified.

The result is a complete virtual representation of the urban area. The proposed framework has been extensively tested on large-scale benchmark datasets and the semantic labeling and reconstruction results are reported.

ICT lab researcher Mhd Adnan Utayim has been awarded the NSERC Undergraduate Student Research Award.

The proposal is on “3D Reconstruction of large-scale areas from remote-sensor data”. A summary of the project is shown below:

In the proposed work we will explore other methods to alleviate issues relating to large-scale 3D reconstruction from remote-sensor data and result in higher quality, accurate 3D models. In particular, we will investigate the reformulation of the problem as an optimization problem involving a finite set of parameterized primitives. The anticipated result is the complete representation of the captured scene as a combination of basic primitives comprising of lightweight, watertight, and simple polygonal 3D models.

Our journal paper “Reflecting on the Design Process for Virtual Reality Applications” has been published and is available online:


A reflective analysis on the experience of Virtual Environment (VE) design is presented focusing on the human computer interaction (HCI) challenges presented by virtual reality (VR). HCI design guidelines were applied to development of two VRs, one in marine archaeology and the other in situation awareness simulation experiments. The impact of methods and HCI knowledge on the VR design process is analyzed leading to proposals for presenting HCI and cognitive knowledge in the context of design trade-offs in the choice of VR design techniques. Problems reconciling VE and standard GUI design components are investigated. A trade-off framework for design options set against criteria for usability, efficient operation, realism and presence is proposed. HCI-VR design advice and proposals for further research aimed towards improving human factors-related design in VEs are discussed.



Object classification is one of the many holy grails in computer vision and as such has resulted in a very large number of algorithms being proposed already. Specifically in recent years there has been considerable progress in this area primarily due to the increased efficiency and accessibility of deep learning techniques. In fact, for single-label object classification [i.e. only one object present in the image] the state-of-the-art techniques employ deep neural networks and are reporting very close to human-like performance. There are specialized applications in which single-label object-level classification will not suffice; for example in cases where the image contains multiple intertwined objects of different labels.
In this paper, we address the complex problem of multi-label pixelwise classification.

We present our distinct solution based on a convolutional neural network (CNN) for performing multi-label pixelwise classification and its application to large-scale urban reconstruction. A supervised learning approach is followed for training a 13-layer CNN using both LiDAR and satellite images. An empirical study has been conducted to determine the hyperparameters which result in the optimal performance of the CNN. Scale invariance is introduced by training the network on five different scales of the input and labeled data. This results in six pixelwise classifications for each different scale. An SVM is then trained to map the six pixelwise classifications into a single-label. Lastly, we refine boundary pixel labels using graph-cuts for maximum a-posteriori (MAP) estimation with Markov Random Field (MRF) priors. The resulting pixelwise classification is then used to accurately extract and reconstruct the buildings in large-scale urban areas. The proposed approach has been extensively
tested and the results are reported.