III: Small: Geospatial Data Representation and Analysis through the Stellar Decomposition
Award: NSF-IIS - 1910766
October 1, 2019 – September 30, 2024
Project description
The project focuses on developing scalable data representations, and algorithms for processing and analyzing scattered big geospatial data. The emphasis is on dealing with point clouds of very large size, arising from LIDAR acquisitions, and on applications to terrain modeling and to tree reconstruction from forestry data, with benefits to the research in environmental and Earth science. Being the approach entirely data agnostic, the project has a potential impact on a broader range of applications, including neuroscience, social science, and virtual reality. Software tools for modeling and analysis of very large terrains, and for forestry segmentation will be developed and distributed in the public domain.
Currently, raw LiDAR point clouds are processed by first converting them into raster models, with high computational costs, potential loss of information, and creation of artifacts due to missing data and to the presence of noise. The innovative aspect of the project is in performing data analysis directly on the LiDAR point clouds. This requires encoding the neighboring relations among the points as a simplicial mesh so as to provide an approximation of the underlying “shape” of the point cloud. In order to be able to deal with massive data sets, the research will develop efficient and effective mesh data structures, based on a new data clustered spatio-topological model, the Stellar decomposition, supporting scalability, and efficient processing of fundamental spatial and connectivity queries. New algorithms, rooted in computational topology, will be developed for terrain simplification and analysis, and for tree reconstruction from forestry data. The new scalable framework will make the analysis of terrain and forest data possible on commodity hardware even for datasets composed of billions of points. Moreover, thanks to the use of the Stellar decomposition, it will be well suited for implementations in a distributed environment for the analysis of such data at the global Earth scale.
The major challenges of the project are in (i) defining and implementing compact and efficient data structures for simplicial complexes in order to make the direct processing and analysis of big scattered geospatial data effective and efficient on commodity hardware; in (ii) developing scalable representations which are well suited for future implementations in distributed environments; in (iii) designing and developing new algorithms, rooted in computational topology, for terrain analysis and simplification, and for forest segmentation, applicable also in other contexts; in (iv) developing a software platform for modeling and analysis of terrains and for tree segmentation from forestry data, which will consist of several integrated libraries and software tools based on highly innovative data representation and analysis techniques, capable of dealing with large-size point clouds.
The fundamental research tasks identified in this project are:
-
Task 1. Develop a new decomposition-based model for data structures for simplicial complexes (meshes in two, three and higher dimension complexes), as well as a dimension-independent implementation.
-
Task 2. Develop a new framework for processing and analysis of terrains reconstructed from large LIDAR point clouds, based on the new data model, and on topological data analysis algorithms.
-
Task 3. Develop a new framework for the representation and analysis of 3D shapes for applications to tree identification and reconstruction from LiDAR point clouds based on topological algorithms and on a data structure rooted in the new decomposition-based model.
Participants
Principal Investigator, Professor, University of Maryland at College Park
Collaborator, Assistant professor, Clemson University, South Carolina
Collaborator, Senior Scientist, German Aerospace Center (DLR), Germany
Collaborator, Senior Scientist, Lawrence Livermore National Laboratory, USA
PhD student, University of Maryland at College Park
PhD student, University of Maryland at College Park
PhD student, University of Maryland at College Park
PhD student, University of Maryland at College Park
PhD student, University of Maryland at College Park
Publications
2024
-
ImplicitTerrain: a Continuous Surface Model for Terrain Data Analysis
-
Parallel Topology-aware Mesh Simplification on Terrain Trees
2023
-
A topology-based approach to individual tree segmentation from airborne LiDAR data
-
Terrain trees: a framework for representing, analyzing and visualizing triangulated terrains
2022
-
An unsupervised building footprints delineation approach for large-scale LiDAR point clouds
-
Label-based generalization of bathymetry data for hydrographic sounding selection
2021
-
The Stellar decomposition: A compact representation for simplicial complexes and beyond
-
Efficient topology-aware simplification of large triangulated terrains
2020
-
Efficient Homology-Preserving Simplification of High-Dimensional Simplicial Shapes
-
Computing Multiparameter Persistent Homology through a Discrete Morse-Based Approach
-
Tetrahedral trees: A family of hierarchical spatial indexes for tetrahedral meshes
Presentations
-
Xin Xu @ ACM SIGSPATIAL, November 2022An unsupervised building footprints delineation approach for large-scale LiDAR point clouds
-
Noel Dyer @ Canadian Hydrographic Conference, June 2022Towards an Automated Chart-Ready Cartographic Sounding Selection
-
Yunting Song @ ACM SIGSPATIAL, November 2021Efficient topology-aware simplification of large triangulated terrains | video
-
Leila De Floriani @ IEEE Computer Society Japan Chapter, December 2020Mesh-based approached to modeling point clouds for geospatial applications
-
Xin Xu @ ACM SIGSPATIAL, November 2020A Persistence-Based Approach for Individual Tree Mapping | video
-
Leila De Floriani @ IEEE YESIST12, August 2020Topology-based approaches to data analysis
-
Leila De Floriani @ IEEE Vardhaman College of Engineering and IEEE Hyderabad Section, August 2020Topology-based clustering methods for geospatial data analysis
-
Leila De Floriani and Yunting Song @ NOAA, August 2020Representation and analysis of big terrain data
-
Federico Iuricich @ Eurographics, May 2020Efficient Homology-Preserving Simplification of High-Dimensional Simplicial Shapes | video
Software
A C++ library for computing a discrete gradient on multivariate data |
code
DescriptionThis library includes the implementation of the Indexed data structure with Adjacencies (IA data structure) for the representation of triangle meshes. The IA data structure is a triangle-based data structure that encodes the vertices and the triangles of the mesh. The vertices of the meshes with their attributes are encoding in a vertex table. Each triangle is encoded in a triangle table as a triple of indexes to its vertices in the vertex table. The IA data structure also explicitly encodes, for each triangle, the indexes in the triangle table of its three bounding vertices, and for each vertex, the index of one triangle incident in the vertex. This enables the efficient retrieval of all connectivity relations. This library is developed based on the LibTri for terrain analysis, which includes state-of-the-art estimators for slope and curvature for triangulated surfaces (Triangulated Irregular Networks - TINs), and for the extraction of critical points from a TIN. For a description of the required libraries for building the library, the compilation process, the run-time options and of the supported input format, please refer to the github page of the repository listed above. |
|
Tetrahedral Trees |
code
DescriptionIn this library we address the problem of performing efficient spatial and topological queries on large tetrahedral meshes with arbitrary topology and complex boundaries. Such meshes arise in several application domains, such as 3D Geographic Information Systems (GISs), scientific visualization, and finite element analysis. To this aim, we have defined the Tetrahedral trees, a family of spatial indexes based on a nested space subdivision (an octree or a kD-tree) and defined by several different subdivision criteria. In this library, we provide efficient algorithms for spatial and topological queries on Tetrahedral trees and it has been used for generating the results of the paper "Tetrahedral Trees: A Family of Hierarchical Spatial Indexes for Tetrahedral Meshes" by Riccardo Fellegara, Leila De Floriani, Paola Magillo, and Kenneth Weiss in ACM Transaction on Spatial Algorithms and Systems 6, 4, Article 23, 34 pages, 2020 [doi]. For a description of the required libraries for building the library, the compilation process, the run-time options and of the supported input format, please refer to the github page of the repository listed above. |
|
Stellar Trees |
code
DescriptionIn this library we address the problem of the efficient representation and management of simplicial and cell complexes in arbitrary dimension and with arbitrary domain. Managing complexes in three dimensions and higher is not a simple task, since the topological data structures proposed in the literature do not scale when the size and the dimension of a complex become high. We propose the Stellar library as a topological C++ framework for performing efficient topological queries on simplicial and non-simplicial meshes. The Stellar tree library provides a scalable and compact representation that encodes the minimal information to locally reconstruct the topological connectivity of its indexed elements. This provides the flexibility to efficiently construct the optimal data structures to solve the task at hand using a fraction of the memory required for a corresponding topological data structure on the global mesh. The efficiency of the Stellar library increases with the execution of successive queries, as the construction costs of these runtime data structures are amortized over multiple accesses while processing each node. This library has been used for generating the results of the paper "The Stellar decomposition: A compact representation for simplicial complexes and beyond" by Riccardo Fellegara, Kenneth Weiss, and Leila De Floriani in Elsevier Computer & Graphics journal, 98: 322-343, 2021 [doi]. A preprint of this article can be also fund on arXiv here. For a description of the required libraries for building the library, the compilation process, the run-time options and of the supported input format, please refer to the readme file available in the repository archive. |
|
Terrain Modeling and Analysis based on the IA data structure |
code
DescriptionThis library includes the implementation of the Indexed data structure with Adjacencies (IA data structure) for the representation of triangle meshes. The IA data structure is a triangle-based data structure that encodes the vertices and the triangles of the mesh. The vertices of the meshes with their attributes are encoding in a vertex table. Each triangle is encoded in a triangle table as a triple of indexes to its vertices in the vertex table. The IA data structure also explicitly encodes, for each triangle, the indexes in the triangle table of its three bounding vertices, and for each vertex, the index of one triangle incident in the vertex. This enables the efficient retrieval of all connectivity relations. This library is developed based on the LibTri for terrain analysis, which includes state-of-the-art estimators for slope and curvature for triangulated surfaces (Triangulated Irregular Networks - TINs), and for the extraction of critical points from a TIN. For a description of the required libraries for building the library, the compilation process, the run-time options and of the supported input format, please refer to the github page of the repository listed above. |
|
Terrain Trees |
code
DescriptionTerrain trees are a new in-core family of spatial indexes for the representation and analysis of Triangulated Irregular Networks (TINs). Terrain trees combine a minimal encoding of the connectivity of the underlying triangle mesh with a hierarchical spatial index, implicitly representing the other topological relations among vertices, edges and vertices. Topological relations are extracted locally within each leaf block of the hierarchal index at runtime, based on specific application needs. Based on Terrain trees, we have developed a set of tools for terrain analysis and named this toolset as Terrain Trees library (TTL). TTL contains a kernel for connectivity and spatial queries, and modules for extracting morphological features, including edge and triangle slopes, roughness, curvature. It also contains modules for extracting topological structures, like critical point, critical net, watershed segmentation, based on the discrete Morse gradient, and a technique for multivariate data visualization, which enables the analysis of multiple scalar fields defined on the same terrain. By working on TINs generated from big LiDAR (Light, Detection and Ranging) data sets, we demonstrate the effectiveness and scalability of the Terrain trees against state-of-the-art compact data structures. |
Datasets
-
Princeton Shape BenchmarkThe Princeton Shape Benchmark provides a repository of 3D models and software tools for evaluating shape-based retrieval and analysis algorithms. The benchmark contains a database of 3D polygonal models collected from the World Wide Web. For each 3D model, there is an Object File Format (.off) file with the polygonal geometry of the model, a model information file (e.g., the URL from where it came), and a JPEG image file with a thumbnail view of the model.
-
NEWFOR (NEW technologies for a better mountain FORest timber mobilization) single tree detection benchmarkThe NEWFOR single tree detection benchmark is used to evaluate LIDAR based detection methods. The benchmark includes the data of 15 different study areas in 5 countries of the Alpine Space. In each study area, the following data are provided: (1) Airborne Laser Scanning data in *.las format; (2) Digital Terrain Model (DTM) in *.tif format (1×1m or 0.5×0.5m spatial resolution); (3) Forest Inventory data in *.shp format; (4) Area of Interest in *.shp format; and (5) Metainformation of the data. Statistical Parameters about the Field Inventory (FI) data can be found in the publication paper:
Eysn, L.; Hollaus, M.; Lindberg, E.; Berger, F.; Monnet, J.-M.; Dalponte, M.; Kobal, M.; Pellegrini, M.; Lingua, E.; Mongus, D.; Pfeifer, N. A Benchmark of Lidar-Based Single Tree Detection Methods Using Heterogeneous Forest Data from the Alpine Space. Forests 2015, 6, 1721-1747. http://www.mdpi.com/1999-4907/6/5/1721 -
Volume LibraryThe Volume Library provides volume datasets for scientists involved with volume visualization and rendering. The datasets contain regular volume data mainly coming from CT or MRI scanners. The data is stored in the PVM format which contains information about the grid size, bit depth, and the cell spacing of a dataset. Optionally it may also contain a dataset description, courtesy information, the type of the scanner and a comment. This information and the raw data can be extracted easily using the PVM tools distributed with the V^3 volume rendering package available here.
-
CMU Unstructured Mesh SuiteThe CMU Unstructured Mesh Suite is a collection of partitioned unstructured 3D meshes of the San Fernando Valley in Southern California. The meshes were generated as part of the Quake project using the Archimedes tool chain. The meshes are described and characterized in detail in the following report: David R. O'Hallaron and Jonathan Richard Shewchuk, Properties of a Family of Parallel Finite Element Simulations, Technical Report CMU-CS-96-141, December, 1996. Abstract and BibTex extry, PostScript.
-
Virtual Terrain ProjectThe Virtual Terrain Project provides both tools for the creation of terrain datasets as well as has a collection of several 2D datasets including DEMs, Bathymetry, and Contour Data.