CNG Datasets Toolkit

A Python toolkit for processing large geospatial datasets into cloud-native formats with H3 hexagonal indexing.

Features

  • Vector Processing: Convert polygon and point datasets to H3-indexed GeoParquet

  • Raster Processing: Create Cloud-Optimized GeoTIFFs (COGs) and H3-indexed parquet

  • Kubernetes Integration: Generate and submit K8s jobs for large-scale processing

  • Cloud Storage: Manage S3 buckets and sync across multiple providers with rclone

  • Scalable: Chunk-based processing for datasets that don’t fit in memory

Indices and tables