Week #1 #

Project description #

Project name: Autonomous Semantic 3D Reconstruction and High-Fidelity Mapping #

Code repository: https://github.com/Mousatat/Autonomous-Drone-Semantic-3D-Reconstruction-and-High-Fidelity-Mapping

An autonomous UAV-based system designed for semantic 3D spatial reconstruction and high-fidelity mapping. The project integrates real-time semantic segmentation via SpatialLM, autonomous UAV navigation using agentic AI, and precise exploration strategies to efficiently generate detailed digital twins.

Problem Statement #

Existing autonomous 3D mapping solutions face limitations in:

Real-time semantic understanding of environments.
Autonomous optimal exploration decision-making.
Efficient and effective data processing for high-quality spatial reconstructions.

Our system addresses these challenges through advanced semantic models, autonomous UAV navigation powered by agentic AI, and two-stage reconstruction techniques for superior spatial accuracy and usability.

Team Members #

Team Member	Telegram Alias	Email Address	Track	Responsibilities
Mahmoud Mousatat	@Mousatat	m.mousatat@innopolis.university	Computer Vision	Semantic understanding from visual data
Nikita Sergeev	@nary_2	n.sergeev@innopolis.university	AI & Backend	Developing agentic AI logic
Alexander Rozanov	@alrozanov	al.rozanov@innopolis.university	DevOps	Integrating AI solutions with cloud environments
Ilvina Akhmetzianova	@IviUnicorn	i.akhmetzianova@innopolis.university	Simulation & Frontend	Building simulation environments and interfaces

Brainstorming #

Ideas during brainstorming #

Autonomous Semantic 3D Mapping (Chosen idea) — UAV system using semantic segmentation, real-time navigation, and agentic AI for 3D mapping and digital twin creation.

Basic requirements #

Target users and their primary needs #

Construction and Architecture Firms: Require accurate digital twins for BIM and project monitoring.
Disaster Response Teams: Need rapid and accurate maps for damage assessment and planning.
Urban Planning and Infrastructure Management: Require updated spatial data for urban management.

Initial scope (MVP) #

Included:

Real-time semantic spatial reconstruction using SLAM and SpatialLM.
Autonomous navigation planning and exploration via agentic AI.
Simple game-server communication setup for simulation integration.

Excluded (future iterations):

Detailed photogrammetric reconstruction using traditional methods.
Neural rendering for interactive visualizations.
Fully scaled cloud solutions and multi-UAV coordination.

Tech-stack #

Frontend / Visualization #

PlayCanvas: Chosen as an initial game engine for simulation and frontend interactions.

Backend / AI Models #

SpatialLM: Semantic segmentation.
Visual-Inertial SLAM: SLAM3R or similar solutions.
Claude 4 or Gemini 2.5 Pro: Agentic AI path planning and decision-making.

Cloud Infrastructure #

We are utilizing a Virtual Dedicated Server (VDS) to host our cloud infrastructure, specifically deploying an Nginx server to facilitate efficient and secure communication between LLM and agentic AI.

Weekly commitments #

Individual contribution of each participant: #

Mahmoud Mousatat:
- Investigated suitable vision models for semantic understanding.
Nikita Sergeev:
- Explored agentic frameworks for semantic vision integration and spatial data processing.
Alexander Rozanov:
- Prepared initial weekly report.
- Explored integration solutions for agentic AI and cloud infrastructure.
Ilvina Akhmetzianova:
- Evaluated suitable game engines, initially selecting PlayCanvas for simulation.

Confirmation of the code’s operability #

We confirm that the code in the main branch:

In working condition.