Skip to content

NamaWho/lex-lunarlander

Repository files navigation

🚀 Non-Standard Reinforcement Learning for Prioritized Multi-Objective Problems: lunarlander

Authors: Daniel Namaki, Niccolò Settimelli
Course: Symbolic and Evolutionary Artificial Intelligence
Academic Year: 2024/2025 – University of Pisa


🧠 Project Overview

This project investigates non-standard reinforcement learning (RL) methods that leverage lexicographic reward prioritization on the classic LunarLander-v2 environment. Instead of a single scalar reward, our agents optimize a vector reward with strict priorities:

  1. ✅ Survival (avoid crashing)
  2. 🎯 Landing quality (upright, centered touchdown)
  3. ⛽ Fuel efficiency

We implement and compare:

  • Potential-Based Survival Shaping
  • Cone-Aware Survival Shaping
  • Curriculum Learning with Prioritized Replay
  • Standard DQN Baselines

🗂 Repository Structure

2025_SEAI_F01/
├── models/                             # Saved model checkpoints
├── networks/                           # LexQNetwork & standard Q-network code
├── v_cone/                             # Cone-aware shaping agent
├── v_potential_shaping/                # Potential-based shaping agent
├── v_prioritized_curriculum_learning/  # Curriculum + prioritized replay agent
├── v_standard/                         # Standard & prioritized DQN agents
├── requirements.txt                    # Python dependencies
├── doc_seai_f01.pdf                    # Full project report
└── README.md                           # This overview

About

Non-Standard Reinforcement Learning for Prioritized Multi-Objective Problems: lunarlander environment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages