RL Library - A Professional C++ Reinforcement Learning Framework

A comprehensive, modular C++ library for implementing and experimenting with Reinforcement Learning (RL) algorithms. Built with clean architecture principles, separating core mathematics, RL algorithms, and simulation environments.

Project Structure

RL-Library/
├── Core/
│   ├── MathUtils.h          # Mathematical foundations (Unit 1)
│   └── MathUtils.cpp        # Implementation
├── Algorithms/
│   ├── QLearning.h          # Q-Learning algorithm
│   ├── QLearning.cpp        # Q-Learning implementation
│   ├── SARSA.h              # SARSA algorithm
│   └── SARSA.cpp            # SARSA implementation
├── Env/
│   ├── GridWorld.h          # GridWorld environment
│   └── GridWorld.cpp        # GridWorld implementation
├── RL_Lib.h                 # Main library header
├── main.cpp                 # Driver/example code
├── CMakeLists.txt           # Build configuration
└── README.md                # This file

Features

1. Core Mathematics Module (`Core/MathUtils.h`)

Implements foundational mathematical operations for machine learning:

Activation Functions
- Sigmoid: σ(x) = 1 / (1 + e^(-x))
- Tanh: tanh(x)
- ReLU: max(0, x)
Evaluation Metrics
- Mean Squared Error (MSE): Σ(target - pred)² / n
- Mean Absolute Error (MAE): Σ|target - pred| / n
Vector Operations
- Dot product
- Vector addition
- Scalar multiplication
Matrix Operations
- Matrix multiplication
- Matrix transpose
- Matrix-vector multiplication

2. Algorithms Module (`Algorithms/`)

Q-Learning (`QLearning.h/cpp`)

An off-policy temporal difference algorithm that learns the optimal action-value function.

Update Rule:

Q(s,a) ← Q(s,a) + α[r + γ·max(Q(s',a')) - Q(s,a)]

Key Parameters:

α (alpha): Learning rate (0.1)
γ (gamma): Discount factor (0.99)
ε (epsilon): Exploration rate (0.1) - Epsilon-greedy strategy

Features:

Epsilon-greedy action selection
Q-table management
Automatic state initialization

SARSA (`SARSA.h/cpp`)

An on-policy temporal difference algorithm that learns the value of the policy being followed.

Update Rule:

Q(s,a) ← Q(s,a) + α[r + γ·Q(s',a') - Q(s,a)]

Key Difference from Q-Learning:

Uses the actual next action taken (a') instead of the max
More conservative (on-policy) learning

3. Environment Module (`Env/`)

GridWorld (`GridWorld.h/cpp`)

A simple grid-based environment for RL agents to navigate.

Features:

Configurable grid dimensions (default: 5×5)
Agent starts at (0, 0)
Customizable goal position
4 actions: UP, DOWN, LEFT, RIGHT
Reward: +1.0 for reaching goal, -0.01 per step
Boundary constraints (agent cannot leave grid)

State Representation:

Linear state index: state = y * width + x

Getting Started

Prerequisites

C++17 or higher
CMake 3.10+
Standard C++ compiler (g++, clang, MSVC)

Building the Project

# Clone the repository
git clone https://github.com/manjushwarkhairkar/ML-Library.git
cd ML-Library

# Create build directory
mkdir build
cd build

# Configure and build
cmake ..
make

# Run the program
./rl_main

Example Output

========================================
RL Library v1.0.0
========================================

[1] Initializing GridWorld Environment...
    Grid Size: 5x5
    Total States: 25
    Goal Position: (4, 4)

[2] Initializing Q-Learning Agent...
    Learning Rate (alpha): 0.1
    Discount Factor (gamma): 0.99
    Exploration Rate (epsilon): 0.1

[3] Starting Training...
    Episode  10 | Reward: -0.38
    Episode  20 | Reward: -0.23
    ...
    Episode 100 | Reward: 0.89

[4] Training Complete!
========================================

Usage Examples

1. Using Q-Learning with GridWorld

#include "RL_Lib.h"
using namespace RLLib;

int main() {
    // Initialize environment
    GridWorld env(5, 5, 4, 4);
    
    // Initialize agent
    QLearner agent(0.1, 0.99, 0.1);
    
    // Training loop
    for (int episode = 0; episode < 100; ++episode) {
        env.reset();
        int state = env.getState();
        
        for (int step = 0; step < 50; ++step) {
            int action = agent.chooseAction(state, GridWorld::NUM_ACTIONS);
            double reward = env.step(action);
            int next_state = env.getState();
            
            agent.update(state, action, reward, next_state, GridWorld::NUM_ACTIONS);
            
            state = next_state;
            if (env.isTerminal()) break;
        }
    }
    return 0;
}

2. Using MathUtils

#include "Core/MathUtils.h"
using namespace RLLib;

int main() {
    // Vector operations
    std::vector<double> a = {1.0, 2.0, 3.0};
    std::vector<double> b = {4.0, 5.0, 6.0};
    
    double dot = dot_product(a, b);
    auto sum = vector_add(a, b);
    
    // Activation functions
    double sig = sigmoid(0.5);
    double act = relu(-2.0);
    
    // Matrix operations
    Matrix m1(3, 2);
    Matrix m2(2, 3);
    Matrix result = matrix_multiply(m1, m2);
    
    return 0;
}

API Reference

QLearner Class

// Constructor
QLearner(double alpha = 0.1, double gamma = 0.9, double epsilon = 0.1);

// Initialize state in Q-table
void initializeState(int state, int numActions);

// Select action using epsilon-greedy strategy
int chooseAction(int state, int numActions);

// Update Q-value using Q-Learning rule
void update(int state, int action, double reward, int nextState, int numActions);

// Get Q-value for state-action pair
double getQValue(int state, int action);

// Get maximum Q-value for a state
double getMaxQValue(int state);

GridWorld Class

// Constructor
GridWorld(int width = 5, int height = 5, int goalX = 4, int goalY = 4);

// Get current state index
int getState() const;

// Reset environment
void reset();

// Execute action, return reward
double step(int action);

// Check if at goal
bool isTerminal() const;

// Get positions
std::pair<int, int> getAgentPosition() const;
std::pair<int, int> getGoalPosition() const;

MathUtils Functions

// Activation functions
double sigmoid(double x);
double tanh_activation(double x);
double relu(double x);

// Evaluation metrics
double calculate_mse(const std::vector<double>& target, const std::vector<double>& pred);
double calculate_mae(const std::vector<double>& target, const std::vector<double>& pred);

// Vector operations
double dot_product(const std::vector<double>& a, const std::vector<double>& b);
std::vector<double> vector_add(const std::vector<double>& a, const std::vector<double>& b);
std::vector<double> scalar_multiply(const std::vector<double>& a, double scalar);

// Matrix operations
Matrix matrix_multiply(const Matrix& a, const Matrix& b);
Matrix matrix_transpose(const Matrix& a);
std::vector<double> matrix_vector_multiply(const Matrix& mat, const std::vector<double>& vec);

Performance Considerations

Time Complexity: O(1) average for Q-Learning updates
Space Complexity: O(n*m) for n states and m actions
Scalability: Efficient for tabular methods; consider function approximation for larger state spaces

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

Sutton & Barto: "Reinforcement Learning: An Introduction" (2nd Edition)
Bellman Equation for MDPs
Temporal Difference Learning

Future Enhancements

Policy Gradient Methods (REINFORCE, Actor-Critic)
Deep Q-Networks (DQN)
Experience Replay and Target Networks
Multiple environments (CartPole, Mountain Car)
Visualization tools
GPU acceleration support
Python bindings

Version: 1.0.0
Last Updated: 2024
Maintainer: Manjushwar Khairkar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL Library - A Professional C++ Reinforcement Learning Framework

Project Structure

Features

1. Core Mathematics Module (`Core/MathUtils.h`)

2. Algorithms Module (`Algorithms/`)

Q-Learning (`QLearning.h/cpp`)

SARSA (`SARSA.h/cpp`)

3. Environment Module (`Env/`)

GridWorld (`GridWorld.h/cpp`)

Getting Started

Prerequisites

Building the Project

Example Output

Usage Examples

1. Using Q-Learning with GridWorld

2. Using MathUtils

API Reference

QLearner Class

GridWorld Class

MathUtils Functions

Performance Considerations

Contributing

License

References

Future Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Algorithms		Algorithms
Core		Core
Env		Env
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
RL_Lib.h		RL_Lib.h
main.cpp		main.cpp

Folders and files

Latest commit

History

Repository files navigation

RL Library - A Professional C++ Reinforcement Learning Framework

Project Structure

Features

1. Core Mathematics Module (Core/MathUtils.h)

2. Algorithms Module (Algorithms/)

Q-Learning (QLearning.h/cpp)

SARSA (SARSA.h/cpp)

3. Environment Module (Env/)

GridWorld (GridWorld.h/cpp)

Getting Started

Prerequisites

Building the Project

Example Output

Usage Examples

1. Using Q-Learning with GridWorld

2. Using MathUtils

API Reference

QLearner Class

GridWorld Class

MathUtils Functions

Performance Considerations

Contributing

License

References

Future Enhancements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Core Mathematics Module (`Core/MathUtils.h`)

2. Algorithms Module (`Algorithms/`)

Q-Learning (`QLearning.h/cpp`)

SARSA (`SARSA.h/cpp`)

3. Environment Module (`Env/`)

GridWorld (`GridWorld.h/cpp`)

Packages