Solving Brax Problems in EvoX#

EvoX deeply dives into neuroevolution with Brax. Here we will show an example of solving Brax problem in EvoX.

# install EvoX and Brax, skip it if you have already installed EvoX or Brax
from importlib.util import find_spec
from IPython.display import HTML

if find_spec("evox") is None:
    %pip install evox
if find_spec("brax") is None:
    %pip install brax
# The dependent packages or functions in this example
import torch
import torch.nn as nn

from evox.algorithms import PSO
from evox.problems.neuroevolution.brax import BraxProblem
from evox.utils import ParamsAndVector
from evox.workflows import EvalMonitor, StdWorkflow

Use EvoX to solve Neuroevolution Tasks#

Neuroevolution is an optimization method that combines neural networks with evolutionary algorithms to evolve the structure and parameters of neural networks. By simulating natural selection and genetic mechanisms, Neuroevolution aims to optimize neural network architectures and weights, addressing complex problems such as game AI, robotic control, and more.

In our example of neuroevolution tasks, Brax is needed. So it is recommended to install Brax if you want to replicate this example.

What is Brax#

Brax is a fast and fully differentiable physics engine used for research and development of robotics, human perception, materials science, reinforcement learning, and other simulation-heavy applications.

Here we will demonstrate a “swimmer” environment of Brax. For more information, you can browse the Github of Brax.

Design a neural network class#

To start with, we need to decide which neural network we are about to construct.

Here we will give a simple Multilayer Perceptron (MLP) class.

# Construct an MLP using PyTorch.
# This MLP has 3 layers.


class SimpleMLP(nn.Module):
    def __init__(self):
        super(SimpleMLP, self).__init__()
        self.features = nn.Sequential(nn.Linear(8, 4), nn.Tanh(), nn.Linear(4, 2))

    def forward(self, x):
        x = self.features(x)
        return torch.tanh(x)

Initiate a model#

Through the SimpleMLP class, we can initiate a MLP model.

# Make sure that the model is on the same device, better to be on the GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
# Reset the random seed
seed = 1234
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)

# Initialize the MLP model
model = SimpleMLP().to(device)

Initiate an adapter#

An adapter can help us convert the data back-and-forth.

  • to_vector can convert a parameters dictionary to a vector.

  • to_params can convert a vector back to a parameters dictionary.

There are also batched version conversion.

  • batched_to_vector can convert a batched parameters dictionary to a batch of vectors.

  • batched_to_params can convert a batch of vectors back to a batched parameters dictionary.

adapter = ParamsAndVector(dummy_model=model)

With an adapter, we can set out to do this Neuroevolution Task.

Set up the running process#

Initiate an algorithm and a problem#

We initiate a PSO algorithm, and the problem is a Brax problem in “swimmer” environment.

# Set the population size
POP_SIZE = 1024

# Get the bound of the PSO algorithm
model_params = dict(model.named_parameters())
pop_center = adapter.to_vector(model_params)
lower_bound = torch.full_like(pop_center, -5)
upper_bound = torch.full_like(pop_center, 5)

# Initialize the PSO, and you can also use any other algorithms
algorithm = PSO(
    pop_size=POP_SIZE,
    lb=lower_bound,
    ub=upper_bound,
    device=device,
)
algorithm.setup()

# Initialize the Brax problem
problem = BraxProblem(
    policy=model,
    env_name="swimmer",
    max_episode_length=1000,
    num_episodes=3,
    pop_size=POP_SIZE,
    device=device,
)

In this case, we will be using 1000 steps for each episode, and the average reward of 3 episodes will be returned as the fitness value.

Set an monitor#

# set an monitor, and it can record the top 3 best fitnesses
monitor = EvalMonitor(
    topk=3,
    device=device,
)
monitor.setup()
EvalMonitor()

Initiate an workflow#

# Initiate an workflow
workflow = StdWorkflow(opt_direction="max")
workflow.setup(
    algorithm=algorithm,
    problem=problem,
    solution_transform=adapter,
    monitor=monitor,
    device=device,
)

Run the workflow#

Run the workflow and see the magic!

Note

The following block will take around 20 minute to run. The time may vary depending on your hardware.

# Set the maximum number of generations
max_generation = 50

# Run the workflow
for i in range(max_generation):
    if i % 10 == 0:
        print(f"Generation {i}")
    workflow.step()

monitor = workflow.get_submodule("monitor")
print(f"Top fitness: {monitor.get_best_fitness()}")
best_params = adapter.to_params(monitor.get_best_solution())
print(f"Best params: {best_params}")
Generation 0
Generation 10
Generation 20
Generation 30
Generation 40
Top fitness: 369.4692077636719
Best params: {'features.0.weight': tensor([[ 2.3992, -1.8511,  4.8109, -4.4597, -1.0910,  1.4677, -4.9631,  5.0000],
        [-5.0000, -2.5050,  2.4442, -3.0992, -0.8043,  3.4015, -5.0000, -4.4697],
        [-4.9733,  3.3274,  2.6283, -1.8122, -4.9979, -5.0000, -4.2314, -1.4714],
        [-4.0897,  5.0000, -5.0000,  4.6735,  5.0000, -5.0000,  5.0000,  1.5789]],
       device='cuda:0'), 'features.0.bias': tensor([ 2.6846, -4.9499,  3.2993,  3.6577], device='cuda:0'), 'features.2.weight': tensor([[-0.5519,  5.0000, -3.8752,  5.0000],
        [-3.2433, -4.9600, -0.1063,  2.1125]], device='cuda:0'), 'features.2.bias': tensor([ 1.4154, -0.5969], device='cuda:0')}
monitor.get_best_fitness()
tensor(369.4692, device='cuda:0')
monitor.plot()