Skip to content

Player based on evolutionary algorithm, with average score of around 7.8 ± 0.89#5

Merged
rougier merged 1 commit intorougier:masterfrom
tjayada:master
Aug 19, 2025
Merged

Player based on evolutionary algorithm, with average score of around 7.8 ± 0.89#5
rougier merged 1 commit intorougier:masterfrom
tjayada:master

Conversation

@tjayada
Copy link
Contributor

@tjayada tjayada commented Jul 31, 2025

I have tried many different approaches and hyperparameters to end up handing in a rather simple model ;)

So the algorithm to find the optimal parameters / weights in Wout is quite generic and follows the setup of :

  1. Randomly generate the initial population of individuals, the first generation.
  2. Evaluate the fitness of each individual in the population.
  3. Check, if the goal is reached and the algorithm can be terminated.
  4. Select individuals as parents, preferably of higher fitness.
  5. Produce offspring with optional crossover (mimicking reproduction).
  6. Apply mutation operations on the offspring.
  7. Select individuals preferably of lower fitness for replacement with new individuals (mimicking natural selection).
  8. Return to 2
    Source: https://en.wikipedia.org/wiki/Evolutionary_algorithm

So I begin by generating min(cpu_count(), 8) * 3 individual Wout, with np.random.uniform(-1, 1, (1, 1000)) and use the best performing individuals, then some crossover (taking weights from both "parents") and mutation (adding "variation" with np.random.normal) to create the next generation.
This is pretty simple and thus only performs good around half the time.

To speed things up I have used multiprocessing and for the evaluation of the population I'm using a handcrafted reward function, which discourages hitting the wall and encourages "movement", by visiting many locations and both energy sources plus gaining energy in general, especially when running low on energy.

Most parameters are chosen reasonably, eg. leak = 0.8, spectral_radius = 0.95 and density = 0.1, but if any questions arise for any specific choice, I'm happy to answer.

I tested some random seeds and got an average of 7.8 ± 0.89 with the highest being 13.32 ± 0.37 and the lowest 1.53 ± 0.19, so because of the short training time and not that efficient optimisation algorithm, the random seed has a big influence on the final performance, so its a bit of playing roulette and I'm curious to see what seed you will pick for evaluation ;)
Anyways, I think this can be a good baseline to beat for other people as well, I'm looking forward to seeing other solutions and thanks for the fun challenge!

@rougier
Copy link
Owner

rougier commented Aug 19, 2025

Many thanks for your entry. I'll merge it and evaluate it using a new random seed. From your description of performances, I fear the bot might have learned to turn on a specific direction which might explain why you oscillate between very poor and very good performances. Did you have a chance to observe its behavior ?

Note also that since you've made this first entry and if you found interesting structures inside your model, you can freeze thes structures and re-use them for a new entry.

@rougier rougier merged commit 2db5144 into rougier:master Aug 19, 2025
@rougier
Copy link
Owner

rougier commented Aug 19, 2025

I ran with debug and the bot is circling the outer loop which is a goo strategy even though it is not optimal. On my machine, training last 2.36 seconds such that there is still plenty of time to improve behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants