## hnefatafl and the quest for balance

### Re: hnefatafl and the quest for balance

Ah, the pitfalls of 'confirmation bias'. Beware one and all.

I had always hoped that our rigorous testing of variants would help us to zero in on the ultimate tafl rule set to make tournaments and clubs easier to arrange. However I now think that that hope while laudable, was rather ill informed. We have found the exact opposite thus far. Our testing is rather revealing the various merits of the many variants, all of which have their own peculiarities and charm. We have found some that are on the face of it poorly balanced, but this is so dependent on the strength and experience of the players, that it becomes a matter of handicapping. Perpetual board positions are perhaps the only thing some of us would like to see the back of, and now that most variants are set to forbid it on the site, lets see if we miss it.

Through this research we have in fact generated new variants, while maintaining the old. Personally I am surprised by this, but I think its good to be surprised by results!

My feeling now, is that thanks to Aages excellent site and all our dedication, that the gaming statistics will start to speak for themselves. Lesser variants will naturally fall into disuse, while the ones that stand the test of time will have done so simply because they work. And that removes the problems of confirmation bias. Hooray for science : )

I had always hoped that our rigorous testing of variants would help us to zero in on the ultimate tafl rule set to make tournaments and clubs easier to arrange. However I now think that that hope while laudable, was rather ill informed. We have found the exact opposite thus far. Our testing is rather revealing the various merits of the many variants, all of which have their own peculiarities and charm. We have found some that are on the face of it poorly balanced, but this is so dependent on the strength and experience of the players, that it becomes a matter of handicapping. Perpetual board positions are perhaps the only thing some of us would like to see the back of, and now that most variants are set to forbid it on the site, lets see if we miss it.

Through this research we have in fact generated new variants, while maintaining the old. Personally I am surprised by this, but I think its good to be surprised by results!

My feeling now, is that thanks to Aages excellent site and all our dedication, that the gaming statistics will start to speak for themselves. Lesser variants will naturally fall into disuse, while the ones that stand the test of time will have done so simply because they work. And that removes the problems of confirmation bias. Hooray for science : )

### Re: hnefatafl and the quest for balance

Game balance.

A measure for game balance could be:

(number of most winning color's wins) divided by (number of least winning color's wins),

the number positive if white is more winning, negative if black is more winning.

Draws are counted as half wins.

Examples:

White wins 20 times and black wins 10 times. Balance = +20/10 = +2 (white has twice the chance of winning than black).

White wins 20 times and black wins 30 times. Balance = -30/20 = -1.5 (black has 1.5 times the chance of winning than white).

White wins 10 times and black wins 10 times, and 20 draws. Balance = 20/20 = 1 (perfect balance).

Calculated this way, the series of test tournaments resulted in these game balances:

+3.25 Hnefatafl 9x9

+1.55 History Craft Hnefatafl 11x11

+1.50 Foteviken Lapp Tablut

+1.47 Fetlar Hnefatafl 11x11

+1.32 Skalk Edge Hnefatafl 9x9

+1.21 Sea Battle Tafl 9x9

+1.13 Hnefatafl 11x11

-1.17 Skalk Edge Hnefatafl 11x11

-1.22 Berserk Hnefatafl

-1.60 Sea Battle Tafl 11x11

-2.25 Skalk Hnefatafl 9x9

-3.83 Skalk Hnefatafl 11x11

-5.09 Ashton Lapp Tablut

Since Fetlar Hnefatafl 11x11 (balance +1.47) is accepted as a balanced game, all variants with balances from -1.5 to +1.5 should be accepted as balanced games.

This leaves Skalk Hnefatafl 9x9 & 11x11, Hnefatafl 9x9 and Ashton Lapp Tablut as unbalanced variants.

A measure for game balance could be:

(number of most winning color's wins) divided by (number of least winning color's wins),

the number positive if white is more winning, negative if black is more winning.

Draws are counted as half wins.

Examples:

White wins 20 times and black wins 10 times. Balance = +20/10 = +2 (white has twice the chance of winning than black).

White wins 20 times and black wins 30 times. Balance = -30/20 = -1.5 (black has 1.5 times the chance of winning than white).

White wins 10 times and black wins 10 times, and 20 draws. Balance = 20/20 = 1 (perfect balance).

Calculated this way, the series of test tournaments resulted in these game balances:

+3.25 Hnefatafl 9x9

+1.55 History Craft Hnefatafl 11x11

+1.50 Foteviken Lapp Tablut

+1.47 Fetlar Hnefatafl 11x11

+1.32 Skalk Edge Hnefatafl 9x9

+1.21 Sea Battle Tafl 9x9

+1.13 Hnefatafl 11x11

-1.17 Skalk Edge Hnefatafl 11x11

-1.22 Berserk Hnefatafl

-1.60 Sea Battle Tafl 11x11

-2.25 Skalk Hnefatafl 9x9

-3.83 Skalk Hnefatafl 11x11

-5.09 Ashton Lapp Tablut

Since Fetlar Hnefatafl 11x11 (balance +1.47) is accepted as a balanced game, all variants with balances from -1.5 to +1.5 should be accepted as balanced games.

This leaves Skalk Hnefatafl 9x9 & 11x11, Hnefatafl 9x9 and Ashton Lapp Tablut as unbalanced variants.

### Re: hnefatafl and the quest for balance

I think this is a very good idea to calculate but I see a problem with the date you put in because it depends on the experience level of the players if the game is balanced for them or not. For example Fetlar Hnefatafl is quite hard to play as black for beginners but I had the perception that when both players play perfectly black might even have slightly better chances of winning so +1.62 seems to be not generally correct to me.Hagbard wrote:A measure for game balance could be:

(number of most winning color's wins) divided by (number of least winning color's wins),

the number positive if white is more winning, negative if black is more winning.

Draws are counted as half wins.

Examples:

White wins 20 times and black wins 10 times. Balance = +20/10 = +2 (white has twice the chance of winning than black).

White wins 20 times and black wins 30 times. Balance = -30/20 = -1.5 (black has 1.5 times the chance of winning than white).

White wins 10 times and black wins 10 times, and 20 draws. Balance = 20/20 = 1 (perfect balance).

Calculated this way, the series of test tournaments resulted in these game balances:

+1.62 Fetlar Hnefatafl 11x11

+1.55 History Craft Hnefatafl 11x11

+1.50 Foteviken Lapp Tablut

+1.32 Skalk Edge Hnefatafl 9x9

+1.21 Sea Battle Tafl 9x9

-1.32 Skalk Edge Hnefatafl 11x11

-1.40 Berserk Hnefatafl

-1.60 Sea Battle Tafl 11x11

-2.25 Skalk Hnefatafl 9x9

-3.83 Skalk Hnefatafl 11x11

-4.83 Ashton Lapp Tablut

Since Fetlar Hnefatafl 11x11 (balance +1.62) is accepted as a balanced game, all variants with balances from -1.62 to +1.62 should be accepted as balanced games.

This leaves Skalk Hnefatafl 9x9 & 11x11 and Ashton Lapp Tablut as unbalanced variants.

My suggestion would be having a balance number for different experience levels.

### Re: Balanced 9x9 and 11x11 tafl variants

I think that is a very good idea. Aage has made some comments on the site and forum based on the research tournaments suggesting which variants are best for beginners, and which are best for experienced players. A statistical table showing the evidence might be tricky to piece together, but it would convincingly prove once and for all how the variants play according to ability.arne64 wrote:My suggestion would be having a balance number for different experience levels.

### Re: Balanced 9x9 and 11x11 tafl variants

arne64 wrote:My suggestion would be having a balance number for different experience levels.

The Fetlar variant is a good object for study, because we have a long time experience with it, and a perception that this is a variant which is easier for black with more experience.Adam wrote:A statistical table showing the evidence might be tricky to piece together, but it would convincingly prove once and for all how the variants play according to ability.

Last summer, August 2012, we had here the Fetlar championship tournament with eight participants (excluding participants with many time outs). If the tournament results are ordered by player strength and analyzed for the partial tournaments 4 players x 4 players, we find

players 1-4 (four strongest players):

7 white wins, 4 black wins, 1 draw. Balance +1.67

players 2-5:

7 white wins, 5 black wins. Balance +1.40

players 3-6:

6 white wins, 4 black wins, 2 time outs. Balance +1.50

players 4-7:

6 white wins, 4 black wins, 2 time outs. Balance +1.50

players 5-8:

5 white wins, 4 black wins, 1 draw, 2 time outs. Balance +1.22

The outcome of this analysis is somewhat unexpected and indicates that in the Fetlar variant, white is favoured on all experience levels.

### Re: Balanced 9x9 and 11x11 tafl variants

Interesting. If one takes a look at page one of this topic, the overall statistics, with a much bigger number of games, do seem to say the same thing. A general rule seems to be that white tends to be favoured on average to a greater or lesser degree in all variants between all levels of players, but that the imbalance is less extreme, though still present, with experienced players. Copenhagen looks conspicuously more even even in its old form.

How is Copenhagen doing statistically these days?

How is Copenhagen doing statistically these days?

### Re: Balanced 9x9 and 11x11 tafl variants

Current overall statistics for Copenhagen Hnefatafl 11x11:Adam wrote:How is Copenhagen doing statistically these days?

25 players did 176 games with the result:

91 white wins, 80 black wins, 5 draws.

Balance: +1.13

### Re: hnefatafl and the quest for balance

Game rating.

I've had an interesting conversation with nath about the calculation method of game ratings.

The method used here is very straightforward; the Elo rating system known from chess, with an exponential constant of 400 (that is, a difference of 400 points in rating means that the weaker player is expected to win 1/10 th of the games), and a K-factor of 32 (that is, a single game can change the rating only up to 32 points). To that, the average rating is at all times aimed to be about 1500.

The theory behind the Elo system is simple; the probability of one or the other player winning is calculated solely from the two players' ratings. This implies, however, that the game is perfectly balanced, like fx. is chess where white and black are identical, except for white having the first move and by that a small advantage which is ignored.

Not so with tafl. Tafl games are all asymmetrical games and therefore innately unbalanced.

The game balance of Sea Battle tafl 11x11 fx. is measured to be -1.60; black wins 16 times when white wins 10 times. If a player should decide to never play anything else but Sea Battle tafl 11x11 black, he would be overrated by 80 rating points.

It would be simple to bring the Elo system back on track, by incorporating the measured game balances along with the player ratings in the calculation of the probabilities of winning, and thereby finding more accurate rating changes for the tafl games.

I've had an interesting conversation with nath about the calculation method of game ratings.

The method used here is very straightforward; the Elo rating system known from chess, with an exponential constant of 400 (that is, a difference of 400 points in rating means that the weaker player is expected to win 1/10 th of the games), and a K-factor of 32 (that is, a single game can change the rating only up to 32 points). To that, the average rating is at all times aimed to be about 1500.

The theory behind the Elo system is simple; the probability of one or the other player winning is calculated solely from the two players' ratings. This implies, however, that the game is perfectly balanced, like fx. is chess where white and black are identical, except for white having the first move and by that a small advantage which is ignored.

Not so with tafl. Tafl games are all asymmetrical games and therefore innately unbalanced.

The game balance of Sea Battle tafl 11x11 fx. is measured to be -1.60; black wins 16 times when white wins 10 times. If a player should decide to never play anything else but Sea Battle tafl 11x11 black, he would be overrated by 80 rating points.

It would be simple to bring the Elo system back on track, by incorporating the measured game balances along with the player ratings in the calculation of the probabilities of winning, and thereby finding more accurate rating changes for the tafl games.

### Re: Balanced 9x9 and 11x11 tafl variants

An overview of measured tafl game balances to be found here:

http://aagenielsen.dk/tafl_balances.php

http://aagenielsen.dk/tafl_balances.php

### Re: Rating of tafl games

I think an important part is to measure different variants differently. Different variants have different draw chances etc.

I don't think that variants like Salk Hnefatafl should be rated. maybe we should just make different ratings for each variants and display them in a different menu, but just display the rating from a main variant at the main page.

Also variants with a lot of draws like 'old' Hnefatafl let look the distance between players much smaller than it looks with a draw-less variant like Copenhagen Hnefatafl.

I don't think that variants like Salk Hnefatafl should be rated. maybe we should just make different ratings for each variants and display them in a different menu, but just display the rating from a main variant at the main page.

Also variants with a lot of draws like 'old' Hnefatafl let look the distance between players much smaller than it looks with a draw-less variant like Copenhagen Hnefatafl.

### Who is online

Users browsing this forum: No registered users and 1 guest