The History class

We use the History class to compute the learning curves and predictions of a sequence of events.

TrueSkillThroughTime.HistoryType

The History class

History(composition::Vector{Vector{Vector{String}}},
results::Vector{Vector{Float64}}=Vector{Vector{Float64}}(),
times::Vector{Int64}=Int64[], priors::Dict{String,Player}=Dict{String,Player}()
; mu::Float64=MU, sigma::Float64=SIGMA, beta::Float64=BETA,
gamma::Float64=GAMMA, p_draw::Float64=P_DRAW, online::Bool=false,
weights::Vector{Vector{Vector{Float64}}}=Vector{Vector{Vector{Float64}}}())

Properties:

size::Int64
batches::Vector{Batch}
agents::Dict{String,Agent}
time::Bool
mu::Float64
sigma::Float64
beta::Float64
gamma::Float64
p_draw::Float64
online::Bool

Let us return to the example seen on the first page of this manual. We define the composition of each game using the names of the agents (i.e. their identifiers). In the following example, all agents ("a", "b", "c") win one game and lose the other. The results will be implicitly defined by the order in which the game compositions are initialized: the teams appearing firstly in the list defeat those appearing later. By initializing gamma = 0.0 we specify that skills do not change over time.

c1 = [["a"],["b"]]
c2 = [["b"],["c"]]
c3 = [["c"],["a"]]
composition = [c1, c2, c3]
h = ttt.History(composition, gamma=0.0)
History(Events=3, Batches=3, Agents=3)

After initialization, the History class immediately instantiates a new player for each name and activates the computation of the TrueSkill estimates (not yet TrueSkill Through Time).

Learning curves

To access estimates we can call the method learning\_curves(), which returns a dictionary indexed by the names of the agents.

ttt.learning_curves(h)["a"]
2-element Vector{Tuple{Int64, Gaussian}}:
 (1, Gaussian(mu=3.339079, sigma=4.985033))
 (3, Gaussian(mu=-2.687824, sigma=3.77941))
ttt.learning_curves(h)["b"]
2-element Vector{Tuple{Int64, Gaussian}}:
 (1, Gaussian(mu=-3.339079, sigma=4.985033))
 (2, Gaussian(mu=0.058622, sigma=4.218053))

Individual learning curves are lists of tuples: each tuple has the time of the estimate as the first component and the estimate itself as the second one. Although in this example no player is stronger than the others, the TrueSkill estimates present strong variations between players.

Convergence

TrueSkill Through Time solves TrueSkill's inability to obtain correct estimates by allowing the information to propagate throughout the system. To compute them, we call the method convergence() of the History class.

TrueSkillThroughTime.convergenceMethod
convergence(h::History; epsilon::Float64=EPSILON,
iterations::Int64=ITERATIONS; epsilon::Float64=EPSILON, iterations::Int64=ITERATIONS, verbose = true)

TrueSkill Through Time not only returns correct estimates (same for all players), they also have less uncertainty.

ttt.convergence(h)
ttt.learning_curves(h)["a"]
2-element Vector{Tuple{Int64, Gaussian}}:
 (1, Gaussian(mu=0.0, sigma=2.394808))
 (3, Gaussian(mu=-0.0, sigma=2.394808))
ttt.learning_curves(h)["b"]
2-element Vector{Tuple{Int64, Gaussian}}:
 (1, Gaussian(mu=-0.0, sigma=2.394808))
 (2, Gaussian(mu=-0.0, sigma=2.394808))

Model evidence

We would like to have a procedure to decide whether TrueSkill Through Time is better than others models and the optimal values of the parameters $\sigma$ and $\gamma$. In the same way that we use probability theory to evaluate the hypotheses of a model given the data, we can also evaluate different models given the data.

$P(\text{Model}|\text{Data}) \propto P(\text{Data}|\text{Model})P(\text{Model})$

where $P(\text{Model})$ is the prior of the models, which we define, and $P(\text{Data}|\text{Model})$ is the prediction made by the model. In the special case where we have no prior preference over any model, we need only compare the predictions made by the models.

$P(\text{Model}|\text{Data}) \propto P(\text{Data}|\text{Model})$

In other words, we prefer the model with the best prediction.

$P(\text{Data}|\text{Model}) = P(d_1|\text{M})P(d_2|d_1,\text{M}) \dots P(d_n|d_{n-1}, \dots, d_1, \text{M})$

where D represents the data set, M the model, and $d_i$ the individual data points. This measure can be obtained by the evidence method.

Let us develop a complex synthetic example in which this measure is useful for choosing the optimal dynamic uncertainty.

Optimizing the dynamic factor

We now analyze a scenario in which a new player joins a large community of already known players. In this example, we focus on the estimation of an evolving skill. For this purpose, we establish the skill of the target player to change over time following a logistic function. The community is generated by ensuring that each opponent has a skill similar to that of the target player throughout their evolution. In the following code, we generate the target player's learning curve and 1000 random opponents.

using Random; Random.seed!(999); N = 1000
function skill(experience, middle, maximum, slope)
    return maximum/(1+exp(slope*(-experience+middle)))
end
target = skill.(1:N, 500, 2, 0.0075)
opponents = Random.randn.(1000)*0.5 .+ target
println("t1 = ", target[1], ", tn = ", target[end])
t1 = 0.046292688460268065, tn = 1.9540452601799487

The list target has the agent's skills at each moment: the values start at zero and grow smoothly until the target player's skill reaches two. The list opponents includes the randomly generated opponents' skills following a Gaussian distribution centered on each target's skills and a standard deviation of $0.5$.

composition = [[["a"], [string(i)]] for i in 1:N]
results = [r ? [1.,0.] : [0.,1.] for r in (Random.randn(N).+target.>Random.randn(N).+opponents)]
times = [i for i in 1:N]
priors = Dict{String,ttt.Player}()
for i in 1:N  priors[string(i)] = ttt.Player(ttt.Gaussian(opponents[i], 0.2))  end

h = ttt.History(composition, results, times, priors, gamma=0.018)
ttt.convergence(h, iterations = 16)
mu = [tp[2].mu for tp in ttt.learning_curves(h)["a"]]
println()
Iteration = 1, step = (4.599309069592936, 3.6248598158131187)
Iteration = 2, step = (1.291636872973942e-5, 4.8906614031063445e-6)
Iteration = 3, step = (3.8670833896192747e-11, 3.056527253519903e-12)
End

In this code, we define four variables to instantiate the class History to compute the target's learning curve. The variable composition contains 1000 games between the target player and different opponents. The list results is generated randomly by sampling the agents' performance following Gaussian distributions centered on their skills. The winner is the player with the highest performance. The variable time is a list of integer values ranging from 0 to 999 representing the time batch in which each game is located: the class History uses the temporal distance between events to determine the amount of dynamic uncertainty ($\gamma^2$) to be added between games. The variable priors is a dictionary used to customize player attributes: we assign low uncertainty to the opponents' priors as we know their skills beforehand.

The class History receives these four parameters and initializes the target player using the default values and a dynamic uncertainty gamma=0.018. Using the method convergence(), we obtain the TrueSkill Through Time estimates and the target's learning curve. The following figure shows the evolution of the true (solid line) and estimated (dotted line) target player's learning curves.

The estimated learning curves remain close to the actual skill during the whole evolution.

le = ttt.log_evidence(h)
-668.6661312494882

The geometric mean of the evidence is

exp(le/h.size)
0.5123915852877343

To optimize, repeat this procedure with different values of gamma until minimize the log_evidence (or maximize the geommetric mean).