
Daphne Academy
Daphne Academy is an intelligent tutoring system capable of adapting its content to a user’s array of ability levels (where each ability level corresponds to some learning topic). With the partnership of the D.O.D, Daphne Academy has developed into a mature prototype ready for more rigorous testing and, soon enough, final deployment. Guest account credentials for testing can be found below (as available).
Location: https://academy.selva-research.com
Guest Accounts
None available…
Methods
Item Response Theory
IRT Parameter Estimation¶
Deps
- numpy
- scipy
- matplotlib
These can be installed with your local python package manager
pip3 install numpy scipy matplotlib
If you see any import errors, change the path in the function below to point to the incorrectly imported module and run
import sys
# sys.path.insert(1, '/path/to/repo/cost-estimator-ca')
Item Characteristic Curve¶
Here, a test-item is modeled with a three parameter logistic model. The test item’s parameters are the following:
- a = 10 (discrimination – ability of question to split population accurately)
- b = 0.6 (difficulty – point at which persons with that ability level socre in the top 50th percentile)
- c = 0.25 (guessing parameter – probability that low-ability people can get the item correct)
The characteristics of the MC question are the following:
- has four equally interesting choices
- is slightly harder than average
- discriminates between novices and experts very well
The 3PL model allows us to determine the likelihood that a user at any given ability level will answer the question correctly.
$ p(\theta) = c + (1 – c)\frac{\exp[a(\theta – b)]}{1 + \exp[a(\theta – b)]}$
where
- $a$: discrimination parameter
- $b$: difficulty parameter
- $c$: guessing parameter
- $\theta$: user skill level
- $p$: Probability user answers correctly
import sys
from irt_estimation.IIIPL import IIIPL
import numpy as np
import matplotlib.pyplot as plt
# Define test-item parameters
a = 10
b = 0.6
c = 0.25
# Instantiate 3PL model for test-item
item_model = IIIPL(a, b, c)
# Sample a uniform distribution of ability parameter vaules [0, 1] to draw the Item Characteristic Curve
theta = np.arange(0, 1, 0.01)
prob_correct = item_model.prob_correct_arr(theta)
plt.plot(theta, prob_correct)
plt.xlabel("user ability")
plt.ylabel("probability of correctly answering")
_ = plt.title(f"Item Characteristic Curve a={a}, b={b}, c={c}")
Estimating an Examinee’s Ability¶
To estimate an examinee’s unknown ability parameter, it will be assumed that the numerical values of the parameters of the test items are known. Additionally, the initial guess of the examinee’s ability parameter will be some a priori value (e.g. $\theta = 1$).
Two methods for estimating an examinee’s ability will be demonstrated
- Maximum Likelihood Estimation
- Maximum a Posteriori Estimation
Maximum Likelihood Estimation¶
Consider a K item test where the items are scored $u_j=0$ or $u_j=1$ depending if the answer is incorrect or correct for $j=1,…,K$. Assuming local independence of test items in the response vector $u=(u_1,…,u_K)$, the maximum likelihood estimator for an examinee’s ability parameter $\theta$ is found with:
max: $L(\theta|u) = p(u|\theta) = \prod_{j = 1}^{K} p_j(\theta)^{u_j}[1-p_j(\theta)]^{1-u_j}$
from irt_estimation.IIIPL import MLE_Ability_Estimator
import matplotlib.pyplot as plt
# Exam results
answers = [
(1, IIIPL(10, .5, .25)),
(0, IIIPL(4, .4, .25)),
(1, IIIPL(6, .8, .25)),
(1, IIIPL(6, .9, .25)),
(1, IIIPL(3, .2, .25)),
(0, IIIPL(6, .4, .25)),
(1, IIIPL(4, .3, .25)),
(1, IIIPL(7, .5, .25)),
(1, IIIPL(7, .7, .25)),
(1, IIIPL(1.3, .5, .25))
]
# Instantiate estimator
estimator = MLE_Ability_Estimator(answers)
# Estimate the user's ability parameter by maximizing the likelihood function
estimate = estimator.estimate().x
print('--> ESTIMATE', estimate)
# Sample a uniform distribution of ability parameter vaules [0, 1] to draw Likelihood function
theta = np.arange(0, 1, 0.01)
likelihoods = estimator.likelihood_arr(theta)
plt.plot(theta, likelihoods)
plt.axvline(x=estimate, color='r', linestyle='dashed', linewidth=2, label='Estimate')
plt.legend()
plt.xlabel("user ability")
plt.ylabel("likelihood given exam results")
_ = plt.title(f"ML Estimation")
--> ESTIMATE 0.7338761709505345
Maximum a Posteriori Estimation¶
This estimation method is identical to the previous one except the likelihood function is multiplied by the ability paremeter’s distribution. The equation for the posterior distribution of $\theta$ is found according to Bayes’ theorem:
- $p(\theta|u) = \frac{p(u|\theta)p(\theta)}{p(u)}$
maximize: $p(\theta|u) = [\prod_{j = 1}^{K} p_j(\theta)^{u_j}[1-p_j(\theta)]^{1-u_j}] * p(\theta)$ over $\theta \in (0, 1)$ where
- $u$: Data on user answered questions: $u_j=0$ or $u_j=1$ depending if the question was answered correctly for $j=1,…,K$
- $p_j(\theta)$: Probability user answered question $j$ correctly
Because the denominator is not a function of $\theta$, it has no bearing on the optimization and hence can be ignored. Is is sufficient to find the value of $\theta$ that maximizes numerator, giving:
- max $p(u|\theta)p(\theta)$
from irt_estimation.IIIPL import MAP_Ability_Estimator
from scipy.stats import norm, uniform
# Exam results (same as above)
answers = [
(1, IIIPL(10, .5, .25)),
(0, IIIPL(4, .4, .25)),
(1, IIIPL(6, .8, .25)),
(1, IIIPL(6, .9, .25)),
(1, IIIPL(3, .2, .25)),
(0, IIIPL(6, .4, .25)),
(1, IIIPL(4, .3, .25)),
(1, IIIPL(7, .5, .25)),
(1, IIIPL(7, .7, .25)),
(1, IIIPL(1.3, .5, .25))
]
# Define ability parameter prior density
mean = 0.5
sd = 0.2
dist = norm(mean, sd)
# dist = uniform()
# Instantiate estimator, set prior distribution
estimator = MAP_Ability_Estimator(answers, theta_dist=dist)
# Estimate the user's ability parameter by maximizing the MAP function
estimate = estimator.estimate().x
print('--> ESTIMATE', estimate)
# Sample a uniform distribution of ability parameter vaules [0, 1] to draw MAP function
theta = np.arange(0, 1, 0.01)
probabilities = estimator.likelihood_arr(theta)
plt.plot(theta, probabilities)
plt.axvline(x=estimate, color='r', linestyle='dashed', linewidth=2, label='Estimate')
plt.legend()
plt.xlabel("user ability")
plt.ylabel("a posteriori probability given exam results")
_ = plt.title(f"MAP Estimation")
--> ESTIMATE 0.6160507721906785
Estimation vs Reality¶
To show how our ML estimation of $\theta$ compares with reality, an exam consisting of 10 MC questions is simulated 100 times for a user with ability $\theta = 0.6$. For each of the 10 questions in a given simulation, we sample the ICC to generate the probability the user will answer the question correctly and draw a random variable to decide whether the user gets the question right. Then, the item response vector for a given simulation is passed to the ML estimator to produce an estimate of the user’s ability parameter $\theta$. This procedure is repeated 100 times, and a historgram plotting the $\theta$ estimates is displayed along with additional metrics.
- It can be seen that increasing the number of exam questions reduces the standard deviation of the point estimates.
from irt_estimation.IIIPL import MLE_Ability_Estimator, IIIPL
import numpy as np
import random
import matplotlib.pyplot as plt
# Simulates an examination and estimates ability
def run_simulation(true_theta, questions):
probs = [q.prob_correct(true_theta) for q in questions]
results = np.random.rand(len(questions)) < probs
answers = [int(i) for i in results]
if all(i == 1 for i in answers) or all(i == 0 for i in answers):
return None
item_response_vector = []
for idx, question in enumerate(questions):
item_response_vector.append((answers[idx], question))
return MLE_Ability_Estimator(item_response_vector).estimate().x
# Define the examinee's true ability parameter value
true_theta = 0.6
num_questions = 3 # x10
num_sims = 100
# Define exam questions
questions = [
IIIPL(10, .6, .25),
IIIPL(5, .3, .25),
IIIPL(8, .8, .25),
IIIPL(2, .4, .25),
IIIPL(3, .65, .25),
IIIPL(4, .45, .25),
IIIPL(6, .7, .25),
IIIPL(4, .3, .25),
IIIPL(7, .25, .25),
IIIPL(8, .9, .25)
] * num_questions
# Run simulation and record estimates
estimates = []
for x in range(num_sims):
estimate = run_simulation(true_theta, questions)
if estimate:
estimates.append(estimate)
print('------> NUM SAMPLES:', num_sims)
print('----> TRUE ABILILTY:', round(true_theta, 5))
print('----> MEAN ESTIMATE:', round(np.mean(estimates), 5))
print('--> ESTIMATES STDEV:', round(np.std(estimates), 5))
print('----> ESTIMATES VAR:', round(np.var(estimates), 5))
# Plot histogram of estimates
bins = np.arange(0, 1.1, 0.1)
plt.hist(estimates, bins=bins)
plt.axvline(x=true_theta, color='r', linestyle='dashed', linewidth=2, label='True Ability')
plt.axvline(x=np.mean(estimates), color='g', linestyle='dashed', linewidth=2, label='Mean Estimate')
plt.legend()
plt.xlabel('Estimated User Ability')
plt.ylabel('Number of Occurrences')
_ = plt.title(f"Histogram of Ability Parameter Estimates: questions={len(questions)}")
------> NUM SAMPLES: 100 ----> TRUE ABILILTY: 0.6 ----> MEAN ESTIMATE: 0.60393 --> ESTIMATES STDEV: 0.09845 ----> ESTIMATES VAR: 0.00969