Goals of the talk
Provide an overview of the:
- Data
- Tasks
- Evaluation
- Experience with implementing the evaluation procedure
- Feedback from NIST assessors
-
Introduce the results:
- Sanity checking the results and measures
- Effect of reassessment with a different model summary (Phase 2)
-
Emphasize:
- Exploratory data analysis
- Attention to evaluation fundamentals over “final” conclusions
- Improving future evaluations