Evaluating Human-Intelligent Virtual Agent Interactions

Welcome to the website of the Open Science Foundation work group "Artificial Social Agent Evaluation Instrument". In this workgroup, scientists from the Intelligent Virtual Agent (IVA) community are collaborating to create a validated community driven standardized questionnaire instrument for evaluating human interaction with Intelligent Virtual Agents. This instrument will help researchers to make claims about people’s perceptions, attitudes, and beliefs towards their agent. It will allow agents to be compared across user studies, and importantly, it helps in replicating our scientific findings. This is essential for the community if we want to make valid claims about the impact that our social agents can have in domains such as health, entertainment, and education.


The plan, as preregistered on the OSF platform, will be updated to reflect the progress. Where possible a link to the result of each step in the plan will be added. For all resources, visit the publication on this site and/or the OSF webpage.

  1. Determine the process, and get people involved

  2. Determine the model

    1. Examine existing questionnaires

    2. Discussion among experts

  3. Determine the constructs and dimensions

    1. Face validity among experts

    2. Grouping of existing constructs

  4. Determine initial set of construct items

    1. Content validity analysis – study into expert’s agreement of items to measure constructs

    2. Reformulating into easy to understand item questions

  5. Determine the final item set with the provision to create a long and short questionnaire version (i.e., construct validity, convergent and discriminant validity analysis: select items that both convert and discriminate) 
  6. Determine the generalization performance of the long and short questionnaire version (i.e. cross validation: fit model on data set from new set of ASAs) (Ongoing)

  7. Criteria validity

    1. Predictive validity: agreement with future observation

    2. Concurrent validity agreement with other ‘valid’ measure collected at same time

  8. Translate questionnaire (forward/backward translation)

  9. Developing normative data set