Chapter Number
5
Authors
Majdi Owda
Amani Yousef Owda
Fathi Gasir
Pages From

196
Pages To
208
Book Title
Intelligent Systems and Applications
Publisher
Springer
Project
Data Science
Abstract

Evaluation can be defined as a process of determining the significance of a research output. This is usually done by devising a well-structured study on this output using one or more evaluation measures in which a careful inspection is performed. This paper presents a review of evaluation techniques for Conversational Agents (CAs) and Natural Language Interfaces to Databases (NLIDBs). It then introduces the developed customized evaluation methodology for Conversation-Based Interface to Relational Databases (C-BIRDs). The evaluation methodology created has been divided into two groups of measures. The first is based on quantitative measures, including two measures: task success and dialogue length. The second group is based on a number of qualitative measures, including: prototype ease of use, naturalness of system responses, positive/negative emotion, appearance, text on screen, organization of information, and error message clarity. Then an elaboration is carried out on the devised methodology by adding a discussion and recommendations on the sample size, the experimental setup and the scaling in order to provide a comprehensive evaluation methodology for C-BIRDs. In conclusion the evaluation methodology created is better way for identifying the strengths and weaknesses of C-BIRDs in comparison to the usage of single measure evaluations.