A Natural Conversational Virtual Human with Multimodal Dialog System
DOI:
https://doi.org/10.11113/jt.v71.3859Keywords:
Speech Synchronization, Dialog Behavior Systems.Abstract
The making of virtual human character to be realistic and credible in real time automated dialog animation system is necessary. This kind of animation carries importance elements for many applications such as games, virtual agents and movie animations. It is also considered important for applications which require interaction between human and computer. However, for this purpose, it is compulsory that the machine should have sufficient intelligence for recognizing and synthesizing human voices. As one of the most vital interaction method between human and machine, speech has recently received significant attention, especially in avatar research innovation. One of the challenges is to create precise lip movements of the avatar and synchronize it with a recorded audio. This paper specifically introduces the innovative concept of multimodal dialog systems of the virtual character and focuses the output part of such systems. More specifically, its focus is on behavior planning and developing the data control languages (DCL).References
G. Zoric, R. Forchheimer, and I.S. Pandzic. 2010. On creating multimodal virtual humans-real time speech driven facial gesturing, Multimedia Tools and Applications. 54(1): 165–179.
W. Wahlster. 2006. Dialogue Systems Go Multimodal The SmartKom Experience. Springer Berlin Heidelberg. 3–27.
G. Ferré. 2010. Timing Relationships between Speech and Co-Verbal Gestures in Spontaneous French.
A. Cerekovic and I. S. Pandžic. 2011. Multimodal behavior realization for embodied conversational agent, Multimedia Tools and Applications. 54(1): 143–164.
S. Kopp, B. Krenn, S. Marsella, A. Marshall, C. Pelachaud, H. Pirker, K. Thórisson, and H. Vilhjálmsson. 2006. Towards a common framework for multimodal generation: The behavior markup language. in Intelligent Virtual Agents. 4133: 205–217.
P. Aggarwal and D. Traum. 2011. The BML Sequencer: A Tool for Authoring Multi-character Animations. 428–430.
S. Sutton, R. Cole, J. De Villiers, J. Schalkwyk, P. Vermeulen, M. Macon, Y. Yan, E. Kaiser, B. Rundle, K. Shobaki, P. Hosom, A. Kain, J. Wouters, D. Massaro, and M. Cohen. 1998. Universal Speech Tools: The Cslu Toolkit.
G. Skantze and S. Al Moubayed. 2012. IrisTK: a Statechart-based Toolkit for Multi-party Face-to-face Interaction. 69–76.
F. López-Colino and J. Colás. 2012. Spanish Sign Language synthesis system. Journal of Visual Languages & Computing. 23(3). 121–136.
Y. Jung, A. Kuijper, D. Fellner, M. Kipp, J. Miksatko, J. Gratch, and D. Thalmann. 2011. Believable Virtual Characters in Human-Computer Dialogs.
B. Krenn, C. Pelachaud, H. Pirker, and C. Peters. 2011. Emotion-Oriented Systems. 389–415.
S. Scherer. 2013. Towards a Multimodal Virtual Audience Platform for Public Speaking Training. International Conference on Intelligent Virtual Agents.
J. Cassell and E. Mbodied. 2000. Human Conversation as a System Framework: Designing Embodied Conversational Agents.
B. Li, Q. Zhang, D. Zhou, and X. Wei. 2013. Facial Animation Based on Feature Points. 11(3).
J. Cassell, H.H. Vilhjálmsson, and T. Bickmore. 2001. BEAT: the Behavior Expression Animation Toolkit. In Proceedings Of The 28th Annual Conference On Computer Graphics And Interactive Techniques. 137: 477–486.
S. Kopp, B. Krenn, S. Marsella, and A. N. Marshall. 2011. Towards a Common Framework for Multimodal Generation: The Behavior Markup Language.
L.Q. Anh and C. Pelachaud. 2011. Expressive Gesture Model for Humanoid Robot. 224–231.
E. Bevacqua, T. Paristech, C. T. Paristech, J. Looser, and C. Pelachaud. 2011. Cross-Media Agent Platform. 1(212): 11–20.
M. Salvati and K. Anjyo. 2011. Developing Tools for 2D / 3D Conversion of Japanese Animations. 4503.
L. Kunc and J. Kleindienst. 2007. ECAF: Authoring Language for Embodied Conversational Agents. Springer-Verlag Berlin Heidelberg. 4629: 206–213
Downloads
Published
Issue
Section
License
Copyright of articles that appear in Jurnal Teknologi belongs exclusively to Penerbit Universiti Teknologi Malaysia (Penerbit UTM Press). This copyright covers the rights to reproduce the article, including reprints, electronic reproductions, or any other reproductions of similar nature.