We live in a multi-modal world, we learn, we think and we express through multiple modalities. Therefore for AI systems they should have the ability to understand the multi-modal world.
Our research efforts for building AI systems focus on understanding from multi-level, multi-aspect and multi-modal.