Pronunciation training aid using media tools such as mobile apps and online web-based system are widely used nowadays. These tools often provide audio-based sample and phonetic style texts that can be used to support the learners train their pronunciation without language teachers. However, the learners still have the difficulty in the learning process, because they found it is hard to detect and locate the mispronounced parts in their own speech while practicing. In this paper, we present a method that enables to visualize speech detailed features such as pitch, intensity and duration into the text forms. The medium to portray those speech features is the animated texts which enable to express the speech features in the attributes of text features such as text size, color, position or motion. By viewing the speech features in the rich text forms like the animated texts, the learners can easily spot their mispronounced parts and correct them. Here, we examined how the actual analyzed speech data can be mapped into the animated texts' features and the effectiveness of using the proposed visualization system in portraying speech pitch, intensity and duration features. The evaluation experiments were surveyed by forty non-native Japanese learners who are Malaysian novice level learners. The experiment subjects appeared to agree with the animated texts as the representative for speech visualization and the daily conversation based speech data appeared to be an easy approach for the novice level.