The team I’m part of recently did the first of a series of training sessions that focus on Emmersion fundamentals—key features and processes that help anyone better understand the important work we do. If you have been a regular reader, you will hopefully recognize some overlap between what is discussed here and what we have discussed earlier.
If you are new, welcome! This will be a great place for you to start learning more about the language assessment principles and technology that support our work to close the global communication gap. In this first session, our team focused on answering the following questions:
- What is elicited imitation?
- Which aspects of listening does EI measure?
- What is chunking?
- Which elements of speaking are measured with EI?
- Which elements of speaking are not measured by EI?
- What is partial construct coverage?
What is elicited imitation?
Elicited imitation (EI) is the key innovation that our speaking test uses to measure speaking ability quickly and accurately. This task type consists of a language-learner listening to an audio prompt and then repeating what they heard as completely as their language ability allows. Using AI technology, the test compares the speech produced by the speaker to the prompt in order to assess ability and generate an automatic score.
While researchers have known about EI’s potential for decades, it took advancements in automatic speech recognition and the persistence of Emmersion’s founders to take it from the laboratory to classrooms and offices around the world.
Which aspects of listening does EI measure?
Elicited imitation measures a person’s ability to listen with linguistic precision. It measures this ability with a task type that has a single, clearly identified purpose. To present information to the test-taker, EI uses audio recordings of a single utterance that vary in length and complexity. The text of this audio spans many different contexts and is presented only once under idealized conditions.
I will provide a little more detail for each of those aspects:
Listen with linguistic precision. Unlike other listening tasks where understanding the main idea or gist of a speaker is saying is sufficient, EI requires the listener to listen with precision to not only what was said but also how it was formed.
When there is a single clearly identified purpose. The test-taker is listening in order to be able to repeat with accuracy what was said. The directness of its purpose helps even people who have never taken an EI test before to quickly understand the expectations of the task.
Audio that consists of a single utterance of varied length and complexity. The recording or prompt of an EI task consists of one person speaking a sentence or a few connected sentences. These sentences will vary in their length and complexity depending on the ability level that is being targeted.
That spans many different contexts. EI items have been selected from a wide variety of genres and sources of texts. They are intended to encompass a broad spectrum of language depending on the proficiency level that they target. In order to ensure the appropriateness of the content, we consult with language experts and items are calibrated across thousands of test-takers before they can be used to calculate a score.
Presented only once. A test-taker can only listen to an audio prompt once. This is by design and aligned with the body of research into EI’s effectiveness. Playing the audio once targets a specific cognitive skill set that is highly relevant to language performance.
Under idealized conditions. The recording and playback of the audio prompt is done with care to prevent any disadvantage to the test-taker. Voice talent used to record the prompt is reviewed by language experts and selected to be highly standard in their accent and rate. Great efforts are made to ensure that audio quality of playback is clear and clean when the recommended hardware and software are used.
Related to this question, we also discussed what increases the difficulty of and EI task. Several factors increase the difficulty of the prompt, including:
- Grammar complexity
- Vocabulary frequency
- Function complexity
What is chunking?
Chunking refers to a specific type of memory skill that is very prominent in language performance.
In its most elemental form, spoken language is simply sounds or phonemes. As the sounds of the language are learned, these begin to be grouped together into syllables. As words are learned, these syllables are grouped together and gain new meaning.
Further mastery of the language results in words being grouped together in longer strings or multi-word phrases. These phases of language acquisition are regulated by chunking, as illustrated below with some building blocks.
As the brain is performing an EI task, it needs to take the input (what is heard) from the areas of the brain that receive information when listening and pass it to the areas of the brain that will produce it (through speaking). The success of this handoff is directly impacted by chunking.
A person’s ability to chunk is directly impacted by language proficiency. If you are highly proficient in a language, you can take a longer, more complex sentence and successfully chunk it into meaningful bits and reproduce it with accuracy. Lower levels of language ability result in a less-effective ability to chunk. Bits go missing or are presented with errors that were not actually heard even if the prompt was short and simple.
This process can be visualized in the following sequence of pictures. If I am completely unfamiliar with a language, the sounds of the language quickly overwhelm me and I’m very limited in what input I can successfully move to output. If I am highly skillful with the language, I can take even the many sounds in a long complex sentence and chunk them into manageable pieces. If I am somewhere in-between, I will be mixed in my chunking success. This will be evidenced by inconsistencies in my ability to listen and repeat.
So, is EI a memory test? Meaning only a test of a person’s ability to memorize? No.
Does EI test language ability as it encompasses a very specific aspect of memory? Yes.
We avoid over-testing a person’s memory through limitation testing. Limitation testing reveals the reasonable capacity of memory for high-ability speakers of a language. As we calibrate new test content, we ensure that we are measuring language ability and not something else (including memory).
Which elements of speaking are measured with EI?
Elicited imitation measures a person’s ability to use spoken English to transmit information accurately (without error) and comprehensively (in a way that is understood). For many organizations around the world, this is the specific type of speaking skill needed in order to perform important work.
Elicited imitation measures speaking ability with a very strong emphasis on its linguistic components. Components of speaking that are tested very well by EI (much more precisely than human-powered assessments) include:
- Grammatical accuracy – the ability to produce language with its expected and correct order and formation
- Vocabulary control – the ability to use vocabulary correctly
- Phonologic control – the accentedness/pronunciation
The following are also targeted by EI, but more generally:
- Vocabulary range – the ability to understand and produce vocabulary that varies in difficulty
- General range – the ability to use speaking across a range of situations
The following are loosely measured by EI:
- Flexibility – the ability to adjust to changes in topic
- Turn-taking – the ability to switch from listening to speaking
Which elements of speaking are not measured by EI?
There are also important elements of speaking that are not measured by EI. These include components in the areas of sociolinguistics (responding appropriately based on changes in interaction and situation) and pragmatics (the non-linguistic features of speaking).
Other than the interactive and generative components of speaking that are missing because of the structure of an EI task, the largest gap present in how EI tasks are scored is the element of spoken fluency. Fluency in this specific case refers to the flow of speech. Fluent speech is speech that is an appropriate rate and free from disruptions that distract or change meaning.
Fluency is an important part of speaking ability. It is one way that a speaker shows their ability. It is appropriate that our use of EI does not include a focus on fluency because of its targeted focus on accuracy and control. Future feature work will use the open response module to add measures of fluency to our overall speaking scoring.
What is partial construct coverage?
A construct is an abstract phenomenon that is real and significant but cannot be directly measured (like measuring something with a ruler). Speaking ability is a construct. While it can be observed, it cannot be fully and perfectly captured by a single measurement.
Elicited imitation does not fully represent the construct of speaking ability. As described above, it is limited in what aspects of speaking it measures and how it measures them. Other types of speaking tests have different strengths and weaknesses. Some types of speaking tests may more fully cover the construct of speaking ability.
This includes interview style assessments, which include some of the interactive and generative aspects of speaking. However, despite having greater coverage of the construct, the vulnerabilities to reliability and practicality in administering and scoring these types of assessments are important to consider.
Elicited imitation testing presents partial construct coverage. However, any potential negative effects of this limitation are reduced by the predictive modeling that is done as a TrueNorth score is generated. With this type of machine learning analysis, we are able to generate an understanding of ability similar to other fuller construct coverage types of assessment without their vulnerabilities and limitations.
What’s coming next?
We will keep you in the Emmersion know through future posts that follow-up on my team’s training series. As always, if you would like to learn more about our products, reach out. Also, if you have any feedback on the information that is presented here, we would love to connect!
Personally, I am interested to hear how you would evaluate the information presented here on a scale from 1-5 across three characteristics: ease of understanding, usefulness, and insightfulness (how new or newly compelling was this information to you).
Despite having a lot of new stuff that we are excited by and working on, we are committed to improving the work that has previously been shared or released, including content that comes out in this blog. If there is anything you would like us to specifically highlight, we would love to hear from you. Comment below!
The TrueNorth English Speaking Assessment was developed to modernize English language testing with patented artificial intelligence and machine learning. This technology allows for immediate results and scoring as opposed to all other language assessments that take 24 to 48 hours to grade. TrueNorth provides a convenient and immediate English testing solution that has been validated and calibrated to global testing standards and is also available in several other languages.
Some of the largest international companies rely on TrueNorth testing every day. With this technology, BPOs and international companies utilize real-time data and reporting to assist in attracting and hiring the best talent. Additionally, more than 650 universities, colleges, and training institutions around the globe use the TrueNorth platform for course placement and progress tracking. TrueNorth is delivered online with only the need for headphones and microphones.