When I consider an enterprise software solution, I like to match up the features from each consideration. Sometimes I learn about important features that weren’t originally in my consideration set. Once I have a solid understanding of what I am looking for, I proceed to evaluate firms with the options that I need and eliminate firms that are not capable of delivering on my needs.
Corporations with large-scale software implementation will formalize the process I described above into a Request for Information (RFI) or Request for Proposal (RFP).
Might I suggest comparing the following characteristics during your consideration for a language testing service or software.
Adaptive testing is a testing technique designed to adjust to the response characteristics of individual examinees by presenting items of varying difficulty based on the examinee’s responses to previous items. The process continues until a stable estimate of the ability level of the examinee can be determined (APA Dictionary of Psychology).
Adaptivity is an important feature of language testing. Without it, test-takers are likely to encounter testing fatigue and frustration, creating a greater chance of invalid and unexpected results. Adaptive language tests exist today in the form of human-rated interviews or even automated solutions. Whichever you choose, you should be sure adaptivity plays a part.
Calibration is the process of assigning values to a measuring device (instrument, test, or scale) relative to a reference standard.
For example, it would be useful to compare the scores on a new test of intelligence with those from an older, well-accepted test to ensure that the new test scores provide comparable ratings or values. To do so, a researcher might select a specific group of people (the calibration sample), administer each individual both the old and new tests, and then assess the results (APA Dictionary of Psychology).
A fair number of standards exist in the English language testing, for example. Mapping and correlating language testing results to these standards is typical for the calibration of a language test. Some of the more common standards are:
In addition to calibrating a test to a common standard, some testing software takes it a step further. In some cases, organizations want a language test that best fits their specific needs. They don’t need a scale that specializes in testing for academic purposes (like TOEFL) or testing for business settings (like TOEIC); rather, they prefer a scale and fit specifically for their organization.
Emmersion’s tools, for example, are designed to be calibrated and fit to each organization. A most common use is in education when universities and colleges want to test and place students based on their own courses and curriculum. Finding a software that can both match to industry standards but also calibrate to unique organizational standards adds a lot of flexibility and accuracy.
Validity is the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of conclusions drawn from some form of assessment. Validity has multiple forms, typically depending on the research question and the particular type of inference being made.
4. Usage and Adoption
One can become concerned with which is the best or most accurate because there is a plethora of testing standards. Sometimes the best way to validate your decisions is by researching which testing software has the highest adoption rate or is most widely accepted. Other considerations include finding which assessments are being used in K12 education and higher education.
Often, assessments for specific purposes are used for inappropriate or inaccurate testing. While there may exist a correlation between the two, that does not mean the test is accurately measuring what it was intended to measure. Be sure to check how language assessments are adopted and used by educational peers.
5. Automation and Development
Automation is probably one of the most difficult and important features of language testing. That is, how fast can you get accurate results?
Only recently, with patented technologies, has automated language-speaking testing become available. Most speaking tests are still graded by a human or multiple humans. Human grading is inherently inaccurate, time-consuming, and expensive.
Testing software today uses artificial intelligence (AI) to learn. This is called machine learning. As more and more test-takers use the software, more data becomes available for learning. That data is incorporated into the scoring algorithm and test-taking process.
Using surveys and background information, test scoring can be adjusted based on key information points. Additionally, machine learning can be used to identify which particular sources of data from surveys and background information are stronger ability predictors and can be further incorporated.
6. Language Syntax and Modality
Syntax is the set of rules that describes how words and phrases in a language are arranged into grammatical sentences. Essentially, it is the branch of linguistics that studies such language rules.
Language testing has syntax too. The main areas of language testing include:
Not all language tests cover these five language syntaxes. Finding a test that accurately measures the language component you’re looking for is extremely important for valid results. Many testing services will attempt to adapt their language assessment for purposes it was not designed or calibrated to accomplish. This can lead to poor results and performance.
Making your software accessible and usable by those with disabilities is just good practice. Every country will have different laws and regulations with regards to accessibility, but it should be a practice in which software companies (especially language testing companies) are compliant and constantly improving. Technology exists today to support those with disabilities, and companies with high ethical standards should stay as compliant as possible. For example, the United States has an Americans with Disabilities Act (ADA) standard for accessible design.
Great software comes with great support. In the evaluation of software, it is imperative to evaluate the support systems in place, as well. Does the company have solid reviews with easy and multiple access points to support teams? Does the company care about your success? What about upgrades? Do you get unlimited access to future development? Many companies today have customer success teams to improve the usage and benefit of their software. Do not purchase software without a deep understanding of their customer support and success efforts.
While the interface, usability, and capabilities are often the focus of feature discovery, you cannot forget about the availability of the foundational technology. I am talking about the scalability, capacity, and robust programming around the actual language tool or software. Scalability is directly affected by how many assessments the system can administer without any lag or failure at one time.
Some organizations need the ability for the software to scale rapidly for mass testing. Even if you don’t have this need, you need to know their system can handle it because other customers on the system may tax the infrastructure causing problems for your needs. Some businesses use powerful third-party tools and partners to ensure scalability. (i.e. IBM Watson, AWS, etc).
Knowing about the bandwidth requirements for the various assessments is also important. If you have staff or students in remote parts of the world, then bandwidth becomes a big challenge. Learn about what requirements exist for technology and bandwidths. Know what hardware and software requirements are needed for the users to have a positive experience.
With language testing, there are two key security concerns. The first is cheating. Obviously, there is always potential for cheating, but you will want to know how to mitigate or eliminate those risks. More importantly, how does the software help you with those risks?
The second key security concern is data; both yours and your users’. Make sure servers and services are aligned with current industry benchmarks and standards. Companies must apply and prove rigorous standards and practices to get key certifications in security for their business and software. You need to make sure your data is as secure as possible.
On occasion, there are special cases of language learning or testing that warrant a customized test bank. English for Special Purposes is often a term used for such a case. While a basic understanding of English is usually required, there are times when advanced English understanding or English vocabulary specific to a discipline is needed.
While I ordered this one as last, it is in no-part inconsequential. While the experience needs to be positive for your users, you need to have administrator rights that allow you simple, fast access to what is important to you.
Do you want to make customizations? Groups? Tailor the system in a specific direction? What rights and privileges do you want to control for your other admins, and users? How complex or simple is the administration interface? If you cannot get access to the language testing results comfortably and fluidly, you need to consider another system.
While this may not be a holistic or complete set of feature considerations, I think it will give you a great start. Also, I hope you realize there is more to consider than just the software or the interface alone. Great experience is not only in great coding, but great practices and systems thinking.
Emmersion certifies language ability for organizations around the world using a fully automated and adaptive language assessment engine. It’s revolutionizing the language testing process with instant, accurate scoring for speaking, grammar, and writing ability in 9 global languages and counting. With a scalable assessment solution they can count on, hundreds of global businesses are building successful teams, reducing turnover, and improving their customer satisfaction scores. Learn more at www.Emmersion.ai.