Australia's national curriculum authority, ACARA, raised eyebrows last week when it revealed plans to have computers mark the written parts of NAPLAN tests from 2017.
NAPLAN is the national assessment program for literacy and numeracy and tests the skills of Years 3, 5, 7 and 9 every year.
ACARA's general manager Dr Stanley Rabinowitz said a technology platform was yet to be decided.
However it is envisioned that the chosen technology will be able to take a sample of 1000 tests marked by a human teacher and "learn a criteria for spotting the score points in each paper", iTnews reports, thus being able to mark the rest of the test papers using this learned criteria.
This is where it gets tricky. To achieve its vision, ACARA will need a platform armed with powerful artificial intelligence to be ready within just two years.
It's possible that vendor platforms such as IBM's Watson could be ready to accept the challenge - but only because so little is known about how they work and their developmental roadmaps.
When assessing what is known about the state of artificial intelligence and cognitive computing today, ACARA's deadline appears much more challenging.
"Artificial intelligence has been saying, 'It's around the corner' since the 1950s," Professor David Powers tells Information Age.
"The state of the art is still a long way - I would say decades - from being able to be a competitor for human cognitive capabilities."
Powers directs the School of Computer Science, Engineering & Maths' Knowledge and Interaction Technologies Centre at Flinders University, where he specialises in artificial intelligence and cognitive science.
One of the Centre's products is a spinoff called Clevertar, which produces an AI tool for aged care.
Pushing AI's limits
One of the major objections raised to ACARA's plan by teaching bodies is the idea of whether a computer can recognise and reward creativity in children's written responses.
Powers is doubtful that this is possible, even if current artificial intelligence technology benefits from another two years' further research and development.
"Can it even recognise grammatical versus ungrammatical sentences? I don't know of any computer system that can," he said.
"We make mistakes. If you're looking at an email or an essay that hasn't had 10 passes of proofreading, it's going to have errors in it that make it ungrammatical, but most people will still be able to read past the spelling errors, missing words, duplicated words, or 'false friends' where the wrong spelling actually makes a totally different word."
Language recognition is further complicated by ambiguity, dialect and linguistic variation, according to Powers.
"If you train [a machine] on essays from a class one year, then next year there's going to be a totally different class with their own different styles," he said.
Children also tended to develop their own idiolect - or unique use of language. The challenge for any computer marking children's work would be to recognise shifts in language use, as well as a human can.
"We [as humans] pick up in a very dynamic way how language is changing, how people are using it," Powers said.
"Even the first time we see a construction in an essay that we don't recognise, we note it and we learn a bit about context and what it might mean. Then if we see it again and again, by the time we see it three times it's clear what it is."