OK, I get your point, but transcribing e.e. cummings poems is kind of a corner case for voice recognition, no?
What Siri is going to test is whether the 80/20 rule applies to a voice recognition based personal assistant. By constraining it to assistant-type tasks, and with what seems to be the most intelligent design we've seen yet, it has a better shot of achieving it than anything else out there. Previous entrants have been tripped up by the lack of a decent UX, failure to integrate with other data sources, or any number of shortcomings in the long chain from microphone to software back to speaker. Apple is the first company to have such precise control over every component in that chain. (For instance, I wouldn't be surprised if part of the reason Siri is 4S-only is because hardware has been added for smarter noise or echo cancellation, or other subtle design changes.) And the fact that it can learn from the internet, and presumably submit its training data back to the cloud, means that maybe it will eventually be able to handle not just 80%, but 90% or 99% of total inputs.
What Siri is going to test is whether the 80/20 rule applies to a voice recognition based personal assistant. By constraining it to assistant-type tasks, and with what seems to be the most intelligent design we've seen yet, it has a better shot of achieving it than anything else out there. Previous entrants have been tripped up by the lack of a decent UX, failure to integrate with other data sources, or any number of shortcomings in the long chain from microphone to software back to speaker. Apple is the first company to have such precise control over every component in that chain. (For instance, I wouldn't be surprised if part of the reason Siri is 4S-only is because hardware has been added for smarter noise or echo cancellation, or other subtle design changes.) And the fact that it can learn from the internet, and presumably submit its training data back to the cloud, means that maybe it will eventually be able to handle not just 80%, but 90% or 99% of total inputs.