By Will Pearson
[Editor’s Note: Will sent me the following in an email. I thought it interesting enough to put up as a blog post. Other than a few grammatical repairs I handled, this post is entirely by the soon to be Doctor Pearson.]
I’ve been doing some thinking following on from the discussions around whether commercial or open source AT is better. One thought that I came up with
is quite interesting. If you set the goal to be the situation where someone can just pick something up and use it without anyone having to do any work
to enable that person to use that thing then we’re kind of going about things the wrong way at the moment. The way we think about accessibility is wrong,
the way we achieve accessibility is wrong, and largely the people who are responsible for accessibility are doing the wrong things to achieve it.
You can think of commercial AT vendors and accessibility consultancies as consumers of accessibility problems. They take accessibility problems and turn
them into cash for themselves. They do this by providing solutions to those accessibility problems; either solutions to end users in the case of AT vendors
or solutions to people who produce things in the case of accessibility consultancies. So, both AT vendors and accessibility consultancies need accessibility
problems to exist in order to make money and stay in business.
I think this need for accessibility problems to exist has led to us thinking about accessibility in the wrong way. Accessibility is seen as making a particular
product or feature accessible; I think this is a fundamental mistake but one that suits the AT vendors and accessibility consultancies quite well.
If problems are only solved within a particular context, say
a particular software package, then they can go and solve the same problem across multiple contexts making money each time they solve that problem.
The alternative to a contextual way of thinking is a context independent one. Instead of solving a problem in a particular context, say a particular software
package, you create a generalisable solution that solves the problem regardless of use case. This general solution can then be used to solve all
instances of that problem instead of just solving that problem within a specific area. This is better for the users but worse for AT vendors and accessibility
Consultancies as it gives them less opportunities to make money.
This need for a context independent solution means that the solution needs to be placed in something that spans different contexts. It can’t be placed
in a particular piece of software as that’s just a single context and instead needs to be placed in assistive technologies that operate across different
pieces of software. So, I believe that asking application and web developers to make things accessible is wrong and that we should instead be asking AT
Vendors to make things accessible using generalisable solutions if we want to have the most things accessible.
While I agree with Will’s statements, I also believe that the generic solutions he proposes may be the “Holy Grail” of screen reading. So many applications out there require some bit diddling to get them to speak properly but, unlike Will, I’m not too strong in artificial intelligence or notions like synthesized vision so I’ll defer to his expertise on this matter.
When I get the chance, I’ll write full length pieces about two newly introduced technologies to the world of people with vision impairment. The first is the User Centric Licensing scheme available in the recent releases of MSS and MSP from Code Factory. I would like all software vendors to move to such a solution. Second, the Victor Reader Stream from Humanware is really fricking cool.
Will’s note came as an html email, I did a Select All, Copy and pasted it into Word. For no reason apparent to me, after pasting the text, a whole lot of hard line breaks showed up in Word making its grammar checker think that it had a lot of sentence fragments and letters that it thought should be capitalized. Does anyone know how to keep this from happening in the future as I hate fixing such one line at a time in an editor.
Finally, if you, like me, use a lot of batteries (in my case, Olympus DS 50, my Sony 4 track cassette player and a few other odds and ends, the new Duracell rechargeable are not just better for the environment, they charge really quickly and, if you have two sets you should enjoy thousands of hours of use before they fade away. As mine still work without having been replaced, I cannot speak to their life cycle but the fast recharges and long use periods make them really nice.
2 thoughts on “Thinking About Accessibility”
Wil makes some assertions in his piece that I find totally unjustified. Firstly, the AT venders do have generic solutions; if application developers implement MSAA or UIA correctly, Jaws and/or Window Eyes will read that app with only usability/design issues popping up as the app gets read. The attempt has been made to provide this generic solution in the form of programmatic API’s, but its use in the developer community is hit and miss with the use of custom controls. He totally ignores the enormous technical challenges in supplying a real-world, efficient, and reliable solution.
I think the biggest problem that screen readers face is mapping the inherently visual ness of Windows and whatever Microsoft may dream up for a user interface in the future (tablet? 3d?) into something that is intuitive and efficient for blind users. A context-free screen reader would have this task and there just isn’t programmatic API’s nor tools that will accomplish this. The issue with using artificial intelligence is that AI is largely probabilistic and thus will have unreliable results; just take OCR as a reference. I certainly would not want to work in a environment in which 10% of my world consisted of errors.
At this time, if we look upon the landscape of screen readers, the solution that has worked in practice is the context-sensitive approach. The generic approaches have all had to special case in one way or another to make some funky UI read correctly. Unless you can get Microsoft to agree to rewrite a large portion of their controls and UI’s to conform to some generic standard that facilitates a screen reader’s use on that UI and deprecate all the old control/custom control API’s, I think Jaws and Window Eyes will live on and continue charging the same rates for their software.
I welcome your attempt to defend the screen reader vendors; however, I have to disagree with you.
True, MSAA, UIA, IAccessible2, and the other programatic API’s are generic solutions in theory. As you mentioned, the problem is that they are not generic or generalisable solutions in practice.
I agree that AI is largely probabilistic, but then so is human behaviour. It’s this probabilistic nature that allows humans to transfer learning and knowledge between different contexts. Without this probabilistic nature people would be restricted to only applying knowledge within the context that they learnt it. This obviously isn’t the case. For example, people can read a wider body of writing than the simplistic story books that they learned to read from and people can read a wide variety of different scripts and typefaces that they haven’t encountered before. They cannot always be certain that they are reading correctly but by using heuristics and other cognitive processes they can achieve a very high probability that their interpretation is correct. So, the probabilistic nature of human behaviour is what actually allows behaviour to span contexts. The knowledge that people have often results in a high probability that people are correct in their recognition of objects.
OCR tends to provide a weak reference point when discussing computer vision. OCR doesn’t really mimick many of the processes found in vision and perception. Therefore, typical OCR implementations do have the low accuracy that you point out; however, when OCR systems do mimick the ways in which we currently think vision and perception work then you can get accuracy rates of 98% and above.
One assumption that you seem to make is that speech is the only method of achieving accessibility. If this is the case then I don’t believe that you fully understand what accessibility is in the context of computing. Computers are conceptually little black boxes that perform state transitions, think of concepts from the theory of computation such as FSM’s, PDA’s, and Turing machines. To make computers useful to people there needs to be some communications mechanism that allows a person to communicate with a computer to select the state transitions they want performed and between computer and user to communicate the results of those state transitions. The part of the system that is responsible for this communication is the user interface. So, the user interface is just a communications channel, and speech is not the only non visual communications channel in existance.
Your argument also seems to fail to account for both of the key functions that a screen reader performs. You seem to have focused on the communicative aspect of screen readers but you fail to account for simulating selective attention. At present the simulation that screen readers employ is in no way generic but it is relatively easy to create a generic simulation or to allow the user to use their own selective attention mechanisms without the need for simulation.