[jawsscripts] Re: jaws14 public Beta2, enhancements in Scripting, +Convenient OCR usage.

  • From: Don Marang <donald.marang@xxxxxxxxx>
  • To: jawsscripts@xxxxxxxxxxxxx
  • Date: Fri, 19 Oct 2012 06:43:53 -0400

I have slight vision in my right eye.  From 10 feet away, I can tell
where the TV is and whether there is one or two people showing.  If I
put my nose up against the computer screen, i may be able to read one
word at a time after focusing for a minute or more.  Unfortunately, this
is no help with color schemes, since I am also color blind. 
I mostly  depend on JAWS to tell me basic colors.  Once in a while, I am
able to ask my daughter. 

Most of my experience with color combinations is with attempting to get
good results from screen snapshots in Vinux with my speedy-ocr profgam. 
It needs good contrast and does not do well in inverse video.  I want to
find a way to tell if inverse colors are being used, so I can reverse
them again before performing OCR.  I probably will run the OCR both ways
and pick the most accurate.  First, I need to determine automatically
which is more accurate, possibly by getting confidence info from the OCR
engine. 

I have yet to find an accessible light text on a dark background color
scheme for my website suitable for low vision users.  I have not found
good colors for visited, hover, and active links. 

*Don Marang*
Vinux Package Development Coordinator - vinuxproject.org
<http://www.vinuxproject.org/>


On 10/18/2012 8:49 PM, Geoff Chapman wrote:
> Thanks so much for all this response Don and apologies for lack of prior 
> acknowledgement on your efforts in replying to my queries on this.
>
> Out of interest, may I ask whether you have some useable vision at all? 
> which may, for example, aid you in perhaps trialing different color schemes 
> to determine a more optimal contrast for possibly better onscreen Convenient 
> OCR results in the jaws implementation?
>
> Perhaps I should take this discussion with you offlist, but I think many of 
> us might be grateful for any knowledge which might help us make better usage 
> of this Convenient OCR feature of jaws, when we next get stuck with either 
> individualized graphics/controls, or perhaps entire screens, where we might 
> desire the knowledge of where/how best to attempt to configure, in order to 
> glean optimal results?
>
> Did you physically yourself alter color schemes and obtain varied results 
> utilizing the Jaws Convenient OCR feature with it's builtin engine?
>
> ----- Original Message ----- 
> From: "Don Marang" <donald.marang@xxxxxxxxx>
> To: <jawsscripts@xxxxxxxxxxxxx>
> Sent: Friday, October 12, 2012 2:57 AM
> Subject: [jawsscripts] Re: jaws14 public Beta2, enhancements in Scripting, 
> +Convenient OCR usage.
>
>
>> In Windows and VMware, I mostly did simple things to get the best
>> possible results,  This mostly kept to maximizing the resolution of
>> Windows, removing most status bars / tool bars / side panes, and going
>> full screen in VMware.  I had not gotten to changing the contrast of the
>> screen snapshot in Windows.  I just realized things like color choices
>> and contrast of the install screens made a big difference in the
>> accuracy of the results.  Sometimes, no results were possible.
>> You would think that with the exact same input image, the OCR results
>> would be identical.  I do not think most people realize the amount of
>> image preprocessing, and the many algorithms which are applied to
>> determine a screen of characters, numbers, punctuation, lines, and
>> graphics.  I think, that the OmniPage code used by the FS Convenience
>> OCR has a competition among a few different engines or algorithms.
>> Perhaps the variation in results is due to different algorithms winning
>> out each time?
>>
>> I ended up moving my development of OCR to Vinux because it is so much
>> easier to put different tools together, like building blocks.  In my
>> case, engines like cuneiform and tesseract with Image Magick for image
>> preprocessing.  The obvious desired image manipulations for a screen are
>> convert to greyscale, adjust thresholds, convert to TIFF, and enlarge to
>> get a higher resolution and font size in the range where the OCR engine
>> works best.  Other preprocessing steps may be desired on images from a
>> scanner, such as split pages, rotate, auto align, deskewing,
>> despeckling, page layout, and sharpening, as well.  It is hard to come
>> up with effective preprocessing for a general case utility.  On the
>> tesseract mailing list I have seen many complex preprocessing done to
>> get reasonable results for some very specific and difficult situations,
>> like reading tombstones, gas meters, LED displays, handwriting, and
>> license plates.  With the newest version of tesseract on Google code, it
>> now can do some image manipulation within the OCR.  I have even started
>> to see OCR programs iterate on some of the different possible thresholds
>> and other image factors to optimize from a sample before Performing the
>> OCR on the entire document.
>>
>>
>> By the way, the XML location information that Convenience OCR can
>> provide is probably in the HOCR standardized format.  It is HTML with
>> additional tags that provide location coordinates for each block of text
>> found.  I think the OCR engine determines how to split up the blocks.  I
>> believe tesseract would define a block as an undisturbed line of text.
>>
>> *Don Marang*
>> Vinux Package Development Coordinator - vinuxproject.org
>> <http://www.vinuxproject.org/>
>>
>>
>> On 10/10/2012 8:11 PM, Geoff Chapman wrote:
>>> Hi Don/scripters.
>>>
>>> So Don Now I'm curious!
>>> when you mentioned, contrast, below, are you intimating that you were
>>> able/had  to do something then, to the screen color contrast foreground 
>>> to
>>> background,
>>> to enhance the ability of getting better OCR results?
>>>
>>> Did you/anyone  ever trial it several times though on the same
>>> screen/window, and obtain differing results each time?  This is what
>>> disturbed me about it when I tried it, on a bitmap graphic where I had a
>>> sighted person with me to tell me it was there, but Jaws wasn't seeing 
>>> it.
>>> So the first time I tried it on this screen, it did render an OCR version 
>>> of
>>> the text on this bitMap graphic, but then I lost it somehow by reverting 
>>> to
>>> pc cursor or something, then did it couple times again, and even after
>>> screen OSM refreshes etc, wasn't able to get the OCR process to render it
>>> the same the other times!
>>>
>>> Like I know the thing isn't gunna be able to be super good in all
>>> circumstances, depending on the contrast/colors of stuff it's trying to
>>> extract etc.  But, I did think that it wasn't too much to expect that it
>>> might render the same OCR results, on the Same screen, with no changes,
>>> dealing with those same colors, no matter how many times I invoked it?
>>> But, That didn't seem to be my experience?
>>> Would you/anyone else  concur Don?
>>>
>>> The other thing I'd reeeally like to see enhanced with this Convbenient 
>>> OCR
>>> feature, would be somehow some kind of attempted tighter integration,
>>> between Text it grabbed through the OCR Function, and determining whether
>>> any said bits of text were in close enough proximity to an actual 
>>> unlabelled
>>> but identifiable graphic   in the OSM?  Which one could then label with 
>>> the
>>> OCR text on it?  That might give another bite at the cherry for location,
>>> and navigating to, as a now labelled graphic, with all the normal 
>>> methods.
>>> Although, hmmm, I can see a flaw with that, given that quite often you'll
>>> get a whole row of exactly the same graphics number, which all do very
>>> different things, which of course would severely limit the usefulness of
>>> such an idea.
>>> hmmm.
>>>
>>> It might still be handy though where those graphics were different 
>>> numbers.
>>> Maybe further tests would need to be done in the background to try and
>>> determine that.
>>> sounds tricky.
>>>
>>> Anywayz, I don't think any such integration is available with graphics
>>> labelling at present though. is it?  I must admit it's been a while since
>>> I've tested this, as I only have SMA count up to 12.
>>>
>>> So at present, until Jim builds this funky utility he proposes, <grin,> 
>>> is
>>> it true that the user's got no way of marking such a spot onscreen, even 
>>> if
>>> they have identified it as a useful one through OCR?
>>> It's just a one-shot deal at present. right?
>>>
>>> Geoff C.
>>>
>>>
>>> ----- Original Message ----- 
>>> From: "Don Marang" <donald.marang@xxxxxxxxx>
>>> To: <jawsscripts@xxxxxxxxxxxxx>
>>> Sent: Thursday, October 11, 2012 8:36 AM
>>> Subject: [jawsscripts] Re: jaws14 public Beta2, enhancements in 
>>> Scripting.
>>>
>>>
>>>> I find it is also useful when using virtualization software, like
>>>> VMware, on some guest OS installations or when speech is lost.  If lucky
>>>> with guest resolution and contrast, it can normally not only provide the
>>>> text of the buttons on the guest screen.  Not only that, you can also
>>>> click on it with NumPad Slash.
>>>> I had attempted something less comprehensive, which called the same
>>>> OmniPage COM modules used by Convenience OCR.  My JAWS script to take a
>>>> screen snapshot, perform OCR using the COM module, then displaying the
>>>> text.  I required that  the Scanning software of MS Office 2003 or 2007
>>>> be installed.  It did not allow you to interact with the results like
>>>> the FS Convenience OCR feature provides.
>>>>
>>>> *Don Marang*
>>>> Vinux Package Development Coordinator - vinuxproject.org
>>>> <http://www.vinuxproject.org/>
>>>>
>>>>
>>>> On 10/10/2012 2:24 PM, Soronel Haetir wrote:
>>>>> convenient OCR is not meant to be a super accurate mechanism,  It's
>>>>> main use is for things like setup programs that label their buttons
>>>>> with fancy text graphics and things like that.  Places where before we
>>>>> would often have to go grab someone and ask "Now what does this say?"
>>>>>
>>>>> It's not meant to be used for things like PDF conversion, nor is it
>>>>> any good at tasks like that.  But for things like the mentioned setup
>>>>> programs and the like convenient OCR is in fact a great advance.
>>>>>
>>>>> On 10/9/12, Jim Snowbarger <Snowman@xxxxxxxxxxxxxxxx> wrote:
>>>>>> I think, what you have is the text, the location at which it was 
>>>>>> found,
>>>>>> it's
>>>>>>
>>>>>> attributes and colors.
>>>>>> But, you don't know what kind of window it resided in, or control it 
>>>>>> was
>>>>>> associated with unless you go look.
>>>>>> You can do all that with existing technology, except for the OCR
>>>>>> portion.
>>>>>> I don't know about you guys, but I have not had much luck with
>>>>>> inconvenient
>>>>>>
>>>>>> OCr.  Usually, it makes lots of mistakes, or misses text altogether.
>>>>>>
>>>>>> But, assuming the best about that, one wonders whether you could turn 
>>>>>> it
>>>>>> loose on your favorite bit mapped wizzy application, and go drink a 
>>>>>> cup
>>>>>> of
>>>>>> coffee.  Then, when you came back, you might be able to use the OCR
>>>>>> rendered
>>>>>>
>>>>>> text to perform a single mouse click, at which point you would repeat
>>>>>> the
>>>>>> process until desired results were achieved, or coffee overload,
>>>>>> whichever
>>>>>> came first.
>>>>>> Well, Ok, maybe that might interfere with the creative process.
>>>>>>
>>>>>> Seriously though, how about an enhancement to something like the 
>>>>>> toolTip
>>>>>> scanner in HotSpotClicker, which builds a list of fired toolTips, as a
>>>>>> result of a painfully slow mouse scan, and the locations  at which 
>>>>>> they
>>>>>> were
>>>>>>
>>>>>> detected.
>>>>>> This guy would use the .xml as a database of OCR-rendered text chunks,
>>>>>> what
>>>>>>
>>>>>> they said, and where they were.  And, you could take the jaws cursor 
>>>>>> to
>>>>>> each
>>>>>>
>>>>>> of them in turn, click there, and decide if anything useful happened.
>>>>>> If it
>>>>>>
>>>>>> did, then it might be a place you would want to save, assign a name 
>>>>>> to,
>>>>>> and
>>>>>>
>>>>>> use in the proper context.   It would help something like HSC deal 
>>>>>> with
>>>>>> totally graphical apps that do not support toolTips.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>> From: "Bissett, Tom" <tom.bissett@xxxxxxx>
>>>>>> To: <jawsscripts@xxxxxxxxxxxxx>
>>>>>> Sent: Tuesday, October 09, 2012 8:57 AM
>>>>>> Subject: [jawsscripts] Re: jaws14 public Beta2, enhancements in
>>>>>> Scripting.
>>>>>>
>>>>>>
>>>>>> Hi,  I am thinking that this can be used for finding text and trigger
>>>>>> points
>>>>>>
>>>>>> on the screen.  My guess this is what they use for the convenient OCR
>>>>>> and
>>>>>> now they have made it available to us.  We just need to figure out how
>>>>>> to
>>>>>> parse it and implement what we want to achieve with it.
>>>>>> Could the string returned be used to display its content in the 
>>>>>> results
>>>>>> viewer and from there we could activate buttons etc?
>>>>>> I haven't taken the time to play with it so I am just throwing some
>>>>>> thoughts
>>>>>>
>>>>>> out there.
>>>>>> Tom Bisset
>>>>>> -----Original Message-----
>>>>>> From: jawsscripts-bounce@xxxxxxxxxxxxx
>>>>>> [mailto:jawsscripts-bounce@xxxxxxxxxxxxx] On Behalf Of Stefan Moisei
>>>>>> Sent: October 7, 2012 4:02 AM
>>>>>> To: jawsscripts@xxxxxxxxxxxxx
>>>>>> Subject: [jawsscripts] Re: jaws14 public Beta2, enhancements in
>>>>>> Scripting.
>>>>>>
>>>>>> I don't know fs's intended use for this function, but I know how I 
>>>>>> might
>>>>>> use
>>>>>>
>>>>>> it.
>>>>>> The screen xml output includes OCR text, if existent. It is marked as
>>>>>> OCR in
>>>>>>
>>>>>> the xml. One can copy it, I guess one can also trigger OCR on a 
>>>>>> specific
>>>>>> event and get the new text with this.
>>>>>> -----Original Message-----
>>>>>> From: Jim Snowbarger
>>>>>> Sent: Sunday, October 07, 2012 4:55 AM
>>>>>> To: jawsscripts@xxxxxxxxxxxxx
>>>>>> Subject: [jawsscripts] Re: jaws14 public Beta2, enhancements in
>>>>>> Scripting.
>>>>>>
>>>>>> Very interesting.  The next thing will be how to make use of this .xml
>>>>>> information.
>>>>>> Basically, this is our equivalent of print screen.
>>>>>> If I understand, from this, you can get all text, including text color
>>>>>> and
>>>>>> attributes.  You don't get graphics, and don't know anything about
>>>>>> window
>>>>>> boundaries.
>>>>>>
>>>>>> What is actually the intended purpose of this?  Anybody know?
>>>>>>
>>>>>> One idea that comes to mind is that, if a client wanted to pass you a
>>>>>> screen
>>>>>>
>>>>>> shot, they could capture this .xml information, and send it to you.
>>>>>> You, as
>>>>>>
>>>>>> a script developer, have some clever tool that decodes the XML
>>>>>> jibberish,
>>>>>> and renders it in some familiar form, a web page, or a virtual viewer,
>>>>>> or
>>>>>> something like that.
>>>>>> Does something already exist?  Or, do I feel a utility coming on.
>>>>>> Sounds like a nice addition to my JLS_utilities, and/or Doug's BX.
>>>>>>
>>>>>> JLS actually has a means of collecting the window structure from a
>>>>>> client's
>>>>>>
>>>>>> machine, including window attributes, such as boundary coordinates,
>>>>>> style,
>>>>>> type, class etc, and even the text in each window, done of course only
>>>>>> under
>>>>>>
>>>>>> the control of the client.  But, they basically create a big data file
>>>>>> that
>>>>>>
>>>>>> they send to me.  I put JLS in a special mode where all the window
>>>>>> navigation functions, GetFocus, GetPriorWindow, GetNextWindow etc, all
>>>>>> consult the data file, rather than my own system.
>>>>>> The missing piece was color and attributes for the text.
>>>>>> This would supplement that.
>>>>>>
>>>>>> Of course, we still don't have pixel colors in non-textual areas. 
>>>>>> But,
>>>>>> I'm
>>>>>>
>>>>>> getting the disturbing impression that precious information like that
>>>>>> may be
>>>>>>
>>>>>> becoming harder to obtain.  ,sigh>
>>>>>>
>>>>>> Anyway, back to the original.  Does anybody know the intended means of
>>>>>> interfacing with the XML data?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>> From: "Geoff Chapman" <gch@xxxxxxxxxxxxxxxx>
>>>>>> To: <jawsscripts@xxxxxxxxxxxxx>
>>>>>> Sent: Friday, October 05, 2012 2:57 AM
>>>>>> Subject: [jawsscripts] jaws14 public Beta2, enhancements in Scripting.
>>>>>>
>>>>>>
>>>>>> JAWS 14 includes a new scripting function for obtaining screen content
>>>>>> in
>>>>>> XML format. For more information on using this new function, refer to
>>>>>> the
>>>>>> JAWS 14 includes a new scripting function for obtaining screen content
>>>>>> in
>>>>>> XML format. for further information see:
>>>>>> http://www.freedomscientific.com/documentation/scripts/scripting-info.asp
>>>>>>
>>>>>> Short URL:
>>>>>>
>>>>>> http://bit.ly/RFtrUd
>>>>>> Geoff C.
>>>>>>
>>>>>> __________�
>>>>>>
>>>>>> View the list's information and change your settings at
>>>>>> http://www.freelists.org/list/jawsscripts
>>>>>>
>>>>>>
>>>>>>
>>>>>> __________�
>>>>>>
>>>>>> View the list's information and change your settings at
>>>>>> http://www.freelists.org/list/jawsscripts
>>>>>>
>>>>>> __________�
>>>>>>
>>>>>> View the list's information and change your settings at
>>>>>> http://www.freelists.org/list/jawsscripts
>>>>>>
>>>>>> __________�
>>>>>>
>>>>>> View the list's information and change your settings at
>>>>>> http://www.freelists.org/list/jawsscripts
>>>>>>
>>>>>>
>>>>>>
>>>>>> __________�
>>>>>>
>>>>>> View the list's information and change your settings at
>>>>>> http://www.freelists.org/list/jawsscripts
>>>>>>
>>>>>>
>>>> __________�
>>>>
>>>> View the list's information and change your settings at
>>>> http://www.freelists.org/list/jawsscripts
>>>>
>>> __________�
>>>
>>> View the list's information and change your settings at
>>> http://www.freelists.org/list/jawsscripts
>>>
>>
>> __________�
>>
>> View the list's information and change your settings at
>> http://www.freelists.org/list/jawsscripts
>>
> __________�
>
> View the list's information and change your settings at 
> http://www.freelists.org/list/jawsscripts
>


__________�

View the list's information and change your settings at 
http://www.freelists.org/list/jawsscripts

Other related posts: