05 Apr SOLIDWORKS Inspection Tips: How to get the best out the Optical Character Recognition (OCR)
SOLIDWORKS Inspection is an invaluable tool that creates error proof inspection reports and ballooned drawings in no time.
To make things even easier, SOLIDWORKS Inspection comes with an integrated add-in to leverage existing SOLIDWORKS Drawing files and a standalone application to work with your company’s PDFs and TIFFs files.
While the SOLIDWORKS Inspection add-in can extract all the information needed to create your reports in just a couple of clicks, when working with a PDF document the process is not as automated since the software is working with a simple image.
That’s where the OCR comes in handy. By simply selecting the information to add to your inspection reports, the software automatically takes a picture and extracts or “reads” the information from that picture.
It can recognize dimension, notes, GD&T, etc… and even understand the type of dimensions (linear, radius, diameter) or the nominal value and tolerances.
However, to get good results, you need to be careful when selecting the characteristics to extract. Ideally, only the characteristic to extract should be selected and not the leaders, lines, etc.
Additionally, the resolution of the PDF drawing or scanned document can have an impact with low-quality documents difficult to interpret by the OCR engine.
You also have access, in the options, to add a couple of dictionary fonts and filters to improve the accuracy of the results. For example, the sharpen filter will attempt to improve the clarity of blurry characters while the stretch percentage will stretch the image to create spacing between characters which could help when letters or numbers are touching.
But the best way to extract information from a PDF is to simply “read” it. A “Searchable Text” PDF is essentially a PDF document that includes two layers: the image layer and the text layer. When scanning a document or drawing, a “normal” PDF is created. However, when creating a PDF document from a CAD
package, you often have the option to make the document “searchable”. If the PDF document used is a “searchable text” PDF, then you should be able to use this option to read the information instead of using the OCR to recognize it.
Finally, if you can’t use “Searchable Text” you can always use characteristic manager to manually modify the extracted value or designate additional information using several tools available. You can rotate the capture by 90 degrees using the slider for fine rotations. You can also zoom in and out, or re-perform the OCR on all or part of the characteristic to extract the nominal value or tolerance using the selective recapture tools. The recapture tool, on the other hand, allows you to complete redefine the area selected.
Hopefully this will help you make the best out of the OCR engine to create your reports in no time!
Chethan S – Application Engineer