I am writing out pages of PDF's to BMP's. I've automated the OCR process, but to avoid producing "junk" text, I'd like to be able to scrap a page which contains a photograph, or preferably ignore embedded photographs and capture only text for that page.
Is there a method for scanning an image, discerning photos/testing line by line and then only OCRing the "valid" areas of text?