Author |
Topic |
|
jrpcguru
USA
266 Posts |
Posted - Oct 08 2024 : 19:00:57
|
I have several memorial PDF slide shows with between 150 and 190 pages. Each page is a scanned snapshot, scanned document about milestones in the decedents life, or later, various digital photos. Digital photos are up to 20 megapixels. DPI for each page can vary and dimensions vary considerably. There is a mix of portrait and landscape pages. These PDF files were assembled with my Scan program, using WPViewPDF to read PDF pages and native ImageEn to save the whole thing. Adobe Reader has no difficulty displaying each page, including in full page view. The old version of my Scan program, using WPViewPDF, can read the entire file, page by page (slowly).
I've resumed working with my modified version of your demo program, PDFViewer to try to read these files. I'm using D10.3, 32 bit, ImageEn 13.5.0.
procedure TfrmMain.OpenFile(const Filename: string);
var
I : integer;
begin
IEGlobalSettings().PDFEngine := ieenDLL;
// ImageEnMView1.AttachedImageEnView := nil;
try
ImageEnMView1.Clear;
ImageEnMView1.Refresh;
ImageEnMView1.LockUpdate;
ImageEnMView1.MIO.LoadFromFilePDF( Filename, 2500,2500 ); //smaller sizes don't help load missing pages
// ImageEnMView1.MIO.LoadFromFilePDF( Filename ); //out of memory
ImageEnMView1.SelectedImage := 0;
ImageEnMView1.UnlockUpdate;
except
on e:Exception do
MessageDlg( 'Error encountered loading PDF file: ' + e.message, mtError, [ mbOK ], 0 );
end;
// ImageEnMView1.AttachedImageEnView := ImageEnView1;
ImageEnView1.Modified := false;
Caption := FileName;
UpdateStatus();
end;
This code successfully loads most of either file, if LoadFromFilePDF(Filename,1500,1500) or 2500 as shown. If I use LoadFromFilePDF( Filename ) I get an out of memory error part way through.
The code shows that I have tried to disconnect ImageEnView1 from ImageEnMView1 until the loading is complete, but that did not solve the out of memory error.
So, using 2550 as shown or 1500, the code loads the correct number of pages, but many pages are blank. The blanks seem to be when there is a significant change in page size or orientation, though not always. I've added a double click event to ImageEnMView1:
procedure TfrmMain.ImageEnMView1DblClick(Sender: TObject);
var
iPage : integer;
begin
iPage := ImageEnMView1.SelectedImage;
UpdateStatus;
begin
try
//ImageEnMView1.AttachedImageEnView := nil;
ImageEnView1.ClearAll;
ImageEnView1.IO.Params.FileName := sCurrentFilename;
ImageEnView1.IO.Params.ImageIndex := iPage;
ImageEnView1.IO.LoadFromFilePDF( sCurrentFilename);
ImageEnView1.Refresh;
ImageEnMView1.SetImage(iPage, ImageEnView1.IEBITMAP);
ImageEnMView1.Refresh;
except
on e:Exception do
MessageDlg( 'Error encountered loading PDF file: ' + e.message, mtError, [ mbOK ], 0 );
end;
//ImageEnMView1.AttachedImageEnView := ImageEnView1;
end;
end;
This code successfully reloads the blank image into the proper page. So I looked for a work around that could detect the blank pages and use this code to fix the problem. I've found no way to do that. I'm hoping you can tell me what I'm missing? Surely I should be able to identify blank pages in an ImageEnMView?
J.R. |
|
xequte
38607 Posts |
Posted - Oct 08 2024 : 21:41:51
|
Hi JR
Firstly, why are you looking such a high res copy of the PDF into the TImageEnMView? (given that it generally only displays thumbnails)
Are you looking to load the entire PDF into the control, change some pages and then save the entire PDF again?
If you are just looking to update pages within a PDF, you are probably better off to either: - Load the PDF into the PdfViewer - Work directly with the PDF file
To update a page you would add a new one using:
https://www.imageen.com/help/TIEPdfViewer.AddPage.html
And then delete the old page using:
http://www.imageen.com/help/TIEPdfViewer.DeletePages.html
Also, what is the valid of ImageEnMView1.StoreType?
Nigel Xequte Software www.imageen.com
|
|
|
jrpcguru
USA
266 Posts |
Posted - Oct 09 2024 : 16:26:34
|
I forgot to include important details:
ImageEnMView1.StoreType := ietNormal;
ImageEnMView1.DeprioritizeLargeImages := 0;
ImageEnMView1.EnableImageCaching := false;
When I originally designed this program, I was using WPViewPDF to load PDF pages. I think that required me to load each page into the ImageEnView then transfer to ImageEnMView. That system works, though it is slow to load a large PDF like these examples.
Using IEGlobalSettings().PDFEngine := ieenDLL offered the chance to speed the loading directly into ImageEnMView, so I've been trying to make that work. I've also tried to load page by page via ImageEnView and that presents several additional problems which I haven't fully documented yet.
Yes. I prefer to continue the method I used with WPViewPDF and load the entire page image into ImageEnMView. I use the full editing capability of ImageEnView, at least those which I've implemented, to allow editing these page images. Deskew, crop, redact, combine images, add captions via layers, rearranging pages, etc. These pages are all images, so PDFViewer does not seem to have any advantage. As best as I understand it, if I edit a page in ImageEnView, I need to save it back to ImagenMView and eventually save the entire ImagenMView. With TIF files, this page by page loading was a good idea since each page can have its own metadata. Since PDF files only have one set of metadata, I can see how to load directly into ImageEnMView, but editing still seems to be page by page.
The program was used to scan original snapshots and documents that illustrated the decedent's life, usually as .jpg. It was then used to append all those .jpg along with .jpg from a variety of digital cameras and arrange the image pages into the proper order before creating the PDF. If the program fails to load individual pages, and I can't automatically detect and fix those failures, the program is seriously deficient.
I am also concerned by the out of memory error. That does not happen when using WPViewPDF to load page by page. Is there anything I can do to properly clear memory between each page, to avoid the out of memory? I would certainly like to be able to load the pages full sized, rather than using LoadFromFilePDF( Filename, 2500,2500 ) or something similar.
J.R. |
|
|
xequte
38607 Posts |
Posted - Oct 09 2024 : 20:34:26
|
Hi JR
Loading 150+ images of around 2000x2000 pixels into memory will be slow and resource hungry (150 x 12MB = 1.8GB of data). ImageEn will try to offload some of that to the disk to prevent memory issues (See: http://www.imageen.com/help/TImageEnMView.ImageCacheSize.html), but it is still not optimal way to do it.
You are using TImageEnMView as your image container, when for PDF you should use using PDFViewer as the container (with TImageEnMView showing a preview of the PDFViewer content). I cannot think of any situations where TImageEnMView should be used as a container for a PDF. It is not a multiple image format like TIFF, it is a document format and PDF Viewer gives you the optimal method to interact with it (designed specifically for that purpose).
e.g. loading into PDFViewer should be much faster, as it only loads the current page, rather than the whole document.
If your application is designed to load an image, edit the image and then add it to a PDF, then you might be better to have a hidden PDFViewer/TImageEnView as your PDF container that your main editing TImageEnView outputs to.
I would expect this method to be more susceptible to memory issue rather than load page by page, because all of the pages are in memory (excluding the ones that TImageEnMView has cached to disk). That said, TImageEnMView should not give you memory issues (even in a challenging situation like this), so if you can create a test project that reproduces the issue, please forward it to me.
Nigel Xequte Software www.imageen.com
|
|
|
jrpcguru
USA
266 Posts |
Posted - Oct 10 2024 : 18:17:32
|
I've sent an email to your support address with a link to a sample PDF file that causes the out of memory and other behaviors. It also includes a modified version of your demo program to demonstrate the problem. The original PDF file was 140mb. After redacting and resaving at 37% JPG compression, it is now 50mb, but still shows all the issues. It was originally created with 37% JPG compression and apparently resaving it shrank it some more.
Firstly, why are you looking such a high res copy of the PDF into the TImageEnMView? (given that it generally only displays thumbnails)
Are you looking to load the entire PDF into the control, change some pages and then save the entire PDF again?
Yes. That is what I have always done since I was able to load PDF files via WPViewPDF. Native ImageEn creates the PDF and WPViewPDF loads it. Now, because of WPViewPDF deficiencies, and because of the excellent progress with PDFium, I would like to replace WPViewPDF with PDFium.
If you are just looking to update pages within a PDF, you are probably better off to either: - Load the PDF into the PdfViewer - Work directly with the PDF file
I guess I don't know how to do this. My pages are all images. I see nothing in the help file that makes me think PDFViewer will do all the image editing that I described earlier in this discussion. It seems to be focused on documents and is very nice for reading documents and editing or searching text based PDFs. I don't know what you mean by "work directly with the PDF file" in the context of a program written to use ImageEn.
Also, what is the valid of ImageEnMView1.StoreType?
The demo project that I've sent also includes my attempt to load each page of the PDF into ImageEnView and then transfer to ImageEnMView, which is how my current program works with WPViewPDF. This also works poorly, though out of memory is not an issue.
If I could detect empty or improper pages in ImageEnMView, I could probably fix them programmatically. But I could never find a way to detect them.
An initial impression from paging through the loaded file is that PDFium is choking on loading ImageEnView when there is a transition from a landscape image page to a portrait page or from a small image page to a large image page. ImageEnMView1.StoreType := ietNormal;
J.R. |
|
|
jrpcguru
USA
266 Posts |
Posted - Oct 10 2024 : 18:29:04
|
I just noticed an earlier suggestion from you:
If your application is designed to load an image, edit the image and then add it to a PDF, then you might be better to have a hidden PDFViewer/TImageEnView as your PDF container that your main editing TImageEnView outputs to.
This is another thing I don't know how to do. Since PDFViewer is not operating as an image, how do I copy it into a normal ImageEnView, edit the image, and then save it back to PDFViewer?
I've been so frustrated with this issue that I'll admit I haven't tried to do this. But the help file doesn't seem to offer any hope. If it is possible, then it is likely to be useful. In this case, I would have to hope PDFViewer updated the underlying PDF file each time I changed pages and saving the edited PDF means PDFViewer would just be saving the last page and closing the file? Further, if I understand correctly, the ImageEnMView thumbnail display would be linked to the PDFViewer (invisible) and I would transfer pages back and forth between the PDFViewer and a normal ImageEnView (visible) for user interaction?
J.R. |
|
|
xequte
38607 Posts |
Posted - Oct 12 2024 : 05:54:24
|
Hi JR
It sounds like you are trying to use PDF as an image format, which is not ideal. It's similar to using a Microsoft Word file to store images. It can do it, but modifying the image(s) on each page is not as simple as modifying an image (the image is only an object on the page, unlike a multiple-image format like TIFF, where each page is an image).
But if that is your design, the most workable solution would be something like this:
1. Use TImageEnMView to load thumbnails from the PDF document (on demand) 2. If you want to edit a page, use a TImageEnView to load just that page as an image (i.e. PDFViewer.Enabled = False) 3. To "save" the page: - delete the old page from file: http://www.imageen.com/help/DeletePagesFromPDF.html - Save the image to a temporary PDF file - Import the temp PDF file into the original file http://www.imageen.com/help/ImportPagesIntoPDF.html 4. Refresh the thumbnail in the TImageEnMView
It's not ideal, but I think it is the best way to meet your requirements (which I think are somewhat outside the area of ImageEn functionality)
Nigel Xequte Software www.imageen.com
|
|
|
jrpcguru
USA
266 Posts |
Posted - Oct 26 2024 : 13:03:14
|
I appreciate your suggestions so far. First some good news:
I finally installed D12.1 with ImageEn 13.6.0. I then converted the demo program to 64 bit. It now successfully loads the entire huge PDF files using this code:
ImageEnMView1.MIO.LoadFromFilePDF( Filename );
Each ImageEnMView thumbnail is correct and it displays full sized in ImageEnView correctly.
For the moment, I remain puzzled. My 32 bit D10.3 version, using page by page loading via IEGlobalSettings().PDFEngine := ieenDLL; works to slowly and correctly load all pages of these large PDF files. But the 64 bit version of the demo program fails.
I feel your pain about my abusing the purpose of ImageEnMView. For many years I argued against colleagues using Excel as a database, instead of MS Access. Now it seems I am doing that, though I've been doing it successfully for many years, without realizing it was outside your concept. Good thing ImageEnMView has proven to be versatile! I would love to use PDFViewer since it is faster and higher quality in loading pages. But it doesn't allow the kind of editing that I need.
I will look carefully into your last suggestion for how to make this work better. Thank you.
J.R. |
|
|
xequte
38607 Posts |
Posted - Oct 26 2024 : 17:43:40
|
Hi JR
Just to be sure, check you have added ielib64.dll and iepdf64.dll to your exe folder.
Nigel Xequte Software www.imageen.com
|
|
|
jrpcguru
USA
266 Posts |
Posted - Oct 26 2024 : 18:19:25
|
Yes, I did. I just switched back to the 32bit versions of these DLLs for my scanning program, since the scanner issue will keep me in 32bit.
J.R. |
|
|
xequte
38607 Posts |
Posted - Oct 26 2024 : 19:34:11
|
Hi JR
If you can create a demo that loads PDF pages differently in 32 and 64bit mode, please forward it to me for testing.
Nigel Xequte Software www.imageen.com
|
|
|
jrpcguru
USA
266 Posts |
Posted - Oct 26 2024 : 22:04:15
|
The demo that I emailed you awhile ago was a 32 bit program. I just converted it to 64 bit to get the marked improvement that I reported.
J.R. |
|
|
xequte
38607 Posts |
Posted - Nov 18 2024 : 22:16:39
|
Hi JR
We have significantly improved the memory handling of large PDF files in the latest beta. Please email me for an update.
Nigel Xequte Software www.imageen.com
|
|
|
|
Topic |
|
|
|