I am trying to convert a normal PDF into PDF/A3b. The normal PDF doesn’t have the fonts embedded so I used Ghostscript command
gs -o output.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -dEmbedAllFonts=true -dSubsetFonts=true -dCompressFonts=true input.pdf
Even though there are multiple fonts on the PDF, using command above all fonts get embedded except for the first font (Albany WT J) which is already embedded.
Once the fonts are embedded, I used my own script to convert it into PDF/A3 but the issue is that the font (Albany WT J) has the CIDSet incomplete and the compliance to PDF/A3b is showing this error. Actually, for PDF/A3b compliance, the CIDset is optional but since this exists the CIDset should be complete. The exact error shown on the Adobe preflight is cidset in subset font is incomplete (font contains glyphs that are not listed)
and on verapdf it says the same thing If the FontDescriptor dictionary of an embedded CID font contains a CIDSet stream, then it shall identify all CIDs which are present in the font program, regardless of whether a CID in the font is referenced or used by the PDF or not.
I am trying multiple solutions to fix this issue, but I am sure this is a fixable issue, as there are online websites (like freepdfconvert.com) that is able to convert to PDF/A3b successfully. I think they are removing the CIDset on the font to fix this issue or they are re-embedding the font with a complete CIDset.
You can access the original PDF – here
Our conversion to PDF/A3b – which has the above-mentioned error – here
The PDF that we got converted from the online solution which shows success – here
I am looking for a solution in GhostScript, PDFlib, Nodejs pdf libraries or any other language. Any lead is also highly appreciated on this issue.
File: Orginal PDF.pdf Modified: 2023-10-05 18:23:55 PDF Producer: 3.0.5 (5.0.9) PDF Version: 1.5 File Size: 73.14 KB (74,899 Bytes) Number of Pages: 1 Page Size: 21.59 x 27.94 cm (Letter) Fonts: Albany WT J (TrueType (CID); Identity-H; embedded) Helvetica (Type1; Ansi) Helvetica-Bold (Type1; Ansi) Times-Bold (Type1; Ansi) Times-Roman (Type1; Ansi)
So all the expected fonts are embed as per PDF standard the core problem is PDF/A is a silly rule that you need to add bloat to meet unnecessary requirement to replace Type 1 fonts with any so GS Nimbus will do@KJ the issue is with font (Albany WT J) and doesn’t have to do anything Type1 or Truetype font, I think. If you look at the 3rd PDF (the online converted) it has type 1 fonts but there is no issue with PDF/A compliance. Also, can you please give me the script that will do this in ghost script –
replace Type 1 fonts with any so GS Nimbus
? Let me try if this solution works for PDF/A3b compliance