使用PHP脚本中的条形码字体压缩MS word docx文档
Using Tinybutstrong and openTBS i created a script in PHP that opens multiple docx templates and replaces a lot of variables with values from a database. In a nutshell clients can download their unique files, add information and pictures and upload them again. This works excellent. But of coarse i wouldn't post here if there wasn't some sort of problem.
Because of the barcodes (I am using barcode fonts and embed them in Word because the documents will be scanned far later in the process), the documents get huge. Instead of 100 KB average, they'll easily get 7MB. This is a problem, because per year about 20.000 documents will be scanned. That's an extra +/- 130 GB per year.
It's a long story but we need docx, so we can't simply replace it with some sort of PHP / MySQL template that would be far more efficient.
Word has the option to just embed the font symbols that are being used to cut on the size. But that isn't an option, because the main template needs to have all chars available. It's also not an option to send the font to the users, since there are +/- 20.000 new ones each year.
Is there another solution to cut the file size or use compression. Perhaps in Word, PHP, FTP, Apache?
使用 Tinybutstrong和openTBS 我在PHP中创建了一个脚本,它打开了多个docx模板,并用数据库中的值替换了很多变量。 简而言之,客户可以下载其独特的文件,添加信息和图片,然后重新上传。 这很好用。 但粗略的我不会在这里发布如果没有某种问题。 p>
由于条形码(我使用条形码字体并将它们嵌入Word中,因为文件将是 在此过程中稍后扫描),文件变得庞大。 而不是平均100 KB,他们将轻松获得7MB。 这是一个问题,因为每年将扫描大约20,000个文档。 这是每年额外+/- 130 GB。 p>
这是一个很长的故事,但我们需要docx,所以我们不能简单地用某种PHP / MySQL模板替换它 效率更高。 p>
Word可以选择嵌入用于剪切大小的字体符号。 但这不是一种选择,因为主模板需要提供所有字符。 它也不是将字体发送给用户的选项,因为每年有+/- 20.000个新字体。 p>
是否有其他解决方案来削减文件大小或使用压缩。 也许在Word,PHP,FTP,Apache? p> div>
I'm afraid the solution of using the option "Embed fonts in the file" with "Embed only characters used in the document" cannot be exploited. Ms Word saves the font using a special format with the extension ODTTF (for example, you have it in "word\fonts\font1.odttf"). But this format is binary, it seems badly documented and thus it stays as a proprietary format. Only Ms Word will be able to build such a sub-file.
Since you haven't any lighter font for the barcode, the only solution I can see is to use image instead of font for you barcode:
- OpenTBS has a feature to easily replace a picture inside a DOCX file (parameter "op=changepic").
- Barcode2Image tools are easy to find in PHP. For example : Barcode Generator.
Then you only have to code your process like this :
- Load the DOCX template,
- Create the temporary image of the barcode.
- Change the image inside the template.
- Merge the template, and save or send the result.
- Delete the temporary image.
It's important to delete the temporary image only after the final merge of the template, because OpenTBS actually inserts the image only when method $tbs->Show() is called.
It's also important to use a different temporary file for each merging because many merges can occur in the same time.
If temporary files have a prefix or are saved into a dedicated directory, then it is advisable to clean up old temporary images regulary.