Skip to main content
  1. Posts/

HTML Email to Print Gateway

·1032 words·5 mins· loading · loading ·
Tech Bash Email Exim Centos CUPS Procmail
Table of Contents

What’s the situation?
#

Mulitple libraries, covering thousands of students, want to move away from the old-school method of book loan/return slips. A very manual and time-consuming process.

Cloud-based Library Management Service
#

A cloud-hosted service was purchased to manage book loans/returns in a single interface, with all the trimmings a single platform can offer; managing print, electronic and digital materials. In essence, a full library management service.

This is a great step forward, but whilst they now have a rather great management system with all the students in it, barcode scanning and extra functionality, what they still have not fixed is the manual effort required by the librarians.

Developing the Email to Print Gateway
#

I was approached to help figure out an automated solution to remove as much of the manual requirements as possible.

I am well versed in email and automation, and as the primary problem to solve started out as an email coming through to the library inbox, I knew how to get started on this.

First Things First
#

The first thing to do was to stop the email from the cloud service coming through Exchange Online. We already had SPF issues and having the cloud service send as the library receiving the email was unnecessary, and as I already ran a rather large internal email infrastructure, it was a no-brainer to add a couple more servers to make this work.

I began to think about the requirements and reducing it to it’s simplest components. Essentially, as I stated above, it all begins with an email from the library management service and ends when it is printed out on the required/requested printer.

Flow Process
#

A quick high-level view of the process, but I now have an idea of how this is going to work.

  • Email comes in from library management service
  • Email is stripped of unrequired HTML elements
  • Leftover HTML email is converted to Postscript
  • Postscript is sent to the relevant printer

Requirements
#

Thinking on the above, here’s my list of requirements to put all this together.

Server Requirements
#

Software Requirements
#

Piecing it all Together
#

Now that I have a better understanding of the OS and software required to make this work, here’s the new more detailed Flow

  • An email is sent to a user account that exists on both boxes ie, printername@library.tld. (user account are based on printer name)
  • Exim then pipes that email to Procmail
  • Procmail looks for .procmailrc in the users home directory and takes action.
  • Mhonarc converts the email to a HTML file, saving the images and attachments.
  • Using “sed”, open the HTML file and look for the start of the email () and collect all text to the end of file
  • Pipe to “sed” again to remove superfluous HTML tags (hr tags) added by Mhonarc
  • Pipe to Html2ps to convert to PostScript
  • Pipe to designated printer (Printers are named the same as the user account)

The reason why I named the user accounts after the printers was so I didn’t have to create a custom script per account/printer. This meant I could call “whoami” in the .procmail script, as that script is run as the user of the home directory the script lives in.

Also, it made it easier to create new accounts and deploy the folder structure and script via the on-prem Gitlab server.

Here’s the “.procmailrc” file that brings all of that together.

SHELL=/bin/bash

# Designate the printer. Printer names match usernames so you don't have to manually change 60+ files.
printer=`whoami`

# Generate a unique ID
f=`uuidgen`

# Convert email, including headers and body into a HTML file and save off the images using MHONARC https://www.mhonarc.org/
# Open file and search <!--X-Body-of-Message--> string using SED and collect all text to EOF.
# Pipe the result into SED again to remove unwanted HTML tags added by MHONARC
# Pipe result into HTML2PS to convert to PostScript
# Pipe PostScript file to the designated printer
:0E
| mhonarc -single > ${f}.html; sed -n '/^<!--X-Body-of-Message-->$/ { s///; :a; n; p; ba; }' ${f}.html | sed -e '/<hr>/d' | html2ps | lp -d ${printer} -o media=a4 2>&1

# Finally, delete the email
:0
/dev/null

Six lines of code is all it took to transform all the aforementioned manual steps into an automated process, leaving the final step to collect the slip from the printer and put it in the book.

This service has stood for 5 years without a single outage. The only downtime either of those servers have had is to run security patches at the OS level and keep Exim up to date to prevent vulnerablities allowing an attacker in. The servers are patched one at a time, so the service itself is never impacted; the loadbalancer just routes around the downed server.

Supporting the Service
#

Adding/Removing Accounts and Printers
#

I wrote a bash script to do the following based on the details in a CSV file

  • Create the user account
  • Add a printer to CUPS (trying multiple different connection methods until one worked - printers were not the same make/model and varied in age)
  • Connect to Gitlab and pull down the script and folder structure.

Lessons Learned
#

There were two primary lessons learned

  • The images that get stripped from the incoming email (a barcode that was not required), all get saved in the local drive. Writing a script at the start of the service to purge these images on a regular basis would have prevented low-disk warnings after the first year.
  • Users often switch printers off, or they break down, of the network connection drops. Adding a “check if printer is live” function would have been handy here.
    • CUPS can be a bit temperamental as well, so I did build in a “wake printer” cron job, but I added no error checking at the time.

Summary
#

In all, this was one of the best scripts I ever wrote that solves a manual process. Utilising open-source software I was able to deliver a resilient and smooth service that solved a major issue. A great piece of work even if I say so myself.

Hi! I'm Mark!
Author
Hi! I’m Mark!
Mark McKee is the Microsoft 365 Lead Solutions Architect for a public sector organisation, with over 20 years of experience. Mark is a leader in blue-sky-thinking, automation, and identity.