Run a Postfix mail server with TLS and SPF in Kubernetes

Let’s call the outcome of this blog post “usable” with room for improvement 😉 More on potential improvements are mentioned in the text and at the end of the text. So if you want to do the same consider this text a first step or starting point. At some points I can only give a recommendation or suggest a minimal setup to make it work. Sometimes it just depends on your setup which implementation makes most sense.

Also note that this is only about Postfix and not about Postfix plus whatever IMAP server. Personally I use Archiveopteryx — An advanced PostgreSQL-based IMAP/POP server as final delivery destination to store my mails. Postfix delivers the mails via LMTP protocol to my Archiveopteryx. But it’s up to you if you want to use Dovecot, Cyrus IMAP or whatever. However I’ll cover the basic settings where Postfix should deliver the virtual users (because you don’t want to store mails in a Pod’s filesystem, right? 😉).

The first thing I wanted to mention - and is a little off topic - is a tool called kubectx. Since I’ve quite a few namespaces already it’s a little bit annoying to provide -n whatever_namespace to kubectl command all the time (of course you can also create bash aliases on Linux e.g.). kubectx provides a very handy tool called kubens which switches to whatever namespace you like. So as long you work in the same namespace you no longer need -n for kubectl. That said let’s start:

I decided to put everything mail server related to it’s own K8s namespace called mailserver as it has no dependencies to services running in other namespaces:

bash

kubectl create namespace mailserver

If you want to store your configuration files in Git e.g. to preserve your change history the YAML file looks like this:

yaml

apiVersion: v1
kind: Namespace
metadata:
  name: mailserver

Store the content in a file namespace.yml e.g. Apply the file via

bash

kubectl create -f namespace.yml

As mentioned above if you use kubens tool you just need to enter

bash

kubens mailserver

now and there is no need to further specify -n mailserver or --namespace=mailserver to kubectl. If you don’t have kubens installed remember to specify the namespace to kubectl all the time (I won’t do it in the following text).

As Postfix should run in a K8s cluster we need a Postfix container image. I used Jessie Frazelle’s Postfix Dockerfile as a template and modified it to my needs. You can clone my Github repo kubernetes-postfix which includes a few resources to start with to setup Postfix running in Kubernetes:

bash

git clone https://github.com/githubixx/kubernetes-postfix

In the repository you’ll find the Dockerfile for the Postfix container which looks like this (installing bash isn’t a requirement so you can (or even should) remove it later but it makes debugging issues at the beginning a little bit easier):

dockerfile

FROM alpine:3.19

RUN apk add --no-cache \
        bash \
        ca-certificates \
        libsasl \
        mailx \
        postfix \
        rsyslog \
        runit \
        postfix-policyd-spf-perl

COPY service /etc/service
COPY runit_bootstrap /usr/sbin/runit_bootstrap
COPY rsyslog.conf /etc/rsyslog.conf

STOPSIGNAL SIGKILL

ENTRYPOINT ["/usr/sbin/runit_bootstrap"]

To keep the container image small Alpine Linux is used as OS and a few packages are installed which are needed to operate Postfix. The first two COPY lines copy the files needed for runit. runit is an init scheme for Unix-like operating systems that initializes, supervises, and ends processes throughout the operating system. In general that avoids having zombie processes. runit is an init daemon, so it is the direct or indirect ancestor of all other processes. It is the first process started during booting, and continues running until the system is shut down. runit basically starts and supervises Postfix itself (as you can see in service/postfix/run) and rsyslogd (service/rsyslog/run). That’s the second reason why runit is needed as you normally only have one process per container but in this case we have two. Have a look at the files in the Github repo if you need more information. The runit_bootstrap script is our entrypoint for the container.

Before we can build the container image we need a bunch of additional files.

The first ones are for Forward Secrecy. From the Postfix docs: The term "Forward Secrecy" (or sometimes "Perfect Forward Secrecy") is used to describe security protocols in which the confidentiality of past traffic is not compromised when long-term keys used by either or both sides are later disclosed. For better forward secrecy settings we create two DH (Diffie-Hellman) parameter files which we later use in Postfix’s main.cf. With prime-field EDH, OpenSSL wants the server to provide two explicitly-selected (prime, generator) combinations. One for the now long-obsolete “export” cipher suites, and another for non-export cipher suites. Postfix has two such default combinations compiled in, but also supports explicitly-configured overrides:

bash

mkdir dhparam
openssl dhparam -out dhparam/dh512.pem 512
openssl dhparam -out dhparam/dh2048.pem 2048

NOTE: In Postfix >= 3.6 the parameter smtpd_tls_dh512_param_file is silently ignored. See: smtpd_tls_dh512_param_file

Next we need a few configuration files for Postfix. I prepared a few files which are included in the repo you checked out above. Adjust the files to your needs. I put quite a lot comments into the files so I won’t repeat everything here again. Just to get a quick overview:

bash

etc/mail/aliases

The aliases table provides a system-wide mechanism to redirect mail for local recipients. The redirection is processed by the Postfix local delivery agent. Basically this file isn’t that interesting as we are mainly interested in receiving mails from other mail servers and deliver the mails for our “virtual” users to a final destination like a Dovecot IMAP server or Archiveopteryx via LMTP protocol as mentioned above. Nevertheless it’s a good habit to create aliases for some common users and send the mails to an administrator mail account e.g.

bash

etc/postfix/headercheck
etc/postfix/bodycheck

Sometimes it is very useful to filter mails directly where they are received which is Postfix in our case. You can use regular expression rules in this files to reject mails that contains some keywords or even a specific character set e.g. That’s quite handy if a spam wave pop’s up which isn’t recognized by your spam filter e.g.

I’ve added this file in to flavors so to say:

bash

etc/postfix/main_single.cf.tmpl
etc/postfix/main_multi.cf.tmpl

This is most important Postfix configuration of course. That’s the main Postfix configuration file. Please read the comments and adjust to your needs (and add additional parameters if needed of course 😉)!

Why two files? If you don’t have that high availability requirements then it’s good enough to go with one Postfix Pod. For this usecase etc/postfix/main_single.cf.tmpl is the main.cf to use. This is the easiest setup. Of course if the node goes down the Postfix container is running on or the Postfix Pod is down you won’t receive any mails until that node and/or the Postfix Pod is back. If this doesn’t take that long it’s normally not a problem as other mail servers should(!) queue the mail they want to deliver and retry later. But there might be mail servers out there which don’t retry.

If you have high availability requirements you most probably want to run two or more Postfix Pods on different K8s nodes of course. In that case the other mail server tries to deliver the mail to one of the remaining Postfix services if the “primary” one is down (how to configure this will be discussed shortly). But this kind of setup is obviously more difficult.

Now why have this files a .tmpl postfix? Because they are templates (well at least main_multi.cf.tmpl) and not the final main.cf file. That one is generated during startup of the container. Why this?

E.g. if you have three worker nodes and you want to use all three nodes as Postfix server you need to think about it a little bit upfront. Postfix’s myhostname parameter should resolve to the public IP of your host which the Postfix container runs on (which in this case is normally the K8s worker node the Postfix container runs on).

Now if we have mail.domain.tld as our mail server name you can only assign the IP behind that DNS entry to one host. So in case of a host failure you either move that IP around via some fail over mechanism, accept the fact that Postfix could be down or you do something different. That’s basically the single Postfix deployment mentioned above. As you’ll see later Postfix deployment will be deployed as DaemonSet. That means if you have three worker nodes all three worker nodes will run Postfix on port 25 bound to an/the IP(s) of the host network. That way Postfix service can be exposed to the outside world. That’s normally accomplished via firewall or NAT (which maps the public IP of a VM to an internal IP address e.g.). For a single Pod Postfix deployment you can tell Kubernetes on which node you want to run that Pod on (see Assign Pods to Nodes using Node Affinity and Assign Pods to Nodes). If two or more Postfix Pods should run then it’s also possible to tell Kubernetes to run the Pods only on a subset of the K8s worker nodes if you want (see previous links for Pods assignment). So you might have a dedicated pool of K8s worker nodes for Postfix with fixed/static public IPs.

Quite often K8s node names are not static. You might have a setup where you constantly scale up and down nodes. But with Postfix running on K8s that might become a problem if the public IP’s changes constantly. Since quite a few mail server still do reverse lookups you either do some DNS magic e.g. during Pod startup or via an init container or do it like I did. I only have a very small cluster so my worker nodes also run the Postfix workload. But if you do some serious stuff it might be a good idea to run Postfix on two or three worker nodes completely on its own because of the special requirements for Postfix. E.g. DNS reverse lookups need to work for the public IPs and myhostname parameter should resolve to the IP of your mail server (as already mentioned). That are requirements that you don’t have normally for typical webserver or application server stuff e.g.
Good host candidates for running Postfix Pods on are also the hosts that run the Ingress proxies. Normally you want to expose your Kubernetes workload to the public Internet. That’s what Ingress is used for. In general Ingress is basically a proxy that has a connection to the public Internet, accepts HTTP/HTTPs requests and forwards them to internal Kubernetes services. I’m using Traefik Proxy as you’ll see below. As you need to point your DNS entries for your website to some static IPs anyways, these Ingress proxy hosts are perfect for Postfix too. So you can use two or three dedicated nodes for external traffic like HTTP/HTTPs requests and for mail e.g. This has also the advantage that you can harden these instances (security wise) even better as you just run this dedicated workload there. Ingress proxies like Traefik or NGINX Ingress Controller also allows to forward incoming TCP traffic and not only HTTP/HTTPs. BUT there is a potential problem with this approach when it comes to email: For incoming mail it probably works very well if the Ingress proxy forwards mail requests to any of the Postfix container but not for outgoing mail traffic (at least not out of the box). If you send mails from your internal network to someone else via one of your Postfix instances, the mail server on the “other side” most probably will do a DNS reverse lookup to check if the IP address of your mail server matches the DNS forward lookup. So e.g. we have a DNS record for mail01.domain.tld which resolves to an IP address xxx.yyy.xxx.yyy:

bash

dig mail01.domain.tld

mail01.domain.tld.    60      IN      A       xxx.yyy.xxx.yyy

Now the opposite must also work. If we lookup the IP address we should get a PTR record with the hostname:

bash

dig -x xxx.yyy.xxx.yyy

xxx.yyy.xxx.yyy.in-addr.arpa. 86400 IN     PTR     mail01.domain.tld.

Once your Postfix queued the mail it should forward elsewhere it opens a connection to the mail server that should receive your mail. And if that outgoing public IP address of your host is different then configured in the DNS PTR record, quite a few mail servers will reject your mails. So whatever you do and wherever the Postfix pod runs on, the outgoing IP address and the DNS reverse lookup (PTR record) needs to match. You can do some “routing magic” to send all requests to external resources via a gateway and your DNS reverse lookup points to that gateway address. But as you can see the whole mail server setup can get evermore complicated 😉

So if running a single Postfix Pod isn’t good enough this might be one(!) of the possible solutions: You might have three worker nodes are called worker01, worker02 and worker03. Now service/postfix/run_multi file contains the following lines:

bash

# Set myhostname depending on the hostname the Postfix Pod runs on.
# The ID gets extracted from the hostname.
SERVER_ID="${HOSTNAME//worker/}"
sed "s/@ID@/${SERVER_ID}/g" /etc/postfix/main_multi.cf.tmpl > \
  /etc/postfix/main.cf

And in etc/postfix/main_multi.cf.tmpl we have this line:

bash

myhostname = mail@ID@.$mydomain

Since Postfix Pods are distributed as DaemonSet the ${HOSTNAME} variable is like the worker node names which are again worker01, worker02 and worker03. So the two lines of Bash code above extract the ID which is part of the hostname which results in 01, 02 and 03. This ID can now be used in main_multi.cf.tmpl to set the myhostname value.

A more elegant approach (and more Kubernetes/Cloud like) might be to put a label on every K8s node in question with kubectl label nodes .... Now similar to using the HOSTNAME described above this label can be used to set myhostname. But how to get the node labels in a Pod?
What you basically need is again an init container. This container runs before your Postfix container starts. But Pods can share a volume between all containers. So in the init container you can run a simple Bash script that runs env (to get HOSTNAME variable which is also the name of the Pod), kubectl describe pod ... (to grep for Node: which is the node/hostname the pod runs on) and finally kubectl describe nodes ... to get the labels of that K8s node. Of course this Pod needs a service account that is allowed to query this information from the Kubernetes API server (which normally has the hostname kubernetes.default.svc.cluster.local when doing the request inside a pod). But it’s out of the scope of this blog post to get more into detail.
For more information see Accessing the API from a Pod and Inject node labels into Kubernetes pod. Nevertheless at the end the Bash script that runs in the init container puts the discovered label or hostname in a file of a shared volume that can be accessed by the Postfix container too. Using this information the script service/postfix/run_multi which starts the Postfix process at the end can use that information to set the myhostname variable accordingly.

Now if creating DNS entries like mail01.domain.tld, mail02.domain.tld and mail03.domain.tld later there is no DNS magic needed as we know that the Postfix behind mail01.domain.tld will always run on K8s node worker01 and the same is true for the other two Postfix processes. So we’ve a 1:1 mapping between K8s worker node and the DNS entry and that’s very easy to handle. This allows us to have a DNS MX entry for the domain that contains all three Postfix instances e.g.:

bash

domain.tld.           60      IN      MX      10 mail01.domain.tld.
domain.tld.           60      IN      MX      20 mail02.domain.tld.
domain.tld.           60      IN      MX      30 mail03.domain.tld.

With this setup mails are normally delivered to mail01.domain.tld as it has the highest priority which is 10. But you can also set priority for all entries to the same value which gives you basically some kind of loadbalancing over all nodes. If host mail01.domain.tld (which is worker01) fails for whatever reason the sender will try the next host in the list which is mail02.domain.tld with priority 20. So with this setup at least incoming mails should now be always handled by one of three available Postfix Pods as they run independently and on different hosts. And we need no loadbalancer or failover mechanism to move a single IP in case one of the nodes fail.

If running a single Postfix Pod is good enough for you, you don’t need to care about this template handling above. In this case a static main.cf is good enough (see service/postfix/run_single and etc/postfix/main_single.cf.tmpl). But nevertheless this mechanism allows you to change Postfix settings before the Postfix process starts if you need it. And as already mentioned this can also be done in init containers. Using init containers allows for separation of concerns. So all the initialization and preparation happens in the init container and the Postfix container just consumes the provided and ready to use resources.

To make sure that Postfix Pods are only running on a specific set of K8s nodes please have a look at these resources:

bash

etc/postfix/vmailbox

Here you finally specify the mail addresses which you’re accepting and want to deliver the message to its final destination. The scripts service/postfix/run_* contains a line which converts this text file into a binary database file that Postfix can use for fast lookup (postmap /etc/postfix/vmailbox) before Postfix gets started. For this functionality you can (again) use Init containers if you want.

Now we can build the Postfix image. As the K8s cluster needs to be able to pull the image later when we deploy the Postfix DaemonSet you should include your Docker registry in the image name. E.g. if your Docker registry host is registry.domain.tld and listens on port 5000 you just run this command (don’t forget the . at the end 😉):

bash

docker build -t registry.domain.tld:5000/postfix:0.1 .

Push that image to your Docker registry now:

bash

docker push registry.domain.tld:5000/postfix:0.1

Because we want to be able to change the Postfix configuration quickly without building a new container image we store the configuration files in a K8s ConfigMap. So if you’re still in my kubernetes-postfix repo and adjusted all parameters and scripts accordingly execute the following commands:

bash

kubectl create configmap postfix --from-file=etc/postfix/
kubectl create configmap mail --from-file=etc/mail/

This copies all the files needed to the postfix and mail ConfigMap. We’ll mount the files via the subPath option into the Postfix container so that Postfix can read the configuration files. If you want to change any of the files in etc/postfix/ later just change the file accordingly and run

bash

kubectl create configmap postfix \
  --dry-run=client \
  -o yaml \
  --from-file=etc/postfix/ | \
    kubectl replace -f -

This replaces the postfix ConfigMap e.g. Afterwards you can just kill the Postfix container and Postfix container will be restarted by Kubernetes with the new configuration. Another option is a nice utility called Reloader which takes care of container restarts if a ConfigMap changes e.g.

Next a K8s secret gets created. It includes the DH parameters that were created above:

bash

kubectl create secret generic dhparam --from-file=dhparam/

Since the mails should travel encrypted over the wire, a TLS certificate is needed and Postfix configured accordingly. I already discussed this above. So now it’s time to setup and configure cert-manager. cert-manager is a Kubernetes add-on to automate the management and issuance of TLS certificates from various issuing sources. It will ensure certificates are valid and up to date periodically, and attempt to renew certificates at an appropriate time before expiry. It can store the received certificates as K8s secrets after it received the updated certificates. In this case it’s good enough to make the Postfix pods reread the new certificates by restarting the pods e.g.
As already mentioned above Reloader is an option to do the Pod restart if the certificate changes. It’s a Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, DaemonSet and DeploymentConfig.

So lets install cert-manager. I already did this in Kubernetes the Not So Hard Way With Ansible - Ingress with Traefik v2 and cert-manager (Part 2). Please follow the steps there from top including the paragraph Install ClusterIssuer. If you don’t want to use my cert-manager Ansible role as described there you can also use the official helm chart or install it in any other way of course. If cert-manager is installed and the ClusterIssuer for Let’s Encrypt is in place read on.

This step is optional if you use Let's Encrypt wildcard certificates. In this case the DNS challenge is the only option to get such kind of certificate and you don’t need an Ingress proxy for the verification process. This challenge asks you to prove that you control the DNS zone by putting a specific value in a TXT record in that zone. It is harder to configure than HTTP-01 but can work in scenarios that HTTP-01 can’t. So if you can use DNS challenge it makes sense to configure cert-manager (Cluster)Issuer accordingly.

Otherwise you need to use HTTP-01 challenge. Most probably you already have a Kubernetes Ingress proxy installed if you allow HTTP/HTTPs requests to your Kubernetes cluster from the public Internet.

I’ve written a blog post on how to setup Traefik proxy: Kubernetes the Not So Hard Way With Ansible - Ingress with Traefik v2 and cert-manager (Part 1). You can also use the official Helm chart to install and configure Traefik. But of course you can use any Ingress proxy you prefer.

If you can use DNS01 challenge you can skip this step.

As already described above running the Ingress proxy (Traefik in my case) and Postfix together makes things a little bit easier. Let's Encrypt will issue a HTTP request if HTTP01 challenge is used to verify that you actually can manage the domain. So if cert-manager tries to get a TLS certificate from Let’s Encrypt, cert-manager will temporary configure a route (Ingress resource) in Traefik that intercepts requests to the domain you want the TLS certificate for if the Host HTTP header matches the Domain Name accordingly AND a specific path is requested. This path normally starts with /.well-known/acme-challenge.

So back to my example above with my three hosts called worker(01|02|03) where now a single Postfix and Traefik Pod runs on. For this setup three DNS records of type A are needed:

bash

mail01.domain.tld. 60 IN A xxx.yyy.xxx.1
mail02.domain.tld. 60 IN A xxx.yyy.xxx.2
mail03.domain.tld. 60 IN A xxx.yyy.xxx.3

xxx.yyy.xxx.(1|2|3) are the public IPv4 addresses of worker(01|02|03). Of course you can also create DNS records of type CNAME and point them to the worker host names. These entries are needed for Let's Encrypt HTTP01 challenge. Later we’ll create a certificate request that contains all three domain names in the certificate. This will cause Let's Encrypt to make three HTTP requests for every domain configured to verify that you are in charge of the webserver setup. So it will request http://mail01.domain.tld/.well-known/acme-challenge e.g. The same is true for mail02 and mail03 of course. cert-manager will make sure that this URL/route exists and responds accordingly. After the verification the URL/route will be removed.

I already mentioned this further above. As (in my example) worker(01|02|03) also run a Postfix Pod to handle incoming mail traffic there are also three DNS records of type MX (for mail exchange) needed e.g.:

bash

dig domain.tld MX

domain.tld.           60      IN      MX      10 mail01.domain.tld.
domain.tld.           60      IN      MX      20 mail02.domain.tld.
domain.tld.           60      IN      MX      30 mail03.domain.tld.

With this in place other mail servers now have the information where to deliver mails addressed to you.

There is one more thing that you should do now in regards to DNS: Create a PTR record for your mail servers IP addresses. The PTR record is a reverse lookup which maps the IP address to the name (so it’s the opposite of a forward lookup where you want the IP address of a DNS A or CNAME record e.g.). So for mail01.domain.tld the “forward lookup” looks like this e.g.:

bash

dig mail01.domain.tld

...
;; ANSWER SECTION:
mail01.domain.tld.      60      IN      A       xxx.yyy.xxx.1
...

But the opposite should also work e.g.:

bash

dig -x xxx.yyy.xxx.1

...
;; ANSWER SECTION:
xxx.yyy.xxx.1.in-addr.arpa. 60   IN      PTR     mail01.domain.tld.
...

How to setup a reverse lookup depends on your DNS provider. For Scaleway it’s pretty easy. Just login to the Scaleway UI, select Network on the left menu, choose the IP of your mail server, click on it and then you can Edit the REVERSE value. If we take the example above the value would be mail01.domain.tld.

The same is basically true for Hetzner Cloud. See Cloud Server rDNS and What is a valid rDNS entry for mail servers?

Basically that’s not a requirement to have a PTR record but there’re still lots of installations out there that do a reverse lookup of your mail server’s IP and if no PTR record exits they’ll refuse your mail. Strict mail servers do a forward lookup on the name your mail server introduces itself as such as mail01.domain.tld, verify it is the IP address that is read off the connection, and do a PTR lookup on that IP address to see if it resolves to the same name. So without DNS reverse lookups properly configured most mail servers will reject your mails.

Now with DNS configured and cert-manager and Traefik (optionally, when using DNS01 challenge) in place a TLS certificate from Let's Encrypt can be ordered. In my Git repo there is a sample file called kubernetes/certificate.yml and it looks like this:

yaml

---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: mail-domain-tld-cert
  namespace: mailserver
spec:
  commonName: mail01.domain.tld
  secretName: mail-domain-tld-cert
  dnsNames:
    - mail01.domain.tld
    - mail02.domain.tld
    - mail03.domain.tld
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer

Of course you need to adjust the values. So what’s in this spec: If applied successfully a Certificate resource called mail-domain-tld-cert will be created. If Let's Encrypt successfully verifies your request (letsencrypt-prod ClusterIssuer is used in this case), a Kubernetes Secret called mail-domain-tld-secret (secretName) will be created in the mailserver namespace. The generated certificate contains several different names using the Subject Alternative Name (SAN) mechanism. So thanks to Let's Encrypt which supports SAN certificates it’s good enough to have one certificate with multiple domain names if more than one mail server is used. So just list all your mail server hostnames here.

As mentioned above if the certificate request was successful there should now be a K8s secret called mail-domain-tld-cert (in this example) in the mailserver namespace (kubectl -n mailserver get secrets mail-domain-tld-cert -o yaml). The secret contains tls.cert (the Let’s Encrypt certificate file) and tls.key (the key file). These two files will be mounted into the Pod and will be loaded by Postfix. That’s these two settings in main.cf:

bash

...
# Let's Encrypt certificate file
smtpd_tls_cert_file = /etc/postfix/certs/tls.cert
# Let's Encrypt key file
smtpd_tls_key_file  = /etc/postfix/certs/tls.key
...

Now we finally setup the important K8s resources to run Postfix now. Let’s start with a Role. An RBAC Role or ClusterRole contains rules that represent a set of permissions. A Role always sets permissions within a particular namespace; when you create a Role, you have to specify the namespace it belongs in. So for my example the Role resource looks like this:

yaml

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: mailserver
  name: postfix
rules:
- apiGroups: 
    - ""
  resources: 
    - configmaps
    - secrets
  resourceNames:
    - postfix
  verbs:
    - get

All K8s resources needed are also in my kubernetes-postfix repository. Adjust to your needs and save this as role.yaml. As we created ConfigMap’s and Secret’s we of course need permissions to at least fetch them. That’s what we see in resources. In verbs we define that we just want to get the resources (aka no write permissions needed). The "" in apiGroups indicates the core API group.

Next we need a RoleBinding. A role binding grants the permissions defined in a role to a user or set of users. It holds a list of subjects (users, groups, or service accounts), and a reference to the role being granted. A RoleBinding grants permissions within a specific namespace whereas a ClusterRoleBinding grants that access cluster-wide.

yaml

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: postfix
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: postfix
subjects:
- kind: ServiceAccount
  name: postfix
  namespace: mailserver

Save this as rolebinding.yaml. This is basically the “glue” between the Role and the ServiceAccount which comes next:

A ServiceAccount provides an identity for processes that run in a Pod.

yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: postfix
  namespace: mailserver

Save this to serviceaccount.yaml. We’ll use this ServiceAccount in the DaemonSet configuration to assign the Postfix Pod the grants it needs to operate which we defined in the Role above.

The final K8s resource we define is a DaemonSet. So here it is:

yaml

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: postfix
  namespace: mailserver
  labels:
    app: postfix
spec:
  selector:
    matchLabels:
      app: postfix
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: postfix
    spec:
      serviceAccountName: postfix
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      imagePullSecrets:
      - name: registry-domain-tld
      containers:
      - image: registry.domain.tld:5000/postfix:0.1
        name: postfix
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 2
          tcpSocket:
            port: 25
          initialDelaySeconds: 10
          periodSeconds: 60
        readinessProbe:
          failureThreshold: 2
          tcpSocket:
            port: 25
          periodSeconds: 60
        resources:
          requests:
            memory: "32Mi"
            cpu: "50m"
          limits:
            memory: "64Mi"
            cpu: "50m"
        ports:
        - name: smtp
          containerPort: 25
          hostPort: 25
        - name: smtp-auth
          containerPort: 587
          hostPort: 587
        securityContext:
          capabilities:
            drop:
            - ALL
            add:
            - DAC_OVERRIDE
            - FOWNER
            - SETUID
            - SETGID
            - NET_BIND_SERVICE
        volumeMounts:
        - name: config
          subPath: bodycheck
          mountPath: /etc/postfix/bodycheck
          readOnly: true
        - name: config
          subPath: headercheck
          mountPath: /etc/postfix/headercheck
          readOnly: true
        - name: config
          subPath: main.cf.tmpl
          mountPath: /etc/postfix/main.cf.tmpl
          readOnly: true
        - name: config
          subPath: vmailbox
          mountPath: /etc/postfix/vmailbox
          readOnly: true
        - name: aliases
          subPath: aliases
          mountPath: /etc/mail/aliases
          readOnly: true
        - name: certs
          subPath: tls.crt
          mountPath: /etc/postfix/certs/tls.crt
          readOnly: true
        - name: certs
          subPath: tls.key
          mountPath: /etc/postfix/certs/tls.key
          readOnly: true
        - name: dhparam
          subPath: dh512.pem
          mountPath: /etc/postfix/dh512.pem
          readOnly: true
        - name: dhparam
          subPath: dh2048.pem
          mountPath: /etc/postfix/dh2048.pem
          readOnly: true
      volumes:
      - name: config
        configMap:
          name: postfix
      - name: aliases
        configMap:
          name: mail
      - name: certs
        secret:
          secretName: mail-domain-tld-cert
      - name: dhparam
        secret:
          secretName: dhparam

Again adjust to your needs and save as daemonset.yaml. Let’s have a quick look at the important parts of the DaemonSet definition:

yaml

      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet

This will cause Postfix to listen on the hosts network and NOT on an internal Pod IP. This makes it possible to reach the Postfix service from the internet (if the firewall allows it 😉). For Pods running with hostNetwork, you should explicitly set its DNS policy to ClusterFirstWithHostNet (see Pod’s DNS Policy).

yaml

      imagePullSecrets:
      - name: registry-domain-tld

I need this setting to pull container images from my private Docker registry. Have a look at Pull an Image from a Private Registry for more information. This secret basically contains my Docker credentials which are needed to login to the private registry.

yaml

        livenessProbe:
          failureThreshold: 2
          tcpSocket:
            port: 25
          initialDelaySeconds: 10
          periodSeconds: 60
        readinessProbe:
          failureThreshold: 2
          tcpSocket:
            port: 25
          periodSeconds: 60

The kubelet uses liveness probes to know when to restart a Container and uses readiness probes to know when a Container is ready to start accepting traffic (see Configure Liveness and Readiness Probes).

yaml

        resources:
          requests:
            memory: "32Mi"
            cpu: "50m"
          limits:
            memory: "64Mi"
            cpu: "50m"

Here we specify resource limits. requests specifies that the Pod gets at least this amount of CPU and memory. This is guaranteed. I also used the same settings for limits but doubled the memory value. Setting the limits resources higher then the one in requests the Pod may get more resources if it needs it and if the resources are available on that node. If you don’t specify limits then the Pod may get as many resources as the node provides. But setting at least a memory limit makes sense to avoid Pods getting killed because of memory pressure which may be caused by a process in a Pod running wild and consume lots of memory. So it will be killed before it could do harm to other pods running by OOM killer (see also: Resource Quality of Service in Kubernetes).

yaml

        ports:
        - name: smtp
          containerPort: 25
          hostPort: 25

This maps hostPort 25 into containerPort 25. Basically Postfix process running in the container could use a different port then 25 as for connectivity the hostPort is important as this is the one which needs to be available from the internet.

yaml

        securityContext:
          capabilities:
            drop:
            - ALL
            add:
            - DAC_OVERRIDE
            - FOWNER
            - SETUID
            - SETGID
            - NET_BIND_SERVICE

As we need to bind Postfix to a port < 1024 we need this security settings. This basically limits what system calls the container is allowed to execute. I need to do some further experiments. Maybe the capabilities can be reduced even further. But that’s the best I came up with for now. At least it’s better then privileged.

Finally lets have a look at some parts of the volumeMounts and volumes. In volumes we have

yaml

      - name: config
        configMap:
          name: postfix

Remember that we created a ConfigMap called postfix from all Postfix configuration files above? We basically “mount” this ConfigMap in volumes and name this “volume” config. As you may also recall this ConfigMap contains more “files”. You can see them if you use this command:

bash

kubectl get configmap postfix -o yaml

The output looks like this:

yaml

apiVersion: v1
data:
  bodycheck: |
    ...    
  headercheck: |
    ...    
  main.cf: 
    ...
  vmailbox: |
    ...    
kind: ConfigMap
metadata:
  name: postfix
  namespace: mailserver
  ...

In volumeMounts we now tell K8s to use subPath: bodycheck (which contains the content of the bodycheck file) and mount it into the container: mountPath: /etc/postfix/bodycheck. Additionally it should be readOnly:

yaml

        - name: config
          subPath: bodycheck
          mountPath: /etc/postfix/bodycheck
          readOnly: true

Next we have volumes that contain our secrets like this one:

yaml

      - name: certs
        secret:
          secretName: mail-domain-tld-cert

As you may recall this Secret was created by cert-manager and contains tls.crt and tls.key. As already mentioned above this contains the TLS certificate needed for secure communication. Again we use the subPath key to specify which file in the secret we want to mount and where to mount inside the container so that Postfix is able to load the files that were specified in its config file:

yaml

        - name: certs
          subPath: tls.crt
          mountPath: /etc/postfix/certs/tls.crt
          readOnly: false 
        - name: certs
          subPath: tls.key
          mountPath: /etc/postfix/certs/tls.key
          readOnly: false

A final word to volumes: Currently the Postfix directories /var/spool/postfix, /var/mail-state and /var/mail are ephemeral. That means if the Postfix container still processes a mail in its queue while getting killed for whatever reason the mail is lost! Currently the mail queue and its contents only exist in the container and as you know when a container gets killed everything that wasn’t included during creating the Docker image will be lost.
So if you don’t want to loose a mail you need to add persistent storage. This is a topic on its own and can’t be covered in this blog post. For more information see Volumes and Configure a Pod to Use a PersistentVolume for Storage.

Now the K8s resources can be created:

bash

kubectl create -f role.yaml
kubectl create -f rolebinding.yaml
kubectl create -f serviceaccount.yaml
kubectl create -f daemonset.yaml

Next make sure that you opened port 25 on the firewall 😉 If you used my ansible-role-harden-linux you can simply adjust the harden_linux_ufw_rules variable by adding this lines

yaml

harden_linux_ufw_rules:
  - rule: "allow"
    to_port: "25"
    protocol: "tcp"

and roll out the changes via Ansible. Since we (hopefully) already created the DNS entries for mail(01|02|03).domain.tld and the MX record for domain.tld we should now be able to verify if TLS connectivity works as expected (of course again replace mail01.domain.tld with your mail domain) e.g.:

bash

echo QUIT | openssl s_client \
  -starttls smtp \
  -crlf \
  -connect mail01.domain.tld:25 | \
    openssl x509 -noout -dates

depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
verify return:1
depth=0 CN = mail01.domain.tld
verify return:1
250 DSN
DONE
notBefore=May  4 21:10:28 2021 GMT
notAfter=Aug  2 21:10:28 2021 GMT

That’s looking good :-) You can also just run

bash

echo QUIT | openssl s_client \
  -starttls smtp \
  -crlf \
  -connect mail01.domain.tld:25

which gives you more information.

Now we’re basically setup and you should be able to send and receive mails. If you have a Gmail account you can see if the mail was encrypted while it traveled from your mail server to Gmail mail server. If you open a mail in Gmail which you sent through your mail server you can click on See security details.

What we can implement next is SPF (at least a SPF DNS record without having the SPF policy agent configured within Postfix). SPF (Sender Policy Framework) is a system that identifies to mail servers what hosts are allowed to send email for a given domain. Setting up SPF helps to prevent your email from being classified as spam. So to make it possible for other mail servers to check your SPF DNS record we need to generate a TXT record.
MxToolBox SPF Record Generator helps us to create one. Just insert your domain name (that’s only your domain+TLD e.g. domain.tld and not mail01.domain.tld) and click on Check SPF Record. Now you see the SPF Wizard at the end of the page. Answer the questions and/or fill out the input fields and generate the record. Now add this TXT DNS record to your DNS configuration. I suggest to use a small time to live (TTL) e.g. between 60 to 300 secs. in the beginning so that you can change quickly if needed. To verify if the record is set use dig e.g.:

bash

dig domain.tld TXT

domain.tld. 60 IN TXT "v=spf1 mx ~all"

This setting basically states that I send emails from the same servers as defined in the MX records.

Puh…. What a ride! If you made it that far: 🎉 That’s it for now. There are a few other things I want to implement when I’ve time:

  • Add the SPF policy agent to Postfix
  • Implement DKIM (DomainKeys Identified Mail)
  • Implement DMARC (Domain Message Authentication, Reporting & Conformance)
  • Install a CSI driver to have persistent storage for my K8s cluster and move the Postfix volumes there instead of having the mail queue directory in the Pod

Lots to do 😉 I’ll update the blog post accordingly.