What is Man-in-the-Middle attack? How can you prevent it?

Imagine a hypothetical scenario.

Its a typical day at the office until an engineer rushed to managers cabin with alarming news: someone was making unexpected calls to their API endpoints using a valid client token and using it to reverse engineer and extract data. Manager understood the gravity of the situation and asked him, “What was compromised?” “Well, not sure, we need to see the logs and investigate,” the engineer replied.

After an investigation, they discovered that it was Man-in-the-Middle (MitM). The attacker had intercepted the app-server communication, acting as a proxy. The engineering team gathered to discuss how to prevent this from happening again. They realised that although they had implemented standard security measures like using an HSTS-enabled server, the attack still happened due to a sophisticated technique called SSL hijacking.

Enough of a context, now let’s dive into a case study of the problem & how to cope with it.

Elephant in the Town – MitM

Imagine a hidden listener messing with your chat, like a spy silently playing with your messages during a conversation. The man in the middle is a cyber-attack where the attacker intercepts the communication between two parties impersonating one of the parties making it appear as if a normal exchange of information over the network. it’s one of the most sophisticated attacks where the goal could be anything like monitoring/spying, stealing data, reverse engineering systems involved, etc

To understand how this happens we need to understand the protocols which power the communication.

The Protocols behind

HTTP/ HTTPS: Application layer protocol based on top of TCP designed to transfer information between networked devices.

TCP/IP: Transmission Control Protocol is a protocol used for establishing reliable, ordered, and error-checked communication between applications running on host computers connected to the Internet.

When you make an http request, firstly the 3-way TCP handshake happens between the client and server as follows;

Step 1 – SYN: Client sends a segment with SYN (hello) which tells the server that it wants to communicate.
Step 2 – SYN + ACK: Server responds with a segment that has both the SYN (hello) and ACK (hello back) flags set. The ACK flag acknowledges the client’s initial sequence number, and the SYN flag indicates the server’s willingness to establish a connection.
Step 3 – ACK: Client sends an ACK segment to acknowledge the server’s initial sequence number, completing the three-way handshake and establishing the TCP connection (not secure yet).

Now after the successful TCP handshake, an SSL/TLS Handshake takes place to establish a secure HTTPS connection between a server and a client.

TLS/SSL handshake

TLS – is an encryption and authentication protocol designed to secure Internet communications.

During a TLS/SSL handshake, the client and server together will do the following:

Specify version of TLS (TLS 1.0, 1.2, 1.3)
Decide which cipher suites they will use
Authenticate the identity of the server via the server’s public key and the SSL certificate authority’s digital signature
Generate session keys in order to use symmetric encryption

It uses cryptographic encryption for secure key exchange. The two key exchange methods symmetric & asymmetric are used leveraging algorithms such as Rivest Shamir Adleman (RSA), Diffie-Hellman algorithm, etc for generating session keys which are used to encrypt and decrypt data during communication. The updated 1.3 version of TLS uses asymmetric encryption which uses the Diffie-Hellman algorithm. Read more about encryptions here.

After this handshake, our connection is secure over HTTPS. Now understand the SSL certificate which is being verified during TLS handshake.

SSL Certificate

SSL certificate creates a foundation of trust for establishing a secure connection.

It’s a digital certificate issued by the Certificate Authority (CA) that contains information about the owner of the certificate, such as the domain name, organization name, signing algorithm, validity, public key of the server which is used to establish a secure connection, etc

The SSL certificate is basically a chain of certificates ⛓️ (also called a chain of trust) as you can see in the below image the SSL certificate has three parts, The Root Certificate, the Intermediate Certificate, and the Server certificate which is ours. The Root CA (Certificate Authority) Signs the Intermediate CA who signs our Server certificate.

Now let’s get back to our SSL handshake where we said that the server sends the certificate to the client and the client verifies it, but how actually, let’s see.

The client first checks if the certificate is valid and has not expired. It then verifies the certificate chain to ensure that the certificate has been issued by a trusted root certificate authority that is included in the operating system’s trust store. If the certificate is not signed by a trusted CA the connection will not be established.

Now we know TLS/SSL handshake is where the SSL certificate of the server is verified and trusted by the client and If the certificate is invalid or not signed by Trusted CA the connection will just drop.

How does Interception / MitM happen?

Let’s say you are a legitimate client trying to connect to a legitimate server and someone in the middle (let’s call him Caren) intercepted the connection somehow (very hard but it happens, he doesn’t even need to intercept if he has poisoned the DNS, out of the scope of this article) and he has the SSL certificate which is valid, signed by a Trusted CA which signed by Root CA from your OS trust store.

During the TLS handshake, he serves you that false certificate. You can’t do anything here, the certificate is valid. Now he decrypts your calls as you both have session keys and he then makes the connection to the legit server impersonating you.

Boom, the MitM happened! 💥

In the same way interceptor tools such as Charles Proxy, Mitm Proxy are able to monitor the traffic. Interesting & scary at the same time, right :p

How to prevent this?

Now we have a bad actor in the middle (Caren), how can you solve this? Try to think in a reverse manner now.

As you might have guessed, we don’t want to trust any other certificate apart from ours. We need to make sure the certificate received belongs to us, if we can do so we are safe against nasty Caren in the middle.

There are two solutions which are recommended in this case,

SSL Certificate Pinning
Certificate Transparency technique

1. SSL Certificate pinning

You can’t trust any other certificate apart from your own so you burn the knowledge of your certificate in the client and tell it to only trust the same certificate.

SSL Pinning is a robust security feature incorporated into client applications to deal with unauthorised interceptions or MITM scenarios. By implementing SSL Pinning, applications are designed to only accept a certain SSL certificate for a particular host or group of hosts instead of accepting any valid certificate provided by the server, which could potentially come from a malicious entity, the application is configured to recognise and validate only the predetermined, specific certificate or public key.

This is commonly used in mobile applications and other client-server scenarios where developers want to enforce stronger security measures. It provides an additional layer of protection against attacks targeting the SSL/TLS infrastructure and helps ensure that the client is communicating securely with the intended server. This eliminates the risk of attackers successfully using false certificates, even if they compromise a Certificate Authority and generate seemingly legitimate certificates for a domain.

This can be implemented in mobile apps by incorporating it into the code. Many mobile development frameworks and libraries provide tools or modules such as Trustkit for this purpose, making it easier for developers. For web apps, pinning can be achieved through the use of browser extensions or client-side scripts that are designed to enforce the use of a specific SSL certificate or public key for a particular domain.

1.1 Problems with Pinning

Force update overhead – you need to force update the application on the rotation of the SSL certificate.
Handling compromises – How can you handle scenarios where client code / pinned cert gets compromised? what if nasty Caren extracted client code and changed the pinned cert info or injected bad root CA in the device? Proper error handling is also essential to avoid getting locked out in edge cases where a certificate or public key expires or an issuer revokes it.
Engineering overhead – You need to make various decisions while implementing SSL pinning such as which certificate to pin among the chain, which part of the certificate to pin, how to pin and other technical challenges which come along with it.

Even if you choose to get your certificate info dynamically instead of hardcoding in client, you need to handle tradeoffs for that as well.

To deal with such client compromises you can use obfuscation which makes it hard to reverse-engineer the client code and modify it. Also, you do not want to keep important computations on the client side.

SSL pinning is great but in the case of web apps (despite of fact that the browser environment is safer than native mobile apps when it comes to network security), client code is easy to see. Pinning doesn’t fit there properly, yes you can do it but you know nasty Caren is smart. :p

Meet certificate transparency

2. Certificate Transparency technique

Certificate Transparency (CT) is an open framework of logs, monitors, and auditors created to help domain owners monitor digital certificates issued to their domains. CT was first standardised by Google in 2012 as a response to the 2011 attack on DigiNotar and other Certificate Authorities. Before there was not an efficient way to get a comprehensive list of certificates issued to your domain.

What CT aims to achieve is;

To make it impossible (or at least very difficult) for a CA to issue an SSL Certificate for a domain without the certificate being visible to the owner of that domain.
To provide an open auditing and monitoring system that lets any domain owner or CA determine whether certificates have been mistakenly or maliciously issued.
To protect users from being duped by certificates that were mistakenly or maliciously issued.

Certificate Transparency serves the same purpose as SSL pinning but in a different way. In this method when an SSL certificate is issued to a client by the server, it verifies whether it is valid or not with the Log server which already has a copy of valid certificates issued by the trusted root authorities. Thus if a hacker is performing MITM, his root CA will not be present in the Log server and the user will be saved from an MITM attack.

There are different ways to apply certificate transparency in applications using various libraries or your custom methods. Implementation of the same is out of the scope of this article.

2.1 Problems with CT

Not Realtime – The CT logs are not exactly real-time so if you aren’t able to verify in the meantime your apps won’t be able to connect. So, if you need high availability you may better go with SSL pinning.
No 100% guarantee – Certificate Transparency is more secure than SSL Pinning. But, it does not protect from rogue certificates that were publicly logged.

It’s great that CT policies on the web are great and they are making efforts to make the web a safer space. Read Apple and Chrome policies. Not all the browsers have proper CT policies in place by date yet. There’s another good project HSTS – HTTP Strict Transport Security which always upgrades to HTTPS before a request is sent.

Parting Words

When you think about integrating security features, you should always ask yourself if it’s worth it. There are always trade-offs with every decision you take and you pay the opportunity cost.

Do you transfer any sensitive data? What could happen to the user or your business in case of an incident? If the damage of an incident might cost more than protection integration, you should do it.

Also please note that no solution is foolproof, each has its pros, cons and limitations.

For Ex. HTTPS is more secure than HTTP, Pinning is more secure than HTTPS, CT is more secure than Pinning and this will go on, nothing is a guarantee.

We understood the entire picture from a little higher level and did not discuss any implementations because each of the concept has its depth and complexities. You should make decisions based on your tech stack, business requirements, risk and other factors I discussed above.

Thank you for being a patient and curios reader 🙂