Envoy TLS Handshake Failure

There is a mismatch in the TLS configuration between Envoy and the upstream server.

Understanding Envoy and Its Purpose

Envoy is a high-performance open-source edge and service proxy designed for cloud-native applications. It is used to manage network traffic, providing features like load balancing, service discovery, and observability. Envoy is particularly popular in microservices architectures, where it acts as a communication bridge between services, ensuring secure and reliable data transfer.

Identifying the TLS Handshake Failure Symptom

One common issue encountered when using Envoy is a TLS Handshake Failure. This problem manifests as an inability to establish a secure connection between Envoy and an upstream server. You might see error logs indicating handshake failures, or experience connectivity issues between services that rely on TLS for secure communication.

Common Error Messages

Typical error messages associated with TLS handshake failures include:

  • SSL routines:ssl3_get_record:wrong version number
  • SSL routines:ssl3_read_bytes:sslv3 alert handshake failure

Explaining the TLS Handshake Failure Issue

The TLS handshake is a crucial part of establishing a secure connection. It involves the exchange of cryptographic keys and the negotiation of encryption algorithms. A handshake failure usually indicates a mismatch in the TLS configuration between Envoy and the upstream server. This could be due to incompatible TLS versions, incorrect cipher suites, or invalid certificates.

Root Causes of Handshake Failures

Some common root causes include:

  • Incompatible TLS versions between Envoy and the server.
  • Mismatched cipher suites.
  • Expired or incorrectly configured certificates.

Steps to Fix the TLS Handshake Failure

To resolve a TLS handshake failure, follow these steps:

Step 1: Verify TLS Versions

Ensure that both Envoy and the upstream server support the same TLS versions. You can configure the supported TLS versions in Envoy's configuration file:


static_resources:
clusters:
- name: example_service
connect_timeout: 0.25s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
tls_context:
common_tls_context:
tls_params:
tls_minimum_protocol_version: TLSv1_2
tls_maximum_protocol_version: TLSv1_3

Step 2: Check Cipher Suites

Ensure that the cipher suites configured in Envoy are compatible with those on the server. You can specify cipher suites in the Envoy configuration:


static_resources:
clusters:
- name: example_service
tls_context:
common_tls_context:
tls_params:
cipher_suites:
- "ECDHE-RSA-AES128-GCM-SHA256"
- "ECDHE-RSA-AES256-GCM-SHA384"

Step 3: Validate Certificates

Ensure that the certificates used by Envoy are valid and correctly configured. Check for expired certificates and verify the certificate chain. You can use tools like OpenSSL to inspect certificates:


openssl s_client -connect example.com:443 -showcerts

Additional Resources

For more detailed guidance, refer to the Envoy Listener Configuration documentation. Additionally, the Envoy SSL/TLS Overview provides comprehensive information on configuring TLS in Envoy.

Never debug

Envoy

manually again

Let Dr. Droid create custom investigation plans for your infrastructure.

Book Demo
Automate Debugging for
Envoy
See how Dr. Droid creates investigation plans for your infrastructure.

MORE ISSUES

Made with ❤️ in Bangalore & San Francisco 🏢

Doctor Droid