Teleport: How I Managed 50+ Servers and K8s Without Using a Single SSH Key

Security tutorial - IT technology blog
Security tutorial - IT technology blog

The Anxiety of Scaling from 5 to 50+ Servers

A common sight in startups: in the beginning, there were only 5 Ubuntu servers, and everything was a breeze. Whenever a new dev joined, I’d just spend 2 minutes copying their public key and using Ansible to push it to authorized_keys. However, when the system spiked to over 50 servers, accompanied by dozens of Kubernetes (K8s) clusters and databases, this “handcrafted” process began to reveal its flaws.

The real nightmare began when someone left the company. I once spent an entire afternoon just auditing and removing keys for one person across dozens of servers. Missing just one machine could leave behind a risky backdoor. Furthermore, sharing the root account made accountability nearly impossible. If someone accidentally ran rm -rf at 2 AM, I wouldn’t even know who did it because SSH audit logs are far too basic.

Why Traditional SSH Keys Often “Betray” You

The core issue lies in Static Credentials. An SSH key pair is usually valid forever until manually deleted. In a corporate environment, this is a potential risk because:

  • Loose Identity: SSH keys are tied to machines, not people. If a developer’s laptop is stolen or infected with malware, an attacker can freely access servers without any protective layer.
  • RBAC Nightmare: Defining that Person A can only access the Web Server while Person B only gets DB Server access is an administrative nightmare with traditional SSH.
  • Kubeconfig Leaks: Kubernetes config files containing long-term certificates are often casually stored in Slack or Google Drive, making them extremely easy to lose control over.

Temporary Fixes That Aren’t Very Effective

Before discovering Teleport, I tried using a Jump Host (Bastion Host). This helps hide servers from the internet but still relies on static keys. I also wrote scripts to rotate keys weekly. However, the more the scripts ran, the more errors occurred; once, a script error locked the entire DevOps team out of the system for 2 hours.

Some organizations use VPNs like WireGuard or OpenVPN. While these are good solutions at the network layer, they neglect the identity layer. A VPN only lets you “into the house”; it doesn’t care about—or record—what you do inside the rooms (servers).

Teleport: My Favorite Centralized Access Management Solution

After 6 months of real-world production use, I’ve found that Teleport completely solves the security puzzle. Instead of static keys, Teleport uses Short-lived Certificates. When you log in, you’re issued a certificate valid for only a few hours (I usually set it to 8 working hours). Once time is up, the certificate self-destructs and all access is closed.

Three features that help me sleep soundly:

  • SSO-Based: Integrates directly with Google Workspace or GitHub. Users must pass the company’s MFA (2FA) to obtain SSH permissions.
  • Session Recording: It records every terminal action like a video. You can replay it to see how someone debugged an issue, which is incredibly useful for training or incident investigation.
  • All-in-One: Centrally manage everything from SSH and Kubernetes to Databases (Postgres, MySQL…) from a single interface.

Quick Teleport Deployment Guide

You need a server to act as the Teleport Proxy & Auth. Prepare a domain name and an SSL certificate (Certbot is the go-to choice).

1. Installing on Ubuntu

Run the following command to get the latest stable version:

curl https://apt.releases.teleport.dev/gpg -o /usr/share/keyrings/teleport-archive-keyring.asc
echo "deb [signed-by=/usr/share/keyrings/teleport-archive-keyring.asc] https://apt.releases.teleport.dev/ubuntu $(lsb_release -cs) stable main" | sudo tee /etc/apt/sources.list.d/teleport.list
sudo apt update && sudo apt install teleport

2. Automatic Configuration

Assuming your domain is teleport.example.com. This command will generate a standard config file:

sudo teleport configure --acme [email protected] --cluster-name=teleport.example.com -o /etc/teleport.yaml
sudo systemctl enable --now teleport

3. Initializing Admin

Create an initial admin user to access the Web UI:

sudo tctl users add admin --roles=editor,access --logins=root,ubuntu

Copy the displayed link into your browser to set a password and scan the OTP code.

Real-world Experience: Goodbye Old SSH Commands

Nowadays, our team no longer types ssh user@ip. Instead, we use the magical tsh tool:

# Log in once in the morning
tsh login --proxy=teleport.example.com

# View list of servers I have access to
tsh ls

# Access server using a friendly name
tsh ssh root@web-server-01

A huge plus is that tsh automatically configures kubeconfig for you. With just tsh kube login, you can use kubectl immediately without carrying certificate files everywhere.

Hard-learned Lessons After Half a Year of Operation

If you plan to bring Teleport into production, keep these in mind:

  1. Protect the Auth Server: If this server goes down, the entire team loses access. Prioritize regular backups of your configuration data.
  2. Principle of Least Privilege: Don’t grant root access to everyone. The Dev team should only have roles for Staging, while only DevOps should touch Production.
  3. Monitor Audit Logs: Push Teleport logs to ELK or Splunk. In case of an incident, these logs are ironclad evidence that helps you find the root cause in minutes.

Switching from SSH to Teleport is like upgrading from a physical key to a smart fingerprint lock. The initial setup might be a bit more involved, but the peace of mind and professionalism it provides are well worth the effort.

Share: