Accelerating Microservices: Why I Switched from REST to gRPC with Python – ITFROMZERO

Table of Contents

Why I Abandoned REST API for gRPC

Back when the system only had 3-4 services, I simply used REST API (JSON over HTTP/1.1) for convenience. Everything worked fine until that number grew to 20. At this point, total internal requests hit 1.5 million per minute, and two painful issues began to emerge.

First was the Latency spike due to the overhead of HTTP/1.1. Second was Payload bloat. The redundant JSON text consumed too much bandwidth for no good reason. After 6 months of switching to gRPC for internal communication, system performance improved significantly. Payload size dropped by up to 60% thanks to the binary mechanism. Most importantly, I no longer had to laboriously write manual API documentation because everything is strictly defined in Protocol Buffers.

Quick Start: Running gRPC in 5 Minutes

Theory alone is easy to forget; let’s get hands-on with some code to see the difference. I’ll build a simple service: Send a name and receive a greeting.

Step 1: Install Libraries

pip install grpcio grpcio-tools

Step 2: Define the .proto File

Create a hello.proto file. This is the single source of truth (contract) between services.

syntax = "proto3";

service Greeter {
  rpc SayHello (HelloRequest) returns (HelloReply) {}
}

message HelloRequest {
  string name = 1;
}

message HelloReply {
  string message = 1;
}

Step 3: Generate Python Code

Instead of writing classes manually, let Python generate the code from the proto file:

python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. hello.proto

This command generates two files: hello_pb2.py (containing messages) and hello_pb2_grpc.py (containing connection logic).

Step 4: Write Server and Client

File server.py:

import grpc
from concurrent import futures
import hello_pb2
import hello_pb2_grpc

class Greeter(hello_pb2_grpc.GreeterServicer):
    def SayHello(self, request, context):
        # Return a greeting message
        return hello_pb2.HelloReply(message=f'Hello {request.name}, I am a gRPC server!')

def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    hello_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
    server.add_insecure_port('[::]:50051')
    print("Server is running on port 50051...")
    server.start()
    server.wait_for_termination()

if __name__ == '__main__':
    serve()

File client.py:

import grpc
import hello_pb2
import hello_pb2_grpc

def run():
    # Connect to the server
    with grpc.insecure_channel('localhost:50051') as channel:
        stub = hello_pb2_grpc.GreeterStub(channel)
        response = stub.SayHello(hello_pb2.HelloRequest(name='itfromzero'))
    print("Client received: " + response.message)

if __name__ == '__main__':
    run()

Decoding the Power: Why is gRPC Significantly Faster?

If you find the proto file compilation step a bit tedious, here are the reasons why that effort is extremely worthwhile.

1. Protocol Buffers (Protobuf) Compression Mechanism

REST sends {"user_id": 123} as text. Protobuf is different. It strips field names and sends only numeric tags and values in binary format. As a result, the packets are 3 to 10 times smaller than traditional JSON.

2. Leveraging the Power of HTTP/2

REST is often limited by the sequential nature of HTTP/1.1. In contrast, gRPC runs on HTTP/2 with Multiplexing capabilities. You can send hundreds of requests simultaneously over a single connection. This completely eliminates bottlenecks as the number of services grows.

3. Error Prevention via Strong Typing

The error “Service A sent a missing field that Service B needs” is a backend developer’s nightmare. With gRPC, if you pass the wrong data type, the system throws an error immediately. You no longer have to spend hours debugging because of a damn null value in JSON.

Advanced Techniques: Real-world Data Streaming

Streaming is my favorite feature. Imagine you need to push 100,000 log lines from the server to the client. Instead of making the client wait to download a massive file, gRPC allows you to push line by line.

Simply add the stream keyword to the proto file:

rpc ListLogs (LogRequest) returns (stream LogResponse) {}

On the server side, you use the yield command to continuously push data. This approach significantly reduces RAM usage for both ends.

4 Hard-earned Lessons After 6 Months in Production

Real-world deployment is much more complex than a Hello World example. Here is my experience:

Centralized Proto Management: Don’t let proto files scatter. Create a separate Git repo containing all .proto files and use Git Submodules so other services can point to it.
Middleware (Interceptors): Don’t write Logging or Auth code into every function. Use Interceptors for centralized processing, making your code much cleaner.
Health Checks are Mandatory: Implement gRPC’s standard health check protocol. Otherwise, Kubernetes won’t know if your service is alive or hung to trigger an automatic restart.
Tag Number Rule: When updating a proto file, never change the sequence number (tag). If you need to remove a field, use reserved. Reusing old numbers will cause conflicts between old and new data, leading to system crashes.

Conclusion

gRPC is not a silver bullet for every problem. If you’re building a Web Frontend for users, REST is still the way to go. But if you’re struggling with internal communication between Microservices, gRPC is the key to elevating performance to the next level.

Try applying it to the smallest service in your system. You’ll find that going back to manual JSON is truly… exhausting.