Overview

A request from a client to execute a function on a server

When a computer program causes a procedure to execute in a different address space (perhaps on another computer) → location independent

To the client, looks like a procedure call

To the server, looks like an implementation of a procedure call

Diagram

(Review) Local procedure call

The caller and the callee are both in the same program (in same)
Caller invokes function (by name) with arguments
- Pass some arguments in registers, push others
- Push return program counter
- jump to first instruction in callee
Callee (implementation of function)
- Reads parameters from registers and stack
- Computes function, possibly updates state, returns result
- Jumps back to next instruction in caller
Example
- Caller: if (BuyBook(OSPP)) print(“Done!”);
- Callee: Boolean BuyBook(char \*booktitle) { do stuff; return success; }
- With an RPC, the BuyBook function would be running on a completely different computer
Compiler defines protocol: name binding, where to find arguments, etc.

Remote procedure call (RPC)

Client request to execute a function on the server
- The goal is to hide the network communication from the developer.
  - The developer just writes result = BuyBook(OSPP), and the RPC framework handles all the complex network steps in the background.
On client: result = BuyBook(OSPP)
- Parameters marshalled into a message (arbitrary types)
  - Marshalling (also called “serializing”) = the process of taking the parameters (in this case, OSPP) and converting them into a standardized byte format that can be sent in a network message
- Message sent to server (may be multiple pkts)
  - The byte stream (the “message”) is broken down into network packets and sent over the network (TCP/IP) to the server’s address
- Wait for reply
On server: implement BuyBook - The server has the actual BuyBook function code, it is listening for incoming network requests
- message is parsed
  - Parsing (also called “unmarshalling” or “deserializing”) = the process of receiving the stream of bytes from the network and converting it back into the original data structures the function understands (e.g., the string "OSPP").
- Perform operation (executes local BuyBook("OSPP")) & gets result
- Put result into a message (may be multiple pkts)
- Result returned to client
The client (which was waiting) receives the message, parses (unmarshalls) the result, and its original function call result = BuyBook(OSPP) completes

Why Serialize?

Procedure arguments can be values or pointers
- Need to be assembled into a linear message
What should happen if you call an RPC with a pointer?
- Pass the pointer? Convert it to a global reference?
- The RPC framework (the “stub” code that is auto-generated) detects that the argument is a pointer and follows one of two strategies, depending on how the pointer is being used.
- File write - Pass a copy of the data it points to
  - Most common solution
  - Dereferences the pointer (follows it to the actual data), serializes (copies) that data, and puts the copy of the data into the network message
- File read -Store the result

Implementation

RPC vs Procedure Call

Problems RPCs have that LPC (local procedural calls) don’t have

Binding
- Client needs a connection to server
- Server must implement the required function
- What if the server is running a different version? (ex, function accepts different parameters)
Service Discovery Service
- Solution to binding problem
- Like a phonebook - Includes what services are available, including versions and packet formats (may be 1000s of APIs)
- When server starts it registers itself here, when client makes a call it asks this service
Interface Description Language (IDL)
- An IDL (like Google’s Protobuf) is a file that acts as a language-neutral contract. You define the functions and, most importantly, the exact structure of the data.
- automatically writes all the complex serialization/deserialization code for you, programmer just uses the generated classes
Failures
- What happens if messages get dropped? Duplicated? Delayed? Reordered?
- What if client crashes?
- What if server crashes?
- What if server crashes after performing op but before replying?
- What if server appears to crash but is slow?
- What if network partitions?
- (461 alert) Some are handled by TCP but what if TCP socket fails?
- The Naive RPC is a simple implementation that is vulnerable to almost all failure listed above

Naive RPC

The most basic implementation you could ever build (problematic)
Nodes
- Any number of clients and servers
- No state at client or server
- Server performs operation if it gets a message
Messages
- Client requests, server replies
- Client request contains IP address of client/server, name of procedure, arguments
- Server reply contains IP address of server/client, results
Timers: none

What can go wrong?

Message corruption
- Delivered msg might be different: Becomes “buy Silberschatz” instead of “buy OSPP”
- 📌Fix: include checksum or CRC or cryptographic signature in each message
Message dropped
- Request message dropped (due to failure)
  - operation not performed, client waits forever
- Reply message dropped
  - operation was performed, but client doesn’t know if operation was performed, waits forever
- 📌Fix with client timer and retransmit
Request message duplicated
- operation performed twice
If multiple requests, and some are dropped
- client doesn’t know which operation was done and which ones aren’t
- 📌Fix: use unique ID per request: (client ID, sequence number)
Message ordering
- Network congestion
  - Client sends sequence of msgs to server (a,b,c,d..) (using threads) but some can get dropped (network congestion)
  - If it’s a sequence for CRUD (delete, create, then fill new file), if the sequence is mixed then it can’t do it properly
- Network reorders packets, server gets different sequence
- 📌Fix: Label messages with sequence number
  - {seq: 1, op: "delete"}
  - Detect missing messages & unneeded retransmissions
- Labs assume each client sends only one RPC at a time
  - If client sends the next RPC, it means they must have received the reply for the previous one
  - Still need to worry about lost and duplicate and delayed RPCs
  - E.g., server might get a delayed request

RPC Semantics

Semantics = meaning
How many times will an RPC be executed by the server before it returns to the invoker of the RPC?
At least once (NFS, DNS)
- Client retries
- If reply: executed at least once
- If no reply: maybe executed, maybe multiple times
At most once
- • If reply: executed once
- If no reply: maybe executed, but never more than once
- Server prevents non-unique requests from compound
- UID (sequence numbers like 1,2,3,4)
  - [Don’t use UUID.randomUUID]
- Idempotent
Exactly once
- If reply: executed once
- If no reply: keep waiting
- At most once with retries

Research Papers

A Cloud-Scale Characterization of Remote Procedure Calls (FOCI)

leejunkim

Explorer

RPC (Remote Procedure Call)

(Review) Local procedure call

Remote procedure call (RPC)

Why Serialize?

Implementation

RPC vs Procedure Call

Naive RPC

What can go wrong?

RPC Semantics

Research Papers

Graph View

Table of Contents

Backlinks