diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 18:45:59 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 18:45:59 +0000 |
commit | 19fcec84d8d7d21e796c7624e521b60d28ee21ed (patch) | |
tree | 42d26aa27d1e3f7c0b8bd3fd14e7d7082f5008dc /src/jaegertracing/thrift/doc/specs/thrift.tex | |
parent | Initial commit. (diff) | |
download | ceph-upstream/16.2.11+ds.tar.xz ceph-upstream/16.2.11+ds.zip |
Adding upstream version 16.2.11+ds.upstream/16.2.11+dsupstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'src/jaegertracing/thrift/doc/specs/thrift.tex')
-rw-r--r-- | src/jaegertracing/thrift/doc/specs/thrift.tex | 1057 |
1 files changed, 1057 insertions, 0 deletions
diff --git a/src/jaegertracing/thrift/doc/specs/thrift.tex b/src/jaegertracing/thrift/doc/specs/thrift.tex new file mode 100644 index 000000000..a706fcbbc --- /dev/null +++ b/src/jaegertracing/thrift/doc/specs/thrift.tex @@ -0,0 +1,1057 @@ +%----------------------------------------------------------------------------- +% +% Thrift whitepaper +% +% Name: thrift.tex +% +% Authors: Mark Slee (mcslee@facebook.com) +% +% Created: 05 March 2007 +% +% You will need a copy of sigplanconf.cls to format this document. +% It is available at <http://www.sigplan.org/authorInformation.htm>. +% +%----------------------------------------------------------------------------- + + +\documentclass[nocopyrightspace,blockstyle]{sigplanconf} + +\usepackage{amssymb} +\usepackage{amsfonts} +\usepackage{amsmath} +\usepackage{url} + +\begin{document} + +% \conferenceinfo{WXYZ '05}{date, City.} +% \copyrightyear{2007} +% \copyrightdata{[to be supplied]} + +% \titlebanner{banner above paper title} % These are ignored unless +% \preprintfooter{short description of paper} % 'preprint' option specified. + +\title{Thrift: Scalable Cross-Language Services Implementation} +\subtitle{} + +\authorinfo{Mark Slee, Aditya Agarwal and Marc Kwiatkowski} + {Facebook, 156 University Ave, Palo Alto, CA} + {\{mcslee,aditya,marc\}@facebook.com} + +\maketitle + +\begin{abstract} +Thrift is a software library and set of code-generation tools developed at +Facebook to expedite development and implementation of efficient and scalable +backend services. Its primary goal is to enable efficient and reliable +communication across programming languages by abstracting the portions of each +language that tend to require the most customization into a common library +that is implemented in each language. Specifically, Thrift allows developers to +define datatypes and service interfaces in a single language-neutral file +and generate all the necessary code to build RPC clients and servers. + +This paper details the motivations and design choices we made in Thrift, as +well as some of the more interesting implementation details. It is not +intended to be taken as research, but rather it is an exposition on what we did +and why. +\end{abstract} + +% \category{D.3.3}{Programming Languages}{Language constructs and features} + +%\terms +%Languages, serialization, remote procedure call + +%\keywords +%Data description language, interface definition language, remote procedure call + +\section{Introduction} +As Facebook's traffic and network structure have scaled, the resource +demands of many operations on the site (i.e. search, +ad selection and delivery, event logging) have presented technical requirements +drastically outside the scope of the LAMP framework. In our implementation of +these services, various programming languages have been selected to +optimize for the right combination of performance, ease and speed of +development, availability of existing libraries, etc. By and large, +Facebook's engineering culture has tended towards choosing the best +tools and implementations available over standardizing on any one +programming language and begrudgingly accepting its inherent limitations. + +Given this design choice, we were presented with the challenge of building +a transparent, high-performance bridge across many programming languages. +We found that most available solutions were either too limited, did not offer +sufficient datatype freedom, or suffered from subpar performance. +\footnote{See Appendix A for a discussion of alternative systems.} + +The solution that we have implemented combines a language-neutral software +stack implemented across numerous programming languages and an associated code +generation engine that transforms a simple interface and data definition +language into client and server remote procedure call libraries. +Choosing static code generation over a dynamic system allows us to create +validated code that can be run without the need for +any advanced introspective run-time type checking. It is also designed to +be as simple as possible for the developer, who can typically define all +the necessary data structures and interfaces for a complex service in a single +short file. + +Surprised that a robust open solution to these relatively common problems +did not yet exist, we committed early on to making the Thrift implementation +open source. + +In evaluating the challenges of cross-language interaction in a networked +environment, some key components were identified: + +\textit{Types.} A common type system must exist across programming languages +without requiring that the application developer use custom Thrift datatypes +or write their own serialization code. That is, +a C++ programmer should be able to transparently exchange a strongly typed +STL map for a dynamic Python dictionary. Neither +programmer should be forced to write any code below the application layer +to achieve this. Section 2 details the Thrift type system. + +\textit{Transport.} Each language must have a common interface to +bidirectional raw data transport. The specifics of how a given +transport is implemented should not matter to the service developer. +The same application code should be able to run against TCP stream sockets, +raw data in memory, or files on disk. Section 3 details the Thrift Transport +layer. + +\textit{Protocol.} Datatypes must have some way of using the Transport +layer to encode and decode themselves. Again, the application +developer need not be concerned by this layer. Whether the service uses +an XML or binary protocol is immaterial to the application code. +All that matters is that the data can be read and written in a consistent, +deterministic matter. Section 4 details the Thrift Protocol layer. + +\textit{Versioning.} For robust services, the involved datatypes must +provide a mechanism for versioning themselves. Specifically, +it should be possible to add or remove fields in an object or alter the +argument list of a function without any interruption in service (or, +worse yet, nasty segmentation faults). Section 5 details Thrift's versioning +system. + +\textit{Processors.} Finally, we generate code capable of processing data +streams to accomplish remote procedure calls. Section 6 details the generated +code and TProcessor paradigm. + +Section 7 discusses implementation details, and Section 8 describes +our conclusions. + +\section{Types} + +The goal of the Thrift type system is to enable programmers to develop using +completely natively defined types, no matter what programming language they +use. By design, the Thrift type system does not introduce any special dynamic +types or wrapper objects. It also does not require that the developer write +any code for object serialization or transport. The Thrift IDL (Interface +Definition Language) file is +logically a way for developers to annotate their data structures with the +minimal amount of extra information necessary to tell a code generator +how to safely transport the objects across languages. + +\subsection{Base Types} + +The type system rests upon a few base types. In considering which types to +support, we aimed for clarity and simplicity over abundance, focusing +on the key types available in all programming languages, omitting any +niche types available only in specific languages. + +The base types supported by Thrift are: +\begin{itemize} +\item \texttt{bool} A boolean value, true or false +\item \texttt{byte} A signed byte +\item \texttt{i16} A 16-bit signed integer +\item \texttt{i32} A 32-bit signed integer +\item \texttt{i64} A 64-bit signed integer +\item \texttt{double} A 64-bit floating point number +\item \texttt{string} An encoding-agnostic text or binary string +\item \texttt{binary} A byte array representation for blobs +\end{itemize} + +Of particular note is the absence of unsigned integer types. Because these +types have no direct translation to native primitive types in many languages, +the advantages they afford are lost. Further, there is no way to prevent the +application developer in a language like Python from assigning a negative value +to an integer variable, leading to unpredictable behavior. From a design +standpoint, we observed that unsigned integers were very rarely, if ever, used +for arithmetic purposes, but in practice were much more often used as keys or +identifiers. In this case, the sign is irrelevant. Signed integers serve this +same purpose and can be safely cast to their unsigned counterparts (most +commonly in C++) when absolutely necessary. + +\subsection{Structs} + +A Thrift struct defines a common object to be used across languages. A struct +is essentially equivalent to a class in object oriented programming +languages. A struct has a set of strongly typed fields, each with a unique +name identifier. The basic syntax for defining a Thrift struct looks very +similar to a C struct definition. Fields may be annotated with an integer field +identifier (unique to the scope of that struct) and optional default values. +Field identifiers will be automatically assigned if omitted, though they are +strongly encouraged for versioning reasons discussed later. + +\subsection{Containers} + +Thrift containers are strongly typed containers that map to the most commonly +used containers in common programming languages. They are annotated using +the C++ template (or Java Generics) style. There are three types available: +\begin{itemize} +\item \texttt{list<type>} An ordered list of elements. Translates directly into +an STL \texttt{vector}, Java \texttt{ArrayList}, or native array in scripting languages. May +contain duplicates. +\item \texttt{set<type>} An unordered set of unique elements. Translates into +an STL \texttt{set}, Java \texttt{HashSet}, \texttt{set} in Python, or native +dictionary in PHP/Ruby. +\item \texttt{map<type1,type2>} A map of strictly unique keys to values +Translates into an STL \texttt{map}, Java \texttt{HashMap}, PHP associative +array, or Python/Ruby dictionary. +\end{itemize} + +While defaults are provided, the type mappings are not explicitly fixed. Custom +code generator directives have been added to substitute custom types in +destination languages (i.e. +\texttt{hash\_map} or Google's sparse hash map can be used in C++). The +only requirement is that the custom types support all the necessary iteration +primitives. Container elements may be of any valid Thrift type, including other +containers or structs. + +\begin{verbatim} +struct Example { + 1:i32 number=10, + 2:i64 bigNumber, + 3:double decimals, + 4:string name="thrifty" +}\end{verbatim} + +In the target language, each definition generates a type with two methods, +\texttt{read} and \texttt{write}, which perform serialization and transport +of the objects using a Thrift TProtocol object. + +\subsection{Exceptions} + +Exceptions are syntactically and functionally equivalent to structs except +that they are declared using the \texttt{exception} keyword instead of the +\texttt{struct} keyword. + +The generated objects inherit from an exception base class as appropriate +in each target programming language, in order to seamlessly +integrate with native exception handling in any given +language. Again, the design emphasis is on making the code familiar to the +application developer. + +\subsection{Services} + +Services are defined using Thrift types. Definition of a service is +semantically equivalent to defining an interface (or a pure virtual abstract +class) in object oriented +programming. The Thrift compiler generates fully functional client and +server stubs that implement the interface. Services are defined as follows: + +\begin{verbatim} +service <name> { + <returntype> <name>(<arguments>) + [throws (<exceptions>)] + ... +}\end{verbatim} + +An example: + +\begin{verbatim} +service StringCache { + void set(1:i32 key, 2:string value), + string get(1:i32 key) throws (1:KeyNotFound knf), + void delete(1:i32 key) +} +\end{verbatim} + +Note that \texttt{void} is a valid type for a function return, in addition to +all other defined Thrift types. Additionally, an \texttt{async} modifier +keyword may be added to a \texttt{void} function, which will generate code that does +not wait for a response from the server. Note that a pure \texttt{void} +function will return a response to the client which guarantees that the +operation has completed on the server side. With \texttt{async} method calls +the client will only be guaranteed that the request succeeded at the +transport layer. (In many transport scenarios this is inherently unreliable +due to the Byzantine Generals' Problem. Therefore, application developers +should take care only to use the async optimization in cases where dropped +method calls are acceptable or the transport is known to be reliable.) + +Also of note is the fact that argument lists and exception lists for functions +are implemented as Thrift structs. All three constructs are identical in both +notation and behavior. + +\section{Transport} + +The transport layer is used by the generated code to facilitate data transfer. + +\subsection{Interface} + +A key design choice in the implementation of Thrift was to decouple the +transport layer from the code generation layer. Though Thrift is typically +used on top of the TCP/IP stack with streaming sockets as the base layer of +communication, there was no compelling reason to build that constraint into +the system. The performance tradeoff incurred by an abstracted I/O layer +(roughly one virtual method lookup / function call per operation) was +immaterial compared to the cost of actual I/O operations (typically invoking +system calls). + +Fundamentally, generated Thrift code only needs to know how to read and +write data. The origin and destination of the data are irrelevant; it may be a +socket, a segment of shared memory, or a file on the local disk. The Thrift +transport interface supports the following methods: + +\begin{itemize} +\item \texttt{open} Opens the transport +\item \texttt{close} Closes the transport +\item \texttt{isOpen} Indicates whether the transport is open +\item \texttt{read} Reads from the transport +\item \texttt{write} Writes to the transport +\item \texttt{flush} Forces any pending writes +\end{itemize} + +There are a few additional methods not documented here which are used to aid +in batching reads and optionally signaling the completion of a read or +write operation from the generated code. + +In addition to the above +\texttt{TTransport} interface, there is a\\ +\texttt{TServerTransport} interface +used to accept or create primitive transport objects. Its interface is as +follows: + +\begin{itemize} +\item \texttt{open} Opens the transport +\item \texttt{listen} Begins listening for connections +\item \texttt{accept} Returns a new client transport +\item \texttt{close} Closes the transport +\end{itemize} + +\subsection{Implementation} + +The transport interface is designed for simple implementation in any +programming language. New transport mechanisms can be easily defined as needed +by application developers. + +\subsubsection{TSocket} + +The \texttt{TSocket} class is implemented across all target languages. It +provides a common, simple interface to a TCP/IP stream socket. + +\subsubsection{TFileTransport} + +The \texttt{TFileTransport} is an abstraction of an on-disk file to a data +stream. It can be used to write out a set of incoming Thrift requests to a file +on disk. The on-disk data can then be replayed from the log, either for +post-processing or for reproduction and/or simulation of past events. + +\subsubsection{Utilities} + +The Transport interface is designed to support easy extension using common +OOP techniques, such as composition. Some simple utilities include the +\texttt{TBufferedTransport}, which buffers the writes and reads on an +underlying transport, the \texttt{TFramedTransport}, which transmits data with frame +size headers for chunking optimization or nonblocking operation, and the +\texttt{TMemoryBuffer}, which allows reading and writing directly from the heap +or stack memory owned by the process. + +\section{Protocol} + +A second major abstraction in Thrift is the separation of data structure from +transport representation. Thrift enforces a certain messaging structure when +transporting data, but it is agnostic to the protocol encoding in use. That is, +it does not matter whether data is encoded as XML, human-readable ASCII, or a +dense binary format as long as the data supports a fixed set of operations +that allow it to be deterministically read and written by generated code. + +\subsection{Interface} + +The Thrift Protocol interface is very straightforward. It fundamentally +supports two things: 1) bidirectional sequenced messaging, and +2) encoding of base types, containers, and structs. + +\begin{verbatim} +writeMessageBegin(name, type, seq) +writeMessageEnd() +writeStructBegin(name) +writeStructEnd() +writeFieldBegin(name, type, id) +writeFieldEnd() +writeFieldStop() +writeMapBegin(ktype, vtype, size) +writeMapEnd() +writeListBegin(etype, size) +writeListEnd() +writeSetBegin(etype, size) +writeSetEnd() +writeBool(bool) +writeByte(byte) +writeI16(i16) +writeI32(i32) +writeI64(i64) +writeDouble(double) +writeString(string) + +name, type, seq = readMessageBegin() + readMessageEnd() +name = readStructBegin() + readStructEnd() +name, type, id = readFieldBegin() + readFieldEnd() +k, v, size = readMapBegin() + readMapEnd() +etype, size = readListBegin() + readListEnd() +etype, size = readSetBegin() + readSetEnd() +bool = readBool() +byte = readByte() +i16 = readI16() +i32 = readI32() +i64 = readI64() +double = readDouble() +string = readString() +\end{verbatim} + +Note that every \texttt{write} function has exactly one \texttt{read} counterpart, with +the exception of \texttt{writeFieldStop()}. This is a special method +that signals the end of a struct. The procedure for reading a struct is to +\texttt{readFieldBegin()} until the stop field is encountered, and then to +\texttt{readStructEnd()}. The +generated code relies upon this call sequence to ensure that everything written by +a protocol encoder can be read by a matching protocol decoder. Further note +that this set of functions is by design more robust than necessary. +For example, \texttt{writeStructEnd()} is not strictly necessary, as the end of +a struct may be implied by the stop field. This method is a convenience for +verbose protocols in which it is cleaner to separate these calls (e.g. a closing +\texttt{</struct>} tag in XML). + +\subsection{Structure} + +Thrift structures are designed to support encoding into a streaming +protocol. The implementation should never need to frame or compute the +entire data length of a structure prior to encoding it. This is critical to +performance in many scenarios. Consider a long list of relatively large +strings. If the protocol interface required reading or writing a list to be an +atomic operation, then the implementation would need to perform a linear pass over the +entire list before encoding any data. However, if the list can be written +as iteration is performed, the corresponding read may begin in parallel, +theoretically offering an end-to-end speedup of $(kN - C)$, where $N$ is the size +of the list, $k$ the cost factor associated with serializing a single +element, and $C$ is fixed offset for the delay between data being written +and becoming available to read. + +Similarly, structs do not encode their data lengths a priori. Instead, they are +encoded as a sequence of fields, with each field having a type specifier and a +unique field identifier. Note that the inclusion of type specifiers allows +the protocol to be safely parsed and decoded without any generated code +or access to the original IDL file. Structs are terminated by a field header +with a special \texttt{STOP} type. Because all the basic types can be read +deterministically, all structs (even those containing other structs) can be +read deterministically. The Thrift protocol is self-delimiting without any +framing and regardless of the encoding format. + +In situations where streaming is unnecessary or framing is advantageous, it +can be very simply added into the transport layer, using the +\texttt{TFramedTransport} abstraction. + +\subsection{Implementation} + +Facebook has implemented and deployed a space-efficient binary protocol which +is used by most backend services. Essentially, it writes all data +in a flat binary format. Integer types are converted to network byte order, +strings are prepended with their byte length, and all message and field headers +are written using the primitive integer serialization constructs. String names +for fields are omitted - when using generated code, field identifiers are +sufficient. + +We decided against some extreme storage optimizations (i.e. packing +small integers into ASCII or using a 7-bit continuation format) for the sake +of simplicity and clarity in the code. These alterations can easily be made +if and when we encounter a performance-critical use case that demands them. + +\section{Versioning} + +Thrift is robust in the face of versioning and data definition changes. This +is critical to enable staged rollouts of changes to deployed services. The +system must be able to support reading of old data from log files, as well as +requests from out-of-date clients to new servers, and vice versa. + +\subsection{Field Identifiers} + +Versioning in Thrift is implemented via field identifiers. The field header +for every member of a struct in Thrift is encoded with a unique field +identifier. The combination of this field identifier and its type specifier +is used to uniquely identify the field. The Thrift definition language +supports automatic assignment of field identifiers, but it is good +programming practice to always explicitly specify field identifiers. +Identifiers are specified as follows: + +\begin{verbatim} +struct Example { + 1:i32 number=10, + 2:i64 bigNumber, + 3:double decimals, + 4:string name="thrifty" +}\end{verbatim} + +To avoid conflicts between manually and automatically assigned identifiers, +fields with identifiers omitted are assigned identifiers +decrementing from -1, and the language only supports the manual assignment of +positive identifiers. + +When data is being deserialized, the generated code can use these identifiers +to properly identify the field and determine whether it aligns with a field in +its definition file. If a field identifier is not recognized, the generated +code can use the type specifier to skip the unknown field without any error. +Again, this is possible due to the fact that all datatypes are self +delimiting. + +Field identifiers can (and should) also be specified in function argument +lists. In fact, argument lists are not only represented as structs on the +backend, but actually share the same code in the compiler frontend. This +allows for version-safe modification of method parameters + +\begin{verbatim} +service StringCache { + void set(1:i32 key, 2:string value), + string get(1:i32 key) throws (1:KeyNotFound knf), + void delete(1:i32 key) +} +\end{verbatim} + +The syntax for specifying field identifiers was chosen to echo their structure. +Structs can be thought of as a dictionary where the identifiers are keys, and +the values are strongly-typed named fields. + +Field identifiers internally use the \texttt{i16} Thrift type. Note, however, +that the \texttt{TProtocol} abstraction may encode identifiers in any format. + +\subsection{Isset} + +When an unexpected field is encountered, it can be safely ignored and +discarded. When an expected field is not found, there must be some way to +signal to the developer that it was not present. This is implemented via an +inner \texttt{isset} structure inside the defined objects. (Isset functionality +is implicit with a \texttt{null} value in PHP, \texttt{None} in Python +and \texttt{nil} in Ruby.) Essentially, +the inner \texttt{isset} object of each Thrift struct contains a boolean value +for each field which denotes whether or not that field is present in the +struct. When a reader receives a struct, it should check for a field being set +before operating directly on it. + +\begin{verbatim} +class Example { + public: + Example() : + number(10), + bigNumber(0), + decimals(0), + name("thrifty") {} + + int32_t number; + int64_t bigNumber; + double decimals; + std::string name; + + struct __isset { + __isset() : + number(false), + bigNumber(false), + decimals(false), + name(false) {} + bool number; + bool bigNumber; + bool decimals; + bool name; + } __isset; +... +} +\end{verbatim} + +\subsection{Case Analysis} + +There are four cases in which version mismatches may occur. + +\begin{enumerate} +\item \textit{Added field, old client, new server.} In this case, the old +client does not send the new field. The new server recognizes that the field +is not set, and implements default behavior for out-of-date requests. +\item \textit{Removed field, old client, new server.} In this case, the old +client sends the removed field. The new server simply ignores it. +\item \textit{Added field, new client, old server.} The new client sends a +field that the old server does not recognize. The old server simply ignores +it and processes as normal. +\item \textit{Removed field, new client, old server.} This is the most +dangerous case, as the old server is unlikely to have suitable default +behavior implemented for the missing field. It is recommended that in this +situation the new server be rolled out prior to the new clients. +\end{enumerate} + +\subsection{Protocol/Transport Versioning} +The \texttt{TProtocol} abstractions are also designed to give protocol +implementations the freedom to version themselves in whatever manner they +see fit. Specifically, any protocol implementation is free to send whatever +it likes in the \texttt{writeMessageBegin()} call. It is entirely up to the +implementor how to handle versioning at the protocol level. The key point is +that protocol encoding changes are safely isolated from interface definition +version changes. + +Note that the exact same is true of the \texttt{TTransport} interface. For +example, if we wished to add some new checksumming or error detection to the +\texttt{TFileTransport}, we could simply add a version header into the +data it writes to the file in such a way that it would still accept old +log files without the given header. + +\section{RPC Implementation} + +\subsection{TProcessor} + +The last core interface in the Thrift design is the \texttt{TProcessor}, +perhaps the most simple of the constructs. The interface is as follows: + +\begin{verbatim} +interface TProcessor { + bool process(TProtocol in, TProtocol out) + throws TException +} +\end{verbatim} + +The key design idea here is that the complex systems we build can fundamentally +be broken down into agents or services that operate on inputs and outputs. In +most cases, there is actually just one input and output (an RPC client) that +needs handling. + +\subsection{Generated Code} + +When a service is defined, we generate a +\texttt{TProcessor} instance capable of handling RPC requests to that service, +using a few helpers. The fundamental structure (illustrated in pseudo-C++) is +as follows: + +\begin{verbatim} +Service.thrift + => Service.cpp + interface ServiceIf + class ServiceClient : virtual ServiceIf + TProtocol in + TProtocol out + class ServiceProcessor : TProcessor + ServiceIf handler + +ServiceHandler.cpp + class ServiceHandler : virtual ServiceIf + +TServer.cpp + TServer(TProcessor processor, + TServerTransport transport, + TTransportFactory tfactory, + TProtocolFactory pfactory) + serve() +\end{verbatim} + +From the Thrift definition file, we generate the virtual service interface. +A client class is generated, which implements the interface and +uses two \texttt{TProtocol} instances to perform the I/O operations. The +generated processor implements the \texttt{TProcessor} interface. The generated +code has all the logic to handle RPC invocations via the \texttt{process()} +call, and takes as a parameter an instance of the service interface, as +implemented by the application developer. + +The user provides an implementation of the application interface in separate, +non-generated source code. + +\subsection{TServer} + +Finally, the Thrift core libraries provide a \texttt{TServer} abstraction. +The \texttt{TServer} object generally works as follows. + +\begin{itemize} +\item Use the \texttt{TServerTransport} to get a \texttt{TTransport} +\item Use the \texttt{TTransportFactory} to optionally convert the primitive +transport into a suitable application transport (typically the +\texttt{TBufferedTransportFactory} is used here) +\item Use the \texttt{TProtocolFactory} to create an input and output protocol +for the \texttt{TTransport} +\item Invoke the \texttt{process()} method of the \texttt{TProcessor} object +\end{itemize} + +The layers are appropriately separated such that the server code needs to know +nothing about any of the transports, encodings, or applications in play. The +server encapsulates the logic around connection handling, threading, etc. +while the processor deals with RPC. The only code written by the application +developer lives in the definitional Thrift file and the interface +implementation. + +Facebook has deployed multiple \texttt{TServer} implementations, including +the single-threaded \texttt{TSimpleServer}, thread-per-connection +\texttt{TThreadedServer}, and thread-pooling \texttt{TThreadPoolServer}. + +The \texttt{TProcessor} interface is very general by design. There is no +requirement that a \texttt{TServer} take a generated \texttt{TProcessor} +object. Thrift allows the application developer to easily write any type of +server that operates on \texttt{TProtocol} objects (for instance, a server +could simply stream a certain type of object without any actual RPC method +invocation). + +\section{Implementation Details} +\subsection{Target Languages} +Thrift currently supports five target languages: C++, Java, Python, Ruby, and +PHP. At Facebook, we have deployed servers predominantly in C++, Java, and +Python. Thrift services implemented in PHP have also been embedded into the +Apache web server, providing transparent backend access to many of our +frontend constructs using a \texttt{THttpClient} implementation of the +\texttt{TTransport} interface. + +Though Thrift was explicitly designed to be much more efficient and robust +than typical web technologies, as we were designing our XML-based REST web +services API we noticed that Thrift could be easily used to define our +service interface. Though we do not currently employ SOAP envelopes (in the +authors' opinions there is already far too much repetitive enterprise Java +software to do that sort of thing), we were able to quickly extend Thrift to +generate XML Schema Definition files for our service, as well as a framework +for versioning different implementations of our web service. Though public +web services are admittedly tangential to Thrift's core use case and design, +Thrift facilitated rapid iteration and affords us the ability to quickly +migrate our entire XML-based web service onto a higher performance system +should the need arise. + +\subsection{Generated Structs} +We made a conscious decision to make our generated structs as transparent as +possible. All fields are publicly accessible; there are no \texttt{set()} and +\texttt{get()} methods. Similarly, use of the \texttt{isset} object is not +enforced. We do not include any \texttt{FieldNotSetException} construct. +Developers have the option to use these fields to write more robust code, but +the system is robust to the developer ignoring the \texttt{isset} construct +entirely and will provide suitable default behavior in all cases. + +This choice was motivated by the desire to ease application development. Our stated +goal is not to make developers learn a rich new library in their language of +choice, but rather to generate code that allow them to work with the constructs +that are most familiar in each language. + +We also made the \texttt{read()} and \texttt{write()} methods of the generated +objects public so that the objects can be used outside of the context +of RPC clients and servers. Thrift is a useful tool simply for generating +objects that are easily serializable across programming languages. + +\subsection{RPC Method Identification} +Method calls in RPC are implemented by sending the method name as a string. One +issue with this approach is that longer method names require more bandwidth. +We experimented with using fixed-size hashes to identify methods, but in the +end concluded that the savings were not worth the headaches incurred. Reliably +dealing with conflicts across versions of an interface definition file is +impossible without a meta-storage system (i.e. to generate non-conflicting +hashes for the current version of a file, we would have to know about all +conflicts that ever existed in any previous version of the file). + +We wanted to avoid too many unnecessary string comparisons upon +method invocation. To deal with this, we generate maps from strings to function +pointers, so that invocation is effectively accomplished via a constant-time +hash lookup in the common case. This requires the use of a couple interesting +code constructs. Because Java does not have function pointers, process +functions are all private member classes implementing a common interface. + +\begin{verbatim} +private class ping implements ProcessFunction { + public void process(int seqid, + TProtocol iprot, + TProtocol oprot) + throws TException + { ...} +} + +HashMap<String,ProcessFunction> processMap_ = + new HashMap<String,ProcessFunction>(); +\end{verbatim} + +In C++, we use a relatively esoteric language construct: member function +pointers. + +\begin{verbatim} +std::map<std::string, + void (ExampleServiceProcessor::*)(int32_t, + facebook::thrift::protocol::TProtocol*, + facebook::thrift::protocol::TProtocol*)> + processMap_; +\end{verbatim} + +Using these techniques, the cost of string processing is minimized, and we +reap the benefit of being able to easily debug corrupt or misunderstood data by +inspecting it for known string method names. + +\subsection{Servers and Multithreading} +Thrift services require basic multithreading to handle simultaneous +requests from multiple clients. For the Python and Java implementations of +Thrift server logic, the standard threading libraries distributed with the +languages provide adequate support. For the C++ implementation, no standard multithread runtime +library exists. Specifically, robust, lightweight, and portable +thread manager and timer class implementations do not exist. We investigated +existing implementations, namely \texttt{boost::thread}, +\texttt{boost::threadpool}, \texttt{ACE\_Thread\_Manager} and +\texttt{ACE\_Timer}. + +While \texttt{boost::threads}\cite{boost.threads} provides clean, +lightweight and robust implementations of multi-thread primitives (mutexes, +conditions, threads) it does not provide a thread manager or timer +implementation. + +\texttt{boost::threadpool}\cite{boost.threadpool} also looked promising but +was not far enough along for our purposes. We wanted to limit the dependency on +third-party libraries as much as possible. Because\\ +\texttt{boost::threadpool} is +not a pure template library and requires runtime libraries and because it is +not yet part of the official Boost distribution we felt it was not ready for +use in Thrift. As \texttt{boost::threadpool} evolves and especially if it is +added to the Boost distribution we may reconsider our decision to not use it. + +ACE has both a thread manager and timer class in addition to multi-thread +primitives. The biggest problem with ACE is that it is ACE. Unlike Boost, ACE +API quality is poor. Everything in ACE has large numbers of dependencies on +everything else in ACE - thus forcing developers to throw out standard +classes, such as STL collections, in favor of ACE's homebrewed implementations. In +addition, unlike Boost, ACE implementations demonstrate little understanding +of the power and pitfalls of C++ programming and take no advantage of modern +templating techniques to ensure compile time safety and reasonable compiler +error messages. For all these reasons, ACE was rejected. Instead, we chose +to implement our own library, described in the following sections. + +\subsection{Thread Primitives} + +The Thrift thread libraries are implemented in the namespace\\ +\texttt{facebook::thrift::concurrency} and have three components: +\begin{itemize} +\item primitives +\item thread pool manager +\item timer manager +\end{itemize} + +As mentioned above, we were hesitant to introduce any additional dependencies +on Thrift. We decided to use \texttt{boost::shared\_ptr} because it is so +useful for multithreaded application, it requires no link-time or +runtime libraries (i.e. it is a pure template library) and it is due +to become part of the C++0x standard. + +We implement standard \texttt{Mutex} and \texttt{Condition} classes, and a + \texttt{Monitor} class. The latter is simply a combination of a mutex and +condition variable and is analogous to the \texttt{Monitor} implementation provided for +the Java \texttt{Object} class. This is also sometimes referred to as a barrier. We +provide a \texttt{Synchronized} guard class to allow Java-like synchronized blocks. +This is just a bit of syntactic sugar, but, like its Java counterpart, clearly +delimits critical sections of code. Unlike its Java counterpart, we still +have the ability to programmatically lock, unlock, block, and signal monitors. + +\begin{verbatim} +void run() { + {Synchronized s(manager->monitor); + if (manager->state == TimerManager::STARTING) { + manager->state = TimerManager::STARTED; + manager->monitor.notifyAll(); + } + } +} +\end{verbatim} + +We again borrowed from Java the distinction between a thread and a runnable +class. A \texttt{Thread} is the actual schedulable object. The +\texttt{Runnable} is the logic to execute within the thread. +The \texttt{Thread} implementation deals with all the platform-specific thread +creation and destruction issues, while the \texttt{Runnable} implementation deals +with the application-specific per-thread logic. The benefit of this approach +is that developers can easily subclass the Runnable class without pulling in +platform-specific super-classes. + +\subsection{Thread, Runnable, and shared\_ptr} +We use \texttt{boost::shared\_ptr} throughout the \texttt{ThreadManager} and +\texttt{TimerManager} implementations to guarantee cleanup of dead objects that can +be accessed by multiple threads. For \texttt{Thread} class implementations, +\texttt{boost::shared\_ptr} usage requires particular attention to make sure +\texttt{Thread} objects are neither leaked nor dereferenced prematurely while +creating and shutting down threads. + +Thread creation requires calling into a C library. (In our case the POSIX +thread library, \texttt{libpthread}, but the same would be true for WIN32 threads). +Typically, the OS makes few, if any, guarantees about when \texttt{ThreadMain}, a C thread's entry-point function, will be called. Therefore, it is +possible that our thread create call, +\texttt{ThreadFactory::newThread()} could return to the caller +well before that time. To ensure that the returned \texttt{Thread} object is not +prematurely cleaned up if the caller gives up its reference prior to the +\texttt{ThreadMain} call, the \texttt{Thread} object makes a weak reference to +itself in its \texttt{start} method. + +With the weak reference in hand the \texttt{ThreadMain} function can attempt to get +a strong reference before entering the \texttt{Runnable::run} method of the +\texttt{Runnable} object bound to the \texttt{Thread}. If no strong references to the +thread are obtained between exiting \texttt{Thread::start} and entering \texttt{ThreadMain}, the weak reference returns \texttt{null} and the function +exits immediately. + +The need for the \texttt{Thread} to make a weak reference to itself has a +significant impact on the API. Since references are managed through the +\texttt{boost::shared\_ptr} templates, the \texttt{Thread} object must have a reference +to itself wrapped by the same \texttt{boost::shared\_ptr} envelope that is returned +to the caller. This necessitated the use of the factory pattern. +\texttt{ThreadFactory} creates the raw \texttt{Thread} object and a +\texttt{boost::shared\_ptr} wrapper, and calls a private helper method of the class +implementing the \texttt{Thread} interface (in this case, \texttt{PosixThread::weakRef}) + to allow it to make add weak reference to itself through the + \texttt{boost::shared\_ptr} envelope. + +\texttt{Thread} and \texttt{Runnable} objects reference each other. A \texttt{Runnable} +object may need to know about the thread in which it is executing, and a Thread, obviously, +needs to know what \texttt{Runnable} object it is hosting. This interdependency is +further complicated because the lifecycle of each object is independent of the +other. An application may create a set of \texttt{Runnable} object to be reused in different threads, or it may create and forget a \texttt{Runnable} object +once a thread has been created and started for it. + +The \texttt{Thread} class takes a \texttt{boost::shared\_ptr} reference to the hosted +\texttt{Runnable} object in its constructor, while the \texttt{Runnable} class has an +explicit \texttt{thread} method to allow explicit binding of the hosted thread. +\texttt{ThreadFactory::newThread} binds the objects to each other. + +\subsection{ThreadManager} + +\texttt{ThreadManager} creates a pool of worker threads and +allows applications to schedule tasks for execution as free worker threads +become available. The \texttt{ThreadManager} does not implement dynamic +thread pool resizing, but provides primitives so that applications can add +and remove threads based on load. This approach was chosen because +implementing load metrics and thread pool size is very application +specific. For example some applications may want to adjust pool size based +on running-average of work arrival rates that are measured via polled +samples. Others may simply wish to react immediately to work-queue +depth high and low water marks. Rather than trying to create a complex +API abstract enough to capture these different approaches, we +simply leave it up to the particular application and provide the +primitives to enact the desired policy and sample current status. + +\subsection{TimerManager} + +\texttt{TimerManager} allows applications to schedule + \texttt{Runnable} objects for execution at some point in the future. Its specific task +is to allows applications to sample \texttt{ThreadManager} load at regular +intervals and make changes to the thread pool size based on application policy. +Of course, it can be used to generate any number of timer or alarm events. + +The default implementation of \texttt{TimerManager} uses a single thread to +execute expired \texttt{Runnable} objects. Thus, if a timer operation needs to +do a large amount of work and especially if it needs to do blocking I/O, +that should be done in a separate thread. + +\subsection{Nonblocking Operation} +Though the Thrift transport interfaces map more directly to a blocking I/O +model, we have implemented a high performance \texttt{TNonBlockingServer} +in C++ based on \texttt{libevent} and the \texttt{TFramedTransport}. We +implemented this by moving all I/O into one tight event loop using a +state machine. Essentially, the event loop reads framed requests into +\texttt{TMemoryBuffer} objects. Once entire requests are ready, they are +dispatched to the \texttt{TProcessor} object which can read directly from +the data in memory. + +\subsection{Compiler} +The Thrift compiler is implemented in C++ using standard \texttt{lex}/\texttt{yacc} +lexing and parsing. Though it could have been implemented with fewer +lines of code in another language (i.e. Python Lex-Yacc (PLY) or \texttt{ocamlyacc}), using C++ +forces explicit definition of the language constructs. Strongly typing the +parse tree elements (debatably) makes the code more approachable for new +developers. + +Code generation is done using two passes. The first pass looks only for +include files and type definitions. Type definitions are not checked during +this phase, since they may depend upon include files. All included files +are sequentially scanned in a first pass. Once the include tree has been +resolved, a second pass over all files is taken that inserts type definitions +into the parse tree and raises an error on any undefined types. The program is +then generated against the parse tree. + +Due to inherent complexities and potential for circular dependencies, +we explicitly disallow forward declaration. Two Thrift structs cannot +each contain an instance of the other. (Since we do not allow \texttt{null} +struct instances in the generated C++ code, this would actually be impossible.) + +\subsection{TFileTransport} +The \texttt{TFileTransport} logs Thrift requests/structs by +framing incoming data with its length and writing it out to disk. +Using a framed on-disk format allows for better error checking and +helps with the processing of a finite number of discrete events. The\\ +\texttt{TFileWriterTransport} uses a system of swapping in-memory buffers +to ensure good performance while logging large amounts of data. +A Thrift log file is split up into chunks of a specified size; logged messages +are not allowed to cross chunk boundaries. A message that would cross a chunk +boundary will cause padding to be added until the end of the chunk and the +first byte of the message are aligned to the beginning of the next chunk. +Partitioning the file into chunks makes it possible to read and interpret data +from a particular point in the file. + +\section{Facebook Thrift Services} +Thrift has been employed in a large number of applications at Facebook, including +search, logging, mobile, ads and the developer platform. Two specific usages are discussed below. + +\subsection{Search} +Thrift is used as the underlying protocol and transport layer for the Facebook Search service. +The multi-language code generation is well suited for search because it allows for application +development in an efficient server side language (C++) and allows the Facebook PHP-based web application +to make calls to the search service using Thrift PHP libraries. There is also a large +variety of search stats, deployment and testing functionality that is built on top +of generated Python code. Additionally, the Thrift log file format is +used as a redo log for providing real-time search index updates. Thrift has allowed the +search team to leverage each language for its strengths and to develop code at a rapid pace. + +\subsection{Logging} +The Thrift \texttt{TFileTransport} functionality is used for structured logging. Each +service function definition along with its parameters can be considered to be +a structured log entry identified by the function name. This log can then be used for +a variety of purposes, including inline and offline processing, stats aggregation and as a redo log. + +\section{Conclusions} +Thrift has enabled Facebook to build scalable backend +services efficiently by enabling engineers to divide and conquer. Application +developers can focus on application code without worrying about the +sockets layer. We avoid duplicated work by writing buffering and I/O logic +in one place, rather than interspersing it in each application. + +Thrift has been employed in a wide variety of applications at Facebook, +including search, logging, mobile, ads, and the developer platform. We have +found that the marginal performance cost incurred by an extra layer of +software abstraction is far eclipsed by the gains in developer efficiency and +systems reliability. + +\appendix + +\section{Similar Systems} +The following are software systems similar to Thrift. Each is (very!) briefly +described: + +\begin{itemize} +\item \textit{SOAP.} XML-based. Designed for web services via HTTP, excessive +XML parsing overhead. +\item \textit{CORBA.} Relatively comprehensive, debatably overdesigned and +heavyweight. Comparably cumbersome software installation. +\item \textit{COM.} Embraced mainly in Windows client software. Not an entirely +open solution. +\item \textit{Pillar.} Lightweight and high-performance, but missing versioning +and abstraction. +\item \textit{Protocol Buffers.} Closed-source, owned by Google. Described in +Sawzall paper. +\end{itemize} + +\acks + +Many thanks for feedback on Thrift (and extreme trial by fire) are due to +Martin Smith, Karl Voskuil and Yishan Wong. + +Thrift is a successor to Pillar, a similar system developed +by Adam D'Angelo, first while at Caltech and continued later at Facebook. +Thrift simply would not have happened without Adam's insights. + +\begin{thebibliography}{} + +\bibitem{boost.threads} +Kempf, William, +``Boost.Threads'', +\url{http://www.boost.org/doc/html/threads.html} + +\bibitem{boost.threadpool} +Henkel, Philipp, +``threadpool'', +\url{http://threadpool.sourceforge.net} + +\end{thebibliography} + +\end{document} |