These two days watching RPC is really a bit of a mystery, some things to see forgotten, forgotten to see! Plus there are some messy things, I feel that these two days need to sort out ideas, and then according to the principle, to achieve a RPC framework of their own.
- What is RPC?
RPC is a remote procedure call that allows one computer to call programs on another computer to get results without additional programming in the code, just like calling one locally.
Sample.
The following is a detailed schematic diagram:
We use Google Protolcol Buffer to serialize. First, we need to register the callMethod function in RPC. The client sends commands to the server, enters corresponding commands and key values, and calls the previously registered callMethod function. The callMethod function is mainly responsible for serialization and deserialization, encapsulation and sending messages. We send the result of serialization of the callMethod function to the server. The server registers the same callMethod call. After receiving the message, the server uses the callMethod function to deserialize and invoke the corresponding service service service service. The server sets the specified data, invokes the callMethod function, and serializes it. The serialized results are sent to the client, and the client is deserialized to process the serialized results.
Let me introduce the process below:
- What are the problems to be solved?
Nowadays, the scale of computer applications is getting larger and larger. A single cluster of clearing machines can complete the call between machines. Distributed applications can complete the call between machines.
- Technology used?
GPB (google protocol buffer) here uses the mainstream RPC call protocol, of course, json can also implement rpc.
- Serialization and deserialization.
So far, using google protocol buffer to realize serialization and deserialization is a relatively efficient method. Before sending, GPB serializes the transmitted data into key values, and then encodes it internally. After encoding, the data is converted into a protocol stream shorter than the original data and sent to the opposite end. The other end receives the data, deserializes it, and converts it into the original sender's unserialized data.
Refer to this blog's introduction to serialization and deserialization:
https://blog.csdn.net/qq_41681241/article/details/99406841
- How does the GPB RPC interface work?
package echo; option cc_generic_services = true; message EchoRequest { required string msg = 1; } message EchoResponse { required string msg = 2; } service EchoService { rpc Echo(EchoRequest) returns (EchoResponse); }
protoc automatically generates echo.pb.h echo.pb.cc code. The sentence service EchoService generates two classes of EchoService EchoService_Stub, which are concerned by server and client respectively. On the server side, requests are processed through EchoService::Echo, the code is not implemented, and subclasses are needed to overload.
class EchoService : public ::google::protobuf::Service { ... virtual void Echo(::google::protobuf::RpcController* controller, const ::echo::EchoRequest* request, ::echo::EchoResponse* response, ::google::protobuf::Closure* done); }; void EchoService::Echo(::google::protobuf::RpcController* controller, const ::echo::EchoRequest*, ::echo::EchoResponse*, ::google::protobuf::Closure* done) { //Code not implemented controller->SetFailed("Method Echo() not implemented."); done->Run(); }
For client side, EchoService_Stub::Echo calls: google::protobuf::Channel::CallMethod, but Channel is a pure virtual class, which requires RPC framework to implement the required functions in subclasses.
class EchoService_Stub : public EchoService { ... void Echo(::google::protobuf::RpcController* controller, const ::echo::EchoRequest* request, ::echo::EchoResponse* response, ::google::protobuf::Closure* done); private: ::google::protobuf::RpcChannel* channel_; }; void EchoService_Stub::Echo(::google::protobuf::RpcController* controller, const ::echo::EchoRequest* request, ::echo::EchoResponse* response, ::google::protobuf::Closure* done) { channel_->CallMethod(descriptor()->method(0), controller, request, response, done); }
The server side implements the following sample functions:
//override Echo method class MyEchoService : public echo::EchoService { public: virtual void Echo(::google::protobuf::RpcController* /* controller */, const ::echo::EchoRequest* request, ::echo::EchoResponse* response, ::google::protobuf::Closure* done) { std::cout << request->msg() << std::endl; response->set_msg( std::string("I have received '") + request->msg() + std::string("'")); done->Run(); } };//MyEchoService int main() { MyServer my_server; MyEchoService echo_service; my_server.add(&echo_service); my_server.start("127.0.0.1", 6688); return 0; }
Just define the subclass service to implement the method method method and add the service to the server.
Client-side Implementation
int main() { MyChannel channel; channel.init("127.0.0.1", 6688); echo::EchoRequest request; echo::EchoResponse response; request.set_msg("hello, myrpc."); echo::EchoService_Stub stub(&channel); MyController cntl; stub.Echo(&cntl, &request, &response, NULL); std::cout << "resp:" << response.msg() << std::endl; return 0; }
Such usage seems natural, but when you think about the implementation behind it, there are certainly many doubts:
Why does the server only need to implement the MyEchoService::Echo function and the client only need to call EchoService_Stub::Echo to send and receive data in the corresponding format? What is the middle call process like?
If the server receives a variety of pb data (for example, a method rpc Post(DeepLinkReq) returns (DeepLinkResp);), how can we distinguish which format is received?
After distinguishing, how to construct corresponding objects? For example, MyEchoService:: EchoRequest EchoResponse in the Echo parameter, because the rpc framework is not clear about the existence of these specific classes and functions, the framework is not clear about the name of specific classes, nor the method name, but to be able to construct objects and call this function?
It can be inferred that the answer is in MyServer MyChannel MyController. Next we will analyze it step by step.
Processing flow
Consider the server-side processing flow
Receiving data from the opposite end
Identify how to deserialize to request data type through identification mechanism
Generate the corresponding response data type
Call the corresponding service-method to fill in response data
Serialized response
Send data back to the opposite end
Specifically, the problem of interface design mentioned in the previous section is embodied in the 234 steps, or the example of Echo above, because the RPC framework can not know EchoService::Echo in advance, how to call this function?
In google/protobuf/service.h:: google::protobuf::Service's source code is as follows:
class LIBPROTOBUF_EXPORT Service { virtual void CallMethod(const MethodDescriptor* method, RpcController* controller, const Message* request, Message* response, Closure* done) = 0; };//Service
Service is a pure virtual class, CallMethod = 0, EchoService implements as follows:
void EchoService::CallMethod(const ::google::protobuf::MethodDescriptor* method, ::google::protobuf::RpcController* controller, const ::google::protobuf::Message* request, ::google::protobuf::Message* response, ::google::protobuf::Closure* done) { GOOGLE_DCHECK_EQ(method->service(), EchoService_descriptor_); switch(method->index()) { case 0: Echo(controller, ::google::protobuf::down_cast<const ::echo::EchoRequest*>(request), ::google::protobuf::down_cast< ::echo::EchoResponse*>(response), done); break; default: GOOGLE_LOG(FATAL) << "Bad method index; this should never happen."; break; } }
You can see that there will be a data conversion down_cast, so the framework can call Echo by calling: google::protobuf::ServiceCallMethod function. The data is unified in Message* format, so that the interface problem of the framework can be solved. Then consider the client-side processing flow.
In the implementation of EchoService_Stub::Echo:
channel_->CallMethod(descriptor()->method(0), controller, request, response, done);
So first look at: the implementation of google::protobuf::RpcChannel:
// Abstract interface for an RPC channel. An RpcChannel represents a // communication line to a Service which can be used to call that Service's // methods. The Service may be running on another machine. Normally, you // should not call an RpcChannel directly, but instead construct a stub Service // wrapping it. Example: // RpcChannel* channel = new MyRpcChannel("remotehost.example.com:1234"); // MyService* service = new MyService::Stub(channel); // service->MyMethod(request, &response, callback); class LIBPROTOBUF_EXPORT RpcChannel { public: inline RpcChannel() {} virtual ~RpcChannel(); // Call the given method of the remote service. The signature of this // procedure looks the same as Service::CallMethod(), but the requirements // are less strict in one important way: the request and response objects // need not be of any specific class as long as their descriptors are // method->input_type() and method->output_type(). virtual void CallMethod(const MethodDescriptor* method, RpcController* controller, const Message* request, Message* response, Closure* done) = 0; private: GOOGLE_DISALLOW_EVIL_CONSTRUCTORS(RpcChannel); };
The annotations of pb are very clear. Channel can be understood as a channel that connects the two ends of rpc service and essentially communicates through socket.
But RpcChannel is also a pure virtual class, CallMethod = 0.
So we need to implement a subclass, base class is RpcChannel, and implement the CallMethod method. We should implement two functions:
To serialize request s and send them to the opposite end, an identification mechanism is needed so that the opposite end knows how to parse (schema) and process (method) such data.
Receive end-to-end data and deserialize it to response
In addition, there is RpcController, also a pure virtual class, is an auxiliary class, used to obtain RPC results, end-to-end IP and so on.
Identification mechanism
The so-called identification mechanism mentioned in the previous section is that when client sends a piece of data to server, server can know the corresponding data format of this buffer, how to deal with it, and what the corresponding return data format is.
The simplest way to be violent is to identify what format is in each set of data and what format the return value is expected to be. This will certainly solve the problem.
This is obviously not the case in pb, because server/client uses the same (or compatible) proto, as long as the data type name is identified. But there are also problems with the same type of method, such as:
service EchoService { rpc Echo(EchoRequest) returns (EchoResponse); rpc AnotherEcho(EchoRequest) returns (EchoResponse) }
So you can use the name service and method to know the request/response type through proto.
Therefore, the conclusion is that we can add the service method name to each data transfer.
There are many xxxDescriptor classes in pb, and service method is no exception. For example, GetDescriptor can get Service Descriptor.
class LIBPROTOBUF_EXPORT Service { ... // Get the ServiceDescriptor describing this service and its methods. virtual const ServiceDescriptor* GetDescriptor() = 0; };//Service
The corresponding name and Method Descriptor can be obtained through Service Descriptor.
class LIBPROTOBUF_EXPORT ServiceDescriptor { public: // The name of the service, not including its containing scope. const string& name() const; ... // The number of methods this service defines. int method_count() const; // Gets a MethodDescriptor by index, where 0 <= index < method_count(). // These are returned in the order they were defined in the .proto file. const MethodDescriptor* method(int index) const; };//ServiceDescriptor
MethodDecriptor can obtain the corresponding name and the subordinate ServiceDescriptor:
class LIBPROTOBUF_EXPORT MethodDescriptor { public: // Name of this method, not including containing scope. const string& name() const; ... // Gets the service to which this method belongs. Never NULL. const ServiceDescriptor* service() const; };//MethodDescriptor
Therefore:
On the server side, we can record the service name and all method name s when we pass in a:: google::protobuf::Service.
The client side calls virtual void CallMethod (const Method Descriptor * method... The method name and corresponding service name can also be obtained.
In this way, you can know the type of data sent.
Constructional parameters
// const MethodDescriptor* method = // service->GetDescriptor()->FindMethodByName("Foo"); // Message* request = stub->GetRequestPrototype (method)->New(); // Message* response = stub->GetResponsePrototype(method)->New(); // request->ParseFromString(input); // service->CallMethod(method, *request, response, callback); virtual const Message& GetRequestPrototype( const MethodDescriptor* method) const = 0; virtual const Message& GetResponsePrototype( const MethodDescriptor* method) const = 0;
Message can construct corresponding objects through New:
class LIBPROTOBUF_EXPORT Message : public MessageLite { public: inline Message() {} virtual ~Message(); // Basic Operations ------------------------------------------------ // Construct a new instance of the same type. Ownership is passed to the // caller. (This is also defined in MessageLite, but is defined again here // for return-type covariance.) virtual Message* New() const = 0; ...
In this way, we can get the objects that Service::Method needs.
Server/Channel/Controller Subclass Implementation
RpcMeta
RpcMeta is used to solve the problem of passing service-name method-name. It is defined as follows:
package myrpc; message RpcMeta { optional string service_name = 1; optional string method_name = 2; optional int32 data_size = 3; }
Where data_size represents the next data size to be transmitted, such as the size of the EchoRequest object.
We also need an int to represent the size of RpcMeta, so let's look at Channel's implementation.
Channel:
//Inheritance from Rpc Channel to realize data transmission and reception class MyChannel : public ::google::protobuf::RpcChannel { public: //init is imported into ip:port and boost.asio is used for network interaction void init(const std::string& ip, const int port) { _io = boost::make_shared<boost::asio::io_service>(); _sock = boost::make_shared<boost::asio::ip::tcp::socket>(*_io); boost::asio::ip::tcp::endpoint ep( boost::asio::ip::address::from_string(ip), port); _sock->connect(ep); } //EchoService_Stub::Echo calls Channel::CallMethod //The first parameter, MethodDescriptor* method, gets service-name method-name virtual void CallMethod(const ::google::protobuf::MethodDescriptor* method, ::google::protobuf::RpcController* /* controller */, const ::google::protobuf::Message* request, ::google::protobuf::Message* response, ::google::protobuf::Closure*) { //request data serialization std::string serialzied_data = request->SerializeAsString(); //Get service-name method-name and fill it in rpc_meta myrpc::RpcMeta rpc_meta; rpc_meta.set_service_name(method->service()->name()); rpc_meta.set_method_name(method->name()); rpc_meta.set_data_size(serialzied_data.size()); //rpc_meta serialization std::string serialzied_str = rpc_meta.SerializeAsString(); //Get the size of rpc_meta serialized data and fill it into the data header, taking up 4 bytes int serialzied_size = serialzied_str.size(); serialzied_str.insert(0, std::string((const char*)&serialzied_size, sizeof(int))); //Tail-appended request serialized data serialzied_str += serialzied_data; //Send all data: //| rpc_meta size (fixed length 4 bytes) | rpc_meta serialized data (variable length) | request serialized data (variable length)| _sock->send(boost::asio::buffer(serialzied_str)); //Receive 4 bytes: serialized resp data size char resp_data_size[sizeof(int)]; _sock->receive(boost::asio::buffer(resp_data_size)); //Receive N bytes: N = serialized resp data size int resp_data_len = *(int*)resp_data_size; std::vector<char> resp_data(resp_data_len, 0); _sock->receive(boost::asio::buffer(resp_data)); //Deserialize to resp response->ParseFromString(std::string(&resp_data[0], resp_data.size())); } private: boost::shared_ptr<boost::asio::io_service> _io; boost::shared_ptr<boost::asio::ip::tcp::socket> _sock; };//MyChannel
By implementing the Channel::CallMethod method, we can automatically send/receive, serialize/deserialize data when calling subclass methods, such as EchoService_Stub::Echo.
The implementation of server is a bit more complicated, because it is possible to register multiple Service::Method. When receiving data from client side, parsing RpcMeta to get service-name method-name, we need to find the corresponding Service::Method, and record this part of information when registering. So let's first look at the implementation of the add method:
class MyServer { public: void add(::google::protobuf::Service* service) { ServiceInfo service_info; service_info.service = service; service_info.sd = service->GetDescriptor(); for (int i = 0; i < service_info.sd->method_count(); ++i) { service_info.mds[service_info.sd->method(i)->name()] = service_info.sd->method(i); } _services[service_info.sd->name()] = service_info; } ... private: struct ServiceInfo{ ::google::protobuf::Service* service; const ::google::protobuf::ServiceDescriptor* sd; std::map<std::string, const ::google::protobuf::MethodDescriptor*> mds; };//ServiceInfo //service_name -> {Service*, ServiceDescriptor*, MethodDescriptor* []} std::map<std::string, ServiceInfo> _services;
In my implementation, _services records service and its corresponding Service Descriptor Method Descriptor. The ServiceDescritpr::FindMethodByName method can find the method, so it's also possible not to record the method_name. But for performance reasons, I think there's more to record, such as req/resp data types.
//Monitor ip:port and receive data void MyServer::start(const std::string& ip, const int port) { boost::asio::io_service io; boost::asio::ip::tcp::acceptor acceptor( io, boost::asio::ip::tcp::endpoint( boost::asio::ip::address::from_string(ip), port)); while (true) { auto sock = boost::make_shared<boost::asio::ip::tcp::socket>(io); acceptor.accept(*sock); std::cout << "recv from client:" << sock->remote_endpoint().address() << std::endl; //Receive 4 bytes: rpc_meta length char meta_size[sizeof(int)]; sock->receive(boost::asio::buffer(meta_size)); int meta_len = *(int*)(meta_size); //Receiving rpc_meta data std::vector<char> meta_data(meta_len, 0); sock->receive(boost::asio::buffer(meta_data)); myrpc::RpcMeta meta; meta.ParseFromString(std::string(&meta_data[0], meta_data.size())); //Receiving req data std::vector<char> data(meta.data_size(), 0); sock->receive(boost::asio::buffer(data)); //data processing dispatch_msg( meta.service_name(), meta.method_name(), std::string(&data[0], data.size()), sock); } }
start starts a loop, parses RpcMeta data and receives request data, which is then handled by dispatch_msg.
void MyServer::dispatch_msg( const std::string& service_name, const std::string& method_name, const std::string& serialzied_data, const boost::shared_ptr<boost::asio::ip::tcp::socket>& sock) { //Find the corresponding registered Service according to service_name method_name* auto service = _services[service_name].service; auto md = _services[service_name].mds[method_name]; std::cout << "recv service_name:" << service_name << std::endl; std::cout << "recv method_name:" << method_name << std::endl; std::cout << "recv type:" << md->input_type()->name() << std::endl; std::cout << "resp type:" << md->output_type()->name() << std::endl; //Generating req resp objects from Service* auto recv_msg = service->GetRequestPrototype(md).New(); recv_msg->ParseFromString(serialzied_data); auto resp_msg = service->GetResponsePrototype(md).New(); MyController controller; auto done = ::google::protobuf::NewCallback( this, &MyServer::on_resp_msg_filled, recv_msg, resp_msg, sock); //Call Service::Method (that is, a subclass method implemented by the user) service->CallMethod(md, &controller, recv_msg, resp_msg, done);
When the user fills in resp_msg, he calls the callback function specified by done (that is, the corresponding do - > Run () sentence in MyEchoService::Echo code). After the user fills in the data, on_resp_msg_filled is used for serialization and sending.
void MyServer::on_resp_msg_filled( ::google::protobuf::Message* recv_msg, ::google::protobuf::Message* resp_msg, const boost::shared_ptr<boost::asio::ip::tcp::socket> sock) { //avoid mem leak boost::scoped_ptr<::google::protobuf::Message> recv_msg_guard(recv_msg); boost::scoped_ptr<::google::protobuf::Message> resp_msg_guard(resp_msg); std::string resp_str; pack_message(resp_msg, &resp_str); sock->send(boost::asio::buffer(resp_str)); } pack_message To package data, you insert it before serializing it4Byte length data void pack_message( const ::google::protobuf::Message* msg, std::string* serialized_data) { int serialized_size = msg->ByteSize(); serialized_data->assign( (const char*)&serialized_size, sizeof(serialized_size)); msg->AppendToString(serialized_data); }
Reference Blog: https://izualzhy.cn/demo-protobuf-rpc
Following is the RPC framework created by Tencent's development team that readers can study on their own: https://github.com/Tencent/phxrpc