这篇笔记比较老,不再更新维护,请移步最新的手册:envoy相关笔记。
这里阅读的是envoy 1.8.0版本的代码,目的是对envoy的代码结构、启动过程建立基本的认知,以后遇到“感觉比较模糊”的内容时,可以快速通过阅读源码求证。
Envoy是使用C++14开发的,先简单了解一下C++14是有必要的,C++14在2015年12月15日正式发布,取代了2011年发布的C++11。
参考:https://en.cppreference.com 、https://en.cppreference.com
source/exe/main.cc中实现了main()
,是程序运行开始地方,也是代码阅读的入口:
// soruce/exe/main.cc: 14
int main(int argc, char** argv) {
...
std::unique_ptr<Envoy::MainCommon> main_common;
...
return main_common->run() ? EXIT_SUCCESS : EXIT_FAILURE;
}
类Envoy::MainCommon
的定义在source/exe/main_common.h
中,它的run()
方法是envoy运行的主体函数,这个方法调用了Envoy::MainCommonBase
的run()
:
// source/exe/main_common.h: 67
class MainCommon {
public:
MainCommon(int argc, const char* const* argv);
bool run() { return base_.run(); }
...
private:
...
MainCommonBase base_;
...
}
Envoy::MainCommonBase
的run()
中调用类Envoy::Server::InstanceImp
的run()
:
// source/exe/main_common.h: 25
class MainCommonBase {
protected:
std::unique_ptr<Server::InstanceImpl> server_;
// source/exe/main_common.cc: 96
bool MainCommonBase::run() {
switch (options_.mode()) {
case Server::Mode::Serve:
server_->run();
return true;
...
}
Envoy::Server::InstanceImp
中实现了envoy的具体功能:
// source/server/server.h:130
/**
* This is the actual full standalone server which stiches together various common components.
*/
class InstanceImpl : Logger::Loggable<Logger::Id::main>, public Instance {
...
类Envoy::Server::InstanceImp
在实例化的时候,初始化了大量的私有成员:
// source/exe/main_common.cc: 45
InstanceImpl::InstanceImpl(Options& options, Event::TimeSystem& time_system,
Network::Address::InstanceConstSharedPtr local_address, TestHooks& hooks,
HotRestart& restarter, Stats::StoreRoot& store,
Thread::BasicLockable& access_log_lock,
ComponentFactory& component_factory,
Runtime::RandomGeneratorPtr&& random_generator,
ThreadLocal::Instance& tls)
: options_(options), time_system_(time_system), restarter_(restarter),
start_time_(time(nullptr)), original_start_time_(start_time_), stats_store_(store),
thread_local_(tls), api_(new Api::Impl(options.fileFlushIntervalMsec())),
dispatcher_(api_->allocateDispatcher(time_system)),
singleton_manager_(new Singleton::ManagerImpl()),
handler_(new ConnectionHandlerImpl(ENVOY_LOGGER(), *dispatcher_)),
random_generator_(std::move(random_generator)),
secret_manager_(std::make_unique<Secret::SecretManagerImpl>()),
listener_component_factory_(*this), worker_factory_(thread_local_, *api_, hooks, time_system),
dns_resolver_(dispatcher_->createDnsResolver({})),
access_log_manager_(*api_, *dispatcher_, access_log_lock, store), terminated_(false) {
...
initialize(options, local_address, component_factory);
...
}
注意构造函数中的调用的initialize()
,这个函数里完成大量初始化操作。特别注意其中的:
//source/server/server.cc:300
cluster_manager_factory_.reset(new Upstream::ProdClusterManagerFactory(
runtime(), stats(), threadLocal(), random(), dnsResolver(), sslContextManager(), dispatcher(),
localInfo(), secretManager()));
Configuration::MainImpl* main_config = new Configuration::MainImpl();
config_.reset(main_config);
main_config->initialize(bootstrap_, *this, *cluster_manager_factory_);
在后面会再次用到config_
获取到main_config通过initialize()
创建的cluster_manager_:
// source/server/configuration_impl.cc: 46
void MainImpl::initialize(const envoy::config::bootstrap::v2::Bootstrap& bootstrap,
Instance& server,
Upstream::ClusterManagerFactory& cluster_manager_factory) {
...
cluster_manager_ = cluster_manager_factory.clusterManagerFromProto(
bootstrap, server.stats(), server.threadLocal(), server.runtime(), server.random(),
server.localInfo(), server.accessLogManager(), server.admin());
...
cluster_manager_是虚类Envoy::Upstream::ClusterManager
的实现类的对象,它有一个名为setInitializedCb()
的方法。
// include/envoy/upstream: 91
/**
* Set a callback that will be invoked when all owned clusters have been initialized.
*/
virtual void setInitializedCb(std::function<void()> callback) PURE;
通过setInitializedCb()
注入的函数会在所有的clusters初始化完成后被调用,回调函数中会启动所有的worker,见下一节。
// source/server/server.cc: 444
void InstanceImpl::run() {
// We need the RunHelper to be available to call from InstanceImpl::shutdown() below, so
// we save it as a member variable.
run_helper_ = std::make_unique<RunHelper>(*dispatcher_, clusterManager(), restarter_,
access_log_manager_, init_manager_, overloadManager(),
[this]() -> void { startWorkers(); });
// Run the main dispatch loop waiting to exit.
ENVOY_LOG(info, "starting main dispatch loop");
auto watchdog = guard_dog_->createWatchDog(Thread::Thread::currentThreadId());
watchdog->startWatchdog(*dispatcher_);
dispatcher_->run(Event::Dispatcher::RunType::Block);
ENVOY_LOG(info, "main dispatch loop exited");
guard_dog_->stopWatching(watchdog);
watchdog.reset();
terminate();
run_helper_.reset();
}
注意run_helper_
的创建,类RunHelper
的构造函数中启动了envoy的主要服务!刚开始看代码的时候把它漏过去了,好久没找到envoy服务的启动代码:
// source/server/server.cc: 386
RunHelper::RunHelper(Event::Dispatcher& dispatcher, Upstream::ClusterManager& cm,
HotRestart& hot_restart, AccessLog::AccessLogManager& access_log_manager,
InitManagerImpl& init_manager, OverloadManager& overload_manager,
std::function<void()> workers_start_cb) {
...
cm.setInitializedCb([this, &init_manager, &cm, workers_start_cb]() {
if (shutdown_) {
return;
}
...
init_manager.initialize([this, workers_start_cb]() {
if (shutdown_) {
return;
}
workers_start_cb();
});
...
});
overload_manager.start();
}
这里的workers_start_cb()
是Envoy::Server::InstanceImp::()
:
// source/server/server.cc: 444
void InstanceImpl::run() {
// We need the RunHelper to be available to call from InstanceImpl::shutdown() below, so
// we save it as a member variable.
run_helper_ = std::make_unique<RunHelper>(*dispatcher_, clusterManager(), restarter_,
access_log_manager_, init_manager_, overloadManager(),
[this]() -> void { startWorkers(); });
startWorkers()
是类Envoy::Server::InstanceImpl
的方法:
// source/server/server.cc: 342
void InstanceImpl::startWorkers() {
listener_manager_->startWorkers(*guard_dog_);
// At this point we are ready to take traffic and all listening ports are up. Notify our parent
// if applicable that they can stop listening and drain.
restarter_.drainParentListeners();
drain_manager_->startParentShutdownSequence();
}
在类Envoy:Server:InstanceImpl
的run()
方法中还有一个dispatcher_->run()
// source/server/server.cc: 444
void InstanceImpl::run() {
...
run_helper_ = std::make_unique<RunHelper>(*dispatcher_, clusterManager(), restarter_,
access_log_manager_, init_manager_, overloadManager(),
[this]() -> void { startWorkers(); });
...
ENVOY_LOG(info, "starting main dispatch loop");
auto watchdog = guard_dog_->createWatchDog(Thread::Thread::currentThreadId());
watchdog->startWatchdog(*dispatcher_);
dispatcher_->run(Event::Dispatcher::RunType::Block);
ENVOY_LOG(info, "main dispatch loop exited");
guard_dog_->stopWatching(watchdog);
watchdog.reset();
terminate();
run_helper_.reset();
}
dispatcher_的类型是Event::DispatcherPtr
,这是一个虚类,
虚类Event::DispatcherPtr
在include/envoy/event/dispatcher.h
中定义,注释写得很好,通过注释也可以大概了解到这个类的功能是进行事件分发
,可以监听设置的信号量、文件事件、连接事件等,并在事件发生时调用对应的函数。
需要从构造函数中找到dispatcher的创建过程,找到虚函数的实现,然后才能知晓它具体是怎样做的。也可以完全把它当成一个黑盒,看一下它的成员方法都被谁调用,怎样调用的,为哪些事件设置了怎样的处理函数。这里先找到它的实现类,然后在看它的成员方法是被怎样使用的。
在构造函数中(source/server/server.cc: 45),可以看到成员dispatcher_
是用api_->allocateDispatcher
创建的:
// source/server/server.cc: 45
api_(new Api::Impl(options.fileFlushIntervalMsec())),
dispatcher_(api_->allocateDispatcher(time_system)),
...
而api_是类Envoy::Server::Api::Impl
的对象,它的allocateDispatcher()
方法实现如下。
// source/common/api/api_impl.cc:12
Event::DispatcherPtr Impl::allocateDispatcher(Event::TimeSystem& time_system) {
return Event::DispatcherPtr{new Event::DispatcherImpl(time_system)};
}
从而知晓dispatcher_
是类Envoy::Event::DispatcherImpl
的对象。
// source/common/event/dispatcher_impl.h:23
/**
* libevent implementation of Event::Dispatcher.
*/
class DispatcherImpl : Logger::Loggable<Logger::Id::main>, public Dispatcher {...}
DispatcherImpl
的run()函数放在后面,单独分析,先看看哪些地方还用到了dispatcher_
。
在Envoy::Server::InstanceImp的构造函数中用到dispatcher_
的地方,除了dispatcher_->createDnsResolver()
,其它都是在对应对象中保存了dispatcher_的引用:
// source/server/server.cc: 45
InstanceImpl::InstanceImpl(Options& options, Event::TimeSystem& time_system,..):
...
handler_(new ConnectionHandlerImpl(ENVOY_LOGGER(), *dispatcher_)),
dns_resolver_(dispatcher_->createDnsResolver({})),
access_log_manager_(*api_, *dispatcher_, access_log_lock, store), terminated_(false)
...{
restarter_.initialize(*dispatcher_, *this);
}
在handler_
、access_log_manger_
对象中都存了dispatcher_的引用,restarter_
在dispatcher_中注册了socket_event_:
// source/server/hot_restart_impl.cc
void HotRestartImpl::initialize(Event::Dispatcher& dispatcher, Server::Instance& server) {
socket_event_ =
dispatcher.createFileEvent(my_domain_socket_,
[this](uint32_t events) -> void {
ASSERT(events == Event::FileReadyType::Read);
onSocketEvent();
},
Event::FileTriggerType::Edge, Event::FileReadyType::Read);
server_ = &server;
}
createFileEvent()
的三个参数分别是文件句柄、回调函数——onSocketEvent()
、触发时机、事件,restarter_中注册了事件,先记在心里。
dispatcher_
是类Envoy::Event::DispatcherImpl
的对象。
// source/common/event/dispatcher_impl.h:23
/**
* libevent implementation of Event::Dispatcher.
*/
class DispatcherImpl : Logger::Loggable<Logger::Id::main>, public Dispatcher {...}
run()
函数实现如下:
// source/common/event/dispatcher_impl.cc
void DispatcherImpl::run(RunType type) {
run_tid_ = Thread::Thread::currentThreadId();
// Flush all post callbacks before we run the event loop. We do this because there are post
// callbacks that have to get run before the initial event loop starts running. libevent does
// not guarantee that events are run in any particular order. So even if we post() and call
// event_base_once() before some other event, the other event might get called first.
runPostCallbacks();
event_base_loop(base_.get(), type == RunType::NonBlock ? EVLOOP_NONBLOCK : 0);
}