How does Erlang hot code swapping work in the middle of activity?

穿精又带淫゛_ 提交于 2019-12-04 05:54:05

Yes, you are exactly right. No one said hot code swapping is easy. I worked for a telecommunication company where all code upgrades were performed on a live system (so that users aren't disconnected in the middle of their calls). Doing it right means carefully considering all those scenarios that you mentioned and preparing the code for every failure, then testing, then fixing issues, testing, and so on. To test it properly you would need a system running the old version under load (e.g. in a testing environment), then deploying the new code and checking for any crashes.

In this particular example mentioned in your question the simplest way of dealing with this issue is writing two versions of module2:do_something/1, one accepting the old state and one accepting the new state. Then dealing with the old state accordingly, e.g. converting it to the new state.

For this to work you will also need to ensure that the new version of module2 is deployed before any module has a chance to call it with the new state:

  1. If the application containing module2 is a dependency of the other application release_handler will upgrade that module first.

  2. Otherwise, you may need to split the deployment into two parts, firstly upgrading the common functions so that they can handle the new state, then deploying new versions of gen_servers and other modules that make calls to module2.

  3. If you are not using the release handler you could manually specify in which order the modules are loaded.

This is also the reason why in Erlang it's advised to avoid circular dependencies in function calls between modules, e.g. when modA calls a function in modB which calls another function in modA.

For upgrades performed with the help of release handler you can verify the order in which release_handler will upgrade modules on the old system in the relup file that the release_handler generates based on the old and new release. It's a text file containing all instructions for the upgrade, e.g.: remove (to remove modules), load_object_code (load new module), load, purge, etc.

Please note that there is no strict requirement that all applications must follow OTP principles for the hot code swapping to work, however using gen_server's and a proper supervisor stack makes this task much easier to handle for both, the developer and the release handler.

If you are not using OTP release you can't upgrade using the release handler, but you can still forcefully reload modules on your system and upgrade them to the new version. This works fine as long as you don't need to add/remove Erlang applications, because for that the release definition would need to change, and that can't be done on a live system without the support from the release handler.

The release handling calls sys:suspend which sends a message to the gen_server. The server will keep processing requests until it handles the suspend message at which time it basically just sits and waits. The new module version is then loaded into the system, sys:change_code is called which tells the server to call the code_change callback to do its upgrade and then the server again sits and waits. When the release handler calls sys:resume it sends a message to the server which tells it to get back to work and start processing incoming messages again.

The release handling does this at the same time for all servers which are dependent on a module. So first all are suspended, then the new module is loaded, then all are told to upgrade themselves and then finally all are told to resume work.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!