How to recover Mesos executor after Mesos framework failure? -
my scenario framework running on server a. has executor on server b running task (a long running web service long initialization time). server shutdown. framework restarted somewhere else in cluster.
currently, after restart new framework registers new executor runs new task. after time, mesos master deactivates old , no-longer-running framework in turn kills old still-running executor , task.
i new framework re-register old executor rather register new one. possible?
this on mesos forum answers question:
http://www.mail-archive.com/user%40mesos.apache.org/msg00069.html
included here reference:
(1) 1 thing particular found unexpected executors shutdown if scheduler shutdown. there way keep executors/tasks running when scheduler down? imagine when scheduler comes back, reestablish state somehow , keep going without interrupting running tasks. use case mesos designed for?
you can use frameworkinfo.failover_timeout tell mesos how long wait framework re-register before cleans framework's executors , tasks.
also, note work framework has persist frameworkid when first registers master. when framework comes needs reconnect setting frameworkinfo.framework_id = persisted id.
Comments
Post a Comment