python - Optimizing Push Task Queues -
i use google app engine (python) backend of mobile game, includes social network integration (twitter) , global & relative leaderboards. application makes use of 2 task queues, 1 building out relationships between players, , 1 updating objects when player's score changes.
model
class relativeuserscore(ndb.model): id_format = "%s:%s" # "friend_id:follower_id" #--- ndb properties follower_id = ndb.stringproperty(indexed=true) # follower user_id = ndb.stringproperty(indexed=true) # followed (aka friend) points = ndb.integerproperty(indexed=true) # user data denormalization screen_name = ndb.stringproperty(indexed=false) # user data denormalization profile_image_url = ndb.stringproperty(indexed=false) # user data denormalization
this allows me build relative leaderboards querying objects requesting user follower.
push task queues
i have 2 major tasks performed:
sync-twitter
tasks fetch friends / followers twitter's api, , build out relative user score models. friends checked on user sign up, , again if twitter friend count changes. followers checked on user sign up. runs in own module f4 instances, , min_idle_instances set 20 (i'd reduce both settings if possible, though instance memory usage requires @ least f2 instances).
- name: sync-twitter target: sync-twitter # target version / module bucket_size: 100 # default 5, max 100? max_concurrent_requests: 1000 # default 1000. max? rate: 100/s # default 5/s. max? retry_parameters: min_backoff_seconds: 30 max_backoff_seconds: 900
update-leaderboard
tasks update user's objects after play game (which takes 2 minutes do). runs in own module f2 instances, , min_idle_instances set 10 (i'd reduce both settings if possible).
- name: update-leaderboard target: update-leaderboard # target version / module bucket_size: 100 # default 5, max 100? max_concurrent_requests: 1000 # default 1000. max? rate: 100/s # default 5/s. max?
i've optimized these tasks make them run asynchronously, , have reduced run time significantly. of time, tasks take between .5 5 seconds. i've put both task queues on own dedicated module, , have automatic scaling turned pretty high (and using f4 , f2 server types respectively) however, i'm still running few issues.
as can see i've tried max out bucket_size , max_concurrent_requests, these tasks run fast possible.
problems
- every once in while deadlineexceedederror on request handler initiates call.
deadlineexceedederrors: api call taskqueue.bulkadd() took long respond , cancelled.
- every once in while chunk of similar errors within tasks (for both task types): "process terminated because request deadline exceeded during loading request". (note isn't listed deadlineexceedederror). logs show these tasks took entire 600 seconds allowed. end getting rescheduled, , when re-run, take expected .5 5 seconds. i've tried using appstats gain more insight whats going on, these calls never recorded killed before appstats able save.
- with users updating score every 2 minutes,
update-leaderboard
queue starts fall behind somewhere around 10k ccu. i'd ideally prepared @ least 100k ccu. (by ccu i'm meaning actual users playing our game, not number of concurrent requests, 500 front-end api requests/second @ 25k users. - use locust.io load test)
potential optimizations / questions
my first thought maybe first 2 issues deal having single task queue each of task types. maybe happening because underlying bigtable splitting during these calls? (see this article, "queue sharding stable performance")
so, maybe sharding each queue 10 different queues. i'd think problem #3 benefit queue sharding. so... 1. idea underlying causes of problems #1 , #2? sharding eliminate these errors? 2. if queue sharding, keep queues on same respective module, , rely on autoscaling meet demand? or better off module per shard? 3. way dynamically scale sharding?
my next thought try , reduce calls update-leaderboard
tasks. not every complete game translates directly leaderboard update. i'd need if user plays 1 game, guaranteed update objects eventually. suggestions on implementing reduction?
finally, of modules' auto scaling parameters , queue's parameters set arbitrarily, trying err on side of maxing these out. advice on setting these appropriately i'm not spending more resources need?
Comments
Post a Comment