Files
2012-02-21 01:15:00 -05:00

1 line
12 KiB
JSON

[{"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1312248925.1672111, "message": "Lately I've been having second thoughts about celery / RabbitMQ due to an increasing unease over the amount of complexity we're taking on simply to say \"do this later\", with admittedly very simple usage (queued image processing), particularly since the toolchain's observability and documentation feel a little patchy for a production service - the sysadmin in me is itching over things like rabbitmqctl list_queue showing 1.2M messages in the celery queue which celery doesn't show. I'm debating swapping in redis to simplify that part of the stack but was wondering whether it'd make sense to simply roll a basic task decorator and remove a considerable amount of code from our install footprint.", "group_id": 81, "id": 1771737}, {"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1312248981.3514249, "message": "Has anyone gone down this path before? I seem to recall some discussion of simpler task queuing at Pycon but was wondering what people who are doing this in production think.", "group_id": 81, "id": 1771740}, {"user_id": 4383, "stars": [], "topic_id": 43126, "date_created": 1312270565.4251969, "message": "If you don't need to use the features of rabbitmq it's probably easier to use redis though some of the monitoring capabilities that rabbitmq has might be hard to live without. You'd need to query redis itself to check the queue size etc.", "group_id": 81, "id": 1773248}, {"user_id": 4383, "stars": [], "topic_id": 43126, "date_created": 1312270668.706404, "message": "Unless you want to roll you own ghetto queue by writing jobs to a table in the db and running a cron job to do the jobs every once in a while, doing things later in a predictable way doesn't get any easier that celery I'm afraid.", "group_id": 81, "id": 1773255}, {"user_id": 4383, "stars": [], "topic_id": 43126, "date_created": 1312270734.3074951, "message": "Even then you'd still need to create a file lock to make sure the cron jobs don't step on each other and make sure that jobs don't get run twice.", "group_id": 81, "id": 1773259}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312281572.7111299, "message": "what parts of the documentation do you think is patchy?", "group_id": 81, "id": 1773759}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312281712.4841189, "message": "(s)", "group_id": 81, "id": 1773765}, {"user_id": 257, "stars": [{"date_created": 1312285903.1296749, "user_id": 1243}], "topic_id": 43126, "date_created": 1312281688.3663139, "message": "and when you say that \"celery doesn't show\" the messages, what are you looking at? If you mean task monitors, then you have to set CELERY_SEND_TASK_SENT_EVENT=True, to keep track of messages before they are picked up by the worker.", "group_id": 81, "id": 1773764}, {"user_id": 1243, "stars": [], "topic_id": 43126, "date_created": 1312287326.895869, "message": "As somebody who gets to \"enjoy\" debugging a Frankenstein's monster built on top of a \"ghetto queue\" + \"cron\" (essentially the idea @ianmlewis threw out there) every time something acts up: just don't do it. You'll almost certainly have weirdo edge cases, poor monitoring, and less automatable sysadmin tasks. I'd probably stick with celery, but if you're really weirded out by it, maybe you could look into django-ztask (https://github.com/dmgctrl/django-ztask). Here's the backstory: http://www.zeromq.org/story:3", "group_id": 81, "id": 1774137}, {"user_id": 653, "stars": [], "topic_id": 43126, "date_created": 1312301015.340456, "message": "Do you need a ton of Django integration? Because if not, you could look at something like pyres, which was originally built where I work and uses Redis in place of RabbitMQ: https://github.com/binarydud/pyres", "group_id": 81, "id": 1776443}, {"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1312308258.331495, "message": "@asksol Some basic sysadmin stuff: yesterday my queue had 7MB messages in it but my celery workers were idle. It turns out that rabbit had silently deadlocked (which is *much* scarier than anything related to celery) but there was no utility which would show me why the messages were stacking up when the workers were idle or even what those messages were.", "group_id": 81, "id": 1777480}, {"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1312308365.9921989, "message": "I was also looking for some basic things like how to get celeryd to log its informational messages at level INFO rather than WARNING or trying to figure out why I was churning TCP connections to the queue server", "group_id": 81, "id": 1777497}, {"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1312308598.5230291, "message": "@gthank I was looking at something like http://richardhenry.github.com/hotqueue/ which appears to be a win over Rabbit simply by using a very simple, easy to manage redis queue rather than this opaque blob of Erlang with cumbersome command-line tools and e.g. broken logging (apparently the three messages Rabbit logs per-connection is a known source of poor performance but there's no way to disable it?)", "group_id": 81, "id": 1777539}, {"user_id": 1243, "stars": [], "topic_id": 43126, "date_created": 1312368028.084826, "message": "@acdha Is Celery's support for Redis enough? http://ask.github.com/celery/tutorials/otherqueues.html#redis", "group_id": 81, "id": 1783495}, {"user_id": 1243, "stars": [], "topic_id": 43126, "date_created": 1312368258.1648741, "message": "@acdha The docs call it \"limited\", but in the pre-1.0 days when I was actively messing with Celery, it seemed to do the job just fine.", "group_id": 81, "id": 1783504}, {"user_id": 257, "stars": [{"date_created": 1312390296.535187, "user_id": 1736}], "topic_id": 43126, "date_created": 1312380306.253022, "message": "redis support has been significantly improved, it's not really limited anymore (apart from reliability)", "group_id": 81, "id": 1784308}, {"user_id": 3580, "stars": [], "topic_id": 43126, "date_created": 1312428515.28916, "message": "+1 for redis. @acdha I'm in a similar position of recently dipping my toe into this.. just stood up celery/redis primarily for keeping haystack indexes up to date, and I find myself using them for more almost daily. Felt kind of big at first.. but couple reactions to what you're saying... First, redis has been STUPID reliable. Runs for days. It's honestly the one long-running service I don't even think about (that said, Ask's \"apart from reliability\" comment gives me pause. @Asksol what does that mean?).. Second, celery logging at info has been great for me. All of this to be taken with a grain of salt since I've only got about 2mo experience with them in production at this point :)", "group_id": 81, "id": 1790667}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312449921.7591779, "message": "@phil: with redis you can lose messages if redis is shutdown improperly (e.g. killed or power failure)", "group_id": 81, "id": 1791838}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312449968.3182409, "message": "redis doesn't have acknowledgments, but celery will put the message back at shutdown if it hasn't been processed", "group_id": 81, "id": 1791844}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450027.614615, "message": "how many messages celery will reserve at a time is decided with the CELERYD_PREFETCH_MULTIPLIER setting", "group_id": 81, "id": 1791845}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450044.9236169, "message": "if that is 0 it will reserve as many messages as it can, as fast as it can", "group_id": 81, "id": 1791846}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312449940.9436059, "message": "also you can lose messages if the worker is shutdown improperly", "group_id": 81, "id": 1791840}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450050.5853159, "message": "so not a good idea", "group_id": 81, "id": 1791848}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450135.422364, "message": "if acks_late is off (default) and that value is 1 it will reserve two messages at a time (one that is processing, and one extra). if acks_late is on it will reserve one message at a time", "group_id": 81, "id": 1791852}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450178.161356, "message": "http://ask.github.com/celery/faq.html#should-i-use-retry-or-acks-late for more info on acks_late", "group_id": 81, "id": 1791854}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450210.261034, "message": "http://ask.github.com/celery/userguide/optimizing.html#prefetch-limits for more info on prefetch limits", "group_id": 81, "id": 1791857}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450278.678622, "message": "also redis doesn't immediately write data to disk with the default settings, see http://redis.io/topics/persistence", "group_id": 81, "id": 1791859}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312450303.66447, "message": "so if the messages are important to you, I still recommend using rabbitmq", "group_id": 81, "id": 1791860}, {"user_id": 1126, "stars": [], "topic_id": 43126, "date_created": 1312463992.128407, "message": "@asksol I thought that's what the appendonly option in redis was for, if you needed real durability", "group_id": 81, "id": 1792666}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312492758.6875041, "message": "it still doesn't have message acknowledgements though, so you can still lose messages if the worker is killed", "group_id": 81, "id": 1797172}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312492700.452317, "message": "@pewpewarrows: yes, the append only option solves one of these issues", "group_id": 81, "id": 1797157}, {"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1312495634.631537, "message": "@asksol: what do you think about https://github.com/ask/celery/issues/445 - I'm trying to get more useful information in my django-sentry reports by improving celery's use of logging", "group_id": 81, "id": 1797507}, {"user_id": 257, "stars": [], "topic_id": 43126, "date_created": 1312528643.259933, "message": "@acdha I'm not sure yet, you mention a performance gain, but does it then support lazy arguments? e,g logger.info(\"Report: %s\", self.complex_report) ?", "group_id": 81, "id": 1800175}, {"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1312549768.7617681, "message": "@asksol Correct: the arguments are then lazy, which was really important for me in a couple of cases where stringifying a debug message was noticeably expensive", "group_id": 81, "id": 1801632}, {"user_id": 18845, "stars": [], "topic_id": 43126, "date_created": 1313046557.2859449, "message": "We are old school and using Sun Grid Engine + drmma-python", "group_id": 81, "id": 1846242}, {"user_id": 927, "stars": [], "topic_id": 43126, "date_created": 1313185264.9100921, "message": "@gourneau Nice: SGE brings back some *old* memories I hadn't thought to bring up", "group_id": 81, "id": 1859825}, {"user_id": 18845, "stars": [], "topic_id": 43126, "date_created": 1313710286.239393, "message": "Also this is not Django, but is extra fancy http://learnboost.github.com/kue/", "group_id": 81, "id": 1905489}, {"user_id": 18845, "stars": [], "topic_id": 43126, "date_created": 1320279554.0024819, "message": "Celery is a nightmare for me to roll out to 100s of machines, is there a simpler solution that does not require so many dependencies. Also, the queue is only on one machine at a time.", "group_id": 81, "id": 2495309}, {"user_id": 18845, "stars": [], "topic_id": 43126, "date_created": 1320689581.7200191, "message": "Okay, we just hand rolled the 8 package dependencies, they will be added to the Debian packages soon", "group_id": 81, "id": 2527350}]