Files
2012-02-21 01:15:00 -05:00

1 line
16 KiB
JSON

[{"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321042022.0488601, "message": "Candid opinion? It sucks for any large applications. On high end 24 core machines I can squeeze out ~400 req/s to one page. Just getting there required heroic tuning. Switched to jinja2 for major performance increase but still nowhere near what I am used to from other platforms. All the tricks are in place, pgbouncer, caching, etc. DB is lightning fast and query count is low. Is this level of performance common?", "group_id": 81, "id": 2554804}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321042262.845078, "message": "Also should mention that DBs are on external and even beefier systems. The 400/s rate is all 24 cores doing just page gen, not happy here :(", "group_id": 81, "id": 2554826}, {"user_id": 11592, "stars": [], "topic_id": 47712, "date_created": 1321087683.272871, "message": "Profiler is your friend. Otherwise it's really chasing a mare's nest.", "group_id": 81, "id": 2558135}, {"user_id": 32294, "stars": [], "topic_id": 47712, "date_created": 1321095807.2454381, "message": "Johnny cache in place? - Template loader caching on? - memcached ?", "group_id": 81, "id": 2558461}, {"user_id": 32294, "stars": [], "topic_id": 47712, "date_created": 1321095873.6321599, "message": "johnny cache caches queries and the template loader caching reduces io", "group_id": 81, "id": 2558464}, {"user_id": 1398, "stars": [], "topic_id": 47712, "date_created": 1321161914.2377069, "message": "@kev900 can you tell us a little more about your application? What are you doing? Saying \"It Sucks\" gives us nothing.", "group_id": 81, "id": 2563651}, {"user_id": 5436, "stars": [], "topic_id": 47712, "date_created": 1321163860.6326859, "message": "What's your wsgi clue? I got a huuuge boost when using uWSGI on nginx. (assuming you run nginx)", "group_id": 81, "id": 2563782}, {"user_id": 43404, "stars": [{"date_created": 1322386720.088402, "user_id": 13300}], "topic_id": 47712, "date_created": 1321165370.143394, "message": "cProfile show lots of dictionary/object access.. not a lot can be done there. I've dropped down to the Linux 'perf' level to see what's happening in the interpreter and it's the same story. The app is typical CRUD and not doing anything particularly naughty. The numbers are actually in line with i.e. http://www.askthepony.com/blog/2011/07/django-and-postgresql-improving-the-performance-with-no-effort-and-no-code/ where they are super stoked to get 70/s one one core for a dead simple app with Jinja.\n\nYes, caching is in place all across the board. To ease benchmarking, I usually drop down to 2 cores. In that setup, I see around 40/s on less cachable page gens with 10 queries and up to 100/s with full cache hits. In all cases the CPU pegs in Python and is page gen is the limiting factor.\n\nI had to make double check my watch to make sure it wasn't 2001 again.\n\nuWSGI does scale more linearly on the 24 core machines but doesn't do anything vs. well tuned Apache with mod_wsgi on 2 cores.\n\nMyself and another industry veteran have put about 100 hours a piece into tuning and code simplification to get to these numbers, so I'm skeptical there's anything inherently awry. If you think you know better, it would be worth some money to me to give you a quick consulting job.", "group_id": 81, "id": 2563864}, {"user_id": 11592, "stars": [], "topic_id": 47712, "date_created": 1321174149.144522, "message": "Maybe you could somehow flatten/inline that \"dict/object access\" and/or render it through cython. Get a nice cprofile visualizer that shows tree of offending blocks. You may even look into dis-ing hottest spots.", "group_id": 81, "id": 2564171}, {"user_id": 41146, "stars": [], "topic_id": 47712, "date_created": 1321197441.4302831, "message": "Have you tried using Django with Pypy its experimental of course but i heard you get a huge performance boost.", "group_id": 81, "id": 2565312}, {"user_id": 2362, "stars": [], "topic_id": 47712, "date_created": 1321209408.851779, "message": "We end to scale more horizontally, rather than vertically, for increasing our throughput. We just use a bunch of really cheap 2-core VMs, and get far above 400 req/s.", "group_id": 81, "id": 2566328}, {"user_id": 32294, "stars": [], "topic_id": 47712, "date_created": 1321215809.7195849, "message": "Could you post the apache and mod_wsgi settings?", "group_id": 81, "id": 2566987}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321477392.8945191, "message": "I found that Scala with Play 2.0 framework can serve a simple app at 15k/s on 4 cores without any tuning whatsoever. Adding in DB and even if we did everything wrong it would be reasonable to get at least 1% of that, so a higher level business decision was made to begin migrating. Unfortunately I'm left hoping the lumbering Django app won't eat our hardware budget alive in the mean time :p.\n\nI realize it might not be welcome here, but if you are doing a big app with Django, don't. Maybe that is harsh, but at least don't assume it will run acceptably (cost/user) because machines have gotten very fast. This admittedly short-sighted mistake cost us many months of development. Django offers reasonable development time at great performance and scalability (in both deployment and team size) cost.\n\nFWIW, Pypy is also actually slower with a DB driver in my testing vs CPython for Django. The benchmark they use is contrived to show just template performance vs CPython. It's got a year or two yet before it's ready for commercial settings.", "group_id": 81, "id": 2585505}, {"user_id": 2116, "stars": [], "topic_id": 47712, "date_created": 1321517785.776509, "message": "@kev009 I don't know what you're calling a big app, but AFAIK disqus is using Django and serves millions of pages each day. Is that big enough ? If so, maybe you should ask some tips & tricks to @zeeg", "group_id": 81, "id": 2588183}, {"user_id": 13320, "stars": [], "topic_id": 47712, "date_created": 1321532273.7035279, "message": "GIL issues? 24 VM with NGNIX in front is probably a better bench mark. I wouldn't say that more CPUs on a single machine would lead to a performance boost or would be the best way to scale.", "group_id": 81, "id": 2588770}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321560552.256413, "message": "GIL is per process, therefore 24 python processes ensures fairly good usage of the hardware. uWSGI handles this well in tests. 24 VMs would equal 24 kernels with hypervisor overheads and boatloads of context switches and cache thrashing.", "group_id": 81, "id": 2590940}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321561021.570811, "message": "@zyegfryed - It's big in terms of lots of code, lots of templates. I've watched many of the djangocon videos and paid close attention to disqus. The primary difference I can think is that they might be very light on templates. They had a ton of webservers vs. DB in 2010, so maybe the cost was just acceptable for their business model.", "group_id": 81, "id": 2590969}, {"user_id": 1398, "stars": [], "topic_id": 47712, "date_created": 1321601314.7582631, "message": "@keven 009 tons of templates can mean tons of templatetags. Those things can be slow, especially when loading lots of templates. Which Django version are you on?", "group_id": 81, "id": 2593821}, {"user_id": 13320, "stars": [], "topic_id": 47712, "date_created": 1321618981.223531, "message": "I don't think Disqus owns any 24 CPU servers because they are expensive. I would guess they don't own any servers at all since they are lean and development focused which means there cost scale with their needs, making it an acceptable business model. The 24 VMs wouldn't be running on one machine if you want to scale. Have one 24 CPU machine is the opposite of redundant.", "group_id": 81, "id": 2594524}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321642314.3590021, "message": "@amjoconn that's a strawman. We have two of these systems provisioned at the moment for basic redundancy. I don't want to veer too far off topic but hosted VMs/\"the cloud\" are a LOT more expensive than colocation or dedicated hosting when you move beyond a handful of instances. Your disk I/O also wont be grossly over-committed :)", "group_id": 81, "id": 2596133}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321642542.6025431, "message": "@pydanny Django 1.3.1, Jinja2 2.6. The cached template loader is being used. There's not a whole lot that can be done to trim them, except perhaps something drastic like bare-bones Django templates and assemble all the fluff with Varnish edge side includes.", "group_id": 81, "id": 2596147}, {"user_id": 2362, "stars": [], "topic_id": 47712, "date_created": 1321689491.5703721, "message": "@kev009 It's been alluded to already by several, but you're going to have much better luck scaling horizontally. The 24-core behemoth isn't giving you what you want, try your luck with load balancing across weaker, 2/4-core machines. We get great results doing that. It sounds like your hardware was not requisitioned to meet your expectations.", "group_id": 81, "id": 2599436}, {"user_id": 2362, "stars": [], "topic_id": 47712, "date_created": 1321689635.5995519, "message": "@kev009 Also, you are definitely going to get worse performance with PyPy if you're using their CPython C extension compatibility layer for any C extensions, especially DB drivers. AFAIK, there is only a custom mysql driver with good performance, which Alex Gaynor wrote for Quora. It's available somewhere out there. He started working on a Postgres one as well, but I'm not sure if it's usable yet. Both may be in the PyPy mercurial repo.", "group_id": 81, "id": 2599446}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1321772762.5654669, "message": "This is another strawman. You're approaching hardware with a great deal of mysticism. There's nothing inherently different with these 24 core machines that have ample disk and memory bandwidth than an equivalent core count of VMs or lesser machines. The nice thing about server workloads are that you can multiplex them across large thread or in this case process pools (to avoid GIL contention). Furthermore, the problem is the same if I try my tests on 4 core VMs and extrapolate the results.\n\nAt this point I'm fairly confident we haven't done anything wrong. Pypy may make Django acceptable to me in the future, but it's definitely not ready yet. The mid term goal is to move to the Play Scala framework, and just accept the high user cost of the current Django code. From everything I've heard, this is the right move. Think long and hard about your platform choices up front, that's the lesson I learned here.\n\nThanks for the input all.", "group_id": 81, "id": 2605638}, {"user_id": 2116, "stars": [], "topic_id": 47712, "date_created": 1321864116.354023, "message": "@gtaylor @kev009 Here's the github repo for mysql-ctypes : https://github.com/quora/mysql-ctypes. Still waiting for the Django 1.3 compatibility - there's a pull-request thats needs to be merged.", "group_id": 81, "id": 2611742}, {"user_id": 32294, "stars": [], "topic_id": 47712, "date_created": 1321950301.551966, "message": "Is not the complexity of business logic per request, important for the overall performance?", "group_id": 81, "id": 2617623}, {"user_id": 13325, "stars": [], "topic_id": 47712, "date_created": 1321973799.606847, "message": "Discus must have wizards on the payroll.", "group_id": 81, "id": 2618761}, {"user_id": 2362, "stars": [{"date_created": 1323204931.7146051, "user_id": 34360}], "topic_id": 47712, "date_created": 1321974321.624433, "message": "@emperorcezar Indeed, though I think @kev009 is making some assumptions as to his code/setup being bottleneck-free. There just hasn't been enough data to suggest anything other than impatience, and assuming that \"everything is correct\".", "group_id": 81, "id": 2618805}, {"user_id": 2375, "stars": [], "topic_id": 47712, "date_created": 1322004268.0944469, "message": "@gtaylor Erm, he's mentioned profiling a couple times... doesn't that mean he's checking his code for bottlenecks?", "group_id": 81, "id": 2621378}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1322009264.3607531, "message": "@gtaylor Really man? I'm quite comfortable with C and at systems programming level and have used this experience to look into the problem far deeper than most web developers could. I don't even know what you're trying to get across other than you might be offended because I'm not completely satisfied with Django. Of course there is a bottleneck, that's the crux of the issue! If you're hot shit, I also offered a potential gig to anyone in this thread that knows more.", "group_id": 81, "id": 2621857}, {"user_id": 2116, "stars": [], "topic_id": 47712, "date_created": 1322040070.7547591, "message": "@kev009 let's ping some core developers that might/want to be aware of this issue /cc @alex @jezdez @jacobian", "group_id": 81, "id": 2623780}, {"user_id": 32294, "stars": [{"date_created": 1323651168.4160719, "user_id": 37410}], "topic_id": 47712, "date_created": 1322045269.366823, "message": "Have you thought about doing a contest around the bottleneck?", "group_id": 81, "id": 2624043}, {"user_id": 2116, "stars": [], "topic_id": 47712, "date_created": 1322054487.108706, "message": "@kev009 A thought just hit myself : did you play with the number of process (when using mod_wsgi) or the number of worker (when dealing with gunicorn) and/or another trick to allow multi-core usage to serve the app ?", "group_id": 81, "id": 2624389}, {"user_id": 2362, "stars": [], "topic_id": 47712, "date_created": 1322060280.918041, "message": "@kev009 Are you using pgbouncer or another connection pooler? Were you sieging/testing from one or multiple external machines? No need to compare e-peens, and it's silly to have allegiance to any technology but the best for the job. People can and do serve well in excess of 400 r/s. Discus is even cool enough to post most of how they did it.", "group_id": 81, "id": 2624808}, {"user_id": 31162, "stars": [], "topic_id": 47712, "date_created": 1322066444.26019, "message": "He said he used pgbouncer in the OP.", "group_id": 81, "id": 2625332}, {"user_id": 43404, "stars": [{"date_created": 1322422389.6774731, "user_id": 23352}], "topic_id": 47712, "date_created": 1322097603.209563, "message": "@zyegfryed - yes, since the per page CPU usage is so high I've found that CPUs+1 seems to be the sweet spot for this app. This saturates the VMs pretty well and anything higher would just be unnecessary context switches. pgbouncer doesn't yield huge gains but it has been in use the entire time.\n\nThinking a bit about it, the thing that really stands out to me is that a lot of heavy Django users can do really aggressive caching (i.e. CMS type sites). This is commerce site and we can't cache anywhere near as aggressively -- only template blocks and queries, but very few whole pages.", "group_id": 81, "id": 2627857}, {"user_id": 43404, "stars": [{"date_created": 1322148189.259588, "user_id": 26100}], "topic_id": 47712, "date_created": 1322097607.8760951, "message": "@zyegfryed - yes, since the per page CPU usage is so high I've found that CPUs+1 seems to be the sweet spot for this app. This saturates the VMs pretty well and anything higher would just be unnecessary context switches. pgbouncer doesn't yield huge gains but it has been in use the entire time.\n\nThinking a bit about it, the thing that really stands out to me is that a lot of heavy Django users can do really aggressive caching (i.e. CMS type sites). This is commerce site and we can't cache anywhere near as aggressively -- only template blocks and queries, but very few whole pages.", "group_id": 81, "id": 2627861}, {"user_id": 43404, "stars": [], "topic_id": 47712, "date_created": 1322097781.9799581, "message": "The load tests are coming from a single utility box in the same rack. I don't think it's the limiting factor though, i.e. localhost tests aren't different and the CPU usage is pegged on the web server in all cases.", "group_id": 81, "id": 2627873}]