Files
2012-02-21 01:15:00 -05:00

1 line
6.3 KiB
JSON

[{"user_id": 14970, "stars": [], "topic_id": 10093, "date_created": 1299075530.9782951, "message": "Hi there, coming from a PHP/MySQL background, I'm looking to learn a new programming language that will allow me to collect and store, indefinitely, large amounts of statistics/analytics. The type of numbers/data that will be stored are along the line of what Google Analytics collects (browser data, operating system, geographic, time spent, etc). The collected data has to generally be raw number figures as I'd like to display and manipulate the data using beautiful graphs.\n\nThanks for all of your help Convorions", "group_id": 95, "id": 247084}, {"user_id": 1209, "stars": [], "topic_id": 10093, "date_created": 1299076824.332798, "message": "So, you're looking for the back-end systems (programming language + database) vs. front-end systems (dashboards + reports) correct? If so, I'd research Python for a general Swiss Army programming language and \"R\" for the statistical functions (with ggplot2 for graphing).", "group_id": 95, "id": 247224}, {"user_id": 14970, "stars": [], "topic_id": 10093, "date_created": 1299087475.130264, "message": "I figured R was mainly a desktop program? What I am developing is mainly a web solution, so wanted to be able to run some analysis on the web and display graphically. Any advice?", "group_id": 95, "id": 249191}, {"user_id": 3952, "stars": [], "topic_id": 10093, "date_created": 1299091056.6299729, "message": "I've thought about making something similar, with the intent of open sourcing it (like mint analytics). Proposed to do a heavy JavaScript stack with jQuery + node.js", "group_id": 95, "id": 250959}, {"user_id": 1209, "stars": [], "topic_id": 10093, "date_created": 1299095936.6921749, "message": "I'd stick with the PHP/MySQL for the backend side of it and then use Google Visualization API for the front end. Not sure if I'm still missing your point. Can you give some examples or a bit more info perhaps?", "group_id": 95, "id": 252348}, {"user_id": 14970, "stars": [], "topic_id": 10093, "date_created": 1299110957.3821521, "message": "Google Visualization API looks neat. Will have to look more into it, thanks. Essentially I am going to collect visitor statistics (time spent, browser info, location) and going to dump it in a database. Going to then make it visible in graphs and provide various tools to analyze it.", "group_id": 95, "id": 254664}, {"user_id": 16271, "stars": [], "topic_id": 10093, "date_created": 1299188827.7993491, "message": "@AroonUp You might look at something like HBase or even Hadoop Map/Reduce although M/R isn't going to drive your realtime web queries it might find a place in your architecture.", "group_id": 95, "id": 264107}, {"user_id": 15929, "stars": [], "topic_id": 10093, "date_created": 1299209634.9869969, "message": "thought this might be helpful: http://www.readwriteweb.com/hack/2011/03/rstudio-an-open-source-and-cro.php", "group_id": 95, "id": 266286}, {"user_id": 17546, "stars": [{"date_created": 1299636353.575012, "user_id": 1661}, {"date_created": 1300133104.7702019, "user_id": 2293}, {"date_created": 1300141975.913132, "user_id": 2072}], "topic_id": 10093, "date_created": 1299275734.5530479, "message": "I'd definitely look at node.js + MongoDB. You'd be working in the same language on the client and server, its fast and well suited (imo) to this sort of thing, and theres plenty of graph/visualization libraries for JS/JQuery. Using projects like jsdom you can run these visualization libs on the server side if thats a useful thing for you.", "group_id": 95, "id": 273295}, {"user_id": 14970, "stars": [], "topic_id": 10093, "date_created": 1300123152.847842, "message": "Any idea for how the Google Analytics implementation their tracker works?", "group_id": 95, "id": 349244}, {"user_id": 1208, "stars": [], "topic_id": 10093, "date_created": 1300125053.0640941, "message": "If you want to keep a live copy running on a server, I would look at MongoDB really and build from that. If you're more looking for desktop analysis, you may want to check out Incanter (http://incanter.org/), R, SAS, or even Tableau (http://www.tableausoftware.com/), depending on your exact needs! Hope this helps.", "group_id": 95, "id": 349532}, {"user_id": 14970, "stars": [], "topic_id": 10093, "date_created": 1300123122.1440301, "message": "Thanks @matwill", "group_id": 95, "id": 349240}, {"user_id": 14970, "stars": [], "topic_id": 10093, "date_created": 1300131551.6434209, "message": "The analysis will be on a server, will not be opting to keep anything desktop level. I am looking to create an applet of sorts, like Google Analytics (that script you put before the </body> tag to track analytics. This script then will be embedded in my projects myself for the data I'd like to collect. I am not sure how this is currently done by Google. Any tips/advice would be great.", "group_id": 95, "id": 350206}, {"user_id": 5863, "stars": [], "topic_id": 10093, "date_created": 1300179418.660511, "message": "I'd go through all the bits you need. Javascript on the client to make an http request to your server, no real option there. The server that takes that request has to be very fast, non-blocking and able to deal with a very large number of connections. If I had to guess I'd bet Google have a custom web server written in C/C++. Ruby/Eventmachine, Python/Twisted/Eventlet or Javascript/Node.js would all make for good candidates for this to begin with. You then need a datastore. MongoDB or HBase, maybe Redis or Tokyo Cabinet, would seem worth consideration. You then need to analyse the data and then store it in a way that makes most sense for your frontend. That might be in the same datastore or it might not, depends exactly what you're trying to do. I'd might even be in a relational database. I'd favour Python for the analysis due to strong statistical and scientific tools. Java would also be a good choice, especially if you go for Hbase and Map Reduce jobs make sense for the analysis. R would work I think although I'm less familiar with it. The frontend is then relatively dump, showing tables of paged data, providing a search facility and drawing graphs of the data on the client or rendering then on the server. You could use whatever you're comfortable with for that, be that PHP or a language/framework you're used elsewhere in this stack.", "group_id": 95, "id": 354135}]