Merge pull request #257 from rgbkrk/editing_on_the_plane

Editing on the plane
2026-06-05 23:00:18 +00:00 · 2013-03-22 11:00:27 -07:00
parent 5cf74bed24 191ee6641e
commit 581284da5f
11 changed files with 145 additions and 132 deletions
@@ -104,7 +104,7 @@ The following command lists all available minions running CentOS using the grain

 Salt also provides a state system. States can be used to configure the minion hosts.

-For example, when a minion host is ordered to read the following state file, will install
+For example, when a minion host is ordered to read the following state file, it will install
 and start the Apache server:

 .. code-block:: yaml
@@ -6,4 +6,4 @@ Command Line Applications
 Clint
 -----

-.. todo:: Write about Clint
+.. todo:: Write about Clint
@@ -41,3 +41,9 @@ messaging library aimed at use in scalable distributed or concurrent
 applications. It provides a message queue, but unlike message-oriented
 middleware, a ØMQ system can run without a dedicated message broker. The
 library is designed to have a familiar socket-style API.
+
+RabbitMQ
+--------
+
+.. todo:: Write about RabbitMQ
+
@@ -30,7 +30,6 @@ Django ORM
 The Django ORM is the interface used by `Django <http://www.djangoproject.com>`_
 to provide database access.

-It's based on the idea of models, an abstraction that makes it easier to
+It's based on the idea of `models <https://docs.djangoproject.com/en/1.3/#the-model-layer>`_, an abstraction that makes it easier to
 manipulate data in Python.

-Documentation can be found `here <https://docs.djangoproject.com/en/1.3/#the-model-layer>`_
@@ -41,7 +41,7 @@ Gtk
 PyGTK provides Python bindings for the GTK+ toolkit. Like the GTK+ library
 itself, it is currently licensed under the GNU LGPL. It is worth noting that
 PyGTK only currently supports the Gtk-2.X API (NOT Gtk-3.0). It is currently
-recommended that PyGTK is not used for new projects and existing applications
+recommended that PyGTK not be used for new projects and existing applications
 be ported from PyGTK to PyGObject.

 Tk
@@ -60,10 +60,10 @@ available on the `Python Wiki <http://wiki.python.org/moin/TkInter>`_.

 Kivy
 ----
-Kivy is a Python library for development of multi-touch enabled media rich applications. The aim is to allow for quick and easy interaction design and rapid prototyping, while making your code reusable and deployable.
+`Kivy <http://kivy.org>`_ is a Python library for development of multi-touch enabled media rich applications. The aim is to allow for quick and easy interaction design and rapid prototyping, while making your code reusable and deployable.

 Kivy is written in Python, based on OpenGL and supports different input devices such as: Mouse, Dual Mouse, TUIO, WiiMote, WM_TOUCH, HIDtouch, Apple's products and so on.

 Kivy is actively being developed by a community and free to use. It operates on all major platforms (Linux, OSX, Windows, Android).

-The main resource for information is the website: http://kivy.org
+The main resource for information is the website: http://kivy.org
@@ -12,7 +12,7 @@ The `Python Imaging Library <http://www.pythonware.com/products/pil/>`_, or PIL
 for short, is *the* library for image manipulation in Python.

 It works with Python 1.5.2 and above, including 2.5, 2.6 and 2.7. Unfortunately,
-it doesn't work with 3.0+ yet. 
+it doesn't work with 3.0+ yet.

 Installation
 ~~~~~~~~~~~~
@@ -20,7 +20,7 @@ Installation
 PIL has a reputation of not being very straightforward to install. Listed below
 are installation notes on various systems.

-Also, there's a fork named `Pillow <http://pypi.python.org/pypi/Pillow>`_ which is easier 
+Also, there's a fork named `Pillow <http://pypi.python.org/pypi/Pillow>`_ which is easier
 to install. It has good setup instructions for all platforms.

 Installing on Linux
@@ -6,7 +6,7 @@ Twisted

 `Twisted <http://twistedmatrix.com/trac/>`_ is an event-driven networking engine. It can be
 used to build applications around many different networking protocols, including http servers
-and clients, applications using SMTP, POP3, IMAP or SSH protocols, instant messaging and 
+and clients, applications using SMTP, POP3, IMAP or SSH protocols, instant messaging and
 `many more <http://twistedmatrix.com/trac/wiki/Documentation>`_.

 PyZMQ
@@ -14,11 +14,11 @@ PyZMQ

 `PyZMQ <http://zeromq.github.com/pyzmq/>`_ is the Python binding for `ZeroMQ <http://www.zeromq.org/>`_,
 which is a high-performance asynchronous messaging library. One great advantage is that ZeroMQ
-can be used for message queuing without message broker. The basic patterns for this are:
+can be used for message queuing without a message broker. The basic patterns for this are:

 - request-reply: connects a set of clients to a set of services. This is a remote procedure call
  and task distribution pattern.
- publish-subscribe: connects a set of publishers to a set of subscribers. This is a data 
+- publish-subscribe: connects a set of publishers to a set of subscribers. This is a data
  distribution pattern.
 - push-pull (or pipeline): connects nodes in a fan-out / fan-in pattern that can have multiple
  steps, and loops. This is a parallel task distribution and collection pattern.
@@ -35,6 +35,10 @@ people who only need the basic requirements can just use NumPy.

 NumPy is compatible with Python versions 2.4 through to 2.7.2 and 3.1+.

+Numba
+-----
+.. todo:: Write about Numba
+
 SciPy
 -----

@@ -60,8 +64,9 @@ Resources

 Installation of scientific  Python packages can be troublesome. Many of these
 packages are implemented as Python C extensions which need to be compiled.
-This section lists various so-called Python distributions which provide precompiled and 
-easy-to-install collections of scientific Python packages.
+This section lists various so-called scientific Python distributions which
+provide precompiled and easy-to-install collections of scientific Python
+packages.

 Unofficial Windows Binaries for Python Extension Packages
 ---------------------------------------------------------
@@ -91,6 +96,6 @@ Anaconda
 Python Distribution <https://store.continuum.io/cshop/anaconda>`_ which
 includes all the common scientific python packages and additionally many
 packages related to data analytics and big data. Anaconda comes in two
-flavours, a paid for version and a completely free and open source community
+flavors, a paid for version and a completely free and open source community
 edition, Anaconda CE, which contains a slightly reduced feature set. Free
-licences for the paid-for version are available for academics and researchers.
+licenses for the paid-for version are available for academics and researchers.
@@ -1,99 +1,101 @@
-HTML Scraping
-=============
-
-Web Scraping
------------
-
-Web sites are written using HTML, which means that each web page is a
-structured document. Sometimes it would be great to obtain some data from 
-them and preserve the structure while we're at it. Web sites provide 
-don't always provide their data in comfortable formats such as ``.csv``. 
-
-This is where web scraping comes in. Web scraping is the practice of using a
-computer program to sift through a web page and gather the data that you need
-in a format most useful to you while at the same time preserving the structure
-of the data.
-
-lxml and Requests
-----------------
-
-`lxml <http://lxml.de/>`_ is a pretty extensive library written for parsing
-XML and HTML documents really fast. It even handles messed up tags. We will 
-also be using the `Requests <http://docs.python-requests.org/en/latest/>`_ module instead of the already built-in urlib2 
-due to improvements in speed and readability. You can easily install both 
-using ``pip install lxml`` and ``pip install requests``.
-
-Lets start with the imports:
-
-.. code-block:: python
-
-    from lxml import html
-    import requests
-    
-Next we will use ``requests.get`` to retrieve the web page with our data 
-and parse it using the ``html`` module and save the results in ``tree``:
-
-.. code-block:: python
-
-    page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
-    tree = html.fromstring(page.text)
-
-``tree`` now contains the whole HTML file in a nice tree structure which
-we can go over two different ways: XPath and CSSSelect. In this example, I
-will focus on the former. 
-
-XPath is a way of locating information in structured documents such as 
-HTML or XML documents. A good introduction to XPath is on `W3Schools <http://www.w3schools.com/xpath/default.asp>`_ .
-
-There are also various tools for obtaining the XPath of elements such as
-FireBug for Firefox or if you're using Chrome you can right click an 
-element, choose 'Inspect element', highlight the code and then right 
-click again and choose 'Copy XPath'.
-
-After a quick analysis, we see that in our page the data is contained in 
-two elements - one is a div with title 'buyer-name' and the other is a 
-span with class 'item-price':
-
-::
-
-    <div title="buyer-name">Carson Busses</div>
-    <span class="item-price">$29.95</span>
-
-Knowing this we can create the correct XPath query and use the lxml
-``xpath`` function like this:
-
-.. code-block:: python
-
-    #This will create a list of buyers:
-    buyers = tree.xpath('//div[@title="buyer-name"]/text()')
-    #This will create a list of prices
-    prices = tree.xpath('//span[@class="item-price"]/text()')
-
-Lets see what we got exactly:
-
-.. code-block:: python
-
-    print 'Buyers: ', buyers
-    print 'Prices: ', prices
-
-::
-
-    Buyers:  ['Carson Busses', 'Earl E. Byrd', 'Patty Cakes', 
-    'Derri Anne Connecticut', 'Moe Dess', 'Leda Doggslife', 'Dan Druff',
-    'Al Fresco', 'Ido Hoe', 'Howie Kisses', 'Len Lease', 'Phil Meup',
-    'Ira Pent', 'Ben D. Rules', 'Ave Sectomy', 'Gary Shattire',
-    'Bobbi Soks', 'Sheila Takya', 'Rose Tattoo', 'Moe Tell']
-    
-    Prices:  ['$29.95', '$8.37', '$15.26', '$19.25', '$19.25',
-    '$13.99', '$31.57', '$8.49', '$14.47', '$15.86', '$11.11',
-    '$15.98', '$16.27', '$7.50', '$50.85', '$14.26', '$5.68',
-    '$15.00', '$114.07', '$10.09']
-
-Congratulations! We have successfully scraped all the data we wanted from
-a web page using lxml and Requests. We have it stored in memory as two 
-lists. Now we can do all sorts of cool stuff with it: we can analyze it 
-using Python or we can save it a file and share it with the world.
-
-A cool idea to think about is modifying this script to iterate through 
-the rest of the pages of this example dataset or rewriting this 
-application to use threads for improved speed.
+HTML Scraping
+=============
+
+Web Scraping
+------------
+
+Web sites are written using HTML, which means that each web page is a
+structured document. Sometimes it would be great to obtain some data from
+them and preserve the structure while we're at it. Web sites don't always
+provide their data in comfortable formats such as ``csv`` or ``json``.
+
+This is where web scraping comes in. Web scraping is the practice of using a
+computer program to sift through a web page and gather the data that you need
+in a format most useful to you while at the same time preserving the structure
+of the data.
+
+lxml and Requests
+-----------------
+
+`lxml <http://lxml.de/>`_ is a pretty extensive library written for parsing
+XML and HTML documents really fast. It even handles messed up tags. We will
+also be using the `Requests <http://docs.python-requests.org/en/latest/>`_
+module instead of the already built-in urlib2 due to improvements in speed and
+readability. You can easily install both using ``pip install lxml`` and
+``pip install requests``.
+
+Lets start with the imports:
+
+.. code-block:: python
+
+    from lxml import html
+    import requests
+
+Next we will use ``requests.get`` to retrieve the web page with our data
+and parse it using the ``html`` module and save the results in ``tree``:
+
+.. code-block:: python
+
+    page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
+    tree = html.fromstring(page.text)
+
+``tree`` now contains the whole HTML file in a nice tree structure which
+we can go over two different ways: XPath and CSSSelect. In this example, I
+will focus on the former.
+
+XPath is a way of locating information in structured documents such as
+HTML or XML documents. A good introduction to XPath is on
+`W3Schools <http://www.w3schools.com/xpath/default.asp>`_ .
+
+There are also various tools for obtaining the XPath of elements such as
+FireBug for Firefox or the Chrome Inspector. If you're using Chrome, you
+can right click an element, choose 'Inspect element', highlight the code,
+right click again and choose 'Copy XPath'.
+
+After a quick analysis, we see that in our page the data is contained in
+two elements - one is a div with title 'buyer-name' and the other is a
+span with class 'item-price':
+
+::
+
+    <div title="buyer-name">Carson Busses</div>
+    <span class="item-price">$29.95</span>
+
+Knowing this we can create the correct XPath query and use the lxml
+``xpath`` function like this:
+
+.. code-block:: python
+
+    #This will create a list of buyers:
+    buyers = tree.xpath('//div[@title="buyer-name"]/text()')
+    #This will create a list of prices
+    prices = tree.xpath('//span[@class="item-price"]/text()')
+
+Lets see what we got exactly:
+
+.. code-block:: python
+
+    print 'Buyers: ', buyers
+    print 'Prices: ', prices
+
+::
+
+    Buyers:  ['Carson Busses', 'Earl E. Byrd', 'Patty Cakes',
+    'Derri Anne Connecticut', 'Moe Dess', 'Leda Doggslife', 'Dan Druff',
+    'Al Fresco', 'Ido Hoe', 'Howie Kisses', 'Len Lease', 'Phil Meup',
+    'Ira Pent', 'Ben D. Rules', 'Ave Sectomy', 'Gary Shattire',
+    'Bobbi Soks', 'Sheila Takya', 'Rose Tattoo', 'Moe Tell']
+
+    Prices:  ['$29.95', '$8.37', '$15.26', '$19.25', '$19.25',
+    '$13.99', '$31.57', '$8.49', '$14.47', '$15.86', '$11.11',
+    '$15.98', '$16.27', '$7.50', '$50.85', '$14.26', '$5.68',
+    '$15.00', '$114.07', '$10.09']
+
+Congratulations! We have successfully scraped all the data we wanted from
+a web page using lxml and Requests. We have it stored in memory as two
+lists. Now we can do all sorts of cool stuff with it: we can analyze it
+using Python or we can save it to a file and share it with the world.
+
+A cool idea to think about is modifying this script to iterate through
+the rest of the pages of this example dataset or rewriting this
+application to use threads for improved speed.
@@ -42,7 +42,7 @@ The GIL

 `The GIL`_ (Global Interpreter Lock) is how Python allows multiple threads to
 operate at the same time. Python's memory management isn't entirely thread-safe,
-so the GIL is required to prevents multiple threads from running the same
+so the GIL is required to prevent multiple threads from running the same
 Python code at once.

 David Beazley has a great `guide`_ on how the GIL operates. He also covers the
@@ -58,8 +58,8 @@ C Extensions
 The GIL
 -------

-`Special care`_ must be taken when writing C extensions to make sure you r
-egister your threads with the interpreter.
+`Special care`_ must be taken when writing C extensions to make sure you
+register your threads with the interpreter.

 C Extensions
 ::::::::::::
@@ -76,7 +76,9 @@ Pyrex
 Shedskin?
 ---------

-
+Numba
+-----
+.. todo:: Write about Numba and the autojit compiler for NumPy

 Threading
 :::::::::
@@ -86,7 +88,7 @@ Threading
 ---------


-Spanwing Processes
+Spawning Processes
 ------------------


@@ -98,12 +98,12 @@ framework like Django and the microframeworks: It comes with a lot of libraries
 and functionality and can thus not be considered lightweight. On the other
 hand, it does not provide all the functionality Django does. Instead Pyramid
 brings basic support for most regular tasks and provides a great deal of
-extensibility. Additionally, Pyramid has a huge focus on complete 
+extensibility. Additionally, Pyramid has a huge focus on complete
 `documentation <http://docs.pylonsproject.org/en/latest/docs/pyramid.html>`_. As
 a little extra it comes with the Werkzeug Debugger which allows you to debug a
 running web application in the browser.

-**Support** can also be found in the 
+**Support** can also be found in the
 `documentation <http://docs.pylonsproject.org/en/latest/index.html#support-desc>`_.


@@ -140,8 +140,8 @@ Gunicorn
 to serve Python applications. It is a Python interpretation of the Ruby
 `Unicorn <http://unicorn.bogomips.org/>`_ server. Unicorn is designed to be
 lightweight, easy to use, and uses many UNIX idioms. Gunicorn is not designed
-to face the internet, in fact it was designed to run behind Nginx which buffers
-slow requests, and takes care of other important considerations. A sample
+to face the internet -- it was designed to run behind Nginx which buffers
+slow requests and takes care of other important considerations. A sample
 setup for Nginx + gUnicorn can be found in the
 `Gunicorn help <http://gunicorn.org/deploy.html>`_.

@@ -189,7 +189,7 @@ support for Python 2.7 applications.

 Heroku allows you to run as many Python web applications as you like, 24/7 and
 free of charge. Heroku is best described as a horizontal scaling platform. They
-start to charge you once you "scale" you application to run on more than one
+start to charge you once you "scale" your application to run on more than one
 Dyno (abstracted servers) at a time.

 Heroku publishes `step-by-step instructions
@@ -202,10 +202,9 @@ DotCloud
 ~~~~~~~~

 `DotCloud <http://www.dotcloud.com/>`_ supports WSGI applications and
-background/worker tasks natively on their platform. Web applications running
-Python version 2.6, and uses :ref:`nginx <nginx-ref>` and :ref:`uWSGI
-<uwsgi-ref>`, and allows custom configuration of both
-for advanced users.
+background/worker tasks natively on their platform. Web applications run
+Python version 2.6, use :ref:`nginx <nginx-ref>` and :ref:`uWSGI
+<uwsgi-ref>`, and allow custom configuration of both for advanced users.

 DotCloud uses a custom command-line API client which can work with
 applications managed in git repositories or any other version control
@@ -222,7 +221,7 @@ getting started.
 Gondor
 ~~~~~~

-`Gondor <https://gondor.io/>`_ is a PaaS specailized for deploying Django
+`Gondor <https://gondor.io/>`_ is a PaaS specialized for deploying Django
 and Pinax applications. Gondor supports Django versions 1.2 and 1.3 on
 Python version 2.7, and can automatically configure your Django site if you
 use ``local_settings.py`` for site-specific configuration information.
@@ -238,7 +237,7 @@ Templating
 Most WSGI applications are responding to HTTP requests to serve
 content in HTML or other markup languages. Instead of generating directly
 textual content from Python, the concept of separation of concerns
-advises us to use templates. A template engine manage a suite of
+advises us to use templates. A template engine manages a suite of
 template files, with a system of hierarchy and inclusion to
 avoid unnecessary repetition, and is in charge of rendering
 (generating) the actual content, filling the static content
@@ -265,7 +264,7 @@ and to the templates themselves.
  templates. This convenience can lead to uncontrolled
  increase in complexity, and often harder to find bugs.

- It is often possible or necessary to mix javascript templates with
+- It is often necessary to mix javascript templates with
  HTML templates. A sane approach to this design is to isolate
  the parts where the HTML template passes some variable content
  to the javascript code.