Compare commits

...

54 Commits

Author SHA1 Message Date
Timo Furrer 733d77ad1e release: 0.13.0 2019-03-08 12:17:07 +00:00
Timo Furrer 3abd7e8c53 Merge pull request #351 from jdufresne/isinstance
Merge multiple isinstance() calls to one
2019-03-03 16:43:10 +01:00
Timo Furrer 0be9e6a74b Merge pull request #353 from jdufresne/pypy
Remove pypy from tox.ini
2019-03-03 16:42:34 +01:00
Timo Furrer ecd0afbcec Merge pull request #248 from jean/master
Editing while reading: punctuation, markup, linebreaks
2019-03-03 16:41:47 +01:00
Jean Jordaan addaa090ef Merge branch 'master' into master 2019-03-03 13:29:21 +07:00
Jon Dufresne f7b3fd4601 Remove pypy from tox.ini
The platform is not tested on Travis and it fails to run with:

    Processing ./.tox/dist/tablib-0.12.1.zip
    Collecting odfpy (from tablib==0.12.1)
    Collecting openpyxl>=2.4.0 (from tablib==0.12.1)
    Collecting backports.csv (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/71/f7/5db9136de67021a6dce4eefbe50d46aa043e59ebb11c83d4ecfeb47b686e/backports.csv-1.0.6-py2.py3-none-any.whl
    Collecting xlrd (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/b0/16/63576a1a001752e34bf8ea62e367997530dc553b689356b9879339cf45a4/xlrd-1.2.0-py2.py3-none-any.whl
    Collecting xlwt (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/44/48/def306413b25c3d01753603b1a222a011b8621aed27cd7f89cbc27e6b0f4/xlwt-1.3.0-py2.py3-none-any.whl
    Collecting pyyaml (from tablib==0.12.1)
    Collecting pandas (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/81/fd/b1f17f7dc914047cd1df9d6813b944ee446973baafe8106e4458bfb68884/pandas-0.24.1.tar.gz
        Complete output from command python setup.py egg_info:
        Traceback (most recent call last):
          File "<module>", line 1, in <module>
          File "/tmp/pip-install-F5lmAg/pandas/setup.py", line 732, in <module>
            ext_modules=maybe_cythonize(extensions, compiler_directives=directives),
          File "/tmp/pip-install-F5lmAg/pandas/setup.py", line 475, in maybe_cythonize
            numpy_incl = pkg_resources.resource_filename('numpy', 'core/include')
          File "tablib/.tox/pypy/site-packages/pkg_resources/__init__.py", line 1144, in resource_filename
            return get_provider(package_or_requirement).get_resource_filename(
          File "tablib/.tox/pypy/site-packages/pkg_resources/__init__.py", line 361, in get_provider
            __import__(moduleOrReq)
        ImportError: No module named numpy

        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-F5lmAg/pandas/
2019-03-02 08:56:06 -08:00
Parth Shandilya 79dc77de49 Merge pull request #352 from jdufresne/ws
Trim trailing white space throughout the project
2019-03-02 22:18:07 +05:30
Jon Dufresne b057cdf05e Trim trailing white space throughout the project
Many editors clean up trailing white space on save. By removing it all
in one go, it helps keep future diffs cleaner by avoiding spurious white
space changes on unrelated lines.
2019-03-02 08:42:53 -08:00
Jon Dufresne fc2f3c07c8 Merge multiple isinstance() calls to one 2019-03-02 08:38:03 -08:00
Timo Furrer a10327a283 Merge pull request #350 from browniebroke/bugfix/invalid-ascii-csv
Import ascii characters not valid with unicode literals - updated
2019-03-02 15:06:21 +01:00
Bruno Alla e0de42ef06 Add backports.csv to requirements.txt 2019-03-02 10:44:38 -03:00
Bruno Alla f757ab84d1 Merge branch 'master' into bugfix/invalid-ascii-csv
# Conflicts:
#	setup.py
#	tablib/compat.py
#	test_tablib.py
2019-03-02 10:41:07 -03:00
Timo Furrer dc24fda415 Merge pull request #333 from hudgeon/master
Updated xlsx format to remove reference to openpyxl's deprecated get_active_worksheet
2019-03-02 13:03:30 +01:00
Timo Furrer 3ba8d529fc Merge pull request #348 from mloesch/jira
Add Jira table export
2019-03-02 12:16:00 +01:00
Timo Furrer a8bdb4b28f Merge pull request #338 from lepuchi/hotfix/csv-new-line
Handle case where there is an empty line in CSV
2019-03-02 12:12:30 +01:00
Timo Furrer 1aaf235751 Merge pull request #344 from jdufresne/tox-pandas
Include pandas dependency when testing with tox
2019-03-02 12:04:35 +01:00
Parth Shandilya 36ec60d5dd Merge pull request #343 from jdufresne/od
Remove vendored ordereddict package
2019-03-02 00:08:40 +05:30
Parth Shandilya babcbfd949 Merge pull request #339 from thombashi/replace_deprecated_method
Replace a deprecated method call
2019-03-02 00:07:39 +05:30
Parth Shandilya 29b2c08da0 Merge pull request #346 from jdufresne/compat
Remove unused compat entries
2019-03-01 23:59:27 +05:30
Parth Shandilya 862a681263 Merge pull request #345 from jdufresne/cache
Enable pip cache in Travis CI
2019-03-01 23:58:22 +05:30
Mathias Loesch 102073c426 Add Jira table export 2019-01-23 22:34:45 +01:00
Jon Dufresne 499ce52304 Remove unused compat entries
Organize both the Python2 & Python3 sections in the same order so they
are easier to compare.

Removed:

- basestring
- ifilter
- bytes
2019-01-01 10:40:29 -08:00
Jon Dufresne c650b67e06 Enable pip cache in Travis CI
Reduce load on PyPI servers and slightly speed up builds.

For more information, see:

https://docs.travis-ci.com/user/caching/#pip-cache
2019-01-01 10:32:08 -08:00
Jon Dufresne 3e4d6fb5aa Include pandas dependency when testing with tox
Allows all tests to pass.

As pandas is defined as an 'extra', use tox's 'extras' feature. This
requires tox 2.4+, so document that as well.

https://tox.readthedocs.io/en/latest/config.html#conf-extras
2019-01-01 10:28:29 -08:00
Jon Dufresne dd2ba714d3 Remove vendored ordereddict package
Now that Python 2.6 support has been dropped, can remove the vendored
ordereddict package. Use the stdlib collections.OrderedDict instead.
2019-01-01 10:02:13 -08:00
Tsuyoshi Hombashi a28a057559 Replace a deprecated method call
Workbook.remove_sheet method deprecated since openpyxl 2.4.0
2018-10-06 19:19:09 +09:00
lepuchi d38549ef1e only add row if it exists 2018-10-02 23:26:19 +05:30
kennethreitz 5a359ba4de Update README.rst 2018-09-17 08:14:12 -04:00
kennethreitz 359007444c Update README.rst 2018-09-17 08:13:48 -04:00
Maciej "RooTer" Urbański 4f8949417e ujson presence no longer breaks tablib (resolves #297) (#311) 2018-09-12 16:15:20 -03:00
Bruno Soares 3d5943a8a4 Fix: Circular reference detected error (#332)
* Rename function name

* Add uuid handler on json dumps

* Add myself to authors
2018-09-12 15:49:46 -03:00
Norman Hooper 38486231cc reStructuredText (#336)
* median for Python 2

* More compat

* Support reStructuredText

* Tests
2018-09-12 15:27:10 -03:00
Claude Paroz 75f1bafd69 Removed Python 3.3 support (#310) 2018-09-12 15:24:37 -03:00
Iuri de Silvio 4749760e6f Typo: OSD -> ODS
Fix #330
2018-09-12 15:22:06 -03:00
Gregory Bataille ac3cf67620 fix(): remove openpyxl warning by properly accessing cells (#296) 2018-09-12 08:34:55 -03:00
DougHudgeon f812c29275 Add instructions for handling csv line endings in Windows in Python 3 2018-06-26 10:33:21 +10:00
DougHudgeon 4c5d0b1a45 Instructions for opening Excel workbook and reading the first sheet 2018-06-25 14:25:50 +10:00
DougHudgeon 61063e2b09 Updated xlsx format to use openpyxl's .active property 2018-06-25 14:17:34 +10:00
kennethreitz 4c300e65a5 update install instructions
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2017-09-01 15:42:51 -04:00
kennethreitz edbb16ec97 next version 2017-09-01 15:37:00 -04:00
kennethreitz dec5cea722 Merge pull request #307 from audiolion/make-pandas-optional
Make pandas optional
2017-09-01 13:49:44 -04:00
Ryan Castner 38183938dc Change how travis installs to get all test dependencies 2017-09-01 13:33:28 -04:00
Ryan Castner 7f1db4023f Raise NotImplementedError if pandas is not installed 2017-09-01 13:21:21 -04:00
Ryan Castner b09fface1b Make pandas an optional install 2017-09-01 13:20:54 -04:00
kennethreitz 69edb9def3 Update index.rst 2017-08-28 01:14:36 -04:00
kennethreitz ec54918f4a Update tutorial.rst 2017-08-28 01:06:43 -04:00
kennethreitz ab6633549f Update index.rst 2017-08-28 01:04:16 -04:00
kennethreitz 56005d8022 Update README.rst 2017-08-28 01:02:49 -04:00
kennethreitz 36fa7ef097 update docs
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2017-08-27 03:56:14 -04:00
Bruno Alla 80e72cfa27 Fix unicode encode errors on Python 2 -- Fixes #215
Switch csv library to backports.csv as the implementation
is closer to the python 3 one. Add a test case covering the
problem.

Run tests with unicode_literals from future

Fix unicode encode errors with unicode characters

- Use `backports.csv` instead of `unicodecsv`
- Use StringIO instead of cStringIO
- Clean-up some Python 2 specific code
2017-05-02 17:33:14 +01:00
Jean Jordaan cd67a63b43 Fix typo in label, add missing newline before directive
Label: s/peed/speed/
2016-08-01 11:13:25 +07:00
Jean Jordaan 19b3d6d06a Change the blind reference mit to an URL
:ref:`MIT Licensed <mit>` had no target in the docs, so change it to a
canonical URL.
2016-08-01 11:11:01 +07:00
Jean Jordaan 59090d33a8 Missed some tabs. 2016-07-31 18:23:21 +07:00
Jean Jordaan a4f974287b Editing while reading: punctuation, markup, linebreaks
I fixed some extra commas, missing apostrophes, and typos;
added some linebreaks between sentences for very long lines;
added explicit markup for console blocks,
got rid of some tabs,
fixed indentation of an admonition, and some more small tweaks.

This supersedes https://github.com/kennethreitz/tablib/pull/84
2016-07-31 18:15:12 +07:00
36 changed files with 714 additions and 401 deletions
+2 -2
View File
@@ -1,10 +1,10 @@
language: python
cache: pip
python:
- 2.7
- 3.3
- 3.4
- 3.5
- 3.6
install:
- python setup.py install
- pip install -r requirements.txt
script: python test_tablib.py
+2
View File
@@ -34,3 +34,5 @@ Patches and Suggestions
- Mathias Loesch
- Tushar Makkar
- Andrii Soldatenko
- Bruno Soares
- Tsuyoshi Hombashi
+2 -2
View File
@@ -1,10 +1,10 @@
Where possible, please follow PEP8 with regard to coding style. Sometimes the line
Where possible, please follow PEP8 with regard to coding style. Sometimes the line
length restriction is too hard to follow, so don't bend over backwards there.
Triple-quotes should always be """, single quotes are ' unless using "
would result in less escaping within the string.
All modules, functions, and methods should be well documented reStructuredText for
All modules, functions, and methods should be well documented reStructuredText for
Sphinx AutoDoc.
All functionality should be available in pure Python. Optional C (via Cython)
-1
View File
@@ -254,4 +254,3 @@ History
* Export Support for XLS, JSON, YAML, and CSV.
* DataBook Export for XLS, JSON, and YAML.
* Python Dict Property Support.
+1 -27
View File
@@ -1,32 +1,6 @@
Tablib includes some vendorized python libraries: ordereddict, markup.
Tablib includes some vendorized Python libraries: markup.
Markup License
==============
Markup is in the public domain.
OrderedDict License
===================
Copyright (c) 2009 Raymond Hettinger
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation files
(the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge,
publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
+27 -17
View File
@@ -6,11 +6,11 @@ Tablib: format-agnostic tabular dataset library
::
_____ ______ ___________ ______
__ /_______ ____ /_ ___ /___(_)___ /_
_ __/_ __ `/__ __ \__ / __ / __ __ \
/ /_ / /_/ / _ /_/ /_ / _ / _ /_/ /
\__/ \__,_/ /_.___/ /_/ /_/ /_.___/
_____ ______ ___________ ______
__ /_______ ____ /_ ___ /___(_)___ /_
_ __/_ __ `/__ __ \__ / __ / __ __ \
/ /_ / /_/ / _ /_/ /_ / _ / _ /_/ /
\__/ \__,_/ /_.___/ /_/ /_/ /_.___/
@@ -23,21 +23,31 @@ Output formats supported:
- YAML (Sets + Books)
- Pandas DataFrames (Sets)
- HTML (Sets)
- Jira (Sets)
- TSV (Sets)
- OSD (Sets)
- ODS (Sets)
- CSV (Sets)
- DBF (Sets)
Note that tablib *purposefully* excludes XML support. It always will. (Note: This is a joke. Pull requests are welcome.)
If you're interested in financially supporting Kenneth Reitz open source, consider `visiting this link <https://cash.me/$KennethReitz>`_. Your support helps tremendously with sustainability of motivation, as Open Source is no longer part of my day job.
Overview
--------
`tablib.Dataset()`
A Dataset is a table of tabular data. It may or may not have a header row. They can be build and manipulated as raw Python datatypes (Lists of tuples|dictionaries). Datasets can be imported from JSON, YAML, DBF, and CSV; they can be exported to XLSX, XLS, ODS, JSON, YAML, DBF, CSV, TSV, and HTML.
A Dataset is a table of tabular data.
It may or may not have a header row.
They can be build and manipulated as raw Python datatypes (Lists of tuples|dictionaries).
Datasets can be imported from JSON, YAML, DBF, and CSV;
they can be exported to XLSX, XLS, ODS, JSON, YAML, DBF, CSV, TSV, and HTML.
`tablib.Databook()`
A Databook is a set of Datasets. The most common form of a Databook is an Excel file with multiple spreadsheets. Databooks can be imported from JSON and YAML; they can be exported to XLSX, XLS, ODS, JSON, and YAML.
A Databook is a set of Datasets.
The most common form of a Databook is an Excel file with multiple spreadsheets.
Databooks can be imported from JSON and YAML;
they can be exported to XLSX, XLS, ODS, JSON, and YAML.
Usage
-----
@@ -87,7 +97,7 @@ JSON!
+++++
::
>>> print(data.json)
>>> print(data.export('json'))
[
{
"last_name": "Adams",
@@ -106,7 +116,7 @@ YAML!
+++++
::
>>> print(data.yaml)
>>> print(data.export('yaml'))
- {age: 90, first_name: John, last_name: Adams}
- {age: 83, first_name: Henry, last_name: Ford}
@@ -114,7 +124,7 @@ CSV...
++++++
::
>>> print(data.csv)
>>> print(data.export('csv'))
first_name,last_name,age
John,Adams,90
Henry,Ford,83
@@ -124,20 +134,20 @@ EXCEL!
::
>>> with open('people.xls', 'wb') as f:
... f.write(data.xls)
... f.write(data.export('xls'))
DBF!
++++
::
>>> with open('people.dbf', 'wb') as f:
... f.write(data.dbf)
... f.write(data.export('dbf'))
Pandas DataFrame!
+++++++++++++++++
::
::
>>> print(data.df):
>>> print(data.export('df')):
first_name last_name age
0 John Adams 90
1 Henry Ford 83
@@ -150,7 +160,7 @@ Installation
To install tablib, simply: ::
$ pip install tablib
$ pip install tablib[pandas]
Make sure to check out `Tablib on PyPi <https://pypi.python.org/pypi/tablib/>`_!
+2 -2
View File
@@ -1,9 +1,9 @@
Modifications:
Modifications:
Copyright (c) 2011 Kenneth Reitz.
Original Project:
Original Project:
Copyright (c) 2010 by Armin Ronacher.
+1 -2
View File
@@ -1,7 +1,7 @@
krTheme Sphinx Style
====================
This repository contains sphinx styles Kenneth Reitz uses in most of
This repository contains sphinx styles Kenneth Reitz uses in most of
his projects. It is a drivative of Mitsuhiko's themes for Flask and Flask related
projects. To use this style in your Sphinx documentation, follow
this guide:
@@ -22,4 +22,3 @@ The following themes exist:
**kr_small**
small one-page theme. Intended to be used by very small addon libraries.
+1 -1
View File
@@ -4,4 +4,4 @@ stylesheet = flasky.css
pygments_style = flask_theme_support.FlaskyStyle
[options]
touch_icon =
touch_icon =
+22 -22
View File
@@ -8,11 +8,11 @@
* :license: BSD, see LICENSE for details.
*
*/
@import url("basic.css");
/* -- page layout ----------------------------------------------------------- */
body {
font-family: 'Georgia', serif;
font-size: 17px;
@@ -35,7 +35,7 @@ div.bodywrapper {
hr {
border: 1px solid #B1B4B6;
}
div.body {
background-color: #ffffff;
color: #3E4349;
@@ -46,7 +46,7 @@ img.floatingflask {
padding: 0 0 10px 10px;
float: right;
}
div.footer {
text-align: right;
color: #888;
@@ -55,12 +55,12 @@ div.footer {
width: 650px;
margin: 0 auto 40px auto;
}
div.footer a {
color: #888;
text-decoration: underline;
}
div.related {
line-height: 32px;
color: #888;
@@ -69,18 +69,18 @@ div.related {
div.related ul {
padding: 0 0 0 10px;
}
div.related a {
color: #444;
}
/* -- body styles ----------------------------------------------------------- */
a {
color: #004B6B;
text-decoration: underline;
}
a:hover {
color: #6D4100;
text-decoration: underline;
@@ -89,7 +89,7 @@ a:hover {
div.body {
padding-bottom: 40px; /* saved for footer */
}
div.body h1,
div.body h2,
div.body h3,
@@ -109,24 +109,24 @@ div.indexwrapper h1 {
height: {{ theme_index_logo_height }};
}
{% endif %}
div.body h2 { font-size: 180%; }
div.body h3 { font-size: 150%; }
div.body h4 { font-size: 130%; }
div.body h5 { font-size: 100%; }
div.body h6 { font-size: 100%; }
a.headerlink {
color: white;
padding: 0 4px;
text-decoration: none;
}
a.headerlink:hover {
color: #444;
background: #eaeaea;
}
div.body p, div.body dd, div.body li {
line-height: 1.4em;
}
@@ -164,25 +164,25 @@ div.note {
background-color: #eee;
border: 1px solid #ccc;
}
div.seealso {
background-color: #ffc;
border: 1px solid #ff6;
}
div.topic {
background-color: #eee;
}
div.warning {
background-color: #ffe4e4;
border: 1px solid #f66;
}
p.admonition-title {
display: inline;
}
p.admonition-title:after {
content: ":";
}
@@ -254,7 +254,7 @@ dl {
dl dd {
margin-left: 30px;
}
pre {
padding: 0;
margin: 15px -30px;
+56 -31
View File
@@ -43,11 +43,12 @@ control machine.
The repository is publicly accessible.
``git clone git://github.com/kennethreitz/tablib.git``
.. code-block:: console
git clone git://github.com/kennethreitz/tablib.git
The project is hosted on **GitHub**.
GitHub:
http://github.com/kennethreitz/tablib
@@ -55,10 +56,9 @@ The project is hosted on **GitHub**.
Git Branch Structure
++++++++++++++++++++
Feature / Hotfix / Release branches follow a `Successful Git Branching Model`_ . Git-flow_ is a great tool for managing the repository. I highly recommend it.
Feature / Hotfix / Release branches follow a `Successful Git Branching Model`_ .
Git-flow_ is a great tool for managing the repository. I highly recommend it.
``develop``
The "next release" branch. Likely unstable.
``master``
Current production release (|version|) on PyPi.
@@ -86,11 +86,14 @@ Tablib welcomes new format additions! Format suggestions include:
Coding by Convention
++++++++++++++++++++
Tablib features a micro-framework for adding format support. The easiest way to understand it is to use it. So, let's define our own format, named *xxx*.
Tablib features a micro-framework for adding format support.
The easiest way to understand it is to use it.
So, let's define our own format, named *xxx*.
1. Write a new format interface.
:class:`tablib.core` follows a simple pattern for automatically utilizing your format throughout Tablib. Function names are crucial.
:class:`tablib.core` follows a simple pattern for automatically utilizing your format throughout Tablib.
Function names are crucial.
Example **tablib/formats/_xxx.py**: ::
@@ -116,17 +119,18 @@ Tablib features a micro-framework for adding format support. The easiest way to
...
# returns True if given stream is parsable as xxx
.. admonition:: Excluding Support
.. admonition:: Excluding Support
If the format excludes support for an import/export mechanism (*e.g.*
:class:`csv <tablib.Dataset.csv>` excludes
:class:`Databook <tablib.Databook>` support),
simply don't define the respective functions.
Appropriate errors will be raised.
If the format excludes support for an import/export mechanism (*eg.* :class:`csv <tablib.Dataset.csv>` excludes :class:`Databook <tablib.Databook>` support), simply don't define the respective functions. Appropriate errors will be raised.
2. Add your new format module to the :class:`tablib.formats.available` tuple.
2.
Add your new format module to the :class:`tablib.formats.available` tuple.
3.
Add a mock property to the :class:`Dataset <tablib.Dataset>` class with verbose `reStructured Text`_ docstring. This alleviates IDE confusion, and allows for pretty auto-generated Sphinx_ documentation.
3. Add a mock property to the :class:`Dataset <tablib.Dataset>` class with verbose `reStructured Text`_ docstring.
This alleviates IDE confusion, and allows for pretty auto-generated Sphinx_ documentation.
4. Write respective :ref:`tests <testing>`.
@@ -136,22 +140,33 @@ Tablib features a micro-framework for adding format support. The easiest way to
Testing Tablib
--------------
Testing is crucial to Tablib's stability. This stable project is used in production by many companies and developers, so it is important to be certain that every version released is fully operational. When developing a new feature for Tablib, be sure to write proper tests for it as well.
Testing is crucial to Tablib's stability.
This stable project is used in production by many companies and developers,
so it is important to be certain that every version released is fully operational.
When developing a new feature for Tablib, be sure to write proper tests for it as well.
When developing a feature for Tablib, the easiest way to test your changes for potential issues is to simply run the test suite directly. ::
When developing a feature for Tablib,
the easiest way to test your changes for potential issues is to simply run the test suite directly.
$ ./test_tablib.py
.. code-block:: console
$ ./test_tablib.py
`Jenkins CI`_, amongst other tools, supports Java's xUnit testing report format. Nose_ allows us to generate our own xUnit reports.
`Jenkins CI`_, amongst other tools, supports Java's xUnit testing report format.
Nose_ allows us to generate our own xUnit reports.
Installing nose is simple. ::
Installing nose is simple.
$ pip install nose
.. code-block:: console
Once installed, we can generate our xUnit report with a single command. ::
$ pip install nose
$ nosetests test_tablib.py --with-xunit
Once installed, we can generate our xUnit report with a single command.
.. code-block:: console
$ nosetests test_tablib.py --with-xunit
This will generate a **nosetests.xml** file, which can then be analyzed.
@@ -165,7 +180,9 @@ This will generate a **nosetests.xml** file, which can then be analyzed.
Continuous Integration
----------------------
Every commit made to the **develop** branch is automatically tested and inspected upon receipt with `Travis CI`_. If you have access to the main repository and broke the build, you will receive an email accordingly.
Every commit made to the **develop** branch is automatically tested and inspected upon receipt with `Travis CI`_.
If you have access to the main repository and broke the build,
you will receive an email accordingly.
Anyone may view the build status and history at any time.
@@ -182,19 +199,27 @@ Additional reports will also be included here in the future, including :pep:`8`
Building the Docs
-----------------
Documentation is written in the powerful, flexible, and standard Python documentation format, `reStructured Text`_.
Documentation builds are powered by the powerful Pocoo project, Sphinx_. The :ref:`API Documentation <api>` is mostly documented inline throughout the module.
Documentation is written in the powerful, flexible,
and standard Python documentation format, `reStructured Text`_.
Documentation builds are powered by the powerful Pocoo project, Sphinx_.
The :ref:`API Documentation <api>` is mostly documented inline throughout the module.
The Docs live in ``tablib/docs``. In order to build them, you will first need to install Sphinx. ::
The Docs live in ``tablib/docs``.
In order to build them, you will first need to install Sphinx.
$ pip install sphinx
.. code-block:: console
$ pip install sphinx
Then, to build an HTML version of the docs, simply run the following from the **docs** directory: ::
Then, to build an HTML version of the docs, simply run the following from the ``docs`` directory:
$ make html
.. code-block:: console
Your ``docs/_build/html`` directory will then contain an HTML representation of the documentation, ready for publication on most web servers.
$ make html
Your ``docs/_build/html`` directory will then contain an HTML representation of the documentation,
ready for publication on most web servers.
You can also generate the documentation in **epub**, **latex**, **json**, *&c* similarly.
+16 -9
View File
@@ -22,7 +22,10 @@ Release v\ |version|. (:ref:`Installation <install>`)
.. * :ref:`search`
Tablib is an :ref:`MIT Licensed <mit>` format-agnostic tabular dataset library, written in Python. It allows you to import, export, and manipulate tabular data sets. Advanced features include, segregation, dynamic columns, tags & filtering, and seamless format import & export.
Tablib is an `MIT Licensed <https://mit-license.org/>`_ format-agnostic tabular dataset library, written in Python.
It allows you to import, export, and manipulate tabular data sets.
Advanced features include segregation, dynamic columns, tags & filtering,
and seamless format import & export.
::
@@ -31,17 +34,17 @@ Tablib is an :ref:`MIT Licensed <mit>` format-agnostic tabular dataset library,
... data.append(i)
>>> print(data.json)
>>> print(data.export('json'))
[{"Last Name": "Reitz", "First Name": "Kenneth", "Age": 22}, {"Last Name": "Monke", "First Name": "Bessie", "Age": 21}]
>>> print(data.yaml)
>>> print(data.export('yaml'))
- {Age: 22, First Name: Kenneth, Last Name: Reitz}
- {Age: 21, First Name: Bessie, Last Name: Monke}
>>> data.xlsx
<censored binary data>
>>> data.export('xlsx')
<redacted binary data>
>>> data.df
>>> data.export('df')
First Name Last Name Age
0 Kenneth Reitz 22
1 Bessie Monke 21
@@ -59,16 +62,20 @@ and `The Sunlight Foundation <http://sunlightfoundation.com/>`_ use Tablib inter
**Greg Thorton**
Tablib by @kennethreitz saved my life. I had to consolidate like 5 huge poorly maintained lists of domains and data. It was a breeze!
Tablib by @kennethreitz saved my life.
I had to consolidate like 5 huge poorly maintained lists of domains and data.
It was a breeze!
**Dave Coutts**
It's turning into one of my most used modules of 2010. You really hit a sweet spot for managing tabular data with a minimal amount of code and effort.
It's turning into one of my most used modules of 2010.
You really hit a sweet spot for managing tabular data with a minimal amount of code and effort.
**Joshua Ourisman**
Tablib has made it so much easier to deal with the inevitable 'I want an Excel file!' requests from clients...
**Brad Montgomery**
I think you nailed the "Python Zen" with tablib. Thanks again for an awesome lib!
I think you nailed the "Python Zen" with tablib.
Thanks again for an awesome lib!
User's Guide
+15 -15
View File
@@ -1,4 +1,5 @@
.. _install:
Installation
============
@@ -14,22 +15,30 @@ Installing Tablib
Distribute & Pip
----------------
Of course, the recommended way to install Tablib is with `pip <http://www.pip-installer.org/>`_::
Of course, the recommended way to install Tablib is with `pip <http://www.pip-installer.org/>`_:
$ pip install tablib
.. code-block:: console
$ pip install tablib[pandas]
-------------------
Download the Source
-------------------
You can also install tablib from source. The latest release (|version|) is available from GitHub.
You can also install tablib from source.
The latest release (|version|) is available from GitHub.
* tarball_
* zipball_
.. _
Once you have a copy of the source, you can embed it in your Python package, or install it into your site-packages easily. ::
Once you have a copy of the source,
you can embed it in your Python package,
or install it into your site-packages easily.
.. code-block:: console
$ python setup.py install
@@ -40,23 +49,14 @@ To download the full source history from Git, see :ref:`Source Control <scm>`.
.. _zipball: http://github.com/kennethreitz/tablib/zipball/master
.. _speed-extensions:
Speed Extensions
----------------
You can gain some speed improvement by optionally installing the ujson_ library.
Tablib will fallback to the standard `json` module if it doesn't find ``ujson``.
.. _ujson: https://pypi.python.org/pypi/ujson
.. _updates:
Staying Updated
---------------
The latest version of Tablib will always be available here:
* PyPi: http://pypi.python.org/pypi/tablib/
* PyPI: http://pypi.python.org/pypi/tablib/
* GitHub: http://github.com/kennethreitz/tablib/
When a new version is available, upgrading is simple::
+3 -7
View File
@@ -6,16 +6,15 @@ Introduction
This part of the documentation covers all the interfaces of Tablib.
Tablib is a format-agnostic tabular dataset library, written in Python.
It allows you to Pythonically import, export, and manipulate tabular data sets.
Advanced features include, segregation, dynamic columns, tags / filtering, and
Advanced features include segregation, dynamic columns, tags / filtering, and
seamless format import/export.
Philosophy
---------
----------
Tablib was developed with a few :pep:`20` idioms in mind.
#. Beautiful is better than ugly.
#. Explicit is better than implicit.
#. Simple is better than complex.
@@ -49,7 +48,7 @@ Tablib is released under terms of `The MIT License`_.
Tablib License
--------------
Copyright 2016 Kenneth Reitz
Copyright 2017 Kenneth Reitz
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
@@ -77,14 +76,11 @@ Pythons Supported
At this time, the following Python platforms are officially supported:
* cPython 2.6
* cPython 2.7
* cPython 3.3
* cPython 3.4
* cPython 3.5
* cPython 3.6
* PyPy-c 1.4
* PyPy-c 1.5
Support for other Pythons will be rolled out soon.
+61 -33
View File
@@ -8,7 +8,10 @@ Quickstart
.. module:: tablib
Eager to get started? This page gives a good introduction in how to get started with Tablib. This assumes you already have Tablib installed. If you do not, head over to the :ref:`Installation <install>` section.
Eager to get started?
This page gives a good introduction in how to get started with Tablib.
This assumes you already have Tablib installed.
If you do not, head over to the :ref:`Installation <install>` section.
First, make sure that:
@@ -16,7 +19,7 @@ First, make sure that:
* Tablib is :ref:`up-to-date <updates>`
Lets gets started with some simple use cases and examples.
Let's get started with some simple use cases and examples.
@@ -35,8 +38,8 @@ You can now start filling this :class:`Dataset <tablib.Dataset>` object with dat
.. admonition:: Example Context
From here on out, if you see ``data``, assume that it's a fresh :class:`Dataset <tablib.Dataset>` object.
From here on out, if you see ``data``, assume that it's a fresh
:class:`Dataset <tablib.Dataset>` object.
@@ -57,7 +60,7 @@ Let's say you want to collect a simple list of names. ::
# add names to Dataset
data.append([fname, lname])
You can get a nice, Pythonic view of the dataset at any time with :class:`Dataset.dict`.
You can get a nice, Pythonic view of the dataset at any time with :class:`Dataset.dict`::
>>> data.dict
[('Kenneth', 'Reitz'), ('Bessie', 'Monke')]
@@ -69,14 +72,16 @@ Adding Headers
--------------
It's time to enhance our :class:`Dataset` by giving our columns some titles. To do so, set :class:`Dataset.headers`. ::
It's time to enhance our :class:`Dataset` by giving our columns some titles.
To do so, set :class:`Dataset.headers`. ::
data.headers = ['First Name', 'Last Name']
Now our data looks a little different. ::
>>> data.dict
[{'Last Name': 'Reitz', 'First Name': 'Kenneth'}, {'Last Name': 'Monke', 'First Name': 'Bessie'}]
[{'Last Name': 'Reitz', 'First Name': 'Kenneth'},
{'Last Name': 'Monke', 'First Name': 'Bessie'}]
@@ -93,7 +98,8 @@ Now that we have a basic :class:`Dataset` in place, let's add a column of **ages
Let's view the data now. ::
>>> data.dict
[{'Last Name': 'Reitz', 'First Name': 'Kenneth', 'Age': 22}, {'Last Name': 'Monke', 'First Name': 'Bessie', 'Age': 20}]
[{'Last Name': 'Reitz', 'First Name': 'Kenneth', 'Age': 22},
{'Last Name': 'Monke', 'First Name': 'Bessie', 'Age': 20}]
It's that easy.
@@ -115,28 +121,36 @@ Tablib's killer feature is the ability to export your :class:`Dataset` objects i
**Comma-Separated Values** ::
>>> data.csv
>>> data.export('csv')
Last Name,First Name,Age
Reitz,Kenneth,22
Monke,Bessie,20
**JavaScript Object Notation** ::
>>> data.json
>>> data.export('json')
[{"Last Name": "Reitz", "First Name": "Kenneth", "Age": 22}, {"Last Name": "Monke", "First Name": "Bessie", "Age": 20}]
**YAML Ain't Markup Language** ::
>>> data.yaml
>>> data.export('yaml')
- {Age: 22, First Name: Kenneth, Last Name: Reitz}
- {Age: 20, First Name: Bessie, Last Name: Monke}
**Microsoft Excel** ::
>>> data.xls
<censored binary data>
>>> data.export('xls')
<redacted binary data>
**Pandas DataFrame** ::
>>> data.export('df')
First Name Last Name Age
0 Kenneth Reitz 22
1 Bessie Monke 21
------------------------
@@ -150,7 +164,8 @@ You can slice and dice your data, just like a standard Python list. ::
('Kenneth', 'Reitz', 22)
If we had a set of data consisting of thousands of rows, it could be useful to get a list of values in a column.
If we had a set of data consisting of thousands of rows,
it could be useful to get a list of values in a column.
To do so, we access the :class:`Dataset` as if it were a standard Python dictionary. ::
>>> data['First Name']
@@ -175,11 +190,11 @@ Let's find the average age. ::
Removing Rows & Columns
-----------------------
It's easier than you could imagine::
It's easier than you could imagine. Delete a column::
>>> del data['Col Name']
::
Delete a range of rows::
>>> del data[0:12]
@@ -188,7 +203,6 @@ It's easier than you could imagine::
Advanced Usage
==============
This part of the documentation services to give you an idea that are otherwise hard to extract from the :ref:`API Documentation <api>`
And now for something completely different.
@@ -202,9 +216,11 @@ Dynamic Columns
.. versionadded:: 0.8.3
Thanks to Josh Ourisman, Tablib now supports adding dynamic columns. A dynamic column is a single callable object (*ie.* a function).
Thanks to Josh Ourisman, Tablib now supports adding dynamic columns.
A dynamic column is a single callable object (*e.g.* a function).
Let's add a dynamic column to our :class:`Dataset` object. In this example, we have a function that generates a random grade for our students. ::
Let's add a dynamic column to our :class:`Dataset` object.
In this example, we have a function that generates a random grade for our students. ::
import random
@@ -216,7 +232,7 @@ Let's add a dynamic column to our :class:`Dataset` object. In this example, we h
Let's have a look at our data. ::
>>> data.yaml
>>> data.export('yaml')
- {Age: 22, First Name: Kenneth, Grade: 0.6, Last Name: Reitz}
- {Age: 20, First Name: Bessie, Grade: 0.75, Last Name: Monke}
@@ -226,7 +242,8 @@ Let's remove that column. ::
>>> del data['Grade']
When you add a dynamic column, the first argument that is passed in to the given callable is the current data row. You can use this to perform calculations against your data row.
When you add a dynamic column, the first argument that is passed in to the given callable is the current data row.
You can use this to perform calculations against your data row.
For example, we can use the data available in the row to guess the gender of a student. ::
@@ -246,7 +263,7 @@ For example, we can use the data available in the row to guess the gender of a s
Adding this function to our dataset as a dynamic column would result in: ::
>>> data.yaml
>>> data.export('yaml')
- {Age: 22, First Name: Kenneth, Gender: Male, Last Name: Reitz}
- {Age: 20, First Name: Bessie, Gender: Female, Last Name: Monke}
@@ -260,9 +277,11 @@ Filtering Datasets with Tags
.. versionadded:: 0.9.0
When constructing a :class:`Dataset` object, you can add tags to rows by specifying the ``tags`` parameter.
This allows you to filter your :class:`Dataset` later. This can be useful to separate rows of data based on
arbitrary criteria (*e.g.* origin) that you don't want to include in your :class:`Dataset`.
When constructing a :class:`Dataset` object,
you can add tags to rows by specifying the ``tags`` parameter.
This allows you to filter your :class:`Dataset` later.
This can be useful to separate rows of data based on arbitrary criteria
(*e.g.* origin) that you don't want to include in your :class:`Dataset`.
Let's tag some students. ::
@@ -281,14 +300,24 @@ Now that we have extra meta-data on our rows, we can easily filter our :class:`D
It's that simple. The original :class:`Dataset` is untouched.
Open an Excel Workbook and read first sheet
--------------------------------
To open an Excel 2007 and later workbook with a single sheet (or a workbook with multiple sheets but you just want the first sheet), use the following:
data = tablib.Dataset()
data.xlsx = open('my_excel_file.xlsx', 'rb').read()
print(data)
Excel Workbook With Multiple Sheets
------------------------------------
When dealing with a large number of :class:`Datasets <Dataset>` in spreadsheet format, it's quite common to group multiple spreadsheets into a single Excel file, known as a Workbook. Tablib makes it extremely easy to build workbooks with the handy, :class:`Databook` class.
When dealing with a large number of :class:`Datasets <Dataset>` in spreadsheet format,
it's quite common to group multiple spreadsheets into a single Excel file, known as a Workbook.
Tablib makes it extremely easy to build workbooks with the handy :class:`Databook` class.
Let's say we have 3 different :class:`Datasets <Dataset>`. All we have to do is add then to a :class:`Databook` object... ::
Let's say we have 3 different :class:`Datasets <Dataset>`.
All we have to do is add them to a :class:`Databook` object... ::
book = tablib.Databook((data1, data2, data3))
@@ -297,7 +326,7 @@ Let's say we have 3 different :class:`Datasets <Dataset>`. All we have to do is
with open('students.xls', 'wb') as f:
f.write(book.xls)
The resulting **students.xls** file will contain a separate spreadsheet for each :class:`Dataset` object in the :class:`Databook`.
The resulting ``students.xls`` file will contain a separate spreadsheet for each :class:`Dataset` object in the :class:`Databook`.
.. admonition:: Binary Warning
@@ -312,9 +341,8 @@ Separators
.. versionadded:: 0.8.2
When, it's often useful to create a blank row containing information on the upcoming data. So,
When constructing a spreadsheet,
it's often useful to create a blank row containing information on the upcoming data. So,
::
@@ -346,7 +374,7 @@ When, it's often useful to create a blank row containing information on the upco
# Write spreadsheet to disk
with open('grades.xls', 'wb') as f:
f.write(tests.xls)
f.write(tests.export('xls'))
The resulting **tests.xls** will have the following layout:
+1
View File
@@ -1,3 +1,4 @@
backports.csv==1.0.6
certifi==2017.7.27.1
chardet==3.0.4
et-xmlfile==1.0.1
+5 -13
View File
@@ -14,15 +14,6 @@ if sys.argv[-1] == 'publish':
os.system("python setup.py sdist upload")
sys.exit()
if sys.argv[-1] == 'speedups':
try:
__import__('pip')
except ImportError:
print('Pip required.')
sys.exit(1)
os.system('pip install ujson')
sys.exit()
if sys.argv[-1] == 'test':
try:
@@ -43,12 +34,11 @@ packages = [
install = [
'odfpy',
'openpyxl',
'unicodecsv',
'openpyxl>=2.4.0',
'backports.csv',
'xlrd',
'xlwt',
'pyyaml',
'pandas'
]
@@ -74,11 +64,13 @@ setup(
'License :: OSI Approved :: MIT License',
'Programming Language :: Python',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
],
tests_require=['pytest'],
install_requires=install,
extras_require={
'pandas': ['pandas'],
},
)
-1
View File
@@ -5,4 +5,3 @@ from tablib.core import (
InvalidDatasetType, InvalidDimensions, UnsupportedFormat,
__version__
)
+8 -18
View File
@@ -13,35 +13,25 @@ import sys
is_py3 = (sys.version_info[0] > 2)
try:
from collections import OrderedDict
except ImportError:
from tablib.packages.ordereddict import OrderedDict
if is_py3:
from io import BytesIO
from io import StringIO
from tablib.packages import markup3 as markup
from statistics import median
from itertools import zip_longest as izip_longest
import csv
import tablib.packages.dbfpy3 as dbfpy
import csv
from io import StringIO
# py3 mappings
ifilter = filter
unicode = str
bytes = bytes
basestring = str
xrange = range
else:
from cStringIO import StringIO as BytesIO
from cStringIO import StringIO
from StringIO import StringIO
from tablib.packages import markup
from itertools import ifilter
import unicodecsv as csv
from tablib.packages.statistics import median
from itertools import izip_longest
from backports import csv
import tablib.packages.dbfpy as dbfpy
unicode = unicode
+29 -10
View File
@@ -9,17 +9,18 @@
:license: MIT, see LICENSE for more details.
"""
from collections import OrderedDict
from copy import copy
from operator import itemgetter
from tablib import formats
from tablib.compat import OrderedDict, unicode
from tablib.compat import unicode
__title__ = 'tablib'
__version__ = '0.12.0'
__build__ = 0x001200
__version__ = '0.13.0'
__build__ = 0x001201
__author__ = 'Kenneth Reitz'
__license__ = 'MIT'
__copyright__ = 'Copyright 2017 Kenneth Reitz'
@@ -178,7 +179,7 @@ class Dataset(object):
def __getitem__(self, key):
if isinstance(key, str) or isinstance(key, unicode):
if isinstance(key, (str, unicode)):
if key in self.headers:
pos = self.headers.index(key) # get 'key' index from each data
return [row[pos] for row in self._data]
@@ -197,7 +198,7 @@ class Dataset(object):
def __delitem__(self, key):
if isinstance(key, str) or isinstance(key, unicode):
if isinstance(key, (str, unicode)):
if key in self.headers:
@@ -526,9 +527,9 @@ class Dataset(object):
Import assumes (for now) that headers exist.
.. admonition:: Binary Warning
.. admonition:: Binary Warning for Python 2
:class:`Dataset.csv` uses \\r\\n line endings by default, so make
:class:`Dataset.csv` uses \\r\\n line endings by default so, in Python 2, make
sure to write in binary mode::
with open('output.csv', 'wb') as f:
@@ -536,6 +537,18 @@ class Dataset(object):
If you do not do this, and you export the file on Windows, your
CSV file will open in Excel with a blank line between each row.
.. admonition:: Line endings for Python 3
:class:`Dataset.csv` uses \\r\\n line endings by default so, in Python 3, make
sure to include newline='' otherwise you will get a blank line between each row
when you open the file in Excel::
with open('output.csv', 'w', newline='') as f:
f.write(data.csv)
If you do not do this, and you export the file on Windows, your
CSV file will open in Excel with a blank line between each row.
"""
pass
@@ -631,7 +644,6 @@ class Dataset(object):
"""
pass
@property
def latex():
"""A LaTeX booktabs representation of the :class:`Dataset` object. If a
@@ -641,6 +653,13 @@ class Dataset(object):
"""
pass
@property
def jira():
"""A Jira table representation of the :class:`Dataset` object.
.. note:: This method can be used for export only.
"""
pass
# ----
# Rows
@@ -843,7 +862,7 @@ class Dataset(object):
against each cell value.
"""
if isinstance(col, str):
if isinstance(col, unicode):
if col in self.headers:
col = self.headers.index(col) # get 'key' index from each data
else:
@@ -876,7 +895,7 @@ class Dataset(object):
sorted.
"""
if isinstance(col, str) or isinstance(col, unicode):
if isinstance(col, (str, unicode)):
if not self.headers:
raise HeadersNeeded
+3 -1
View File
@@ -14,5 +14,7 @@ from . import _ods as ods
from . import _dbf as dbf
from . import _latex as latex
from . import _df as df
from . import _rst as rst
from . import _jira as jira
available = (json, xls, yaml, csv, dbf, tsv, html, latex, xlsx, ods, df)
available = (json, xls, yaml, csv, dbf, tsv, html, jira, latex, xlsx, ods, df, rst)
+3 -8
View File
@@ -3,15 +3,14 @@
""" Tablib - *SV Support.
"""
from tablib.compat import is_py3, csv, StringIO
from tablib.compat import csv, StringIO, unicode
title = 'csv'
extensions = ('csv',)
DEFAULT_ENCODING = 'utf-8'
DEFAULT_DELIMITER = ','
DEFAULT_DELIMITER = unicode(',')
def export_set(dataset, **kwargs):
@@ -19,8 +18,6 @@ def export_set(dataset, **kwargs):
stream = StringIO()
kwargs.setdefault('delimiter', DEFAULT_DELIMITER)
if not is_py3:
kwargs.setdefault('encoding', DEFAULT_ENCODING)
_csv = csv.writer(stream, **kwargs)
@@ -36,15 +33,13 @@ def import_set(dset, in_stream, headers=True, **kwargs):
dset.wipe()
kwargs.setdefault('delimiter', DEFAULT_DELIMITER)
if not is_py3:
kwargs.setdefault('encoding', DEFAULT_ENCODING)
rows = csv.reader(StringIO(in_stream), **kwargs)
for i, row in enumerate(rows):
if (i == 0) and (headers):
dset.headers = row
else:
elif row:
dset.append(row)
-3
View File
@@ -89,6 +89,3 @@ def detect(stream):
# When unpacking a string argument with less than 8 chars, struct.error is
# raised.
return False
+10 -1
View File
@@ -10,7 +10,10 @@ if sys.version_info[0] > 2:
else:
from cStringIO import StringIO as BytesIO
from pandas import DataFrame
try:
from pandas import DataFrame
except ImportError:
DataFrame = None
import tablib
@@ -21,6 +24,8 @@ extensions = ('df', )
def detect(stream):
"""Returns True if given stream is a DataFrame."""
if DataFrame is None:
return False
try:
DataFrame(stream)
return True
@@ -30,6 +35,10 @@ def detect(stream):
def export_set(dset, index=None):
"""Returns DataFrame representation of DataBook."""
if DataFrame is None:
raise NotImplementedError(
'DataFrame Format requires `pandas` to be installed.'
' Try `pip install tablib[pandas]`.')
dataframe = DataFrame(dset.dict, columns=dset.headers)
return dataframe
+39
View File
@@ -0,0 +1,39 @@
# -*- coding: utf-8 -*-
"""Tablib - Jira table export support.
Generates a Jira table from the dataset.
"""
from tablib.compat import unicode
title = 'jira'
def export_set(dataset):
"""Formats the dataset according to the Jira table syntax:
||heading 1||heading 2||heading 3||
|col A1|col A2|col A3|
|col B1|col B2|col B3|
:param dataset: dataset to serialize
:type dataset: tablib.core.Dataset
"""
header = _get_header(dataset.headers) if dataset.headers else ''
body = _get_body(dataset)
return '%s\n%s' % (header, body) if header else body
def _get_body(dataset):
return '\n'.join([_serialize_row(row) for row in dataset])
def _get_header(headers):
return _serialize_row(headers, delimiter='||')
def _serialize_row(row, delimiter='|'):
return '%s%s%s' % (delimiter,
delimiter.join([unicode(item) if item else ' ' for item in row]),
delimiter)
+6 -9
View File
@@ -3,36 +3,33 @@
""" Tablib - JSON Support
"""
import decimal
import json
from uuid import UUID
import tablib
try:
import ujson as json
except ImportError:
import json
title = 'json'
extensions = ('json', 'jsn')
def date_handler(obj):
if isinstance(obj, decimal.Decimal):
def serialize_objects_handler(obj):
if isinstance(obj, (decimal.Decimal, UUID)):
return str(obj)
elif hasattr(obj, 'isoformat'):
return obj.isoformat()
else:
return obj
# return obj.isoformat() if hasattr(obj, 'isoformat') else obj
def export_set(dataset):
"""Returns JSON representation of Dataset."""
return json.dumps(dataset.dict, default=date_handler)
return json.dumps(dataset.dict, default=serialize_objects_handler)
def export_book(databook):
"""Returns JSON representation of Databook."""
return json.dumps(databook._package(), default=date_handler)
return json.dumps(databook._package(), default=serialize_objects_handler)
def import_set(dset, in_stream):
+273
View File
@@ -0,0 +1,273 @@
# -*- coding: utf-8 -*-
""" Tablib - reStructuredText Support
"""
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
from textwrap import TextWrapper
from tablib.compat import (
median,
unicode,
izip_longest,
)
title = 'rst'
extensions = ('rst',)
MAX_TABLE_WIDTH = 80 # Roughly. It may be wider to avoid breaking words.
JUSTIFY_LEFT = 'left'
JUSTIFY_CENTER = 'center'
JUSTIFY_RIGHT = 'right'
JUSTIFY_VALUES = (JUSTIFY_LEFT, JUSTIFY_CENTER, JUSTIFY_RIGHT)
def to_unicode(value):
if isinstance(value, bytes):
return value.decode('utf-8')
return unicode(value)
def _max_word_len(text):
"""
Return the length of the longest word in `text`.
>>> _max_word_len('Python Module for Tabular Datasets')
8
"""
return max((len(word) for word in text.split()))
def _get_column_string_lengths(dataset):
"""
Returns a list of string lengths of each column, and a list of
maximum word lengths.
"""
if dataset.headers:
column_lengths = [[len(h)] for h in dataset.headers]
word_lens = [_max_word_len(h) for h in dataset.headers]
else:
column_lengths = [[] for _ in range(dataset.width)]
word_lens = [0 for _ in range(dataset.width)]
for row in dataset.dict:
values = iter(row.values() if hasattr(row, 'values') else row)
for i, val in enumerate(values):
text = to_unicode(val)
column_lengths[i].append(len(text))
word_lens[i] = max(word_lens[i], _max_word_len(text))
return column_lengths, word_lens
def _row_to_lines(values, widths, wrapper, sep='|', justify=JUSTIFY_LEFT):
"""
Returns a table row of wrapped values as a list of lines
"""
if justify not in JUSTIFY_VALUES:
raise ValueError('Value of "justify" must be one of "{}"'.format(
'", "'.join(JUSTIFY_VALUES)
))
if justify == JUSTIFY_LEFT:
just = lambda text, width: text.ljust(width)
elif justify == JUSTIFY_CENTER:
just = lambda text, width: text.center(width)
else:
just = lambda text, width: text.rjust(width)
lpad = sep + ' ' if sep else ''
rpad = ' ' + sep if sep else ''
pad = ' ' + sep + ' '
cells = []
for value, width in zip(values, widths):
wrapper.width = width
text = to_unicode(value)
cell = wrapper.wrap(text)
cells.append(cell)
lines = izip_longest(*cells, fillvalue='')
lines = (
(just(cell_line, widths[i]) for i, cell_line in enumerate(line))
for line in lines
)
lines = [''.join((lpad, pad.join(line), rpad)) for line in lines]
return lines
def _get_column_widths(dataset, max_table_width=MAX_TABLE_WIDTH, pad_len=3):
"""
Returns a list of column widths proportional to the median length
of the text in their cells.
"""
str_lens, word_lens = _get_column_string_lengths(dataset)
median_lens = [int(median(lens)) for lens in str_lens]
total = sum(median_lens)
if total > max_table_width - (pad_len * len(median_lens)):
column_widths = (max_table_width * l // total for l in median_lens)
else:
column_widths = (l for l in median_lens)
# Allow for separator and padding:
column_widths = (w - pad_len if w > pad_len else w for w in column_widths)
# Rather widen table than break words:
column_widths = [max(w, l) for w, l in zip(column_widths, word_lens)]
return column_widths
def export_set_as_simple_table(dataset, column_widths=None):
"""
Returns reStructuredText grid table representation of dataset.
"""
lines = []
wrapper = TextWrapper()
if column_widths is None:
column_widths = _get_column_widths(dataset, pad_len=2)
border = ' '.join(['=' * w for w in column_widths])
lines.append(border)
if dataset.headers:
lines.extend(_row_to_lines(
dataset.headers,
column_widths,
wrapper,
sep='',
justify=JUSTIFY_CENTER,
))
lines.append(border)
for row in dataset.dict:
values = iter(row.values() if hasattr(row, 'values') else row)
lines.extend(_row_to_lines(values, column_widths, wrapper, ''))
lines.append(border)
return '\n'.join(lines)
def export_set_as_grid_table(dataset, column_widths=None):
"""
Returns reStructuredText grid table representation of dataset.
>>> from tablib import Dataset
>>> from tablib.formats import rst
>>> bits = ((0, 0), (1, 0), (0, 1), (1, 1))
>>> data = Dataset()
>>> data.headers = ['A', 'B', 'A and B']
>>> for a, b in bits:
... data.append([bool(a), bool(b), bool(a * b)])
>>> print(rst.export_set(data, force_grid=True))
+-------+-------+-------+
| A | B | A and |
| | | B |
+=======+=======+=======+
| False | False | False |
+-------+-------+-------+
| True | False | False |
+-------+-------+-------+
| False | True | False |
+-------+-------+-------+
| True | True | True |
+-------+-------+-------+
"""
lines = []
wrapper = TextWrapper()
if column_widths is None:
column_widths = _get_column_widths(dataset)
header_sep = '+=' + '=+='.join(['=' * w for w in column_widths]) + '=+'
row_sep = '+-' + '-+-'.join(['-' * w for w in column_widths]) + '-+'
lines.append(row_sep)
if dataset.headers:
lines.extend(_row_to_lines(
dataset.headers,
column_widths,
wrapper,
justify=JUSTIFY_CENTER,
))
lines.append(header_sep)
for row in dataset.dict:
values = iter(row.values() if hasattr(row, 'values') else row)
lines.extend(_row_to_lines(values, column_widths, wrapper))
lines.append(row_sep)
return '\n'.join(lines)
def _use_simple_table(head0, col0, width0):
"""
Use a simple table if the text in the first column is never wrapped
>>> _use_simple_table('menu', ['egg', 'bacon'], 10)
True
>>> _use_simple_table(None, ['lobster thermidor', 'spam'], 10)
False
"""
if head0 is not None:
head0 = to_unicode(head0)
if len(head0) > width0:
return False
for cell in col0:
cell = to_unicode(cell)
if len(cell) > width0:
return False
return True
def export_set(dataset, **kwargs):
"""
Returns reStructuredText table representation of dataset.
Returns a simple table if the text in the first column is never
wrapped, otherwise returns a grid table.
>>> from tablib import Dataset
>>> bits = ((0, 0), (1, 0), (0, 1), (1, 1))
>>> data = Dataset()
>>> data.headers = ['A', 'B', 'A and B']
>>> for a, b in bits:
... data.append([bool(a), bool(b), bool(a * b)])
>>> table = data.rst
>>> table.split('\\n') == [
... '===== ===== =====',
... ' A B A and',
... ' B ',
... '===== ===== =====',
... 'False False False',
... 'True False False',
... 'False True False',
... 'True True True ',
... '===== ===== =====',
... ]
True
"""
if not dataset.dict:
return ''
force_grid = kwargs.get('force_grid', False)
max_table_width = kwargs.get('max_table_width', MAX_TABLE_WIDTH)
column_widths = _get_column_widths(dataset, max_table_width)
use_simple_table = _use_simple_table(
dataset.headers[0] if dataset.headers else None,
dataset.get_col(0),
column_widths[0],
)
if use_simple_table and not force_grid:
return export_set_as_simple_table(dataset, column_widths)
else:
return export_set_as_grid_table(dataset, column_widths)
def export_book(databook):
"""
reStructuredText representation of a Databook.
Tables are separated by a blank line. All tables use the grid
format.
"""
return '\n\n'.join(export_set(dataset, force_grid=True)
for dataset in databook._datasets)
+2 -2
View File
@@ -3,6 +3,7 @@
""" Tablib - TSV (Tab Separated Values) Support.
"""
from tablib.compat import unicode
from tablib.formats._csv import (
export_set as export_set_wrapper,
import_set as import_set_wrapper,
@@ -12,8 +13,7 @@ from tablib.formats._csv import (
title = 'tsv'
extensions = ('tsv',)
DEFAULT_ENCODING = 'utf-8'
DELIMITER = '\t'
DELIMITER = unicode('\t')
def export_set(dataset):
"""Returns TSV representation of Dataset."""
+1 -1
View File
@@ -25,7 +25,7 @@ def detect(stream):
xlrd.open_workbook(file_contents=stream)
return True
except (TypeError, XLRDError):
pass
pass
try:
xlrd.open_workbook(file_contents=stream.read())
return True
+4 -6
View File
@@ -52,7 +52,7 @@ def export_book(databook, freeze_panes=True):
wb = Workbook()
for sheet in wb.worksheets:
wb.remove_sheet(sheet)
wb.remove(sheet)
for i, dset in enumerate(databook._datasets):
ws = wb.create_sheet()
ws.title = dset.title if dset.title else 'Sheet%s' % (i)
@@ -71,7 +71,7 @@ def import_set(dset, in_stream, headers=True):
dset.wipe()
xls_book = openpyxl.reader.excel.load_workbook(BytesIO(in_stream))
sheet = xls_book.get_active_sheet()
sheet = xls_book.active
dset.title = sheet.title
@@ -119,7 +119,7 @@ def dset_sheet(dataset, ws, freeze_panes=True):
row_number = i + 1
for j, col in enumerate(row):
col_idx = get_column_letter(j + 1)
cell = ws.cell('%s%s' % (col_idx, row_number))
cell = ws['%s%s' % (col_idx, row_number)]
# bold headers
if (row_number == 1) and dataset.headers:
@@ -129,7 +129,7 @@ def dset_sheet(dataset, ws, freeze_panes=True):
if freeze_panes:
# Export Freeze only after first Line
ws.freeze_panes = 'A2'
# bold separators
elif len(row) < dataset.width:
cell.value = unicode('%s' % col, errors='ignore')
@@ -145,5 +145,3 @@ def dset_sheet(dataset, ws, freeze_panes=True):
cell.value = unicode('%s' % col, errors='ignore')
except TypeError:
cell.value = unicode(col)
+1 -1
View File
@@ -220,7 +220,7 @@ class DbfRecord(object):
def toString(self):
"""Return string packed record values."""
# for (_def, _dat) in zip(self.dbf.header.fields, self.fieldData):
#
#
return "".join([" *"[self.deleted]] + [
_def.encodeValue(_dat)
+18 -18
View File
@@ -33,7 +33,7 @@ class element:
self.tag = tag.lower( )
else:
self.tag = tag.upper( )
def __call__( self, *args, **kwargs ):
if len( args ) > 1:
raise ArgumentError( self.tag )
@@ -42,14 +42,14 @@ class element:
if self.parent is not None and self.parent.class_ is not None:
if 'class_' not in kwargs:
kwargs['class_'] = self.parent.class_
if self.parent is None and len( args ) == 1:
x = [ self.render( self.tag, False, myarg, mydict ) for myarg, mydict in _argsdicts( args, kwargs ) ]
return '\n'.join( x )
elif self.parent is None and len( args ) == 0:
x = [ self.render( self.tag, True, myarg, mydict ) for myarg, mydict in _argsdicts( args, kwargs ) ]
return '\n'.join( x )
if self.tag in self.parent.twotags:
for myarg, mydict in _argsdicts( args, kwargs ):
self.render( self.tag, False, myarg, mydict )
@@ -63,7 +63,7 @@ class element:
raise DeprecationError( self.tag )
else:
raise InvalidElementError( self.tag, self.parent.mode )
def render( self, tag, single, between, kwargs ):
"""Append the actual tags to content."""
@@ -89,7 +89,7 @@ class element:
self.parent.content.append( out )
else:
return out
def close( self ):
"""Append a closing tag unless element has only opening tag."""
@@ -128,11 +128,11 @@ class page:
these two keyword arguments may be used to select
the set of valid elements in 'xml' mode
invalid elements will raise appropriate exceptions
separator -- string to place between added elements, defaults to newline
class_ -- a class that will be added to every element if defined"""
valid_onetags = [ "AREA", "BASE", "BR", "COL", "FRAME", "HR", "IMG", "INPUT", "LINK", "META", "PARAM" ]
valid_twotags = [ "A", "ABBR", "ACRONYM", "ADDRESS", "B", "BDO", "BIG", "BLOCKQUOTE", "BODY", "BUTTON",
"CAPTION", "CITE", "CODE", "COLGROUP", "DD", "DEL", "DFN", "DIV", "DL", "DT", "EM", "FIELDSET",
@@ -163,7 +163,7 @@ class page:
self.deptags += list(map( str.lower, self.deptags ))
self.mode = 'strict_html'
elif mode == 'loose_html':
self.onetags = valid_onetags + deprecated_onetags
self.onetags = valid_onetags + deprecated_onetags
self.onetags += list(map( str.lower, self.onetags ))
self.twotags = valid_twotags + deprecated_twotags
self.twotags += list(map( str.lower, self.twotags ))
@@ -187,12 +187,12 @@ class page:
return element( attr, case=self.case, parent=self )
def __str__( self ):
if self._full and ( self.mode == 'strict_html' or self.mode == 'loose_html' ):
end = [ '</body>', '</html>' ]
else:
end = [ ]
return self.separator.join( self.header + self.content + self.footer + end )
def __call__( self, escape=False ):
@@ -232,7 +232,7 @@ class page:
lang -- language, usually a two character string, will appear
as <html lang='en'> in html mode (ignored in xml mode)
css -- Cascading Style Sheet filename as a string or a list of
strings for multiple css files (ignored in xml mode)
@@ -306,7 +306,7 @@ class page:
def css( self, filelist ):
"""This convenience function is only useful for html.
It adds css stylesheet(s) to the document via the <link> element."""
if isinstance( filelist, str ):
self.link( href=filelist, rel='stylesheet', type='text/css', media='all' )
else:
@@ -339,10 +339,10 @@ class _oneliner:
"""An instance of oneliner returns a string corresponding to one element.
This class can be used to write 'oneliners' that return a string
immediately so there is no need to instantiate the page class."""
def __init__( self, case='lower' ):
self.case = case
def __getattr__( self, attr ):
if attr.startswith("__") and attr.endswith("__"):
raise AttributeError(attr)
@@ -353,9 +353,9 @@ upper_oneliner = _oneliner( case='upper' )
def _argsdicts( args, mydict ):
"""A utility generator that pads argument list and dictionary values, will only be called with len( args ) = 0, 1."""
if len( args ) == 0:
args = None,
args = None,
elif len( args ) == 1:
args = _totuple( args[0] )
else:
@@ -418,7 +418,7 @@ _escape = escape
def unescape( text ):
"""Inverse of escape."""
if isinstance( text, str ):
if '&amp;' in text:
text = text.replace( '&amp;', '&' )
-127
View File
@@ -1,127 +0,0 @@
# Copyright (c) 2009 Raymond Hettinger
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.
from UserDict import DictMixin
class OrderedDict(dict, DictMixin):
def __init__(self, *args, **kwds):
if len(args) > 1:
raise TypeError('expected at most 1 arguments, got %d' % len(args))
try:
self.__end
except AttributeError:
self.clear()
self.update(*args, **kwds)
def clear(self):
self.__end = end = []
end += [None, end, end] # sentinel node for doubly linked list
self.__map = {} # key --> [key, prev, next]
dict.clear(self)
def __setitem__(self, key, value):
if key not in self:
end = self.__end
curr = end[1]
curr[2] = end[1] = self.__map[key] = [key, curr, end]
dict.__setitem__(self, key, value)
def __delitem__(self, key):
dict.__delitem__(self, key)
key, prev, next = self.__map.pop(key)
prev[2] = next
next[1] = prev
def __iter__(self):
end = self.__end
curr = end[2]
while curr is not end:
yield curr[0]
curr = curr[2]
def __reversed__(self):
end = self.__end
curr = end[1]
while curr is not end:
yield curr[0]
curr = curr[1]
def popitem(self, last=True):
if not self:
raise KeyError('dictionary is empty')
if last:
key = next(reversed(self))
else:
key = next(iter(self))
value = self.pop(key)
return key, value
def __reduce__(self):
items = [[k, self[k]] for k in self]
tmp = self.__map, self.__end
del self.__map, self.__end
inst_dict = vars(self).copy()
self.__map, self.__end = tmp
if inst_dict:
return (self.__class__, (items,), inst_dict)
return self.__class__, (items,)
def keys(self):
return list(self)
setdefault = DictMixin.setdefault
update = DictMixin.update
pop = DictMixin.pop
values = DictMixin.values
items = DictMixin.items
iterkeys = DictMixin.iterkeys
itervalues = DictMixin.itervalues
iteritems = DictMixin.iteritems
def __repr__(self):
if not self:
return '%s()' % (self.__class__.__name__,)
return '%s(%r)' % (self.__class__.__name__, list(self.items()))
def copy(self):
return self.__class__(self)
@classmethod
def fromkeys(cls, iterable, value=None):
d = cls()
for key in iterable:
d[key] = value
return d
def __eq__(self, other):
if isinstance(other, OrderedDict):
if len(self) != len(other):
return False
for p, q in zip(list(self.items()), list(other.items())):
if p != q:
return False
return True
return dict.__eq__(self, other)
def __ne__(self, other):
return not self == other
+24
View File
@@ -0,0 +1,24 @@
from __future__ import division
def median(data):
"""
Return the median (middle value) of numeric data, using the common
"mean of middle two" method. If data is empty, ValueError is raised.
Mimics the behaviour of Python3's statistics.median
>>> median([1, 3, 5])
3
>>> median([1, 3, 5, 7])
4.0
"""
data = sorted(data)
n = len(data)
if not n:
raise ValueError("No median for empty data")
i = n // 2
if n % 2:
return data[i]
return (data[i - 1] + data[i]) / 2
+72 -4
View File
@@ -1,16 +1,19 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Tests for Tablib."""
import json
import unittest
import sys
from __future__ import unicode_literals
import datetime
import doctest
import json
import sys
import unittest
from uuid import uuid4
import tablib
from tablib.compat import markup, unicode, is_py3
from tablib.core import Row
from tablib.formats import csv as csv_format
class TablibTestCase(unittest.TestCase):
@@ -226,6 +229,22 @@ class TablibTestCase(unittest.TestCase):
# Delete from invalid index
self.assertRaises(IndexError, self.founders.__delitem__, 3)
def test_json_export(self):
"""Verify exporting dataset object as JSON"""
address_id = uuid4()
headers = self.headers + ('address_id',)
founders = tablib.Dataset(headers=headers, title='Founders')
founders.append(('John', 'Adams', 90, address_id))
founders_json = founders.export('json')
expected_json = (
'[{"first_name": "John", "last_name": "Adams", "gpa": 90, '
'"address_id": "%s"}]' % str(address_id)
)
self.assertEqual(founders_json, expected_json)
def test_csv_export(self):
"""Verify exporting dataset object as CSV."""
@@ -298,6 +317,23 @@ class TablibTestCase(unittest.TestCase):
self.assertEqual(html, d.html)
def test_jira_export(self):
expected = """||first_name||last_name||gpa||
|John|Adams|90|
|George|Washington|67|
|Thomas|Jefferson|50|"""
self.assertEqual(expected, self.founders.jira)
def test_jira_export_no_headers(self):
self.assertEqual('|a|b|c|', tablib.Dataset(['a', 'b', 'c']).jira)
def test_jira_export_none_and_empty_values(self):
self.assertEqual('| | |c|', tablib.Dataset(['', None, 'c']).jira)
def test_jira_export_empty_dataset(self):
self.assertTrue(tablib.Dataset().jira is not None)
def test_latex_export(self):
"""LaTeX export"""
@@ -381,8 +417,10 @@ class TablibTestCase(unittest.TestCase):
data.xlsx
data.ods
data.html
data.jira
data.latex
data.df
data.rst
def test_datetime_append(self):
"""Passes in a single datetime and a single date and exports."""
@@ -402,7 +440,9 @@ class TablibTestCase(unittest.TestCase):
data.xlsx
data.ods
data.html
data.jira
data.latex
data.rst
def test_book_export_no_exceptions(self):
"""Test that various exports don't error out."""
@@ -416,6 +456,7 @@ class TablibTestCase(unittest.TestCase):
book.xlsx
book.ods
book.html
data.rst
def test_json_import_set(self):
"""Generate and import JSON set serialization."""
@@ -531,6 +572,15 @@ class TablibTestCase(unittest.TestCase):
self.assertEqual(_csv, data.csv)
def test_csv_import_set_with_unicode_str(self):
"""Import CSV set with non-ascii characters in unicode literal"""
csv_text = (
"id,givenname,surname,loginname,email,pref_firstname,pref_lastname\n"
"13765,Ævar,Arnfjörð,testing,test@example.com,Ævar,Arnfjörð"
)
data.csv = csv_text
self.assertEqual(data.width, 7)
def test_tsv_import_set(self):
"""Generate and import TSV set serialization."""
data.append(self.john)
@@ -961,6 +1011,24 @@ class TablibTestCase(unittest.TestCase):
self.founders.append(('First\nSecond', 'Name', 42))
self.founders.export('xlsx')
def test_rst_force_grid(self):
data.append(self.john)
data.append(self.george)
data.headers = self.headers
simple = tablib.formats._rst.export_set(data)
grid = tablib.formats._rst.export_set(data, force_grid=True)
self.assertNotEqual(simple, grid)
self.assertNotIn('+', simple)
self.assertIn('+', grid)
class DocTests(unittest.TestCase):
def test_rst_formatter_doctests(self):
results = doctest.testmod(tablib.formats._rst)
self.assertEqual(results.failed, 0)
if __name__ == '__main__':
unittest.main()
+4 -7
View File
@@ -1,11 +1,8 @@
# Tox (http://tox.testrun.org/) is a tool for running tests
# in multiple virtualenvs. This configuration file will run the
# test suite on all supported python versions. To use it, "pip install tox"
# and then run "tox" from this directory.
[tox]
envlist = py26, py27, py33, py34, py35, py36, pypy
minversion = 2.4
envlist = py27, py34, py35, py36
[testenv]
commands = python setup.py test
deps = pytest
extras = pandas
commands = python setup.py test