Compare commits

..

1049 Commits

Author SHA1 Message Date
dmosberger e8f54811c7 Expose 'read_only' parameter for 'import_set' and 'import_book' (#483) 2020-12-04 10:10:02 +02:00
Nuno André e8774043ed Substitute tuples for dicts in __getstate__/__setstate__ to speed up the pickling 2020-11-29 22:11:46 +01:00
Jannis Leidel dc1729fc6f Move releases to GitHub actions. 2020-11-23 13:14:21 +01:00
Hugo van Kemenade 3dc62685f8 Reduce Travis CI testing (#479) 2020-11-23 11:01:10 +01:00
Hugo van Kemenade 22c88de90d Upload coverage from GHA (#480)
* Upload coverage from GHA

* Fix PytestConfigWarning: Unknown config option: python_paths
2020-11-14 23:51:05 +02:00
Hugo van Kemenade 615e308559 Docs: Add link to changelog/history (#478) 2020-11-12 10:30:57 +02:00
Hugo van Kemenade 8c5404591b Add support for Python 3.9, drop EOL 3.5 (#477) 2020-10-30 19:01:48 +02:00
Hugo van Kemenade 5fa4496f9d Suggest quotes when pip installing with optional dependencies (#474) 2020-08-12 16:12:57 +03:00
Ran Benita bc8438bda4 Stop using pkg_resources
tablib imports pkg_resources in order to find its own version. Importing
pkg_resources is very slow (100ms-250ms is common).

Avoid it by letting setuptools-scm generate a file with the version
instead.
2020-08-10 15:49:51 +02:00
Claude Paroz ce79e44d14 Fixes #469 - Prevented rst crash with only-space strings (#470)
Thanks nexone for the report.
2020-06-15 08:42:51 +03:00
Claude Paroz 985c3d98b0 Set the release date for 2.0.0 2020-05-16 14:04:19 +02:00
Claude Paroz 6d097c0214 Fixes #465 - Allow importing 'ragged' .xlsx files (#466) 2020-05-16 09:07:32 +03:00
dragonworks 16b5565354 Fixes #462 - Update xlsx import to read cell values instead of cell formulas
Co-authored-by: Claude Paroz <claude@2xlibre.net>
2020-03-11 09:05:43 +01:00
Claude Paroz c25fe54b6f Refs #373 - Import dates from xls files as Python datetime objects 2020-03-09 17:05:32 +01:00
Tim Gates b39aefb8d8 Fix simple typo: belonogs -> belongs (#460)
Closes #459
2020-02-21 10:26:58 +02:00
Claude Paroz a442758729 Fixes #457 - Bumped openpyxl dependency to 2.6.0 (#458) 2020-02-16 15:05:20 +02:00
Claude Paroz 21479001a7 Fixes #453 - Reversing behavior of Row.lpush/Row.rpush (#454)
Co-authored-by: chim <chenpan@xiaomai5.com>
2020-02-13 20:51:49 +02:00
Claude Paroz f7e39c1ad5 Set the 1.1.0 release date 2020-02-13 18:56:15 +01:00
Claude Paroz aaeb5c8360 Fixes #226 - Allow importing ragged CSV files (#456) 2020-02-12 21:12:53 +02:00
Hugo 7a6c623cca Document upcoming breaking change in 2.0 2020-02-12 19:04:51 +01:00
Hugo 0c31fcb3e4 Test on Python 3.8 2020-02-02 16:44:26 +01:00
Hugo fa7fdb0443 pre-commit autoupdate 2020-02-02 16:44:26 +01:00
Hugo 8e19479cea Simplify config: uses the interpreter tox is installed to 2020-02-02 16:44:26 +01:00
Claude Paroz 8f39ac5055 Optimize xlsx detection (#448)
Reading the whole file is a bit too much to detect if the file
looks like an xlsx file.
2020-01-26 22:02:52 +02:00
Hugo 8d02934c53 Fix tox config 2020-01-26 20:48:20 +01:00
Claude Paroz d0963c206f Fix the missing xls dependencies message 2020-01-14 17:58:32 +01:00
Claude Paroz 993af5b0b4 Add release date for 1.0.0 2020-01-13 19:08:47 +01:00
Hugo van Kemenade 0accb4c437 Add project_urls metadata for programmatic use 2020-01-11 15:00:51 +01:00
Claude Paroz 0821716983 Refs #401 - Fixed some flake8 errors 2020-01-11 11:57:53 +01:00
Claude Paroz 660990b6b0 Fixes #440 -Normalize stream inputs as IO streams 2020-01-11 11:30:16 +01:00
Claude Paroz 6152d995f0 Tablib docs isn't the place to debate GPL vs MIT licensing 2019-12-31 14:08:08 +01:00
Claude Paroz 0ea6d706a9 Refs #293 - Ensured Dataset can be pickled/unpickled without damages 2019-12-30 16:23:38 +01:00
Hugo van Kemenade 00d8ab0b37 Remove unnecessary MANIFEST.in (#439)
* This MANIFEST.in unnecessary with setuptools_scm

https://github.com/pypa/setuptools_scm/blob/master/README.rst#file-finders-hook-makes-most-of-manifestin-unnecessary

* No manifest to check
2019-12-11 10:51:21 +01:00
Hugo van Kemenade 06c2326dc0 Refactor error raising to remove duplication 2019-12-10 10:55:30 +01:00
Daniel Santos fa30ea858d Implement feature that allows to export tabular data suited to a… (#437) 2019-12-10 01:04:03 +02:00
Hugo van Kemenade 4de2e17984 README: Add more badges (#435) 2019-12-02 11:10:54 +02:00
Hugo van Kemenade 52b64757b7 Remove unused Pipfile (#436) 2019-12-02 11:10:41 +02:00
Joseph Herlant 5ff4a55ae6 Force default_flow_style for pyyaml safe_dump
This is to keep behavior of pre-5.1 pyyaml.
2019-11-24 20:43:12 +01:00
Claude Paroz ce7d887adc Documented csv import/export options from standard lib (#431) 2019-11-14 18:08:51 +02:00
Hugo van Kemenade 57a535f577 Fix NameError: name '_get_column_widths' is not defined (#433)
* Fix NameError: name '_get_column_widths' is not defined

* Also test ReSTFormat.export_set
2019-11-12 10:53:20 +02:00
Claude Paroz 357a5594c5 Admonitions must have a title 2019-11-11 21:25:56 +01:00
Claude Paroz f61b8d8926 Fixes #422 - Allow ability to lazy-load external modules (#430) 2019-11-11 21:46:28 +02:00
Hugo van Kemenade 22a193dafb No __cmp__ or cmp in Python 3 (#429)
* No __cmp__ or cmp in Python 3

* Add rich comparisons

* Simplify using total_ordering decorator
2019-11-11 12:06:25 +02:00
Hugo van Kemenade b539e96697 Update testing: add docs + lint jobs; use pre-commit for linting (#426)
* Move docs and lint to their own [3.8] build job for more parallelism

* No codecov for docs or lint

* Move isort into pre-commit

* Add some handy linters to pre-commit

* Add rst-backticks linter and fix the errors

* Add pyupgrade and add upgrades

* Test docs and lint on GitHub Actions

* Xenial is default
2019-11-10 21:09:18 +02:00
Claude Paroz 626a062747 Fixes #421 - Make all dependencies optional
Thanks Hugo van Kemenade for the review.
2019-11-10 18:00:31 +01:00
Claude Paroz 9d2f7d6999 Point README to the documentation 2019-11-08 17:31:25 +01:00
Claude Paroz a9d9671b7f Moved format documentation from code to docs (#420) 2019-11-06 22:37:01 +02:00
Claude Paroz f1046cd13e Refs #256 - Implement class-based formats
This allows to extend Tablib with new formats far more easily.
2019-11-02 17:44:05 +01:00
Claude Paroz d21bd10908 Revert " Implement feature new format: Cli. Generate adapter for tabulate. This close issue #340"
This reverts commit c26159d48f.
The patch was NOT ready to be merged.
2019-10-30 14:24:07 +01:00
Daniel Santos c26159d48f Implement feature new format: Cli. Generate adapter for tabulate. This close issue #340
* Implement feature new format: Cli. Generate adapter for  tabulate. This close issue #340

* Write respective tests.

* Correct name Clase Base Test

* Implement missing class method to export cli.

* Remove property headers in method export book Cli.

* Remove cli from list to test Iterable data books.
2019-10-30 14:13:39 +01:00
Daniel Santos 34fe72305e Add missing extraline. 2019-10-29 22:13:57 +01:00
Daniel Santos d94420d968 Elucidate the use of filters (and, or). 2019-10-29 22:08:31 +01:00
Daniel Santos 51a720b21c Merge pull request #416 from xdanielsb/doc-formats
Update doc, clarify the use and scope of the flag headers.
2019-10-28 16:53:02 +01:00
Daniel 20f51d0bc1 Update doc, apply requested changes in headers flag doc. 2019-10-28 16:45:26 +01:00
Daniel 87d15a1529 Update doc, clarify the use and scope of the flag headers. 2019-10-27 21:00:21 +01:00
Hugo van Kemenade 08a6759520 Fixes #202 - Keep error content when importing xls files (#415)
Fixes #202 - Keep error content when importing xls files
2019-10-22 22:49:10 +03:00
Claude Paroz 205403d377 Fixes #202 - Keep error content when importing xls files 2019-10-22 20:48:45 +02:00
Claude Paroz 9858539c87 Add known third parties to isort 2019-10-22 14:19:20 +02:00
Hugo van Kemenade 201d8d9910 Merge pull request #412 from claudep/isort
Display isort errors
2019-10-22 14:06:31 +03:00
Claude Paroz fede4a4f13 Display isort errors 2019-10-22 12:29:55 +02:00
Hugo a76933edd5 Refs #401 - Sort imports with isort 2019-10-22 11:59:19 +02:00
Hugo 3197e59b25 Add pyproject.toml as per PEP 518 2019-10-22 11:52:20 +02:00
Claude Paroz 1f000f2f2c Removed unused imports 2019-10-20 12:23:05 +02:00
Hugo van Kemenade 7879fef65a Remove Python 2 code (#410)
Remove Python 2 code
2019-10-20 12:58:52 +03:00
Hugo b8bff1190e Don't omit tests from coverage https://nedbatchelder.com/blog/201908/dont_omit_tests_from_coverage.html 2019-10-20 12:36:20 +03:00
Hugo d77aba6210 Add more tests 2019-10-20 12:32:00 +03:00
Hugo bf6e5c2e78 100% Row test coverage 2019-10-20 12:27:51 +03:00
Hugo 088b916bab __unicode__ not used in Python 3 2019-10-20 12:04:33 +03:00
Hugo e4ac50260e __getslice__ is deprecated since Python 2.0 and it is not available in Python 3
https://docs.python.org/2/reference/datamodel.html#object.__getslice__
2019-10-19 20:01:07 +03:00
Hugo 7347d07624 Upgrade Python syntax with pyupgrade --py3-plus 2019-10-19 19:25:34 +03:00
Hugo c9027b446c Drop support fo Python 2.7 2019-10-19 19:24:03 +03:00
Hugo van Kemenade 825de0193b Test Windows, macOS and Linux on GitHub Actions (#409)
Test Windows, macOS and Linux on GitHub Actions
2019-10-19 19:20:42 +03:00
Claude Paroz 78b483d39e Fixes #368 - Avoid crashing when exporting empty string in ReST 2019-10-19 18:00:47 +02:00
Claude Paroz 8f09789d40 [csv] Fixes #342 - Feed only 1k of content to csv.Sniffer
Thanks Rivo Laks for the suggestion.
2019-10-19 17:37:56 +02:00
Hugo e0e75ed43c Test Windows, macOS and Linux on GitHub Actions 2019-10-19 18:25:02 +03:00
Peyman Salehi bdc84255a8 Fix some linting errors 2019-10-19 16:57:14 +02:00
Peyman Salehi b3c7145c40 Drop python 2 support
Remove support python 2 from doc, requirements.txt and config
Replace unicode with str
Remove dbfpy folder and rename dbfpy3 to dbfpy
Remove compat file and remove python2 packages from dependency
2019-10-19 16:30:57 +02:00
Claude Paroz 44f43516a5 Refs #250 - Test that commas embedded in quoted strings can be imported 2019-10-19 15:05:46 +02:00
Hugo van Kemenade e0a40577fd Update docs (#406)
Update docs
2019-10-19 16:02:55 +03:00
Hugo 0329eb6168 Update docs 2019-10-19 15:33:33 +03:00
Hugo van Kemenade 4c3dc847b0 Add release checklist (#403)
Add release checklist
2019-10-19 15:00:14 +03:00
Hugo 89fbd54b00 Fix check-manifest 2019-10-19 14:55:31 +03:00
Hugo van Kemenade 067dc769dc Fix typos (#404)
Fix typos
2019-10-19 14:54:25 +03:00
Hugo 1726c1cf37 Add release checklist 2019-10-19 14:45:55 +03:00
Hugo debe77e432 Fix typos 2019-10-19 13:56:34 +03:00
Hugo van Kemenade 0e022d89e5 Merge pull request #402 from hugovk/release-prep
Prepare for release
2019-10-19 13:24:35 +03:00
Hugo van Kemenade a40852d1c6 Add release date 2019-10-19 13:11:45 +03:00
Hugo 9b9fb0aa8a Prepare for release 2019-10-18 23:25:50 +03:00
Hugo van Kemenade ca1aa3ad30 Add support for Python 3.8 (#399)
Add support for Python 3.8
2019-10-18 22:11:19 +03:00
Jannis Leidel 2cfde95fe2 Fix PDF documentation generation. 2019-10-18 21:05:17 +02:00
Hugo 5b94682df3 Add support for Python 3.8 2019-10-18 20:46:23 +03:00
Jannis Leidel f6bf14afd2 Add project release config and cleanup project setup. (#398)
* Add project release config and use Travis build stages.

Refs #378.

* Restructure project to use src/ and tests/ directories.

* Fix testing.

* Remove eggs.

* More fixes.

- isort and flake8 config
- manifest template update
- tox ini extension
- docs build fixes
- docs content fixes

* Docs and license cleanup.
2019-10-18 15:57:13 +02:00
Hugo f3d02aa3b0 Test with pytest and send coverage to Codecov 2019-10-05 23:21:40 +02:00
Claude Paroz ca8dbcf9be Refs #108 - Test and improve format autodetection
Autodetection was added for the odf format.
2019-10-04 23:40:24 +02:00
Claude Paroz 4418535030 Refs #288 - Add string starting with '0' to xlsx round trip test 2019-10-04 21:34:33 +02:00
Claude Paroz 5bd896b954 Refs #304 - Test separator exporting 2019-10-04 21:18:30 +02:00
Claude Paroz af414b69d7 Factorized exporting in all formats in tests 2019-10-04 21:13:29 +02:00
Claude Paroz 91062672b5 Fixed #373 - Properly detect xlsx format 2019-10-04 20:46:31 +02:00
Claude Paroz 34334e72a1 Refs #314 - Add datetime to xlsx round trip test 2019-10-04 19:57:04 +02:00
Claude Paroz 5595bb7993 Refs #380 - Removed mention of the develop branch in docs 2019-10-04 19:52:14 +02:00
Claude Paroz 5fde5259d9 Fixes #314 - Delegate type coercion to openpyxl
Thanks Cristiano Lopes for the initial patch.
2019-10-04 19:45:42 +02:00
Peyman Salehi 591e8f7448 Fix missing comma in setup.py 2019-10-04 19:44:15 +02:00
Claude Paroz 4dfe2c2f89 Fixes #388 - Pipfile.lock is not needed for us 2019-10-04 15:45:15 +02:00
Claude Paroz 91608895d6 Fixes #376 - Add missing HISTORY entries 2019-10-04 10:45:05 +02:00
Hugo van Kemenade 0d36390254 Remove call for financial help 2019-10-04 10:28:47 +02:00
Claude Paroz 8ea082ce60 Fixes #274 - Fix Databook.load() params ordering 2019-10-04 09:36:42 +02:00
Claude Paroz a0df54ca22 Reorganized test cases by format 2019-10-03 22:59:12 +02:00
Claude Paroz 0e06b7e328 Refs #322 - Open dbf file in binary mode in docs 2019-10-03 22:24:23 +02:00
Claude Paroz 8aeb5e5158 io.BytesIO is also available in Python 2.7 2019-10-03 21:14:22 +02:00
Claude Paroz a0d19a56cb Made blank lines PEP-8 compatible 2019-10-03 20:54:10 +02:00
Claude Paroz 8cc024e61b Refs #273 - Replaced vendored markup lib by dependency 2019-10-03 20:42:55 +02:00
Claude Paroz a7c40a0881 Updated some links and favour https 2019-10-03 20:27:10 +02:00
Claude Paroz 20de7fad98 Replaced python-tablib.org by tablib.readthedocs.io 2019-10-03 20:16:00 +02:00
Kiran Subbaraman 4969a71f7f Nose link corrected
It now points to https://github.com/nose-devs/nose
2019-10-03 18:58:37 +02:00
Hugo 743776371a Test on Python 3.8 beta 2019-10-03 11:29:10 +02:00
Claude Paroz 326d07c2ed Updated AUTHORS file and alphabetized list 2019-10-03 11:27:40 +02:00
Hugo 2f6ea8c644 Update MANIFEST.in 2019-10-03 11:24:05 +02:00
Hugo 8aaed50cc8 Refs #378 Add Jazzband Contributing Guidelines 2019-10-03 11:24:05 +02:00
Ran Benita a21b276d9c Avoid DeprecationWarning due to invalid escape in docstring
Will become a SyntaxError in Python 3.8:
https://bugs.python.org/issue32912
2019-10-03 11:15:34 +02:00
Hugo van Kemenade e8838b5ce6 Converted README/HISTORY to Markdown format 2019-10-03 11:13:13 +02:00
Hugo van Kemenade 923711d99a Add support for Python 3.7 and drop 3.4 2019-10-03 09:32:43 +02:00
Claude Paroz aac129db66 Refs #378 - Added the Jazzband badge to the README 2019-10-03 09:13:22 +02:00
Claude Paroz d9df89f5da Refs #378 - Updated Travis and GitHub links 2019-10-03 09:10:46 +02:00
schopenhauerzhang 22bb20c74b delete ; 2019-10-03 08:45:56 +02:00
Frost Ming d25d24a9bb Merge pull request #337 from ZuluPro/stream
Added stream to CSV
2019-06-28 09:02:43 +08:00
Anthony Monthe 513bba2c20 Added CSV stream test 2019-06-27 23:19:06 +01:00
kennethreitz 2b9ce02e3c Merge pull request #364 from s-pace/doc/update
[doc website] Add a nice search experience
2019-04-22 22:21:34 -04:00
s-pace f9f28d3d86 feat: add search to the main introduction page 2019-04-22 22:40:48 +02:00
s-pace 0cb50bb008 feat: add search to every documentation pages 2019-04-22 22:40:35 +02:00
Anthony Monthe f55f56ae1d Added stream to CSV 2019-03-30 19:09:12 +00:00
Timo Furrer 0937c9f9ec Merge pull request #358 from claudep/byedistutils
Removed distutils fallback
2019-03-17 16:15:22 +01:00
Timo Furrer 25a66f95ac Merge pull request #356 from claudep/xlsx_read_only
Open xlsx workbooks in read-only mode
2019-03-11 12:12:12 +01:00
Parth Shandilya 6ab511f8c0 Merge pull request #361 from jdufresne/pin
Unpin transient dependencies in requirements.txt
2019-03-10 23:52:05 +05:30
Jon Dufresne 64816258e6 Unpin transient dependencies in requirements.txt
The project is expected to work with the all versions of dependencies as
specified by dependency ranges, not just a single pinned version. Stop
overspecifying them.
2019-03-09 10:13:39 -08:00
Timo Furrer 41cbaa04b9 Merge pull request #359 from claudep/backports.csv
Limit backports.csv install to Python 2
2019-03-09 16:40:24 +01:00
Claude Paroz c136940801 Limit backports.csv install to Python 2 2019-03-09 10:06:33 +01:00
Claude Paroz cf03ecfe25 Removed distutils fallback
As of https://github.com/kennethreitz/setup.py/blob/master/setup.py
2019-03-09 09:57:19 +01:00
Claude Paroz 193b840da2 Open xlsx workbooks in read-only mode
Refs #316
2019-03-09 09:26:10 +01:00
Timo Furrer 733d77ad1e release: 0.13.0 2019-03-08 12:17:07 +00:00
Timo Furrer 3abd7e8c53 Merge pull request #351 from jdufresne/isinstance
Merge multiple isinstance() calls to one
2019-03-03 16:43:10 +01:00
Timo Furrer 0be9e6a74b Merge pull request #353 from jdufresne/pypy
Remove pypy from tox.ini
2019-03-03 16:42:34 +01:00
Timo Furrer ecd0afbcec Merge pull request #248 from jean/master
Editing while reading: punctuation, markup, linebreaks
2019-03-03 16:41:47 +01:00
Jean Jordaan addaa090ef Merge branch 'master' into master 2019-03-03 13:29:21 +07:00
Jon Dufresne f7b3fd4601 Remove pypy from tox.ini
The platform is not tested on Travis and it fails to run with:

    Processing ./.tox/dist/tablib-0.12.1.zip
    Collecting odfpy (from tablib==0.12.1)
    Collecting openpyxl>=2.4.0 (from tablib==0.12.1)
    Collecting backports.csv (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/71/f7/5db9136de67021a6dce4eefbe50d46aa043e59ebb11c83d4ecfeb47b686e/backports.csv-1.0.6-py2.py3-none-any.whl
    Collecting xlrd (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/b0/16/63576a1a001752e34bf8ea62e367997530dc553b689356b9879339cf45a4/xlrd-1.2.0-py2.py3-none-any.whl
    Collecting xlwt (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/44/48/def306413b25c3d01753603b1a222a011b8621aed27cd7f89cbc27e6b0f4/xlwt-1.3.0-py2.py3-none-any.whl
    Collecting pyyaml (from tablib==0.12.1)
    Collecting pandas (from tablib==0.12.1)
      Using cached https://files.pythonhosted.org/packages/81/fd/b1f17f7dc914047cd1df9d6813b944ee446973baafe8106e4458bfb68884/pandas-0.24.1.tar.gz
        Complete output from command python setup.py egg_info:
        Traceback (most recent call last):
          File "<module>", line 1, in <module>
          File "/tmp/pip-install-F5lmAg/pandas/setup.py", line 732, in <module>
            ext_modules=maybe_cythonize(extensions, compiler_directives=directives),
          File "/tmp/pip-install-F5lmAg/pandas/setup.py", line 475, in maybe_cythonize
            numpy_incl = pkg_resources.resource_filename('numpy', 'core/include')
          File "tablib/.tox/pypy/site-packages/pkg_resources/__init__.py", line 1144, in resource_filename
            return get_provider(package_or_requirement).get_resource_filename(
          File "tablib/.tox/pypy/site-packages/pkg_resources/__init__.py", line 361, in get_provider
            __import__(moduleOrReq)
        ImportError: No module named numpy

        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-F5lmAg/pandas/
2019-03-02 08:56:06 -08:00
Parth Shandilya 79dc77de49 Merge pull request #352 from jdufresne/ws
Trim trailing white space throughout the project
2019-03-02 22:18:07 +05:30
Jon Dufresne b057cdf05e Trim trailing white space throughout the project
Many editors clean up trailing white space on save. By removing it all
in one go, it helps keep future diffs cleaner by avoiding spurious white
space changes on unrelated lines.
2019-03-02 08:42:53 -08:00
Jon Dufresne fc2f3c07c8 Merge multiple isinstance() calls to one 2019-03-02 08:38:03 -08:00
Timo Furrer a10327a283 Merge pull request #350 from browniebroke/bugfix/invalid-ascii-csv
Import ascii characters not valid with unicode literals - updated
2019-03-02 15:06:21 +01:00
Bruno Alla e0de42ef06 Add backports.csv to requirements.txt 2019-03-02 10:44:38 -03:00
Bruno Alla f757ab84d1 Merge branch 'master' into bugfix/invalid-ascii-csv
# Conflicts:
#	setup.py
#	tablib/compat.py
#	test_tablib.py
2019-03-02 10:41:07 -03:00
Timo Furrer dc24fda415 Merge pull request #333 from hudgeon/master
Updated xlsx format to remove reference to openpyxl's deprecated get_active_worksheet
2019-03-02 13:03:30 +01:00
Timo Furrer 3ba8d529fc Merge pull request #348 from mloesch/jira
Add Jira table export
2019-03-02 12:16:00 +01:00
Timo Furrer a8bdb4b28f Merge pull request #338 from lepuchi/hotfix/csv-new-line
Handle case where there is an empty line in CSV
2019-03-02 12:12:30 +01:00
Timo Furrer 1aaf235751 Merge pull request #344 from jdufresne/tox-pandas
Include pandas dependency when testing with tox
2019-03-02 12:04:35 +01:00
Parth Shandilya 36ec60d5dd Merge pull request #343 from jdufresne/od
Remove vendored ordereddict package
2019-03-02 00:08:40 +05:30
Parth Shandilya babcbfd949 Merge pull request #339 from thombashi/replace_deprecated_method
Replace a deprecated method call
2019-03-02 00:07:39 +05:30
Parth Shandilya 29b2c08da0 Merge pull request #346 from jdufresne/compat
Remove unused compat entries
2019-03-01 23:59:27 +05:30
Parth Shandilya 862a681263 Merge pull request #345 from jdufresne/cache
Enable pip cache in Travis CI
2019-03-01 23:58:22 +05:30
Mathias Loesch 102073c426 Add Jira table export 2019-01-23 22:34:45 +01:00
Jon Dufresne 499ce52304 Remove unused compat entries
Organize both the Python2 & Python3 sections in the same order so they
are easier to compare.

Removed:

- basestring
- ifilter
- bytes
2019-01-01 10:40:29 -08:00
Jon Dufresne c650b67e06 Enable pip cache in Travis CI
Reduce load on PyPI servers and slightly speed up builds.

For more information, see:

https://docs.travis-ci.com/user/caching/#pip-cache
2019-01-01 10:32:08 -08:00
Jon Dufresne 3e4d6fb5aa Include pandas dependency when testing with tox
Allows all tests to pass.

As pandas is defined as an 'extra', use tox's 'extras' feature. This
requires tox 2.4+, so document that as well.

https://tox.readthedocs.io/en/latest/config.html#conf-extras
2019-01-01 10:28:29 -08:00
Jon Dufresne dd2ba714d3 Remove vendored ordereddict package
Now that Python 2.6 support has been dropped, can remove the vendored
ordereddict package. Use the stdlib collections.OrderedDict instead.
2019-01-01 10:02:13 -08:00
Tsuyoshi Hombashi a28a057559 Replace a deprecated method call
Workbook.remove_sheet method deprecated since openpyxl 2.4.0
2018-10-06 19:19:09 +09:00
lepuchi d38549ef1e only add row if it exists 2018-10-02 23:26:19 +05:30
kennethreitz 5a359ba4de Update README.rst 2018-09-17 08:14:12 -04:00
kennethreitz 359007444c Update README.rst 2018-09-17 08:13:48 -04:00
Maciej "RooTer" Urbański 4f8949417e ujson presence no longer breaks tablib (resolves #297) (#311) 2018-09-12 16:15:20 -03:00
Bruno Soares 3d5943a8a4 Fix: Circular reference detected error (#332)
* Rename function name

* Add uuid handler on json dumps

* Add myself to authors
2018-09-12 15:49:46 -03:00
Norman Hooper 38486231cc reStructuredText (#336)
* median for Python 2

* More compat

* Support reStructuredText

* Tests
2018-09-12 15:27:10 -03:00
Claude Paroz 75f1bafd69 Removed Python 3.3 support (#310) 2018-09-12 15:24:37 -03:00
Iuri de Silvio 4749760e6f Typo: OSD -> ODS
Fix #330
2018-09-12 15:22:06 -03:00
Gregory Bataille ac3cf67620 fix(): remove openpyxl warning by properly accessing cells (#296) 2018-09-12 08:34:55 -03:00
DougHudgeon f812c29275 Add instructions for handling csv line endings in Windows in Python 3 2018-06-26 10:33:21 +10:00
DougHudgeon 4c5d0b1a45 Instructions for opening Excel workbook and reading the first sheet 2018-06-25 14:25:50 +10:00
DougHudgeon 61063e2b09 Updated xlsx format to use openpyxl's .active property 2018-06-25 14:17:34 +10:00
kennethreitz 4c300e65a5 update install instructions
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2017-09-01 15:42:51 -04:00
kennethreitz edbb16ec97 next version 2017-09-01 15:37:00 -04:00
kennethreitz dec5cea722 Merge pull request #307 from audiolion/make-pandas-optional
Make pandas optional
2017-09-01 13:49:44 -04:00
Ryan Castner 38183938dc Change how travis installs to get all test dependencies 2017-09-01 13:33:28 -04:00
Ryan Castner 7f1db4023f Raise NotImplementedError if pandas is not installed 2017-09-01 13:21:21 -04:00
Ryan Castner b09fface1b Make pandas an optional install 2017-09-01 13:20:54 -04:00
kennethreitz 69edb9def3 Update index.rst 2017-08-28 01:14:36 -04:00
kennethreitz ec54918f4a Update tutorial.rst 2017-08-28 01:06:43 -04:00
kennethreitz ab6633549f Update index.rst 2017-08-28 01:04:16 -04:00
kennethreitz 56005d8022 Update README.rst 2017-08-28 01:02:49 -04:00
kennethreitz 36fa7ef097 update docs
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2017-08-27 03:56:14 -04:00
kennethreitz bb0abc863e bunk requirements file
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2017-08-27 03:49:29 -04:00
kennethreitz 58f6eefe01 Merge branch 'master' of github.com:kennethreitz/tablib 2017-08-27 03:48:10 -04:00
kennethreitz e4726cb85c update docs
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2017-08-27 03:48:01 -04:00
kennethreitz 412e690289 Update README.rst 2017-08-27 03:42:15 -04:00
kennethreitz 44e797d70e Update README.rst 2017-08-27 03:41:53 -04:00
kennethreitz 34c14aca18 Update README.rst 2017-08-27 03:41:26 -04:00
kennethreitz 7c318adde4 Update README.rst 2017-08-27 03:41:01 -04:00
kennethreitz 5dd74c0104 drop 2.6 2017-08-27 03:29:44 -04:00
kennethreitz a50ff92ff2 only require pandas if python isn't 2.6
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2017-08-27 03:26:21 -04:00
kennethreitz 383d4b9c4e Merge pull request #301 from jasonamyers/feature/dataframes
Adding initial DataFrames Support
2017-08-27 03:22:42 -04:00
Jason Myers 00e2ffa2ef Adding initial DataFrames Support
Signed-off-by: Jason Myers <jason@jasonamyers.com>
2017-08-26 20:43:35 -05:00
kennethreitz a3cd2c9cff history 2017-06-13 12:31:42 -04:00
kennethreitz d89d243a30 v0.11.5 2017-06-13 12:30:27 -04:00
kennethreitz 69abfc3ada use safe load 2017-06-13 12:29:55 -04:00
Bruno Alla 80e72cfa27 Fix unicode encode errors on Python 2 -- Fixes #215
Switch csv library to backports.csv as the implementation
is closer to the python 3 one. Add a test case covering the
problem.

Run tests with unicode_literals from future

Fix unicode encode errors with unicode characters

- Use `backports.csv` instead of `unicodecsv`
- Use StringIO instead of cStringIO
- Clean-up some Python 2 specific code
2017-05-02 17:33:14 +01:00
Nicolas Appriou 05bd0d1d42 fix python interpreter supported version in doc (#286) 2017-04-19 11:02:55 -03:00
Claude Paroz 62807734bd Replaced vendored odfpy by a dependency (#280)
Refs #273.
2017-02-26 19:05:01 -03:00
yarko c5c2dffe42 correct example (#276)
map() is a function in python2, and iterator in python3+;

In any case - map is inefficient compared to either comprehensions (most efficient), or simple loops (close second).
SInce in this case, data.append() returns nothing, use a simple look.
It is clearer, more efficient, and works with both python2 and python3
2017-02-24 09:39:53 -03:00
Claude Paroz 46102d4be7 Replaced vendored omnijson by the standard lib version (#279)
Refs #273.
2017-02-24 09:38:07 -03:00
Claude Paroz 44e9e24fec Replaced vendored pyyaml by a dependency (#278) 2017-02-20 19:41:38 -03:00
Claude Paroz 0ca5520bbc Replaced vendored xlrd/xlwt by dependencies (#277)
Refs #273.
2017-02-20 17:29:22 -03:00
Claude Paroz e66eb4a189 Replaced vendored openpyxl by a dependency (#221)
It is time to make it happen.

* Dropped Python 3.2 support

Recent dependencies are dropping Python 3.2 too.

* Replaced vendored openpyxl by a dependency

Thanks Tommy Anthony for the initial patch.
2017-02-20 12:41:33 -03:00
kennethreitz 0e720d78ca Merge pull request #272 from founders4schools/unicodecsv
Replaced vendored unicodecsv by a dependency
2017-01-24 23:48:54 -05:00
Iuri de Silvio 6afe716d64 Version bump: 0.11.4 2017-01-23 19:10:36 -02:00
Bruno Alla 76cbf9fadf Read version in setup.py without importing tablib 2017-01-15 14:59:49 +00:00
Bruno Alla a93f93a458 Replaced vendored unicodecsv by a dependency
Using https://pypi.python.org/pypi/unicodecsv/0.14.1
2017-01-15 14:47:14 +00:00
Iuri de Silvio 3d44bdec40 Merge pull request #269 from kammala/master
Fixed classifiers in setup.py
2017-01-10 10:14:11 -02:00
kammala 319505817a Fixed classifiers in setup.py
moved classifiers from tuple to list(this allow to use setup.py upload command in python >= 3.5)
2017-01-10 12:20:08 +03:00
kennethreitz 6cb9a69746 Merge pull request #266 from wenzhihong2003/master
remove file must be close it.
2017-01-05 12:51:01 -05:00
tomwen bb1354b61f remove file must be close it.
in windows if you don't close template file, remove it will raise

WindowsError: [Error 32]
2016-12-30 10:21:43 +08:00
Iuri de Silvio ddc4bd30f2 Merge pull request #234 from BrianPainter/master
if the object is a decimal, return the string representation of it.
2016-12-18 19:10:18 -02:00
Iuri de Silvio 52e547daf9 Merge pull request #259 from dyve/master
Fix #260 date and datetime export to JSON
2016-12-18 19:09:26 -02:00
Iuri de Silvio 7f0b7a0a22 Merge pull request #263 from andriisoldatenko/develop
Remove LOCALE from str regular expression
2016-12-18 19:08:41 -02:00
Andrii Soldatenko ddac443732 Added py36 to tox.ini 2016-12-18 17:04:28 +02:00
Andrii Soldatenko e13f4d0aba Added py35 to tox.ini 2016-12-18 16:54:22 +02:00
Andrii Soldatenko 54f9041f2c Remove LOCALE from str regular expression 2016-12-18 16:44:18 +02:00
Dylan Verheul 91d3299280 Fix date and datetime export to JSON in Python versions with a json package
Python without a json package will use omnijson and fail on date and datetime objects.
Added unit tests.
2016-11-30 12:32:47 +01:00
Jean Jordaan cd67a63b43 Fix typo in label, add missing newline before directive
Label: s/peed/speed/
2016-08-01 11:13:25 +07:00
Jean Jordaan 19b3d6d06a Change the blind reference mit to an URL
:ref:`MIT Licensed <mit>` had no target in the docs, so change it to a
canonical URL.
2016-08-01 11:11:01 +07:00
Jean Jordaan 59090d33a8 Missed some tabs. 2016-07-31 18:23:21 +07:00
Jean Jordaan a4f974287b Editing while reading: punctuation, markup, linebreaks
I fixed some extra commas, missing apostrophes, and typos;
added some linebreaks between sentences for very long lines;
added explicit markup for console blocks,
got rid of some tabs,
fixed indentation of an admonition, and some more small tweaks.

This supersedes https://github.com/kennethreitz/tablib/pull/84
2016-07-31 18:15:12 +07:00
kennethreitz f59abe84be Merge pull request #239 from ErwinJunge/dataset-title-in-docs
Put Dataset.title in the documentation
2016-05-21 15:13:30 -04:00
Erwin Junge cf23f2344f Put Dataset.title in the documentation 2016-05-20 16:13:22 +02:00
kennethreitz e16bb38c48 Merge pull request #238 from sushrutrathi/code_changes
changes in code refactoring
2016-05-03 21:30:25 -04:00
Sushrut Rathi 71ca275dd1 changes in code refactoring 2016-05-03 13:47:25 +05:30
kennethreitz 75bbfbbaf4 Merge pull request #233 from ScorpionResponse/html_book_test
Add HTML format to the book_export test and fix the format to work properly
2016-04-10 18:12:37 -04:00
Iuri de Silvio b35d505621 Merge pull request #236 from candy0427/master
Update README.rst
2016-03-24 10:18:34 -03:00
CandyLikeSmile cd491c062c Update README.rst 2016-03-24 14:19:59 +08:00
Brian Painter 9fdb72cc5c if the object is a decimal, return the string representation of it. 2016-03-23 08:21:36 -04:00
kennethreitz a5b1f7987e Merge pull request #232 from chimeno/patch-1
[docs] Update variable name in tuto
2016-03-18 14:37:19 -04:00
Paul Moss 8cf6770a76 Add HTML format to the book_export test and fix the format to work properly 2016-03-18 18:17:19 +00:00
Daniel Chimeno 5fa3d2f886 [docs] Update variable name in tuto
The tutorial has been using the 'data' variable, but in this case it's using 'd'.
This change that.
2016-03-18 09:22:59 +01:00
kennethreitz d4c66c7a4e Merge pull request #229 from pmlandwehr/patch-1
python 3 fix: map filter to ifilter
2016-02-28 00:20:37 -05:00
Peter M. Landwehr af17586581 python 3 fix: map filter to ifilter 2016-02-27 21:14:13 -08:00
kennethreitz 23d21f00f3 Update HISTORY.rst 2016-02-25 12:59:18 -05:00
kennethreitz 7ee924b5a6 Merge pull request #228 from tomchristie/print-dataset-with-no-headers
Fixed textual representation for Dataset with no headers
2016-02-25 12:58:35 -05:00
Tom Christie d720beadac Fixed __unicode__/__str__ for dataset with no headers 2016-02-25 13:28:29 +00:00
kennethreitz ee9666a146 Merge pull request #225 from tusharmakkar08/master
PEP-8 standards followed
2016-02-22 09:17:34 -05:00
tusharmakkar08 77a9e25795 Reverted back yaml3 for ci failure 2016-02-22 17:05:06 +05:30
tusharmakkar08 d515724817 PEP-8 standards followed 2016-02-22 16:36:26 +05:30
kennethreitz 2814fbc381 v0.11.3 2016-02-16 08:49:28 -05:00
kennethreitz 9ca1d4ec54 Merge pull request #220 from kennethreitz/master
Master
2016-02-16 08:46:37 -05:00
kennethreitz abbb4e32d8 update footer in docs 2016-02-16 08:29:17 -05:00
kennethreitz f6e757d569 v0.11.2 2016-02-16 08:27:59 -05:00
kennethreitz 9ba0451843 Merge pull request #219 from timofurrer/bugfix/export-only
Fix export only formats
2016-02-16 08:17:56 -05:00
Timo Furrer d99db57d75 Fix export only formats
Formats like LaTeX could have never been exported because
`setattr(cls, set_%s % fmt.title, fmt.import_set)` always failed
for export-only formats and with that the exception was caught in the
outer try/except and the format tuple was set to (None, None) with
`cls._formats[fmt.title] = (None, None)`
2016-02-15 19:29:46 -08:00
kennethreitz 2299c00883 Merge pull request #216 from cmhofer/develop
Error: frzn_col_idx not defined
2016-02-08 18:27:39 -05:00
Claudio Mike Hofer 5ba6f5d91a frzn_col_idx not defined
As in #53 already solved. Freeze panes set to A2 again.
2016-02-08 17:30:33 +01:00
kennethreitz bbdf5f11ab v0.11.1, fix packaging error 2016-02-07 13:46:03 -05:00
kennethreitz 851ba25702 Update README.rst 2016-02-07 11:00:14 -05:00
kennethreitz 039272b274 docs cleanup 2016-02-07 10:56:29 -05:00
kennethreitz d6a7832e60 v0.11.0 2016-02-07 10:53:25 -05:00
kennethreitz e51c4faec7 smarter detect_format function 2016-02-07 10:43:38 -05:00
kennethreitz f7fc3244ee updated history 2016-02-07 10:43:19 -05:00
kennethreitz 53d69bd3ea fix __unicode__ 2016-02-07 08:09:10 -05:00
kennethreitz fcc9700d11 Fix for transpose().transpose() with duplicate keys
#199
2016-02-07 07:29:08 -05:00
kennethreitz 1ec9c18a66 $ make test 2016-02-07 07:09:34 -05:00
kennethreitz 99c28fa560 Merge pull request #206 from kontza/develop
Two 'raise AttributeError' converted to Python 3 -friendly format.
2016-02-07 07:09:12 -05:00
kennethreitz fa7fb579fd Merge pull request #193 from jhermann/patch-1
Formats .tsv and .html are implemented by now
2016-02-07 07:03:25 -05:00
kennethreitz be24de19dc Merge remote-tracking branch 'origin/develop' into develop 2016-02-07 07:01:46 -05:00
kennethreitz 1d4f4b68ca cleanup 2016-02-07 07:01:13 -05:00
kennethreitz 8debeb26ac Merge branch 'develop' into import_export
# Conflicts:
#	tablib/core.py
#	tablib/formats/_csv.py
#	tablib/formats/_xlsx.py
2016-02-07 07:00:55 -05:00
kennethreitz 38e1ee6c3d Merge pull request #186 from hdzierz/develop
Added a mechanism to avoid datetime.datetime issues when serializing dat...
2016-02-07 06:44:20 -05:00
kennethreitz a774789252 /s/unique/remove_duplicates
#182
2016-02-07 06:40:46 -05:00
kennethreitz 995eabad37 Merge pull request #182 from cherepski/develop
Adding ability to unique all rows in a dataset.
2016-02-07 06:38:58 -05:00
kennethreitz d90358bf69 Merge branch 'develop' of https://github.com/rabinnankhwa/tablib into develop
# Conflicts:
#	AUTHORS
2016-02-07 06:36:12 -05:00
kennethreitz c5920249de python 3.2 is terrible 2016-02-07 06:32:10 -05:00
kennethreitz 9b6a73c97c fixed stuipid test 2016-02-07 06:29:07 -05:00
kennethreitz 679bd115b6 Merge branch 'develop' of https://github.com/papisz/tablib into develop 2016-02-07 06:09:55 -05:00
kennethreitz 32cbc36fc1 Merge branch 'latex-export' of https://github.com/mloesch/tablib into develop 2016-02-07 06:08:23 -05:00
kennethreitz 8bded88559 update development guide 2016-02-07 06:01:56 -05:00
kennethreitz f8f57a467e updates to install guide 2016-02-07 05:56:19 -05:00
kennethreitz a11a993955 fix documentation 2016-02-07 05:52:45 -05:00
kennethreitz 25894f2948 remove bunk file 2016-02-07 05:47:28 -05:00
kennethreitz 591b89693e remove TODO.rst 2016-02-07 05:46:45 -05:00
kennethreitz 85d9c2497e --universal 2016-02-07 05:46:20 -05:00
kennethreitz eaf52b691e Merge pull request #204 from dallagi/notabs
Replace tabs with whitespaces
2016-01-27 17:19:37 -05:00
kennethreitz 6f53c5d2b9 Merge pull request #209 from jdharms/develop
Small documentation fix in Dataset class
2016-01-27 17:17:38 -05:00
kennethreitz 90ee799576 Merge pull request #208 from stclair/develop
Fix XLSX import
2016-01-27 17:16:22 -05:00
Iuri de Silvio c02a21ccd2 Merge pull request #213 from go8ose/develop
Add section on importing to tutorial.
2016-01-20 10:48:06 -02:00
Geoff Crompton fa045ca114 Add section on importing to tutorial. 2016-01-18 12:13:15 +11:00
Daniel Harms 65703550c3 Small documentation fix in Dataset class 2015-11-10 14:15:37 -05:00
Wes 1fcb98f9ae Fix XLSX import
Calling import_set on an XLSX file was throwing a TypeError from
Openpyxl. Openpyxl Reader load_workbook requires a file-like object as the first
argument. This commit fixes the error by passing in a file-like object
instead of a string.
2015-11-09 06:45:28 -07:00
Rumpu-Jussi e2d45ecff7 More Python 3 -friendly formatting. 2015-10-27 14:46:43 +02:00
Rumpu-Jussi 47d92277cc More Python 3 -friendly formatting. 2015-10-27 14:45:55 +02:00
Rumpu-Jussi fdd74b5b0c More Python 3 -friendly formatting. 2015-10-27 14:44:07 +02:00
Rumpu-Jussi de052f0fac Two 'raise AttributeError' converted to Python 3 -friendly format. 2015-10-27 14:33:44 +02:00
Marco Dalla G 2f3acf5af4 Added myself to authors, as indicated in README 2015-10-07 11:31:26 +03:00
Marco Dalla G c4e8755cd2 Replaced tabs with whitespaces 2015-10-07 11:25:56 +03:00
Mathias Loesch 79dc4524a0 Added LaTeX table export format 2015-06-04 09:26:35 +02:00
Iuri de Silvio a785d77901 Merge pull request #194 from tommyanthony/develop
Fixed a compatibility bug for Python 3
2015-05-27 13:19:36 -03:00
Thomas Anthony b3485ec942 Fixed a compatibility bug for Python 3 by adding xrange to
compat.py.

The code in tablib/formats/_xls.py used xrange in parsing excel spreadsheets.
xrange is not a builtin for Python 3, so I've added
	xrange = range
in compat.py and imported it in tablib/formats/_xls.py.
2015-05-26 20:06:42 -07:00
Jürgen Hermann 28b358c9da Formats .tsv and .html are implemented by now
Removed mentioning of "wanted" formats that exist.
2015-04-08 15:59:55 +02:00
Iuri de Silvio 24657520e9 Merge pull request #189 from tsroten/issue_184
Fixes Row slicing. Fixes #184.
2015-04-05 20:05:31 -03:00
Iuri de Silvio 66d9e50984 New import/export interface with dataset and databook import_ and export methods
and overloaded `import_set` and `import_book` functions.
2015-04-05 19:51:56 -03:00
Thomas Roten 541fba6786 Fixes Row slicing. Fixes #184. 2015-03-28 16:14:27 -04:00
Helge bc6398ffb0 Added a mechanism to avoid datetime.datetime issues when serializing data 2015-03-02 15:06:31 +13:00
Kevin Cherepski dca7bc9a7d Adding ability to unique all rows in a dataset. 2015-02-04 11:53:14 -05:00
Iuri de Silvio 2fbda0f43d Merge pull request #176 from sramana/develop
Fix import errors when installed from source
2014-11-15 16:28:57 -02:00
Ramana Varanasi e350f9428b Fix import errors when installed from source 2014-11-10 16:03:10 +05:30
Iuri de Silvio 68dba0a77d Merge pull request #173 from amarandon/develop
Fix JSON import example
2014-10-04 11:26:55 -03:00
Alex Marandon 028be03c2c Fix JSON import example
The example was triggering this error:

    JSONError: Expecting property name: line 1 column 3 (char 2)

This is because JSON property names should be wrapped in double
quotes.

While at it, I've fixed the typo in "last_name"
2014-10-03 09:17:38 +02:00
Iuri de Silvio e1d65ba3c8 Merge pull request #172 from thibault/patch-1
Minor typo correction
2014-09-26 16:35:29 -03:00
Thibault J. e4cb3bcd9b Minor typo correction
Requests -> Tablib
2014-09-23 11:46:05 +02:00
Iuri de Silvio bf9510e0c7 Merge pull request #170 from phargogh/dbf_docs_repair
Cleaning up DBF API documentation
2014-09-06 09:54:21 -03:00
James Douglass 82ae3ca507 Cleaning up DBF documentation
Fixing indentation issues (off by one space), which caused problems
with the sphinx rendering of the DBF docstring and otherwise cleaning
up the sphinx docstring.
2014-09-05 14:56:33 -07:00
rabinnankhwa 5fbdd56fba filter row and column values 2014-08-31 00:12:44 +05:45
rabinnankhwa f187cef5f4 adding support for creating subset of a dataset. 2014-08-30 23:52:35 +05:45
rabinnankhwa 87892d7266 used get method of dictionary instead of exception handling 2014-08-30 08:56:17 +05:45
rabinnankhwa 20e2ce5ba0 __getslice__ method of Row classcorrected 2014-08-30 08:26:08 +05:45
Iuri de Silvio 48e576954d Merge pull request #153 from phargogh/dbf-support
Support for dBase (DBF) files
2014-08-26 08:24:36 -03:00
James Douglass a21f8187f8 Adding DBF support.
Squashing two squashes.

Adding DBF support

Adding the DBFpy python package

The DBFpy package provides basic dbf support for python.  Still need to
write an interface format file for tablib.

Adding DBF format and imports in compat.py

Adding DBF format to formats.__init__

DBF format had not been committed to formats.__init__, so I’m adding it.

Adding a dbf import test

Adding at test to check whether a DBF can be created properly and
compare it against a regression binary string.

Adding an import_set test (and renaming another)

Adding an import_set test that conforms with the other import_set tests
for other formats.  I’m also adding an export_set function.

Fixing system site-packages import

Importing dbfpy from tab lib.packages instead of system site packages.

Fixing a syntaxError in dbfpy/dbfnew.py

Fixing an issue with ending field definitions

DBFPY, when writing a DBF, terminates the field definitions with a
newline character.  When importing a DBF from a stream, however, DBFPY
was looking only for the \x0D character rather than the newline.  Now
we consider both cases.

Adding a test for dbf format detection

Adding DBF filetype detection tests

Adding tests for YAML, JSON, TSV, CSV using the DBF detection function.

Handling extra exceptions in dbf detection

Adding exception handling for struct.error, an exception that DBFPY
raises when trying to unpack a TSV table.  Since it’s not a DBF file,
we know it’s not a DBF and return False.

Fixing an issue with the DBF set exporting test

The DBF set export test needed a bit enabled (probably the writeable
bit?) before the test would match the regression output.

Updating dbf interface

Updating the int/float class/type checking in the dbf format file.
This allows for python2 and python3 compatibility.

Tweaking dbfpy to work with python3

Altering a couple of imports.

Updating dbf tests for binary data compatibility

Making regression strings binary and improving debug messages for dbf
assertion errors.

Improving file handling for python 2 and 3

Updating DBF file handling for both python 2 and 3 in the _dbf
interface.

Adding a (seemingly) functional dbfpy for python3

I’ve made dbfpy python3 compatible!  Tests appear to pass.
A significant change was made to the format detection test whereby I
made the input string a binary (bytes) string.  If the string is not a
bytes string by the time we try to detect the format, we try to decode
the string as utf-8 (which admittedly might not be the safest thing to
do) and try to decode anyways.

Updating imports for tablib dbf interface

Now importing python2 or python3 versions as appropriate.

Updating dbf package references in compat.py

Cleaning up debugging print statements

Updating stream handling in dbf interface

Factoring the open() call out of the py3 conditional and removing the
temp file before returning the stream value.

Adding dbfpy3 init.py

I had apparently missed the dbfpy3 init file when committing dbfpy3.

Adding dbfpy and dbfpy3 to setup.py's package list

Switching test order of formats

Putting dbf format testing ahead of TSV.  In some of my tests with
numeric DBF files, I encountered an issue where the ASCII horizontal
tab character (0x09) would appear in a numeric DBF.  Because of the
order of tabular format imports, though, format detection would
recognize it as a TSV and not as a DBF.

Adding my name to AUTHORS.

Adding a DBF property to tab lib core

Documentation includes examples on how to explicitly load a DBF
straight from a file and how to load a DBF from a binary string.  Also,
how to write the binary data to a file.

Adding DBF format notes to README

Adding exclamation point to DBF section title

Matching formatting of XLS section

Updating setup.py to match current dev state

Setup.py had been updated since I forked the tablib repo, so I’m
updating setup.py to match its current structure while still
maintaining DBF compatibility.

Fixed callable collumn test

the test was sending a list instead of a function

CORE CONTRIBUTORS

🍰 @iurisilvio

v0.10.0

WHEELS

3.3, 3.4

makefile for WHEELS

v0.10.0 history

ALL

Separate py2 and py3 packages to avoid installation errors. Fix #151

Running travis and tox with python 3.4.

Adding DBF support

Adding the DBFpy python package

The DBFpy package provides basic dbf support for python.  Still need to
write an interface format file for tablib.

Adding DBF format and imports in compat.py

Adding DBF format to formats.__init__

DBF format had not been committed to formats.__init__, so I’m adding it.

Adding a dbf import test

Adding at test to check whether a DBF can be created properly and
compare it against a regression binary string.

Adding an import_set test (and renaming another)

Adding an import_set test that conforms with the other import_set tests
for other formats.  I’m also adding an export_set function.

Fixing system site-packages import

Importing dbfpy from tab lib.packages instead of system site packages.

Fixing a syntaxError in dbfpy/dbfnew.py

Fixing an issue with ending field definitions

DBFPY, when writing a DBF, terminates the field definitions with a
newline character.  When importing a DBF from a stream, however, DBFPY
was looking only for the \x0D character rather than the newline.  Now
we consider both cases.

Adding a test for dbf format detection

Adding DBF filetype detection tests

Adding tests for YAML, JSON, TSV, CSV using the DBF detection function.

Handling extra exceptions in dbf detection

Adding exception handling for struct.error, an exception that DBFPY
raises when trying to unpack a TSV table.  Since it’s not a DBF file,
we know it’s not a DBF and return False.

Fixing an issue with the DBF set exporting test

The DBF set export test needed a bit enabled (probably the writeable
bit?) before the test would match the regression output.

Updating dbf interface

Updating the int/float class/type checking in the dbf format file.
This allows for python2 and python3 compatibility.

Tweaking dbfpy to work with python3

Altering a couple of imports.

Updating dbf tests for binary data compatibility

Making regression strings binary and improving debug messages for dbf
assertion errors.

Improving file handling for python 2 and 3

Updating DBF file handling for both python 2 and 3 in the _dbf
interface.

Adding a (seemingly) functional dbfpy for python3

I’ve made dbfpy python3 compatible!  Tests appear to pass.
A significant change was made to the format detection test whereby I
made the input string a binary (bytes) string.  If the string is not a
bytes string by the time we try to detect the format, we try to decode
the string as utf-8 (which admittedly might not be the safest thing to
do) and try to decode anyways.

Updating imports for tablib dbf interface

Now importing python2 or python3 versions as appropriate.

Updating dbf package references in compat.py

Cleaning up debugging print statements

Updating stream handling in dbf interface

Factoring the open() call out of the py3 conditional and removing the
temp file before returning the stream value.

Adding dbfpy3 init.py

I had apparently missed the dbfpy3 init file when committing dbfpy3.

Adding dbfpy and dbfpy3 to setup.py's package list

Switching test order of formats

Putting dbf format testing ahead of TSV.  In some of my tests with
numeric DBF files, I encountered an issue where the ASCII horizontal
tab character (0x09) would appear in a numeric DBF.  Because of the
order of tabular format imports, though, format detection would
recognize it as a TSV and not as a DBF.

Adding my name to AUTHORS.

Adding a DBF property to tab lib core

Documentation includes examples on how to explicitly load a DBF
straight from a file and how to load a DBF from a binary string.  Also,
how to write the binary data to a file.

Adding DBF format notes to README

Adding exclamation point to DBF section title

Matching formatting of XLS section

Updating setup.py to match current dev state

Setup.py had been updated since I forked the tablib repo, so I’m
updating setup.py to match its current structure while still
maintaining DBF compatibility.

Fixed callable collumn test

the test was sending a list instead of a function

CORE CONTRIBUTORS

🍰 @iurisilvio

v0.10.0

WHEELS

3.3, 3.4

makefile for WHEELS

v0.10.0 history

ALL

Separate py2 and py3 packages to avoid installation errors. Fix #151

Running travis and tox with python 3.4.
2014-08-21 22:06:42 -07:00
Iuri de Silvio 8479df725e Fix some http schemes to follow page scheme. 2014-08-10 11:47:13 -03:00
Iuri de Silvio 333deb2311 Merge pull request #160 from ustun/patch-1
Typo
2014-07-30 08:55:06 -03:00
Ustun Ozgur 0b714f21e1 Typo 2014-07-30 14:46:50 +03:00
Iuri de Silvio ae730b00b1 Merge pull request #154 from fusionbox/freeze-panes
Only freeze the headers row, not the headers columns
2014-06-26 14:00:12 -03:00
Iuri de Silvio 84e8b0384f Merge pull request #155 from fusionbox/update-unicodecsv
Update the vendored unicodecsv to fix None handling
2014-06-24 22:42:56 -03:00
Gavin Wahl 7a2842a8af Update the vendored unicodecsv to fix None handling
The old version of unicodecsv incorrectly (according
https://docs.python.org/2/library/csv.html#csv.writer) encoding None
values as the string 'None', instead of the string '' as the python
documentation specifies.

The newest version of unicodecsv has fixed this.

Fixes #121
2014-06-24 15:22:12 -06:00
Gavin Wahl 954bbdccf3 Only freeze the headers row, not the headers columns
Fixes #53
2014-06-16 15:31:00 -06:00
Iuri de Silvio 7acaa8460d Running travis and tox with python 3.4. 2014-05-27 21:18:14 -03:00
Iuri de Silvio 84e7e251ae Separate py2 and py3 packages to avoid installation errors. Fix #151 2014-05-27 19:25:15 -03:00
Kenneth Reitz dc868eff31 ALL 2014-05-27 12:52:57 -04:00
Kenneth Reitz 43356e908c v0.10.0 history 2014-05-27 12:52:43 -04:00
Kenneth Reitz f7acc19523 makefile for WHEELS 2014-05-27 12:51:51 -04:00
Kenneth Reitz c5972db8f0 Merge branch 'develop' 2014-05-27 12:51:30 -04:00
Kenneth Reitz 1cc051f3e8 .org 2014-05-27 12:49:23 -04:00
Kenneth Reitz 3da155ce0d 3.3, 3.4 2014-05-27 12:49:11 -04:00
Kenneth Reitz 9a34cf0980 WHEELS 2014-05-27 12:49:07 -04:00
Kenneth Reitz 434f66b4eb v0.10.0 2014-05-27 12:48:00 -04:00
Kenneth Reitz d056916c53 CORE CONTRIBUTORS
🍰 @iurisilvio
2014-05-27 12:47:54 -04:00
Iuri de Silvio cf5239f097 Merge pull request #150 from brad/csv-newlines
Allow csv fields to have multiple lines.
2014-05-02 11:25:30 -03:00
Brad Pitcher 49d8cb816f allow csv fields to have multiple lines 2014-05-01 08:12:39 -07:00
Iuri de Silvio fbd277ff2e Merge pull request #149 from brad/tests_fix
Load json to dict to workaround random dictionary hashes. Fix #147
2014-05-01 09:17:34 -03:00
Brad Pitcher 6f4572fa56 load json to dict to workaround random dictionary hashes 2014-04-30 16:27:20 -07:00
Iuri de Silvio 453fc8614c Add NOTICE and tests files to manifest
Tests are code.

The NOTICE file is about third-party licenses and are important too.
2014-04-23 15:01:31 -03:00
Iuri de Silvio 01cf58e431 Add travis badge to readme 2014-04-23 11:25:23 -03:00
Iuri de Silvio f6cd89c76c Fix DeprecationWarnings: assertEquals -> assertEqual 2014-04-19 15:36:00 -03:00
Iuri de Silvio 1e0f30e8a6 Add py33 to travis matrix 2014-04-19 15:26:00 -03:00
Iuri de Silvio 569d35bfca Exit with error when python setup.py test fails 2014-04-19 15:25:43 -03:00
Iuri de Silvio d40cdfbcd0 Merge pull request #146 from kennethreitz/fix/unicode_append
Fix test_unicode_append
2014-04-19 14:51:06 -03:00
Iuri de Silvio 86bbaf9bea Merge pull request #141 from fcurella/develop
added missing yaml3 module to setup.py
2014-04-19 14:48:19 -03:00
Iuri de Silvio 0ed01d85b9 Fix test_unicode_append 2014-04-19 12:41:21 -03:00
Iuri de Silvio fc4cc7fa14 Merge pull request #144 from aleasoluciones/develop
Remove `extend` from first example to make it simple.
2014-04-14 09:06:52 -03:00
papisz 70716fdd21 CSV custom delimiter support 2014-04-09 22:35:56 +02:00
Guillermo Pascual 1146ec2341 Update docs 2014-04-08 10:13:04 +02:00
Flavio Curella 1a7d597745 added missing package to setup.py 2014-03-10 12:56:33 -05:00
kennethreitz 56b627a561 Merge pull request #137 from gisce/fix_xlsx_detect_test
Use InvalidFileException to fix the test
2014-01-23 10:51:59 -08:00
Eduard Carreras 98e182bed2 Use InvalidFileException to fix the test 2014-01-23 18:15:46 +01:00
Iuri de Silvio c8a5563309 Maintain dataset title after sort. 2014-01-11 13:45:45 -02:00
Thomas Coopman c225a64d68 don't use ExcelWriter with databook 2014-01-11 12:56:11 -02:00
kennethreitz d611d5a14f Merge pull request #117 from iurisilvio/patch-1
Fix typo: avalable -> available
2014-01-08 11:48:28 -08:00
kennethreitz 45121ddd65 Merge pull request #63 from jsdalton/fix_unicode_error_in_html_output
Fix unicode error in html output
2014-01-08 11:47:26 -08:00
kennethreitz c74357cb20 Merge pull request #76 from djv/develop
xls and xlsx import support
2014-01-08 11:46:50 -08:00
kennethreitz 939b0af551 Merge pull request #110 from djrobstep/develop
Fix for a broken YAML test and tsv autodetection
2014-01-08 11:44:30 -08:00
kennethreitz 9c2018653f Merge pull request #111 from dec0dedab0de/develop
using readlines() in _tsv.py fixes a small bug.
2014-01-08 11:44:20 -08:00
kennethreitz 2bc6122ee8 Merge pull request #113 from dec0dedab0de/master
Fixed callable column test
2014-01-08 11:44:06 -08:00
kennethreitz 7f0748aac9 Merge pull request #116 from kachick/fix-tsv-typo
Fix some typos in TSV test comment
2014-01-08 11:43:50 -08:00
kennethreitz 41a5c67159 Merge pull request #119 from iurisilvio/empty_sheet
Remove XLSX empty sheet on export_book
2014-01-08 11:43:35 -08:00
kennethreitz 3efefcc8da Merge pull request #127 from medecau/develop
test python 3.3
2014-01-08 11:43:15 -08:00
kennethreitz d19de6025b Merge pull request #131 from fusionbox/quotes
remove extraneous quote marks
2014-01-08 11:41:15 -08:00
kennethreitz 65ba937c0d Merge pull request #129 from lexual/dataset_typo_fix
fix misspelling. hundres -> hundreds.
2014-01-08 11:41:05 -08:00
kennethreitz 79a2bb888f Merge pull request #135 from overthink/doc-fixes
Fix funny typo, refs to tablib.org
2014-01-08 11:40:59 -08:00
kennethreitz 25eacaf6f0 Merge pull request #130 from lndbrg/patch-1
Add pass to json property.
2014-01-08 11:17:23 -08:00
Mark Feeney c2a9af7fb3 Fix funny typo, refs to tablib.org 2013-11-27 12:38:55 -05:00
Gavin Wahl 3b06f3760d remove extraneous quote marks 2013-11-13 13:01:27 -07:00
Olle Lundberg e7ee3195a7 Add pass to json property.
To conform to the code for the other properties.
2013-11-11 21:57:17 +01:00
lexual 5bd2e3df52 fix misspelling. hundres -> hundreds. 2013-11-08 19:03:53 +11:00
Pedro Rodrigues 837b3f83e6 test python 3.3 2013-10-27 18:57:26 +00:00
kennethreitz ff8f23edd5 Merge pull request #98 from pfctdayelise/fixtests
Remove wrong/unused import so tests will actually run
2013-10-16 23:48:37 -07:00
Iuri de Silvio 5ffcfd56f2 Remove XLSX empty sheet on export_book 2013-09-16 10:28:50 -03:00
Iuri de Silvio 955c24c974 Fix typo: avalable -> available 2013-09-15 15:13:29 -03:00
Kenichi Kamiya 192a5efabb Fix some typos in TSV test comment 2013-08-31 21:04:57 +09:00
James Patrick Robinson Jr 1aafc7e2f4 Fixed callable collumn test
the test was sending a list instead of a function
2013-08-28 14:03:58 -04:00
James Patrick Robinson Jr 9e45b95d12 Removed import of openpyxl all together
It's not needed for any of these tests, but if it were we would
need to check for the python version to import the right one.
2013-08-28 11:40:37 -04:00
James Patrick Robinson Jr d8f0a018ae safe_load is not working for book
yaml.safe_load() was not working for import_book,
changed it to use yaml.load() instead.
2013-08-28 11:24:56 -04:00
James Patrick Robinson Jr 7545f3726e changed import to reflect vendorized openpyxl 2013-08-28 09:45:30 -04:00
James Patrick Robinson Jr 85e2bd73fc put the install back in 2013-08-27 17:34:06 -04:00
James Patrick Robinson Jr 37033903c5 Merge branch 'master' into develop 2013-08-27 17:20:31 -04:00
James Patrick Robinson Jr 02c38c2520 edited travis to match master 2013-08-27 17:14:25 -04:00
James Patrick Robinson Jr 26748deb9f changed split('\r\n') to splitlines() 2013-08-27 11:57:43 -04:00
Robert Lechte 63f6cea132 Fixed tsv auto format detection. 2013-08-28 02:07:06 +12:00
Robert Lechte 1b035f9774 Changed yaml dumping to use safe_dump, for consistency with loading. 2013-08-28 01:58:30 +12:00
Kenneth Reitz 2c14486c33 @alex 2013-08-25 17:45:48 -04:00
Kenneth Reitz 8bc69c9d85 Merge pull request #109 from alex/patch-1
Write the example file reliably in the readme
2013-08-25 13:26:47 -07:00
Alex Gaynor d36a2cbd42 Write the example file reliably in the readme
The previous way doesn't work on PyPy or Jython, and emits warnings in recent python3s.
2013-08-25 12:11:46 -07:00
Brianna Laugher 1ab0eb3fae Remove wrong/unused import 2013-04-10 17:45:42 +10:00
Kenneth Reitz cd71e1a5b1 Merge pull request #94 from techniq/patch-1
Update CI docs (Jenkins->Travis)
2013-03-06 10:02:20 -08:00
Sean Lynch 47f79a7ca1 Update CI docs (Jenkins->Travis) 2013-01-20 23:22:56 -05:00
Kenneth Reitz 9f38efe413 Merge pull request #68 from msabramo/python3
Improve Python 3 compatibility
2012-11-15 18:56:50 -08:00
Kenneth Reitz 5d98239a7e Merge pull request #81 from weirdcanada/frozen-frame-fix
Frozen frame fix
2012-11-15 18:50:22 -08:00
Kenneth Reitz a3f0d02633 Merge pull request #89 from PiPeep/patch-1
Update url for pip vs easy_install in docs/install
2012-11-15 18:03:53 -08:00
Benjamin Woodruff b29007a0df Update url for pip vs easy_install in docs/install
The page referred to in the pip documentation has been moved. It
discusses the features that pip offers over easy_install.
2012-10-31 21:23:49 -03:00
Kenneth Reitz e75c3c1a66 Merge pull request #88 from pfmoore/develop
Remove __init__ from slots in ExcelFormula.py for Python 3.3 compatibility
2012-09-22 09:53:49 -07:00
Paul Moore 47cebbc328 Remove __init__ from slots in ExcelFormula.py for Python 3.3 compatibility 2012-09-21 23:35:24 +01:00
Aaron Levin e4c39524f7 another try at committing 2012-08-01 11:51:23 -04:00
Aaron Levin c88c794314 Fixed Frozen Frame issue in xlsx export 2012-08-01 11:45:12 -04:00
Kenneth Reitz 752443f077 Merge pull request #78 from waywardmonkeys/spelling
Fix typos.
2012-06-08 20:12:34 -07:00
Bruce Mitchener 7c0507bcce Fix typos. 2012-06-08 14:10:43 +07:00
Kenneth Reitz 652ac85549 Merge pull request #77 from rbonvall/fix-typos
Fix typos
2012-06-07 17:07:11 -07:00
Roberto Bonvallet 05ea3c35fc s/Jeckyl/Jekyll/ 2012-06-07 12:05:22 -04:00
Roberto Bonvallet d5fada7e1d s/ebpub/epub/ 2012-06-07 12:04:22 -04:00
Roberto Bonvallet 511c58d4e1 s/reqeust/request/ 2012-06-07 12:03:45 -04:00
Kenneth Reitz c469360a0e new domain 2012-06-05 11:19:56 +02:00
Daniel Velkov 97b4401b18 xls and xlsx import support 2012-06-01 11:11:15 -07:00
Kenneth Reitz 40e0f41b4c Merge pull request #72 from xando/develop
import_book method for xls format implemented
2012-05-16 12:13:57 -07:00
xando 39435727ba XLS import_book method implemented. 2012-05-16 17:22:14 +01:00
xando eda9d5af03 Generic method import_book (similar to import_set) to import data into Databook model. 2012-05-16 17:22:14 +01:00
Marc Abramowitz 15435047c6 Add myself to AUTHORS 2012-05-15 07:20:04 -07:00
Marc Abramowitz a3781e3c89 Changes for Python 3 compatibility, including vendorizing xlrd3 2012-05-15 07:19:15 -07:00
Marc Abramowitz 6a825a8a39 NOTICE: Add license info for xlrd3 and xlwt3 2012-05-15 07:18:15 -07:00
Marc Abramowitz 6a449d497a Add support for tox 2012-05-14 22:24:36 -07:00
Marc Abramowitz d807c60346 Tweak setup.py for py.test (pytest?) 2012-05-14 17:14:46 -07:00
Jim Dalton 71603662b1 Make sure codecs module loaded for all versions of Python 2012-05-10 11:29:41 -07:00
Jim Dalton 21c11b9911 Fix UnicodeError in HTML output
* Alter `test_unicode_append` so that actual unicode characters outside the ASCII bytestring range are tested.

 * Make sure output of `render` in markup.py is unicode

 * Add wrapper around output of `export_set` in _html.py so that unicode characters are output.
2012-05-10 11:14:17 -07:00
Kenneth Reitz e8c923d712 Merge pull request #58 from jqb/develop
Support for Dataset subclassing
2012-04-20 06:17:05 -07:00
Kenneth Reitz bc581c08df Update NOTICE 2012-04-20 10:16:28 -03:00
Kenneth Reitz 4f9c9d09ec ODFPy license
(which seems to be missing a copyright for some reason)
2012-04-20 10:15:36 -03:00
Kenneth Reitz 63e8a7172d Merge pull request #61 from bmihelac/patch-1
tablib.org domain expired
2012-04-04 05:41:25 -07:00
Bojan Mihelac 45e0af9f0e tablib.org domain expired 2012-04-04 15:35:16 +03:00
Kuba Janoszek fa6f5b3af3 Databook.add_sheet test for not Dataset subclass added. 2012-03-13 00:21:32 +01:00
Kuba Janoszek 0528e0a500 AUTHORS updated 2012-03-13 00:14:51 +01:00
Kuba Janoszek 8e83734985 Databook.add_sheet accepts Dataset subclasses 2012-03-13 00:05:24 +01:00
Kenneth Reitz 783eccc67d skip install 2012-02-23 06:31:50 -05:00
Kenneth Reitz 7236415f42 travis 2012-02-23 06:20:57 -05:00
Kenneth Reitz c0a3c3ea1e travis test 2012-02-23 06:17:01 -05:00
Jan Brauer 14bd964fb1 Fix #50 - Catch YAML ScannerError 2012-01-29 17:18:30 +01:00
Kenneth Reitz 6bfc6634ba index update 2012-01-28 01:23:54 -05:00
mellort 54affad292 ref #48. makes Dataset more like a duck with extend() 2012-01-28 01:17:15 -05:00
Kenneth Reitz 7c963a0f4d SOPA 2012-01-18 11:24:18 -05:00
Kenneth Reitz 02f27f15c5 Merge pull request #47 from VanL/develop
Add detect function in _xls. Update yaml, csv, and tsv detection functio...
2012-01-05 21:37:51 -08:00
VanL 9c65515e7a Add detect function in _xls. Update yaml, csv, and tsv detection functions to catch other errors when faced with invalid input. 2012-01-06 00:12:06 +00:00
Kenneth Reitz c87a954a9e Merge pull request #43 from svetlyak40wt/develop
Render table in Markdown format on unicode(dataset). Closes #41.
2011-12-24 23:05:03 -08:00
Kenneth Reitz 42e40ed0ab use yaml safe_load (thanks @toastdriven) 2011-11-02 02:35:59 -03:00
Alexander Artemenko 23ab6c4724 Render table in Markdown format on unicode(dataset). Closes #41. 2011-10-16 11:00:06 +04:00
Kenneth Reitz 32a09ccd6a Edited AUTHORS via GitHub 2011-08-31 02:16:16 -03:00
Kenneth Reitz 81a7f79b3d Merge pull request #37 from jfriedly/patch-1
Fixed a few typos.
2011-08-30 22:15:49 -07:00
Joel Friedly 05c9b33003 Fixed a few typos. 2011-08-25 23:33:29 -03:00
Kenneth Reitz ec7273d02d that wasn't right. 2011-08-15 23:29:19 -04:00
Kenneth Reitz 19ee1997b5 really need to use testing branches.. 2011-08-15 22:49:14 -04:00
Kenneth Reitz f01d65c2e9 I don't remember merging that.. 2011-08-15 22:45:35 -04:00
Kenneth Reitz 9778a96351 tuples didn't have index method in the past.
…why?
2011-08-15 22:43:12 -04:00
Kenneth Reitz 906138b138 a column w/ no length could work 2011-08-11 00:47:23 -04:00
Mike Waldner 43c68b396f Fixing magic number in test 2011-08-10 20:05:17 -04:00
Mike Waldner d611233c80 Throwing InvalidDimensions when append_col with header is called but only headers exists
Related #33
2011-08-10 19:52:06 -04:00
Kenneth Reitz 3d02b866ce Merge branch 'append_col_docs' of https://github.com/mawaldne/tablib into develop 2011-08-09 21:48:32 -04:00
Mike Waldner 887ee2fbac Adding documentation changes for append_col
Related #21
2011-08-09 20:52:09 -04:00
Kenneth Reitz bfd211854a Added Mike Waldner to Authors.
#34
2011-08-08 06:48:34 -04:00
Kenneth Reitz bc75911500 Merge branch 'html_None_fix' of https://github.com/mawaldne/tablib into develop 2011-08-08 06:47:47 -04:00
Mike Waldner a2b4e4c6ba Replace None with empty string before creating td 2011-08-07 19:19:54 -04:00
Kenneth Reitz fde6f11763 Merge branch 'feature/xls-import' of https://github.com/xdissent/tablib into develop 2011-07-14 15:16:01 -04:00
Kenneth Reitz 33a83316df Merge branch 'fix_pickle_bug_2' of https://github.com/cswegger/tablib into develop 2011-07-14 15:15:42 -04:00
Greg Thornton f6d7888d9e Added xls detection. 2011-07-14 13:47:07 -05:00
Greg Thornton c19e2f2c5b Added xlrd license to NOTICE. 2011-07-14 13:11:33 -05:00
Greg Thornton eaa2b9b8ea Added XLS import support 2011-07-14 13:08:06 -05:00
Luca Beltrame 2f8083bda6 Fix also __slots__ to ensure proper unpickling 2011-07-14 10:28:12 +02:00
Luca Beltrame 2c5a9af76e Fix pickling (again). Unit tests still pass. 2011-07-14 09:36:35 +02:00
Mark Walling e74a8f41cc Created get_col method with tests and tutorial.rst update
Useful when you have multiple columns with the same header
2011-07-11 17:26:21 -04:00
Kenneth Reitz cd5aa4fc06 toxless 2011-07-04 14:36:08 -04:00
Kenneth Reitz 1d460bac40 setup.py changes 2011-07-04 14:27:42 -04:00
Kenneth Reitz 4a3fde37a3 tox cleanups 2011-07-04 14:05:48 -04:00
Kenneth Reitz 62ad123ad8 updated history 2011-07-04 05:49:41 -04:00
Kenneth Reitz fefc7b4d1f Merge branch 'unicodeheaders' of https://github.com/mwalling/tablib into develop 2011-07-04 05:48:37 -04:00
Mark Walling 6313437a27 Added support for detecting unicode column headers
Also added tests!

Fix for kennethreitz#26
2011-07-01 17:53:38 -04:00
Kenneth Reitz 23a5bb1443 yay 2011-06-30 23:00:26 -04:00
Mark Walling 864f29cc4b Updated some docstrings in core.py
* Binary warning for CSV output, because if you don't, Excel gets upset when Python translates \r\n to \r\n\r\n
 * Cleaned up what looked like a couple of copy paste errors
2011-06-30 22:38:57 -04:00
Kenneth Reitz c136b794a7 Merge branch 'develop' 2011-06-30 16:29:10 -04:00
Kenneth Reitz d254c2d2b0 dynamic columns bugfix for @mwalling :) 2011-06-30 16:28:56 -04:00
Kenneth Reitz 9b235150cf v0.9.10 (packaging fix) 2011-06-23 06:46:24 -04:00
Kenneth Reitz 9f3e6eeaa1 oops 2011-06-23 05:37:09 -04:00
Kenneth Reitz 51728f954f Merge codeplane.com:kennethreitz/tablib into develop 2011-06-22 13:34:22 -04:00
Kenneth Reitz 2949b7c656 A change. 2011-06-22 13:27:24 -04:00
Kenneth Reitz 07d243bbc9 testing GitHub for Mac 2011-06-22 13:16:09 -04:00
Kenneth Reitz bf3484e606 release date 2011-06-21 23:04:42 -04:00
Kenneth Reitz 9b2ab6fae9 Merge branch 'release/0.9.9' 2011-06-21 23:01:46 -04:00
Kenneth Reitz 7a3d55daab test cleanups 2011-06-21 22:58:14 -04:00
Kenneth Reitz eec0595c5c new column methods in tutorial 2011-06-21 20:35:18 -04:00
Kenneth Reitz 0c7c248b96 installation updates 2011-06-21 20:32:44 -04:00
Kenneth Reitz 0d14f7f2b9 Jenkins 2011-06-21 20:28:56 -04:00
Kenneth Reitz d5f713024d setup.py fixes 2011-06-21 20:26:05 -04:00
Kenneth Reitz 415bc819e7 __version__ 2011-06-21 20:17:05 -04:00
Kenneth Reitz 974258094e tablib version in docs 2011-06-21 20:15:47 -04:00
Kenneth Reitz ab16f69be6 big history update 2011-06-21 20:08:28 -04:00
Kenneth Reitz 28d9af852a 0.9.9 2011-06-21 20:04:48 -04:00
Kenneth Reitz 39c6ea6503 lpop/rpop 2011-06-21 20:03:50 -04:00
Kenneth Reitz 39b66ad8e9 add row pop 2011-06-21 20:02:12 -04:00
Kenneth Reitz 004b3da680 Major API Changes
Related #21
2011-06-21 19:42:56 -04:00
Kenneth Reitz d4923533eb style fixes 2011-06-21 19:07:24 -04:00
Kenneth Reitz 29e0b76910 bettter setup.py pattern 2011-06-21 19:04:03 -04:00
Kenneth Reitz 4f54de2630 Stick w/ utf-8. Easy enough to modify.
Related: #18.
2011-06-21 19:00:27 -04:00
Kenneth Reitz 1f0d68ee79 utf-8-sig encoding for csv/tsv (for excel).
Fixes #18.
2011-06-21 18:56:44 -04:00
Kenneth Reitz cae8fa1276 ujson 2011-06-21 18:52:01 -04:00
Kenneth Reitz 4c0a20a7b9 staying with MIT License, for now. 2011-06-21 18:51:54 -04:00
Kenneth Reitz 6c1fa87138 tox cleanup 2011-06-21 01:26:16 -04:00
Kenneth Reitz 0e30255836 bugfix 2011-06-20 12:57:24 -04:00
Kenneth Reitz 1156d5a220 NOTICE update 2011-06-20 12:56:39 -04:00
Kenneth Reitz 83b71967b9 integrate omnijson 2011-06-20 12:55:43 -04:00
Kenneth Reitz 4dab48cd76 add omnijson 2011-06-20 12:55:37 -04:00
Kenneth Reitz 5324526329 remove anyjson 2011-06-20 12:55:30 -04:00
Kenneth Reitz 1dfcd42233 whitespace 2011-06-05 18:50:36 -04:00
Kenneth Reitz f162b19bd6 todo cleanup 2011-06-05 18:43:08 -04:00
Kenneth Reitz 707164e459 fixes #17 2011-05-25 12:12:04 -04:00
Kenneth Reitz 42f0a285c3 gaug.es 2011-05-24 18:30:14 -04:00
Kenneth Reitz d111cc7cc7 testimonial cleanup 2011-05-24 17:19:13 -04:00
Kenneth Reitz 25fe211a22 fix setup packages 2011-05-23 11:20:10 -04:00
Kenneth Reitz 4b675494c4 Merge branch 'feature/apache' into develop 2011-05-22 20:10:33 -04:00
Kenneth Reitz a196b9a5dd readme update 2011-05-22 20:10:14 -04:00
Kenneth Reitz 5ba56c2bb3 Turn off OrderedDict for yaml.
Fixes #12.
2011-05-22 19:52:24 -04:00
Kenneth Reitz 36fbdda492 setup.py improvements
closes #5
2011-05-22 19:43:29 -04:00
Kenneth Reitz 273d2729ee Apache v2 2011-05-22 19:36:38 -04:00
Kenneth Reitz 3036bc9e52 abandon 2011-05-22 15:45:34 -04:00
Kenneth Reitz b9c74eacc8 lower case 2011-05-22 15:41:10 -04:00
Kenneth Reitz 805ccfae34 mention formats 2011-05-22 15:39:28 -04:00
Kenneth Reitz fddc018394 datestamp 2011-05-22 15:34:27 -04:00
Kenneth Reitz 2477100062 Merge branch 'release/0.9.8' into develop 2011-05-22 15:34:05 -04:00
Kenneth Reitz 983b979fda Merge branch 'release/0.9.8' 2011-05-22 15:33:49 -04:00
Kenneth Reitz 3edb45bac7 version bump (v0.9.8)! 2011-05-22 15:33:40 -04:00
Kenneth Reitz 29d626fa1f 2.x bytesio fix 2011-05-22 15:29:11 -04:00
Kenneth Reitz 1f22fc7321 BytesIO 2011-05-22 15:22:22 -04:00
Kenneth Reitz 8631f60f8d Python3 ods fix 2011-05-22 15:13:48 -04:00
Kenneth Reitz 65873b6112 BytesIO 2011-05-22 15:13:40 -04:00
Kenneth Reitz 56e44bd45c csv compatibility 2011-05-22 15:06:52 -04:00
Kenneth Reitz 87e65fd3e7 Merge pull request #14 from f4nt/tablib
---

This should provide basic support for OpenDocument spreadsheets. I didnt have py2.7 installed to test with in tox, but 2.5, 2.6, and py3k passed all tests with tox for me. Lemme know if you see any issues I may have glossed over.

Conflicts:
	tablib/compat.py
2011-05-22 14:07:17 -04:00
Kenneth Reitz ffbc3b122d pass 2011-05-22 14:04:47 -04:00
Kenneth Reitz 9d71603dad Installation link 2011-05-19 12:20:17 -04:00
Mark Rogers cceb41af98 ods support 2011-05-18 16:12:42 -05:00
Kenneth Reitz 60ffa898fd removed meh testimonial 2011-05-16 04:15:34 -04:00
Kenneth Reitz a4a211b5a6 theme update 2011-05-16 02:18:03 -04:00
Kenneth Reitz c9766a48b0 docs update 2011-05-16 02:08:37 -04:00
Kenneth Reitz 6975685b89 theme update 2011-05-15 19:53:18 -04:00
Kenneth Reitz e920244a1b testimonials 2011-05-15 17:08:26 -04:00
Kenneth Reitz ea63779baf syntax fix 2011-05-15 13:29:54 -04:00
Kenneth Reitz d826f6d0ae Orgs 2011-05-15 13:28:15 -04:00
Kenneth Reitz f6fa3f2abc change (c) attribution 2011-05-15 13:28:10 -04:00
Mark Rogers eed6df45e0 Bolding still doesn't work :( 2011-05-15 09:00:47 -05:00
Mark Rogers cb4c67767a py3k tests now pass 2011-05-15 08:35:29 -05:00
Mark Rogers 1e21fee70e start of the py3k port of odfpy 2011-05-14 16:44:23 -05:00
Mark Rogers 420dd36ab8 Tidied up a bit, renamed _odf to _ods like it should have been. Bold not working yet :( 2011-05-14 16:13:17 -05:00
Mark Rogers 9a05770899 proof of concept works. Onto styling and tidying 2011-05-14 14:52:16 -05:00
Mark Rogers 8e055f1c57 adding odfpy to packages 2011-05-14 14:32:03 -05:00
Kenneth Reitz 239e33aaed subtle format cleanups 2011-05-14 10:10:02 -04:00
Kenneth Reitz bf4fdea187 fewer 2/3 mappings 2011-05-14 10:06:54 -04:00
Kenneth Reitz 03086052ed Merge pull request #11 from cswegger/tablib
---

This change applies the same unicode CSV fix for TSV files, since all its done in the exporter is changing a few parameters of the CSV module.

All unit tests are still passing after this change.
2011-05-14 10:02:29 -04:00
Kenneth Reitz 2128473938 license/support update 2011-05-13 21:15:09 -04:00
Kenneth Reitz 74c64d66a9 pypy-1.5 2011-05-13 20:51:54 -04:00
Kenneth Reitz a4e77f22c4 .orig? geeze.. 2011-05-13 01:47:16 -04:00
Kenneth Reitz 2e03046a07 docs fix 2011-05-13 01:46:37 -04:00
Kenneth Reitz 06a7b4cd4e new roadmap 2011-05-13 01:42:42 -04:00
Kenneth Reitz 6a70b84166 docs on xlsx 2011-05-13 01:34:24 -04:00
Kenneth Reitz 77d9fe8b41 Merge branch 'develop' 2011-05-13 01:27:44 -04:00
Kenneth Reitz 64cb547e0a missing modules 2011-05-13 01:27:38 -04:00
Kenneth Reitz 9146de36d4 Merge branch 'release/0.9.7' 2011-05-13 01:15:06 -04:00
Kenneth Reitz 9761ff5e9e hmmm 2011-05-13 01:13:50 -04:00
Kenneth Reitz e5259cbb58 version bump 2011-05-13 01:13:25 -04:00
Kenneth Reitz 56ef89424f fallback on from xml.etree.ElementTree for pypy 2011-05-13 01:05:11 -04:00
Kenneth Reitz 4a01299293 v0.9.7 release notes 2011-05-13 00:40:36 -04:00
Kenneth Reitz 9399bf2fe7 no vendored tests 2011-05-13 00:32:02 -04:00
Kenneth Reitz cbdaa09e83 success!! 2011-05-13 00:30:03 -04:00
Kenneth Reitz f30e760657 less compat reliance 2011-05-13 00:29:51 -04:00
Kenneth Reitz a60e2f132e Finally! :sparkles:Python 3 port of openpyxl 2011-05-13 00:28:50 -04:00
Kenneth Reitz 2b36d71554 additional compat mappings 2011-05-12 19:16:37 -04:00
Kenneth Reitz 690de63b7c ugh 2011-05-12 19:16:24 -04:00
Kenneth Reitz 6b6ef70c61 Convert openpyxl to relative imports 2011-05-12 17:54:00 -04:00
Kenneth Reitz 322283b8f9 Merge pull request #9 from f4nt/tablib
---

I thought I was going to wait til later to do this, but once I got started I couldnt stop myself I guess. I believe this fixes the row limit issues that I mentioned earlier by adding XLSX support. Ive tested it up to 75k rows. Believe the tests and setup.py should be squared away properly as well. Im a bit concerned about the pane freezing stuff, as thats completely undocumented in openpyxl. Truth be told, I barely even know what that functionality does. I hate spreadsheets :)

Anyways, let me know if you want any changes or anything made. Dont have to worry about hurting my feelings or anything!

Conflicts:
	tablib/core25.py
	test_tablib.py
2011-05-12 17:40:17 -04:00
Kenneth Reitz 3968729903 html out 2011-05-12 17:38:53 -04:00
Kenneth Reitz 7b1e533e39 Merge pull request #8 from cswegger/tablib
---

This pull request is to fix pickling / unpickling of Row within Dataset. __getstate__ resembled a dictionary comprehension (Python 2.7+) but it wasnt, and it caused wrong values to be pickled, leading to unusable objects after restoring.

This patch fixes the issues. All unit tests still pass.

Conflicts:
	tablib/core25.py
2011-05-12 17:35:54 -04:00
Kenneth Reitz 8dd7d73abc compat module notes 2011-05-12 17:27:17 -04:00
Kenneth Reitz 176c9615d6 Merge branch 'feature/compat' into develop
Conflicts:
	tablib/core25.py
2011-05-12 17:26:08 -04:00
Kenneth Reitz c65fd4201f Merge branch 'compat' into feature/compat 2011-05-12 17:24:35 -04:00
Kenneth Reitz 11bca4f7a2 callable check fix 2011-05-12 17:20:22 -04:00
Kenneth Reitz 2b5818598a compat module 2011-05-12 17:17:10 -04:00
Kenneth Reitz 79fb82d69d compat module 2011-05-12 17:00:13 -04:00
Mark Rogers 5350355fbe a bunch of cleanup from my previous commit 2011-05-12 15:59:57 -05:00
Kenneth Reitz 85673b365c no more core25 2011-05-12 16:26:22 -04:00
Mark Rogers 87ce64d4c8 Merely a proof of concept, soon to be tidied up 2011-05-12 15:12:46 -05:00
Kenneth Reitz 2cd381389c Merge pull request #8 from cswegger/tablib
---

This pull request is to fix pickling / unpickling of Row within Dataset. __getstate__ resembled a dictionary comprehension (Python 2.7+) but it wasnt, and it caused wrong values to be pickled, leading to unusable objects after restoring.

This patch fixes the issues. All unit tests still pass.
2011-05-12 10:14:39 -04:00
Luca Beltrame 35f21cf73e Fix pickling/unpickling of Dataset instances. 2011-05-12 16:04:46 +02:00
Kenneth Reitz 0ebc8f5e1b Merge branch 'release/0.9.6' into develop 2011-05-12 02:53:15 -04:00
Kenneth Reitz 865ce62782 Merge branch 'release/0.9.6' 2011-05-12 02:53:12 -04:00
Kenneth Reitz 3b961c59e7 version bump 2011-05-12 02:52:35 -04:00
Kenneth Reitz 4be341be4f history: unicode+csv support
refs #7
2011-05-12 02:35:07 -04:00
Kenneth Reitz 2c4337b317 Merge branch 'develop' into feature/compat 2011-05-12 02:31:16 -04:00
Kenneth Reitz 0e4128c73e Erik Youngren to authors 2011-05-12 02:30:39 -04:00
Kenneth Reitz 4ebd66cb09 Merge branch 'bug/csv-unicode' into develop
Closes #7

Conflicts:
	test_tablib.py
2011-05-12 02:28:03 -04:00
Kenneth Reitz bfcfa37ebb Python3 support for csv module.
Refs #7
2011-05-12 02:24:14 -04:00
Kenneth Reitz 5c50c1822e integration of unicodecsv module
refs #7
2011-05-12 02:01:07 -04:00
Kenneth Reitz 2e5577ee91 move csv-unicode branch to bug/csv-unicode
refs #7
2011-05-12 01:52:48 -04:00
Kenneth Reitz 84e4bd9a47 added csv/unicode test 2011-05-12 01:47:49 -04:00
Kenneth Reitz 7270ce49e1 testing webhook 2011-05-11 23:37:38 -04:00
Kenneth Reitz c3052cc02c kill fabfile 2011-05-11 22:57:12 -04:00
Kenneth Reitz 999c49a4f0 initial compat module 2011-05-11 19:01:35 -04:00
Kenneth Reitz 59c996f9df history update 2011-05-11 18:21:44 -04:00
Kenneth Reitz a2b62669b7 seperator => separator 2011-05-11 17:58:31 -04:00
Kenneth Reitz 15e25ef735 no more reqs 2011-05-10 21:24:56 -04:00
Kenneth Reitz 7ae7d3ff46 Update TODOs 2011-04-05 08:56:20 -04:00
Kenneth Reitz 69ed718191 make it mobile! 2011-03-24 16:46:00 -04:00
Kenneth Reitz 328d3880d5 added google analytics to generated docs 2011-03-24 13:53:03 -04:00
Kenneth Reitz ea3cc847a0 updated roadmap 2011-03-24 06:16:34 -04:00
Kenneth Reitz 8efab51355 Merge branch 'develop' 2011-03-24 06:11:02 -04:00
Kenneth Reitz e42d215833 sphinx version # fix. 2011-03-24 06:09:15 -04:00
Kenneth Reitz 10bc5549c9 Merge branch 'release/0.9.5' 2011-03-24 05:58:50 -04:00
Kenneth Reitz 1a5e2ecb33 release date bump 2011-03-24 05:57:22 -04:00
Kenneth Reitz e1bf189847 setup.py says 25,26,27,30,31,32 2011-03-23 05:47:49 -04:00
Kenneth Reitz 0785328e21 updated HISTORY 2011-03-23 04:53:24 -04:00
Kenneth Reitz 6ba0cc9af3 3.1, 3.2 support. 2011-03-23 04:49:07 -04:00
Kenneth Reitz 36876205e7 3.2 compatibility 2011-03-23 04:48:58 -04:00
Kenneth Reitz 1b97b7191e pre-version bump 2011-03-23 04:24:25 -04:00
Kenneth Reitz 8b575df419 version check in setup.py fix 2011-03-23 04:24:12 -04:00
Kenneth Reitz 6a3928759a 25 support 2011-03-23 04:13:49 -04:00
Kenneth Reitz 63348d883b fix imports for 2x 2011-03-23 04:04:03 -04:00
Kenneth Reitz 5dce600969 xls import fix 2011-03-23 04:00:05 -04:00
Kenneth Reitz 0913b54f47 this isn't an apple 2011-03-23 03:56:07 -04:00
Kenneth Reitz c5bbc74b96 import magic 2011-03-23 03:55:23 -04:00
Kenneth Reitz 7f5342a1b8 no need for simplejson anymore 2011-03-23 02:27:40 -04:00
Kenneth Reitz d42f9bc10f python3 tox config for jenkins 2011-03-23 02:23:43 -04:00
Kenneth Reitz c6565c9e29 2.5 compatible version checking 2011-03-23 02:22:10 -04:00
Kenneth Reitz 1a9343750e Merge branch 'feature/3.x' 2011-03-23 02:08:40 -04:00
Kenneth Reitz 8a393214c8 add tox tests for 3.x 2011-03-23 02:08:04 -04:00
Kenneth Reitz b8ed741a36 Same codebase for 2.x and 3.x! 2011-03-23 02:07:39 -04:00
Kenneth Reitz cddbd78a61 autoload 3 modules if using 3 2011-03-23 02:02:59 -04:00
Kenneth Reitz b113f49ce6 python3 AND python2 packages. 2011-03-23 01:59:19 -04:00
Kenneth Reitz 1429b9f8c4 xlwrt3 if python 3 2011-03-23 01:56:57 -04:00
Kenneth Reitz 42700f98a5 check python version 2011-03-23 01:47:10 -04:00
Kenneth Reitz 0e56db632a xlwt3 package 2011-03-23 01:47:01 -04:00
Kenneth Reitz b07512071e cyaml fixes 2011-03-23 01:38:31 -04:00
Kenneth Reitz e4881809d6 BytesIO in xls for 3.x 2011-03-23 01:38:20 -04:00
Kenneth Reitz 54ab300d2d markup changes for 3.x 2011-03-23 01:37:33 -04:00
Kenneth Reitz 4368d64317 xlwrt cleanups 2011-03-23 01:37:14 -04:00
Kenneth Reitz 117344de14 2to3! 2011-03-23 01:13:16 -04:00
Kenneth Reitz 58bc1c7dcf add xlwt 3.x 2011-03-23 01:13:03 -04:00
Kenneth Reitz 4c8b5e72e3 remove 2.x xlwt 2011-03-23 01:12:48 -04:00
Kenneth Reitz b900236157 testing whitespace cleanup 2011-03-23 00:50:27 -04:00
Kenneth Reitz dc14a16e04 added formatter tests 2011-03-23 00:49:32 -04:00
Kenneth Reitz 2d2ac9b708 col not key 2011-03-23 00:49:25 -04:00
Kenneth Reitz 1efcb7a63d Merge branch 'feature/formatters' 2011-03-23 00:39:16 -04:00
Kenneth Reitz 65c73dfc42 do some internal validation for adding formatters 2011-03-23 00:38:45 -04:00
Kenneth Reitz 3803a7a21b formatter execution in place upon export 2011-03-23 00:32:56 -04:00
Kenneth Reitz 8b5b29fc90 added Dataset.add_formatter 2011-03-23 00:20:39 -04:00
Kenneth Reitz e8ba765426 tablib.helpers is not longer needed 🍰 2011-03-23 00:03:33 -04:00
Kenneth Reitz 57001a5465 Merge https://github.com/mwhooker/tablib into develop 2011-03-22 23:57:42 -04:00
Matthew Hooker c8493ff047 fixes issue #38: python 2.5 support 2011-03-22 17:16:30 -04:00
Kenneth Reitz 03914323c2 Merge https://github.com/playpauseandstop/tablib into develop 2011-03-01 17:06:15 -05:00
Igor Davydenko 2f331cee8e Fix #24, add support of spaces in CSV files. 2011-03-01 19:36:19 +02:00
Kenneth Reitz e1734f2315 added __docformat__ 2011-02-21 14:07:42 -05:00
Kenneth Reitz 22cddbcd63 well then 2011-02-21 02:55:24 -05:00
Kenneth Reitz 76f09cd3b3 Added release number to documentation. 2011-02-21 02:37:53 -05:00
Kenneth Reitz d11c09febe Added License text. 2011-02-21 02:37:29 -05:00
Kenneth Reitz 9ab277a468 docs: Python 2.5 is supported now. 2011-02-21 02:32:20 -05:00
Kenneth Reitz 23c1831144 Added HACKING file. 2011-02-21 02:15:00 -05:00
Kenneth Reitz 1cf9bd14b4 ci.kennethreitz.com now 2011-02-21 02:12:41 -05:00
Kenneth Reitz c2331f7a23 Updated pythons list. 2011-02-21 02:11:59 -05:00
Kenneth Reitz bccf0d1ba1 no more test suite. 2011-02-18 04:37:00 -05:00
Kenneth Reitz c219972ccd added pypy location to tox config 2011-02-18 04:28:50 -05:00
Kenneth Reitz 52e9d44739 typo 2011-02-18 03:42:54 -05:00
Kenneth Reitz e94ecd8472 Small doc updates 2011-02-18 03:41:54 -05:00
Kenneth Reitz 96067e6380 Merge branch 'release/0.9.4' 2011-02-18 03:36:27 -05:00
Kenneth Reitz 1cc0f7d1f4 Version Bump (v0.9.4) 2011-02-18 03:34:59 -05:00
Kenneth Reitz f685bf548e latex stylee 2011-02-18 03:27:28 -05:00
Kenneth Reitz ca336926da junit's out 2011-02-18 03:27:12 -05:00
Kenneth Reitz 1aa3d3b06a removing coverage from scm 2011-02-18 03:15:39 -05:00
Kenneth Reitz be576135b2 Added 0.9.4 to History 2011-02-18 03:14:12 -05:00
Kenneth Reitz 0c05d0497e Added OrderedDict support. 2011-02-18 03:13:44 -05:00
Kenneth Reitz 52e307ea35 Docstring update 2011-02-18 02:59:07 -05:00
Kenneth Reitz 5cac9bd97e Python 2.5 added to compatible language list 2011-02-18 02:42:57 -05:00
Kenneth Reitz a285e993f1 add simplejson to requirements for 2.5 2011-02-18 02:42:37 -05:00
Kenneth Reitz 0ed367a31c I can see how that would cause a problem. 2011-02-18 02:34:59 -05:00
Kenneth Reitz c4815c24cc i haz the skillz 2011-02-18 02:07:42 -05:00
Kenneth Reitz 20fe1e0153 py.test now 2011-02-18 01:55:56 -05:00
Kenneth Reitz 5db8d1c3a6 that makes more sense 2011-02-18 01:44:59 -05:00
Kenneth Reitz 828017f9a7 wtf? is that even valid? 2011-02-18 01:34:18 -05:00
Kenneth Reitz cff8a6ac9a i've got to figure this out 2011-02-18 01:31:21 -05:00
Kenneth Reitz aa8590e8b8 json 2011-02-18 01:30:17 -05:00
Kenneth Reitz d2de647c47 added simplejson to tox config. 2011-02-18 01:29:16 -05:00
Kenneth Reitz 7afef680f5 fixed nose issue 2011-02-18 01:26:30 -05:00
Kenneth Reitz 35763f8c24 fix tox configuration 2011-02-17 20:31:02 -05:00
Kenneth Reitz cc3d020914 clear nosetests each time 2011-02-17 20:15:35 -05:00
Kenneth Reitz b8b5405f1c setup.py package fix 2011-02-17 20:13:01 -05:00
Kenneth Reitz b7aebbc74f anyjson in setup.py 2011-02-17 20:11:08 -05:00
Kenneth Reitz d776d78df5 Added AnyJSON to json format system 2011-02-17 20:09:07 -05:00
Kenneth Reitz 6f9365d376 Added AnyJSON 2011-02-17 20:02:14 -05:00
Kenneth Reitz 621b1bd45c Added AnyJSON license. 2011-02-17 20:02:07 -05:00
Kenneth Reitz be21b6fadd Remove vendorized SimpleJSON 2011-02-17 20:01:59 -05:00
Kenneth Reitz 832bfbbb1b this should work better 2011-02-17 19:54:10 -05:00
Kenneth Reitz 288b15fb54 tox coverage 2011-02-17 19:37:52 -05:00
Kenneth Reitz 73df22303b no more failing tests? 2011-02-17 16:47:42 -05:00
Kenneth Reitz 4c125bd206 8tabstop? really? 2011-02-17 16:36:28 -05:00
Kenneth Reitz ff0de1377a wow, that was ugly 2011-02-17 16:35:36 -05:00
Kenneth Reitz ccb29c68fa *shutter* making everyone else happy 2011-02-17 16:31:52 -05:00
Kenneth Reitz e077a7f2bc typo 2011-02-16 13:17:03 -05:00
Kenneth Reitz dcc52bdc18 Added Benjamin Wohlwend to AUTHORS 2011-02-14 04:16:21 -05:00
Benjamin Wohlwend 9cac54eefc Python 2.5 doesn't support @property.setter 2011-02-14 10:06:02 +01:00
Kenneth Reitz f69a96f07e Readme.rst improvements 2011-02-13 21:11:02 -05:00
Kenneth Reitz ca77ed6f64 documentation url style update 2011-02-10 10:18:58 -05:00
Kenneth Reitz 806aba9ef3 spelling corrections 2011-02-10 10:18:30 -05:00
Kenneth Reitz 23cbc0c333 More dynamic __slots__ 2011-02-03 13:52:53 -05:00
Kenneth Reitz 34ab54de77 Merge branch 'master' into develop 2011-02-02 21:34:08 -05:00
Kenneth Reitz 0843a15879 configuration fix for RTD 2011-02-02 21:33:41 -05:00
Kenneth Reitz 08ed309382 Merge branch 'release/0.9.3' into develop 2011-01-31 01:36:14 -05:00
Kenneth Reitz 26b6faa88d Merge branch 'release/0.9.3' 2011-01-31 01:35:52 -05:00
Kenneth Reitz 140736ff33 fabfile typo. 2011-01-31 01:34:40 -05:00
Kenneth Reitz 5379c5683d Markup license notice.
PD? Really?
2011-01-31 01:33:12 -05:00
Kenneth Reitz e8b44b5777 Version bump. 2011-01-31 01:33:00 -05:00
Kenneth Reitz a0822bc9b0 sorting update. 2011-01-31 01:29:41 -05:00
Kenneth Reitz 89b431213b Sorting update for headerless datasets. 2011-01-31 01:28:10 -05:00
Kenneth Reitz 695e8c5af7 Merge branch 'feature/sorting' into release/0.9.3 2011-01-31 00:58:27 -05:00
Kenneth Reitz 0797ec67d4 Prepping for new release (0.9.3) 2011-01-31 00:58:16 -05:00
Kenneth Reitz 1852624a7e Merge pull request #28 from cswegger/tablib
---

This patch provides simple column-based sorting to tablib. A (passing) unit-test is also included.
2011-01-22 18:35:34 -05:00
Luca Beltrame f81dc41a57 Support for sorting. Unit-tested. 2011-01-11 20:53:59 +01:00
Kenneth Reitz 34415b89b8 New Year! 2011-01-10 19:28:12 -05:00
Kenneth Reitz d25655588b TODO update. 2010-12-13 17:08:11 -05:00
Kenneth Reitz 22c4d185e1 Export HTML for Databooks. 2010-11-21 21:33:01 -05:00
Kenneth Reitz e3b3659ea4 whitespace fix 2010-11-21 21:32:00 -05:00
Kenneth Reitz 22d337790a small changes to html output 2010-11-21 18:58:30 -05:00
Kenneth Reitz 0784d4b32c Updated todo w/ new html output feature 2010-11-21 18:55:45 -05:00
Kenneth Reitz 332c5bccd9 Merge branch 'feature/html_out' into develop 2010-11-21 18:53:21 -05:00
Kenneth Reitz 7055d18a2e History update. 2010-11-21 18:53:18 -05:00
Kenneth Reitz 6a7c685111 Import path fix. 2010-11-21 18:49:02 -05:00
Kenneth Reitz 0e5b8f7058 Merge branch 'fix_databooks' into develop 2010-11-21 18:44:24 -05:00
Luca Beltrame e3e6b656e3 Fix the stupid mistake. 2010-11-21 13:17:36 +01:00
Luca Beltrame 99896a5f28 Fix Databook data leaks. 2010-11-21 13:14:47 +01:00
Luca Beltrame 25da44f569 Support for HTML (export only). Unit-tested. Depends on the "markup.py"
package(http://markup.sourceforge.net) which is included in packages/
Notice that the tests now depend on the presence of markup.py.
2010-11-21 13:00:56 +01:00
Kenneth Reitz 7727171379 Merge branch 'release/0.9.2' 2010-11-17 21:00:56 -05:00
Kenneth Reitz 91bd4eb9c7 Updated history 2010-11-17 21:00:13 -05:00
Kenneth Reitz 9b74b139fd Ordered dict in TODO 2010-11-17 21:00:01 -05:00
Kenneth Reitz 823a543f41 Version bump (v0.9.2) 2010-11-17 20:58:50 -05:00
Kenneth Reitz 1aa275bf99 Updated TODO. 2010-11-17 20:55:38 -05:00
Kenneth Reitz 17bb0d3b2c Merge branch 'feature/stacking' into develop 2010-11-17 20:49:08 -05:00
Kenneth Reitz 1a9aee9289 Column stacking only requires headers if headers exist. 2010-11-17 20:48:50 -05:00
Kenneth Reitz 196edb82cc trailing whitespae 2010-11-17 20:02:08 -05:00
Kenneth Reitz a2990d5852 Change stacking method names. 2010-11-17 20:01:31 -05:00
Kenneth Reitz d992ece86a Merge branch 'stacking' into feature/stacking 2010-11-17 19:56:04 -05:00
Kenneth Reitz 46f302255d Updated prophesy. 2010-11-17 19:54:50 -05:00
Kenneth Reitz 9e3ab4c13f Support for locked header row. 2010-11-17 19:50:22 -05:00
Kenneth Reitz eaed0e48c2 Formating. 2010-11-17 19:50:05 -05:00
Kenneth Reitz 501187b357 Merge branch 'feature/pickling' into develop 2010-11-17 19:15:51 -05:00
Kenneth Reitz ea4aef88b6 Subtle format fixes. 2010-11-17 19:15:36 -05:00
Luca Beltrame 24d800fac3 Support for pickling/unpickling Row objects. Makes Datasets pickleable. 2010-11-17 23:03:43 +01:00
Luca Beltrame d8136ab613 Whitespace 2010-11-17 22:51:43 +01:00
Luca Beltrame 36bbe2726b Remove unneded import 2010-11-15 09:00:57 +01:00
Luca Beltrame 1427be2901 Support for row and column stacking. Unit-tested. 2010-11-15 08:59:49 +01:00
Kenneth Reitz 10ce000d31 Updated changelog. 2010-11-11 11:02:14 -05:00
Kenneth Reitz a91254117c Added ordered dict library. 2010-11-11 11:02:07 -05:00
Kenneth Reitz b67762604f Merge branch 'transpose' into develop 2010-11-11 10:59:08 -05:00
Kenneth Reitz 83a8346e8f Added ordered dict license. 2010-11-11 10:58:48 -05:00
Luca Beltrame 657ab98d04 Support for Dataset transposition. Unit-tested. 2010-11-11 09:00:06 +01:00
Kenneth Reitz 9ddb4de942 Documentation typo. 2010-11-09 12:27:15 -05:00
Kenneth Reitz 5fad80a540 Update column append examples. 2010-11-09 08:43:34 -05:00
Kenneth Reitz cabab73045 Spacing fixes. 2010-11-09 08:42:51 -05:00
Kenneth Reitz 2bb0525990 Optimized set intersection for tag checking. 2010-11-05 09:46:14 -04:00
Kenneth Reitz f364bb576e Merge branch 'release/0.9.1' into develop 2010-11-04 12:08:08 -04:00
Kenneth Reitz 09d057094e Merge branch 'release/0.9.1' 2010-11-04 12:07:52 -04:00
Kenneth Reitz 8082c4ad43 Version bump (v0.9.1). 2010-11-04 12:07:37 -04:00
Kenneth Reitz 00e9ae0120 Minor bug was causing reference shadowing. 2010-11-04 12:05:39 -04:00
Kenneth Reitz f01c22213e Merge branch 'develop' 2010-11-04 05:54:01 -04:00
Kenneth Reitz a58bf269d9 Typo. 2010-11-04 05:53:45 -04:00
Kenneth Reitz 437a135dd3 Merge branch 'release/0.9.0' into develop 2010-11-04 05:47:36 -04:00
Kenneth Reitz 0409ff50af Merge branch 'release/0.9.0' 2010-11-04 05:47:25 -04:00
Kenneth Reitz dd24edcc24 Big history update. 2010-11-04 05:47:13 -04:00
Kenneth Reitz cf28f4baa8 Merge branch 'release/0.9.0' 2010-11-04 05:43:54 -04:00
Kenneth Reitz 52dcf79c41 No append documentation necessary. 2010-11-04 05:43:44 -04:00
Kenneth Reitz 49f098ee22 Verb-age update for documentation. 2010-11-04 05:43:23 -04:00
Kenneth Reitz 642b1d8def Exception documentation update. 2010-11-04 05:43:00 -04:00
Kenneth Reitz f6964bba8f Version bump. 2010-11-04 04:49:37 -04:00
Kenneth Reitz 8d6e75ad20 Fixes for 0.9.0. 2010-11-04 04:49:31 -04:00
Kenneth Reitz 30487999ba CI done. 2010-11-04 04:47:52 -04:00
Kenneth Reitz b74308e81e Append fixed. 2010-11-04 04:47:25 -04:00
Kenneth Reitz 577289cbc3 Callable Columns again :) 2010-11-04 04:46:54 -04:00
Kenneth Reitz cf10703e31 Updated Callable Columns support. 2010-11-04 04:46:38 -04:00
Kenneth Reitz 778ad0265e Added new required headers for adding columns. 2010-11-04 04:26:03 -04:00
Kenneth Reitz e3dedb8887 Cleanup todo. 2010-11-04 04:22:50 -04:00
Kenneth Reitz c6e240fa52 Cleanups. 2010-11-04 04:22:37 -04:00
Kenneth Reitz 5c747c9c2e Keepin' it DRY. 2010-11-04 04:20:45 -04:00
Kenneth Reitz 0bbd990ed8 whitespace fix. 2010-11-04 04:13:09 -04:00
Kenneth Reitz fcada243a2 Added new Row class and Dataset.filter(). 2010-11-04 04:13:02 -04:00
Kenneth Reitz fca8ad6182 Ugh.... 2010-11-04 03:55:42 -04:00
Kenneth Reitz 35d9e390fd New todo. 2010-11-04 01:33:12 -04:00
Kenneth Reitz 8ca180c461 Documentation configuration changes for colors. 2010-11-04 01:20:45 -04:00
Kenneth Reitz ff63558a67 Added TSV to Readme. 2010-11-04 01:07:04 -04:00
Kenneth Reitz f621b56178 TODO! 2010-11-04 01:06:17 -04:00
Kenneth Reitz 2b529bcb1c Quotation constancy. 2010-11-04 01:06:07 -04:00
Kenneth Reitz 90c3435600 TODO Update. 2010-11-04 01:02:33 -04:00
Kenneth Reitz 1fa28ee2ca Added test_suite.sh script. 2010-11-04 01:01:54 -04:00
Kenneth Reitz a5cae7c249 Adde Luca Beltrame to AUTHORS. 2010-11-04 00:59:06 -04:00
Kenneth Reitz 666991ca1e Merge branch 'master' of github.com:kennethreitz/tablib into develop 2010-11-04 00:57:55 -04:00
Kenneth Reitz 5f4162918f New site URL. 2010-11-04 00:57:25 -04:00
Kenneth Reitz b554ce36bb Official removal of cli interface. Bad idea. 2010-11-04 00:57:18 -04:00
Kenneth Reitz e5e22d3ca2 Documentation typo fix. 2010-11-04 00:57:12 -04:00
Kenneth Reitz 8626351618 Official removal of cli interface. Bad idea. 2010-11-04 00:56:31 -04:00
Kenneth Reitz cdfacb6d6e Whitespace. 2010-10-26 05:53:07 -07:00
Kenneth Reitz 108c9de130 Merge branch 'tsv' into develop 2010-10-19 11:12:00 -04:00
Luca Beltrame 271aeebf56 Merge branch 'tsv_origin' into tsv_format 2010-10-19 10:49:10 +02:00
Luca Beltrame e75a00541d Support for TSV-files. Unit-tested. 2010-10-19 10:45:54 +02:00
Kenneth Reitz 3b0e0c7991 Updates. 2010-10-10 10:01:51 -04:00
Kenneth Reitz 23440fb7a3 Documentation update. 2010-10-10 06:23:11 -04:00
Kenneth Reitz 459f310857 Trying a few things. 2010-10-10 06:22:59 -04:00
Kenneth Reitz f9021f53c2 Future release? 2010-10-10 04:37:16 -04:00
Kenneth Reitz 7fda829d27 Documentation update. 2010-10-10 04:37:09 -04:00
Kenneth Reitz ca08ac8a7b Documentation update. 2010-10-10 03:03:57 -04:00
Kenneth Reitz 08b51113d3 Added seamless deletion of columns. 2010-10-10 03:03:50 -04:00
Kenneth Reitz 3e391fc8e3 Auto version usage. 2010-10-10 02:33:03 -04:00
Kenneth Reitz a230844914 Docs update. 2010-10-10 02:32:52 -04:00
Kenneth Reitz bc82be09c5 Big Documentation Upgrade. 2010-10-10 02:32:41 -04:00
Kenneth Reitz ed9fe01604 Added column insertion.
Documentation update.
2010-10-08 15:47:10 -04:00
Kenneth Reitz e69546a0ff Major documentation update. 2010-10-08 15:46:50 -04:00
Kenneth Reitz d4b659ece9 documentation update 2010-10-08 11:50:43 -04:00
Kenneth Reitz 55eb3f93e3 documentation update 2010-10-08 11:49:53 -04:00
Kenneth Reitz be7182aea9 installation documentation update. 2010-10-08 09:41:19 -04:00
Kenneth Reitz 48def2cba6 Documentation Update. Site should be up soon. 2010-10-07 17:52:21 -04:00
Kenneth Reitz df8c0335d1 Fixed incorrect packaging. 2010-10-07 16:01:27 -04:00
Kenneth Reitz d0b09f0fce Doc upgrades. 2010-10-07 16:01:17 -04:00
Kenneth Reitz 9efd982bfa Documentation update. 2010-10-07 16:01:09 -04:00
Kenneth Reitz a3c82804cd Simple fix. 2010-10-06 20:01:52 -04:00
Kenneth Reitz 2e75e93f57 Merge branch 'release/0.8.5' into develop 2010-10-06 15:46:55 -04:00
Kenneth Reitz a26d782e88 Merge branch 'release/0.8.5' 2010-10-06 15:46:44 -04:00
Kenneth Reitz f5c0c5c34d Updated History. 2010-10-06 15:46:25 -04:00
Kenneth Reitz d9aee8e605 Version bump (v0.8.5) 2010-10-06 15:13:40 -04:00
Kenneth Reitz 315a082b70 Whoops. 2010-10-06 15:11:14 -04:00
Kenneth Reitz 120ce9fcd6 No dependencies! 2010-10-06 15:10:58 -04:00
Kenneth Reitz 914a82eac9 Added NOTICE file for other licenses. 2010-10-06 14:35:23 -04:00
Kenneth Reitz 3931bcb4e6 Added vendorized JSON. 2010-10-06 14:27:05 -04:00
Kenneth Reitz 471e56c387 Added new vendorized package namespaces. 2010-10-06 14:26:54 -04:00
Kenneth Reitz 8553dbc040 Added simplejson to packages. 2010-10-06 13:45:48 -04:00
Kenneth Reitz 96c93871cf Vendorized XLWT. 2010-10-06 13:42:52 -04:00
Kenneth Reitz a54949bc08 Removed external dependencies, but utilize them if
available.
2010-10-06 13:42:26 -04:00
Kenneth Reitz ed686c2391 Packages is a package. 2010-10-06 12:48:56 -04:00
Kenneth Reitz 143677be77 New exception style. 2010-10-06 12:48:43 -04:00
Kenneth Reitz b2c35c2543 Fallback on vendorized yaml. 2010-10-06 12:23:58 -04:00
Kenneth Reitz 9dfd9d0c8e Rely on built-in JSON. 2010-10-06 12:23:30 -04:00
Kenneth Reitz 140e23c980 Vendorized yaml. 2010-10-06 12:23:14 -04:00
Kenneth Reitz ac797f1eda Added cofiguration. 2010-10-05 17:30:27 -04:00
Kenneth Reitz 7c90595364 Heavy documentation update. 2010-10-05 17:30:13 -04:00
Kenneth Reitz 38ac98fdb2 Typo in readme. 2010-10-05 17:29:55 -04:00
Kenneth Reitz 07c7d172d9 No build! 2010-10-05 17:29:49 -04:00
Kenneth Reitz 9c7707be60 Added generic doctesting. 2010-10-05 17:29:43 -04:00
Kenneth Reitz 14bee65208 Added kr theme. 2010-10-05 17:29:32 -04:00
Kenneth Reitz 4fc5e0655d setup.py updated 2010-10-04 16:22:39 -04:00
Kenneth Reitz da2e670d0d Oops. 2010-10-04 16:12:42 -04:00
Kenneth Reitz 5912bf4870 none should have an __in__ method. 2010-10-04 16:04:36 -04:00
Kenneth Reitz 28e9d7e23e Whitespaces. 2010-10-04 16:04:02 -04:00
Kenneth Reitz 930d38cf5a Merge branch 'release/0.8.4' 2010-10-04 15:52:43 -04:00
Kenneth Reitz 5e433c263d version bump v0.8.4 2010-10-04 15:51:45 -04:00
Kenneth Reitz 19ac9b9716 Updated history for v0.8.4. 2010-10-04 15:51:05 -04:00
Kenneth Reitz 6feb59504a Version bump. 2010-10-04 15:50:52 -04:00
Kenneth Reitz 817eedd6f5 Only wrap when needed. 2010-10-04 15:50:41 -04:00
Kenneth Reitz 4d1c5a9996 Merge branch 'release/0.8.3' into develop 2010-10-04 11:55:59 -04:00
Kenneth Reitz 520a1986d7 Merge branch 'release/0.8.3' 2010-10-04 11:55:48 -04:00
Kenneth Reitz 1ea793112c Version Bump (v0.8.3) 2010-10-04 11:55:35 -04:00
Kenneth Reitz 41a7a5d329 No cli app at this time. 2010-10-04 11:55:26 -04:00
Kenneth Reitz c4edaa2ca8 Appended history. 2010-10-04 11:55:17 -04:00
Kenneth Reitz c612bb3dae Merge branch 'master' into develop 2010-10-04 11:52:12 -04:00
Kenneth Reitz c223dfbdf1 Merge branch 'hotfix/0.8.2' 2010-10-04 11:40:45 -04:00
Kenneth Reitz 49bd48b016 namspace fix. 2010-10-04 11:39:59 -04:00
Kenneth Reitz c6d90bc825 Updated history. 2010-10-04 11:39:05 -04:00
Kenneth Reitz bcd0e37a65 Version bump. 2010-10-04 11:38:28 -04:00
Kenneth Reitz 8c92e878a3 Upgraded XLS abstraction layer. 2010-10-04 11:38:17 -04:00
Kenneth Reitz da2b011358 Added separator support for XLS output. 2010-10-04 11:33:34 -04:00
Kenneth Reitz a8b0bf4b5f Typo. 2010-10-04 11:33:16 -04:00
Kenneth Reitz 6574d3e58b XLS support for Separators.
Bolden headers and Separators.
2010-10-04 10:54:14 -04:00
Kenneth Reitz 1020799828 Separator append and insert support. 2010-10-04 10:53:48 -04:00
Kenneth Reitz 333e73f892 Added wrapping support. 2010-10-04 10:19:31 -04:00
Kenneth Reitz bfe70066b8 Added Josh Ourisman to authors 2010-10-01 18:44:50 -04:00
Kenneth Reitz fbfbe01b70 Merge branch 'joshmerge' into develop 2010-10-01 17:56:31 -04:00
Kenneth Reitz 06a394ea5c typo in setup.py. 2010-10-01 17:52:50 -04:00
Kenneth Reitz 9427decdb0 Changes. 2010-10-01 17:52:08 -04:00
Kenneth Reitz fb59035f8d Added tablib.import_set() and tested accordingly. 2010-10-01 17:52:08 -04:00
Kenneth Reitz 187d12cffc Format Auto-detection in place.
Test suite updated.
2010-10-01 17:52:08 -04:00
Kenneth Reitz eaa4de7793 Auto-detectors operational. 2010-10-01 17:52:08 -04:00
Kenneth Reitz d479c5735a Hmmm.... 2010-10-01 17:52:08 -04:00
Kenneth Reitz 96668bb393 tabbed runner 2010-10-01 17:52:08 -04:00
Kenneth Reitz b369baba40 Added runner (for testing). 2010-10-01 17:52:08 -04:00
Kenneth Reitz 25f846a78a Added entrance point, setup.py updates. 2010-10-01 17:52:08 -04:00
Kenneth Reitz 22fe18239f Added legacy cli interface. 2010-10-01 17:51:30 -04:00
Josh Ourisman 149bafa97b added ability to append new column passing a callable as the value that will be applied to every row; w/ test 2010-10-01 16:17:04 -04:00
Josh Ourisman 9f7fec2379 changing syntax of checking for row and col values in append(); slightly more robust this way 2010-10-01 15:27:28 -04:00
Josh Ourisman 762ac39e27 resolved merge conflict 2010-10-01 14:57:36 -04:00
Josh Ourisman 2a7aa959b3 modified .gitignore to actually ignore .pyc files 2010-10-01 14:51:36 -04:00
Kenneth Reitz d85523b6a6 typo in setup.py. 2010-09-28 09:01:34 -04:00
Kenneth Reitz 6407afba3e typo fix. 2010-09-28 08:46:31 -04:00
Kenneth Reitz 25a5bcea0c merge 2010-09-28 08:45:14 -04:00
Kenneth Reitz 7aada68952 Merge branch 'hotfix/8.0.1' 2010-09-28 08:37:43 -04:00
Kenneth Reitz 5ba92b0f6b Packaging fix.
Version bump.
2010-09-28 08:37:32 -04:00
Kenneth Reitz f58d4b67dc Changes. 2010-09-28 08:33:57 -04:00
Kenneth Reitz a310ab7a09 Added tablib.import_set() and tested accordingly. 2010-09-25 18:35:10 -04:00
Kenneth Reitz 7f2f925ddb Format Auto-detection in place.
Test suite updated.
2010-09-25 18:09:44 -04:00
Kenneth Reitz 3fc898e222 Auto-detectors operational. 2010-09-25 18:03:03 -04:00
Kenneth Reitz de46f45e2e Hmmm.... 2010-09-25 17:36:20 -04:00
Kenneth Reitz 392eaac299 tabbed runner 2010-09-25 17:28:46 -04:00
Kenneth Reitz 3a9c3944cf Added runner (for testing). 2010-09-25 17:27:53 -04:00
Kenneth Reitz 8c402da729 Added entrance point, setup.py updates. 2010-09-25 17:27:04 -04:00
Kenneth Reitz 8feb6e8ddf Added legacy cli interface. 2010-09-25 17:26:53 -04:00
Kenneth Reitz 9072b6ddae Merge branch 'release/0.8.0' into develop 2010-09-25 17:19:06 -04:00
Kenneth Reitz 9f26c23eb5 Merge branch 'release/0.8.0' 2010-09-25 17:18:51 -04:00
Kenneth Reitz 8136f4b09e Updated history for v0.8.0. 2010-09-25 17:18:48 -04:00
Kenneth Reitz 7e7ad73ddd Merge branch 'release/0.8.0' into develop 2010-09-25 17:13:25 -04:00
Kenneth Reitz f889910629 Big documentation update. 2010-09-25 17:12:50 -04:00
Kenneth Reitz 969d9d957d Version Bump (to v0.8.0) 2010-09-25 16:59:27 -04:00
Kenneth Reitz 86d84b555d Import cleanup. 2010-09-25 16:53:33 -04:00
Kenneth Reitz 66867527d2 Format import cleanups. 2010-09-25 16:51:09 -04:00
Kenneth Reitz 7505d8d985 Adding docstrings for pylint coverage. 2010-09-25 16:49:21 -04:00
Kenneth Reitz d5515c17b8 Removed useless imports. 2010-09-25 16:47:04 -04:00
Kenneth Reitz 07ac723971 Readme update for imports. 2010-09-25 16:46:52 -04:00
Kenneth Reitz 5d7843ea59 Merge branch 'feature/imports' into develop 2010-09-25 15:57:30 -04:00
Kenneth Reitz b5f0cf9d37 Tests elegant book imports. 2010-09-25 15:56:43 -04:00
Kenneth Reitz a73bbe1645 Elegant databook importers. 2010-09-25 15:56:20 -04:00
Kenneth Reitz f1bdf43aab Book wiper. 2010-09-25 15:50:06 -04:00
Kenneth Reitz 7623bfe7b0 Updated tests for set imports. 2010-09-25 15:40:05 -04:00
Kenneth Reitz 59ccc0b422 YAML input support. 2010-09-25 15:39:09 -04:00
Kenneth Reitz 99154aa6d6 Merge branches 'feature/import-seamless' and 'feature/imports' into feature/imports 2010-09-25 15:24:59 -04:00
Kenneth Reitz 65836d5ace Updated elegant imports for instance properties.
Data wipes.
2010-09-25 15:24:16 -04:00
Kenneth Reitz 4117503ed5 Elegant imports in place! 2010-09-25 15:23:01 -04:00
Kenneth Reitz dfa26a7d53 Typos. 2010-09-25 10:49:06 -04:00
Kenneth Reitz 4f035caf1b Added dataset wipe. 2010-09-25 10:40:59 -04:00
Kenneth Reitz a9c7a5067d Added dataset wipe. 2010-09-25 06:22:40 -04:00
Kenneth Reitz 80cb42e8dd Archaic imports in place! 2010-09-25 06:20:34 -04:00
Kenneth Reitz 8d7e5732cd Typo. 2010-09-25 05:59:02 -04:00
Kenneth Reitz 942dd3dadf Added tablib core docstring placeholder. 2010-09-25 05:58:40 -04:00
Kenneth Reitz b1d282744c Docstring updates. 2010-09-25 05:57:42 -04:00
Kenneth Reitz 4c0c879d65 Updated tests. 2010-09-25 05:53:19 -04:00
Kenneth Reitz cab63e02c8 Module namespace change. 2010-09-25 05:53:13 -04:00
Kenneth Reitz 63d025888a Added format importers. 2010-09-25 05:49:21 -04:00
Kenneth Reitz 5a993ac281 Working on it. 2010-09-25 05:49:14 -04:00
Kenneth Reitz 666dd1d2c7 Pylint preps. 2010-09-25 05:17:03 -04:00
Kenneth Reitz ac1666e3ae removing garbage 2010-09-25 05:14:07 -04:00
Kenneth Reitz 5b7e817db2 Only CSV Left. 2010-09-25 05:11:57 -04:00
Kenneth Reitz f9c168e4bc Added coverage bin. 2010-09-25 05:08:35 -04:00
Kenneth Reitz 82f3d84c7d Added docstring. 2010-09-25 05:06:04 -04:00
Kenneth Reitz 121cf46aec Corrected always-false condition. 2010-09-25 05:04:51 -04:00
Kenneth Reitz 4bb4a05bcb Longer varnames for pylint. 2010-09-25 05:02:58 -04:00
Kenneth Reitz e52b8dd329 Added methods to struct for pylint. 2010-09-25 05:01:05 -04:00
Kenneth Reitz 93fb89b8b6 Cleanup * imports. 2010-09-25 04:58:24 -04:00
Kenneth Reitz c01b66a16a Moving that back. 2010-09-25 04:53:20 -04:00
Kenneth Reitz c3fa29a166 Added public method for pylint. 2010-09-25 04:51:56 -04:00
Kenneth Reitz 8d6a52aaf5 Cleanups for pylint. 2010-09-25 04:49:31 -04:00
Kenneth Reitz 703b1da04c General cleanups for pylint. 2010-09-25 04:45:22 -04:00
Kenneth Reitz 0e6bd079cc Improved docstring. 2010-09-25 04:43:45 -04:00
Kenneth Reitz 579dbf0cc0 Added docstring.
Removed unneeded import.
2010-09-25 04:43:39 -04:00
Kenneth Reitz fbabb430ca small setup.py fix 2010-09-25 04:04:36 -04:00
Kenneth Reitz b8f923f8c5 added Luke Lee to Authors 2010-09-25 04:03:01 -04:00
Kenneth Reitz fbe6fe1612 fix old push 2010-09-25 02:55:21 -04:00
Kenneth Reitz 17e90e71e5 test 2010-09-25 02:54:43 -04:00
Kenneth Reitz dc21825f34 Merge branch 'release/0.7.1' 2010-09-20 21:39:47 -04:00
Kenneth Reitz 7364995eaa Version bump (v0.7.1) 2010-09-20 21:39:27 -04:00
Kenneth Reitz 3407170b99 Updated TODO. 2010-09-20 21:37:32 -04:00
Kenneth Reitz dd13744c92 Documentation update for properties. 2010-09-20 21:37:08 -04:00
Kenneth Reitz 31e4c39762 Updated tests for reverted methods. 2010-09-20 21:34:01 -04:00
Kenneth Reitz 4fc70957ac Reverted methods back to properties. 2010-09-20 21:33:48 -04:00
Kenneth Reitz 7f17ccf445 Merge branch 'hotfix/dict' into develop 2010-09-20 14:37:36 -04:00
Kenneth Reitz fbcc3b60af Merge branch 'hotfix/dict' 2010-09-20 14:37:26 -04:00
Kenneth Reitz 9b3268f0ad Whoops. 2010-09-20 14:37:10 -04:00
Kenneth Reitz f386ef8ac8 Merge branch 'feature/unicode' into develop 2010-09-20 14:18:55 -04:00
Kenneth Reitz e8f5e023c4 Version bump (v0.7.0). 2010-09-20 14:18:31 -04:00
Kenneth Reitz 81445aeec8 Updated readme to reflect property to method changes. 2010-09-20 14:05:15 -04:00
Kenneth Reitz f94a236122 Changed export properties to methods. 2010-09-20 14:04:02 -04:00
Kenneth Reitz bfbb7c626f Moved from cStringIO to StringIO. More stable. 2010-09-20 12:50:10 -04:00
Kenneth Reitz be0f77f9ee Merge branch 'release/0.6.4' into develop 2010-09-20 09:21:51 -04:00
Kenneth Reitz 3b44349090 Version bump (0.6.4). 2010-09-20 09:21:02 -04:00
Kenneth Reitz 04a16afa58 Chmox. 2010-09-20 09:14:20 -04:00
Kenneth Reitz a8632125dc Merge branch 'master' into dev 2010-09-20 09:07:31 -04:00
Kenneth Reitz ccf2ebcde2 Version bump (v0.6.4) 2010-09-20 08:57:49 -04:00
Kenneth Reitz 2c60ce9233 String decoding to avoid unicode collisions for XLS output. 2010-09-19 23:51:48 -04:00
Kenneth Reitz 649c7e8bb7 Removed unneeded tuple_check. 2010-09-19 23:31:05 -04:00
Kenneth Reitz 2d3dc5ef71 PEP257. 2010-09-19 23:26:37 -04:00
Kenneth Reitz efc516f366 PEP8. 2010-09-19 23:23:03 -04:00
Kenneth Reitz b2a51fd941 Merge branch 'durden' into develop 2010-09-19 23:13:29 -04:00
Luke Lee d54d70bc22 Added test for csv export 2010-09-19 17:04:14 -05:00
Luke Lee 391ad61bef Improved del test
- Added testing for data set width/height
2010-09-19 16:41:23 -05:00
Luke Lee 99a45814d1 Added tests del functionality 2010-09-19 16:36:17 -05:00
Luke Lee fad3546614 Added docstrings 2010-09-19 16:25:18 -05:00
Luke Lee 7ba2849829 Misc. PEP8 whitespace celeanup 2010-09-19 16:16:31 -05:00
Luke Lee 7ec0f2ef07 Attempt at merging upstream develop branch
- Kept the slicing tests in tact by leaving their setup info. in the main setup
- Moved around some of the test methods to organize them a bit by functionality
2010-09-19 16:14:27 -05:00
Luke Lee bd470684a4 Ignore file update
- Update ignoring of python leftovers
- Added vi noise
2010-09-19 16:06:47 -05:00
Kenneth Reitz dbcea81c17 Inline docs. 2010-09-16 00:59:58 -04:00
Kenneth Reitz 49dc4a249e Removed useless is_string function. 2010-09-15 23:46:56 -04:00
Kenneth Reitz 7cd82f956f Version Bump. 2010-09-15 23:46:40 -04:00
Kenneth Reitz 13c3e537fd reamde update 2010-09-14 00:09:04 -04:00
Kenneth Reitz f913853cae Merge branch 'release/0.6.3' 2010-09-14 00:07:19 -04:00
Kenneth Reitz ea1de420a3 Merge branch 'release/0.6.3' 2010-09-14 00:02:38 -04:00
Kenneth Reitz d0c8df95a3 Version bump. v0.6.3. 2010-09-14 00:02:14 -04:00
Kenneth Reitz bb4e97f8aa Updated readme for column additions. 2010-09-14 00:01:59 -04:00
Kenneth Reitz ffaeb64639 Merge branch 'feature/add-cols' into develop 2010-09-13 23:56:08 -04:00
Kenneth Reitz f31ec562b4 Extensively testing 2010-09-13 23:55:17 -04:00
Kenneth Reitz 68d7204b2d Added data.append(col=[]) support. 2010-09-13 23:25:49 -04:00
Luke Lee 52db1ddc3e Fixed typo in test from previous commit 2010-09-13 21:27:35 -05:00
Luke Lee 4755020dd7 Added extra row to base data set
- Testing with 3 rows is a bit more interesting
2010-09-13 21:26:15 -05:00
Luke Lee 5468dd7e67 Added test for slicing data elements 2010-09-13 21:23:20 -05:00
Luke Lee 8673710ddb Refactored creation of data set into setUp
- Broke out tuples for more robust comparisions
2010-09-13 21:08:31 -05:00
Luke Lee f01cf184d4 Added simple test for slicing by headers 2010-09-13 21:03:29 -05:00
Luke Lee 1482ca4a19 Adding docstrings 2010-09-13 20:32:36 -05:00
Luke Lee 93c6c39581 Misc. pep8 cleanups including spaces after ',' and blank line organization 2010-09-13 20:23:31 -05:00
Kenneth Reitz a0cb44cc43 Made Struct really powerful. 2010-09-13 20:03:46 -04:00
Kenneth Reitz b2cd061773 Updated Roadmap 2010-09-13 18:13:20 -04:00
Kenneth Reitz 876b849950 mend 2010-09-13 17:44:28 -04:00
Kenneth Reitz 40c9e09578 Merge branch 'release/0.6.2' 2010-09-13 17:26:56 -04:00
Kenneth Reitz 9f5379fcc7 Version Bump (0.6.2). 2010-09-13 17:24:39 -04:00
Kenneth Reitz 9ecc57dbf7 Added header property to prevent invalid headers being set. 2010-09-13 17:22:02 -04:00
Kenneth Reitz a7471f7302 Testing fixtures for fixed bugs. 2010-09-13 17:21:40 -04:00
Kenneth Reitz 70211b71e0 Updated Readme.rst 2010-09-13 17:18:25 -04:00
Kenneth Reitz ddf4b441b0 Fixed exception catch, Fixes Issue #5. 2010-09-13 16:50:08 -04:00
Kenneth Reitz a0509126e0 Added simple unit-testing structure. 2010-09-13 16:49:11 -04:00
Kenneth Reitz ec5b1cf3e0 Merge branch 'develop' 2010-09-13 16:14:34 -04:00
Kenneth Reitz 3fb729aac6 Removed non-working unit-tests. 2010-09-13 16:11:07 -04:00
Kenneth Reitz 647f69044f Cleaning up code a bit. 2010-09-13 16:07:27 -04:00
Kenneth Reitz 3da34af76f Removed un-implimented junk from core.py 2010-09-13 16:06:37 -04:00
Kenneth Reitz 5b42824871 Removed vendorized packages. 2010-09-13 16:06:14 -04:00
Kenneth Reitz 89209b6bd3 Moving tabbed cli to future feature branch. 2010-09-13 16:03:11 -04:00
Kenneth Reitz 9362d3283b Version bump (v0.6.1) 2010-09-13 15:51:15 -04:00
Kenneth Reitz 123851a737 README.rst update. 2010-09-13 15:49:18 -04:00
Kenneth Reitz 54973c276c Added reqs.txt 2010-09-13 15:48:08 -04:00
Kenneth Reitz 275ac9d194 Setup.py typo 2010-09-12 14:12:30 -04:00
Kenneth Reitz 4ecd5888af Readme issue 2010-09-12 14:03:26 -04:00
Kenneth Reitz 359f12c83c Another readme update. 2010-09-12 13:55:52 -04:00
Kenneth Reitz de8d76fdae Merge branch 'develop' 2010-09-12 13:52:05 -04:00
Kenneth Reitz f188e3dd87 Heavy readme update. 2010-09-12 13:50:59 -04:00
Kenneth Reitz 3dff8f5b79 updated readme a bit 2010-09-12 13:28:55 -04:00
Kenneth Reitz c24d2dd45d Docstring updates. 2010-09-12 13:17:21 -04:00
Kenneth Reitz 4e98563483 Encodings are important. 2010-09-12 13:16:05 -04:00
Kenneth Reitz ca17c9f965 Proper usage of MANIFEST. 2010-09-12 13:15:38 -04:00
Kenneth Reitz 41c4fcc59f Added authors and others to manifest. 2010-09-12 13:15:21 -04:00
Kenneth Reitz e9166b14fd Cleaned up fabfile. 2010-09-12 13:13:43 -04:00
Kenneth Reitz a5528d731e Updated verbage in AUTHORS. 2010-09-12 13:13:08 -04:00
Kenneth Reitz 1a122f2a4d gitignore 2010-09-12 13:12:21 -04:00
Kenneth Reitz e9d9350e43 merge for version bump into master 2010-09-12 12:51:13 -04:00
Kenneth Reitz ac4b568cba Updates for push. 2010-09-12 11:45:31 -04:00
Kenneth Reitz 8be372b8cc Readme cleanups. 2010-09-11 23:14:56 -04:00
Kenneth Reitz f8d8d3058a Hmm 2010-09-11 23:09:13 -04:00
Kenneth Reitz d03ba7e532 Optimizations. 2010-09-11 23:09:06 -04:00
Kenneth Reitz 35102ab951 Prepping for distribution. 2010-09-11 23:08:57 -04:00
Kenneth Reitz b8587e5cb0 Packaging for distribution. 2010-09-11 23:08:48 -04:00
Kenneth Reitz da4f2013f1 Unvendorized packages. 2010-09-11 23:08:31 -04:00
Kenneth Reitz ff069b1604 version bump 2010-09-08 18:07:52 -04:00
Kenneth Reitz 2994a9fc0d Merge branch 'release/0.0.5' 2010-09-08 18:06:40 -04:00
Kenneth Reitz 310400af5b Merge branch 'feature/paged' into develop 2010-09-08 18:06:25 -04:00
Kenneth Reitz d52537b75b Added workbook feature for xls support.
Other formats expected.
2010-09-08 18:05:32 -04:00
Kenneth Reitz 40490d1ba5 Added base DataBook object. 2010-09-08 17:35:13 -04:00
Kenneth Reitz 9f025dc111 Added name to corelib. 2010-09-08 15:55:08 -04:00
Kenneth Reitz 335e3b1134 Updated setup file. 2010-09-08 15:55:01 -04:00
Kenneth Reitz 6b4afc38d1 Roadmap Update 2010-09-02 00:21:03 -04:00
Kenneth Reitz bd3099897c Merge branch 'hotfix/readme' into develop 2010-09-02 00:04:13 -04:00
Kenneth Reitz 4b6fbe9225 Merge branch 'hotfix/readme' 2010-09-02 00:04:07 -04:00
Kenneth Reitz f22357f1bc Readme fix 2010-09-02 00:03:48 -04:00
Kenneth Reitz 37ffbc71c0 index and append methods 2010-08-30 05:31:45 -04:00
Kenneth Reitz 6f7c64eb03 Better xlwt handling. 2010-08-30 05:18:22 -04:00
Kenneth Reitz 89be8f402f Merge branch 'release/0.0.4' into develop 2010-08-30 03:58:44 -04:00
182 changed files with 8030 additions and 25254 deletions
+10
View File
@@ -0,0 +1,10 @@
# .coveragerc to control coverage.py
[report]
# Regexes for lines to exclude from consideration
exclude_lines =
# Have to re-enable the standard pragma:
pragma: no cover
# Don't complain if non-runnable code isn't run:
if __name__ == .__main__.:
+10
View File
@@ -0,0 +1,10 @@
[![Jazzband](https://jazzband.co/static/img/jazzband.svg)](https://jazzband.co/)
This is a [Jazzband](https://jazzband.co/) project. By contributing you agree to abide
by the [Contributor Code of Conduct](https://jazzband.co/about/conduct) and follow the
[guidelines](https://jazzband.co/about/guidelines).
If you'd like to contribute, simply fork
[the repository](https://github.com/jazzband/tablib), commit your changes to a feature
branch, and send a pull request to `master`. Make sure you add yourself to
[AUTHORS](https://github.com/jazzband/tablib/blob/master/AUTHORS).
+46
View File
@@ -0,0 +1,46 @@
name: Docs and lint
on: [push, pull_request]
env:
FORCE_COLOR: 1
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
env:
- TOXENV: docs
- TOXENV: lint
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Get pip cache dir
id: pip-cache
run: |
echo "::set-output name=dir::$(pip cache dir)"
- name: Cache
uses: actions/cache@v2
with:
path: ${{ steps.pip-cache.outputs.dir }}
key:
${{ matrix.os }}-${{ matrix.python-version }}-v1-${{ hashFiles('**/setup.py') }}
restore-keys: |
${{ matrix.os }}-${{ matrix.python-version }}-v1-
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install --upgrade tox
- name: Tox
run: tox
env: ${{ matrix.env }}
+56
View File
@@ -0,0 +1,56 @@
name: Release
on:
push:
branches:
- master
release:
types:
- published
jobs:
build:
if: github.repository == 'jazzband/tablib'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Get pip cache dir
id: pip-cache
run: |
echo "::set-output name=dir::$(pip cache dir)"
- name: Cache
uses: actions/cache@v2
with:
path: ${{ steps.pip-cache.outputs.dir }}
key: release-${{ hashFiles('**/setup.py') }}
restore-keys: |
release-
- name: Install dependencies
run: |
python -m pip install -U pip
python -m pip install -U setuptools twine wheel
- name: Build package
run: |
python setup.py --version
python setup.py sdist --format=gztar bdist_wheel
twine check dist/*
- name: Upload packages to Jazzband
if: github.event.action == 'published'
uses: pypa/gh-action-pypi-publish@master
with:
user: jazzband
password: ${{ secrets.JAZZBAND_RELEASE_KEY }}
repository_url: https://jazzband.co/projects/tablib/upload
+53
View File
@@ -0,0 +1,53 @@
name: Test
on: [push, pull_request]
env:
FORCE_COLOR: 1
jobs:
build:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]
os: [ubuntu-latest, macOS-latest, windows-latest]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Get pip cache dir
id: pip-cache
run: |
echo "::set-output name=dir::$(pip cache dir)"
- name: Cache
uses: actions/cache@v2
with:
path: ${{ steps.pip-cache.outputs.dir }}
key:
${{ matrix.os }}-${{ matrix.python-version }}-v1-${{ hashFiles('**/setup.py') }}
restore-keys: |
${{ matrix.os }}-${{ matrix.python-version }}-v1-
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install --upgrade tox
python -m pip install -e .
- name: Tox tests
shell: bash
run: |
tox -e py
- name: Upload coverage
uses: codecov/codecov-action@v1
with:
name: ${{ matrix.os }} Python ${{ matrix.python-version }}
+30 -2
View File
@@ -1,10 +1,11 @@
# application builds
build/*
dist/*
MANIFEST
# python skin
.pyc
.pyo
*.pyc
*.pyo
# osx noise
.DS_Store
@@ -13,3 +14,30 @@ profile
# pycharm noise
.idea
.idea/*
# vi noise
*.swp
docs/_build/*
coverage.xml
nosetests.xml
junit-py25.xml
junit-py26.xml
junit-py27.xml
# tox noise
.tox
# pyenv noise
.python-version
tablib.egg-info/*
# Coverage
.coverage
htmlcov
# setuptools noise
.eggs
*.egg-info
# generated by setuptools-scm
/src/tablib/_version.py
+25
View File
@@ -0,0 +1,25 @@
repos:
- repo: https://github.com/asottile/pyupgrade
rev: v2.7.3
hooks:
- id: pyupgrade
args: ["--py36-plus"]
- repo: https://github.com/PyCQA/isort
rev: 5.6.4
hooks:
- id: isort
additional_dependencies: [toml]
- repo: https://github.com/pre-commit/pygrep-hooks
rev: v1.7.0
hooks:
- id: python-check-blanket-noqa
- id: rst-backticks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.3.0
hooks:
- id: check-merge-conflict
- id: check-toml
- id: check-yaml
+30 -11
View File
@@ -1,13 +1,32 @@
GistAPI.py is written and maintained by Kenneth Reitz and
various contributors:
Tablib was originally written by Kenneth Reitz and is now maintained
by the Jazzband GitHub team.
Development Lead
````````````````
Here is a list of passed and present much-appreciated contributors:
- Kenneth Reitz <me@kennethreitz.com>
Patches and Suggestions
```````````````````````
- A Lucky Someone
Alex Gaynor
Andrii Soldatenko
Benjamin Wohlwend
Bruno Soares
Claude Paroz
Daniel Santos
Erik Youngren
Hugo van Kemenade
Iuri de Silvio
Jakub Janoszek
James Douglass
Joel Friedly
Josh Ourisman
Kenneth Reitz
Luca Beltrame
Luke Lee
Marc Abramowitz
Marco Dallagiacoma
Mark Rogers
Mark Walling
Mathias Loesch
Mike Waldner
Peyman Salehi
Rabin Nankhwa
Tommy Anthony
Tsuyoshi Hombashi
Tushar Makkar
+341
View File
@@ -0,0 +1,341 @@
# History
## Unreleased
### Breaking changes
- Dropped Python 3.5 support
### Improvements
- Added Python 3.9 support
- Added read_only option to xlsx file reader (#482).
### Bugfixes
- Prevented crash in rst export with only-space strings (#469).
## 2.0.0 (2020-05-16)
### Breaking changes
- The `Row.lpush/rpush` logic was reversed. `lpush` was appending while `rpush`
and `append` were prepending. This was fixed (reversed behavior). If you
counted on the broken behavior, please update your code (#453).
### Bugfixes
- Fixed minimal openpyxl dependency version to 2.6.0 (#457).
- Dates from xls files are now read as Python datetime objects (#373).
- Allow import of "ragged" xlsx files (#465).
### Improvements
- When importing an xlsx file, Tablib will now read cell values instead of formulas (#462).
## 1.1.0 (2020-02-13)
### Deprecations
- Upcoming breaking change in Tablib 2.0.0: the `Row.lpush/rpush` logic is reversed.
`lpush` is appending while `rpush` and `append` are prepending. The broken behavior
will remain in Tablib 1.x and will be fixed (reversed) in Tablib 2.0.0 (#453). If you
count on the broken behavior, please update your code when you upgrade to Tablib 2.x.
### Improvements
- Tablib is now able to import CSV content where not all rows have the same
length. Missing columns on any line receive the empty string (#226).
## 1.0.0 (2020-01-13)
### Breaking changes
- Dropped Python 2 support
- Dependencies are now all optional. To install `tablib` as before with all
possible supported formats, run `pip install tablib[all]`
### Improvements
- Formats can now be dynamically registered through the
`tablib.formats.registry.refister` API (#256).
- Tablib methods expecting data input (`detect_format`, `import_set`,
`Dataset.load`, `Databook.load`) now accepts file-like objects in addition
to raw strings and bytestrings (#440).
### Bugfixes
- Fixed a crash when exporting an empty string with the ReST format (#368)
- Error cells from imported .xls files contain now the error string (#202)
## 0.14.0 (2019-10-19)
### Deprecations
- The 0.14.x series will be the last to support Python 2
### Breaking changes
- Dropped Python 3.4 support
### Improvements
- Added Python 3.7 and 3.8 support
- The project is now maintained by the Jazzband team, https://jazzband.co
- Improved format autodetection and added autodetection for the odf format.
- Added search to all documentation pages
- Open xlsx workbooks in read-only mode (#316)
- Unpin requirements
- Only install backports.csv on Python 2
### Bugfixes
- Fixed `DataBook().load` parameter ordering (first stream, then format).
- Fixed a regression for xlsx exports where non-string values were forced to
strings (#314)
- Fixed xlsx format detection (which was often detected as `xls` format)
## 0.13.0 (2019-03-08)
- Added reStructuredText output capability (#336)
- Added Jira output capability
- Stopped calling openpyxl deprecated methods (accessing cells, removing sheets)
(openpyxl minimal version is now 2.4.0)
- Fixed a circular dependency issue in JSON output (#332)
- Fixed Unicode error for the CSV export on Python 2 (#215)
- Removed usage of optional `ujson` (#311)
- Dropped Python 3.3 support
## 0.12.1 (2017-09-01)
- Favor `Dataset.export(<format>)` over `Dataset.<format>` syntax in docs
- Make Panda dependency optional
## 0.12.0 (2017-08-27)
- Add initial Panda DataFrame support
- Dropped Python 2.6 support
## 0.11.5 (2017-06-13)
- Use `yaml.safe_load` for importing yaml.
## 0.11.4 (2017-01-23)
- Use built-in `json` package if available
- Support Python 3.5+ in classifiers
### Bugfixes
- Fixed textual representation for Dataset with no headers
- Handle decimal types
## 0.11.3 (2016-02-16)
- Release fix.
## 0.11.2 (2016-02-16)
### Bugfixes
- Fix export only formats.
- Fix for xlsx output.
## 0.11.1 (2016-02-07)
### Bugfixes
- Fixed packaging error on Python 3.
## 0.11.0 (2016-02-07)
### New Formats!
- Added LaTeX table export format (`Dataset.latex`).
- Support for dBase (DBF) files (`Dataset.dbf`).
### Improvements
- New import/export interface (`Dataset.export()`, `Dataset.load()`).
- CSV custom delimiter support (`Dataset.export('csv', delimiter='$')`).
- Adding ability to remove duplicates to all rows in a dataset (`Dataset.remove_duplicates()`).
- Added a mechanism to avoid `datetime.datetime` issues when serializing data.
- New `detect_format()` function (mostly for internal use).
- Update the vendored unicodecsv to fix `None` handling.
- Only freeze the headers row, not the headers columns (xls).
### Breaking Changes
- `detect()` function removed.
### Bugfixes
- Fix XLSX import.
- Bugfix for `Dataset.transpose().transpose()`.
## 0.10.0 (2014-05-27)
* Unicode Column Headers
* ALL the bugfixes!
## 0.9.11 (2011-06-30)
* Bugfixes
## 0.9.10 (2011-06-22)
* Bugfixes
## 0.9.9 (2011-06-21)
* Dataset API Changes
* `stack_rows` => `stack`, `stack_columns` => `stack_cols`
* column operations have their own methods now (`append_col`, `insert_col`)
* List-style `pop()`
* Redis-style `rpush`, `lpush`, `rpop`, `lpop`, `rpush_col`, and `lpush_col`
## 0.9.8 (2011-05-22)
* OpenDocument Spreadsheet support (.ods)
* Full Unicode TSV support
## 0.9.7 (2011-05-12)
* Full XLSX Support!
* Pickling Bugfix
* Compat Module
## 0.9.6 (2011-05-12)
* `seperators` renamed to `separators`
* Full unicode CSV support
## 0.9.5 (2011-03-24)
* Python 3.1, Python 3.2 Support (same code base!)
* Formatter callback support
* Various bug fixes
## 0.9.4 (2011-02-18)
* Python 2.5 Support!
* Tox Testing for 2.5, 2.6, 2.7
* AnyJSON Integrated
* OrderedDict support
* Caved to community pressure (spaces)
## 0.9.3 (2011-01-31)
* Databook duplication leak fix.
* HTML Table output.
* Added column sorting.
## 0.9.2 (2010-11-17)
* Transpose method added to Datasets.
* New frozen top row in Excel output.
* Pickling support for Datasets and Rows.
* Support for row/column stacking.
## 0.9.1 (2010-11-04)
* Minor reference shadowing bugfix.
## 0.9.0 (2010-11-04)
* Massive documentation update!
* Tablib.org!
* Row tagging and Dataset filtering!
* Column insert/delete support
* Column append API change (header required)
* Internal Changes (Row object and use thereof)
## 0.8.5 (2010-10-06)
* New import system. All dependencies attempt to load from site-packages,
then fallback on tenderized modules.
## 0.8.4 (2010-10-04)
* Updated XLS output: Only wrap if '\\n' in cell.
## 0.8.3 (2010-10-04)
* Ability to append new column passing a callable
as the value that will be applied to every row.
## 0.8.2 (2010-10-04)
* Added alignment wrapping to written cells.
* Added separator support to XLS.
## 0.8.1 (2010-09-28)
* Packaging Fix
## 0.8.0 (2010-09-25)
* New format plugin system!
* Imports! ELEGANT Imports!
* Tests. Lots of tests.
## 0.7.1 (2010-09-20)
* Reverting methods back to properties.
* Windows bug compensated in documentation.
## 0.7.0 (2010-09-20)
* Renamed DataBook Databook for consistency.
* Export properties changed to methods (XLS filename / StringIO bug).
* Optional Dataset.xls(path='filename') support (for writing on windows).
* Added utf-8 on the worksheet level.
## 0.6.4 (2010-09-19)
* Updated unicode export for XLS.
* More exhaustive unit tests.
## 0.6.3 (2010-09-14)
* Added Dataset.append() support for columns.
## 0.6.2 (2010-09-13)
* Fixed Dataset.append() error on empty dataset.
* Updated Dataset.headers property w/ validation.
* Added Testing Fixtures.
## 0.6.1 (2010-09-12)
* Packaging hotfixes.
## 0.6.0 (2010-09-11)
* Public Release.
* Export Support for XLS, JSON, YAML, and CSV.
* DataBook Export for XLS, JSON, and YAML.
* Python Dict Property Support.
-7
View File
@@ -1,7 +0,0 @@
History
=======
0.1.0 (2010-09-??)
------------------
* Initial Release
+3 -2
View File
@@ -1,4 +1,5 @@
Copyright (c) 2010 Kenneth Reitz.
Copyright 2016 Kenneth Reitz
Copyright 2019 Jazzband
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
@@ -16,4 +17,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
THE SOFTWARE.
-1
View File
@@ -1 +0,0 @@
include HISTORY.rst README.rst tabbed
+44
View File
@@ -0,0 +1,44 @@
# Tablib: format-agnostic tabular dataset library
[![Jazzband](https://jazzband.co/static/img/badge.svg)](https://jazzband.co/)
[![PyPI version](https://img.shields.io/pypi/v/tablib.svg)](https://pypi.org/project/tablib/)
[![Supported Python versions](https://img.shields.io/pypi/pyversions/tablib.svg)](https://pypi.org/project/tablib/)
[![PyPI downloads](https://img.shields.io/pypi/dm/tablib.svg)](https://pypistats.org/packages/tablib)
[![GitHub Actions status](https://github.com/jazzband/tablib/workflows/Test/badge.svg)](https://github.com/jazzband/tablib/actions)
[![codecov](https://codecov.io/gh/jazzband/tablib/branch/master/graph/badge.svg)](https://codecov.io/gh/jazzband/tablib)
[![GitHub](https://img.shields.io/github/license/jazzband/tablib.svg)](LICENSE)
_____ ______ ___________ ______
__ /_______ ____ /_ ___ /___(_)___ /_
_ __/_ __ `/__ __ \__ / __ / __ __ \
/ /_ / /_/ / _ /_/ /_ / _ / _ /_/ /
\__/ \__,_/ /_.___/ /_/ /_/ /_.___/
Tablib is a format-agnostic tabular dataset library, written in Python.
Output formats supported:
- Excel (Sets + Books)
- JSON (Sets + Books)
- YAML (Sets + Books)
- Pandas DataFrames (Sets)
- HTML (Sets)
- Jira (Sets)
- TSV (Sets)
- ODS (Sets)
- CSV (Sets)
- DBF (Sets)
Note that tablib *purposefully* excludes XML support. It always will. (Note: This is a
joke. Pull requests are welcome.)
Tablib documentation is graciously hosted on https://tablib.readthedocs.io
It is also available in the ``docs`` directory of the source distribution.
Make sure to check out [Tablib on PyPI](https://pypi.org/project/tablib/)!
## Contribute
Please see the [contributing guide](https://github.com/jazzband/tablib/blob/master/.github/CONTRIBUTING.md).
-98
View File
@@ -1,98 +0,0 @@
Tabbed: format-agnostic tabular dataset library
===============================================
::
_____ ______ ______ _________
__ /_______ ____ /_ ___ /_ _____ ______ /
_ __/_ __ `/__ __ \__ __ \_ _ \_ __ /
/ /_ / /_/ / _ /_/ /_ /_/ // __// /_/ /
\__/ \__,_/ /_.___/ /_.___/ \___/ \__,_/
.. *Tabbed is under active documentation-driven development.*
Tabbed is a format-agnostic tabular dataset library, written in Python.
It is a full python module which doubles as a CLI application for quick
dataset conversions.
Formats supported:
- JSON
- YAML
- Excel
- CSV
.. - HTML
At this time, Tabbed supports the **export** of it's powerful Dataset object instances into any of the above formats. Import is underway.
Please note that tabbed *purposefully* excludes XML support. It always will.
Features
--------
.. Convert datafile formats via API: ::
..
.. tablib.source(filename='data.csv').export('data.json')
.. Convert datafile formats via CLI: ::
..
.. $ tabbed data.csv data.json
.. Convert data formats via CLI pipe interface: ::
..
.. $ curl http://domain.dev/dataset.json | tabbed --to excel | gist -p
Populate fresh data files: ::
headers = ('first_name', 'last_name', 'gpa')
data = [
('John', 'Adams', 4.0),
('George', 'Washington', 2.6),
('Henry', 'Ford', 2.3)
]
data = tablib.Dataset(*data, headers=headers)
# Establish file location and save
data.save('test.xls')
Intelligently add new rows: ::
data.adppend('Bob', 'Dylan', 3.2)
print data.headers
# >>> ('first_name', 'last_name', 'gpa')
Slice rows: ::
print data[0:1]
# >>> [('John', 'Adams', 4.0), ('George', 'Washington', 2.6)]
.. Slice columns by header: ::
..
.. print data['first_name']
.. # >>> ['John', 'George', 'Henry']
..
Manipulate rows by index: ::
del data[0]
print data[0:1]
# >>> [('George', 'Washington', 2.6), ('Henry', 'Ford', 2.3)]
.. # Update saved file
.. data.save()
.. Export to various formats: ::
..
.. # Save copy as CSV
.. data.export('backup.csv')
+29
View File
@@ -0,0 +1,29 @@
# Release checklist
Jazzband guidelines: https://jazzband.co/about/releases
* [ ] Get master to the appropriate code release state.
[GitHub Actions](https://github.com/jazzband/tablib/actions)
should pass on master.
[![GitHub Actions status](https://github.com/jazzband/tablib/workflows/Test/badge.svg)](https://github.com/jazzband/tablib/actions)
* [ ] Check [HISTORY.md](https://github.com/jazzband/tablib/blob/master/HISTORY.md),
update version number and release date
* [ ] Tag with version number and push tag, for example:
```bash
git tag -a v0.14.0 -m v0.14.0
git push --tags
```
* [ ] Once GitHub Actions has built and uploaded distributions, check files at
[Jazzband](https://jazzband.co/projects/tablib) and release to
[PyPI](https://pypi.org/pypi/tablib)
* [ ] Check installation:
```bash
pip uninstall -y tablib && pip install -U tablib
```
* [ ] Create new GitHub release: https://github.com/jazzband/tablib/releases/new
* Tag: Pick existing tag "v0.14.0"
+130
View File
@@ -0,0 +1,130 @@
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
BUILDDIR = _build
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " dirhtml to make HTML files named index.html in directories"
@echo " singlehtml to make a single large HTML file"
@echo " pickle to make pickle files"
@echo " json to make JSON files"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " qthelp to make HTML files and a qthelp project"
@echo " devhelp to make HTML files and a Devhelp project"
@echo " epub to make an epub"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " latexpdf to make LaTeX files and run them through pdflatex"
@echo " text to make text files"
@echo " man to make manual pages"
@echo " changes to make an overview of all changed/added/deprecated items"
@echo " linkcheck to check all external links for integrity"
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
clean:
-rm -rf $(BUILDDIR)/*
html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
dirhtml:
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
singlehtml:
$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
@echo
@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
pickle:
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
@echo
@echo "Build finished; now you can process the pickle files."
json:
$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
@echo
@echo "Build finished; now you can process the JSON files."
htmlhelp:
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in $(BUILDDIR)/htmlhelp."
qthelp:
$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
@echo
@echo "Build finished; now you can run "qcollectiongenerator" with the" \
".qhcp project file in $(BUILDDIR)/qthelp, like this:"
@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/Tablib.qhcp"
@echo "To view the help file:"
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/Tablib.qhc"
devhelp:
$(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
@echo
@echo "Build finished."
@echo "To view the help file:"
@echo "# mkdir -p $$HOME/.local/share/devhelp/Tablib"
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/Tablib"
@echo "# devhelp"
epub:
$(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
@echo
@echo "Build finished. The epub file is in $(BUILDDIR)/epub."
latex:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo
@echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
@echo "Run \`make' in that directory to run these through (pdf)latex" \
"(use \`make latexpdf' here to do that automatically)."
latexpdf:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through pdflatex..."
make -C $(BUILDDIR)/latex all-pdf
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
text:
$(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
@echo
@echo "Build finished. The text files are in $(BUILDDIR)/text."
man:
$(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
@echo
@echo "Build finished. The manual pages are in $(BUILDDIR)/man."
changes:
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
@echo
@echo "The overview file is in $(BUILDDIR)/changes."
linkcheck:
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in $(BUILDDIR)/linkcheck/output.txt."
doctest:
$(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
@echo "Testing of doctests in the sources finished, look at the " \
"results in $(BUILDDIR)/doctest/output.txt."
+12
View File
@@ -0,0 +1,12 @@
<h3><a href="https://tablib.readthedocs.io">About Tablib</a></h3>
<p>
Tablib is an MIT Licensed format-agnostic tabular dataset library, written in Python. It allows you to import, export, and manipulate tabular data sets. Advanced features include, segregation, dynamic columns, tags & filtering, and seamless format import & export.
</p>
<h3>Useful Links</h3>
<ul>
<li><a href="https://tablib.readthedocs.io">The Tablib Website</a></li>
<li><a href="https://pypi.org/project/tablib">Tablib @ PyPI</a></li>
<li><a href="https://github.com/jazzband/tablib">Tablib @ GitHub</a></li>
<li><a href="https://github.com/jazzband/tablib/issues">Issue Tracker</a></li>
<li><a href="https://github.com/jazzband/tablib/blob/master/HISTORY.md">Changelog</a></li>
</ul>
+4
View File
@@ -0,0 +1,4 @@
<h3><a href="https://tablib.readthedocs.io">About Tablib</a></h3>
<p>
Tablib is an MIT Licensed format-agnostic tabular dataset library, written in Python. It allows you to import, export, and manipulate tabular data sets. Advanced features include, segregation, dynamic columns, tags & filtering, and seamless format import & export.
</p>
+64
View File
@@ -0,0 +1,64 @@
.. _api:
===
API
===
.. module:: tablib
This part of the documentation covers all the interfaces of Tablib. For
parts where Tablib depends on external libraries, we document the most
important right here and provide links to the canonical documentation.
--------------
Dataset Object
--------------
.. autoclass:: Dataset
:inherited-members:
---------------
Databook Object
---------------
.. autoclass:: Databook
:inherited-members:
---------
Functions
---------
.. autofunction:: detect_format
.. autofunction:: import_set
----------
Exceptions
----------
.. class:: InvalidDatasetType
You're trying to add something that doesn't quite look right.
.. class:: InvalidDimensions
You're trying to add something that doesn't quite fit right.
.. class:: UnsupportedFormat
You're trying to add something that doesn't quite taste right.
Now, go start some :ref:`Tablib Development <development>`.
+230
View File
@@ -0,0 +1,230 @@
#
# Tablib documentation build configuration file, created by
# sphinx-quickstart on Tue Oct 5 15:25:21 2010.
#
# This file is execfile()d with the current directory set to its containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
import tablib
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
# sys.path.insert(0, os.path.abspath('..'))
# -- General configuration -----------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = [
'sphinx.ext.autodoc', 'sphinx.ext.todo', 'sphinx.ext.coverage',
'sphinx.ext.viewcode', 'sphinx.ext.intersphinx'
]
intersphinx_mapping = {'python': ('https://docs.python.org/3', None)}
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The encoding of source files.
#source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = 'Tablib'
copyright = '2019 Jazzband'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The full version, including alpha/beta/rc tags.
release = tablib.__version__
# The short X.Y version.
version = '.'.join(tablib.__version__.split('.')[:2])
# for example take major/minor
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
#today_fmt = '%B %d, %Y'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['_build']
# The reST default role (used for this markup: `text`) to use for all documents.
#default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
# add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
# pygments_style = ''
# A list of ignored prefixes for module index sorting.
#modindex_common_prefix = []
# -- Options for HTML output ---------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'alabaster'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
#html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
#html_logo = None
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
#html_favicon = None
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
# html_static_path = ['static']
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
#html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
html_sidebars = {
'index': ['sidebarintro.html', 'sourcelink.html', 'searchbox.html'],
'**': ['sidebarlogo.html', 'localtoc.html', 'relations.html',
'sourcelink.html', 'searchbox.html']
}
# Additional templates that should be rendered to pages, maps page names to
# template names.
#html_additional_pages = {}
# If false, no module index is generated.
#html_domain_indices = True
# If false, no index is generated.
#html_use_index = True
# If true, the index is split into individual pages for each letter.
#html_split_index = False
# If true, links to the reST sources are added to the pages.
html_show_sourcelink = True
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
html_show_sphinx = False
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
#html_show_copyright = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
#html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
#html_file_suffix = None
# Output file base name for HTML help builder.
htmlhelp_basename = 'Tablibdoc'
# -- Options for LaTeX output --------------------------------------------------
# The paper size ('letter' or 'a4').
#latex_paper_size = 'letter'
# The font size ('10pt', '11pt' or '12pt').
#latex_font_size = '10pt'
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [
('index', 'Tablib.tex', 'Tablib Documentation',
'Jazzband', 'manual'),
]
latex_use_modindex = False
latex_elements = {
'papersize': 'a4paper',
'pointsize': '12pt',
}
latex_use_parts = True
# The name of an image file (relative to this directory) to place at the top of
# the title page.
#latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
#latex_use_parts = False
# If true, show page references after internal links.
#latex_show_pagerefs = False
# If true, show URL addresses after external links.
#latex_show_urls = False
# Additional stuff for the LaTeX preamble.
#latex_preamble = ''
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
#latex_domain_indices = True
# -- Options for manual page output --------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
('index', 'tablib', 'Tablib Documentation',
['Jazzband'], 1)
]
+214
View File
@@ -0,0 +1,214 @@
.. _development:
Development
===========
Tablib is under active development, and contributors are welcome.
If you have a feature request, suggestion, or bug report, please open a new
issue on GitHub_. To submit patches, please send a pull request on GitHub_.
.. _GitHub: https://github.com/jazzband/tablib/
.. _design:
---------------------
Design Considerations
---------------------
Tablib was developed with a few :pep:`20` idioms in mind.
#. Beautiful is better than ugly.
#. Explicit is better than implicit.
#. Simple is better than complex.
#. Complex is better than complicated.
#. Readability counts.
A few other things to keep in mind:
#. Keep your code DRY.
#. Strive to be as simple (to use) as possible.
.. _scm:
--------------
Source Control
--------------
Tablib source is controlled with Git_, the lean, mean, distributed source
control machine.
The repository is publicly accessible.
.. code-block:: console
git clone git://github.com/jazzband/tablib.git
The project is hosted on **GitHub**.
GitHub:
https://github.com/jazzband/tablib
Git Branch Structure
++++++++++++++++++++
Feature / Hotfix / Release branches follow a `Successful Git Branching Model`_ .
Git-flow_ is a great tool for managing the repository. I highly recommend it.
``master``
Current production release (|version|) on PyPi.
Each release is tagged.
When submitting patches, please place your feature/change in its own branch prior to opening a pull request on GitHub_.
.. _Git: https://git-scm.org
.. _`Successful Git Branching Model`: https://nvie.com/posts/a-successful-git-branching-model/
.. _git-flow: https://github.com/nvie/gitflow
.. _newformats:
------------------
Adding New Formats
------------------
Tablib welcomes new format additions! Format suggestions include:
* MySQL Dump
Coding by Convention
++++++++++++++++++++
Tablib features a micro-framework for adding format support.
The easiest way to understand it is to use it.
So, let's define our own format, named *xxx*.
From version 1.0, Tablib formats are class-based and can be dynamically
registered.
1. Write your custom format class::
class MyXXXFormatClass:
title = 'xxx'
@classmethod
def export_set(cls, dset):
....
# returns string representation of given dataset
@classmethod
def export_book(cls, dbook):
....
# returns string representation of given databook
@classmethod
def import_set(cls, dset, in_stream):
...
# populates given Dataset with given datastream
@classmethod
def import_book(cls, dbook, in_stream):
...
# returns Databook instance
@classmethod
def detect(cls, stream):
...
# returns True if given stream is parsable as xxx
.. admonition:: Excluding Support
If the format excludes support for an import/export mechanism (*e.g.*
:class:`csv <tablib.Dataset.csv>` excludes
:class:`Databook <tablib.Databook>` support),
simply don't define the respective class methods.
Appropriate errors will be raised.
2. Register your class::
from tablib.formats import registry
registry.register('xxx', MyXXXFormatClass())
3. From then on, you should be able to use your new custom format as if it were
a built-in Tablib format, e.g. using ``dataset.export('xxx')`` will use the
``MyXXXFormatClass.export_set`` method.
.. _testing:
--------------
Testing Tablib
--------------
Testing is crucial to Tablib's stability.
This stable project is used in production by many companies and developers,
so it is important to be certain that every version released is fully operational.
When developing a new feature for Tablib, be sure to write proper tests for it as well.
When developing a feature for Tablib,
the easiest way to test your changes for potential issues is to simply run the test suite directly.
.. code-block:: console
$ tox
----------------------
Continuous Integration
----------------------
Every pull request is automatically tested and inspected upon receipt with `GitHub Actions`_.
If you broke the build, you will receive an email accordingly.
Anyone may view the build status and history at any time.
https://github.com/jazzband/tablib/actions
Additional reports will also be included here in the future, including :pep:`8` checks and stress reports for extremely large datasets.
.. _`GitHub Actions`: https://github.com/jazzband/tablib/actions
.. _docs:
-----------------
Building the Docs
-----------------
Documentation is written in the powerful, flexible,
and standard Python documentation format, `reStructured Text`_.
Documentation builds are powered by the powerful Pocoo project, Sphinx_.
The :ref:`API Documentation <api>` is mostly documented inline throughout the module.
The Docs live in ``tablib/docs``.
In order to build them, you will first need to install Sphinx.
.. code-block:: console
$ pip install sphinx
Then, to build an HTML version of the docs, simply run the following from the ``docs`` directory:
.. code-block:: console
$ make html
Your ``docs/_build/html`` directory will then contain an HTML representation of the documentation,
ready for publication on most web servers.
You can also generate the documentation in **epub**, **latex**, **json**, *&c* similarly.
.. _`reStructured Text`: http://docutils.sourceforge.net/rst.html
.. _Sphinx: http://sphinx.pocoo.org
.. _`GitHub Pages`: https://pages.github.com
----------
Make sure to check out the :ref:`API Documentation <api>`.
+247
View File
@@ -0,0 +1,247 @@
.. _formats:
=======
Formats
=======
Tablib supports a wide variety of different tabular formats, both for input and
output. Moreover, you can :ref:`register your own formats <newformats>`.
cli
===
The ``cli`` format is currently export-only. The exports produce a representation
table suited to a terminal.
When exporting to a CLI you can pass the table format with the ``tablefmt``
parameter, the supported formats are::
>>> import tabulate
>>> list(tabulate._table_formats)
['simple', 'plain', 'grid', 'fancy_grid', 'github', 'pipe', 'orgtbl',
'jira', 'presto', 'psql', 'rst', 'mediawiki', 'moinmoin', 'youtrack',
'html', 'latex', 'latex_raw', 'latex_booktabs', 'tsv', 'textile']
For example::
dataset.export("cli", tablefmt="github")
dataset.export("cli", tablefmt="grid")
This format is optional, install Tablib with ``pip install "tablib[cli]"`` to
make the format available.
csv
===
When you import CSV data, you can specify if the first line of your data source
is headers with the ``headers`` boolean parameter (defaults to ``True``)::
import tablib
tablib.import_set(your_data_stream, format='csv', headers=False)
When exporting with the ``csv`` format, the top row will contain headers, if
they have been set. Otherwise, the top row will contain the first row of the
dataset.
When importing a CSV data source or exporting a dataset as CSV, you can pass any
parameter supported by the :py:func:`csv.reader` and :py:func:`csv.writer`
functions. For example::
tablib.import_set(your_data_stream, format='csv', dialect='unix')
dataset.export('csv', delimiter=' ', quotechar='|')
.. admonition:: Line endings
Exporting uses \\r\\n line endings by default so, make sure to include
``newline=''`` otherwise you will get a blank line between each row
when you open the file in Excel::
with open('output.csv', 'w', newline='') as f:
f.write(dataset.export('csv'))
If you do not do this, and you export the file on Windows, your
CSV file will open in Excel with a blank line between each row.
dbf
===
Import/export using the dBASE_ format.
.. admonition:: Binary Warning
The ``dbf`` format contains binary data, so make sure to write in binary
mode::
with open('output.dbf', 'wb') as f:
f.write(dataset.export('dbf')
.. _dBASE: https://en.wikipedia.org/wiki/DBase
df (DataFrame)
==============
Import/export using the pandas_ DataFrame format. This format is optional,
install Tablib with ``pip install "tablib[pandas]"`` to make the format available.
.. _pandas: https://pandas.pydata.org/
html
====
The ``html`` format is currently export-only. The exports produce an HTML page
with the data in a ``<table>``. If headers have been set, they will be used as
table headers.
This format is optional, install Tablib with ``pip install "tablib[html]"`` to
make the format available.
jira
====
The ``jira`` format is currently export-only. Exports format the dataset
according to the Jira table syntax::
||heading 1||heading 2||heading 3||
|col A1|col A2|col A3|
|col B1|col B2|col B3|
json
====
Import/export using the JSON_ format. If headers have been set, a JSON list of
objects will be returned. If no headers have been set, a JSON list of lists
(rows) will be returned instead.
Import assumes (for now) that headers exist.
.. _JSON: http://json.org/
latex
=====
Import/export using the LaTeX_ format. This format is export-only.
If a title has been set, it will be exported as the table caption.
.. _LaTeX: https://www.latex-project.org/
ods
===
Export data in OpenDocument Spreadsheet format. The ``ods`` format is currently
export-only.
This format is optional, install Tablib with ``pip install "tablib[ods]"`` to
make the format available.
.. admonition:: Binary Warning
:class:`Dataset.ods` contains binary data, so make sure to write in binary mode::
with open('output.ods', 'wb') as f:
f.write(data.ods)
rst
===
Export data as a reStructuredText_ table representation of a dataset. The
``rst`` format is export-only.
Exporting returns a simple table if the text in the first column is never
wrapped, otherwise returns a grid table::
>>> from tablib import Dataset
>>> bits = ((0, 0), (1, 0), (0, 1), (1, 1))
>>> data = Dataset()
>>> data.headers = ['A', 'B', 'A and B']
>>> for a, b in bits:
... data.append([bool(a), bool(b), bool(a * b)])
>>> table = data.export('rst')
>>> table.split('\\n') == [
... '===== ===== =====',
... ' A B A and',
... ' B ',
... '===== ===== =====',
... 'False False False',
... 'True False False',
... 'False True False',
... 'True True True ',
... '===== ===== =====',
... ]
True
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
tsv
===
A variant of the csv_ format with tabulators as fields separators.
xls
===
Import/export data in Legacy Excel Spreadsheet representation.
This format is optional, install Tablib with ``pip install "tablib[xls]"`` to
make the format available.
.. note::
XLS files are limited to a maximum of 65,000 rows. Use xlsx_ to avoid this
limitation.
.. admonition:: Binary Warning
The ``xls`` file format is binary, so make sure to write in binary mode::
with open('output.xls', 'wb') as f:
f.write(data.export('xls'))
xlsx
====
Import/export data in Excel 07+ Spreadsheet representation.
This format is optional, install Tablib with ``pip install "tablib[xlsx]"`` to
make the format available.
The ``import_set()`` and ``import_book()`` methods accept keyword
argument ``read_only``. If its value is ``True`` (the default), the
XLSX data source is read lazily. Lazy reading generally reduces time
and memory consumption, especially for large spreadsheets. However,
it relies on the XLSX data source declaring correct dimensions. Some
programs generate XLSX files with incorrect dimensions. Such files
may need to be loaded with this optimization turned off by passing
``read_only=False``.
.. note::
When reading an ``xlsx`` file containing formulas in its cells, Tablib will
read the cell values, not the cell formulas.
.. versionchanged:: 2.0.0
Reads cell values instead of formulas.
.. admonition:: Binary Warning
The ``xlsx`` file format is binary, so make sure to write in binary mode::
with open('output.xlsx', 'wb') as f:
f.write(data.export('xlsx'))
yaml
====
Import/export data in the YAML_ format.
When exporting, if headers have been set, a YAML list of objects will be
returned. If no headers have been set, a YAML list of lists (rows) will be
returned instead.
Import assumes (for now) that headers exist.
This format is optional, install Tablib with ``pip install "tablib[yaml]"`` to
make the format available.
.. _YAML: https://yaml.org
+121
View File
@@ -0,0 +1,121 @@
.. Tablib documentation master file, created by
sphinx-quickstart on Tue Oct 5 15:25:21 2010.
You can adapt this file completely to your liking, but it should at least
contain the root ``toctree`` directive.
Tablib: Pythonic Tabular Datasets
=================================
Release v\ |version|. (:ref:`Installation <install>`)
.. Contents:
..
.. .. toctree::
.. :maxdepth: 2
..
.. Indices and tables
.. ==================
..
.. * :ref:`genindex`
.. * :ref:`modindex`
.. * :ref:`search`
Tablib is an `MIT Licensed <https://mit-license.org/>`_ format-agnostic tabular dataset library, written in Python.
It allows you to import, export, and manipulate tabular data sets.
Advanced features include segregation, dynamic columns, tags & filtering,
and seamless format import & export.
::
>>> data = tablib.Dataset(headers=['First Name', 'Last Name', 'Age'])
>>> for i in [('Kenneth', 'Reitz', 22), ('Bessie', 'Monke', 21)]:
... data.append(i)
>>> print(data.export('json'))
[{"Last Name": "Reitz", "First Name": "Kenneth", "Age": 22}, {"Last Name": "Monke", "First Name": "Bessie", "Age": 21}]
>>> print(data.export('yaml'))
- {Age: 22, First Name: Kenneth, Last Name: Reitz}
- {Age: 21, First Name: Bessie, Last Name: Monke}
>>> data.export('xlsx')
<redacted binary data>
>>> data.export('df')
First Name Last Name Age
0 Kenneth Reitz 22
1 Bessie Monke 21
Testimonials
------------
`National Geographic <https://www.nationalgeographic.com/>`_,
`Digg, Inc <https://digg.com/>`_,
`Northrop Grumman <https://www.northropgrumman.com/>`_,
`Discovery Channel <https://dsc.discovery.com/>`_,
and `The Sunlight Foundation <https://sunlightfoundation.com/>`_ use Tablib internally.
**Greg Thorton**
Tablib by @kennethreitz saved my life.
I had to consolidate like 5 huge poorly maintained lists of domains and data.
It was a breeze!
**Dave Coutts**
It's turning into one of my most used modules of 2010.
You really hit a sweet spot for managing tabular data with a minimal amount of code and effort.
**Joshua Ourisman**
Tablib has made it so much easier to deal with the inevitable 'I want an Excel file!' requests from clients...
**Brad Montgomery**
I think you nailed the "Python Zen" with tablib.
Thanks again for an awesome lib!
User's Guide
------------
This part of the documentation, which is mostly prose, begins with some background information about Tablib, then focuses on step-by-step instructions for getting the most out of your datasets.
.. toctree::
:maxdepth: 2
intro
.. toctree::
:maxdepth: 2
install
.. toctree::
:maxdepth: 2
tutorial
.. toctree::
:maxdepth: 2
formats
.. toctree::
:maxdepth: 2
development
API Reference
-------------
If you are looking for information on a specific function, class or
method, this part of the documentation is for you.
.. toctree::
:maxdepth: 2
api
+85
View File
@@ -0,0 +1,85 @@
.. _install:
Installation
============
This part of the documentation covers the installation of Tablib. The first step to using any software package is getting it properly installed.
.. _installing:
-----------------
Installing Tablib
-----------------
Distribute & Pip
----------------
Of course, the recommended way to install Tablib is with `pip <https://pip.pypa.io>`_:
.. code-block:: console
$ pip install tablib
You can also choose to install more dependencies to have more import/export
formats available:
.. code-block:: console
$ pip install "tablib[xlsx]"
Or all possible formats:
.. code-block:: console
$ pip install "tablib[all]"
which is equivalent to:
.. code-block:: console
$ pip install "tablib[html, pandas, ods, xls, xlsx, yaml]"
-------------------
Download the Source
-------------------
You can also install Tablib from source.
The latest release (|version|) is available from GitHub.
* tarball_
* zipball_
.. _
Once you have a copy of the source,
you can embed it in your Python package,
or install it into your site-packages easily.
.. code-block:: console
$ python setup.py install
To download the full source history from Git, see :ref:`Source Control <scm>`.
.. _tarball: https://github.com/jazzband/tablib/tarball/master
.. _zipball: https://github.com/jazzband/tablib/zipball/master
.. _updates:
Staying Updated
---------------
The latest version of Tablib will always be available here:
* PyPI: https://pypi.org/project/tablib/
* GitHub: https://github.com/jazzband/tablib/
When a new version is available, upgrading is simple::
$ pip install tablib --upgrade
Now, go get a :ref:`Quick Start <quickstart>`.
+63
View File
@@ -0,0 +1,63 @@
.. _intro:
Introduction
============
This part of the documentation covers all the interfaces of Tablib.
Tablib is a format-agnostic tabular dataset library, written in Python.
It allows you to Pythonically import, export, and manipulate tabular data sets.
Advanced features include segregation, dynamic columns, tags/filtering, and
seamless format import/export.
Philosophy
----------
Tablib was developed with a few :pep:`20` idioms in mind.
#. Beautiful is better than ugly.
#. Explicit is better than implicit.
#. Simple is better than complex.
#. Complex is better than complicated.
#. Readability counts.
All contributions to Tablib should keep these important rules in mind.
.. _license:
Tablib License
--------------
Tablib is released under terms of `The MIT License`_.
Copyright 2017 Kenneth Reitz
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
.. _`The MIT License`: https://opensource.org/licenses/mit-license.php
.. _pythonsupport:
Pythons Supported
-----------------
Python 3.6+ is officially supported.
Now, go :ref:`install Tablib <install>`.
+118
View File
@@ -0,0 +1,118 @@
\definecolor{TitleColor}{rgb}{0,0,0}
\definecolor{InnerLinkColor}{rgb}{0,0,0}
\renewcommand{\maketitle}{%
\begin{titlepage}%
\let\footnotesize\small
\let\footnoterule\relax
\ifsphinxpdfoutput
\begingroup
% This \def is required to deal with multi-line authors; it
% changes \\ to ', ' (comma-space), making it pass muster for
% generating document info in the PDF file.
\def\\{, }
\pdfinfo{
/Author (\@author)
/Title (\@title)
}
\endgroup
\fi
\begin{flushright}%
%\sphinxlogo%
{\center
\vspace*{3cm}
\includegraphics{logo.pdf}
\vspace{3cm}
\par
{\rm\Huge \@title \par}%
{\em\LARGE \py@release\releaseinfo \par}
{\large
\@date \par
\py@authoraddress \par
}}%
\end{flushright}%\par
\@thanks
\end{titlepage}%
\cleardoublepage%
\setcounter{footnote}{0}%
\let\thanks\relax\let\maketitle\relax
%\gdef\@thanks{}\gdef\@author{}\gdef\@title{}
}
\fancypagestyle{normal}{
\fancyhf{}
\fancyfoot[LE,RO]{{\thepage}}
\fancyfoot[LO]{{\nouppercase{\rightmark}}}
\fancyfoot[RE]{{\nouppercase{\leftmark}}}
\fancyhead[LE,RO]{{ \@title, \py@release}}
\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\footrulewidth}{0.4pt}
}
\fancypagestyle{plain}{
\fancyhf{}
\fancyfoot[LE,RO]{{\thepage}}
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0.4pt}
}
\titleformat{\section}{\Large}%
{\py@TitleColor\thesection}{0.5em}{\py@TitleColor}{\py@NormalColor}
\titleformat{\subsection}{\large}%
{\py@TitleColor\thesubsection}{0.5em}{\py@TitleColor}{\py@NormalColor}
\titleformat{\subsubsection}{}%
{\py@TitleColor\thesubsubsection}{0.5em}{\py@TitleColor}{\py@NormalColor}
\titleformat{\paragraph}{\large}%
{\py@TitleColor}{0em}{\py@TitleColor}{\py@NormalColor}
\ChNameVar{\raggedleft\normalsize}
\ChNumVar{\raggedleft \bfseries\Large}
\ChTitleVar{\raggedleft \rm\Huge}
\renewcommand\thepart{\@Roman\c@part}
\renewcommand\part{%
\pagestyle{empty}
\if@noskipsec \leavevmode \fi
\cleardoublepage
\vspace*{6cm}%
\@afterindentfalse
\secdef\@part\@spart}
\def\@part[#1]#2{%
\ifnum \c@secnumdepth >\m@ne
\refstepcounter{part}%
\addcontentsline{toc}{part}{\thepart\hspace{1em}#1}%
\else
\addcontentsline{toc}{part}{#1}%
\fi
{\parindent \z@ %\center
\interlinepenalty \@M
\normalfont
\ifnum \c@secnumdepth >\m@ne
\rm\Large \partname~\thepart
\par\nobreak
\fi
\MakeUppercase{\rm\Huge #2}%
\markboth{}{}\par}%
\nobreak
\vskip 8ex
\@afterheading}
\def\@spart#1{%
{\parindent \z@ %\center
\interlinepenalty \@M
\normalfont
\huge \bfseries #1\par}%
\nobreak
\vskip 3ex
\@afterheading}
% use inconsolata font
\usepackage{inconsolata}
% fix single quotes, for inconsolata. (does not work)
%%\usepackage{textcomp}
%%\begingroup
%% \catcode`'=\active
%% \g@addto@macro\@noligs{\let'\textsinglequote}
%% \endgroup
%%\endinput
+415
View File
@@ -0,0 +1,415 @@
.. _quickstart:
==========
Quickstart
==========
Eager to get started?
This page gives a good introduction in how to get started with Tablib.
This assumes you already have Tablib installed.
If you do not, head over to the :ref:`Installation <install>` section.
First, make sure that:
* Tablib is :ref:`installed <install>`
* Tablib is :ref:`up-to-date <updates>`
Let's get started with some simple use cases and examples.
------------------
Creating a Dataset
------------------
A :class:`Dataset <tablib.Dataset>` is nothing more than what its name implies—a set of data.
Creating your own instance of the :class:`tablib.Dataset` object is simple. ::
data = tablib.Dataset()
You can now start filling this :class:`Dataset <tablib.Dataset>` object with data.
.. admonition:: Example Context
From here on out, if you see ``data``, assume that it's a fresh
:class:`Dataset <tablib.Dataset>` object.
-----------
Adding Rows
-----------
Let's say you want to collect a simple list of names. ::
# collection of names
names = ['Kenneth Reitz', 'Bessie Monke']
for name in names:
# split name appropriately
fname, lname = name.split()
# add names to Dataset
data.append([fname, lname])
You can get a nice, Pythonic view of the dataset at any time with :class:`Dataset.dict`::
>>> data.dict
[('Kenneth', 'Reitz'), ('Bessie', 'Monke')]
--------------
Adding Headers
--------------
It's time to enhance our :class:`Dataset` by giving our columns some titles.
To do so, set :class:`Dataset.headers`. ::
data.headers = ['First Name', 'Last Name']
Now our data looks a little different. ::
>>> data.dict
[{'Last Name': 'Reitz', 'First Name': 'Kenneth'},
{'Last Name': 'Monke', 'First Name': 'Bessie'}]
--------------
Adding Columns
--------------
Now that we have a basic :class:`Dataset` in place, let's add a column of **ages** to it. ::
data.append_col([22, 20], header='Age')
Let's view the data now. ::
>>> data.dict
[{'Last Name': 'Reitz', 'First Name': 'Kenneth', 'Age': 22},
{'Last Name': 'Monke', 'First Name': 'Bessie', 'Age': 20}]
It's that easy.
--------------
Importing Data
--------------
Creating a :class:`tablib.Dataset` object by importing a pre-existing file is simple. ::
with open('data.csv', 'r') as fh:
imported_data = Dataset().load(fh)
This detects what sort of data is being passed in, and uses an appropriate formatter to do the import. So you can import from a variety of different file types.
.. admonition:: Source without headers
When the format is :class:`csv <Dataset.csv>`, :class:`tsv <Dataset.tsv>`, :class:`dbf <Dataset.dbf>`, :class:`xls <Dataset.xls>` or :class:`xlsx <Dataset.xlsx>`, and the data source does not have headers, the import should be done as follows ::
with open('data.csv', 'r') as fh:
imported_data = Dataset().load(fh, headers=False)
--------------
Exporting Data
--------------
Tablib's killer feature is the ability to export your :class:`Dataset` objects into a number of formats.
**Comma-Separated Values** ::
>>> data.export('csv')
Last Name,First Name,Age
Reitz,Kenneth,22
Monke,Bessie,20
**JavaScript Object Notation** ::
>>> data.export('json')
[{"Last Name": "Reitz", "First Name": "Kenneth", "Age": 22}, {"Last Name": "Monke", "First Name": "Bessie", "Age": 20}]
**YAML Ain't Markup Language** ::
>>> data.export('yaml')
- {Age: 22, First Name: Kenneth, Last Name: Reitz}
- {Age: 20, First Name: Bessie, Last Name: Monke}
**Microsoft Excel** ::
>>> data.export('xls')
<redacted binary data>
**Pandas DataFrame** ::
>>> data.export('df')
First Name Last Name Age
0 Kenneth Reitz 22
1 Bessie Monke 21
------------------------
Selecting Rows & Columns
------------------------
You can slice and dice your data, just like a standard Python list. ::
>>> data[0]
('Kenneth', 'Reitz', 22)
If we had a set of data consisting of thousands of rows,
it could be useful to get a list of values in a column.
To do so, we access the :class:`Dataset` as if it were a standard Python dictionary. ::
>>> data['First Name']
['Kenneth', 'Bessie']
You can also access the column using its index. ::
>>> data.headers
['Last Name', 'First Name', 'Age']
>>> data.get_col(1)
['Kenneth', 'Bessie']
Let's find the average age. ::
>>> ages = data['Age']
>>> float(sum(ages)) / len(ages)
21.0
-----------------------
Removing Rows & Columns
-----------------------
It's easier than you could imagine. Delete a column::
>>> del data['Col Name']
Delete a range of rows::
>>> del data[0:12]
==============
Advanced Usage
==============
This part of the documentation services to give you an idea that are otherwise hard to extract from the :ref:`API Documentation <api>`.
And now for something completely different.
.. _dyncols:
---------------
Dynamic Columns
---------------
.. versionadded:: 0.8.3
Thanks to Josh Ourisman, Tablib now supports adding dynamic columns.
A dynamic column is a single callable object (*e.g.* a function).
Let's add a dynamic column to our :class:`Dataset` object.
In this example, we have a function that generates a random grade for our students. ::
import random
def random_grade(row):
"""Returns a random integer for entry."""
return (random.randint(60,100)/100.0)
data.append_col(random_grade, header='Grade')
Let's have a look at our data. ::
>>> data.export('yaml')
- {Age: 22, First Name: Kenneth, Grade: 0.6, Last Name: Reitz}
- {Age: 20, First Name: Bessie, Grade: 0.75, Last Name: Monke}
Let's remove that column. ::
>>> del data['Grade']
When you add a dynamic column, the first argument that is passed in to the given callable is the current data row.
You can use this to perform calculations against your data row.
For example, we can use the data available in the row to guess the gender of a student. ::
def guess_gender(row):
"""Calculates gender of given student data row."""
m_names = ('Kenneth', 'Mike', 'Yuri')
f_names = ('Bessie', 'Samantha', 'Heather')
name = row[0]
if name in m_names:
return 'Male'
elif name in f_names:
return 'Female'
else:
return 'Unknown'
Adding this function to our dataset as a dynamic column would result in: ::
>>> data.export('yaml')
- {Age: 22, First Name: Kenneth, Gender: Male, Last Name: Reitz}
- {Age: 20, First Name: Bessie, Gender: Female, Last Name: Monke}
.. _tags:
----------------------------
Filtering Datasets with Tags
----------------------------
.. versionadded:: 0.9.0
When constructing a :class:`Dataset` object,
you can add tags to rows by specifying the ``tags`` parameter.
This allows you to filter your :class:`Dataset` later.
This can be useful to separate rows of data based on arbitrary criteria
(*e.g.* origin) that you don't want to include in your :class:`Dataset`.
Let's tag some students. ::
students = tablib.Dataset()
students.headers = ['first', 'last']
students.rpush(['Kenneth', 'Reitz'], tags=['male', 'technical'])
students.rpush(['Daniel', 'Dupont'], tags=['male', 'creative' ])
students.rpush(['Bessie', 'Monke'], tags=['female', 'creative'])
Now that we have extra meta-data on our rows, we can easily filter our :class:`Dataset`. Let's just see Female students. ::
>>> students.filter(['female']).yaml
- {first: Bessie, Last: Monke}
By default, when you pass a list of tags you get filter type or. ::
>>> students.filter(['female', 'creative']).yaml
- {first: Daniel, Last: Dupont}
- {first: Bessie, Last: Monke}
Using chaining you can get a filter type and. ::
>>> students.filter(['female']).filter(['creative']).yaml
- {first: Bessie, Last: Monke}
It's that simple. The original :class:`Dataset` is untouched.
Open an Excel Workbook and read first sheet
-------------------------------------------
Open an Excel 2007 and later workbook with a single sheet (or a workbook with multiple sheets but you just want the first sheet). ::
data = tablib.Dataset()
with open('my_excel_file.xlsx', 'rb') as fh:
data.load(fh, 'xlsx')
print(data)
Excel Workbook With Multiple Sheets
------------------------------------
When dealing with a large number of :class:`Datasets <Dataset>` in spreadsheet format,
it's quite common to group multiple spreadsheets into a single Excel file, known as a Workbook.
Tablib makes it extremely easy to build workbooks with the handy :class:`Databook` class.
Let's say we have 3 different :class:`Datasets <Dataset>`.
All we have to do is add them to a :class:`Databook` object... ::
book = tablib.Databook((data1, data2, data3))
... and export to Excel just like :class:`Datasets <Dataset>`. ::
with open('students.xls', 'wb') as f:
f.write(book.export('xls'))
The resulting ``students.xls`` file will contain a separate spreadsheet for each :class:`Dataset` object in the :class:`Databook`.
.. admonition:: Binary Warning
Make sure to open the output file in binary mode.
.. _separators:
----------
Separators
----------
.. versionadded:: 0.8.2
When constructing a spreadsheet,
it's often useful to create a blank row containing information on the upcoming data. So,
::
daniel_tests = [
('11/24/09', 'Math 101 Mid-term Exam', 56.),
('05/24/10', 'Math 101 Final Exam', 62.)
]
suzie_tests = [
('11/24/09', 'Math 101 Mid-term Exam', 56.),
('05/24/10', 'Math 101 Final Exam', 62.)
]
# Create new dataset
tests = tablib.Dataset()
tests.headers = ['Date', 'Test Name', 'Grade']
# Daniel's Tests
tests.append_separator('Daniel\'s Scores')
for test_row in daniel_tests:
tests.append(test_row)
# Susie's Tests
tests.append_separator('Susie\'s Scores')
for test_row in suzie_tests:
tests.append(test_row)
# Write spreadsheet to disk
with open('grades.xls', 'wb') as f:
f.write(tests.export('xls'))
The resulting **tests.xls** will have the following layout:
Daniel's Scores:
* '11/24/09', 'Math 101 Mid-term Exam', 56.
* '05/24/10', 'Math 101 Final Exam', 62.
Suzie's Scores:
* '11/24/09', 'Math 101 Mid-term Exam', 56.
* '05/24/10', 'Math 101 Final Exam', 62.
.. admonition:: Format Support
At this time, only :class:`Excel <Dataset.xls>` output supports separators.
----
Now, go check out the :ref:`API Documentation <api>` or begin :ref:`Tablib Development <development>`.
Vendored
-27
View File
@@ -1,27 +0,0 @@
from fabric.api import *
def scrub():
""" Death to the bytecode! """
local("rm -fr dist build")
local("find . -name \"*.pyc\" -exec rm '{}' ';'")
def test():
""" Test parsing! """
local("rm output/*")
local("./strata.py --nsanity_files 'strata/tests/samples/nsanity' -d")
def build():
""" Build application"""
pass
def init():
""" Initialize Environment """
# TODO: Possibly add Virtual Environment?
local("sudo pip install -r REQUIREMENTS")
if __name__ == '__main__':
# TODO: Remove (for testing purposes)
# TODO: [Possibly] add doctests
test()
+2
View File
@@ -0,0 +1,2 @@
[tool.isort]
profile = "black"
+3
View File
@@ -0,0 +1,3 @@
[pytest]
norecursedirs = .git .*
addopts = -rsxX --showlocals --tb=native --cov=tablib --cov=tests --cov-report xml --cov-report term --cov-report html
Regular → Executable
+49 -33
View File
@@ -1,36 +1,52 @@
#!/usr/bin/env python
import os
import sys
import tablib
from setuptools import find_packages, setup
from distutils.core import setup
def publish():
"""Publish to PyPi"""
os.system("python setup.py sdist upload")
if sys.argv[-1] == "publish":
publish()
sys.exit()
setup(name='tablib',
version=tablib.__version__,
description='Python wrapper for Gist API',
long_description=open('README.rst').read() + '\n\n' +
open('HISTORY.rst').read(),
author='Kenneth Reitz',
author_email='me@kennethreitz.com',
url='http://github.com/kennethreitz/tabbed',
packages=['tablib'],
license='MIT',
classifiers=(
"Development Status :: 4 - Beta",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python",
"Programming Language :: Python :: 2.5",
"Programming Language :: Python :: 2.6",
"Programming Language :: Python :: 2.7",
)
)
setup(
name='tablib',
use_scm_version={
'write_to': 'src/tablib/_version.py',
},
setup_requires=['setuptools_scm'],
description='Format agnostic tabular data library (XLS, JSON, YAML, CSV)',
long_description=(
open('README.md').read() + '\n\n' + open('HISTORY.md').read()
),
long_description_content_type="text/markdown",
author='Kenneth Reitz',
author_email='me@kennethreitz.org',
maintainer='Jazzband',
maintainer_email='roadies@jazzband.co',
url='https://tablib.readthedocs.io',
project_urls={
"Documentation": "https://tablib.readthedocs.io",
"Source": "https://github.com/jazzband/tablib",
},
packages=find_packages(where="src"),
package_dir={"": "src"},
license='MIT',
classifiers=[
'Development Status :: 5 - Production/Stable',
'Intended Audience :: Developers',
'Natural Language :: English',
'License :: OSI Approved :: MIT License',
'Programming Language :: Python',
'Programming Language :: Python :: 3 :: Only',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
],
python_requires='>=3.6',
extras_require={
'all': ['markuppy', 'odfpy', 'openpyxl>=2.6.0', 'pandas', 'pyyaml', 'tabulate', 'xlrd', 'xlwt'],
'cli': ['tabulate'],
'html': ['markuppy'],
'ods': ['odfpy'],
'pandas': ['pandas'],
'xls': ['xlrd', 'xlwt'],
'xlsx': ['openpyxl>=2.6.0'],
'yaml': ['pyyaml'],
},
)
+19
View File
@@ -0,0 +1,19 @@
""" Tablib. """
try:
# Generated by setuptools-scm.
from ._version import version as __version__
except ImportError:
# Some broken installation.
__version__ = None
from tablib.core import ( # noqa: F401
Databook,
Dataset,
InvalidDatasetType,
InvalidDimensions,
UnsupportedFormat,
detect_format,
import_book,
import_set,
)
+917
View File
@@ -0,0 +1,917 @@
"""
tablib.core
~~~~~~~~~~~
This module implements the central Tablib objects.
:copyright: (c) 2016 by Kenneth Reitz. 2019 Jazzband.
:license: MIT, see LICENSE for more details.
"""
from collections import OrderedDict
from copy import copy
from operator import itemgetter
from tablib.exceptions import (
HeadersNeeded,
InvalidDatasetIndex,
InvalidDatasetType,
InvalidDimensions,
UnsupportedFormat,
)
from tablib.formats import registry
from tablib.utils import normalize_input
__title__ = 'tablib'
__author__ = 'Kenneth Reitz'
__license__ = 'MIT'
__copyright__ = 'Copyright 2017 Kenneth Reitz. 2019 Jazzband.'
__docformat__ = 'restructuredtext'
class Row:
"""Internal Row object. Mainly used for filtering."""
__slots__ = ['_row', 'tags']
def __init__(self, row=list(), tags=list()):
self._row = list(row)
self.tags = list(tags)
def __iter__(self):
return (col for col in self._row)
def __len__(self):
return len(self._row)
def __repr__(self):
return repr(self._row)
def __getitem__(self, i):
return self._row[i]
def __setitem__(self, i, value):
self._row[i] = value
def __delitem__(self, i):
del self._row[i]
def __getstate__(self):
return self._row, self.tags
def __setstate__(self, state):
self._row, self.tags = state
def rpush(self, value):
self.insert(len(self._row), value)
def lpush(self, value):
self.insert(0, value)
def append(self, value):
self.rpush(value)
def insert(self, index, value):
self._row.insert(index, value)
def __contains__(self, item):
return (item in self._row)
@property
def tuple(self):
"""Tuple representation of :class:`Row`."""
return tuple(self._row)
@property
def list(self):
"""List representation of :class:`Row`."""
return list(self._row)
def has_tag(self, tag):
"""Returns true if current row contains tag."""
if tag is None:
return False
elif isinstance(tag, str):
return (tag in self.tags)
else:
return bool(len(set(tag) & set(self.tags)))
class Dataset:
"""The :class:`Dataset` object is the heart of Tablib. It provides all core
functionality.
Usually you create a :class:`Dataset` instance in your main module, and append
rows as you collect data. ::
data = tablib.Dataset()
data.headers = ('name', 'age')
for (name, age) in some_collector():
data.append((name, age))
Setting columns is similar. The column data length must equal the
current height of the data and headers must be set. ::
data = tablib.Dataset()
data.headers = ('first_name', 'last_name')
data.append(('John', 'Adams'))
data.append(('George', 'Washington'))
data.append_col((90, 67), header='age')
You can also set rows and headers upon instantiation. This is useful if
dealing with dozens or hundreds of :class:`Dataset` objects. ::
headers = ('first_name', 'last_name')
data = [('John', 'Adams'), ('George', 'Washington')]
data = tablib.Dataset(*data, headers=headers)
:param \\*args: (optional) list of rows to populate Dataset
:param headers: (optional) list strings for Dataset header row
:param title: (optional) string to use as title of the Dataset
.. admonition:: Format Attributes Definition
If you look at the code, the various output/import formats are not
defined within the :class:`Dataset` object. To add support for a new format, see
:ref:`Adding New Formats <newformats>`.
"""
def __init__(self, *args, **kwargs):
self._data = list(Row(arg) for arg in args)
self.__headers = None
# ('title', index) tuples
self._separators = []
# (column, callback) tuples
self._formatters = []
self.headers = kwargs.get('headers')
self.title = kwargs.get('title')
def __len__(self):
return self.height
def __getitem__(self, key):
if isinstance(key, str):
if key in self.headers:
pos = self.headers.index(key) # get 'key' index from each data
return [row[pos] for row in self._data]
else:
raise KeyError
else:
_results = self._data[key]
if isinstance(_results, Row):
return _results.tuple
else:
return [result.tuple for result in _results]
def __setitem__(self, key, value):
self._validate(value)
self._data[key] = Row(value)
def __delitem__(self, key):
if isinstance(key, str):
if key in self.headers:
pos = self.headers.index(key)
del self.headers[pos]
for i, row in enumerate(self._data):
del row[pos]
self._data[i] = row
else:
raise KeyError
else:
del self._data[key]
def __repr__(self):
try:
return '<%s dataset>' % (self.title.lower())
except AttributeError:
return '<dataset object>'
def __str__(self):
result = []
# Add str representation of headers.
if self.__headers:
result.append([str(h) for h in self.__headers])
# Add str representation of rows.
result.extend(list(map(str, row)) for row in self._data)
lens = [list(map(len, row)) for row in result]
field_lens = list(map(max, zip(*lens)))
# delimiter between header and data
if self.__headers:
result.insert(1, ['-' * length for length in field_lens])
format_string = '|'.join('{%s:%s}' % item for item in enumerate(field_lens))
return '\n'.join(format_string.format(*row) for row in result)
# ---------
# Internals
# ---------
def _get_in_format(self, fmt_key, **kwargs):
return registry.get_format(fmt_key).export_set(self, **kwargs)
def _set_in_format(self, fmt_key, in_stream, **kwargs):
in_stream = normalize_input(in_stream)
return registry.get_format(fmt_key).import_set(self, in_stream, **kwargs)
def _validate(self, row=None, col=None, safety=False):
"""Assures size of every row in dataset is of proper proportions."""
if row:
is_valid = (len(row) == self.width) if self.width else True
elif col:
if len(col) < 1:
is_valid = True
else:
is_valid = (len(col) == self.height) if self.height else True
else:
is_valid = all(len(x) == self.width for x in self._data)
if is_valid:
return True
else:
if not safety:
raise InvalidDimensions
return False
def _package(self, dicts=True, ordered=True):
"""Packages Dataset into lists of dictionaries for transmission."""
# TODO: Dicts default to false?
_data = list(self._data)
if ordered:
dict_pack = OrderedDict
else:
dict_pack = dict
# Execute formatters
if self._formatters:
for row_i, row in enumerate(_data):
for col, callback in self._formatters:
try:
if col is None:
for j, c in enumerate(row):
_data[row_i][j] = callback(c)
else:
_data[row_i][col] = callback(row[col])
except IndexError:
raise InvalidDatasetIndex
if self.headers:
if dicts:
data = [dict_pack(list(zip(self.headers, data_row))) for data_row in _data]
else:
data = [list(self.headers)] + list(_data)
else:
data = [list(row) for row in _data]
return data
def _get_headers(self):
"""An *optional* list of strings to be used for header rows and attribute names.
This must be set manually. The given list length must equal :attr:`Dataset.width`.
"""
return self.__headers
def _set_headers(self, collection):
"""Validating headers setter."""
self._validate(collection)
if collection:
try:
self.__headers = list(collection)
except TypeError:
raise TypeError
else:
self.__headers = None
headers = property(_get_headers, _set_headers)
def _get_dict(self):
"""A native Python representation of the :class:`Dataset` object. If headers have
been set, a list of Python dictionaries will be returned. If no headers have been set,
a list of tuples (rows) will be returned instead.
A dataset object can also be imported by setting the `Dataset.dict` attribute: ::
data = tablib.Dataset()
data.dict = [{'age': 90, 'first_name': 'Kenneth', 'last_name': 'Reitz'}]
"""
return self._package()
def _set_dict(self, pickle):
"""A native Python representation of the Dataset object. If headers have been
set, a list of Python dictionaries will be returned. If no headers have been
set, a list of tuples (rows) will be returned instead.
A dataset object can also be imported by setting the :attr:`Dataset.dict` attribute. ::
data = tablib.Dataset()
data.dict = [{'age': 90, 'first_name': 'Kenneth', 'last_name': 'Reitz'}]
"""
if not len(pickle):
return
# if list of rows
if isinstance(pickle[0], list):
self.wipe()
for row in pickle:
self.append(Row(row))
# if list of objects
elif isinstance(pickle[0], dict):
self.wipe()
self.headers = list(pickle[0].keys())
for row in pickle:
self.append(Row(list(row.values())))
else:
raise UnsupportedFormat
dict = property(_get_dict, _set_dict)
def _clean_col(self, col):
"""Prepares the given column for insert/append."""
col = list(col)
if self.headers:
header = [col.pop(0)]
else:
header = []
if len(col) == 1 and hasattr(col[0], '__call__'):
col = list(map(col[0], self._data))
col = tuple(header + col)
return col
@property
def height(self):
"""The number of rows currently in the :class:`Dataset`.
Cannot be directly modified.
"""
return len(self._data)
@property
def width(self):
"""The number of columns currently in the :class:`Dataset`.
Cannot be directly modified.
"""
try:
return len(self._data[0])
except IndexError:
try:
return len(self.headers)
except TypeError:
return 0
def load(self, in_stream, format=None, **kwargs):
"""
Import `in_stream` to the :class:`Dataset` object using the `format`.
`in_stream` can be a file-like object, a string, or a bytestring.
:param \\*\\*kwargs: (optional) custom configuration to the format `import_set`.
"""
stream = normalize_input(in_stream)
if not format:
format = detect_format(stream)
fmt = registry.get_format(format)
if not hasattr(fmt, 'import_set'):
raise UnsupportedFormat(f'Format {format} cannot be imported.')
if not import_set:
raise UnsupportedFormat(f'Format {format} cannot be imported.')
fmt.import_set(self, stream, **kwargs)
return self
def export(self, format, **kwargs):
"""
Export :class:`Dataset` object to `format`.
:param \\*\\*kwargs: (optional) custom configuration to the format `export_set`.
"""
fmt = registry.get_format(format)
if not hasattr(fmt, 'export_set'):
raise UnsupportedFormat(f'Format {format} cannot be exported.')
return fmt.export_set(self, **kwargs)
# ----
# Rows
# ----
def insert(self, index, row, tags=list()):
"""Inserts a row to the :class:`Dataset` at the given index.
Rows inserted must be the correct size (height or width).
The default behaviour is to insert the given row to the :class:`Dataset`
object at the given index.
"""
self._validate(row)
self._data.insert(index, Row(row, tags=tags))
def rpush(self, row, tags=list()):
"""Adds a row to the end of the :class:`Dataset`.
See :method:`Dataset.insert` for additional documentation.
"""
self.insert(self.height, row=row, tags=tags)
def lpush(self, row, tags=list()):
"""Adds a row to the top of the :class:`Dataset`.
See :method:`Dataset.insert` for additional documentation.
"""
self.insert(0, row=row, tags=tags)
def append(self, row, tags=list()):
"""Adds a row to the :class:`Dataset`.
See :method:`Dataset.insert` for additional documentation.
"""
self.rpush(row, tags)
def extend(self, rows, tags=list()):
"""Adds a list of rows to the :class:`Dataset` using
:method:`Dataset.append`
"""
for row in rows:
self.append(row, tags)
def lpop(self):
"""Removes and returns the first row of the :class:`Dataset`."""
cache = self[0]
del self[0]
return cache
def rpop(self):
"""Removes and returns the last row of the :class:`Dataset`."""
cache = self[-1]
del self[-1]
return cache
def pop(self):
"""Removes and returns the last row of the :class:`Dataset`."""
return self.rpop()
# -------
# Columns
# -------
def insert_col(self, index, col=None, header=None):
"""Inserts a column to the :class:`Dataset` at the given index.
Columns inserted must be the correct height.
You can also insert a column of a single callable object, which will
add a new column with the return values of the callable each as an
item in the column. ::
data.append_col(col=random.randint)
If inserting a column, and :attr:`Dataset.headers` is set, the
header attribute must be set, and will be considered the header for
that row.
See :ref:`dyncols` for an in-depth example.
.. versionchanged:: 0.9.0
If inserting a column, and :attr:`Dataset.headers` is set, the
header attribute must be set, and will be considered the header for
that row.
.. versionadded:: 0.9.0
If inserting a row, you can add :ref:`tags <tags>` to the row you are inserting.
This gives you the ability to :method:`filter <Dataset.filter>` your
:class:`Dataset` later.
"""
if col is None:
col = []
# Callable Columns...
if hasattr(col, '__call__'):
col = list(map(col, self._data))
col = self._clean_col(col)
self._validate(col=col)
if self.headers:
# pop the first item off, add to headers
if not header:
raise HeadersNeeded()
# corner case - if header is set without data
elif header and self.height == 0 and len(col):
raise InvalidDimensions
self.headers.insert(index, header)
if self.height and self.width:
for i, row in enumerate(self._data):
row.insert(index, col[i])
self._data[i] = row
else:
self._data = [Row([row]) for row in col]
def rpush_col(self, col, header=None):
"""Adds a column to the end of the :class:`Dataset`.
See :method:`Dataset.insert` for additional documentation.
"""
self.insert_col(self.width, col, header=header)
def lpush_col(self, col, header=None):
"""Adds a column to the top of the :class:`Dataset`.
See :method:`Dataset.insert` for additional documentation.
"""
self.insert_col(0, col, header=header)
def insert_separator(self, index, text='-'):
"""Adds a separator to :class:`Dataset` at given index."""
sep = (index, text)
self._separators.append(sep)
def append_separator(self, text='-'):
"""Adds a :ref:`separator <separators>` to the :class:`Dataset`."""
# change offsets if headers are or aren't defined
if not self.headers:
index = self.height if self.height else 0
else:
index = (self.height + 1) if self.height else 1
self.insert_separator(index, text)
def append_col(self, col, header=None):
"""Adds a column to the :class:`Dataset`.
See :method:`Dataset.insert_col` for additional documentation.
"""
self.rpush_col(col, header)
def get_col(self, index):
"""Returns the column from the :class:`Dataset` at the given index."""
return [row[index] for row in self._data]
# ----
# Misc
# ----
def add_formatter(self, col, handler):
"""Adds a formatter to the :class:`Dataset`.
.. versionadded:: 0.9.5
:param col: column to. Accepts index int or header str.
:param handler: reference to callback function to execute against
each cell value.
"""
if isinstance(col, str):
if col in self.headers:
col = self.headers.index(col) # get 'key' index from each data
else:
raise KeyError
if not col > self.width:
self._formatters.append((col, handler))
else:
raise InvalidDatasetIndex
return True
def filter(self, tag):
"""Returns a new instance of the :class:`Dataset`, excluding any rows
that do not contain the given :ref:`tags <tags>`.
"""
_dset = copy(self)
_dset._data = [row for row in _dset._data if row.has_tag(tag)]
return _dset
def sort(self, col, reverse=False):
"""Sort a :class:`Dataset` by a specific column, given string (for
header) or integer (for column index). The order can be reversed by
setting ``reverse`` to ``True``.
Returns a new :class:`Dataset` instance where columns have been
sorted.
"""
if isinstance(col, str):
if not self.headers:
raise HeadersNeeded
_sorted = sorted(self.dict, key=itemgetter(col), reverse=reverse)
_dset = Dataset(headers=self.headers, title=self.title)
for item in _sorted:
row = [item[key] for key in self.headers]
_dset.append(row=row)
else:
if self.headers:
col = self.headers[col]
_sorted = sorted(self.dict, key=itemgetter(col), reverse=reverse)
_dset = Dataset(headers=self.headers, title=self.title)
for item in _sorted:
if self.headers:
row = [item[key] for key in self.headers]
else:
row = item
_dset.append(row=row)
return _dset
def transpose(self):
"""Transpose a :class:`Dataset`, turning rows into columns and vice
versa, returning a new ``Dataset`` instance. The first row of the
original instance becomes the new header row."""
# Don't transpose if there is no data
if not self:
return
_dset = Dataset()
# The first element of the headers stays in the headers,
# it is our "hinge" on which we rotate the data
new_headers = [self.headers[0]] + self[self.headers[0]]
_dset.headers = new_headers
for index, column in enumerate(self.headers):
if column == self.headers[0]:
# It's in the headers, so skip it
continue
# Adding the column name as now they're a regular column
# Use `get_col(index)` in case there are repeated values
row_data = [column] + self.get_col(index)
row_data = Row(row_data)
_dset.append(row=row_data)
return _dset
def stack(self, other):
"""Stack two :class:`Dataset` instances together by
joining at the row level, and return new combined
``Dataset`` instance."""
if not isinstance(other, Dataset):
return
if self.width != other.width:
raise InvalidDimensions
# Copy the source data
_dset = copy(self)
rows_to_stack = [row for row in _dset._data]
other_rows = [row for row in other._data]
rows_to_stack.extend(other_rows)
_dset._data = rows_to_stack
return _dset
def stack_cols(self, other):
"""Stack two :class:`Dataset` instances together by
joining at the column level, and return a new
combined ``Dataset`` instance. If either ``Dataset``
has headers set, than the other must as well."""
if not isinstance(other, Dataset):
return
if self.headers or other.headers:
if not self.headers or not other.headers:
raise HeadersNeeded
if self.height != other.height:
raise InvalidDimensions
try:
new_headers = self.headers + other.headers
except TypeError:
new_headers = None
_dset = Dataset()
for column in self.headers:
_dset.append_col(col=self[column])
for column in other.headers:
_dset.append_col(col=other[column])
_dset.headers = new_headers
return _dset
def remove_duplicates(self):
"""Removes all duplicate rows from the :class:`Dataset` object
while maintaining the original order."""
seen = set()
self._data[:] = [row for row in self._data if not (tuple(row) in seen or seen.add(tuple(row)))]
def wipe(self):
"""Removes all content and headers from the :class:`Dataset` object."""
self._data = list()
self.__headers = None
def subset(self, rows=None, cols=None):
"""Returns a new instance of the :class:`Dataset`,
including only specified rows and columns.
"""
# Don't return if no data
if not self:
return
if rows is None:
rows = list(range(self.height))
if cols is None:
cols = list(self.headers)
# filter out impossible rows and columns
rows = [row for row in rows if row in range(self.height)]
cols = [header for header in cols if header in self.headers]
_dset = Dataset()
# filtering rows and columns
_dset.headers = list(cols)
_dset._data = []
for row_no, row in enumerate(self._data):
data_row = []
for key in _dset.headers:
if key in self.headers:
pos = self.headers.index(key)
data_row.append(row[pos])
else:
raise KeyError
if row_no in rows:
_dset.append(row=Row(data_row))
return _dset
class Databook:
"""A book of :class:`Dataset` objects.
"""
def __init__(self, sets=None):
self._datasets = sets or []
def __repr__(self):
try:
return '<%s databook>' % (self.title.lower())
except AttributeError:
return '<databook object>'
def wipe(self):
"""Removes all :class:`Dataset` objects from the :class:`Databook`."""
self._datasets = []
def sheets(self):
return self._datasets
def add_sheet(self, dataset):
"""Adds given :class:`Dataset` to the :class:`Databook`."""
if isinstance(dataset, Dataset):
self._datasets.append(dataset)
else:
raise InvalidDatasetType
def _package(self, ordered=True):
"""Packages :class:`Databook` for delivery."""
collector = []
if ordered:
dict_pack = OrderedDict
else:
dict_pack = dict
for dset in self._datasets:
collector.append(dict_pack(
title=dset.title,
data=dset._package(ordered=ordered)
))
return collector
@property
def size(self):
"""The number of the :class:`Dataset` objects within :class:`Databook`."""
return len(self._datasets)
def load(self, in_stream, format, **kwargs):
"""
Import `in_stream` to the :class:`Databook` object using the `format`.
`in_stream` can be a file-like object, a string, or a bytestring.
:param \\*\\*kwargs: (optional) custom configuration to the format `import_book`.
"""
stream = normalize_input(in_stream)
if not format:
format = detect_format(stream)
fmt = registry.get_format(format)
if not hasattr(fmt, 'import_book'):
raise UnsupportedFormat(f'Format {format} cannot be loaded.')
fmt.import_book(self, stream, **kwargs)
return self
def export(self, format, **kwargs):
"""
Export :class:`Databook` object to `format`.
:param \\*\\*kwargs: (optional) custom configuration to the format `export_book`.
"""
fmt = registry.get_format(format)
if not hasattr(fmt, 'export_book'):
raise UnsupportedFormat(f'Format {format} cannot be exported.')
return fmt.export_book(self, **kwargs)
def detect_format(stream):
"""Return format name of given stream (file-like object, string, or bytestring)."""
stream = normalize_input(stream)
fmt_title = None
for fmt in registry.formats():
try:
if fmt.detect(stream):
fmt_title = fmt.title
break
except AttributeError:
pass
finally:
if hasattr(stream, 'seek'):
stream.seek(0)
return fmt_title
def import_set(stream, format=None, **kwargs):
"""Return dataset of given stream (file-like object, string, or bytestring)."""
return Dataset().load(normalize_input(stream), format, **kwargs)
def import_book(stream, format=None, **kwargs):
"""Return dataset of given stream (file-like object, string, or bytestring)."""
return Databook().load(normalize_input(stream), format, **kwargs)
registry.register_builtins()
+18
View File
@@ -0,0 +1,18 @@
class InvalidDatasetType(Exception):
"Only Datasets can be added to a DataBook"
class InvalidDimensions(Exception):
"Invalid size"
class InvalidDatasetIndex(Exception):
"Outside of Dataset size"
class HeadersNeeded(Exception):
"Header parameter must be given when appending a column in this Dataset."
class UnsupportedFormat(NotImplementedError):
"Format is not supported"
+135
View File
@@ -0,0 +1,135 @@
""" Tablib - formats
"""
from collections import OrderedDict
from functools import partialmethod
from importlib import import_module
from importlib.util import find_spec
from tablib.exceptions import UnsupportedFormat
from tablib.utils import normalize_input
from ._csv import CSVFormat
from ._json import JSONFormat
from ._tsv import TSVFormat
uninstalled_format_messages = {
"cli": {"package_name": "tabulate package", "extras_name": "cli"},
"df": {"package_name": "pandas package", "extras_name": "pandas"},
"html": {"package_name": "MarkupPy package", "extras_name": "html"},
"ods": {"package_name": "odfpy package", "extras_name": "ods"},
"xls": {"package_name": "xlrd and xlwt packages", "extras_name": "xls"},
"xlsx": {"package_name": "openpyxl package", "extras_name": "xlsx"},
"yaml": {"package_name": "pyyaml package", "extras_name": "yaml"},
}
def load_format_class(dotted_path):
try:
module_path, class_name = dotted_path.rsplit('.', 1)
return getattr(import_module(module_path), class_name)
except (ValueError, AttributeError) as err:
raise ImportError(f"Unable to load format class '{dotted_path}' ({err})")
class FormatDescriptorBase:
def __init__(self, key, format_or_path):
self.key = key
self._format_path = None
if isinstance(format_or_path, str):
self._format = None
self._format_path = format_or_path
else:
self._format = format_or_path
def ensure_format_loaded(self):
if self._format is None:
self._format = load_format_class(self._format_path)
class ImportExportBookDescriptor(FormatDescriptorBase):
def __get__(self, obj, cls, **kwargs):
self.ensure_format_loaded()
return self._format.export_book(obj, **kwargs)
def __set__(self, obj, val):
self.ensure_format_loaded()
return self._format.import_book(obj, normalize_input(val))
class ImportExportSetDescriptor(FormatDescriptorBase):
def __get__(self, obj, cls, **kwargs):
self.ensure_format_loaded()
return self._format.export_set(obj, **kwargs)
def __set__(self, obj, val):
self.ensure_format_loaded()
return self._format.import_set(obj, normalize_input(val))
class Registry:
_formats = OrderedDict()
def register(self, key, format_or_path):
from tablib.core import Databook, Dataset
# Create Databook.<format> read or read/write properties
setattr(Databook, key, ImportExportBookDescriptor(key, format_or_path))
# Create Dataset.<format> read or read/write properties,
# and Dataset.get_<format>/set_<format> methods.
setattr(Dataset, key, ImportExportSetDescriptor(key, format_or_path))
try:
setattr(Dataset, 'get_%s' % key, partialmethod(Dataset._get_in_format, key))
setattr(Dataset, 'set_%s' % key, partialmethod(Dataset._set_in_format, key))
except AttributeError:
setattr(Dataset, 'get_%s' % key, partialmethod(Dataset._get_in_format, key))
self._formats[key] = format_or_path
def register_builtins(self):
# Registration ordering matters for autodetection.
self.register('json', JSONFormat())
# xlsx before as xls (xlrd) can also read xlsx
if find_spec('openpyxl'):
self.register('xlsx', 'tablib.formats._xlsx.XLSXFormat')
if find_spec('xlrd') and find_spec('xlwt'):
self.register('xls', 'tablib.formats._xls.XLSFormat')
if find_spec('yaml'):
self.register('yaml', 'tablib.formats._yaml.YAMLFormat')
self.register('csv', CSVFormat())
self.register('tsv', TSVFormat())
if find_spec('odf'):
self.register('ods', 'tablib.formats._ods.ODSFormat')
self.register('dbf', 'tablib.formats._dbf.DBFFormat')
if find_spec('MarkupPy'):
self.register('html', 'tablib.formats._html.HTMLFormat')
self.register('jira', 'tablib.formats._jira.JIRAFormat')
self.register('latex', 'tablib.formats._latex.LATEXFormat')
if find_spec('pandas'):
self.register('df', 'tablib.formats._df.DataFrameFormat')
self.register('rst', 'tablib.formats._rst.ReSTFormat')
if find_spec('tabulate'):
self.register('cli', 'tablib.formats._cli.CLIFormat')
def formats(self):
for key, frm in self._formats.items():
if isinstance(frm, str):
self._formats[key] = load_format_class(frm)
yield self._formats[key]
def get_format(self, key):
if key not in self._formats:
if key in uninstalled_format_messages:
raise UnsupportedFormat(
"The '{key}' format is not available. You may want to install the "
"{package_name} (or `pip install \"tablib[{extras_name}]\"`).".format(
**uninstalled_format_messages[key], key=key
)
)
raise UnsupportedFormat("Tablib has no format '%s' or it is not registered." % key)
if isinstance(self._formats[key], str):
self._formats[key] = load_format_class(self._formats[key])
return self._formats[key]
registry = Registry()
+20
View File
@@ -0,0 +1,20 @@
"""Tablib - Command-line Interface table export support.
Generates a representation for CLI from the dataset.
Wrapper for tabulate library.
"""
from tabulate import tabulate as Tabulate
class CLIFormat:
""" Class responsible to export to CLI Format """
title = 'cli'
DEFAULT_FMT = 'plain'
@classmethod
def export_set(cls, dataset, **kwargs):
"""Returns CLI representation of a Dataset."""
if dataset.headers:
kwargs.setdefault('headers', dataset.headers)
kwargs.setdefault('tablefmt', cls.DEFAULT_FMT)
return Tabulate(dataset, **kwargs)
+60
View File
@@ -0,0 +1,60 @@
""" Tablib - *SV Support.
"""
import csv
from io import StringIO
class CSVFormat:
title = 'csv'
extensions = ('csv',)
DEFAULT_DELIMITER = ','
@classmethod
def export_stream_set(cls, dataset, **kwargs):
"""Returns CSV representation of Dataset as file-like."""
stream = StringIO()
kwargs.setdefault('delimiter', cls.DEFAULT_DELIMITER)
_csv = csv.writer(stream, **kwargs)
for row in dataset._package(dicts=False):
_csv.writerow(row)
stream.seek(0)
return stream
@classmethod
def export_set(cls, dataset, **kwargs):
"""Returns CSV representation of Dataset."""
stream = cls.export_stream_set(dataset, **kwargs)
return stream.getvalue()
@classmethod
def import_set(cls, dset, in_stream, headers=True, **kwargs):
"""Returns dataset from CSV stream."""
dset.wipe()
kwargs.setdefault('delimiter', cls.DEFAULT_DELIMITER)
rows = csv.reader(in_stream, **kwargs)
for i, row in enumerate(rows):
if (i == 0) and (headers):
dset.headers = row
elif row:
if i > 0 and len(row) < dset.width:
row += [''] * (dset.width - len(row))
dset.append(row)
@classmethod
def detect(cls, stream, delimiter=None):
"""Returns True if given stream is valid CSV."""
try:
csv.Sniffer().sniff(stream.read(1024), delimiters=delimiter or cls.DEFAULT_DELIMITER)
return True
except Exception:
return False
+66
View File
@@ -0,0 +1,66 @@
""" Tablib - DBF Support.
"""
import io
import os
import tempfile
from tablib.packages.dbfpy import dbf, dbfnew
from tablib.packages.dbfpy import record as dbfrecord
class DBFFormat:
title = 'dbf'
extensions = ('csv',)
DEFAULT_ENCODING = 'utf-8'
@classmethod
def export_set(cls, dataset):
"""Returns DBF representation of a Dataset"""
new_dbf = dbfnew.dbf_new()
temp_file, temp_uri = tempfile.mkstemp()
# create the appropriate fields based on the contents of the first row
first_row = dataset[0]
for fieldname, field_value in zip(dataset.headers, first_row):
if type(field_value) in [int, float]:
new_dbf.add_field(fieldname, 'N', 10, 8)
else:
new_dbf.add_field(fieldname, 'C', 80)
new_dbf.write(temp_uri)
dbf_file = dbf.Dbf(temp_uri, readOnly=0)
for row in dataset:
record = dbfrecord.DbfRecord(dbf_file)
for fieldname, field_value in zip(dataset.headers, row):
record[fieldname] = field_value
record.store()
dbf_file.close()
dbf_stream = open(temp_uri, 'rb')
stream = io.BytesIO(dbf_stream.read())
dbf_stream.close()
os.close(temp_file)
os.remove(temp_uri)
return stream.getvalue()
@classmethod
def import_set(cls, dset, in_stream, headers=True):
"""Returns a dataset from a DBF stream."""
dset.wipe()
_dbf = dbf.Dbf(in_stream)
dset.headers = _dbf.fieldNames
for record in range(_dbf.recordCount):
row = [_dbf[record][f] for f in _dbf.fieldNames]
dset.append(row)
@classmethod
def detect(cls, stream):
"""Returns True if the given stream is valid DBF"""
try:
_dbf = dbf.Dbf(stream, readOnly=True)
return True
except Exception:
return False
+41
View File
@@ -0,0 +1,41 @@
""" Tablib - DataFrame Support.
"""
try:
from pandas import DataFrame
except ImportError:
DataFrame = None
class DataFrameFormat:
title = 'df'
extensions = ('df',)
@classmethod
def detect(cls, stream):
"""Returns True if given stream is a DataFrame."""
if DataFrame is None:
return False
elif isinstance(stream, DataFrame):
return True
try:
DataFrame(stream.read())
return True
except ValueError:
return False
@classmethod
def export_set(cls, dset, index=None):
"""Returns DataFrame representation of DataBook."""
if DataFrame is None:
raise NotImplementedError(
'DataFrame Format requires `pandas` to be installed.'
' Try `pip install "tablib[pandas]"`.')
dataframe = DataFrame(dset.dict, columns=dset.headers)
return dataframe
@classmethod
def import_set(cls, dset, in_stream):
"""Returns dataset from DataFrame."""
dset.wipe()
dset.dict = in_stream.to_dict(orient='records')
+62
View File
@@ -0,0 +1,62 @@
""" Tablib - HTML export support.
"""
import codecs
from io import BytesIO
from MarkupPy import markup
class HTMLFormat:
BOOK_ENDINGS = 'h3'
title = 'html'
extensions = ('html', )
@classmethod
def export_set(cls, dataset):
"""HTML representation of a Dataset."""
stream = BytesIO()
page = markup.page()
page.table.open()
if dataset.headers is not None:
new_header = [item if item is not None else '' for item in dataset.headers]
page.thead.open()
headers = markup.oneliner.th(new_header)
page.tr(headers)
page.thead.close()
for row in dataset:
new_row = [item if item is not None else '' for item in row]
html_row = markup.oneliner.td(new_row)
page.tr(html_row)
page.table.close()
# Allow unicode characters in output
wrapper = codecs.getwriter("utf8")(stream)
wrapper.writelines(str(page))
return stream.getvalue().decode('utf-8')
@classmethod
def export_book(cls, databook):
"""HTML representation of a Databook."""
stream = BytesIO()
# Allow unicode characters in output
wrapper = codecs.getwriter("utf8")(stream)
for i, dset in enumerate(databook._datasets):
title = (dset.title if dset.title else 'Set %s' % (i))
wrapper.write(f'<{cls.BOOK_ENDINGS}>{title}</{cls.BOOK_ENDINGS}>\n')
wrapper.write(dset.html)
wrapper.write('\n')
return stream.getvalue().decode('utf-8')
+40
View File
@@ -0,0 +1,40 @@
"""Tablib - Jira table export support.
Generates a Jira table from the dataset.
"""
class JIRAFormat:
title = 'jira'
@classmethod
def export_set(cls, dataset):
"""Formats the dataset according to the Jira table syntax:
||heading 1||heading 2||heading 3||
|col A1|col A2|col A3|
|col B1|col B2|col B3|
:param dataset: dataset to serialize
:type dataset: tablib.core.Dataset
"""
header = cls._get_header(dataset.headers) if dataset.headers else ''
body = cls._get_body(dataset)
return f'{header}\n{body}' if header else body
@classmethod
def _get_body(cls, dataset):
return '\n'.join([cls._serialize_row(row) for row in dataset])
@classmethod
def _get_header(cls, headers):
return cls._serialize_row(headers, delimiter='||')
@classmethod
def _serialize_row(cls, row, delimiter='|'):
return '{}{}{}'.format(
delimiter,
delimiter.join([str(item) if item else ' ' for item in row]),
delimiter
)
+58
View File
@@ -0,0 +1,58 @@
""" Tablib - JSON Support
"""
import decimal
import json
from uuid import UUID
import tablib
def serialize_objects_handler(obj):
if isinstance(obj, (decimal.Decimal, UUID)):
return str(obj)
elif hasattr(obj, 'isoformat'):
return obj.isoformat()
else:
return obj
class JSONFormat:
title = 'json'
extensions = ('json', 'jsn')
@classmethod
def export_set(cls, dataset):
"""Returns JSON representation of Dataset."""
return json.dumps(dataset.dict, default=serialize_objects_handler)
@classmethod
def export_book(cls, databook):
"""Returns JSON representation of Databook."""
return json.dumps(databook._package(), default=serialize_objects_handler)
@classmethod
def import_set(cls, dset, in_stream):
"""Returns dataset from JSON stream."""
dset.wipe()
dset.dict = json.load(in_stream)
@classmethod
def import_book(cls, dbook, in_stream):
"""Returns databook from JSON stream."""
dbook.wipe()
for sheet in json.load(in_stream):
data = tablib.Dataset()
data.title = sheet['title']
data.dict = sheet['data']
dbook.add_sheet(data)
@classmethod
def detect(cls, stream):
"""Returns True if given stream is valid JSON."""
try:
json.load(stream)
return True
except (TypeError, ValueError):
return False
+132
View File
@@ -0,0 +1,132 @@
"""Tablib - LaTeX table export support.
Generates a LaTeX booktabs-style table from the dataset.
"""
import re
class LATEXFormat:
title = 'latex'
extensions = ('tex',)
TABLE_TEMPLATE = """\
%% Note: add \\usepackage{booktabs} to your preamble
%%
\\begin{table}[!htbp]
\\centering
%(CAPTION)s
\\begin{tabular}{%(COLSPEC)s}
\\toprule
%(HEADER)s
%(MIDRULE)s
%(BODY)s
\\bottomrule
\\end{tabular}
\\end{table}
"""
TEX_RESERVED_SYMBOLS_MAP = dict([
('\\', '\\textbackslash{}'),
('{', '\\{'),
('}', '\\}'),
('$', '\\$'),
('&', '\\&'),
('#', '\\#'),
('^', '\\textasciicircum{}'),
('_', '\\_'),
('~', '\\textasciitilde{}'),
('%', '\\%'),
])
TEX_RESERVED_SYMBOLS_RE = re.compile(
'(%s)' % '|'.join(map(re.escape, TEX_RESERVED_SYMBOLS_MAP.keys())))
@classmethod
def export_set(cls, dataset):
"""Returns LaTeX representation of dataset
:param dataset: dataset to serialize
:type dataset: tablib.core.Dataset
"""
caption = '\\caption{%s}' % dataset.title if dataset.title else '%'
colspec = cls._colspec(dataset.width)
header = cls._serialize_row(dataset.headers) if dataset.headers else ''
midrule = cls._midrule(dataset.width)
body = '\n'.join([cls._serialize_row(row) for row in dataset])
return cls.TABLE_TEMPLATE % dict(CAPTION=caption, COLSPEC=colspec,
HEADER=header, MIDRULE=midrule, BODY=body)
@classmethod
def _colspec(cls, dataset_width):
"""Generates the column specification for the LaTeX `tabular` environment
based on the dataset width.
The first column is justified to the left, all further columns are aligned
to the right.
.. note:: This is only a heuristic and most probably has to be fine-tuned
post export. Column alignment should depend on the data type, e.g., textual
content should usually be aligned to the left while numeric content almost
always should be aligned to the right.
:param dataset_width: width of the dataset
"""
spec = 'l'
for _ in range(1, dataset_width):
spec += 'r'
return spec
@classmethod
def _midrule(cls, dataset_width):
"""Generates the table `midrule`, which may be composed of several
`cmidrules`.
:param dataset_width: width of the dataset to serialize
"""
if not dataset_width or dataset_width == 1:
return '\\midrule'
return ' '.join([cls._cmidrule(colindex, dataset_width) for colindex in
range(1, dataset_width + 1)])
@classmethod
def _cmidrule(cls, colindex, dataset_width):
"""Generates the `cmidrule` for a single column with appropriate trimming
based on the column position.
:param colindex: Column index
:param dataset_width: width of the dataset
"""
rule = '\\cmidrule(%s){%d-%d}'
if colindex == 1:
# Rule of first column is trimmed on the right
return rule % ('r', colindex, colindex)
if colindex == dataset_width:
# Rule of last column is trimmed on the left
return rule % ('l', colindex, colindex)
# Inner columns are trimmed on the left and right
return rule % ('lr', colindex, colindex)
@classmethod
def _serialize_row(cls, row):
"""Returns string representation of a single row.
:param row: single dataset row
"""
new_row = [cls._escape_tex_reserved_symbols(str(item)) if item else ''
for item in row]
return 6 * ' ' + ' & '.join(new_row) + ' \\\\'
@classmethod
def _escape_tex_reserved_symbols(cls, input):
"""Escapes all TeX reserved symbols ('_', '~', etc.) in a string.
:param input: String to escape
"""
def replace(match):
return cls.TEX_RESERVED_SYMBOLS_MAP[match.group()]
return cls.TEX_RESERVED_SYMBOLS_RE.sub(replace, input)
+105
View File
@@ -0,0 +1,105 @@
""" Tablib - ODF Support.
"""
from io import BytesIO
from odf import opendocument, style, table, text
bold = style.Style(name="bold", family="paragraph")
bold.addElement(style.TextProperties(fontweight="bold", fontweightasian="bold", fontweightcomplex="bold"))
class ODSFormat:
title = 'ods'
extensions = ('ods',)
@classmethod
def export_set(cls, dataset):
"""Returns ODF representation of Dataset."""
wb = opendocument.OpenDocumentSpreadsheet()
wb.automaticstyles.addElement(bold)
ws = table.Table(name=dataset.title if dataset.title else 'Tablib Dataset')
wb.spreadsheet.addElement(ws)
cls.dset_sheet(dataset, ws)
stream = BytesIO()
wb.save(stream)
return stream.getvalue()
@classmethod
def export_book(cls, databook):
"""Returns ODF representation of DataBook."""
wb = opendocument.OpenDocumentSpreadsheet()
wb.automaticstyles.addElement(bold)
for i, dset in enumerate(databook._datasets):
ws = table.Table(name=dset.title if dset.title else 'Sheet%s' % (i))
wb.spreadsheet.addElement(ws)
cls.dset_sheet(dset, ws)
stream = BytesIO()
wb.save(stream)
return stream.getvalue()
@classmethod
def dset_sheet(cls, dataset, ws):
"""Completes given worksheet from given Dataset."""
_package = dataset._package(dicts=False)
for i, sep in enumerate(dataset._separators):
_offset = i
_package.insert((sep[0] + _offset), (sep[1],))
for i, row in enumerate(_package):
row_number = i + 1
odf_row = table.TableRow(stylename=bold, defaultcellstylename='bold')
for j, col in enumerate(row):
try:
col = str(col, errors='ignore')
except TypeError:
# col is already str
pass
ws.addElement(table.TableColumn())
# bold headers
if (row_number == 1) and dataset.headers:
odf_row.setAttribute('stylename', bold)
ws.addElement(odf_row)
cell = table.TableCell()
p = text.P()
p.addElement(text.Span(text=col, stylename=bold))
cell.addElement(p)
odf_row.addElement(cell)
# wrap the rest
else:
try:
if '\n' in col:
ws.addElement(odf_row)
cell = table.TableCell()
cell.addElement(text.P(text=col))
odf_row.addElement(cell)
else:
ws.addElement(odf_row)
cell = table.TableCell()
cell.addElement(text.P(text=col))
odf_row.addElement(cell)
except TypeError:
ws.addElement(odf_row)
cell = table.TableCell()
cell.addElement(text.P(text=col))
odf_row.addElement(cell)
@classmethod
def detect(cls, stream):
if isinstance(stream, bytes):
# load expects a file-like object.
stream = BytesIO(stream)
try:
opendocument.load(stream)
return True
except Exception:
return False
+265
View File
@@ -0,0 +1,265 @@
""" Tablib - reStructuredText Support
"""
from itertools import zip_longest
from statistics import median
from textwrap import TextWrapper
JUSTIFY_LEFT = 'left'
JUSTIFY_CENTER = 'center'
JUSTIFY_RIGHT = 'right'
JUSTIFY_VALUES = (JUSTIFY_LEFT, JUSTIFY_CENTER, JUSTIFY_RIGHT)
def to_str(value):
if isinstance(value, bytes):
return value.decode('utf-8')
return str(value)
def _max_word_len(text):
"""
Return the length of the longest word in `text`.
>>> _max_word_len('Python Module for Tabular Datasets')
8
"""
return max([len(word) for word in text.split()], default=0) if text else 0
class ReSTFormat:
title = 'rst'
extensions = ('rst',)
MAX_TABLE_WIDTH = 80 # Roughly. It may be wider to avoid breaking words.
@classmethod
def _get_column_string_lengths(cls, dataset):
"""
Returns a list of string lengths of each column, and a list of
maximum word lengths.
"""
if dataset.headers:
column_lengths = [[len(h)] for h in dataset.headers]
word_lens = [_max_word_len(h) for h in dataset.headers]
else:
column_lengths = [[] for _ in range(dataset.width)]
word_lens = [0 for _ in range(dataset.width)]
for row in dataset.dict:
values = iter(row.values() if hasattr(row, 'values') else row)
for i, val in enumerate(values):
text = to_str(val)
column_lengths[i].append(len(text))
word_lens[i] = max(word_lens[i], _max_word_len(text))
return column_lengths, word_lens
@classmethod
def _row_to_lines(cls, values, widths, wrapper, sep='|', justify=JUSTIFY_LEFT):
"""
Returns a table row of wrapped values as a list of lines
"""
if justify not in JUSTIFY_VALUES:
raise ValueError('Value of "justify" must be one of "{}"'.format(
'", "'.join(JUSTIFY_VALUES)
))
if justify == JUSTIFY_LEFT:
just = lambda text, width: text.ljust(width)
elif justify == JUSTIFY_CENTER:
just = lambda text, width: text.center(width)
else:
just = lambda text, width: text.rjust(width)
lpad = sep + ' ' if sep else ''
rpad = ' ' + sep if sep else ''
pad = ' ' + sep + ' '
cells = []
for value, width in zip(values, widths):
wrapper.width = width
text = to_str(value)
cell = wrapper.wrap(text)
cells.append(cell)
lines = zip_longest(*cells, fillvalue='')
lines = (
(just(cell_line, widths[i]) for i, cell_line in enumerate(line))
for line in lines
)
lines = [''.join((lpad, pad.join(line), rpad)) for line in lines]
return lines
@classmethod
def _get_column_widths(cls, dataset, max_table_width=MAX_TABLE_WIDTH, pad_len=3):
"""
Returns a list of column widths proportional to the median length
of the text in their cells.
"""
str_lens, word_lens = cls._get_column_string_lengths(dataset)
median_lens = [int(median(lens)) for lens in str_lens]
total = sum(median_lens)
if total > max_table_width - (pad_len * len(median_lens)):
column_widths = (max_table_width * l // total for l in median_lens)
else:
column_widths = (l for l in median_lens)
# Allow for separator and padding:
column_widths = (w - pad_len if w > pad_len else w for w in column_widths)
# Rather widen table than break words:
column_widths = [max(w, l) for w, l in zip(column_widths, word_lens)]
return column_widths
@classmethod
def export_set_as_simple_table(cls, dataset, column_widths=None):
"""
Returns reStructuredText grid table representation of dataset.
"""
lines = []
wrapper = TextWrapper()
if column_widths is None:
column_widths = cls._get_column_widths(dataset, pad_len=2)
border = ' '.join(['=' * w for w in column_widths])
lines.append(border)
if dataset.headers:
lines.extend(cls._row_to_lines(
dataset.headers,
column_widths,
wrapper,
sep='',
justify=JUSTIFY_CENTER,
))
lines.append(border)
for row in dataset.dict:
values = iter(row.values() if hasattr(row, 'values') else row)
lines.extend(cls._row_to_lines(values, column_widths, wrapper, ''))
lines.append(border)
return '\n'.join(lines)
@classmethod
def export_set_as_grid_table(cls, dataset, column_widths=None):
"""
Returns reStructuredText grid table representation of dataset.
>>> from tablib import Dataset
>>> from tablib.formats import registry
>>> bits = ((0, 0), (1, 0), (0, 1), (1, 1))
>>> data = Dataset()
>>> data.headers = ['A', 'B', 'A and B']
>>> for a, b in bits:
... data.append([bool(a), bool(b), bool(a * b)])
>>> rst = registry.get_format('rst')
>>> print(rst.export_set(data, force_grid=True))
+-------+-------+-------+
| A | B | A and |
| | | B |
+=======+=======+=======+
| False | False | False |
+-------+-------+-------+
| True | False | False |
+-------+-------+-------+
| False | True | False |
+-------+-------+-------+
| True | True | True |
+-------+-------+-------+
"""
lines = []
wrapper = TextWrapper()
if column_widths is None:
column_widths = cls._get_column_widths(dataset)
header_sep = '+=' + '=+='.join(['=' * w for w in column_widths]) + '=+'
row_sep = '+-' + '-+-'.join(['-' * w for w in column_widths]) + '-+'
lines.append(row_sep)
if dataset.headers:
lines.extend(cls._row_to_lines(
dataset.headers,
column_widths,
wrapper,
justify=JUSTIFY_CENTER,
))
lines.append(header_sep)
for row in dataset.dict:
values = iter(row.values() if hasattr(row, 'values') else row)
lines.extend(cls._row_to_lines(values, column_widths, wrapper))
lines.append(row_sep)
return '\n'.join(lines)
@classmethod
def _use_simple_table(cls, head0, col0, width0):
"""
Use a simple table if the text in the first column is never wrapped
>>> from tablib.formats import registry
>>> rst = registry.get_format('rst')
>>> rst._use_simple_table('menu', ['egg', 'bacon'], 10)
True
>>> rst._use_simple_table(None, ['lobster thermidor', 'spam'], 10)
False
"""
if head0 is not None:
head0 = to_str(head0)
if len(head0) > width0:
return False
for cell in col0:
cell = to_str(cell)
if len(cell) > width0:
return False
return True
@classmethod
def export_set(cls, dataset, **kwargs):
"""
Returns reStructuredText table representation of dataset.
Returns a simple table if the text in the first column is never
wrapped, otherwise returns a grid table.
>>> from tablib import Dataset
>>> bits = ((0, 0), (1, 0), (0, 1), (1, 1))
>>> data = Dataset()
>>> data.headers = ['A', 'B', 'A and B']
>>> for a, b in bits:
... data.append([bool(a), bool(b), bool(a * b)])
>>> table = data.rst
>>> table.split('\\n') == [
... '===== ===== =====',
... ' A B A and',
... ' B ',
... '===== ===== =====',
... 'False False False',
... 'True False False',
... 'False True False',
... 'True True True ',
... '===== ===== =====',
... ]
True
"""
if not dataset.dict:
return ''
force_grid = kwargs.get('force_grid', False)
max_table_width = kwargs.get('max_table_width', cls.MAX_TABLE_WIDTH)
column_widths = cls._get_column_widths(dataset, max_table_width)
use_simple_table = cls._use_simple_table(
dataset.headers[0] if dataset.headers else None,
dataset.get_col(0),
column_widths[0],
)
if use_simple_table and not force_grid:
return cls.export_set_as_simple_table(dataset, column_widths)
else:
return cls.export_set_as_grid_table(dataset, column_widths)
@classmethod
def export_book(cls, databook):
"""
reStructuredText representation of a Databook.
Tables are separated by a blank line. All tables use the grid
format.
"""
return '\n\n'.join(cls.export_set(dataset, force_grid=True)
for dataset in databook._datasets)
+11
View File
@@ -0,0 +1,11 @@
""" Tablib - TSV (Tab Separated Values) Support.
"""
from ._csv import CSVFormat
class TSVFormat(CSVFormat):
title = 'tsv'
extensions = ('tsv',)
DEFAULT_DELIMITER = '\t'
+147
View File
@@ -0,0 +1,147 @@
""" Tablib - XLS Support.
"""
from io import BytesIO
import xlrd
import xlwt
from xlrd.xldate import xldate_as_datetime
import tablib
# special styles
wrap = xlwt.easyxf("alignment: wrap on")
bold = xlwt.easyxf("font: bold on")
class XLSFormat:
title = 'xls'
extensions = ('xls',)
@classmethod
def detect(cls, stream):
"""Returns True if given stream is a readable excel file."""
try:
xlrd.open_workbook(file_contents=stream)
return True
except Exception:
pass
try:
xlrd.open_workbook(file_contents=stream.read())
return True
except Exception:
pass
try:
xlrd.open_workbook(filename=stream)
return True
except Exception:
return False
@classmethod
def export_set(cls, dataset):
"""Returns XLS representation of Dataset."""
wb = xlwt.Workbook(encoding='utf8')
ws = wb.add_sheet(dataset.title if dataset.title else 'Tablib Dataset')
cls.dset_sheet(dataset, ws)
stream = BytesIO()
wb.save(stream)
return stream.getvalue()
@classmethod
def export_book(cls, databook):
"""Returns XLS representation of DataBook."""
wb = xlwt.Workbook(encoding='utf8')
for i, dset in enumerate(databook._datasets):
ws = wb.add_sheet(dset.title if dset.title else 'Sheet%s' % (i))
cls.dset_sheet(dset, ws)
stream = BytesIO()
wb.save(stream)
return stream.getvalue()
@classmethod
def import_set(cls, dset, in_stream, headers=True):
"""Returns databook from XLS stream."""
dset.wipe()
xls_book = xlrd.open_workbook(file_contents=in_stream.read())
sheet = xls_book.sheet_by_index(0)
dset.title = sheet.name
def cell_value(value, type_):
if type_ == xlrd.XL_CELL_ERROR:
return xlrd.error_text_from_code[value]
elif type_ == xlrd.XL_CELL_DATE:
return xldate_as_datetime(value, xls_book.datemode)
return value
for i in range(sheet.nrows):
if i == 0 and headers:
dset.headers = sheet.row_values(0)
else:
dset.append([
cell_value(val, typ)
for val, typ in zip(sheet.row_values(i), sheet.row_types(i))
])
@classmethod
def import_book(cls, dbook, in_stream, headers=True):
"""Returns databook from XLS stream."""
dbook.wipe()
xls_book = xlrd.open_workbook(file_contents=in_stream)
for sheet in xls_book.sheets():
data = tablib.Dataset()
data.title = sheet.name
for i in range(sheet.nrows):
if i == 0 and headers:
data.headers = sheet.row_values(0)
else:
data.append(sheet.row_values(i))
dbook.add_sheet(data)
@classmethod
def dset_sheet(cls, dataset, ws):
"""Completes given worksheet from given Dataset."""
_package = dataset._package(dicts=False)
for i, sep in enumerate(dataset._separators):
_offset = i
_package.insert((sep[0] + _offset), (sep[1],))
for i, row in enumerate(_package):
for j, col in enumerate(row):
# bold headers
if (i == 0) and dataset.headers:
ws.write(i, j, col, bold)
# frozen header row
ws.panes_frozen = True
ws.horz_split_pos = 1
# bold separators
elif len(row) < dataset.width:
ws.write(i, j, col, bold)
# wrap the rest
else:
try:
if '\n' in col:
ws.write(i, j, col, wrap)
else:
ws.write(i, j, col)
except TypeError:
ws.write(i, j, col)
+143
View File
@@ -0,0 +1,143 @@
""" Tablib - XLSX Support.
"""
from io import BytesIO
from openpyxl.reader.excel import ExcelReader, load_workbook
from openpyxl.styles import Alignment, Font
from openpyxl.utils import get_column_letter
from openpyxl.workbook import Workbook
from openpyxl.writer.excel import ExcelWriter
import tablib
class XLSXFormat:
title = 'xlsx'
extensions = ('xlsx',)
@classmethod
def detect(cls, stream):
"""Returns True if given stream is a readable excel file."""
try:
# No need to fully load the file, it should be enough to be able to
# read the manifest.
reader = ExcelReader(stream, read_only=False)
reader.read_manifest()
return True
except Exception:
return False
@classmethod
def export_set(cls, dataset, freeze_panes=True):
"""Returns XLSX representation of Dataset."""
wb = Workbook()
ws = wb.worksheets[0]
ws.title = dataset.title if dataset.title else 'Tablib Dataset'
cls.dset_sheet(dataset, ws, freeze_panes=freeze_panes)
stream = BytesIO()
wb.save(stream)
return stream.getvalue()
@classmethod
def export_book(cls, databook, freeze_panes=True):
"""Returns XLSX representation of DataBook."""
wb = Workbook()
for sheet in wb.worksheets:
wb.remove(sheet)
for i, dset in enumerate(databook._datasets):
ws = wb.create_sheet()
ws.title = dset.title if dset.title else 'Sheet%s' % (i)
cls.dset_sheet(dset, ws, freeze_panes=freeze_panes)
stream = BytesIO()
wb.save(stream)
return stream.getvalue()
@classmethod
def import_set(cls, dset, in_stream, headers=True, read_only=True):
"""Returns databook from XLS stream."""
dset.wipe()
xls_book = load_workbook(in_stream, read_only=read_only, data_only=True)
sheet = xls_book.active
dset.title = sheet.title
for i, row in enumerate(sheet.rows):
row_vals = [c.value for c in row]
if (i == 0) and (headers):
dset.headers = row_vals
else:
dset.append(row_vals)
@classmethod
def import_book(cls, dbook, in_stream, headers=True, read_only=True):
"""Returns databook from XLS stream."""
dbook.wipe()
xls_book = load_workbook(in_stream, read_only=read_only, data_only=True)
for sheet in xls_book.worksheets:
data = tablib.Dataset()
data.title = sheet.title
for i, row in enumerate(sheet.rows):
row_vals = [c.value for c in row]
if (i == 0) and (headers):
data.headers = row_vals
else:
if i > 0 and len(row_vals) < data.width:
row_vals += [''] * (data.width - len(row_vals))
data.append(row_vals)
dbook.add_sheet(data)
@classmethod
def dset_sheet(cls, dataset, ws, freeze_panes=True):
"""Completes given worksheet from given Dataset."""
_package = dataset._package(dicts=False)
for i, sep in enumerate(dataset._separators):
_offset = i
_package.insert((sep[0] + _offset), (sep[1],))
bold = Font(bold=True)
wrap_text = Alignment(wrap_text=True)
for i, row in enumerate(_package):
row_number = i + 1
for j, col in enumerate(row):
col_idx = get_column_letter(j + 1)
cell = ws[f'{col_idx}{row_number}']
# bold headers
if (row_number == 1) and dataset.headers:
cell.font = bold
if freeze_panes:
# Export Freeze only after first Line
ws.freeze_panes = 'A2'
# bold separators
elif len(row) < dataset.width:
cell.font = bold
# wrap the rest
else:
try:
str_col_value = str(col)
except TypeError:
str_col_value = ''
if '\n' in str_col_value:
cell.alignment = wrap_text
try:
cell.value = col
except (ValueError, TypeError):
cell.value = str(col)
+54
View File
@@ -0,0 +1,54 @@
""" Tablib - YAML Support.
"""
import yaml
import tablib
class YAMLFormat:
title = 'yaml'
extensions = ('yaml', 'yml')
@classmethod
def export_set(cls, dataset):
"""Returns YAML representation of Dataset."""
return yaml.safe_dump(dataset._package(ordered=False), default_flow_style=None)
@classmethod
def export_book(cls, databook):
"""Returns YAML representation of Databook."""
return yaml.safe_dump(databook._package(ordered=False), default_flow_style=None)
@classmethod
def import_set(cls, dset, in_stream):
"""Returns dataset from YAML stream."""
dset.wipe()
dset.dict = yaml.safe_load(in_stream)
@classmethod
def import_book(cls, dbook, in_stream):
"""Returns databook from YAML stream."""
dbook.wipe()
for sheet in yaml.safe_load(in_stream):
data = tablib.Dataset()
data.title = sheet['title']
data.dict = sheet['data']
dbook.add_sheet(data)
@classmethod
def detect(cls, stream):
"""Returns True if given stream is valid YAML."""
try:
_yaml = yaml.safe_load(stream)
if isinstance(_yaml, (list, tuple, dict)):
return True
else:
return False
except (yaml.parser.ParserError, yaml.reader.ReaderError,
yaml.scanner.ScannerError):
return False
+297
View File
@@ -0,0 +1,297 @@
#! /usr/bin/env python
"""DBF accessing helpers.
FIXME: more documentation needed
Examples:
Create new table, setup structure, add records:
dbf = Dbf(filename, new=True)
dbf.addField(
("NAME", "C", 15),
("SURNAME", "C", 25),
("INITIALS", "C", 10),
("BIRTHDATE", "D"),
)
for (n, s, i, b) in (
("John", "Miller", "YC", (1980, 10, 11)),
("Andy", "Larkin", "", (1980, 4, 11)),
):
rec = dbf.newRecord()
rec["NAME"] = n
rec["SURNAME"] = s
rec["INITIALS"] = i
rec["BIRTHDATE"] = b
rec.store()
dbf.close()
Open existed dbf, read some data:
dbf = Dbf(filename, True)
for rec in dbf:
for fldName in dbf.fieldNames:
print('%s:\t %s (%s)' % (fldName, rec[fldName],
type(rec[fldName])))
print()
dbf.close()
"""
"""History (most recent first):
11-feb-2007 [als] export INVALID_VALUE;
Dbf: added .ignoreErrors, .INVALID_VALUE
04-jul-2006 [als] added export declaration
20-dec-2005 [yc] removed fromStream and newDbf methods:
use argument of __init__ call must be used instead;
added class fields pointing to the header and
record classes.
17-dec-2005 [yc] split to several modules; reimplemented
13-dec-2005 [yc] adapted to the changes of the `strutil` module.
13-sep-2002 [als] support FoxPro Timestamp datatype
15-nov-1999 [jjk] documentation updates, add demo
24-aug-1998 [jjk] add some encodeValue methods (not tested), other tweaks
08-jun-1998 [jjk] fix problems, add more features
20-feb-1998 [jjk] fix problems, add more features
19-feb-1998 [jjk] add create/write capabilities
18-feb-1998 [jjk] from dbfload.py
"""
__version__ = "$Revision: 1.7 $"[11:-2]
__date__ = "$Date: 2007/02/11 09:23:13 $"[7:-2]
__author__ = "Jeff Kunce <kuncej@mail.conservation.state.mo.us>"
__all__ = ["Dbf"]
from . import header, record
from .utils import INVALID_VALUE
class Dbf:
"""DBF accessor.
FIXME:
docs and examples needed (dont' forget to tell
about problems adding new fields on the fly)
Implementation notes:
``_new`` field is used to indicate whether this is
a new data table. `addField` could be used only for
the new tables! If at least one record was appended
to the table it's structure couldn't be changed.
"""
__slots__ = ("name", "header", "stream",
"_changed", "_new", "_ignore_errors")
HeaderClass = header.DbfHeader
RecordClass = record.DbfRecord
INVALID_VALUE = INVALID_VALUE
# initialization and creation helpers
def __init__(self, f, readOnly=False, new=False, ignoreErrors=False):
"""Initialize instance.
Arguments:
f:
Filename or file-like object.
new:
True if new data table must be created. Assume
data table exists if this argument is False.
readOnly:
if ``f`` argument is a string file will
be opend in read-only mode; in other cases
this argument is ignored. This argument is ignored
even if ``new`` argument is True.
headerObj:
`header.DbfHeader` instance or None. If this argument
is None, new empty header will be used with the
all fields set by default.
ignoreErrors:
if set, failing field value conversion will return
``INVALID_VALUE`` instead of raising conversion error.
"""
if isinstance(f, str):
# a filename
self.name = f
if new:
# new table (table file must be
# created or opened and truncated)
self.stream = open(f, "w+b")
else:
# table file must exist
self.stream = open(f, ("r+b", "rb")[bool(readOnly)])
else:
# a stream
self.name = getattr(f, "name", "")
self.stream = f
if new:
# if this is a new table, header will be empty
self.header = self.HeaderClass()
else:
# or instantiated using stream
self.header = self.HeaderClass.fromStream(self.stream)
self.ignoreErrors = ignoreErrors
self._new = bool(new)
self._changed = False
# properties
closed = property(lambda self: self.stream.closed)
recordCount = property(lambda self: self.header.recordCount)
fieldNames = property(
lambda self: [_fld.name for _fld in self.header.fields])
fieldDefs = property(lambda self: self.header.fields)
changed = property(lambda self: self._changed or self.header.changed)
def ignoreErrors(self, value):
"""Update `ignoreErrors` flag on the header object and self"""
self.header.ignoreErrors = self._ignore_errors = bool(value)
ignoreErrors = property(
lambda self: self._ignore_errors,
ignoreErrors,
doc="""Error processing mode for DBF field value conversion
if set, failing field value conversion will return
``INVALID_VALUE`` instead of raising conversion error.
""")
# protected methods
def _fixIndex(self, index):
"""Return fixed index.
This method fails if index isn't a numeric object
(long or int). Or index isn't in a valid range
(less or equal to the number of records in the db).
If ``index`` is a negative number, it will be
treated as a negative indexes for list objects.
Return:
Return value is numeric object maning valid index.
"""
if not isinstance(index, int):
raise TypeError("Index must be a numeric object")
if index < 0:
# index from the right side
# fix it to the left-side index
index += len(self) + 1
if index >= len(self):
raise IndexError("Record index out of range")
return index
# interface methods
def close(self):
self.flush()
self.stream.close()
def flush(self):
"""Flush data to the associated stream."""
if self.changed:
self.header.setCurrentDate()
self.header.write(self.stream)
self.stream.flush()
self._changed = False
def indexOfFieldName(self, name):
"""Index of field named ``name``."""
# FIXME: move this to header class
names = [f.name for f in self.header.fields]
return names.index(name.upper())
def newRecord(self):
"""Return new record, which belong to this table."""
return self.RecordClass(self)
def append(self, record):
"""Append ``record`` to the database."""
record.index = self.header.recordCount
record._write()
self.header.recordCount += 1
self._changed = True
self._new = False
def addField(self, *defs):
"""Add field definitions.
For more information see `header.DbfHeader.addField`.
"""
if self._new:
self.header.addField(*defs)
else:
raise TypeError("At least one record was added, "
"structure can't be changed")
# 'magic' methods (representation and sequence interface)
def __repr__(self):
return "Dbf stream '%s'\n" % self.stream + repr(self.header)
def __len__(self):
"""Return number of records."""
return self.recordCount
def __getitem__(self, index):
"""Return `DbfRecord` instance."""
return self.RecordClass.fromStream(self, self._fixIndex(index))
def __setitem__(self, index, record):
"""Write `DbfRecord` instance to the stream."""
record.index = self._fixIndex(index)
record._write()
self._changed = True
self._new = False
# def __del__(self):
# """Flush stream upon deletion of the object."""
# self.flush()
def demo_read(filename):
_dbf = Dbf(filename, True)
for _rec in _dbf:
print()
print(repr(_rec))
_dbf.close()
def demo_create(filename):
_dbf = Dbf(filename, new=True)
_dbf.addField(
("NAME", "C", 15),
("SURNAME", "C", 25),
("INITIALS", "C", 10),
("BIRTHDATE", "D"),
)
for (_n, _s, _i, _b) in (
("John", "Miller", "YC", (1981, 1, 2)),
("Andy", "Larkin", "AL", (1982, 3, 4)),
("Bill", "Clinth", "", (1983, 5, 6)),
("Bobb", "McNail", "", (1984, 7, 8)),
):
_rec = _dbf.newRecord()
_rec["NAME"] = _n
_rec["SURNAME"] = _s
_rec["INITIALS"] = _i
_rec["BIRTHDATE"] = _b
_rec.store()
print(repr(_dbf))
_dbf.close()
if __name__ == '__main__':
import sys
_name = len(sys.argv) > 1 and sys.argv[1] or "county.dbf"
demo_create(_name)
demo_read(_name)
# vim: set et sw=4 sts=4 :
+183
View File
@@ -0,0 +1,183 @@
#!/usr/bin/python
""".DBF creation helpers.
Note: this is a legacy interface. New code should use Dbf class
for table creation (see examples in dbf.py)
TODO:
- handle Memo fields.
- check length of the fields according to the
`http://www.clicketyclick.dk/databases/xbase/format/data_types.html`
"""
"""History (most recent first)
04-jul-2006 [als] added export declaration;
updated for dbfpy 2.0
15-dec-2005 [yc] define dbf_new.__slots__
14-dec-2005 [yc] added vim modeline; retab'd; added doc-strings;
dbf_new now is a new class (inherited from object)
??-jun-2000 [--] added by Hans Fiby
"""
__version__ = "$Revision: 1.4 $"[11:-2]
__date__ = "$Date: 2006/07/04 08:18:18 $"[7:-2]
__all__ = ["dbf_new"]
from .dbf import *
from .fields import *
from .header import *
from .record import *
class _FieldDefinition:
"""Field definition.
This is a simple structure, which contains ``name``, ``type``,
``len``, ``dec`` and ``cls`` fields.
Objects also implement get/setitem magic functions, so fields
could be accessed via sequence interface, where 'name' has
index 0, 'type' index 1, 'len' index 2, 'dec' index 3 and
'cls' could be located at index 4.
"""
__slots__ = "name", "type", "len", "dec", "cls"
# WARNING: be attentive - dictionaries are mutable!
FLD_TYPES = {
# type: (cls, len)
"C": (DbfCharacterFieldDef, None),
"N": (DbfNumericFieldDef, None),
"L": (DbfLogicalFieldDef, 1),
# FIXME: support memos
# "M": (DbfMemoFieldDef),
"D": (DbfDateFieldDef, 8),
# FIXME: I'm not sure length should be 14 characters!
# but temporary I use it, cuz date is 8 characters
# and time 6 (hhmmss)
"T": (DbfDateTimeFieldDef, 14),
}
def __init__(self, name, type, len=None, dec=0):
_cls, _len = self.FLD_TYPES[type]
if _len is None:
if len is None:
raise ValueError("Field length must be defined")
_len = len
self.name = name
self.type = type
self.len = _len
self.dec = dec
self.cls = _cls
def getDbfField(self):
"Return `DbfFieldDef` instance from the current definition."
return self.cls(self.name, self.len, self.dec)
def appendToHeader(self, dbfh):
"""Create a `DbfFieldDef` instance and append it to the dbf header.
Arguments:
dbfh: `DbfHeader` instance.
"""
_dbff = self.getDbfField()
dbfh.addField(_dbff)
class dbf_new:
"""New .DBF creation helper.
Example Usage:
dbfn = dbf_new()
dbfn.add_field("name",'C',80)
dbfn.add_field("price",'N',10,2)
dbfn.add_field("date",'D',8)
dbfn.write("tst.dbf")
Note:
This module cannot handle Memo-fields,
they are special.
"""
__slots__ = ("fields",)
FieldDefinitionClass = _FieldDefinition
def __init__(self):
self.fields = []
def add_field(self, name, typ, len, dec=0):
"""Add field definition.
Arguments:
name:
field name (str object). field name must not
contain ASCII NULs and it's length shouldn't
exceed 10 characters.
typ:
type of the field. this must be a single character
from the "CNLMDT" set meaning character, numeric,
logical, memo, date and date/time respectively.
len:
length of the field. this argument is used only for
the character and numeric fields. all other fields
have fixed length.
FIXME: use None as a default for this argument?
dec:
decimal precision. used only for the numric fields.
"""
self.fields.append(self.FieldDefinitionClass(name, typ, len, dec))
def write(self, filename):
"""Create empty .DBF file using current structure."""
_dbfh = DbfHeader()
_dbfh.setCurrentDate()
for _fldDef in self.fields:
_fldDef.appendToHeader(_dbfh)
_dbfStream = open(filename, "wb")
_dbfh.write(_dbfStream)
_dbfStream.close()
if __name__ == '__main__':
# create a new DBF-File
dbfn = dbf_new()
dbfn.add_field("name", 'C', 80)
dbfn.add_field("price", 'N', 10, 2)
dbfn.add_field("date", 'D', 8)
dbfn.write("tst.dbf")
# test new dbf
print("*** created tst.dbf: ***")
dbft = Dbf('tst.dbf', readOnly=0)
print(repr(dbft))
# add a record
rec = DbfRecord(dbft)
rec['name'] = 'something'
rec['price'] = 10.5
rec['date'] = (2000, 1, 12)
rec.store()
# add another record
rec = DbfRecord(dbft)
rec['name'] = 'foo and bar'
rec['price'] = 12234
rec['date'] = (1992, 7, 15)
rec.store()
# show the records
print("*** inserted 2 records into tst.dbf: ***")
print(repr(dbft))
for i1 in range(len(dbft)):
rec = dbft[i1]
for fldName in dbft.fieldNames:
print('{}:\t {}'.format(fldName, rec[fldName]))
print()
dbft.close()
# vim: set et sts=4 sw=4 :
+475
View File
@@ -0,0 +1,475 @@
"""DBF fields definitions.
TODO:
- make memos work
"""
"""History (most recent first):
26-may-2009 [als] DbfNumericFieldDef.decodeValue: strip zero bytes
05-feb-2009 [als] DbfDateFieldDef.encodeValue: empty arg produces empty date
16-sep-2008 [als] DbfNumericFieldDef decoding looks for decimal point
in the value to select float or integer return type
13-mar-2008 [als] check field name length in constructor
11-feb-2007 [als] handle value conversion errors
10-feb-2007 [als] DbfFieldDef: added .rawFromRecord()
01-dec-2006 [als] Timestamp columns use None for empty values
31-oct-2006 [als] support field types 'F' (float), 'I' (integer)
and 'Y' (currency);
automate export and registration of field classes
04-jul-2006 [als] added export declaration
10-mar-2006 [als] decode empty values for Date and Logical fields;
show field name in errors
10-mar-2006 [als] fix Numeric value decoding: according to spec,
value always is string representation of the number;
ensure that encoded Numeric value fits into the field
20-dec-2005 [yc] use field names in upper case
15-dec-2005 [yc] field definitions moved from `dbf`.
"""
__version__ = "$Revision: 1.14 $"[11:-2]
__date__ = "$Date: 2009/05/26 05:16:51 $"[7:-2]
__all__ = ["lookupFor"] # field classes added at the end of the module
import datetime
import struct
import sys
from functools import total_ordering
from . import utils
# abstract definitions
@total_ordering
class DbfFieldDef:
"""Abstract field definition.
Child classes must override ``type`` class attribute to provide datatype
information of the field definition. For more info about types visit
`http://www.clicketyclick.dk/databases/xbase/format/data_types.html`
Also child classes must override ``defaultValue`` field to provide
default value for the field value.
If child class has fixed length ``length`` class attribute must be
overridden and set to the valid value. None value means, that field
isn't of fixed length.
Note: ``name`` field must not be changed after instantiation.
"""
__slots__ = ("name", "decimalCount", "start", "end", "ignoreErrors")
# length of the field, None in case of variable-length field,
# or a number if this field is a fixed-length field
length = None
# field type. for more information about fields types visit
# `http://www.clicketyclick.dk/databases/xbase/format/data_types.html`
# must be overridden in child classes
typeCode = None
# default value for the field. this field must be
# overridden in child classes
defaultValue = None
def __init__(self, name, length=None, decimalCount=None,
start=None, stop=None, ignoreErrors=False):
"""Initialize instance."""
assert self.typeCode is not None, "Type code must be overridden"
assert self.defaultValue is not None, "Default value must be overridden"
# fix arguments
if len(name) > 10:
raise ValueError("Field name \"%s\" is too long" % name)
name = str(name).upper()
if self.__class__.length is None:
if length is None:
raise ValueError("[%s] Length isn't specified" % name)
length = int(length)
if length <= 0:
raise ValueError("[%s] Length must be a positive integer" % name)
else:
length = self.length
if decimalCount is None:
decimalCount = 0
# set fields
self.name = name
# FIXME: validate length according to the specification at
# http://www.clicketyclick.dk/databases/xbase/format/data_types.html
self.length = length
self.decimalCount = decimalCount
self.ignoreErrors = ignoreErrors
self.start = start
self.end = stop
def __eq__(self, other):
return repr(self) == repr(other)
def __ne__(self, other):
return repr(self) != repr(other)
def __lt__(self, other):
return repr(self) < repr(other)
def __hash__(self):
return hash(self.name)
def fromString(cls, string, start, ignoreErrors=False):
"""Decode dbf field definition from the string data.
Arguments:
string:
a string, dbf definition is decoded from. length of
the string must be 32 bytes.
start:
position in the database file.
ignoreErrors:
initial error processing mode for the new field (boolean)
"""
assert len(string) == 32
_length = string[16]
return cls(utils.unzfill(string)[:11].decode('utf-8'), _length,
string[17], start, start + _length, ignoreErrors=ignoreErrors)
fromString = classmethod(fromString)
def toString(self):
"""Return encoded field definition.
Return:
Return value is a string object containing encoded
definition of this field.
"""
_name = self.name.ljust(11, '\0')
return (
_name +
self.typeCode +
# data address
chr(0) * 4 +
chr(self.length) +
chr(self.decimalCount) +
chr(0) * 14
)
def __repr__(self):
return "%-10s %1s %3d %3d" % self.fieldInfo()
def fieldInfo(self):
"""Return field information.
Return:
Return value is a (name, type, length, decimals) tuple.
"""
return (self.name, self.typeCode, self.length, self.decimalCount)
def rawFromRecord(self, record):
"""Return a "raw" field value from the record string."""
return record[self.start:self.end]
def decodeFromRecord(self, record):
"""Return decoded field value from the record string."""
try:
return self.decodeValue(self.rawFromRecord(record))
except Exception:
if self.ignoreErrors:
return utils.INVALID_VALUE
else:
raise
def decodeValue(self, value):
"""Return decoded value from string value.
This method shouldn't be used publicly. It's called from the
`decodeFromRecord` method.
This is an abstract method and it must be overridden in child classes.
"""
raise NotImplementedError
def encodeValue(self, value):
"""Return str object containing encoded field value.
This is an abstract method and it must be overridden in child classes.
"""
raise NotImplementedError
# real classes
class DbfCharacterFieldDef(DbfFieldDef):
"""Definition of the character field."""
typeCode = "C"
defaultValue = b''
def decodeValue(self, value):
"""Return string object.
Return value is a ``value`` argument with stripped right spaces.
"""
return value.rstrip(b' ').decode('utf-8')
def encodeValue(self, value):
"""Return raw data string encoded from a ``value``."""
return str(value)[:self.length].ljust(self.length)
class DbfNumericFieldDef(DbfFieldDef):
"""Definition of the numeric field."""
typeCode = "N"
# XXX: now I'm not sure it was a good idea to make a class field
# `defaultValue` instead of a generic method as it was implemented
# previously -- it's ok with all types except number, cuz
# if self.decimalCount is 0, we should return 0 and 0.0 otherwise.
defaultValue = 0
def decodeValue(self, value):
"""Return a number decoded from ``value``.
If decimals is zero, value will be decoded as an integer;
or as a float otherwise.
Return:
Return value is a int (long) or float instance.
"""
value = value.strip(b' \0')
if b'.' in value:
# a float (has decimal separator)
return float(value)
elif value:
# must be an integer
return int(value)
else:
return 0
def encodeValue(self, value):
"""Return string containing encoded ``value``."""
_rv = ("%*.*f" % (self.length, self.decimalCount, value))
if len(_rv) > self.length:
_ppos = _rv.find(".")
if 0 <= _ppos <= self.length:
_rv = _rv[:self.length]
else:
raise ValueError("[%s] Numeric overflow: %s (field width: %i)"
% (self.name, _rv, self.length))
return _rv
class DbfFloatFieldDef(DbfNumericFieldDef):
"""Definition of the float field - same as numeric."""
typeCode = "F"
class DbfIntegerFieldDef(DbfFieldDef):
"""Definition of the integer field."""
typeCode = "I"
length = 4
defaultValue = 0
def decodeValue(self, value):
"""Return an integer number decoded from ``value``."""
return struct.unpack("<i", value)[0]
def encodeValue(self, value):
"""Return string containing encoded ``value``."""
return struct.pack("<i", int(value))
class DbfCurrencyFieldDef(DbfFieldDef):
"""Definition of the currency field."""
typeCode = "Y"
length = 8
defaultValue = 0.0
def decodeValue(self, value):
"""Return float number decoded from ``value``."""
return struct.unpack("<q", value)[0] / 10000.
def encodeValue(self, value):
"""Return string containing encoded ``value``."""
return struct.pack("<q", round(value * 10000))
class DbfLogicalFieldDef(DbfFieldDef):
"""Definition of the logical field."""
typeCode = "L"
defaultValue = -1
length = 1
def decodeValue(self, value):
"""Return True, False or -1 decoded from ``value``."""
# Note: value always is 1-char string
if value == "?":
return -1
if value in "NnFf ":
return False
if value in "YyTt":
return True
raise ValueError(f"[{self.name}] Invalid logical value {value!r}")
def encodeValue(self, value):
"""Return a character from the "TF?" set.
Return:
Return value is "T" if ``value`` is True
"?" if value is -1 or False otherwise.
"""
if value is True:
return "T"
if value == -1:
return "?"
return "F"
class DbfMemoFieldDef(DbfFieldDef):
"""Definition of the memo field.
Note: memos aren't currently completely supported.
"""
typeCode = "M"
defaultValue = " " * 10
length = 10
def decodeValue(self, value):
"""Return int .dbt block number decoded from the string object."""
# return int(value)
raise NotImplementedError
def encodeValue(self, value):
"""Return raw data string encoded from a ``value``.
Note: this is an internal method.
"""
# return str(value)[:self.length].ljust(self.length)
raise NotImplementedError
class DbfDateFieldDef(DbfFieldDef):
"""Definition of the date field."""
typeCode = "D"
defaultValue = utils.classproperty(lambda cls: datetime.date.today())
# "yyyymmdd" gives us 8 characters
length = 8
def decodeValue(self, value):
"""Return a ``datetime.date`` instance decoded from ``value``."""
if value.strip():
return utils.getDate(value)
else:
return None
def encodeValue(self, value):
"""Return a string-encoded value.
``value`` argument should be a value suitable for the
`utils.getDate` call.
Return:
Return value is a string in format "yyyymmdd".
"""
if value:
return utils.getDate(value).strftime("%Y%m%d")
else:
return " " * self.length
class DbfDateTimeFieldDef(DbfFieldDef):
"""Definition of the timestamp field."""
# a difference between JDN (Julian Day Number)
# and GDN (Gregorian Day Number). note, that GDN < JDN
JDN_GDN_DIFF = 1721425
typeCode = "T"
defaultValue = utils.classproperty(lambda cls: datetime.datetime.now())
# two 32-bits integers representing JDN and amount of
# milliseconds respectively gives us 8 bytes.
# note, that values must be encoded in LE byteorder.
length = 8
def decodeValue(self, value):
"""Return a `datetime.datetime` instance."""
assert len(value) == self.length
# LE byteorder
_jdn, _msecs = struct.unpack("<2I", value)
if _jdn >= 1:
_rv = datetime.datetime.fromordinal(_jdn - self.JDN_GDN_DIFF)
_rv += datetime.timedelta(0, _msecs / 1000.0)
else:
# empty date
_rv = None
return _rv
def encodeValue(self, value):
"""Return a string-encoded ``value``."""
if value:
value = utils.getDateTime(value)
# LE byteorder
_rv = struct.pack("<2I", value.toordinal() + self.JDN_GDN_DIFF,
(value.hour * 3600 + value.minute * 60 + value.second) * 1000)
else:
_rv = "\0" * self.length
assert len(_rv) == self.length
return _rv
_fieldsRegistry = {}
def registerField(fieldCls):
"""Register field definition class.
``fieldCls`` should be subclass of the `DbfFieldDef`.
Use `lookupFor` to retrieve field definition class
by the type code.
"""
assert fieldCls.typeCode is not None, "Type code isn't defined"
# XXX: use fieldCls.typeCode.upper()? in case of any decign
# don't forget to look to the same comment in ``lookupFor`` method
_fieldsRegistry[fieldCls.typeCode] = fieldCls
def lookupFor(typeCode):
"""Return field definition class for the given type code.
``typeCode`` must be a single character. That type should be
previously registered.
Use `registerField` to register new field class.
Return:
Return value is a subclass of the `DbfFieldDef`.
"""
# XXX: use typeCode.upper()? in case of any decign don't
# forget to look to the same comment in ``registerField``
return _fieldsRegistry[chr(typeCode)]
# register generic types
for (_name, _val) in list(globals().items()):
if isinstance(_val, type) and issubclass(_val, DbfFieldDef) \
and (_name != "DbfFieldDef"):
__all__.append(_name)
registerField(_val)
del _name, _val
# vim: et sts=4 sw=4 :
+270
View File
@@ -0,0 +1,270 @@
"""DBF header definition.
TODO:
- handle encoding of the character fields
(encoding information stored in the DBF header)
"""
"""History (most recent first):
16-sep-2010 [als] fromStream: fix century of the last update field
11-feb-2007 [als] added .ignoreErrors
10-feb-2007 [als] added __getitem__: return field definitions
by field name or field number (zero-based)
04-jul-2006 [als] added export declaration
15-dec-2005 [yc] created
"""
__version__ = "$Revision: 1.6 $"[11:-2]
__date__ = "$Date: 2010/09/16 05:06:39 $"[7:-2]
__all__ = ["DbfHeader"]
import datetime
import io
import struct
import sys
from . import fields
from .utils import getDate
class DbfHeader:
"""Dbf header definition.
For more information about dbf header format visit
`http://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_STRUCT`
Examples:
Create an empty dbf header and add some field definitions:
dbfh = DbfHeader()
dbfh.addField(("name", "C", 10))
dbfh.addField(("date", "D"))
dbfh.addField(DbfNumericFieldDef("price", 5, 2))
Create a dbf header with field definitions:
dbfh = DbfHeader([
("name", "C", 10),
("date", "D"),
DbfNumericFieldDef("price", 5, 2),
])
"""
__slots__ = ("signature", "fields", "lastUpdate", "recordLength",
"recordCount", "headerLength", "changed", "_ignore_errors")
# instance construction and initialization methods
def __init__(self, fields=None, headerLength=0, recordLength=0,
recordCount=0, signature=0x03, lastUpdate=None, ignoreErrors=False):
"""Initialize instance.
Arguments:
fields:
a list of field definitions;
recordLength:
size of the records;
headerLength:
size of the header;
recordCount:
number of records stored in DBF;
signature:
version number (aka signature). using 0x03 as a default meaning
"File without DBT". for more information about this field visit
``http://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_NOTE_1_TARGET``
lastUpdate:
date of the DBF's update. this could be a string ('yymmdd' or
'yyyymmdd'), timestamp (int or float), datetime/date value,
a sequence (assuming (yyyy, mm, dd, ...)) or an object having
callable ``ticks`` field.
ignoreErrors:
error processing mode for DBF fields (boolean)
"""
self.signature = signature
if fields is None:
self.fields = []
else:
self.fields = list(fields)
self.lastUpdate = getDate(lastUpdate)
self.recordLength = recordLength
self.headerLength = headerLength
self.recordCount = recordCount
self.ignoreErrors = ignoreErrors
# XXX: I'm not sure this is safe to
# initialize `self.changed` in this way
self.changed = bool(self.fields)
# @classmethod
def fromString(cls, string):
"""Return header instance from the string object."""
return cls.fromStream(io.StringIO(str(string)))
fromString = classmethod(fromString)
# @classmethod
def fromStream(cls, stream):
"""Return header object from the stream."""
stream.seek(0)
first_32 = stream.read(32)
if type(first_32) != bytes:
_data = bytes(first_32, sys.getfilesystemencoding())
_data = first_32
(_cnt, _hdrLen, _recLen) = struct.unpack("<I2H", _data[4:12])
# reserved = _data[12:32]
_year = _data[1]
if _year < 80:
# dBase II started at 1980. It is quite unlikely
# that actual last update date is before that year.
_year += 2000
else:
_year += 1900
# create header object
_obj = cls(None, _hdrLen, _recLen, _cnt, _data[0],
(_year, _data[2], _data[3]))
# append field definitions
# position 0 is for the deletion flag
_pos = 1
_data = stream.read(1)
while _data != b'\r':
_data += stream.read(31)
_fld = fields.lookupFor(_data[11]).fromString(_data, _pos)
_obj._addField(_fld)
_pos = _fld.end
_data = stream.read(1)
return _obj
fromStream = classmethod(fromStream)
# properties
year = property(lambda self: self.lastUpdate.year)
month = property(lambda self: self.lastUpdate.month)
day = property(lambda self: self.lastUpdate.day)
def ignoreErrors(self, value):
"""Update `ignoreErrors` flag on self and all fields"""
self._ignore_errors = value = bool(value)
for _field in self.fields:
_field.ignoreErrors = value
ignoreErrors = property(
lambda self: self._ignore_errors,
ignoreErrors,
doc="""Error processing mode for DBF field value conversion
if set, failing field value conversion will return
``INVALID_VALUE`` instead of raising conversion error.
""")
# object representation
def __repr__(self):
_rv = """\
Version (signature): 0x%02x
Last update: %s
Header length: %d
Record length: %d
Record count: %d
FieldName Type Len Dec
""" % (self.signature, self.lastUpdate, self.headerLength,
self.recordLength, self.recordCount)
_rv += "\n".join(
["%10s %4s %3s %3s" % _fld.fieldInfo() for _fld in self.fields]
)
return _rv
# internal methods
def _addField(self, *defs):
"""Internal variant of the `addField` method.
This method doesn't set `self.changed` field to True.
Return value is a length of the appended records.
Note: this method doesn't modify ``recordLength`` and
``headerLength`` fields. Use `addField` instead of this
method if you don't exactly know what you're doing.
"""
# insure we have dbf.DbfFieldDef instances first (instantiation
# from the tuple could raise an error, in such a case I don't
# wanna add any of the definitions -- all will be ignored)
_defs = []
_recordLength = 0
for _def in defs:
if isinstance(_def, fields.DbfFieldDef):
_obj = _def
else:
(_name, _type, _len, _dec) = (tuple(_def) + (None,) * 4)[:4]
_cls = fields.lookupFor(_type)
_obj = _cls(_name, _len, _dec, ignoreErrors=self._ignore_errors)
_recordLength += _obj.length
_defs.append(_obj)
# and now extend field definitions and
# update record length
self.fields += _defs
return _recordLength
# interface methods
def addField(self, *defs):
"""Add field definition to the header.
Examples:
dbfh.addField(
("name", "C", 20),
dbf.DbfCharacterFieldDef("surname", 20),
dbf.DbfDateFieldDef("birthdate"),
("member", "L"),
)
dbfh.addField(("price", "N", 5, 2))
dbfh.addField(dbf.DbfNumericFieldDef("origprice", 5, 2))
"""
_oldLen = self.recordLength
self.recordLength += self._addField(*defs)
if not _oldLen:
self.recordLength += 1
# XXX: may be just use:
# self.recordeLength += self._addField(*defs) + bool(not _oldLen)
# recalculate headerLength
self.headerLength = 32 + (32 * len(self.fields)) + 1
self.changed = True
def write(self, stream):
"""Encode and write header to the stream."""
stream.seek(0)
stream.write(self.toString())
fields = [_fld.toString() for _fld in self.fields]
stream.write(''.join(fields).encode(sys.getfilesystemencoding()))
stream.write(b'\x0D') # cr at end of all header data
self.changed = False
def toString(self):
"""Returned 32 chars length string with encoded header."""
return struct.pack("<4BI2H",
self.signature,
self.year - 1900,
self.month,
self.day,
self.recordCount,
self.headerLength,
self.recordLength) + (b'\x00' * 20)
# TODO: figure out if bytes(utf-8) is correct here.
def setCurrentDate(self):
"""Update ``self.lastUpdate`` field with current date value."""
self.lastUpdate = datetime.date.today()
def __getitem__(self, item):
"""Return a field definition by numeric index or name string"""
if isinstance(item, str):
_name = item.upper()
for _field in self.fields:
if _field.name == _name:
return _field
else:
raise KeyError(item)
else:
# item must be field index
return self.fields[item]
# vim: et sts=4 sw=4 :
+267
View File
@@ -0,0 +1,267 @@
"""DBF record definition.
"""
"""History (most recent first):
11-feb-2007 [als] __repr__: added special case for invalid field values
10-feb-2007 [als] added .rawFromStream()
30-oct-2006 [als] fix record length in .fromStream()
04-jul-2006 [als] added export declaration
20-dec-2005 [yc] DbfRecord.write() -> DbfRecord._write();
added delete() method.
16-dec-2005 [yc] record definition moved from `dbf`.
"""
__version__ = "$Revision: 1.7 $"[11:-2]
__date__ = "$Date: 2007/02/11 09:05:49 $"[7:-2]
__all__ = ["DbfRecord"]
import sys
from . import utils
class DbfRecord:
"""DBF record.
Instances of this class shouldn't be created manually,
use `dbf.Dbf.newRecord` instead.
Class implements mapping/sequence interface, so
fields could be accessed via their names or indexes
(names is a preferred way to access fields).
Hint:
Use `store` method to save modified record.
Examples:
Add new record to the database:
db = Dbf(filename)
rec = db.newRecord()
rec["FIELD1"] = value1
rec["FIELD2"] = value2
rec.store()
Or the same, but modify existed
(second in this case) record:
db = Dbf(filename)
rec = db[2]
rec["FIELD1"] = value1
rec["FIELD2"] = value2
rec.store()
"""
__slots__ = "dbf", "index", "deleted", "fieldData"
# creation and initialization
def __init__(self, dbf, index=None, deleted=False, data=None):
"""Instance initialization.
Arguments:
dbf:
A `Dbf.Dbf` instance this record belongs to.
index:
An integer record index or None. If this value is
None, record will be appended to the DBF.
deleted:
Boolean flag indicating whether this record
is a deleted record.
data:
A sequence or None. This is a data of the fields.
If this argument is None, default values will be used.
"""
self.dbf = dbf
# XXX: I'm not sure ``index`` is necessary
self.index = index
self.deleted = deleted
if data is None:
self.fieldData = [_fd.defaultValue for _fd in dbf.header.fields]
else:
self.fieldData = list(data)
# XXX: validate self.index before calculating position?
position = property(lambda self: self.dbf.header.headerLength + \
self.index * self.dbf.header.recordLength)
def rawFromStream(cls, dbf, index):
"""Return raw record contents read from the stream.
Arguments:
dbf:
A `Dbf.Dbf` instance containing the record.
index:
Index of the record in the records' container.
This argument can't be None in this call.
Return value is a string containing record data in DBF format.
"""
# XXX: may be write smth assuming, that current stream
# position is the required one? it could save some
# time required to calculate where to seek in the file
dbf.stream.seek(dbf.header.headerLength +
index * dbf.header.recordLength)
return dbf.stream.read(dbf.header.recordLength)
rawFromStream = classmethod(rawFromStream)
def fromStream(cls, dbf, index):
"""Return a record read from the stream.
Arguments:
dbf:
A `Dbf.Dbf` instance new record should belong to.
index:
Index of the record in the records' container.
This argument can't be None in this call.
Return value is an instance of the current class.
"""
return cls.fromString(dbf, cls.rawFromStream(dbf, index), index)
fromStream = classmethod(fromStream)
def fromString(cls, dbf, string, index=None):
"""Return record read from the string object.
Arguments:
dbf:
A `Dbf.Dbf` instance new record should belong to.
string:
A string new record should be created from.
index:
Index of the record in the container. If this
argument is None, record will be appended.
Return value is an instance of the current class.
"""
return cls(dbf, index, string[0]=="*",
[_fd.decodeFromRecord(string) for _fd in dbf.header.fields])
fromString = classmethod(fromString)
# object representation
def __repr__(self):
_template = "%%%ds: %%s (%%s)" % max([len(_fld)
for _fld in self.dbf.fieldNames])
_rv = []
for _fld in self.dbf.fieldNames:
_val = self[_fld]
if _val is utils.INVALID_VALUE:
_rv.append(_template %
(_fld, "None", "value cannot be decoded"))
else:
_rv.append(_template % (_fld, _val, type(_val)))
return "\n".join(_rv)
# protected methods
def _write(self):
"""Write data to the dbf stream.
Note:
This isn't a public method, it's better to
use 'store' instead publicly.
Be design ``_write`` method should be called
only from the `Dbf` instance.
"""
self._validateIndex(False)
self.dbf.stream.seek(self.position)
self.dbf.stream.write(bytes(self.toString(),
sys.getfilesystemencoding()))
# FIXME: may be move this write somewhere else?
# why we should check this condition for each record?
if self.index == len(self.dbf):
# this is the last record,
# we should write SUB (ASCII 26)
self.dbf.stream.write(b"\x1A")
# utility methods
def _validateIndex(self, allowUndefined=True, checkRange=False):
"""Valid ``self.index`` value.
If ``allowUndefined`` argument is True functions does nothing
in case of ``self.index`` pointing to None object.
"""
if self.index is None:
if not allowUndefined:
raise ValueError("Index is undefined")
elif self.index < 0:
raise ValueError("Index can't be negative (%s)" % self.index)
elif checkRange and self.index <= self.dbf.header.recordCount:
raise ValueError("There are only %d records in the DBF" %
self.dbf.header.recordCount)
# interface methods
def store(self):
"""Store current record in the DBF.
If ``self.index`` is None, this record will be appended to the
records of the DBF this records belongs to; or replaced otherwise.
"""
self._validateIndex()
if self.index is None:
self.index = len(self.dbf)
self.dbf.append(self)
else:
self.dbf[self.index] = self
def delete(self):
"""Mark method as deleted."""
self.deleted = True
def toString(self):
"""Return string packed record values."""
# for (_def, _dat) in zip(self.dbf.header.fields, self.fieldData):
#
return "".join([" *"[self.deleted]] + [
_def.encodeValue(_dat)
for (_def, _dat) in zip(self.dbf.header.fields, self.fieldData)
])
def asList(self):
"""Return a flat list of fields.
Note:
Change of the list's values won't change
real values stored in this object.
"""
return self.fieldData[:]
def asDict(self):
"""Return a dictionary of fields.
Note:
Change of the dicts's values won't change
real values stored in this object.
"""
return dict([_i for _i in zip(self.dbf.fieldNames, self.fieldData)])
def __getitem__(self, key):
"""Return value by field name or field index."""
if isinstance(key, int):
# integer index of the field
return self.fieldData[key]
# assuming string field name
return self.fieldData[self.dbf.indexOfFieldName(key)]
def __setitem__(self, key, value):
"""Set field value by integer index of the field or string name."""
if isinstance(key, int):
# integer index of the field
return self.fieldData[key]
# assuming string field name
self.fieldData[self.dbf.indexOfFieldName(key)] = value
# vim: et sts=4 sw=4 :
+168
View File
@@ -0,0 +1,168 @@
"""String utilities.
TODO:
- allow strings in getDateTime routine;
"""
"""History (most recent first):
11-feb-2007 [als] added INVALID_VALUE
10-feb-2007 [als] allow date strings padded with spaces instead of zeroes
20-dec-2005 [yc] handle long objects in getDate/getDateTime
16-dec-2005 [yc] created from ``strutil`` module.
"""
__version__ = "$Revision: 1.4 $"[11:-2]
__date__ = "$Date: 2007/02/11 08:57:17 $"[7:-2]
import datetime
import time
def unzfill(str):
"""Return a string without ASCII NULs.
This function searchers for the first NUL (ASCII 0) occurrence
and truncates string till that position.
"""
try:
return str[:str.index(b'\0')]
except ValueError:
return str
def getDate(date=None):
"""Return `datetime.date` instance.
Type of the ``date`` argument could be one of the following:
None:
use current date value;
datetime.date:
this value will be returned;
datetime.datetime:
the result of the date.date() will be returned;
string:
assuming "%Y%m%d" or "%y%m%dd" format;
number:
assuming it's a timestamp (returned for example
by the time.time() call;
sequence:
assuming (year, month, day, ...) sequence;
Additionally, if ``date`` has callable ``ticks`` attribute,
it will be used and result of the called would be treated
as a timestamp value.
"""
if date is None:
# use current value
return datetime.date.today()
if isinstance(date, datetime.date):
return date
if isinstance(date, datetime.datetime):
return date.date()
if isinstance(date, (int, float)):
# date is a timestamp
return datetime.date.fromtimestamp(date)
if isinstance(date, str):
date = date.replace(" ", "0")
if len(date) == 6:
# yymmdd
return datetime.date(*time.strptime(date, "%y%m%d")[:3])
# yyyymmdd
return datetime.date(*time.strptime(date, "%Y%m%d")[:3])
if hasattr(date, "__getitem__"):
# a sequence (assuming date/time tuple)
return datetime.date(*date[:3])
return datetime.date.fromtimestamp(date.ticks())
def getDateTime(value=None):
"""Return `datetime.datetime` instance.
Type of the ``value`` argument could be one of the following:
None:
use current date value;
datetime.date:
result will be converted to the `datetime.datetime` instance
using midnight;
datetime.datetime:
``value`` will be returned as is;
string:
*** CURRENTLY NOT SUPPORTED ***;
number:
assuming it's a timestamp (returned for example
by the time.time() call;
sequence:
assuming (year, month, day, ...) sequence;
Additionally, if ``value`` has callable ``ticks`` attribute,
it will be used and result of the called would be treated
as a timestamp value.
"""
if value is None:
# use current value
return datetime.datetime.today()
if isinstance(value, datetime.datetime):
return value
if isinstance(value, datetime.date):
return datetime.datetime.fromordinal(value.toordinal())
if isinstance(value, (int, float)):
# value is a timestamp
return datetime.datetime.fromtimestamp(value)
if isinstance(value, str):
raise NotImplementedError("Strings aren't currently implemented")
if hasattr(value, "__getitem__"):
# a sequence (assuming date/time tuple)
return datetime.datetime(*tuple(value)[:6])
return datetime.datetime.fromtimestamp(value.ticks())
class classproperty(property):
"""Works in the same way as a ``property``, but for the classes."""
def __get__(self, obj, cls):
return self.fget(cls)
class _InvalidValue:
"""Value returned from DBF records when field validation fails
The value is not equal to anything except for itself
and equal to all empty values: None, 0, empty string etc.
In other words, invalid value is equal to None and not equal
to None at the same time.
This value yields zero upon explicit conversion to a number type,
empty string for string types, and False for boolean.
"""
def __eq__(self, other):
return not other
def __ne__(self, other):
return not (other is self)
def __bool__(self):
return False
def __int__(self):
return 0
__long__ = __int__
def __float__(self):
return 0.0
def __str__(self):
return ""
def __repr__(self):
return "<INVALID>"
# invalid value is a constant singleton
INVALID_VALUE = _InvalidValue()
# vim: set et sts=4 sw=4 :
+13
View File
@@ -0,0 +1,13 @@
from io import BytesIO, StringIO
def normalize_input(stream):
"""
Accept either a str/bytes stream or a file-like object and always return a
file-like object.
"""
if isinstance(stream, str):
return StringIO(stream)
elif isinstance(stream, bytes):
return BytesIO(stream)
return stream
-14
View File
@@ -1,14 +0,0 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Tabbed
Copyright (c) 2010 Kenneth Reitz. MIT License.
"""
import tablib.cli
if __name__ == '__main__':
tablib.cli.start()
-1
View File
@@ -1 +0,0 @@
from core import *
-85
View File
@@ -1,85 +0,0 @@
#!/usr/bin/env python
# encoding: utf-8
""" Tabbed CLI Inteface Application
"""
import io
import sys
from helpers import *
import tablib.core
from packages import opster
FORMATS = ('json', 'yaml', 'xls', 'csv', 'html')
opts = []
opts.append(('v', 'version', False, 'Report tabbed version'))
for format in FORMATS:
opts.append(('', format, False, 'Output to %s' % (format.upper())))
@opster.command(options=opts, usage='[FILE] [--FORMAT | FILE]')
def start(in_file=None, out_file=None, **opts):
"""Covertly convert dataset formats"""
opts = Object(**opts)
if opts.version:
print('Tabbed, Ver. %s' % tabbed.core.__version__)
sys.sys.exit(0)
stdin = piped()
if stdin:
print stdin
elif in_file:
try:
in_file = io.open(in_file, 'r')
except Exception, e:
print(' %s cannot be read.' % in_file)
sys.exit(65)
file_ext = in_file.name.split('.')[-1]
if file_ext.lower() in FORMATS:
setattr(opts, file_ext, True)
else:
print('Import format not supported.')
sys.exit(65)
else:
print('Please provide input.')
sys.exit(65)
_formats_sum = sum(opts[f] for f in FORMATS)
# Multiple output formats given
if _formats_sum > 1:
print('Please specify a single output format.')
sys.exit(64)
# No output formats given
elif _formats_sum < 1:
print('Please specify an output format.')
sys.exit(64)
# fetch options.formats list
# if sum(()) > 1
# log only one data format please
# if sum of formats == 0, specity format
# look for filename
print opts.__dict__
print in_file
print out_file
-219
View File
@@ -1,219 +0,0 @@
# -*- coding: utf-8 -*-
# _____ ______ ______ _________
# __ /_______ ____ /_ ___ /_ _____ ______ /
# _ __/_ __ `/__ __ \__ __ \_ _ \_ __ /
# / /_ / /_/ / _ /_/ /_ /_/ // __// /_/ /
# \__/ \__,_/ /_.___/ /_.___/ \___/ \__,_/
import csv
import cStringIO
import os
from helpers import *
from packages import simplejson as json
from packages import xlwt
try:
import yaml
except ImportError, why:
from packages import yaml
__all__ = ['Dataset', 'source']
__version__ = '0.0.4'
__build__ = '0x000004'
__author__ = 'Kenneth Reitz'
__license__ = 'MIT'
__copyright__ = 'Copyright 2010 Kenneth Reitz'
FILE_EXTENSIONS = ('csv', 'json', 'xls', 'yaml')
class Dataset(object):
"""Epic Tabular-Dataset object. """
def __init__(self, *args, **kwargs):
self._data = None
self._saved_file = None
self._saved_format = None
self._data = list(args)
try:
self.headers = kwargs['headers']
except KeyError, why:
self.headers = None
try:
self.title = kwargs['title']
except KeyError, why:
self.title = None
def __len__(self):
return self.height
def __getitem__(self, key):
if is_string(key):
if key in self.headers:
pos = self.headers.index(key) # get 'key' index from each data
return [row[pos] for row in self._data]
else:
raise KeyError
else:
return self._data[key]
def __setitem__(self, key, value):
self._validate(value)
self._data[key] = tuple(value)
def __delitem__(self, key):
del self._data[key]
def __repr__(self):
if self.title:
return '<%s dataset>' % (self.title.lower())
else:
return '<dataset object>'
def _validate(self, row=None, safety=False):
"""Assures size of every row in dataset is of proper proportions."""
if row:
is_valid = (len(row) == self.width) if self.width else True
else:
is_valid = all((len(x)== self.width for x in self._data))
if is_valid:
return True
else:
if not safety:
raise InvalidDimensions
return False
def _package(self, dicts=True):
"""Packages Dataset into lists of dictionaries for transmission."""
if self.headers:
if dicts:
data = [dict(zip(self.headers, data_row)) for data_row in self ._data]
else:
data = [list(self.headers)] + list(self._data)
else:
data = [list(row) for row in self._data]
return data
@property
def height(self):
"""Returns the height of the Dataset."""
return len(self._data)
@property
def width(self):
"""Returns the width of the Dataset."""
try:
return len(self._data[0])
except KeyError, why:
return 0
@property
def json(self):
"""Returns JSON representation of Dataset."""
return json.dumps(self._package())
@property
def yaml(self):
"""Returns YAML representation of Dataset."""
return yaml.dump(self._package())
@property
def csv(self):
"""Returns CSV representation of Dataset."""
stream = cStringIO.StringIO()
_csv = csv.writer(stream)
for row in self._package(dicts=False):
_csv.writerow(row)
return stream.getvalue()
@property
def xls(self):
"""Returns XLS representation of Dataset."""
stream = cStringIO.StringIO()
wb = xlwt.Workbook()
ws = wb.add_sheet(self.title if self.title else 'Tabbed Dataset')
# for row in self._package(dicts=False):
for i, row in enumerate(self._package(dicts=False)):
for j, col in enumerate(row):
ws.write(i, j, col)
doc = xlwt.CompoundDoc.XlsDoc()
doc.save(stream, wb.get_biff_data())
return stream.getvalue()
def append(self, row, index=None):
# todo: impliment index
self._validate(row)
self._data.append(tuple(row))
def sort_by(self, key):
"""Sorts datastet by given key"""
# todo: accpept string if headers, or index nubmer
pass
def save(self, filename=None, format=None):
"""Saves dataset"""
if not format:
# set format from filename
# format = filename
pass
if format not in FILE_EXTENSIONS:
raise UnsupportedFormat
# note export format
# open file, save the bitch
def export(self):
"""Exports Dataset to given filename or file-object."""
class InvalidDimensions(Exception):
"Invalid size"
class UnsupportedFormat(NotImplementedError):
"Format is not supported"
def source(src=None, file=None, filename=None):
"""docstring for import"""
#open by filename
pass
View File
View File
View File
-25
View File
@@ -1,25 +0,0 @@
# -*- coding: utf-8 -*-
import sys
class Object(object):
"""Your attributes are belong to us."""
def __init__(self, **entries):
self.__dict__.update(entries)
def __getitem__(self, key):
return getattr(self, key)
def piped():
"""Returns piped input via stdin, else False"""
with sys.stdin as stdin:
return stdin.read() if not stdin.isatty() else None
def is_string(obj):
"""Tests if an object is a string"""
return True if type(obj).__name__ == 'str' else False
-1
View File
@@ -1 +0,0 @@
all = ['simplejson', 'typecheck', 'xlwt', 'opster']
-612
View File
@@ -1,612 +0,0 @@
# (c) Alexander Solovyov, 2009, under terms of the new BSD License
'''Command line arguments parser
'''
import sys, traceback, getopt, types, textwrap, inspect, os
from itertools import imap
__all__ = ['command', 'dispatch']
__version__ = '0.9.10'
__author__ = 'Alexander Solovyov'
__email__ = 'piranha@piranha.org.ua'
write = sys.stdout.write
err = sys.stderr.write
CMDTABLE = {}
# --------
# Public interface
# --------
def command(options=None, usage=None, name=None, shortlist=False, hide=False):
'''Decorator to mark function to be used for command line processing.
All arguments are optional:
- ``options``: options in format described in docs. If not supplied,
will be determined from function.
- ``usage``: usage string for function, replaces ``%name`` with name
of program or subcommand. In case if it's subcommand and ``%name``
is not present, usage is prepended by ``name``
- ``name``: used for multiple subcommands. Defaults to wrapped
function name
- ``shortlist``: if command should be included in shortlist. Used
only with multiple subcommands
- ``hide``: if command should be hidden from help listing. Used only
with multiple subcommands, overrides ``shortlist``
'''
def wrapper(func):
try:
options_ = list(guess_options(func))
except TypeError:
options_ = []
try:
options_ = options_ + list(options)
except TypeError:
pass
name_ = name or func.__name__.replace('_', '-')
if usage is None:
usage_ = guess_usage(func, options_)
else:
usage_ = usage
prefix = hide and '~' or (shortlist and '^' or '')
CMDTABLE[prefix + name_] = (func, options_, usage_)
def help_func(name=None):
return help_cmd(func, replace_name(usage_, sysname()), options_)
@wraps(func)
def inner(*args, **opts):
# look if we need to add 'help' option
try:
(True for option in reversed(options_)
if option[1] == 'help').next()
except StopIteration:
options_.append(('h', 'help', False, 'show help'))
argv = opts.pop('argv', sys.argv[1:])
if opts.pop('help', False):
return help_func()
if args or opts:
# no catcher here because this is call from Python
return call_cmd_regular(func, options_)(*args, **opts)
try:
opts, args = catcher(lambda: parse(argv, options_), help_func)
except Abort:
return -1
try:
if opts.pop('help', False):
return help_func()
return catcher(lambda: call_cmd(name_, func)(*args, **opts),
help_func)
except Abort:
return -1
return inner
return wrapper
def dispatch(args=None, cmdtable=None, globaloptions=None,
middleware=lambda x: x):
'''Dispatch command arguments based on subcommands.
- ``args``: list of arguments, default: ``sys.argv[1:]``
- ``cmdtable``: dict of commands in format described below.
If not supplied, will use functions decorated with ``@command``.
- ``globaloptions``: list of options which are applied to all
commands, will contain ``--help`` option at least.
- ``middleware``: global decorator for all commands.
cmdtable format description::
{'name': (function, options, usage)}
- ``name`` is the name used on command-line. Can contain aliases
(separate them with ``|``), pointer to a fact that this command
should be displayed in short help (start name with ``^``), or to
a fact that this command should be hidden (start name with ``~``)
- ``function`` is the actual callable
- ``options`` is options list in format described in docs
- ``usage`` is the short string of usage
'''
args = args or sys.argv[1:]
cmdtable = cmdtable or CMDTABLE
globaloptions = globaloptions or []
globaloptions.append(('h', 'help', False, 'display help'))
cmdtable['help'] = (help_(cmdtable, globaloptions), [], '[TOPIC]')
help_func = cmdtable['help'][0]
autocomplete(cmdtable, args)
try:
name, func, args, kwargs = catcher(
lambda: _dispatch(args, cmdtable, globaloptions),
help_func)
return catcher(
lambda: call_cmd(name, middleware(func))(*args, **kwargs),
help_func)
except Abort:
return -1
# --------
# Help
# --------
def help_(cmdtable, globalopts):
def help_inner(name=None):
'''Show help for a given help topic or a help overview
With no arguments, print a list of commands with short help messages.
Given a command name, print help for that command.
'''
def helplist():
hlp = {}
# determine if any command is marked for shortlist
shortlist = (name == 'shortlist' and
any(imap(lambda x: x.startswith('^'), cmdtable)))
for cmd, info in cmdtable.items():
if cmd.startswith('~'):
continue # do not display hidden commands
if shortlist and not cmd.startswith('^'):
continue # short help contains only marked commands
cmd = cmd.lstrip('^~')
doc = info[0].__doc__ or '(no help text available)'
hlp[cmd] = doc.splitlines()[0].rstrip()
hlplist = sorted(hlp)
maxlen = max(map(len, hlplist))
write('usage: %s <command> [options]\n' % sysname())
write('\ncommands:\n\n')
for cmd in hlplist:
doc = hlp[cmd]
if False: # verbose?
write(' %s:\n %s\n' % (cmd.replace('|', ', '), doc))
else:
write(' %-*s %s\n' % (maxlen, cmd.split('|', 1)[0],
doc))
if not cmdtable:
return err('No commands specified!\n')
if not name or name == 'shortlist':
return helplist()
aliases, (cmd, options, usage) = findcmd(name, cmdtable)
return help_cmd(cmd,
replace_name(usage, sysname() + ' ' + aliases[0]),
options + globalopts)
return help_inner
def help_cmd(func, usage, options):
'''show help for given command
- ``func``: function to generate help for (``func.__doc__`` is taken)
- ``usage``: usage string
- ``options``: options in usual format
>>> def test(*args, **opts):
... """that's a test command
...
... you can do nothing with this command"""
... pass
>>> opts = [('l', 'listen', 'localhost',
... 'ip to listen on'),
... ('p', 'port', 8000,
... 'port to listen on'),
... ('d', 'daemonize', False,
... 'daemonize process'),
... ('', 'pid-file', '',
... 'name of file to write process ID to')]
>>> help_cmd(test, 'test [-l HOST] [NAME]', opts)
test [-l HOST] [NAME]
<BLANKLINE>
that's a test command
<BLANKLINE>
you can do nothing with this command
<BLANKLINE>
options:
<BLANKLINE>
-l --listen ip to listen on (default: localhost)
-p --port port to listen on (default: 8000)
-d --daemonize daemonize process
--pid-file name of file to write process ID to
<BLANKLINE>
'''
write(usage + '\n')
doc = func.__doc__
if not doc:
doc = '(no help text available)'
write('\n' + doc.strip() + '\n\n')
if options:
write(''.join(help_options(options)))
def help_options(options):
yield 'options:\n\n'
output = []
for short, name, default, desc in options:
if hasattr(default, '__call__'):
default = default(None)
default = default and ' (default: %s)' % default or ''
output.append(('%2s%s' % (short and '-%s' % short,
name and ' --%s' % name),
'%s%s' % (desc, default)))
opts_len = max([len(first) for first, second in output if second] or [0])
for first, second in output:
if second:
# wrap description at 78 chars
second = textwrap.wrap(second, width=(78 - opts_len - 3))
pad = '\n' + ' ' * (opts_len + 3)
yield ' %-*s %s\n' % (opts_len, first, pad.join(second))
else:
yield '%s\n' % first
# --------
# Options parsing
# --------
def parse(args, options):
'''
>>> opts = [('l', 'listen', 'localhost',
... 'ip to listen on'),
... ('p', 'port', 8000,
... 'port to listen on'),
... ('d', 'daemonize', False,
... 'daemonize process'),
... ('', 'pid-file', '',
... 'name of file to write process ID to')]
>>> print parse(['-l', '0.0.0.0', '--pi', 'test', 'all'], opts)
({'pid_file': 'test', 'daemonize': False, 'port': 8000, 'listen': '0.0.0.0'}, ['all'])
'''
argmap, defmap, state = {}, {}, {}
shortlist, namelist, funlist = '', [], []
for short, name, default, comment in options:
if short and len(short) != 1:
raise FOError('Short option should be only a single'
' character: %s' % short)
if not name:
raise FOError(
'Long name should be defined for every option')
# change name to match Python styling
pyname = name.replace('-', '_')
argmap['-' + short] = argmap['--' + name] = pyname
defmap[pyname] = default
# copy defaults to state
if isinstance(default, list):
state[pyname] = default[:]
elif hasattr(default, '__call__'):
funlist.append(pyname)
state[pyname] = None
else:
state[pyname] = default
# getopt wants indication that it takes a parameter
if not (default is None or default is True or default is False):
if short: short += ':'
if name: name += '='
if short:
shortlist += short
if name:
namelist.append(name)
opts, args = getopt.gnu_getopt(args, shortlist, namelist)
# transfer result to state
for opt, val in opts:
name = argmap[opt]
t = type(defmap[name])
if t is types.FunctionType:
del funlist[funlist.index(name)]
state[name] = defmap[name](val)
elif t is types.ListType:
state[name].append(val)
elif t in (types.NoneType, types.BooleanType):
state[name] = not defmap[name]
else:
state[name] = t(val)
for name in funlist:
state[name] = defmap[name](None)
return state, args
# --------
# Subcommand system
# --------
def _dispatch(args, cmdtable, globalopts):
cmd, func, args, options = cmdparse(args, cmdtable, globalopts)
if options.pop('help', False):
return 'help', cmdtable['help'][0], [cmd], {}
elif not cmd:
return 'help', cmdtable['help'][0], ['shortlist'], {}
return cmd, func, args, options
def cmdparse(args, cmdtable, globalopts):
# command is the first non-option
cmd = None
for arg in args:
if not arg.startswith('-'):
cmd = arg
break
if cmd:
args.pop(args.index(cmd))
aliases, info = findcmd(cmd, cmdtable)
cmd = aliases[0]
possibleopts = list(info[1])
else:
possibleopts = []
possibleopts.extend(globalopts)
try:
options, args = parse(args, possibleopts)
except getopt.GetoptError, e:
raise ParseError(cmd, e)
return (cmd, cmd and info[0] or None, args, options)
def aliases_(cmdtable_key):
return cmdtable_key.lstrip("^~").split("|")
def findpossible(cmd, table):
"""
Return cmd -> (aliases, command table entry)
for each matching command.
"""
choice = {}
for e in table.keys():
aliases = aliases_(e)
found = None
if cmd in aliases:
found = cmd
else:
for a in aliases:
if a.startswith(cmd):
found = a
break
if found is not None:
choice[found] = (aliases, table[e])
return choice
def findcmd(cmd, table):
"""Return (aliases, command table entry) for command string."""
choice = findpossible(cmd, table)
if cmd in choice:
return choice[cmd]
if len(choice) > 1:
clist = choice.keys()
clist.sort()
raise AmbiguousCommand(cmd, clist)
if choice:
return choice.values()[0]
raise UnknownCommand(cmd)
# --------
# Helpers
# --------
def guess_options(func):
args, varargs, varkw, defaults = inspect.getargspec(func)
for name, option in zip(args[-len(defaults):], defaults):
try:
sname, default, hlp = option
yield (sname, name.replace('_', '-'), default, hlp)
except TypeError:
pass
def guess_usage(func, options):
usage = '%name '
if options:
usage += '[OPTIONS] '
args, varargs = inspect.getargspec(func)[:2]
argnum = len(args) - len(options)
if argnum > 0:
usage += args[0].upper()
if argnum > 1:
usage += 'S'
elif varargs:
usage += '[%s]' % varargs.upper()
return usage
def catcher(target, help_func):
'''Catches all exceptions and prints human-readable information on them
'''
try:
return target()
except UnknownCommand, e:
err("unknown command: '%s'\n" % e)
except AmbiguousCommand, e:
err("command '%s' is ambiguous:\n %s\n" %
(e.args[0], ' '.join(e.args[1])))
except ParseError, e:
err('%s: %s\n' % (e.args[0], e.args[1]))
help_func(e.args[0])
except getopt.GetoptError, e:
err('error: %s\n' % e)
help_func()
except FOError, e:
err('%s\n' % e)
except KeyboardInterrupt:
err('interrupted!\n')
except SystemExit:
raise
except:
err('unknown exception encountered')
raise
raise Abort
def call_cmd(name, func):
def inner(*args, **kwargs):
try:
return func(*args, **kwargs)
except TypeError:
if len(traceback.extract_tb(sys.exc_info()[2])) == 1:
raise ParseError(name, "invalid arguments")
raise
return inner
def call_cmd_regular(func, opts):
def inner(*args, **kwargs):
funcargs, _, varkw, defaults = inspect.getargspec(func)
if len(args) > len(funcargs):
raise TypeError('You have supplied more positional arguments'
' than applicable')
funckwargs = dict((lname.replace('-', '_'), default)
for _, lname, default, _ in opts)
if 'help' not in (defaults or ()) and not varkw:
funckwargs.pop('help', None)
funckwargs.update(kwargs)
return func(*args, **funckwargs)
return inner
def replace_name(usage, name):
if '%name' in usage:
return usage.replace('%name', name, 1)
return name + ' ' + usage
def sysname():
name = sys.argv[0]
if name.startswith('./'):
return name[2:]
return name
try:
from functools import wraps
except ImportError:
def wraps(wrapped, assigned=('__module__', '__name__', '__doc__'),
updated=('__dict__',)):
def inner(wrapper):
for attr in assigned:
setattr(wrapper, attr, getattr(wrapped, attr))
for attr in updated:
getattr(wrapper, attr).update(getattr(wrapped, attr, {}))
return wrapper
return inner
# --------
# Autocomplete system
# --------
# Borrowed from PIP
def autocomplete(cmdtable, args):
"""Command and option completion.
Enable by sourcing one of the completion shell scripts (bash or zsh).
"""
# Don't complete if user hasn't sourced bash_completion file.
if not os.environ.has_key('OPSTER_AUTO_COMPLETE'):
return
cwords = os.environ['COMP_WORDS'].split()[1:]
cword = int(os.environ['COMP_CWORD'])
try:
current = cwords[cword-1]
except IndexError:
current = ''
commands = []
for k in cmdtable.keys():
commands += aliases_(k)
# command
if cword == 1:
print ' '.join(filter(lambda x: x.startswith(current), commands))
# command options
elif cwords[0] in commands:
options = []
aliases, (cmd, opts, usage) = findcmd(cwords[0], cmdtable)
for (short, long, default, help) in opts:
options.append('-%s' % short)
options.append('--%s' % long)
options = [o for o in options if o.startswith(current)]
print ' '.join(filter(lambda x: x.startswith(current), options))
sys.exit(1)
COMPLETIONS = {
'bash':
"""
# opster bash completion start
_opster_completion()
{
COMPREPLY=( $( COMP_WORDS="${COMP_WORDS[*]}" \\
COMP_CWORD=$COMP_CWORD \\
OPSTER_AUTO_COMPLETE=1 $1 ) )
}
complete -o default -F _opster_completion %s
# opster bash completion end
""",
'zsh':
"""
# opster zsh completion start
function _opster_completion {
local words cword
read -Ac words
read -cn cword
reply=( $( COMP_WORDS="$words[*]" \\
COMP_CWORD=$(( cword-1 )) \\
OPSTER_AUTO_COMPLETE=1 $words[1] ) )
}
compctl -K _opster_completion %s
# opster zsh completion end
"""
}
@command(name='_completion', hide=True)
def completion(type=('t', 'bash', 'Completion type (bash or zsh)')):
"""Outputs completion script for bash or zsh."""
prog_name = os.path.split(sys.argv[0])[1]
print COMPLETIONS[type] % prog_name
# --------
# Exceptions
# --------
# Command exceptions
class CommandException(Exception):
'Base class for command exceptions'
class AmbiguousCommand(CommandException):
'Raised if command is ambiguous'
class UnknownCommand(CommandException):
'Raised if command is unknown'
class ParseError(CommandException):
'Raised on error in command line parsing'
class Abort(CommandException):
'Abort execution'
class FOError(CommandException):
'Raised on trouble with opster configuration'
-437
View File
@@ -1,437 +0,0 @@
r"""JSON (JavaScript Object Notation) <http://json.org> is a subset of
JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data
interchange format.
:mod:`simplejson` exposes an API familiar to users of the standard library
:mod:`marshal` and :mod:`pickle` modules. It is the externally maintained
version of the :mod:`json` library contained in Python 2.6, but maintains
compatibility with Python 2.4 and Python 2.5 and (currently) has
significant performance advantages, even without using the optional C
extension for speedups.
Encoding basic Python object hierarchies::
>>> import simplejson as json
>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
'["foo", {"bar": ["baz", null, 1.0, 2]}]'
>>> print json.dumps("\"foo\bar")
"\"foo\bar"
>>> print json.dumps(u'\u1234')
"\u1234"
>>> print json.dumps('\\')
"\\"
>>> print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)
{"a": 0, "b": 0, "c": 0}
>>> from StringIO import StringIO
>>> io = StringIO()
>>> json.dump(['streaming API'], io)
>>> io.getvalue()
'["streaming API"]'
Compact encoding::
>>> import simplejson as json
>>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':'))
'[1,2,3,{"4":5,"6":7}]'
Pretty printing::
>>> import simplejson as json
>>> s = json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=' ')
>>> print '\n'.join([l.rstrip() for l in s.splitlines()])
{
"4": 5,
"6": 7
}
Decoding JSON::
>>> import simplejson as json
>>> obj = [u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
>>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') == obj
True
>>> json.loads('"\\"foo\\bar"') == u'"foo\x08ar'
True
>>> from StringIO import StringIO
>>> io = StringIO('["streaming API"]')
>>> json.load(io)[0] == 'streaming API'
True
Specializing JSON object decoding::
>>> import simplejson as json
>>> def as_complex(dct):
... if '__complex__' in dct:
... return complex(dct['real'], dct['imag'])
... return dct
...
>>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',
... object_hook=as_complex)
(1+2j)
>>> from decimal import Decimal
>>> json.loads('1.1', parse_float=Decimal) == Decimal('1.1')
True
Specializing JSON object encoding::
>>> import simplejson as json
>>> def encode_complex(obj):
... if isinstance(obj, complex):
... return [obj.real, obj.imag]
... raise TypeError(repr(o) + " is not JSON serializable")
...
>>> json.dumps(2 + 1j, default=encode_complex)
'[2.0, 1.0]'
>>> json.JSONEncoder(default=encode_complex).encode(2 + 1j)
'[2.0, 1.0]'
>>> ''.join(json.JSONEncoder(default=encode_complex).iterencode(2 + 1j))
'[2.0, 1.0]'
Using simplejson.tool from the shell to validate and pretty-print::
$ echo '{"json":"obj"}' | python -m simplejson.tool
{
"json": "obj"
}
$ echo '{ 1.2:3.4}' | python -m simplejson.tool
Expecting property name: line 1 column 2 (char 2)
"""
__version__ = '2.1.1'
__all__ = [
'dump', 'dumps', 'load', 'loads',
'JSONDecoder', 'JSONDecodeError', 'JSONEncoder',
'OrderedDict',
]
__author__ = 'Bob Ippolito <bob@redivi.com>'
from decimal import Decimal
from decoder import JSONDecoder, JSONDecodeError
from encoder import JSONEncoder
def _import_OrderedDict():
import collections
try:
return collections.OrderedDict
except AttributeError:
import ordered_dict
return ordered_dict.OrderedDict
OrderedDict = _import_OrderedDict()
def _import_c_make_encoder():
try:
from simplejson._speedups import make_encoder
return make_encoder
except ImportError:
return None
_default_encoder = JSONEncoder(
skipkeys=False,
ensure_ascii=True,
check_circular=True,
allow_nan=True,
indent=None,
separators=None,
encoding='utf-8',
default=None,
use_decimal=False,
)
def dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True,
allow_nan=True, cls=None, indent=None, separators=None,
encoding='utf-8', default=None, use_decimal=False, **kw):
"""Serialize ``obj`` as a JSON formatted stream to ``fp`` (a
``.write()``-supporting file-like object).
If ``skipkeys`` is true then ``dict`` keys that are not basic types
(``str``, ``unicode``, ``int``, ``long``, ``float``, ``bool``, ``None``)
will be skipped instead of raising a ``TypeError``.
If ``ensure_ascii`` is false, then the some chunks written to ``fp``
may be ``unicode`` instances, subject to normal Python ``str`` to
``unicode`` coercion rules. Unless ``fp.write()`` explicitly
understands ``unicode`` (as in ``codecs.getwriter()``) this is likely
to cause an error.
If ``check_circular`` is false, then the circular reference check
for container types will be skipped and a circular reference will
result in an ``OverflowError`` (or worse).
If ``allow_nan`` is false, then it will be a ``ValueError`` to
serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``)
in strict compliance of the JSON specification, instead of using the
JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
If *indent* is a string, then JSON array elements and object members
will be pretty-printed with a newline followed by that string repeated
for each level of nesting. ``None`` (the default) selects the most compact
representation without any newlines. For backwards compatibility with
versions of simplejson earlier than 2.1.0, an integer is also accepted
and is converted to a string with that many spaces.
If ``separators`` is an ``(item_separator, dict_separator)`` tuple
then it will be used instead of the default ``(', ', ': ')`` separators.
``(',', ':')`` is the most compact JSON representation.
``encoding`` is the character encoding for str instances, default is UTF-8.
``default(obj)`` is a function that should return a serializable version
of obj or raise TypeError. The default simply raises TypeError.
If *use_decimal* is true (default: ``False``) then decimal.Decimal
will be natively serialized to JSON with full precision.
To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
``.default()`` method to serialize additional types), specify it with
the ``cls`` kwarg.
"""
# cached encoder
if (not skipkeys and ensure_ascii and
check_circular and allow_nan and
cls is None and indent is None and separators is None and
encoding == 'utf-8' and default is None and not kw):
iterable = _default_encoder.iterencode(obj)
else:
if cls is None:
cls = JSONEncoder
iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
check_circular=check_circular, allow_nan=allow_nan, indent=indent,
separators=separators, encoding=encoding,
default=default, use_decimal=use_decimal, **kw).iterencode(obj)
# could accelerate with writelines in some versions of Python, at
# a debuggability cost
for chunk in iterable:
fp.write(chunk)
def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
allow_nan=True, cls=None, indent=None, separators=None,
encoding='utf-8', default=None, use_decimal=False, **kw):
"""Serialize ``obj`` to a JSON formatted ``str``.
If ``skipkeys`` is false then ``dict`` keys that are not basic types
(``str``, ``unicode``, ``int``, ``long``, ``float``, ``bool``, ``None``)
will be skipped instead of raising a ``TypeError``.
If ``ensure_ascii`` is false, then the return value will be a
``unicode`` instance subject to normal Python ``str`` to ``unicode``
coercion rules instead of being escaped to an ASCII ``str``.
If ``check_circular`` is false, then the circular reference check
for container types will be skipped and a circular reference will
result in an ``OverflowError`` (or worse).
If ``allow_nan`` is false, then it will be a ``ValueError`` to
serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in
strict compliance of the JSON specification, instead of using the
JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
If ``indent`` is a string, then JSON array elements and object members
will be pretty-printed with a newline followed by that string repeated
for each level of nesting. ``None`` (the default) selects the most compact
representation without any newlines. For backwards compatibility with
versions of simplejson earlier than 2.1.0, an integer is also accepted
and is converted to a string with that many spaces.
If ``separators`` is an ``(item_separator, dict_separator)`` tuple
then it will be used instead of the default ``(', ', ': ')`` separators.
``(',', ':')`` is the most compact JSON representation.
``encoding`` is the character encoding for str instances, default is UTF-8.
``default(obj)`` is a function that should return a serializable version
of obj or raise TypeError. The default simply raises TypeError.
If *use_decimal* is true (default: ``False``) then decimal.Decimal
will be natively serialized to JSON with full precision.
To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
``.default()`` method to serialize additional types), specify it with
the ``cls`` kwarg.
"""
# cached encoder
if (not skipkeys and ensure_ascii and
check_circular and allow_nan and
cls is None and indent is None and separators is None and
encoding == 'utf-8' and default is None and not use_decimal
and not kw):
return _default_encoder.encode(obj)
if cls is None:
cls = JSONEncoder
return cls(
skipkeys=skipkeys, ensure_ascii=ensure_ascii,
check_circular=check_circular, allow_nan=allow_nan, indent=indent,
separators=separators, encoding=encoding, default=default,
use_decimal=use_decimal, **kw).encode(obj)
_default_decoder = JSONDecoder(encoding=None, object_hook=None,
object_pairs_hook=None)
def load(fp, encoding=None, cls=None, object_hook=None, parse_float=None,
parse_int=None, parse_constant=None, object_pairs_hook=None,
use_decimal=False, **kw):
"""Deserialize ``fp`` (a ``.read()``-supporting file-like object containing
a JSON document) to a Python object.
*encoding* determines the encoding used to interpret any
:class:`str` objects decoded by this instance (``'utf-8'`` by
default). It has no effect when decoding :class:`unicode` objects.
Note that currently only encodings that are a superset of ASCII work,
strings of other encodings should be passed in as :class:`unicode`.
*object_hook*, if specified, will be called with the result of every
JSON object decoded and its return value will be used in place of the
given :class:`dict`. This can be used to provide custom
deserializations (e.g. to support JSON-RPC class hinting).
*object_pairs_hook* is an optional function that will be called with
the result of any object literal decode with an ordered list of pairs.
The return value of *object_pairs_hook* will be used instead of the
:class:`dict`. This feature can be used to implement custom decoders
that rely on the order that the key and value pairs are decoded (for
example, :func:`collections.OrderedDict` will remember the order of
insertion). If *object_hook* is also defined, the *object_pairs_hook*
takes priority.
*parse_float*, if specified, will be called with the string of every
JSON float to be decoded. By default, this is equivalent to
``float(num_str)``. This can be used to use another datatype or parser
for JSON floats (e.g. :class:`decimal.Decimal`).
*parse_int*, if specified, will be called with the string of every
JSON int to be decoded. By default, this is equivalent to
``int(num_str)``. This can be used to use another datatype or parser
for JSON integers (e.g. :class:`float`).
*parse_constant*, if specified, will be called with one of the
following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. This
can be used to raise an exception if invalid JSON numbers are
encountered.
If *use_decimal* is true (default: ``False``) then it implies
parse_float=decimal.Decimal for parity with ``dump``.
To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
kwarg.
"""
return loads(fp.read(),
encoding=encoding, cls=cls, object_hook=object_hook,
parse_float=parse_float, parse_int=parse_int,
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook,
use_decimal=use_decimal, **kw)
def loads(s, encoding=None, cls=None, object_hook=None, parse_float=None,
parse_int=None, parse_constant=None, object_pairs_hook=None,
use_decimal=False, **kw):
"""Deserialize ``s`` (a ``str`` or ``unicode`` instance containing a JSON
document) to a Python object.
*encoding* determines the encoding used to interpret any
:class:`str` objects decoded by this instance (``'utf-8'`` by
default). It has no effect when decoding :class:`unicode` objects.
Note that currently only encodings that are a superset of ASCII work,
strings of other encodings should be passed in as :class:`unicode`.
*object_hook*, if specified, will be called with the result of every
JSON object decoded and its return value will be used in place of the
given :class:`dict`. This can be used to provide custom
deserializations (e.g. to support JSON-RPC class hinting).
*object_pairs_hook* is an optional function that will be called with
the result of any object literal decode with an ordered list of pairs.
The return value of *object_pairs_hook* will be used instead of the
:class:`dict`. This feature can be used to implement custom decoders
that rely on the order that the key and value pairs are decoded (for
example, :func:`collections.OrderedDict` will remember the order of
insertion). If *object_hook* is also defined, the *object_pairs_hook*
takes priority.
*parse_float*, if specified, will be called with the string of every
JSON float to be decoded. By default, this is equivalent to
``float(num_str)``. This can be used to use another datatype or parser
for JSON floats (e.g. :class:`decimal.Decimal`).
*parse_int*, if specified, will be called with the string of every
JSON int to be decoded. By default, this is equivalent to
``int(num_str)``. This can be used to use another datatype or parser
for JSON integers (e.g. :class:`float`).
*parse_constant*, if specified, will be called with one of the
following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. This
can be used to raise an exception if invalid JSON numbers are
encountered.
If *use_decimal* is true (default: ``False``) then it implies
parse_float=decimal.Decimal for parity with ``dump``.
To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
kwarg.
"""
if (cls is None and encoding is None and object_hook is None and
parse_int is None and parse_float is None and
parse_constant is None and object_pairs_hook is None
and not use_decimal and not kw):
return _default_decoder.decode(s)
if cls is None:
cls = JSONDecoder
if object_hook is not None:
kw['object_hook'] = object_hook
if object_pairs_hook is not None:
kw['object_pairs_hook'] = object_pairs_hook
if parse_float is not None:
kw['parse_float'] = parse_float
if parse_int is not None:
kw['parse_int'] = parse_int
if parse_constant is not None:
kw['parse_constant'] = parse_constant
if use_decimal:
if parse_float is not None:
raise TypeError("use_decimal=True implies parse_float=Decimal")
kw['parse_float'] = Decimal
return cls(encoding=encoding, **kw).decode(s)
def _toggle_speedups(enabled):
import simplejson.decoder as dec
import simplejson.encoder as enc
import simplejson.scanner as scan
c_make_encoder = _import_c_make_encoder()
if enabled:
dec.scanstring = dec.c_scanstring or dec.py_scanstring
enc.c_make_encoder = c_make_encoder
enc.encode_basestring_ascii = (enc.c_encode_basestring_ascii or
enc.py_encode_basestring_ascii)
scan.make_scanner = scan.c_make_scanner or scan.py_make_scanner
else:
dec.scanstring = dec.py_scanstring
enc.c_make_encoder = None
enc.encode_basestring_ascii = enc.py_encode_basestring_ascii
scan.make_scanner = scan.py_make_scanner
dec.make_scanner = scan.make_scanner
global _default_decoder
_default_decoder = JSONDecoder(
encoding=None,
object_hook=None,
object_pairs_hook=None,
)
global _default_encoder
_default_encoder = JSONEncoder(
skipkeys=False,
ensure_ascii=True,
check_circular=True,
allow_nan=True,
indent=None,
separators=None,
encoding='utf-8',
default=None,
)
File diff suppressed because it is too large Load Diff
-421
View File
@@ -1,421 +0,0 @@
"""Implementation of JSONDecoder
"""
import re
import sys
import struct
from simplejson.scanner import make_scanner
def _import_c_scanstring():
try:
from simplejson._speedups import scanstring
return scanstring
except ImportError:
return None
c_scanstring = _import_c_scanstring()
__all__ = ['JSONDecoder']
FLAGS = re.VERBOSE | re.MULTILINE | re.DOTALL
def _floatconstants():
_BYTES = '7FF80000000000007FF0000000000000'.decode('hex')
# The struct module in Python 2.4 would get frexp() out of range here
# when an endian is specified in the format string. Fixed in Python 2.5+
if sys.byteorder != 'big':
_BYTES = _BYTES[:8][::-1] + _BYTES[8:][::-1]
nan, inf = struct.unpack('dd', _BYTES)
return nan, inf, -inf
NaN, PosInf, NegInf = _floatconstants()
class JSONDecodeError(ValueError):
"""Subclass of ValueError with the following additional properties:
msg: The unformatted error message
doc: The JSON document being parsed
pos: The start index of doc where parsing failed
end: The end index of doc where parsing failed (may be None)
lineno: The line corresponding to pos
colno: The column corresponding to pos
endlineno: The line corresponding to end (may be None)
endcolno: The column corresponding to end (may be None)
"""
def __init__(self, msg, doc, pos, end=None):
ValueError.__init__(self, errmsg(msg, doc, pos, end=end))
self.msg = msg
self.doc = doc
self.pos = pos
self.end = end
self.lineno, self.colno = linecol(doc, pos)
if end is not None:
self.endlineno, self.endcolno = linecol(doc, pos)
else:
self.endlineno, self.endcolno = None, None
def linecol(doc, pos):
lineno = doc.count('\n', 0, pos) + 1
if lineno == 1:
colno = pos
else:
colno = pos - doc.rindex('\n', 0, pos)
return lineno, colno
def errmsg(msg, doc, pos, end=None):
# Note that this function is called from _speedups
lineno, colno = linecol(doc, pos)
if end is None:
#fmt = '{0}: line {1} column {2} (char {3})'
#return fmt.format(msg, lineno, colno, pos)
fmt = '%s: line %d column %d (char %d)'
return fmt % (msg, lineno, colno, pos)
endlineno, endcolno = linecol(doc, end)
#fmt = '{0}: line {1} column {2} - line {3} column {4} (char {5} - {6})'
#return fmt.format(msg, lineno, colno, endlineno, endcolno, pos, end)
fmt = '%s: line %d column %d - line %d column %d (char %d - %d)'
return fmt % (msg, lineno, colno, endlineno, endcolno, pos, end)
_CONSTANTS = {
'-Infinity': NegInf,
'Infinity': PosInf,
'NaN': NaN,
}
STRINGCHUNK = re.compile(r'(.*?)(["\\\x00-\x1f])', FLAGS)
BACKSLASH = {
'"': u'"', '\\': u'\\', '/': u'/',
'b': u'\b', 'f': u'\f', 'n': u'\n', 'r': u'\r', 't': u'\t',
}
DEFAULT_ENCODING = "utf-8"
def py_scanstring(s, end, encoding=None, strict=True,
_b=BACKSLASH, _m=STRINGCHUNK.match):
"""Scan the string s for a JSON string. End is the index of the
character in s after the quote that started the JSON string.
Unescapes all valid JSON string escape sequences and raises ValueError
on attempt to decode an invalid string. If strict is False then literal
control characters are allowed in the string.
Returns a tuple of the decoded string and the index of the character in s
after the end quote."""
if encoding is None:
encoding = DEFAULT_ENCODING
chunks = []
_append = chunks.append
begin = end - 1
while 1:
chunk = _m(s, end)
if chunk is None:
raise JSONDecodeError(
"Unterminated string starting at", s, begin)
end = chunk.end()
content, terminator = chunk.groups()
# Content is contains zero or more unescaped string characters
if content:
if not isinstance(content, unicode):
content = unicode(content, encoding)
_append(content)
# Terminator is the end of string, a literal control character,
# or a backslash denoting that an escape sequence follows
if terminator == '"':
break
elif terminator != '\\':
if strict:
msg = "Invalid control character %r at" % (terminator,)
#msg = "Invalid control character {0!r} at".format(terminator)
raise JSONDecodeError(msg, s, end)
else:
_append(terminator)
continue
try:
esc = s[end]
except IndexError:
raise JSONDecodeError(
"Unterminated string starting at", s, begin)
# If not a unicode escape sequence, must be in the lookup table
if esc != 'u':
try:
char = _b[esc]
except KeyError:
msg = "Invalid \\escape: " + repr(esc)
raise JSONDecodeError(msg, s, end)
end += 1
else:
# Unicode escape sequence
esc = s[end + 1:end + 5]
next_end = end + 5
if len(esc) != 4:
msg = "Invalid \\uXXXX escape"
raise JSONDecodeError(msg, s, end)
uni = int(esc, 16)
# Check for surrogate pair on UCS-4 systems
if 0xd800 <= uni <= 0xdbff and sys.maxunicode > 65535:
msg = "Invalid \\uXXXX\\uXXXX surrogate pair"
if not s[end + 5:end + 7] == '\\u':
raise JSONDecodeError(msg, s, end)
esc2 = s[end + 7:end + 11]
if len(esc2) != 4:
raise JSONDecodeError(msg, s, end)
uni2 = int(esc2, 16)
uni = 0x10000 + (((uni - 0xd800) << 10) | (uni2 - 0xdc00))
next_end += 6
char = unichr(uni)
end = next_end
# Append the unescaped character
_append(char)
return u''.join(chunks), end
# Use speedup if available
scanstring = c_scanstring or py_scanstring
WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS)
WHITESPACE_STR = ' \t\n\r'
def JSONObject((s, end), encoding, strict, scan_once, object_hook,
object_pairs_hook, memo=None,
_w=WHITESPACE.match, _ws=WHITESPACE_STR):
# Backwards compatibility
if memo is None:
memo = {}
memo_get = memo.setdefault
pairs = []
# Use a slice to prevent IndexError from being raised, the following
# check will raise a more specific ValueError if the string is empty
nextchar = s[end:end + 1]
# Normally we expect nextchar == '"'
if nextchar != '"':
if nextchar in _ws:
end = _w(s, end).end()
nextchar = s[end:end + 1]
# Trivial empty object
if nextchar == '}':
if object_pairs_hook is not None:
result = object_pairs_hook(pairs)
return result, end
pairs = {}
if object_hook is not None:
pairs = object_hook(pairs)
return pairs, end + 1
elif nextchar != '"':
raise JSONDecodeError("Expecting property name", s, end)
end += 1
while True:
key, end = scanstring(s, end, encoding, strict)
key = memo_get(key, key)
# To skip some function call overhead we optimize the fast paths where
# the JSON key separator is ": " or just ":".
if s[end:end + 1] != ':':
end = _w(s, end).end()
if s[end:end + 1] != ':':
raise JSONDecodeError("Expecting : delimiter", s, end)
end += 1
try:
if s[end] in _ws:
end += 1
if s[end] in _ws:
end = _w(s, end + 1).end()
except IndexError:
pass
try:
value, end = scan_once(s, end)
except StopIteration:
raise JSONDecodeError("Expecting object", s, end)
pairs.append((key, value))
try:
nextchar = s[end]
if nextchar in _ws:
end = _w(s, end + 1).end()
nextchar = s[end]
except IndexError:
nextchar = ''
end += 1
if nextchar == '}':
break
elif nextchar != ',':
raise JSONDecodeError("Expecting , delimiter", s, end - 1)
try:
nextchar = s[end]
if nextchar in _ws:
end += 1
nextchar = s[end]
if nextchar in _ws:
end = _w(s, end + 1).end()
nextchar = s[end]
except IndexError:
nextchar = ''
end += 1
if nextchar != '"':
raise JSONDecodeError("Expecting property name", s, end - 1)
if object_pairs_hook is not None:
result = object_pairs_hook(pairs)
return result, end
pairs = dict(pairs)
if object_hook is not None:
pairs = object_hook(pairs)
return pairs, end
def JSONArray((s, end), scan_once, _w=WHITESPACE.match, _ws=WHITESPACE_STR):
values = []
nextchar = s[end:end + 1]
if nextchar in _ws:
end = _w(s, end + 1).end()
nextchar = s[end:end + 1]
# Look-ahead for trivial empty array
if nextchar == ']':
return values, end + 1
_append = values.append
while True:
try:
value, end = scan_once(s, end)
except StopIteration:
raise JSONDecodeError("Expecting object", s, end)
_append(value)
nextchar = s[end:end + 1]
if nextchar in _ws:
end = _w(s, end + 1).end()
nextchar = s[end:end + 1]
end += 1
if nextchar == ']':
break
elif nextchar != ',':
raise JSONDecodeError("Expecting , delimiter", s, end)
try:
if s[end] in _ws:
end += 1
if s[end] in _ws:
end = _w(s, end + 1).end()
except IndexError:
pass
return values, end
class JSONDecoder(object):
"""Simple JSON <http://json.org> decoder
Performs the following translations in decoding by default:
+---------------+-------------------+
| JSON | Python |
+===============+===================+
| object | dict |
+---------------+-------------------+
| array | list |
+---------------+-------------------+
| string | unicode |
+---------------+-------------------+
| number (int) | int, long |
+---------------+-------------------+
| number (real) | float |
+---------------+-------------------+
| true | True |
+---------------+-------------------+
| false | False |
+---------------+-------------------+
| null | None |
+---------------+-------------------+
It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as
their corresponding ``float`` values, which is outside the JSON spec.
"""
def __init__(self, encoding=None, object_hook=None, parse_float=None,
parse_int=None, parse_constant=None, strict=True,
object_pairs_hook=None):
"""
*encoding* determines the encoding used to interpret any
:class:`str` objects decoded by this instance (``'utf-8'`` by
default). It has no effect when decoding :class:`unicode` objects.
Note that currently only encodings that are a superset of ASCII work,
strings of other encodings should be passed in as :class:`unicode`.
*object_hook*, if specified, will be called with the result of every
JSON object decoded and its return value will be used in place of the
given :class:`dict`. This can be used to provide custom
deserializations (e.g. to support JSON-RPC class hinting).
*object_pairs_hook* is an optional function that will be called with
the result of any object literal decode with an ordered list of pairs.
The return value of *object_pairs_hook* will be used instead of the
:class:`dict`. This feature can be used to implement custom decoders
that rely on the order that the key and value pairs are decoded (for
example, :func:`collections.OrderedDict` will remember the order of
insertion). If *object_hook* is also defined, the *object_pairs_hook*
takes priority.
*parse_float*, if specified, will be called with the string of every
JSON float to be decoded. By default, this is equivalent to
``float(num_str)``. This can be used to use another datatype or parser
for JSON floats (e.g. :class:`decimal.Decimal`).
*parse_int*, if specified, will be called with the string of every
JSON int to be decoded. By default, this is equivalent to
``int(num_str)``. This can be used to use another datatype or parser
for JSON integers (e.g. :class:`float`).
*parse_constant*, if specified, will be called with one of the
following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. This
can be used to raise an exception if invalid JSON numbers are
encountered.
*strict* controls the parser's behavior when it encounters an
invalid control character in a string. The default setting of
``True`` means that unescaped control characters are parse errors, if
``False`` then control characters will be allowed in strings.
"""
self.encoding = encoding
self.object_hook = object_hook
self.object_pairs_hook = object_pairs_hook
self.parse_float = parse_float or float
self.parse_int = parse_int or int
self.parse_constant = parse_constant or _CONSTANTS.__getitem__
self.strict = strict
self.parse_object = JSONObject
self.parse_array = JSONArray
self.parse_string = scanstring
self.memo = {}
self.scan_once = make_scanner(self)
def decode(self, s, _w=WHITESPACE.match):
"""Return the Python representation of ``s`` (a ``str`` or ``unicode``
instance containing a JSON document)
"""
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
end = _w(s, end).end()
if end != len(s):
raise JSONDecodeError("Extra data", s, end, len(s))
return obj
def raw_decode(self, s, idx=0):
"""Decode a JSON document from ``s`` (a ``str`` or ``unicode``
beginning with a JSON document) and return a 2-tuple of the Python
representation and the index in ``s`` where the document ended.
This can be used to decode a JSON document from a string that may
have extraneous data at the end.
"""
try:
obj, end = self.scan_once(s, idx)
except StopIteration:
raise JSONDecodeError("No JSON object could be decoded", s, idx)
return obj, end
-501
View File
@@ -1,501 +0,0 @@
"""Implementation of JSONEncoder
"""
import re
from decimal import Decimal
def _import_speedups():
try:
from simplejson import _speedups
return _speedups.encode_basestring_ascii, _speedups.make_encoder
except ImportError:
return None, None
c_encode_basestring_ascii, c_make_encoder = _import_speedups()
from simplejson.decoder import PosInf
ESCAPE = re.compile(r'[\x00-\x1f\\"\b\f\n\r\t]')
ESCAPE_ASCII = re.compile(r'([\\"]|[^\ -~])')
HAS_UTF8 = re.compile(r'[\x80-\xff]')
ESCAPE_DCT = {
'\\': '\\\\',
'"': '\\"',
'\b': '\\b',
'\f': '\\f',
'\n': '\\n',
'\r': '\\r',
'\t': '\\t',
}
for i in range(0x20):
#ESCAPE_DCT.setdefault(chr(i), '\\u{0:04x}'.format(i))
ESCAPE_DCT.setdefault(chr(i), '\\u%04x' % (i,))
FLOAT_REPR = repr
def encode_basestring(s):
"""Return a JSON representation of a Python string
"""
if isinstance(s, str) and HAS_UTF8.search(s) is not None:
s = s.decode('utf-8')
def replace(match):
return ESCAPE_DCT[match.group(0)]
return u'"' + ESCAPE.sub(replace, s) + u'"'
def py_encode_basestring_ascii(s):
"""Return an ASCII-only JSON representation of a Python string
"""
if isinstance(s, str) and HAS_UTF8.search(s) is not None:
s = s.decode('utf-8')
def replace(match):
s = match.group(0)
try:
return ESCAPE_DCT[s]
except KeyError:
n = ord(s)
if n < 0x10000:
#return '\\u{0:04x}'.format(n)
return '\\u%04x' % (n,)
else:
# surrogate pair
n -= 0x10000
s1 = 0xd800 | ((n >> 10) & 0x3ff)
s2 = 0xdc00 | (n & 0x3ff)
#return '\\u{0:04x}\\u{1:04x}'.format(s1, s2)
return '\\u%04x\\u%04x' % (s1, s2)
return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"'
encode_basestring_ascii = (
c_encode_basestring_ascii or py_encode_basestring_ascii)
class JSONEncoder(object):
"""Extensible JSON <http://json.org> encoder for Python data structures.
Supports the following objects and types by default:
+-------------------+---------------+
| Python | JSON |
+===================+===============+
| dict | object |
+-------------------+---------------+
| list, tuple | array |
+-------------------+---------------+
| str, unicode | string |
+-------------------+---------------+
| int, long, float | number |
+-------------------+---------------+
| True | true |
+-------------------+---------------+
| False | false |
+-------------------+---------------+
| None | null |
+-------------------+---------------+
To extend this to recognize other objects, subclass and implement a
``.default()`` method with another method that returns a serializable
object for ``o`` if possible, otherwise it should call the superclass
implementation (to raise ``TypeError``).
"""
item_separator = ', '
key_separator = ': '
def __init__(self, skipkeys=False, ensure_ascii=True,
check_circular=True, allow_nan=True, sort_keys=False,
indent=None, separators=None, encoding='utf-8', default=None,
use_decimal=False):
"""Constructor for JSONEncoder, with sensible defaults.
If skipkeys is false, then it is a TypeError to attempt
encoding of keys that are not str, int, long, float or None. If
skipkeys is True, such items are simply skipped.
If ensure_ascii is true, the output is guaranteed to be str
objects with all incoming unicode characters escaped. If
ensure_ascii is false, the output will be unicode object.
If check_circular is true, then lists, dicts, and custom encoded
objects will be checked for circular references during encoding to
prevent an infinite recursion (which would cause an OverflowError).
Otherwise, no such check takes place.
If allow_nan is true, then NaN, Infinity, and -Infinity will be
encoded as such. This behavior is not JSON specification compliant,
but is consistent with most JavaScript based encoders and decoders.
Otherwise, it will be a ValueError to encode such floats.
If sort_keys is true, then the output of dictionaries will be
sorted by key; this is useful for regression tests to ensure
that JSON serializations can be compared on a day-to-day basis.
If indent is a string, then JSON array elements and object members
will be pretty-printed with a newline followed by that string repeated
for each level of nesting. ``None`` (the default) selects the most compact
representation without any newlines. For backwards compatibility with
versions of simplejson earlier than 2.1.0, an integer is also accepted
and is converted to a string with that many spaces.
If specified, separators should be a (item_separator, key_separator)
tuple. The default is (', ', ': '). To get the most compact JSON
representation you should specify (',', ':') to eliminate whitespace.
If specified, default is a function that gets called for objects
that can't otherwise be serialized. It should return a JSON encodable
version of the object or raise a ``TypeError``.
If encoding is not None, then all input strings will be
transformed into unicode using that encoding prior to JSON-encoding.
The default is UTF-8.
If use_decimal is true (not the default), ``decimal.Decimal`` will
be supported directly by the encoder. For the inverse, decode JSON
with ``parse_float=decimal.Decimal``.
"""
self.skipkeys = skipkeys
self.ensure_ascii = ensure_ascii
self.check_circular = check_circular
self.allow_nan = allow_nan
self.sort_keys = sort_keys
self.use_decimal = use_decimal
if isinstance(indent, (int, long)):
indent = ' ' * indent
self.indent = indent
if separators is not None:
self.item_separator, self.key_separator = separators
if default is not None:
self.default = default
self.encoding = encoding
def default(self, o):
"""Implement this method in a subclass such that it returns
a serializable object for ``o``, or calls the base implementation
(to raise a ``TypeError``).
For example, to support arbitrary iterators, you could
implement default like this::
def default(self, o):
try:
iterable = iter(o)
except TypeError:
pass
else:
return list(iterable)
return JSONEncoder.default(self, o)
"""
raise TypeError(repr(o) + " is not JSON serializable")
def encode(self, o):
"""Return a JSON string representation of a Python data structure.
>>> from simplejson import JSONEncoder
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'
"""
# This is for extremely simple cases and benchmarks.
if isinstance(o, basestring):
if isinstance(o, str):
_encoding = self.encoding
if (_encoding is not None
and not (_encoding == 'utf-8')):
o = o.decode(_encoding)
if self.ensure_ascii:
return encode_basestring_ascii(o)
else:
return encode_basestring(o)
# This doesn't pass the iterator directly to ''.join() because the
# exceptions aren't as detailed. The list call should be roughly
# equivalent to the PySequence_Fast that ''.join() would do.
chunks = self.iterencode(o, _one_shot=True)
if not isinstance(chunks, (list, tuple)):
chunks = list(chunks)
if self.ensure_ascii:
return ''.join(chunks)
else:
return u''.join(chunks)
def iterencode(self, o, _one_shot=False):
"""Encode the given object and yield each string
representation as available.
For example::
for chunk in JSONEncoder().iterencode(bigobject):
mysocket.write(chunk)
"""
if self.check_circular:
markers = {}
else:
markers = None
if self.ensure_ascii:
_encoder = encode_basestring_ascii
else:
_encoder = encode_basestring
if self.encoding != 'utf-8':
def _encoder(o, _orig_encoder=_encoder, _encoding=self.encoding):
if isinstance(o, str):
o = o.decode(_encoding)
return _orig_encoder(o)
def floatstr(o, allow_nan=self.allow_nan,
_repr=FLOAT_REPR, _inf=PosInf, _neginf=-PosInf):
# Check for specials. Note that this type of test is processor
# and/or platform-specific, so do tests which don't depend on
# the internals.
if o != o:
text = 'NaN'
elif o == _inf:
text = 'Infinity'
elif o == _neginf:
text = '-Infinity'
else:
return _repr(o)
if not allow_nan:
raise ValueError(
"Out of range float values are not JSON compliant: " +
repr(o))
return text
key_memo = {}
if (_one_shot and c_make_encoder is not None
and not self.indent and not self.sort_keys):
_iterencode = c_make_encoder(
markers, self.default, _encoder, self.indent,
self.key_separator, self.item_separator, self.sort_keys,
self.skipkeys, self.allow_nan, key_memo, self.use_decimal)
else:
_iterencode = _make_iterencode(
markers, self.default, _encoder, self.indent, floatstr,
self.key_separator, self.item_separator, self.sort_keys,
self.skipkeys, _one_shot, self.use_decimal)
try:
return _iterencode(o, 0)
finally:
key_memo.clear()
class JSONEncoderForHTML(JSONEncoder):
"""An encoder that produces JSON safe to embed in HTML.
To embed JSON content in, say, a script tag on a web page, the
characters &, < and > should be escaped. They cannot be escaped
with the usual entities (e.g. &amp;) because they are not expanded
within <script> tags.
"""
def encode(self, o):
# Override JSONEncoder.encode because it has hacks for
# performance that make things more complicated.
chunks = self.iterencode(o, True)
if self.ensure_ascii:
return ''.join(chunks)
else:
return u''.join(chunks)
def iterencode(self, o, _one_shot=False):
chunks = super(JSONEncoderForHTML, self).iterencode(o, _one_shot)
for chunk in chunks:
chunk = chunk.replace('&', '\\u0026')
chunk = chunk.replace('<', '\\u003c')
chunk = chunk.replace('>', '\\u003e')
yield chunk
def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
_key_separator, _item_separator, _sort_keys, _skipkeys, _one_shot,
_use_decimal,
## HACK: hand-optimized bytecode; turn globals into locals
False=False,
True=True,
ValueError=ValueError,
basestring=basestring,
Decimal=Decimal,
dict=dict,
float=float,
id=id,
int=int,
isinstance=isinstance,
list=list,
long=long,
str=str,
tuple=tuple,
):
def _iterencode_list(lst, _current_indent_level):
if not lst:
yield '[]'
return
if markers is not None:
markerid = id(lst)
if markerid in markers:
raise ValueError("Circular reference detected")
markers[markerid] = lst
buf = '['
if _indent is not None:
_current_indent_level += 1
newline_indent = '\n' + (_indent * _current_indent_level)
separator = _item_separator + newline_indent
buf += newline_indent
else:
newline_indent = None
separator = _item_separator
first = True
for value in lst:
if first:
first = False
else:
buf = separator
if isinstance(value, basestring):
yield buf + _encoder(value)
elif value is None:
yield buf + 'null'
elif value is True:
yield buf + 'true'
elif value is False:
yield buf + 'false'
elif isinstance(value, (int, long)):
yield buf + str(value)
elif isinstance(value, float):
yield buf + _floatstr(value)
elif _use_decimal and isinstance(value, Decimal):
yield buf + str(value)
else:
yield buf
if isinstance(value, (list, tuple)):
chunks = _iterencode_list(value, _current_indent_level)
elif isinstance(value, dict):
chunks = _iterencode_dict(value, _current_indent_level)
else:
chunks = _iterencode(value, _current_indent_level)
for chunk in chunks:
yield chunk
if newline_indent is not None:
_current_indent_level -= 1
yield '\n' + (_indent * _current_indent_level)
yield ']'
if markers is not None:
del markers[markerid]
def _iterencode_dict(dct, _current_indent_level):
if not dct:
yield '{}'
return
if markers is not None:
markerid = id(dct)
if markerid in markers:
raise ValueError("Circular reference detected")
markers[markerid] = dct
yield '{'
if _indent is not None:
_current_indent_level += 1
newline_indent = '\n' + (_indent * _current_indent_level)
item_separator = _item_separator + newline_indent
yield newline_indent
else:
newline_indent = None
item_separator = _item_separator
first = True
if _sort_keys:
items = dct.items()
items.sort(key=lambda kv: kv[0])
else:
items = dct.iteritems()
for key, value in items:
if isinstance(key, basestring):
pass
# JavaScript is weakly typed for these, so it makes sense to
# also allow them. Many encoders seem to do something like this.
elif isinstance(key, float):
key = _floatstr(key)
elif key is True:
key = 'true'
elif key is False:
key = 'false'
elif key is None:
key = 'null'
elif isinstance(key, (int, long)):
key = str(key)
elif _skipkeys:
continue
else:
raise TypeError("key " + repr(key) + " is not a string")
if first:
first = False
else:
yield item_separator
yield _encoder(key)
yield _key_separator
if isinstance(value, basestring):
yield _encoder(value)
elif value is None:
yield 'null'
elif value is True:
yield 'true'
elif value is False:
yield 'false'
elif isinstance(value, (int, long)):
yield str(value)
elif isinstance(value, float):
yield _floatstr(value)
elif _use_decimal and isinstance(value, Decimal):
yield str(value)
else:
if isinstance(value, (list, tuple)):
chunks = _iterencode_list(value, _current_indent_level)
elif isinstance(value, dict):
chunks = _iterencode_dict(value, _current_indent_level)
else:
chunks = _iterencode(value, _current_indent_level)
for chunk in chunks:
yield chunk
if newline_indent is not None:
_current_indent_level -= 1
yield '\n' + (_indent * _current_indent_level)
yield '}'
if markers is not None:
del markers[markerid]
def _iterencode(o, _current_indent_level):
if isinstance(o, basestring):
yield _encoder(o)
elif o is None:
yield 'null'
elif o is True:
yield 'true'
elif o is False:
yield 'false'
elif isinstance(o, (int, long)):
yield str(o)
elif isinstance(o, float):
yield _floatstr(o)
elif isinstance(o, (list, tuple)):
for chunk in _iterencode_list(o, _current_indent_level):
yield chunk
elif isinstance(o, dict):
for chunk in _iterencode_dict(o, _current_indent_level):
yield chunk
elif _use_decimal and isinstance(o, Decimal):
yield str(o)
else:
if markers is not None:
markerid = id(o)
if markerid in markers:
raise ValueError("Circular reference detected")
markers[markerid] = o
o = _default(o)
for chunk in _iterencode(o, _current_indent_level):
yield chunk
if markers is not None:
del markers[markerid]
return _iterencode
-119
View File
@@ -1,119 +0,0 @@
"""Drop-in replacement for collections.OrderedDict by Raymond Hettinger
http://code.activestate.com/recipes/576693/
"""
from UserDict import DictMixin
# Modified from original to support Python 2.4, see
# http://code.google.com/p/simplejson/issues/detail?id=53
try:
all
except NameError:
def all(seq):
for elem in seq:
if not elem:
return False
return True
class OrderedDict(dict, DictMixin):
def __init__(self, *args, **kwds):
if len(args) > 1:
raise TypeError('expected at most 1 arguments, got %d' % len(args))
try:
self.__end
except AttributeError:
self.clear()
self.update(*args, **kwds)
def clear(self):
self.__end = end = []
end += [None, end, end] # sentinel node for doubly linked list
self.__map = {} # key --> [key, prev, next]
dict.clear(self)
def __setitem__(self, key, value):
if key not in self:
end = self.__end
curr = end[1]
curr[2] = end[1] = self.__map[key] = [key, curr, end]
dict.__setitem__(self, key, value)
def __delitem__(self, key):
dict.__delitem__(self, key)
key, prev, next = self.__map.pop(key)
prev[2] = next
next[1] = prev
def __iter__(self):
end = self.__end
curr = end[2]
while curr is not end:
yield curr[0]
curr = curr[2]
def __reversed__(self):
end = self.__end
curr = end[1]
while curr is not end:
yield curr[0]
curr = curr[1]
def popitem(self, last=True):
if not self:
raise KeyError('dictionary is empty')
# Modified from original to support Python 2.4, see
# http://code.google.com/p/simplejson/issues/detail?id=53
if last:
key = reversed(self).next()
else:
key = iter(self).next()
value = self.pop(key)
return key, value
def __reduce__(self):
items = [[k, self[k]] for k in self]
tmp = self.__map, self.__end
del self.__map, self.__end
inst_dict = vars(self).copy()
self.__map, self.__end = tmp
if inst_dict:
return (self.__class__, (items,), inst_dict)
return self.__class__, (items,)
def keys(self):
return list(self)
setdefault = DictMixin.setdefault
update = DictMixin.update
pop = DictMixin.pop
values = DictMixin.values
items = DictMixin.items
iterkeys = DictMixin.iterkeys
itervalues = DictMixin.itervalues
iteritems = DictMixin.iteritems
def __repr__(self):
if not self:
return '%s()' % (self.__class__.__name__,)
return '%s(%r)' % (self.__class__.__name__, self.items())
def copy(self):
return self.__class__(self)
@classmethod
def fromkeys(cls, iterable, value=None):
d = cls()
for key in iterable:
d[key] = value
return d
def __eq__(self, other):
if isinstance(other, OrderedDict):
return len(self)==len(other) and \
all(p==q for p, q in zip(self.items(), other.items()))
return dict.__eq__(self, other)
def __ne__(self, other):
return not self == other
-77
View File
@@ -1,77 +0,0 @@
"""JSON token scanner
"""
import re
def _import_c_make_scanner():
try:
from simplejson._speedups import make_scanner
return make_scanner
except ImportError:
return None
c_make_scanner = _import_c_make_scanner()
__all__ = ['make_scanner']
NUMBER_RE = re.compile(
r'(-?(?:0|[1-9]\d*))(\.\d+)?([eE][-+]?\d+)?',
(re.VERBOSE | re.MULTILINE | re.DOTALL))
def py_make_scanner(context):
parse_object = context.parse_object
parse_array = context.parse_array
parse_string = context.parse_string
match_number = NUMBER_RE.match
encoding = context.encoding
strict = context.strict
parse_float = context.parse_float
parse_int = context.parse_int
parse_constant = context.parse_constant
object_hook = context.object_hook
object_pairs_hook = context.object_pairs_hook
memo = context.memo
def _scan_once(string, idx):
try:
nextchar = string[idx]
except IndexError:
raise StopIteration
if nextchar == '"':
return parse_string(string, idx + 1, encoding, strict)
elif nextchar == '{':
return parse_object((string, idx + 1), encoding, strict,
_scan_once, object_hook, object_pairs_hook, memo)
elif nextchar == '[':
return parse_array((string, idx + 1), _scan_once)
elif nextchar == 'n' and string[idx:idx + 4] == 'null':
return None, idx + 4
elif nextchar == 't' and string[idx:idx + 4] == 'true':
return True, idx + 4
elif nextchar == 'f' and string[idx:idx + 5] == 'false':
return False, idx + 5
m = match_number(string, idx)
if m is not None:
integer, frac, exp = m.groups()
if frac or exp:
res = parse_float(integer + (frac or '') + (exp or ''))
else:
res = parse_int(integer)
return res, m.end()
elif nextchar == 'N' and string[idx:idx + 3] == 'NaN':
return parse_constant('NaN'), idx + 3
elif nextchar == 'I' and string[idx:idx + 8] == 'Infinity':
return parse_constant('Infinity'), idx + 8
elif nextchar == '-' and string[idx:idx + 9] == '-Infinity':
return parse_constant('-Infinity'), idx + 9
else:
raise StopIteration
def scan_once(string, idx):
try:
return _scan_once(string, idx)
finally:
memo.clear()
return scan_once
make_scanner = c_make_scanner or py_make_scanner
@@ -1,63 +0,0 @@
import unittest
import doctest
class OptionalExtensionTestSuite(unittest.TestSuite):
def run(self, result):
import simplejson
run = unittest.TestSuite.run
run(self, result)
simplejson._toggle_speedups(False)
run(self, result)
simplejson._toggle_speedups(True)
return result
def additional_tests(suite=None):
import simplejson
import simplejson.encoder
import simplejson.decoder
if suite is None:
suite = unittest.TestSuite()
for mod in (simplejson, simplejson.encoder, simplejson.decoder):
suite.addTest(doctest.DocTestSuite(mod))
suite.addTest(doctest.DocFileSuite('../../index.rst'))
return suite
def all_tests_suite():
suite = unittest.TestLoader().loadTestsFromNames([
'simplejson.tests.test_check_circular',
'simplejson.tests.test_decode',
'simplejson.tests.test_default',
'simplejson.tests.test_dump',
'simplejson.tests.test_encode_basestring_ascii',
'simplejson.tests.test_encode_for_html',
'simplejson.tests.test_fail',
'simplejson.tests.test_float',
'simplejson.tests.test_indent',
'simplejson.tests.test_pass1',
'simplejson.tests.test_pass2',
'simplejson.tests.test_pass3',
'simplejson.tests.test_recursion',
'simplejson.tests.test_scanstring',
'simplejson.tests.test_separators',
'simplejson.tests.test_speedups',
'simplejson.tests.test_unicode',
'simplejson.tests.test_decimal',
])
suite = additional_tests(suite)
return OptionalExtensionTestSuite([suite])
def main():
runner = unittest.TextTestRunner()
suite = all_tests_suite()
runner.run(suite)
if __name__ == '__main__':
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
main()
@@ -1,30 +0,0 @@
from unittest import TestCase
import simplejson as json
def default_iterable(obj):
return list(obj)
class TestCheckCircular(TestCase):
def test_circular_dict(self):
dct = {}
dct['a'] = dct
self.assertRaises(ValueError, json.dumps, dct)
def test_circular_list(self):
lst = []
lst.append(lst)
self.assertRaises(ValueError, json.dumps, lst)
def test_circular_composite(self):
dct2 = {}
dct2['a'] = []
dct2['a'].append(dct2)
self.assertRaises(ValueError, json.dumps, dct2)
def test_circular_default(self):
json.dumps([set()], default=default_iterable)
self.assertRaises(TypeError, json.dumps, [set()])
def test_circular_off_default(self):
json.dumps([set()], default=default_iterable, check_circular=False)
self.assertRaises(TypeError, json.dumps, [set()], check_circular=False)
@@ -1,33 +0,0 @@
from decimal import Decimal
from unittest import TestCase
import simplejson as json
class TestDecimal(TestCase):
NUMS = "1.0", "10.00", "1.1", "1234567890.1234567890", "500"
def test_decimal_encode(self):
for d in map(Decimal, self.NUMS):
self.assertEquals(json.dumps(d, use_decimal=True), str(d))
def test_decimal_decode(self):
for s in self.NUMS:
self.assertEquals(json.loads(s, parse_float=Decimal), Decimal(s))
def test_decimal_roundtrip(self):
for d in map(Decimal, self.NUMS):
# The type might not be the same (int and Decimal) but they
# should still compare equal.
self.assertEquals(
json.loads(
json.dumps(d, use_decimal=True), parse_float=Decimal),
d)
self.assertEquals(
json.loads(
json.dumps([d], use_decimal=True), parse_float=Decimal),
[d])
def test_decimal_defaults(self):
d = Decimal(1)
# use_decimal=False is the default
self.assertRaises(TypeError, json.dumps, d, use_decimal=False)
self.assertRaises(TypeError, json.dumps, d)
@@ -1,73 +0,0 @@
import decimal
from unittest import TestCase
from StringIO import StringIO
import simplejson as json
from simplejson import OrderedDict
class TestDecode(TestCase):
if not hasattr(TestCase, 'assertIs'):
def assertIs(self, a, b):
self.assertTrue(a is b, '%r is %r' % (a, b))
def test_decimal(self):
rval = json.loads('1.1', parse_float=decimal.Decimal)
self.assertTrue(isinstance(rval, decimal.Decimal))
self.assertEquals(rval, decimal.Decimal('1.1'))
def test_float(self):
rval = json.loads('1', parse_int=float)
self.assertTrue(isinstance(rval, float))
self.assertEquals(rval, 1.0)
def test_decoder_optimizations(self):
# Several optimizations were made that skip over calls to
# the whitespace regex, so this test is designed to try and
# exercise the uncommon cases. The array cases are already covered.
rval = json.loads('{ "key" : "value" , "k":"v" }')
self.assertEquals(rval, {"key":"value", "k":"v"})
def test_empty_objects(self):
s = '{}'
self.assertEqual(json.loads(s), eval(s))
s = '[]'
self.assertEqual(json.loads(s), eval(s))
s = '""'
self.assertEqual(json.loads(s), eval(s))
def test_object_pairs_hook(self):
s = '{"xkd":1, "kcw":2, "art":3, "hxm":4, "qrt":5, "pad":6, "hoy":7}'
p = [("xkd", 1), ("kcw", 2), ("art", 3), ("hxm", 4),
("qrt", 5), ("pad", 6), ("hoy", 7)]
self.assertEqual(json.loads(s), eval(s))
self.assertEqual(json.loads(s, object_pairs_hook=lambda x: x), p)
self.assertEqual(json.load(StringIO(s),
object_pairs_hook=lambda x: x), p)
od = json.loads(s, object_pairs_hook=OrderedDict)
self.assertEqual(od, OrderedDict(p))
self.assertEqual(type(od), OrderedDict)
# the object_pairs_hook takes priority over the object_hook
self.assertEqual(json.loads(s,
object_pairs_hook=OrderedDict,
object_hook=lambda x: None),
OrderedDict(p))
def check_keys_reuse(self, source, loads):
rval = loads(source)
(a, b), (c, d) = sorted(rval[0]), sorted(rval[1])
self.assertIs(a, c)
self.assertIs(b, d)
def test_keys_reuse_str(self):
s = u'[{"a_key": 1, "b_\xe9": 2}, {"a_key": 3, "b_\xe9": 4}]'.encode('utf8')
self.check_keys_reuse(s, json.loads)
def test_keys_reuse_unicode(self):
s = u'[{"a_key": 1, "b_\xe9": 2}, {"a_key": 3, "b_\xe9": 4}]'
self.check_keys_reuse(s, json.loads)
def test_empty_strings(self):
self.assertEqual(json.loads('""'), "")
self.assertEqual(json.loads(u'""'), u"")
self.assertEqual(json.loads('[""]'), [""])
self.assertEqual(json.loads(u'[""]'), [u""])
@@ -1,9 +0,0 @@
from unittest import TestCase
import simplejson as json
class TestDefault(TestCase):
def test_default(self):
self.assertEquals(
json.dumps(type, default=repr),
json.dumps(repr(type)))
@@ -1,27 +0,0 @@
from unittest import TestCase
from cStringIO import StringIO
import simplejson as json
class TestDump(TestCase):
def test_dump(self):
sio = StringIO()
json.dump({}, sio)
self.assertEquals(sio.getvalue(), '{}')
def test_dumps(self):
self.assertEquals(json.dumps({}), '{}')
def test_encode_truefalse(self):
self.assertEquals(json.dumps(
{True: False, False: True}, sort_keys=True),
'{"false": true, "true": false}')
self.assertEquals(json.dumps(
{2: 3.0, 4.0: 5L, False: 1, 6L: True, "7": 0}, sort_keys=True),
'{"false": 1, "2": 3.0, "4.0": 5, "6": true, "7": 0}')
def test_ordered_dict(self):
# http://bugs.python.org/issue6105
items = [('one', 1), ('two', 2), ('three', 3), ('four', 4), ('five', 5)]
s = json.dumps(json.OrderedDict(items))
self.assertEqual(s, '{"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}')
@@ -1,41 +0,0 @@
from unittest import TestCase
import simplejson.encoder
CASES = [
(u'/\\"\ucafe\ubabe\uab98\ufcde\ubcda\uef4a\x08\x0c\n\r\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?', '"/\\\\\\"\\ucafe\\ubabe\\uab98\\ufcde\\ubcda\\uef4a\\b\\f\\n\\r\\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?"'),
(u'\u0123\u4567\u89ab\ucdef\uabcd\uef4a', '"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
(u'controls', '"controls"'),
(u'\x08\x0c\n\r\t', '"\\b\\f\\n\\r\\t"'),
(u'{"object with 1 member":["array with 1 element"]}', '"{\\"object with 1 member\\":[\\"array with 1 element\\"]}"'),
(u' s p a c e d ', '" s p a c e d "'),
(u'\U0001d120', '"\\ud834\\udd20"'),
(u'\u03b1\u03a9', '"\\u03b1\\u03a9"'),
('\xce\xb1\xce\xa9', '"\\u03b1\\u03a9"'),
(u'\u03b1\u03a9', '"\\u03b1\\u03a9"'),
('\xce\xb1\xce\xa9', '"\\u03b1\\u03a9"'),
(u'\u03b1\u03a9', '"\\u03b1\\u03a9"'),
(u'\u03b1\u03a9', '"\\u03b1\\u03a9"'),
(u"`1~!@#$%^&*()_+-={':[,]}|;.</>?", '"`1~!@#$%^&*()_+-={\':[,]}|;.</>?"'),
(u'\x08\x0c\n\r\t', '"\\b\\f\\n\\r\\t"'),
(u'\u0123\u4567\u89ab\ucdef\uabcd\uef4a', '"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
]
class TestEncodeBaseStringAscii(TestCase):
def test_py_encode_basestring_ascii(self):
self._test_encode_basestring_ascii(simplejson.encoder.py_encode_basestring_ascii)
def test_c_encode_basestring_ascii(self):
if not simplejson.encoder.c_encode_basestring_ascii:
return
self._test_encode_basestring_ascii(simplejson.encoder.c_encode_basestring_ascii)
def _test_encode_basestring_ascii(self, encode_basestring_ascii):
fname = encode_basestring_ascii.__name__
for input_string, expect in CASES:
result = encode_basestring_ascii(input_string)
#self.assertEquals(result, expect,
# '{0!r} != {1!r} for {2}({3!r})'.format(
# result, expect, fname, input_string))
self.assertEquals(result, expect,
'%r != %r for %s(%r)' % (result, expect, fname, input_string))
@@ -1,32 +0,0 @@
import unittest
import simplejson.decoder
import simplejson.encoder
class TestEncodeForHTML(unittest.TestCase):
def setUp(self):
self.decoder = simplejson.decoder.JSONDecoder()
self.encoder = simplejson.encoder.JSONEncoderForHTML()
def test_basic_encode(self):
self.assertEqual(r'"\u0026"', self.encoder.encode('&'))
self.assertEqual(r'"\u003c"', self.encoder.encode('<'))
self.assertEqual(r'"\u003e"', self.encoder.encode('>'))
def test_basic_roundtrip(self):
for char in '&<>':
self.assertEqual(
char, self.decoder.decode(
self.encoder.encode(char)))
def test_prevent_script_breakout(self):
bad_string = '</script><script>alert("gotcha")</script>'
self.assertEqual(
r'"\u003c/script\u003e\u003cscript\u003e'
r'alert(\"gotcha\")\u003c/script\u003e"',
self.encoder.encode(bad_string))
self.assertEqual(
bad_string, self.decoder.decode(
self.encoder.encode(bad_string)))
@@ -1,91 +0,0 @@
from unittest import TestCase
import simplejson as json
# Fri Dec 30 18:57:26 2005
JSONDOCS = [
# http://json.org/JSON_checker/test/fail1.json
'"A JSON payload should be an object or array, not a string."',
# http://json.org/JSON_checker/test/fail2.json
'["Unclosed array"',
# http://json.org/JSON_checker/test/fail3.json
'{unquoted_key: "keys must be quoted}',
# http://json.org/JSON_checker/test/fail4.json
'["extra comma",]',
# http://json.org/JSON_checker/test/fail5.json
'["double extra comma",,]',
# http://json.org/JSON_checker/test/fail6.json
'[ , "<-- missing value"]',
# http://json.org/JSON_checker/test/fail7.json
'["Comma after the close"],',
# http://json.org/JSON_checker/test/fail8.json
'["Extra close"]]',
# http://json.org/JSON_checker/test/fail9.json
'{"Extra comma": true,}',
# http://json.org/JSON_checker/test/fail10.json
'{"Extra value after close": true} "misplaced quoted value"',
# http://json.org/JSON_checker/test/fail11.json
'{"Illegal expression": 1 + 2}',
# http://json.org/JSON_checker/test/fail12.json
'{"Illegal invocation": alert()}',
# http://json.org/JSON_checker/test/fail13.json
'{"Numbers cannot have leading zeroes": 013}',
# http://json.org/JSON_checker/test/fail14.json
'{"Numbers cannot be hex": 0x14}',
# http://json.org/JSON_checker/test/fail15.json
'["Illegal backslash escape: \\x15"]',
# http://json.org/JSON_checker/test/fail16.json
'["Illegal backslash escape: \\\'"]',
# http://json.org/JSON_checker/test/fail17.json
'["Illegal backslash escape: \\017"]',
# http://json.org/JSON_checker/test/fail18.json
'[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]',
# http://json.org/JSON_checker/test/fail19.json
'{"Missing colon" null}',
# http://json.org/JSON_checker/test/fail20.json
'{"Double colon":: null}',
# http://json.org/JSON_checker/test/fail21.json
'{"Comma instead of colon", null}',
# http://json.org/JSON_checker/test/fail22.json
'["Colon instead of comma": false]',
# http://json.org/JSON_checker/test/fail23.json
'["Bad value", truth]',
# http://json.org/JSON_checker/test/fail24.json
"['single quote']",
# http://code.google.com/p/simplejson/issues/detail?id=3
u'["A\u001FZ control characters in string"]',
]
SKIPS = {
1: "why not have a string payload?",
18: "spec doesn't specify any nesting limitations",
}
class TestFail(TestCase):
def test_failures(self):
for idx, doc in enumerate(JSONDOCS):
idx = idx + 1
if idx in SKIPS:
json.loads(doc)
continue
try:
json.loads(doc)
except json.JSONDecodeError:
pass
else:
#self.fail("Expected failure for fail{0}.json: {1!r}".format(idx, doc))
self.fail("Expected failure for fail%d.json: %r" % (idx, doc))
def test_array_decoder_issue46(self):
# http://code.google.com/p/simplejson/issues/detail?id=46
for doc in [u'[,]', '[,]']:
try:
json.loads(doc)
except json.JSONDecodeError, e:
self.assertEquals(e.pos, 1)
self.assertEquals(e.lineno, 1)
self.assertEquals(e.colno, 1)
except Exception, e:
self.fail("Unexpected exception raised %r %s" % (e, e))
else:
self.fail("Unexpected success parsing '[,]'")
@@ -1,19 +0,0 @@
import math
from unittest import TestCase
import simplejson as json
class TestFloat(TestCase):
def test_floats(self):
for num in [1617161771.7650001, math.pi, math.pi**100,
math.pi**-100, 3.1]:
self.assertEquals(float(json.dumps(num)), num)
self.assertEquals(json.loads(json.dumps(num)), num)
self.assertEquals(json.loads(unicode(json.dumps(num))), num)
def test_ints(self):
for num in [1, 1L, 1<<32, 1<<64]:
self.assertEquals(json.dumps(num), str(num))
self.assertEquals(int(json.dumps(num)), num)
self.assertEquals(json.loads(json.dumps(num)), num)
self.assertEquals(json.loads(unicode(json.dumps(num))), num)
@@ -1,53 +0,0 @@
from unittest import TestCase
import simplejson as json
import textwrap
class TestIndent(TestCase):
def test_indent(self):
h = [['blorpie'], ['whoops'], [], 'd-shtaeou', 'd-nthiouh',
'i-vhbjkhnth',
{'nifty': 87}, {'field': 'yes', 'morefield': False} ]
expect = textwrap.dedent("""\
[
\t[
\t\t"blorpie"
\t],
\t[
\t\t"whoops"
\t],
\t[],
\t"d-shtaeou",
\t"d-nthiouh",
\t"i-vhbjkhnth",
\t{
\t\t"nifty": 87
\t},
\t{
\t\t"field": "yes",
\t\t"morefield": false
\t}
]""")
d1 = json.dumps(h)
d2 = json.dumps(h, indent='\t', sort_keys=True, separators=(',', ': '))
d3 = json.dumps(h, indent=' ', sort_keys=True, separators=(',', ': '))
d4 = json.dumps(h, indent=2, sort_keys=True, separators=(',', ': '))
h1 = json.loads(d1)
h2 = json.loads(d2)
h3 = json.loads(d3)
h4 = json.loads(d4)
self.assertEquals(h1, h)
self.assertEquals(h2, h)
self.assertEquals(h3, h)
self.assertEquals(h4, h)
self.assertEquals(d3, expect.replace('\t', ' '))
self.assertEquals(d4, expect.replace('\t', ' '))
# NOTE: Python 2.4 textwrap.dedent converts tabs to spaces,
# so the following is expected to fail. Python 2.4 is not a
# supported platform in simplejson 2.1.0+.
self.assertEquals(d2, expect)
@@ -1,76 +0,0 @@
from unittest import TestCase
import simplejson as json
# from http://json.org/JSON_checker/test/pass1.json
JSON = r'''
[
"JSON Test Pattern pass1",
{"object with 1 member":["array with 1 element"]},
{},
[],
-42,
true,
false,
null,
{
"integer": 1234567890,
"real": -9876.543210,
"e": 0.123456789e-12,
"E": 1.234567890E+34,
"": 23456789012E666,
"zero": 0,
"one": 1,
"space": " ",
"quote": "\"",
"backslash": "\\",
"controls": "\b\f\n\r\t",
"slash": "/ & \/",
"alpha": "abcdefghijklmnopqrstuvwyz",
"ALPHA": "ABCDEFGHIJKLMNOPQRSTUVWYZ",
"digit": "0123456789",
"special": "`1~!@#$%^&*()_+-={':[,]}|;.</>?",
"hex": "\u0123\u4567\u89AB\uCDEF\uabcd\uef4A",
"true": true,
"false": false,
"null": null,
"array":[ ],
"object":{ },
"address": "50 St. James Street",
"url": "http://www.JSON.org/",
"comment": "// /* <!-- --",
"# -- --> */": " ",
" s p a c e d " :[1,2 , 3
,
4 , 5 , 6 ,7 ],
"compact": [1,2,3,4,5,6,7],
"jsontext": "{\"object with 1 member\":[\"array with 1 element\"]}",
"quotes": "&#34; \u0022 %22 0x22 034 &#x22;",
"\/\\\"\uCAFE\uBABE\uAB98\uFCDE\ubcda\uef4A\b\f\n\r\t`1~!@#$%^&*()_+-=[]{}|;:',./<>?"
: "A key can be any string"
},
0.5 ,98.6
,
99.44
,
1066
,"rosebud"]
'''
class TestPass1(TestCase):
def test_parse(self):
# test in/out equivalence and parsing
res = json.loads(JSON)
out = json.dumps(res)
self.assertEquals(res, json.loads(out))
try:
json.dumps(res, allow_nan=False)
except ValueError:
pass
else:
self.fail("23456789012E666 should be out of range")
@@ -1,14 +0,0 @@
from unittest import TestCase
import simplejson as json
# from http://json.org/JSON_checker/test/pass2.json
JSON = r'''
[[[[[[[[[[[[[[[[[[["Not too deep"]]]]]]]]]]]]]]]]]]]
'''
class TestPass2(TestCase):
def test_parse(self):
# test in/out equivalence and parsing
res = json.loads(JSON)
out = json.dumps(res)
self.assertEquals(res, json.loads(out))
@@ -1,20 +0,0 @@
from unittest import TestCase
import simplejson as json
# from http://json.org/JSON_checker/test/pass3.json
JSON = r'''
{
"JSON Test Pattern pass3": {
"The outermost value": "must be an object or array.",
"In this test": "It is an object."
}
}
'''
class TestPass3(TestCase):
def test_parse(self):
# test in/out equivalence and parsing
res = json.loads(JSON)
out = json.dumps(res)
self.assertEquals(res, json.loads(out))
@@ -1,67 +0,0 @@
from unittest import TestCase
import simplejson as json
class JSONTestObject:
pass
class RecursiveJSONEncoder(json.JSONEncoder):
recurse = False
def default(self, o):
if o is JSONTestObject:
if self.recurse:
return [JSONTestObject]
else:
return 'JSONTestObject'
return json.JSONEncoder.default(o)
class TestRecursion(TestCase):
def test_listrecursion(self):
x = []
x.append(x)
try:
json.dumps(x)
except ValueError:
pass
else:
self.fail("didn't raise ValueError on list recursion")
x = []
y = [x]
x.append(y)
try:
json.dumps(x)
except ValueError:
pass
else:
self.fail("didn't raise ValueError on alternating list recursion")
y = []
x = [y, y]
# ensure that the marker is cleared
json.dumps(x)
def test_dictrecursion(self):
x = {}
x["test"] = x
try:
json.dumps(x)
except ValueError:
pass
else:
self.fail("didn't raise ValueError on dict recursion")
x = {}
y = {"a": x, "b": x}
# ensure that the marker is cleared
json.dumps(x)
def test_defaultrecursion(self):
enc = RecursiveJSONEncoder()
self.assertEquals(enc.encode(JSONTestObject), '"JSONTestObject"')
enc.recurse = True
try:
enc.encode(JSONTestObject)
except ValueError:
pass
else:
self.fail("didn't raise ValueError on default recursion")
@@ -1,117 +0,0 @@
import sys
from unittest import TestCase
import simplejson as json
import simplejson.decoder
class TestScanString(TestCase):
def test_py_scanstring(self):
self._test_scanstring(simplejson.decoder.py_scanstring)
def test_c_scanstring(self):
if not simplejson.decoder.c_scanstring:
return
self._test_scanstring(simplejson.decoder.c_scanstring)
def _test_scanstring(self, scanstring):
self.assertEquals(
scanstring('"z\\ud834\\udd20x"', 1, None, True),
(u'z\U0001d120x', 16))
if sys.maxunicode == 65535:
self.assertEquals(
scanstring(u'"z\U0001d120x"', 1, None, True),
(u'z\U0001d120x', 6))
else:
self.assertEquals(
scanstring(u'"z\U0001d120x"', 1, None, True),
(u'z\U0001d120x', 5))
self.assertEquals(
scanstring('"\\u007b"', 1, None, True),
(u'{', 8))
self.assertEquals(
scanstring('"A JSON payload should be an object or array, not a string."', 1, None, True),
(u'A JSON payload should be an object or array, not a string.', 60))
self.assertEquals(
scanstring('["Unclosed array"', 2, None, True),
(u'Unclosed array', 17))
self.assertEquals(
scanstring('["extra comma",]', 2, None, True),
(u'extra comma', 14))
self.assertEquals(
scanstring('["double extra comma",,]', 2, None, True),
(u'double extra comma', 21))
self.assertEquals(
scanstring('["Comma after the close"],', 2, None, True),
(u'Comma after the close', 24))
self.assertEquals(
scanstring('["Extra close"]]', 2, None, True),
(u'Extra close', 14))
self.assertEquals(
scanstring('{"Extra comma": true,}', 2, None, True),
(u'Extra comma', 14))
self.assertEquals(
scanstring('{"Extra value after close": true} "misplaced quoted value"', 2, None, True),
(u'Extra value after close', 26))
self.assertEquals(
scanstring('{"Illegal expression": 1 + 2}', 2, None, True),
(u'Illegal expression', 21))
self.assertEquals(
scanstring('{"Illegal invocation": alert()}', 2, None, True),
(u'Illegal invocation', 21))
self.assertEquals(
scanstring('{"Numbers cannot have leading zeroes": 013}', 2, None, True),
(u'Numbers cannot have leading zeroes', 37))
self.assertEquals(
scanstring('{"Numbers cannot be hex": 0x14}', 2, None, True),
(u'Numbers cannot be hex', 24))
self.assertEquals(
scanstring('[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]', 21, None, True),
(u'Too deep', 30))
self.assertEquals(
scanstring('{"Missing colon" null}', 2, None, True),
(u'Missing colon', 16))
self.assertEquals(
scanstring('{"Double colon":: null}', 2, None, True),
(u'Double colon', 15))
self.assertEquals(
scanstring('{"Comma instead of colon", null}', 2, None, True),
(u'Comma instead of colon', 25))
self.assertEquals(
scanstring('["Colon instead of comma": false]', 2, None, True),
(u'Colon instead of comma', 25))
self.assertEquals(
scanstring('["Bad value", truth]', 2, None, True),
(u'Bad value', 12))
def test_issue3623(self):
self.assertRaises(ValueError, json.decoder.scanstring, "xxx", 1,
"xxx")
self.assertRaises(UnicodeDecodeError,
json.encoder.encode_basestring_ascii, "xx\xff")
def test_overflow(self):
# Python 2.5 does not have maxsize
maxsize = getattr(sys, 'maxsize', sys.maxint)
self.assertRaises(OverflowError, json.decoder.scanstring, "xxx",
maxsize + 1)
@@ -1,42 +0,0 @@
import textwrap
from unittest import TestCase
import simplejson as json
class TestSeparators(TestCase):
def test_separators(self):
h = [['blorpie'], ['whoops'], [], 'd-shtaeou', 'd-nthiouh', 'i-vhbjkhnth',
{'nifty': 87}, {'field': 'yes', 'morefield': False} ]
expect = textwrap.dedent("""\
[
[
"blorpie"
] ,
[
"whoops"
] ,
[] ,
"d-shtaeou" ,
"d-nthiouh" ,
"i-vhbjkhnth" ,
{
"nifty" : 87
} ,
{
"field" : "yes" ,
"morefield" : false
}
]""")
d1 = json.dumps(h)
d2 = json.dumps(h, indent=' ', sort_keys=True, separators=(' ,', ' : '))
h1 = json.loads(d1)
h2 = json.loads(d2)
self.assertEquals(h1, h)
self.assertEquals(h2, h)
self.assertEquals(d2, expect)
@@ -1,21 +0,0 @@
import decimal
from unittest import TestCase
from simplejson import decoder, encoder, scanner
def has_speedups():
return encoder.c_make_encoder is not None
class TestDecode(TestCase):
def test_make_scanner(self):
if not has_speedups():
return
self.assertRaises(AttributeError, scanner.c_make_scanner, 1)
def test_make_encoder(self):
if not has_speedups():
return
self.assertRaises(TypeError, encoder.c_make_encoder,
None,
"\xCD\x7D\x3D\x4E\x12\x4C\xF9\x79\xD7\x52\xBA\x82\xF2\x27\x4A\x7D\xA0\xCA\x75",
None)
@@ -1,99 +0,0 @@
from unittest import TestCase
import simplejson as json
class TestUnicode(TestCase):
def test_encoding1(self):
encoder = json.JSONEncoder(encoding='utf-8')
u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
s = u.encode('utf-8')
ju = encoder.encode(u)
js = encoder.encode(s)
self.assertEquals(ju, js)
def test_encoding2(self):
u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
s = u.encode('utf-8')
ju = json.dumps(u, encoding='utf-8')
js = json.dumps(s, encoding='utf-8')
self.assertEquals(ju, js)
def test_encoding3(self):
u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
j = json.dumps(u)
self.assertEquals(j, '"\\u03b1\\u03a9"')
def test_encoding4(self):
u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
j = json.dumps([u])
self.assertEquals(j, '["\\u03b1\\u03a9"]')
def test_encoding5(self):
u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
j = json.dumps(u, ensure_ascii=False)
self.assertEquals(j, u'"' + u + u'"')
def test_encoding6(self):
u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
j = json.dumps([u], ensure_ascii=False)
self.assertEquals(j, u'["' + u + u'"]')
def test_big_unicode_encode(self):
u = u'\U0001d120'
self.assertEquals(json.dumps(u), '"\\ud834\\udd20"')
self.assertEquals(json.dumps(u, ensure_ascii=False), u'"\U0001d120"')
def test_big_unicode_decode(self):
u = u'z\U0001d120x'
self.assertEquals(json.loads('"' + u + '"'), u)
self.assertEquals(json.loads('"z\\ud834\\udd20x"'), u)
def test_unicode_decode(self):
for i in range(0, 0xd7ff):
u = unichr(i)
#s = '"\\u{0:04x}"'.format(i)
s = '"\\u%04x"' % (i,)
self.assertEquals(json.loads(s), u)
def test_object_pairs_hook_with_unicode(self):
s = u'{"xkd":1, "kcw":2, "art":3, "hxm":4, "qrt":5, "pad":6, "hoy":7}'
p = [(u"xkd", 1), (u"kcw", 2), (u"art", 3), (u"hxm", 4),
(u"qrt", 5), (u"pad", 6), (u"hoy", 7)]
self.assertEqual(json.loads(s), eval(s))
self.assertEqual(json.loads(s, object_pairs_hook=lambda x: x), p)
od = json.loads(s, object_pairs_hook=json.OrderedDict)
self.assertEqual(od, json.OrderedDict(p))
self.assertEqual(type(od), json.OrderedDict)
# the object_pairs_hook takes priority over the object_hook
self.assertEqual(json.loads(s,
object_pairs_hook=json.OrderedDict,
object_hook=lambda x: None),
json.OrderedDict(p))
def test_default_encoding(self):
self.assertEquals(json.loads(u'{"a": "\xe9"}'.encode('utf-8')),
{'a': u'\xe9'})
def test_unicode_preservation(self):
self.assertEquals(type(json.loads(u'""')), unicode)
self.assertEquals(type(json.loads(u'"a"')), unicode)
self.assertEquals(type(json.loads(u'["a"]')[0]), unicode)
def test_ensure_ascii_false_returns_unicode(self):
# http://code.google.com/p/simplejson/issues/detail?id=48
self.assertEquals(type(json.dumps([], ensure_ascii=False)), unicode)
self.assertEquals(type(json.dumps(0, ensure_ascii=False)), unicode)
self.assertEquals(type(json.dumps({}, ensure_ascii=False)), unicode)
self.assertEquals(type(json.dumps("", ensure_ascii=False)), unicode)
def test_ensure_ascii_false_bytestring_encoding(self):
# http://code.google.com/p/simplejson/issues/detail?id=48
doc1 = {u'quux': 'Arr\xc3\xaat sur images'}
doc2 = {u'quux': u'Arr\xeat sur images'}
doc_ascii = '{"quux": "Arr\\u00eat sur images"}'
doc_unicode = u'{"quux": "Arr\xeat sur images"}'
self.assertEquals(json.dumps(doc1), doc_ascii)
self.assertEquals(json.dumps(doc2), doc_ascii)
self.assertEquals(json.dumps(doc1, ensure_ascii=False), doc_unicode)
self.assertEquals(json.dumps(doc2, ensure_ascii=False), doc_unicode)
-39
View File
@@ -1,39 +0,0 @@
r"""Command-line tool to validate and pretty-print JSON
Usage::
$ echo '{"json":"obj"}' | python -m simplejson.tool
{
"json": "obj"
}
$ echo '{ 1.2:3.4}' | python -m simplejson.tool
Expecting property name: line 1 column 2 (char 2)
"""
import sys
import simplejson as json
def main():
if len(sys.argv) == 1:
infile = sys.stdin
outfile = sys.stdout
elif len(sys.argv) == 2:
infile = open(sys.argv[1], 'rb')
outfile = sys.stdout
elif len(sys.argv) == 3:
infile = open(sys.argv[1], 'rb')
outfile = open(sys.argv[2], 'wb')
else:
raise SystemExit(sys.argv[0] + " [infile [outfile]]")
try:
obj = json.load(infile,
object_pairs_hook=json.OrderedDict,
use_decimal=True)
except ValueError, e:
raise SystemExit(e)
json.dump(obj, outfile, sort_keys=True, indent=' ', use_decimal=True)
outfile.write('\n')
if __name__ == '__main__':
main()
File diff suppressed because it is too large Load Diff
-262
View File
@@ -1,262 +0,0 @@
# -*- coding: windows-1251 -*-
# Portions are Copyright (C) 2005 Roman V. Kiseliov
# Portions are Copyright (c) 2004 Evgeny Filatov <fufff@users.sourceforge.net>
# Portions are Copyright (c) 2002-2004 John McNamara (Perl Spreadsheet::WriteExcel)
from BIFFRecords import BiffRecord
from struct import *
def _size_col(sheet, col):
return sheet.col_width(col)
def _size_row(sheet, row):
return sheet.row_height(row)
def _position_image(sheet, row_start, col_start, x1, y1, width, height):
"""Calculate the vertices that define the position of the image as required by
the OBJ record.
+------------+------------+
| A | B |
+-----+------------+------------+
| |(x1,y1) | |
| 1 |(A1)._______|______ |
| | | | |
| | | | |
+-----+----| BITMAP |-----+
| | | | |
| 2 | |______________. |
| | | (B2)|
| | | (x2,y2)|
+---- +------------+------------+
Example of a bitmap that covers some of the area from cell A1 to cell B2.
Based on the width and height of the bitmap we need to calculate 8 vars:
col_start, row_start, col_end, row_end, x1, y1, x2, y2.
The width and height of the cells are also variable and have to be taken into
account.
The values of col_start and row_start are passed in from the calling
function. The values of col_end and row_end are calculated by subtracting
the width and height of the bitmap from the width and height of the
underlying cells.
The vertices are expressed as a percentage of the underlying cell width as
follows (rhs values are in pixels):
x1 = X / W *1024
y1 = Y / H *256
x2 = (X-1) / W *1024
y2 = (Y-1) / H *256
Where: X is distance from the left side of the underlying cell
Y is distance from the top of the underlying cell
W is the width of the cell
H is the height of the cell
Note: the SDK incorrectly states that the height should be expressed as a
percentage of 1024.
col_start - Col containing upper left corner of object
row_start - Row containing top left corner of object
x1 - Distance to left side of object
y1 - Distance to top of object
width - Width of image frame
height - Height of image frame
"""
# Adjust start column for offsets that are greater than the col width
while x1 >= _size_col(sheet, col_start):
x1 -= _size_col(sheet, col_start)
col_start += 1
# Adjust start row for offsets that are greater than the row height
while y1 >= _size_row(sheet, row_start):
y1 -= _size_row(sheet, row_start)
row_start += 1
# Initialise end cell to the same as the start cell
row_end = row_start # Row containing bottom right corner of object
col_end = col_start # Col containing lower right corner of object
width = width + x1 - 1
height = height + y1 - 1
# Subtract the underlying cell widths to find the end cell of the image
while (width >= _size_col(sheet, col_end)):
width -= _size_col(sheet, col_end)
col_end += 1
# Subtract the underlying cell heights to find the end cell of the image
while (height >= _size_row(sheet, row_end)):
height -= _size_row(sheet, row_end)
row_end += 1
# Bitmap isn't allowed to start or finish in a hidden cell, i.e. a cell
# with zero height or width.
if ((_size_col(sheet, col_start) == 0) or (_size_col(sheet, col_end) == 0)
or (_size_row(sheet, row_start) == 0) or (_size_row(sheet, row_end) == 0)):
return
# Convert the pixel values to the percentage value expected by Excel
x1 = int(float(x1) / _size_col(sheet, col_start) * 1024)
y1 = int(float(y1) / _size_row(sheet, row_start) * 256)
# Distance to right side of object
x2 = int(float(width) / _size_col(sheet, col_end) * 1024)
# Distance to bottom of object
y2 = int(float(height) / _size_row(sheet, row_end) * 256)
return (col_start, x1, row_start, y1, col_end, x2, row_end, y2)
class ObjBmpRecord(BiffRecord):
_REC_ID = 0x005D # Record identifier
def __init__(self, row, col, sheet, im_data_bmp, x, y, scale_x, scale_y):
# Scale the frame of the image.
width = im_data_bmp.width * scale_x
height = im_data_bmp.height * scale_y
# Calculate the vertices of the image and write the OBJ record
coordinates = _position_image(sheet, row, col, x, y, width, height)
# print coordinates
col_start, x1, row_start, y1, col_end, x2, row_end, y2 = coordinates
"""Store the OBJ record that precedes an IMDATA record. This could be generalise
to support other Excel objects.
"""
cObj = 0x0001 # Count of objects in file (set to 1)
OT = 0x0008 # Object type. 8 = Picture
id = 0x0001 # Object ID
grbit = 0x0614 # Option flags
colL = col_start # Col containing upper left corner of object
dxL = x1 # Distance from left side of cell
rwT = row_start # Row containing top left corner of object
dyT = y1 # Distance from top of cell
colR = col_end # Col containing lower right corner of object
dxR = x2 # Distance from right of cell
rwB = row_end # Row containing bottom right corner of object
dyB = y2 # Distance from bottom of cell
cbMacro = 0x0000 # Length of FMLA structure
Reserved1 = 0x0000 # Reserved
Reserved2 = 0x0000 # Reserved
icvBack = 0x09 # Background colour
icvFore = 0x09 # Foreground colour
fls = 0x00 # Fill pattern
fAuto = 0x00 # Automatic fill
icv = 0x08 # Line colour
lns = 0xff # Line style
lnw = 0x01 # Line weight
fAutoB = 0x00 # Automatic border
frs = 0x0000 # Frame style
cf = 0x0009 # Image format, 9 = bitmap
Reserved3 = 0x0000 # Reserved
cbPictFmla = 0x0000 # Length of FMLA structure
Reserved4 = 0x0000 # Reserved
grbit2 = 0x0001 # Option flags
Reserved5 = 0x0000 # Reserved
data = pack("<L", cObj)
data += pack("<H", OT)
data += pack("<H", id)
data += pack("<H", grbit)
data += pack("<H", colL)
data += pack("<H", dxL)
data += pack("<H", rwT)
data += pack("<H", dyT)
data += pack("<H", colR)
data += pack("<H", dxR)
data += pack("<H", rwB)
data += pack("<H", dyB)
data += pack("<H", cbMacro)
data += pack("<L", Reserved1)
data += pack("<H", Reserved2)
data += pack("<B", icvBack)
data += pack("<B", icvFore)
data += pack("<B", fls)
data += pack("<B", fAuto)
data += pack("<B", icv)
data += pack("<B", lns)
data += pack("<B", lnw)
data += pack("<B", fAutoB)
data += pack("<H", frs)
data += pack("<L", cf)
data += pack("<H", Reserved3)
data += pack("<H", cbPictFmla)
data += pack("<H", Reserved4)
data += pack("<H", grbit2)
data += pack("<L", Reserved5)
self._rec_data = data
def _process_bitmap(bitmap):
"""Convert a 24 bit bitmap into the modified internal format used by Windows.
This is described in BITMAPCOREHEADER and BITMAPCOREINFO structures in the
MSDN library.
"""
# Open file and binmode the data in case the platform needs it.
fh = file(bitmap, "rb")
try:
# Slurp the file into a string.
data = fh.read()
finally:
fh.close()
# Check that the file is big enough to be a bitmap.
if len(data) <= 0x36:
raise Exception("bitmap doesn't contain enough data.")
# The first 2 bytes are used to identify the bitmap.
if (data[:2] != "BM"):
raise Exception("bitmap doesn't appear to to be a valid bitmap image.")
# Remove bitmap data: ID.
data = data[2:]
# Read and remove the bitmap size. This is more reliable than reading
# the data size at offset 0x22.
#
size = unpack("<L", data[:4])[0]
size -= 0x36 # Subtract size of bitmap header.
size += 0x0C # Add size of BIFF header.
data = data[4:]
# Remove bitmap data: reserved, offset, header length.
data = data[12:]
# Read and remove the bitmap width and height. Verify the sizes.
width, height = unpack("<LL", data[:8])
data = data[8:]
if (width > 0xFFFF):
raise Exception("bitmap: largest image width supported is 65k.")
if (height > 0xFFFF):
raise Exception("bitmap: largest image height supported is 65k.")
# Read and remove the bitmap planes and bpp data. Verify them.
planes, bitcount = unpack("<HH", data[:4])
data = data[4:]
if (bitcount != 24):
raise Exception("bitmap isn't a 24bit true color bitmap.")
if (planes != 1):
raise Exception("bitmap: only 1 plane supported in bitmap image.")
# Read and remove the bitmap compression. Verify compression.
compression = unpack("<L", data[:4])[0]
data = data[4:]
if (compression != 0):
raise Exception("bitmap: compression not supported in bitmap image.")
# Remove bitmap data: data size, hres, vres, colours, imp. colours.
data = data[20:]
# Add the BITMAPCOREHEADER data
header = pack("<LHHHH", 0x000c, width, height, 0x01, 0x18)
data = header + data
return (width, height, size, data)
class ImDataBmpRecord(BiffRecord):
_REC_ID = 0x007F
def __init__(self, filename):
"""Insert a 24bit bitmap image in a worksheet. The main record required is
IMDATA but it must be proceeded by a OBJ record to define its position.
"""
BiffRecord.__init__(self)
self.width, self.height, self.size, data = _process_bitmap(filename)
# Write the IMDATA record to store the bitmap data
cf = 0x09
env = 0x01
lcb = self.size
self._rec_data = pack("<HHL", cf, env, lcb) + data
-243
View File
@@ -1,243 +0,0 @@
# -*- coding: windows-1252 -*-
from struct import unpack, pack
import BIFFRecords
class StrCell(object):
__slots__ = ["rowx", "colx", "xf_idx", "sst_idx"]
def __init__(self, rowx, colx, xf_idx, sst_idx):
self.rowx = rowx
self.colx = colx
self.xf_idx = xf_idx
self.sst_idx = sst_idx
def get_biff_data(self):
# return BIFFRecords.LabelSSTRecord(self.rowx, self.colx, self.xf_idx, self.sst_idx).get()
return pack('<5HL', 0x00FD, 10, self.rowx, self.colx, self.xf_idx, self.sst_idx)
class BlankCell(object):
__slots__ = ["rowx", "colx", "xf_idx"]
def __init__(self, rowx, colx, xf_idx):
self.rowx = rowx
self.colx = colx
self.xf_idx = xf_idx
def get_biff_data(self):
# return BIFFRecords.BlankRecord(self.rowx, self.colx, self.xf_idx).get()
return pack('<5H', 0x0201, 6, self.rowx, self.colx, self.xf_idx)
class MulBlankCell(object):
__slots__ = ["rowx", "colx1", "colx2", "xf_idx"]
def __init__(self, rowx, colx1, colx2, xf_idx):
self.rowx = rowx
self.colx1 = colx1
self.colx2 = colx2
self.xf_idx = xf_idx
def get_biff_data(self):
return BIFFRecords.MulBlankRecord(self.rowx,
self.colx1, self.colx2, self.xf_idx).get()
class NumberCell(object):
__slots__ = ["rowx", "colx", "xf_idx", "number"]
def __init__(self, rowx, colx, xf_idx, number):
self.rowx = rowx
self.colx = colx
self.xf_idx = xf_idx
self.number = float(number)
def get_encoded_data(self):
rk_encoded = 0
num = self.number
# The four possible kinds of RK encoding are *not* mutually exclusive.
# The 30-bit integer variety picks up the most.
# In the code below, the four varieties are checked in descending order
# of bangs per buck, or not at all.
# SJM 2007-10-01
if -0x20000000 <= num < 0x20000000: # fits in 30-bit *signed* int
inum = int(num)
if inum == num: # survives round-trip
# print "30-bit integer RK", inum, hex(inum)
rk_encoded = 2 | (inum << 2)
return 1, rk_encoded
temp = num * 100
if -0x20000000 <= temp < 0x20000000:
# That was step 1: the coded value will fit in
# a 30-bit signed integer.
itemp = int(round(temp, 0))
# That was step 2: "itemp" is the best candidate coded value.
# Now for step 3: simulate the decoding,
# to check for round-trip correctness.
if itemp / 100.0 == num:
# print "30-bit integer RK*100", itemp, hex(itemp)
rk_encoded = 3 | (itemp << 2)
return 1, rk_encoded
if 0: # Cost of extra pack+unpack not justified by tiny yield.
packed = pack('<d', num)
w01, w23 = unpack('<2i', packed)
if not w01 and not(w23 & 3):
# 34 lsb are 0
# print "float RK", w23, hex(w23)
return 1, w23
packed100 = pack('<d', temp)
w01, w23 = unpack('<2i', packed100)
if not w01 and not(w23 & 3):
# 34 lsb are 0
# print "float RK*100", w23, hex(w23)
return 1, w23 | 1
#print "Number"
#print
return 0, pack('<5Hd', 0x0203, 14, self.rowx, self.colx, self.xf_idx, num)
def get_biff_data(self):
isRK, value = self.get_encoded_data()
if isRK:
return pack('<5Hi', 0x27E, 10, self.rowx, self.colx, self.xf_idx, value)
return value # NUMBER record already packed
class BooleanCell(object):
__slots__ = ["rowx", "colx", "xf_idx", "number"]
def __init__(self, rowx, colx, xf_idx, number):
self.rowx = rowx
self.colx = colx
self.xf_idx = xf_idx
self.number = number
def get_biff_data(self):
return BIFFRecords.BoolErrRecord(self.rowx,
self.colx, self.xf_idx, self.number, 0).get()
error_code_map = {
0x00: 0, # Intersection of two cell ranges is empty
0x07: 7, # Division by zero
0x0F: 15, # Wrong type of operand
0x17: 23, # Illegal or deleted cell reference
0x1D: 29, # Wrong function or range name
0x24: 36, # Value range overflow
0x2A: 42, # Argument or function not available
'#NULL!' : 0, # Intersection of two cell ranges is empty
'#DIV/0!': 7, # Division by zero
'#VALUE!': 36, # Wrong type of operand
'#REF!' : 23, # Illegal or deleted cell reference
'#NAME?' : 29, # Wrong function or range name
'#NUM!' : 36, # Value range overflow
'#N/A!' : 42, # Argument or function not available
}
class ErrorCell(object):
__slots__ = ["rowx", "colx", "xf_idx", "number"]
def __init__(self, rowx, colx, xf_idx, error_string_or_code):
self.rowx = rowx
self.colx = colx
self.xf_idx = xf_idx
try:
self.number = error_code_map[error_string_or_code]
except KeyError:
raise Exception('Illegal error value (%r)' % error_string_or_code)
def get_biff_data(self):
return BIFFRecords.BoolErrRecord(self.rowx,
self.colx, self.xf_idx, self.number, 1).get()
class FormulaCell(object):
__slots__ = ["rowx", "colx", "xf_idx", "frmla", "calc_flags"]
def __init__(self, rowx, colx, xf_idx, frmla, calc_flags=0):
self.rowx = rowx
self.colx = colx
self.xf_idx = xf_idx
self.frmla = frmla
self.calc_flags = calc_flags
def get_biff_data(self):
return BIFFRecords.FormulaRecord(self.rowx,
self.colx, self.xf_idx, self.frmla.rpn(), self.calc_flags).get()
# module-level function for *internal* use by the Row module
def _get_cells_biff_data_mul(rowx, cell_items):
# Return the BIFF data for all cell records in the row.
# Adjacent BLANK|RK records are combined into MUL(BLANK|RK) records.
pieces = []
nitems = len(cell_items)
i = 0
while i < nitems:
icolx, icell = cell_items[i]
if isinstance(icell, NumberCell):
isRK, value = icell.get_encoded_data()
if not isRK:
pieces.append(value) # pre-packed NUMBER record
i += 1
continue
muldata = [(value, icell.xf_idx)]
target = NumberCell
elif isinstance(icell, BlankCell):
muldata = [icell.xf_idx]
target = BlankCell
else:
pieces.append(icell.get_biff_data())
i += 1
continue
lastcolx = icolx
j = i
packed_record = ''
for j in xrange(i+1, nitems):
jcolx, jcell = cell_items[j]
if jcolx != lastcolx + 1:
nexti = j
break
if not isinstance(jcell, target):
nexti = j
break
if target == NumberCell:
isRK, value = jcell.get_encoded_data()
if not isRK:
packed_record = value
nexti = j + 1
break
muldata.append((value, jcell.xf_idx))
else:
muldata.append(jcell.xf_idx)
lastcolx = jcolx
else:
nexti = j + 1
if target == NumberCell:
if lastcolx == icolx:
# RK record
value, xf_idx = muldata[0]
pieces.append(pack('<5Hi', 0x027E, 10, rowx, icolx, xf_idx, value))
else:
# MULRK record
nc = lastcolx - icolx + 1
pieces.append(pack('<4H', 0x00BD, 6 * nc + 6, rowx, icolx))
pieces.append(''.join([pack('<Hi', xf_idx, value) for value, xf_idx in muldata]))
pieces.append(pack('<H', lastcolx))
else:
if lastcolx == icolx:
# BLANK record
xf_idx = muldata[0]
pieces.append(pack('<5H', 0x0201, 6, rowx, icolx, xf_idx))
else:
# MULBLANK record
nc = lastcolx - icolx + 1
pieces.append(pack('<4H', 0x00BE, 2 * nc + 6, rowx, icolx))
pieces.append(''.join([pack('<H', xf_idx) for xf_idx in muldata]))
pieces.append(pack('<H', lastcolx))
if packed_record:
pieces.append(packed_record)
i = nexti
return ''.join(pieces)
-34
View File
@@ -1,34 +0,0 @@
# -*- coding: windows-1252 -*-
from BIFFRecords import ColInfoRecord
class Column(object):
def __init__(self, colx, parent_sheet):
if not(isinstance(colx, int) and 0 <= colx <= 255):
raise ValueError("column index (%r) not an int in range(256)" % colx)
self._index = colx
self._parent = parent_sheet
self._parent_wb = parent_sheet.get_parent()
self._xf_index = 0x0F
self.width = 0x0B92
self.hidden = 0
self.level = 0
self.collapse = 0
def set_style(self, style):
self._xf_index = self._parent_wb.add_style(style)
def width_in_pixels(self):
# *** Approximation ****
return int(round(self.width * 0.0272 + 0.446, 0))
def get_biff_record(self):
options = (self.hidden & 0x01) << 0
options |= (self.level & 0x07) << 8
options |= (self.collapse & 0x01) << 12
return ColInfoRecord(self._index, self._index, self.width, self._xf_index, options).get()
-516
View File
@@ -1,516 +0,0 @@
# -*- coding: windows-1252 -*-
import sys
import struct
class Reader:
def __init__(self, filename, dump = False):
self.dump = dump
self.STREAMS = {}
doc = file(filename, 'rb').read()
self.header, self.data = doc[0:512], doc[512:]
del doc
self.__build_header()
self.__build_MSAT()
self.__build_SAT()
self.__build_directory()
self.__build_short_sectors_data()
if len(self.short_sectors_data) > 0:
self.__build_SSAT()
else:
if self.dump and (self.total_ssat_sectors != 0 or self.ssat_start_sid != -2):
print 'NOTE: header says that must be', self.total_ssat_sectors, 'short sectors'
print 'NOTE: starting at', self.ssat_start_sid, 'sector'
print 'NOTE: but file does not contains data in short sectors'
self.ssat_start_sid = -2
self.total_ssat_sectors = 0
self.SSAT = [-2]
for dentry in self.dir_entry_list[1:]:
(did,
sz, name,
t, c,
did_left, did_right, did_root,
dentry_start_sid,
stream_size
) = dentry
stream_data = ''
if stream_size > 0:
if stream_size >= self.min_stream_size:
args = (self.data, self.SAT, dentry_start_sid, self.sect_size)
else:
args = (self.short_sectors_data, self.SSAT, dentry_start_sid, self.short_sect_size)
stream_data = self.get_stream_data(*args)
if name != '':
# BAD IDEA: names may be equal. NEED use full paths...
self.STREAMS[name] = stream_data
def __build_header(self):
self.doc_magic = self.header[0:8]
if self.doc_magic != '\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1':
raise Exception, 'Not an OLE file.'
self.file_uid = self.header[8:24]
self.rev_num = self.header[24:26]
self.ver_num = self.header[26:28]
self.byte_order = self.header[28:30]
self.log2_sect_size, = struct.unpack('<H', self.header[30:32])
self.log2_short_sect_size, = struct.unpack('<H', self.header[32:34])
self.total_sat_sectors, = struct.unpack('<L', self.header[44:48])
self.dir_start_sid, = struct.unpack('<l', self.header[48:52])
self.min_stream_size, = struct.unpack('<L', self.header[56:60])
self.ssat_start_sid, = struct.unpack('<l', self.header[60:64])
self.total_ssat_sectors, = struct.unpack('<L', self.header[64:68])
self.msat_start_sid, = struct.unpack('<l', self.header[68:72])
self.total_msat_sectors, = struct.unpack('<L', self.header[72:76])
self.sect_size = 1 << self.log2_sect_size
self.short_sect_size = 1 << self.log2_short_sect_size
if self.dump:
print 'file magic: '
print_bin_data(self.doc_magic)
print 'file uid: '
print_bin_data(self.file_uid)
print 'revision number: '
print_bin_data(self.rev_num)
print 'version number: '
print_bin_data(self.ver_num)
print 'byte order: '
print_bin_data(self.byte_order)
print 'sector size :', hex(self.sect_size), self.sect_size
#print 'total sectors in file :', hex(self.total_sectors), self.total_sectors
print 'short sector size :', hex(self.short_sect_size), self.short_sect_size
print 'Total number of sectors used for the SAT :', hex(self.total_sat_sectors), self.total_sat_sectors
print 'SID of first sector of the directory stream:', hex(self.dir_start_sid), self.dir_start_sid
print 'Minimum size of a standard stream :', hex(self.min_stream_size), self.min_stream_size
print 'SID of first sector of the SSAT :', hex(self.ssat_start_sid), self.ssat_start_sid
print 'Total number of sectors used for the SSAT :', hex(self.total_ssat_sectors), self.total_ssat_sectors
print 'SID of first additional sector of the MSAT :', hex(self.msat_start_sid), self.msat_start_sid
print 'Total number of sectors used for the MSAT :', hex(self.total_msat_sectors), self.total_msat_sectors
def __build_MSAT(self):
self.MSAT = list(struct.unpack('<109l', self.header[76:]))
next = self.msat_start_sid
while next > 0:
msat_sector = struct.unpack('<128l', self.data[next*self.sect_size:(next+1)*self.sect_size])
self.MSAT.extend(msat_sector[:127])
next = msat_sector[-1]
if self.dump:
print 'MSAT (header part): \n', self.MSAT[:109]
print 'additional MSAT sectors: \n', self.MSAT[109:]
def __build_SAT(self):
sat_stream = ''.join([self.data[i*self.sect_size:(i+1)*self.sect_size] for i in self.MSAT if i >= 0])
sat_sids_count = len(sat_stream) >> 2
self.SAT = struct.unpack('<%dl' % sat_sids_count, sat_stream) # SIDs tuple
if self.dump:
print 'SAT sid count:\n', sat_sids_count
print 'SAT content:\n', self.SAT
def __build_SSAT(self):
ssat_stream = self.get_stream_data(self.data, self.SAT, self.ssat_start_sid, self.sect_size)
ssids_count = len(ssat_stream) >> 2
self.SSAT = struct.unpack('<%dl' % ssids_count, ssat_stream)
if self.dump:
print 'SSID count:', ssids_count
print 'SSAT content:\n', self.SSAT
def __build_directory(self):
dir_stream = self.get_stream_data(self.data, self.SAT, self.dir_start_sid, self.sect_size)
self.dir_entry_list = []
i = 0
while i < len(dir_stream):
dentry = dir_stream[i:i+128] # 128 -- dir entry size
i += 128
did = len(self.dir_entry_list)
sz, = struct.unpack('<H', dentry[64:66])
if sz > 0 :
name = dentry[0:sz-2].decode('utf_16_le', 'replace')
else:
name = u''
t, = struct.unpack('B', dentry[66])
c, = struct.unpack('B', dentry[67])
did_left , = struct.unpack('<l', dentry[68:72])
did_right , = struct.unpack('<l', dentry[72:76])
did_root , = struct.unpack('<l', dentry[76:80])
dentry_start_sid , = struct.unpack('<l', dentry[116:120])
stream_size , = struct.unpack('<L', dentry[120:124])
self.dir_entry_list.extend([(did, sz, name, t, c,
did_left, did_right, did_root,
dentry_start_sid, stream_size)])
if self.dump:
dentry_types = {
0x00: 'Empty',
0x01: 'User storage',
0x02: 'User stream',
0x03: 'LockBytes',
0x04: 'Property',
0x05: 'Root storage'
}
node_colours = {
0x00: 'Red',
0x01: 'Black'
}
print 'total directory entries:', len(self.dir_entry_list)
for dentry in self.dir_entry_list:
(did, sz, name, t, c,
did_left, did_right, did_root,
dentry_start_sid, stream_size) = dentry
print 'DID', did
print 'Size of the used area of the character buffer of the name:', sz
print 'dir entry name:', repr(name)
print 'type of entry:', t, dentry_types[t]
print 'entry colour:', c, node_colours[c]
print 'left child DID :', did_left
print 'right child DID:', did_right
print 'root DID :', did_root
print 'start SID :', dentry_start_sid
print 'stream size :', stream_size
if stream_size == 0:
print 'stream is empty'
elif stream_size >= self.min_stream_size:
print 'stream stored as normal stream'
else:
print 'stream stored as short-stream'
def __build_short_sectors_data(self):
(did, sz, name, t, c,
did_left, did_right, did_root,
dentry_start_sid, stream_size) = self.dir_entry_list[0]
assert t == 0x05 # Short-Stream Container Stream (SSCS) resides in Root Storage
if stream_size == 0:
self.short_sectors_data = ''
else:
self.short_sectors_data = self.get_stream_data(self.data, self.SAT, dentry_start_sid, self.sect_size)
def get_stream_data(self, data, SAT, start_sid, sect_size):
sid = start_sid
chunks = [(sid, sid)]
stream_data = ''
while SAT[sid] >= 0:
next_in_chain = SAT[sid]
last_chunk_start, last_chunk_finish = chunks[-1]
if next_in_chain == last_chunk_finish + 1:
chunks[-1] = last_chunk_start, next_in_chain
else:
chunks.extend([(next_in_chain, next_in_chain)])
sid = next_in_chain
for s, f in chunks:
stream_data += data[s*sect_size:(f+1)*sect_size]
#print chunks
return stream_data
def print_bin_data(data):
i = 0
while i < len(data):
j = 0
while (i < len(data)) and (j < 16):
c = '0x%02X' % ord(data[i])
sys.stdout.write(c)
sys.stdout.write(' ')
i += 1
j += 1
print
if i == 0:
print '<NO DATA>'
# This implementation writes only 'Root Entry', 'Workbook' streams
# and 2 empty streams for aligning directory stream on sector boundary
#
# LAYOUT:
# 0 header
# 76 MSAT (1st part: 109 SID)
# 512 workbook stream
# ... additional MSAT sectors if streams' size > about 7 Mb == (109*512 * 128)
# ... SAT
# ... directory stream
#
# NOTE: this layout is "ad hoc". It can be more general. RTFM
class XlsDoc:
SECTOR_SIZE = 0x0200
MIN_LIMIT = 0x1000
SID_FREE_SECTOR = -1
SID_END_OF_CHAIN = -2
SID_USED_BY_SAT = -3
SID_USED_BY_MSAT = -4
def __init__(self):
#self.book_stream = '' # padded
self.book_stream_sect = []
self.dir_stream = ''
self.dir_stream_sect = []
self.packed_SAT = ''
self.SAT_sect = []
self.packed_MSAT_1st = ''
self.packed_MSAT_2nd = ''
self.MSAT_sect_2nd = []
self.header = ''
def __build_directory(self): # align on sector boundary
self.dir_stream = ''
dentry_name = '\x00'.join('Root Entry\x00') + '\x00'
dentry_name_sz = len(dentry_name)
dentry_name_pad = '\x00'*(64 - dentry_name_sz)
dentry_type = 0x05 # root storage
dentry_colour = 0x01 # black
dentry_did_left = -1
dentry_did_right = -1
dentry_did_root = 1
dentry_start_sid = -2
dentry_stream_sz = 0
self.dir_stream += struct.pack('<64s H 2B 3l 9L l L L',
dentry_name + dentry_name_pad,
dentry_name_sz,
dentry_type,
dentry_colour,
dentry_did_left,
dentry_did_right,
dentry_did_root,
0, 0, 0, 0, 0, 0, 0, 0, 0,
dentry_start_sid,
dentry_stream_sz,
0
)
dentry_name = '\x00'.join('Workbook\x00') + '\x00'
dentry_name_sz = len(dentry_name)
dentry_name_pad = '\x00'*(64 - dentry_name_sz)
dentry_type = 0x02 # user stream
dentry_colour = 0x01 # black
dentry_did_left = -1
dentry_did_right = -1
dentry_did_root = -1
dentry_start_sid = 0
dentry_stream_sz = self.book_stream_len
self.dir_stream += struct.pack('<64s H 2B 3l 9L l L L',
dentry_name + dentry_name_pad,
dentry_name_sz,
dentry_type,
dentry_colour,
dentry_did_left,
dentry_did_right,
dentry_did_root,
0, 0, 0, 0, 0, 0, 0, 0, 0,
dentry_start_sid,
dentry_stream_sz,
0
)
# padding
dentry_name = ''
dentry_name_sz = len(dentry_name)
dentry_name_pad = '\x00'*(64 - dentry_name_sz)
dentry_type = 0x00 # empty
dentry_colour = 0x01 # black
dentry_did_left = -1
dentry_did_right = -1
dentry_did_root = -1
dentry_start_sid = -2
dentry_stream_sz = 0
self.dir_stream += struct.pack('<64s H 2B 3l 9L l L L',
dentry_name + dentry_name_pad,
dentry_name_sz,
dentry_type,
dentry_colour,
dentry_did_left,
dentry_did_right,
dentry_did_root,
0, 0, 0, 0, 0, 0, 0, 0, 0,
dentry_start_sid,
dentry_stream_sz,
0
) * 2
def __build_sat(self):
# Build SAT
book_sect_count = self.book_stream_len >> 9
dir_sect_count = len(self.dir_stream) >> 9
total_sect_count = book_sect_count + dir_sect_count
SAT_sect_count = 0
MSAT_sect_count = 0
SAT_sect_count_limit = 109
while total_sect_count > 128*SAT_sect_count or SAT_sect_count > SAT_sect_count_limit:
SAT_sect_count += 1
total_sect_count += 1
if SAT_sect_count > SAT_sect_count_limit:
MSAT_sect_count += 1
total_sect_count += 1
SAT_sect_count_limit += 127
SAT = [self.SID_FREE_SECTOR]*128*SAT_sect_count
sect = 0
while sect < book_sect_count - 1:
self.book_stream_sect.append(sect)
SAT[sect] = sect + 1
sect += 1
self.book_stream_sect.append(sect)
SAT[sect] = self.SID_END_OF_CHAIN
sect += 1
while sect < book_sect_count + MSAT_sect_count:
self.MSAT_sect_2nd.append(sect)
SAT[sect] = self.SID_USED_BY_MSAT
sect += 1
while sect < book_sect_count + MSAT_sect_count + SAT_sect_count:
self.SAT_sect.append(sect)
SAT[sect] = self.SID_USED_BY_SAT
sect += 1
while sect < book_sect_count + MSAT_sect_count + SAT_sect_count + dir_sect_count - 1:
self.dir_stream_sect.append(sect)
SAT[sect] = sect + 1
sect += 1
self.dir_stream_sect.append(sect)
SAT[sect] = self.SID_END_OF_CHAIN
sect += 1
self.packed_SAT = struct.pack('<%dl' % (SAT_sect_count*128), *SAT)
MSAT_1st = [self.SID_FREE_SECTOR]*109
for i, SAT_sect_num in zip(range(0, 109), self.SAT_sect):
MSAT_1st[i] = SAT_sect_num
self.packed_MSAT_1st = struct.pack('<109l', *MSAT_1st)
MSAT_2nd = [self.SID_FREE_SECTOR]*128*MSAT_sect_count
if MSAT_sect_count > 0:
MSAT_2nd[- 1] = self.SID_END_OF_CHAIN
i = 109
msat_sect = 0
sid_num = 0
while i < SAT_sect_count:
if (sid_num + 1) % 128 == 0:
#print 'link: ',
msat_sect += 1
if msat_sect < len(self.MSAT_sect_2nd):
MSAT_2nd[sid_num] = self.MSAT_sect_2nd[msat_sect]
else:
#print 'sid: ',
MSAT_2nd[sid_num] = self.SAT_sect[i]
i += 1
#print sid_num, MSAT_2nd[sid_num]
sid_num += 1
self.packed_MSAT_2nd = struct.pack('<%dl' % (MSAT_sect_count*128), *MSAT_2nd)
#print vars()
#print zip(range(0, sect), SAT)
#print self.book_stream_sect
#print self.MSAT_sect_2nd
#print MSAT_2nd
#print self.SAT_sect
#print self.dir_stream_sect
def __build_header(self):
doc_magic = '\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1'
file_uid = '\x00'*16
rev_num = '\x3E\x00'
ver_num = '\x03\x00'
byte_order = '\xFE\xFF'
log_sect_size = struct.pack('<H', 9)
log_short_sect_size = struct.pack('<H', 6)
not_used0 = '\x00'*10
total_sat_sectors = struct.pack('<L', len(self.SAT_sect))
dir_start_sid = struct.pack('<l', self.dir_stream_sect[0])
not_used1 = '\x00'*4
min_stream_size = struct.pack('<L', 0x1000)
ssat_start_sid = struct.pack('<l', -2)
total_ssat_sectors = struct.pack('<L', 0)
if len(self.MSAT_sect_2nd) == 0:
msat_start_sid = struct.pack('<l', -2)
else:
msat_start_sid = struct.pack('<l', self.MSAT_sect_2nd[0])
total_msat_sectors = struct.pack('<L', len(self.MSAT_sect_2nd))
self.header = ''.join([ doc_magic,
file_uid,
rev_num,
ver_num,
byte_order,
log_sect_size,
log_short_sect_size,
not_used0,
total_sat_sectors,
dir_start_sid,
not_used1,
min_stream_size,
ssat_start_sid,
total_ssat_sectors,
msat_start_sid,
total_msat_sectors
])
def save(self, file_name_or_filelike_obj, stream):
# 1. Align stream on 0x1000 boundary (and therefore on sector boundary)
padding = '\x00' * (0x1000 - (len(stream) % 0x1000))
self.book_stream_len = len(stream) + len(padding)
self.__build_directory()
self.__build_sat()
self.__build_header()
f = file_name_or_filelike_obj
we_own_it = not hasattr(f, 'write')
if we_own_it:
f = open(file_name_or_filelike_obj, 'wb')
f.write(self.header)
f.write(self.packed_MSAT_1st)
f.write(stream)
f.write(padding)
f.write(self.packed_MSAT_2nd)
f.write(self.packed_SAT)
f.write(self.dir_stream)
if we_own_it:
f.close()

Some files were not shown because too many files have changed in this diff Show More