Metadata-Version: 2.1 Name: pyexcel-io Version: 0.6.7 Summary: A python library to read and write structured data in csv, zipped csvformat and to/from databases Home-page: https://github.com/pyexcel/pyexcel-io Download-URL: https://github.com/pyexcel/pyexcel-io/archive/0.6.7.tar.gz Author: C.W. Author-email: info@pyexcel.org License: New BSD Keywords: python,API,tsv,tsvz,csv,csvz,django,sqlalchemy Classifier: Topic :: Software Development :: Libraries Classifier: Programming Language :: Python Classifier: Intended Audience :: Developers Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: License :: OSI Approved :: BSD License Classifier: Programming Language :: Python :: Implementation :: PyPy Requires-Python: >=3.6 License-File: LICENSE Requires-Dist: lml>=0.0.4 Provides-Extra: extra Requires-Dist: chardet; extra == "extra" Provides-Extra: ods Requires-Dist: pyexcel-ods3>=0.6.0; extra == "ods" Provides-Extra: xls Requires-Dist: pyexcel-xls>=0.6.0; extra == "xls" Provides-Extra: xlsx Requires-Dist: pyexcel-xlsx>=0.6.0; extra == "xlsx" ================================================================================ pyexcel-io - Let you focus on data, instead of file formats ================================================================================ .. image:: https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png :target: https://www.patreon.com/chfw .. image:: https://raw.githubusercontent.com/pyexcel/pyexcel-mobans/master/images/awesome-badge.svg :target: https://awesome-python.com/#specific-formats-processing .. image:: https://github.com/pyexcel/pyexcel-io/workflows/run_tests/badge.svg :target: http://github.com/pyexcel/pyexcel-io/actions .. image:: https://codecov.io/gh/pyexcel/pyexcel-io/branch/master/graph/badge.svg :target: https://codecov.io/gh/pyexcel/pyexcel-io .. image:: https://badge.fury.io/py/pyexcel-io.svg :target: https://pypi.org/project/pyexcel-io .. image:: https://anaconda.org/conda-forge/pyexcel-io/badges/version.svg :target: https://anaconda.org/conda-forge/pyexcel-io .. image:: https://pepy.tech/badge/pyexcel-io/month :target: https://pepy.tech/project/pyexcel-io .. image:: https://anaconda.org/conda-forge/pyexcel-io/badges/downloads.svg :target: https://anaconda.org/conda-forge/pyexcel-io .. image:: https://img.shields.io/gitter/room/gitterHQ/gitter.svg :target: https://gitter.im/pyexcel/Lobby .. image:: https://img.shields.io/static/v1?label=continuous%20templating&message=%E6%A8%A1%E7%89%88%E6%9B%B4%E6%96%B0&color=blue&style=flat-square :target: https://moban.readthedocs.io/en/latest/#at-scale-continous-templating-for-open-source-projects .. image:: https://img.shields.io/static/v1?label=coding%20style&message=black&color=black&style=flat-square :target: https://github.com/psf/black .. image:: https://readthedocs.org/projects/pyexcel-io/badge/?version=latest :target: http://pyexcel-io.readthedocs.org/en/latest/ Support the project ================================================================================ If your company has embedded pyexcel and its components into a revenue generating product, please support me on github, `patreon `_ or `bounty source `_ to maintain the project and develop it further. If you are an individual, you are welcome to support me too and for however long you feel like. As my backer, you will receive `early access to pyexcel related contents `_. And your issues will get prioritized if you would like to become my patreon as `pyexcel pro user`. With your financial support, I will be able to invest a little bit more time in coding, documentation and writing interesting posts. Known constraints ================== Fonts, colors and charts are not supported. Nor to read password protected xls, xlsx and ods files. Introduction ================================================================================ **pyexcel-io** provides **one** application programming interface(API) to read and write the data in excel format, import the data into and export the data from database. It provides support for csv(z) format, django database and sqlalchemy supported databases. Its supported file formats are extended to cover "xls", "xlsx", "ods" by the following extensions: .. _file-format-list: .. _a-map-of-plugins-and-file-formats: .. table:: A list of file formats supported by external plugins ======================== ======================= ================= Package name Supported file formats Dependencies ======================== ======================= ================= `pyexcel-io`_ csv, csvz [#f1]_, tsv, csvz,tsvz readers depends on `chardet` tsvz [#f2]_ `pyexcel-xls`_ xls, xlsx(read only), `xlrd`_, xlsm(read only) `xlwt`_ `pyexcel-xlsx`_ xlsx `openpyxl`_ `pyexcel-ods3`_ ods `pyexcel-ezodf`_, lxml `pyexcel-ods`_ ods `odfpy`_ ======================== ======================= ================= .. table:: Dedicated file reader and writers ======================== ======================= ================= Package name Supported file formats Dependencies ======================== ======================= ================= `pyexcel-xlsxw`_ xlsx(write only) `XlsxWriter`_ `pyexcel-libxlsxw`_ xlsx(write only) `libxlsxwriter`_ `pyexcel-xlsxr`_ xlsx(read only) lxml `pyexcel-xlsbr`_ xlsb(read only) pyxlsb `pyexcel-odsr`_ read only for ods, fods lxml `pyexcel-odsw`_ write only for ods loxun `pyexcel-htmlr`_ html(read only) lxml,html5lib `pyexcel-pdfr`_ pdf(read only) camelot ======================== ======================= ================= Plugin shopping guide ------------------------ Since 2020, all pyexcel-io plugins have dropped the support for python versions which are lower than 3.6. If you want to use any of those Python versions, please use pyexcel-io and its plugins versions that are lower than 0.6.0. Except csv files, xls, xlsx and ods files are a zip of a folder containing a lot of xml files The dedicated readers for excel files can stream read In order to manage the list of plugins installed, you need to use pip to add or remove a plugin. When you use virtualenv, you can have different plugins per virtual environment. In the situation where you have multiple plugins that does the same thing in your environment, you need to tell pyexcel which plugin to use per function call. For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr. You need to append get_array(..., library='pyexcel-odsr'). .. _pyexcel-io: https://github.com/pyexcel/pyexcel-io .. _pyexcel-xls: https://github.com/pyexcel/pyexcel-xls .. _pyexcel-xlsx: https://github.com/pyexcel/pyexcel-xlsx .. _pyexcel-ods: https://github.com/pyexcel/pyexcel-ods .. _pyexcel-ods3: https://github.com/pyexcel/pyexcel-ods3 .. _pyexcel-odsr: https://github.com/pyexcel/pyexcel-odsr .. _pyexcel-odsw: https://github.com/pyexcel/pyexcel-odsw .. _pyexcel-pdfr: https://github.com/pyexcel/pyexcel-pdfr .. _pyexcel-xlsxw: https://github.com/pyexcel/pyexcel-xlsxw .. _pyexcel-libxlsxw: https://github.com/pyexcel/pyexcel-libxlsxw .. _pyexcel-xlsxr: https://github.com/pyexcel/pyexcel-xlsxr .. _pyexcel-xlsbr: https://github.com/pyexcel/pyexcel-xlsbr .. _pyexcel-htmlr: https://github.com/pyexcel/pyexcel-htmlr .. _xlrd: https://github.com/python-excel/xlrd .. _xlwt: https://github.com/python-excel/xlwt .. _openpyxl: https://bitbucket.org/openpyxl/openpyxl .. _XlsxWriter: https://github.com/jmcnamara/XlsxWriter .. _pyexcel-ezodf: https://github.com/pyexcel/pyexcel-ezodf .. _odfpy: https://github.com/eea/odfpy .. _libxlsxwriter: http://libxlsxwriter.github.io/getting_started.html .. rubric:: Footnotes .. [#f1] zipped csv file .. [#f2] zipped tsv file If you need to manipulate the data, you might do it yourself or use its brother library `pyexcel `__ . If you would like to extend it, you may use it to write your own extension to handle a specific file format. Installation ================================================================================ You can install pyexcel-io via pip: .. code-block:: bash $ pip install pyexcel-io or clone it and install it: .. code-block:: bash $ git clone https://github.com/pyexcel/pyexcel-io.git $ cd pyexcel-io $ python setup.py install Development guide ================================================================================ Development steps for code changes #. git clone https://github.com/pyexcel/pyexcel-io.git #. cd pyexcel-io Upgrade your setup tools and pip. They are needed for development and testing only: #. pip install --upgrade setuptools pip Then install relevant development requirements: #. pip install -r rnd_requirements.txt # if such a file exists #. pip install -r requirements.txt #. pip install -r tests/requirements.txt Once you have finished your changes, please provide test case(s), relevant documentation and update changelog.yml .. note:: As to rnd_requirements.txt, usually, it is created when a dependent library is not released. Once the dependecy is installed (will be released), the future version of the dependency in the requirements.txt will be valid. How to test your contribution ------------------------------ Although `nose` and `doctest` are both used in code testing, it is adviable that unit tests are put in tests. `doctest` is incorporated only to make sure the code examples in documentation remain valid across different development releases. On Linux/Unix systems, please launch your tests like this:: $ make On Windows, please issue this command:: > test.bat Before you commit ------------------------------ Please run:: $ make format so as to beautify your code otherwise your build may fail your unit test. License ================================================================================ New BSD License 6 contributors ================================================================================ In alphabetical order: * `Craig Anderson `_ * `John Vandenberg `_ * `Stephen J. Fuhry `_ * `Stephen Rauch `_ * `Vincent Raspal `_ * `Víctor Antonio Hernández Monroy `_ Change log ================================================================================ 0.6.7 - 09.11.2024 -------------------------------------------------------------------------------- **updated** #. `#115 `_: Pathnames with . cause file_name error in get_writer. #. `#117 `_: fix a typo in license. 0.6.6 - 31.1.2022 -------------------------------------------------------------------------------- **updated** #. `#112 `_: Log Empty Row Warning instead 'print' 0.6.5 - 08.10.2021 -------------------------------------------------------------------------------- **updated** #. `#109 `_: enable ods3 to have datetime 0.6.4 - 31.10.2020 -------------------------------------------------------------------------------- **updated** #. `#102 `_: skip columns from imported excel sheet. 0.6.3 - 12.10.2020 -------------------------------------------------------------------------------- **fixed** #. `#96 `_: regression: unknown file type shall trigger NoSupportingPluginFound **updated** #. extra dependencies uses 0.6.0 based plugins 0.6.2 - 7.10.2020 -------------------------------------------------------------------------------- **updated** #. `#94 `_: keep backward compatibility for pyexcel-xls 0.4.1 0.6.1 - 7.10.2020 -------------------------------------------------------------------------------- **removed** #. python 3.6 lower versions are no longer supported **updated** #. pyexcel-io plugin interface has been rewritten. PyInstaller user will be impacted. please read 'Packaging with Pyinstaller' in the documentation. #. new query set reader plugin. pyexcel<=0.6.4 has used intrusive way of getting query set source done. it is against the plugin interface. **fixed** #. `#74 `_: handle zip files which contain non-UTF-8 encoded files. **added** #. `#86 `_: allow trailing options, get_data(...keep_trailing_empty_cells=True). 0.5.20 - 17.7.2019 -------------------------------------------------------------------------------- **updated** #. `#70 `_: when the given file is a root directory, the error shall read it is not a file 0.5.19 - 14.7.2019 -------------------------------------------------------------------------------- **updated** #. `pyexcel#185 `_: handle stream conversion if file type(html) needs string content then bytes to handle 0.5.18 - 12.06.2019 -------------------------------------------------------------------------------- **updated** #. `#69 `_: Force file type(force_file_type) on write 0.5.17 - 04.04.2019 -------------------------------------------------------------------------------- **updated** #. `#68 `_: Raise IOError when the data file does not exist 0.5.16 - 19.03.2019 -------------------------------------------------------------------------------- **updated** #. `#67 `_: fix conversion issue for long type on python 2.7 for ods 0.5.15 - 16.03.2019 -------------------------------------------------------------------------------- **updated** #. `pyexcel-ods#33 `_: fix integer comparision error on i586 0.5.14 - 21.02.2019 -------------------------------------------------------------------------------- **updated** #. `#65 `_: add tests/__init__.py because python2.7 setup.py test needs it 0.5.13 - 12.02.2019 -------------------------------------------------------------------------------- **updated** #. `#63 `_: Version 0.5.12 prevents xslx and ods plugin from being loaded 0.5.12 - 9.02.2019 -------------------------------------------------------------------------------- **updated** #. `#60 `_: include tests in tar ball #. `#61 `_: enable python setup.py test 0.5.11 - 3.12.2018 -------------------------------------------------------------------------------- **updated** #. `#59 `_: Please use scan_plugins_regex, which lml 0.7 complains about 0.5.10 - 27.11.2018 -------------------------------------------------------------------------------- **added** #. `#57 `_, long type will not be written in ods. please use string type. And if the integer is equal or greater than 10 to the power of 16, it will not be written either in ods. In both situation, IntegerPrecisionLossError will be raised. And this version enables pyexcel-ods and pyexcel-ods3 to do so. 0.5.9.1 - 30.08.2018 -------------------------------------------------------------------------------- **updated** #. `#53 `_, upgrade lml dependency to at least 0.0.2 0.5.9 - 23.08.2018 -------------------------------------------------------------------------------- **added** #. `pyexcel#148 `_, support force_file_type 0.5.8 - 16.08.2018 -------------------------------------------------------------------------------- **added** #. `#49 `_, support additional options when detecting float values in csv format. default_float_nan, ignore_nan_text 0.5.7 - 02.05.2018 -------------------------------------------------------------------------------- **fixed** #. `#48 `_, turn off pep 0515 #. `#47 `_, csv reader cannot handle relative file names 0.5.6 - 11.01.2018 -------------------------------------------------------------------------------- **fixed** #. `#46 `_, expose `bulk_save` to developer 0.5.5 - 23.12.2017 -------------------------------------------------------------------------------- **fixed** #. Issue `#45 `_, csv reader throws exception because google app engine does not support mmap. People who are not working with google app engine, need not to take this update. Enjoy your Christmas break. 0.5.4 - 10.11.2017 -------------------------------------------------------------------------------- **updated** #. PR `#44 `_, use unicodewriter for csvz writers. 0.5.3 - 23.10.2017 -------------------------------------------------------------------------------- **updated** #. pyexcel `pyexcel#105 `_, remove gease from setup_requires, introduced by 0.5.2. #. remove python2.6 test support 0.5.2 - 20.10.2017 -------------------------------------------------------------------------------- **added** #. `pyexcel#103 `_, include LICENSE file in MANIFEST.in, meaning LICENSE file will appear in the released tar ball. 0.5.1 - 02.09.2017 -------------------------------------------------------------------------------- **Fixed** #. `pyexcel-ods#25 `_, Unwanted dependency on pyexcel. 0.5.0 - 30.08.2017 -------------------------------------------------------------------------------- **Added** #. Collect all data type conversion codes as service.py. **Updated** #. `#19 `_, use cString by default. For python, it will be a performance boost 0.4.4 - 08.08.2017 -------------------------------------------------------------------------------- **Updated** #. `#42 `_, raise exception if database table name does not match the sheet name 0.4.3 - 29.07.2017 -------------------------------------------------------------------------------- **Updated** #. `#41 `_, walk away gracefully when mmap is not available. 0.4.2 - 05.07.2017 -------------------------------------------------------------------------------- **Updated** #. `#37 `_, permanently fix the residue folder pyexcel by release all future releases in a clean clone. 0.4.1 - 29.06.2017 -------------------------------------------------------------------------------- **Updated** #. `#39 `_, raise exception when bulk save in django fails. Please `bulk_save=False` if you as the developer choose to save the records one by one if bulk_save cannot be used. However, exception in one-by-one save case will be raised as well. This change is made to raise exception in the first place so that you as the developer will be suprised when it was deployed in production. 0.4.0 - 19.06.2017 -------------------------------------------------------------------------------- **Updated** #. 'built-in' as the value to the parameter 'library' as parameter to invoke pyexcel-io's built-in csv, tsv, csvz, tsvz, django and sql won't work. It is renamed to 'pyexcel-io'. #. built-in csv, tsv, csvz, tsvz, django and sql are lazy loaded. #. pyexcel-io plugin interface has been updated. v0.3.x plugins won't work. #. `#32 `_, csv and csvz file handle are made sure to be closed. File close mechanism is enfored. #. iget_data function is introduced to cope with dangling file handle issue. **Removed** #. Removed plugin loading code and lml is used instead 0.3.4 - 18.05.2017 -------------------------------------------------------------------------------- **Updated** #. `#33 `_, handle mmap object differently given as file content. This issue has put in a priority to single sheet csv over multiple sheets in a single memory stream. The latter format is pyexcel own creation but is rarely used. In latter case, multiple_sheet=True should be passed along get_data. #. `#34 `_, treat mmap object as a file content. #. `#35 `_, encoding parameter take no effect when given along with file content #. use ZIP_DEFALTED to really do the compression 0.3.3 - 30.03.2017 -------------------------------------------------------------------------------- **Updated** #. `#31 `_, support pyinstaller 0.3.2 - 26.01.2017 -------------------------------------------------------------------------------- **Updated** #. `#29 `_, change skip_empty_rows to False by default 0.3.1 - 21.01.2017 -------------------------------------------------------------------------------- **Added** #. updated versions of extra packages **Updated** #. `#23 `_, provide helpful message when old pyexcel plugin exists #. restored previously available diagnosis message for missing libraries 0.3.0 - 22.12.2016 -------------------------------------------------------------------------------- **Added** #. lazy loading of plugins. for example, pyexcel-xls is not entirely loaded until xls format is used at its first attempted reading or writing. Since it is loaded, it will not be loaded in the second io action. #. `pyexcel-xls#11 `_, make case-insensitive for file type 0.2.6 - 21.12.2016 -------------------------------------------------------------------------------- **Updated** #. `#24 `__, pass on batch_size 0.2.5 - 20.12.2016 -------------------------------------------------------------------------------- **Updated** #. `#26 `__, performance issue with getting the number of columns. 0.2.4 - 24.11.2016 -------------------------------------------------------------------------------- **Updated** #. `#23 `__, Failed to convert long integer string in python 2 to its actual value 0.2.3 - 16.09.2016 -------------------------------------------------------------------------------- **Added** #. `#21 `__, choose subset from data base tables for export #. `#22 `__, custom renderer if given `row_renderer` as parameter. 0.2.2 - 31.08.2016 -------------------------------------------------------------------------------- **Added** #. support pagination. two pairs: start_row, row_limit and start_column, column_limit help you deal with large files. #. `skip_empty_rows=True` was introduced. To include empty rows, put it to False. **Updated** #. `#20 `__, pyexcel-io attempts to parse cell contents of 'infinity' as a float/int, crashes 0.2.1 - 11.07.2016 -------------------------------------------------------------------------------- **Added** #. csv format: handle utf-16 encoded csv files. Potentially being able to decode other formats if correct "encoding" is provided #. csv format: write utf-16 encoded files. Potentially other encoding is also supported #. support stdin as input stream and stdout as output stream **Updated** #. Attention, user of pyexcel-io! No longer io stream validation is performed in python 3. The guideline is: io.StringIO for csv, tsv only, otherwise BytesIO for xlsx, xls, ods. You can use RWManager.get_io to produce a correct stream type for you. #. `#15 `__, support foreign django/sql foreign key 0.2.0 - 01.06.2016 -------------------------------------------------------------------------------- **Added** #. autoload of pyexcel-io plugins #. auto detect `datetime`, `float` and `int`. Detection can be switched off by `auto_detect_datetime`, `auto_detect_float`, `auto_detect_int` 0.1.0 - 17.01.2016 -------------------------------------------------------------------------------- **Added** #. yield key word to return generator as content