clinical-toolkit

processing-toolbox is a machine learning python API designed for clinical data processing. The objective of this repository to gather in a same API useful tools for such tasks rather than implementing new specific algorithms. That is the reason why it relies on many widely used APIs such as scikit-learn or gensim .

Installation

The package is not available on PyPI so you need to install it from source.

1
2
3
4
$ git clone https://github.com/DITEP/clinical-toolkit.git
$ cd preprocessing-toolbox
$ pip install -r requirements.txt
$ pip install .

To check consistency of the installation, go to the root of the project and run

1
$ nosetests

Which whould prompt OK in your terminal;

Dependencies

The repository is compatible with following versions of packages but may also work with previous versions.

Python 2 has not been tested. Windows support has not yet been tested

  • beautifulsoup4==4.6.0
  • gensim==3.4.0
  • nltk==3.3
  • nose==1.3.7
  • numpy==1.14.2
  • pandas==0.23.0
  • requests==2.18.4
  • scikit-learn==0.19.1
  • scipy==1.1.0
  • SQLAlchemy==1.2.7
  • Unidecode==1.0.22

Development

Everyone is welcome to take part to the project, there are however a few guidelines one should need to follow to keep it clean and functioning

  • make sure your code follows PEP 8 guidelines for optimal readability. The best way to check it is by using flake 8 tool.
  • to report an issue or a bug, refer https://github.com/DITEP/clinical-toolkit/issues and make sur the error you report is reproducible.
  • to make add a feature or fix a bug, clone the repo and create a new branch named new_feature and then make the pull request so that other contributors can review you code.

References