Skip to content

NLP Framework

Project structure

Here is the structure of a project generated with generate_nlp_project command :

.
├─ template_nlp                 # your application package
    ├─ models_training         # global config and utilities
        ├─ models_sklearn     # package containing some predefined scikit-learn models
        ├─ models_tensorflow  # package containing some predefined tensorflow models
        ├─ model_class.py     # module containing Model base class
        ├─ ...
        └─ utils_models.py    # module containing utility functions
        ├─ monitoring              # package containing monitoring utilities (mlflow, model explicability)
        ├─ preprocessing           # package containing preprocessing logic
        ├─ __init__.py
    └─ utils.py
 
├─ template_nlp-data            # Folder where to store your data
├─ template_nlp-exploration     # Folder where to store your exploratory notebooks
├─ template_nlp-models          # Folder containing trained models
├─ template_nlp-scripts         # Folder containing script for preprocessing, training, etc.
├─ template_nlp-tutorials       # Folder containing a tutorial notebook
.
.
.
├─ makefile
├─ setup.py
└─ README.md

Warning

If you used a custom preprocessing function funcA with FunctionTransformer, be aware that the pickled pipeline may not return wanted results if you later modify funcA definition.

Please check gabarit/issues/63