Skip to content

quality gate

Welcome to MLCore

For other project documentations, visit MTech Wiki.

Overview

Architecture

flowchart LR
    source([Source]) -- "1. data" --> datastore[(Datastore)]
    datastore -.- azml
    subgraph Azure
        datastore -. "2. trigger" .- azfunction[Function]
        datastore -- "3. Function: execute pipeline<br/>data" --> pipeline[Pipeline]
        model -. "5. trigger" .- azfunction
        subgraph azml[Machine Learning]
            pipeline -- "4. model" --> model[Model]
            model -- "6. Function: deploy model<br/>model" --> endpoint[Endpoint]
        end
    end
    %% amino([Amino]) -- "input" --> endpoint
    %% endpoint -- "output" --> amino
    %% style azml fill:#ffa

Machine Learning Pipeline

flowchart LR
    datastore[(Datastore)] -- data --> preprocess
    subgraph Pipeline
        preprocess[Preprocess Data] -- train --> model_train[Train Model]
        preprocess -- test --> model_train
        model_train -- model --> register[Register Model]
    end
    register -- model --> model[Model]

Preprocess Data step in Azure Machine Learning Pipeline uses aminoml.preprocess module to transform the input data using the config file in the Datastore. Azure Function automatically executes the Machine Learning Pipeline when it detects the input file is saved in the Datastore.

Danger

Azure Machine Learning Pipeline will fail if the config file is not found!

Register Model step saves ONNX model, model info, and test prediction files to the Datastore in the output folder.

  • customer/species/module/model/input/preprocess_config.yml
  • customer/species/module/model/input/data.csv
  • customer/species/module/model/output/model.onnx
  • customer/species/module/model/output/model.json
  • customer/species/module/model/output/pred.csv

DevOps Pipelines

flowchart
    subgraph DevOps
        subgraph Release[Release Pipeline]
            direction TB
            deployfunc[Deploy Function App] --- deployweb[Deploy Web App]
        end
        subgraph Build[Build Pipeline]
            direction TB
            setup[Setup] --- pytest[Run Tests]
            pytest --- publishpipeline[Publish ML Pipeline]
            publishpipeline --- func[Build Function App]
            func --- web[Build Web App]
        end
    end
    Build -- function app artifact<br/>web app artifact --> Release
    %% style Build fill:#ffa
    %% style Release fill:#ffa
  • ML Pipeline: Published to Azure Machine Learning from DevOps Build Pipeline.
  • Function App: Executes Machine Learning pipeline and deploys models.
  • Web App: Hosts this documentation website.

Project layout

.amlignore                          # Files to ignore for ML Pipeline.
.flake8                             # Flake8 config file.
.funcignore                         # Files to ignore for Azure Function.
.gitignore                          # Files to ignore for Git.
.pre-commit-config.yaml             # Pre-commit config file.
.python-version                     # Python version to use for this project.
host.json                           # Azure Function host configuration.
local.settings.json                 # Azure Function local settings.
main.py                             # Script to publish Azure ML Pipeline.
mkdocs.yml                          # Config for mkdocs.
poetry.lock                         # Poetry lock file.
poetry.toml                         # Poetry config file.
pyproject.toml                      # Python project file.
README.md                           # Project description.
requirements.txt                    # Azure Function requirements.
sonar-project.properties            # SonarCloud config file.
.vscode/...                         # Azure Function deployment settings.
aminoml/...                         # AminoML Python package. See AminoML.
aminoml/deployment/                 # Azure ML model deployment configs.
  score.py                          # Scoring script for model deployment.
  score_bim_longrange_loess.py      # Scoring script for bim-lr model deployment.
  score_bim_shortrange.py           # Scoring script for bim-sr model deployment.
automl_env/...                      # Snapshot of the latest AzureML-AutoML environment. Update with `make automl`
azDeployModelUponRegister/...       # Azure Function to deploy model.
azExecutePipelineUponFileSave/...   # Azure Function to execute pipeline.
azureml_pipeline                    # Azure ML Pipeline configs and steps.
  automl.yml                        # AutoML config for ML Pipeline.
  data_prep.py                      # Data preprocess script for ML Pipeline.
  register.py                       # Model register script for ML Pipeline.
dev_tools/
  download_automl_env.py            # Download the latest AzureML-AutoML environment from AzureML
  update_environment.sh             # Extract conda and pip requirements from the AzureML-AutoML environment snapshot
docs/                               # Documentation and mkdocs related files.
  conf/...                          # MLCore configuration files.
  img/...                           # Images used in the docs and mkdocs.
  javascripts/...                   # Javascript files used in mkdocs.
  stylesheets/...                   # CSS files used in mkdocs.
  ...                               # Documentation files.
notebooks/...                       # Jupyter notebooks.
package_requirements/               # Python package requirements.
  install_validated_requirements.py # Script to install azure-sdk validated req.
  README.md                         # Description for package_requirements/.
  requirements.txt                  # MLCore Python package requirements.
web/                                # Web app files.
  staticwebapp.config.json          # Static Web App config file.