In the realm of machine learning, the ability to accurately evaluate the performance of your model is paramount. Whether you're tackling regression or classification problems, understanding how your model measures up against predefined benchmarks is key to refining its capabilities. That's why Xdeep empowers you with the flexibility to define your own conditions for stopping model training, using a simple and intuitive grammar.
With Xdeep, users can specify conditions using the following grammar:
(Performance Metric) (Operator) (Targeted Value)
LessThanOrEqual
GreaterThanOrEqual
The Performance Metrics encompass four (4) fundamental measures applicable to both Regression and Classification problems, such as Mean Square Error and R-Squared, alongside an additional set of eleven (11) metrics tailored specifically for Classification scenarios, including False Classified Examples and the Fowlkes-Mallows index.
By allowing users to define their own conditions based on these performance metrics, Xdeep offers unparalleled flexibility in model training. Whether your priority is minimizing error, maximizing accuracy, or achieving a specific balance of precision and recall, Xdeep puts the power in your hands.
For detailed information on each performance metric, including mathematical formulas refer to our comprehensive Performance Metrics Guide
In classification problems, Xdeep employs an automatic training mode wherein the corresponding neural network model is trained according to a specific rule: ensuring that the count of False Classified Examples (FCE) equals zero.
During each epoch, the network undergoes optimal expansion until it satisfies the aforementioned target rule. False classified examples are instances where the predicted value exhibits an opposite sign from the actual (targeted) value. Consequently, the resultant model attains minimal complexity, maximizing its generalization capability while accurately addressing all known data patterns.
In regression scenarios, Xdeep implements a strategic approach for partitioning the dataset into training, validation, and test sets in order to train a neural network model effectively:
Initial Split: The dataset is initially divided into two segments— a training set and a holdout set. Approximately 90% of the available data is allocated for training, while the remaining 10% forms the holdout set. Both sets are crafted to maintain statistical properties closely resembling those of the original dataset.
Further Split for Validation: The training set from the previous step undergoes an additional partition, resulting in separate training and validation subsets. About 80% of the data is utilized for training, leaving 20% for validation purposes.
The holdout set derived from the initial split functions as the test set, remaining untouched throughout the Xdeep training procedure. It serves as an entirely unseen dataset for objectively evaluating the performance of the generated model, assessed in terms of relative error and uncertainty.
Training process: Xdeep proceeds to train the neural network model on the training set, concurrently utilizing the validation set to monitor its performance, in terms of Mean Square Error (MSE), throughout the training phase. This facilitates the detection of overfitting and the fine-tuning of hyperparameters as necessary.
Evaluation process on test set: Following the finalization of the model based on validation set performance, Xdeep evaluates the Prediction Absolute Error (PAE) and Uncertainty (UNC) using the test set (see also the Performance Metrics Guide). This evaluation provides an unbiased estimate of the model's ability to generalize to unseen data, ensuring a comprehensive assessment of its performance.