Predict
Predict[{in_{1}out_{1},in_{2}out_{2},…}]
generates a PredictorFunction[…] based on the example inputoutput pairs given.
Predict[{in_{1},in_{2},…}{out_{1},out_{2},…}]
generates the same result.
Predict[training,input]
attempts to predict the output associated with input from the training examples given.
Predict["name",input]
uses the builtin predictor function represented by "name".
Predict[predictor,opts]
takes an existing predictor function and modifies it with the new options given.
Details and Options
 Predict can be used on many types of data, including numerical, textual, sounds and images, as well as combinations of these.
 Each input_{i} can be a single data element, a list of data elements, an association of data elements or a Dataset object. In Predict[training,…], training can be a Dataset object.
 Predict[training] returns a PredictorFunction[…] that can then be applied to specific data.
 In Predict[…,input], input can be a single item or a list of items.
 In Predict[…,input,prop], properties are as given in PredictorFunction[…]; they include:

"Decision" best prediction according to distribution and utility function "Distribution" distribution of value conditioned on input "SHAPValues" Shapley additive feature explanations for each example "Properties" list of all properties available  "SHAPValues" assesses the contribution of features by comparing predictions with different sets of features removed and then synthesized. The option MissingValueSynthesis can be used to specify how the missing features are synthesized. SHAP explanations are given as deviation from the training output mean. "SHAPValues"n can be used to control the the number of samples used for the numeric estimations of SHAP explanations.
 Examples of builtin predictor functions include:

"NameAge" age of a person, given their first name  The following options can be given:

AnomalyDetector None anomaly detector used by the predictor AcceptanceThreshold Automatic rarer probability threshold for anomaly detector FeatureExtractor Identity how to extract features from which to learn FeatureNames Automatic feature names to assign for input data FeatureTypes Automatic feature types to assume for input data IndeterminateThreshold 0 below what probability density to return Indeterminate Method Automatic which regression algorithm to use MissingValueSynthesis Automatic how to synthesize missing values PerformanceGoal Automatic aspects of performance to try to optimize RecalibrationFunction Automatic how to postprocess predicted value RandomSeeding 1234 what seeding of pseudorandom generators should be done internally TargetDevice "CPU" the target device on which to perform training TimeGoal Automatic how long to spend training the classifier TrainingProgressReporting Automatic how to report progress during training UtilityFunction Automatic utility as function of actual and predicted value ValidationSet Automatic data on which to validate the model generated  Possible settings for PerformanceGoal include:

"DirectTraining" train directly on the full dataset, without model searching "Memory" minimize storage requirements of the predictor "Quality" maximize accuracy of the predictor "Speed" maximize speed of the predictor "TrainingSpeed" minimize time spent producing the predictor Automatic automatic tradeoff among speed, accuracy and memory {goal_{1},goal_{2},…} automatically combine goal_{1}, goal_{2}, etc.  Possible settings for Method include:

"DecisionTree" predict using a decision tree "GradientBoostedTrees" predict using an ensemble of trees trained with gradient boosting "LinearRegression" predict from linear combinations of features "NearestNeighbors" predict from nearest neighboring examples "NeuralNetwork" predict using an artificial neural network "RandomForest" predict from Breiman–Cutler ensembles of decision trees "GaussianProcess" predict using a Gaussian process prior over functions  The following settings for TrainingProgressReporting can be used:

"Panel" show a dynamically updating graphical panel "Print" periodically report information using Print "ProgressIndicator" show a simple ProgressIndicator "SimplePanel" dynamically updating panel without learning curves None do not report any information  Possible settings for RandomSeeding include:

Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed  Predict[{assoc_{1},assoc_{2},…}"key",…] can be used to specify that the output is given by the value of "key" in each association assoc_{i}.
 Predict[{list_{1},list_{2},…}n,…] can be used to specify that the output is given by the value of part n in each list list_{i}.
 Predict[Dataset[…]part,…] can be used to specify that the outputs are given by the value of part of each row of the dataset.
 Predict[FittedModel[…]] can be used to convert a fitted model into a PredictorFunction[…].
 Predict[…,FeatureExtractor"Minimal"] indicates that the internal preprocessing should be as simple as possible.
 In Predict[PredictorFunction[…],FeatureExtractorfe], the FeatureExtractorFunction[…] fe will be prepended to the existing feature extractor.
 Information can be used on the PredictorFunction[…] obtained.
Examples
open allclose allBasic Examples (2)
Train a predictor function on a set of examples:
Predict the value of a new example, given its feature:
Get the conditional distribution of the value, given the example feature:
Plot the predicted values as a function of the feature value and show the training examples:
Train a predictor on a dataset with multiple features:
Predict the value of a new example, given its features:
Predict the value of a new example that has a missing feature:
Scope (8)
Train a predictor to predict the colored area of an image:
Predict the values of new examples:
Train a predictor on data where the feature is a sequence of tokens:
Train a predictor on a dataset with features and values in separate lists:
Obtain information about the predictor:
Train a nearestneighbors predictor on a dataset containing missing features:
Predict the value of a new example:
Predict values on examples containing missing features:
Train a predictor on a dataset with named features. The order of the keys does not matter. Keys can be missing:
Predict examples containing missing features:
Construct a Dataset with a list of associations:
Train a predictor to predict the feature "age" as a function of the other features:
Once the predictor is trained, any input format can be used. Predict an example formatted as an association:
Find out the order of the features and predict an example formatted as a list:
Predict examples in a Dataset:
Create and visualize an artificial dataset from the expression Cos[x*y]:
Train a predictor on the dataset:
Visualize the prediction surface:
Use the builtin predictor "NameAge" to predict the age of a person from their first name:
Options (23)
AcceptanceThreshold (1)
AnomalyDetector (1)
FeatureExtractor (2)
Generate a predictor function using FeatureExtractor to preprocess the data using a custom function:
Add the "StandardizedVector" method to the preprocessing pipeline:
Use the predictor on new data:
Create a feature extractor and extract features from a dataset:
Train a predictor on the extracted features:
FeatureNames (2)
Train a predictor and give a name to each feature:
Use the association format to predict a new example:
The list format can still be used:
Train a predictor on a training set with named features and use FeatureNames to set their order:
FeatureTypes (2)
Train a predictor on textual and nominal data:
The first feature has been wrongly interpreted as a nominal feature:
Specify that the first feature should be considered textual:
Train a predictor with named features:
Both features have been considered numerical:
Specify that the feature "gender" should be considered nominal:
IndeterminateThreshold (1)
Method (4)
Train a nearestneighbors predictor:
Plot the predicted value as a function of the feature for both predictors:
Train a random forest predictor:
Find the standard deviation of the residuals on a test set:
In this example, using a linear regression predictor increases the standard deviation of the residuals:
However, using a linear regression predictor reduces the training time:
Train a linear regression, neural network, and Gaussian process predictor:
These methods produce smooth predictors:
Train a random forest and nearestneighbor predictor:
These methods produce nonsmooth predictors:
Train a neural network, a random forest, and a Gaussian process predictor:
The Gaussian process predictor is smooth and handles small datasets well:
MissingValueSynthesis (1)
Train a predictor with two input features:
Get the prediction for an example that has a missing value:
Set the missing value synthesis to replace each missing variable with its estimated most likely value given known values (which is the default behavior):
Replace missing variables with random samples conditioned on known values:
Averaging over many random imputations is usually the best strategy and allows obtaining the uncertainty caused by the imputation:
Specify a learning method during training to control how the distribution of data is learned:
Predict an example with missing values using the "KernelDensityEstimation" distribution to condition values:
Provide an existing LearnedDistribution at training to use it when imputing missing values during training and later evaluations:
Specify an existing LearnedDistribution to synthesize missing values for an individual evaluation:
Control both the learning method and the evaluation strategy by passing an association at training:
PerformanceGoal (1)
RecalibrationFunction (1)
TargetDevice (1)
Train a predictor on the system's default GPU using a neural network and look at the AbsoluteTiming:
Compare the previous result with the one achieved by using the default CPU computation:
TimeGoal (2)
Train a predictor while specifying a total training time of 3 seconds:
Load the "BostonHomes" dataset:
Train a predictor while specifying a target training time of 0.1 seconds:
The predictor reached a standard deviation of about 3.2:
Train a classifier while specifying a target training time of 5 seconds:
TrainingProgressReporting (1)
UtilityFunction (2)
Visualize the probability density for a given example:
By default, the value with the highest probability density is predicted:
This corresponds to a Dirac delta utility function:
Define a utility function that penalizes the predicted value's being smaller than the actual value:
Plot this function for a given actual value:
Train a predictor with this utility function:
The predictor decision is now changed despite the probability density's being unchanged:
Specifying a utility function when predicting supersedes the utility function specified at training:
Visualize the distribution of age for the name "Claire" with the builtin predictor "NameAge":
The most likely value of this distribution is the following:
Change the utility function to predict the mean value instead of the most likely value:
Applications (5)
Train a predictor that predicts the median value of properties in a neighborhood of Boston, given some features of the neighborhood:
Generate a PredictorMeasurementsObject to analyze the performance of the predictor on a test set:
Visualize a scatter plot of the values of the test set as a function of the predicted values:
Compute the root mean square of the residuals:
Load a dataset of the average monthly temperature as a function of the city, the year, and the month:
Visualize a sample of the dataset:
Train a linear predictor on the dataset:
Plot the predicted temperature distribution of the city "Lincoln" in 2020 for different months:
For every month, plot the predicted temperature and its error bar (standard deviation):
Load a dataset of wine quality as a function of the wines' physical properties:
Get a description of the variables in the dataset:
Visualize the distribution of the "alcohol" and "pH" variables:
Train a predictor on the training set:
Predict the quality of an unknown wine:
Create a function that predicts the quality of the unknown wine as a function of its pH and alcohol level:
Plot this function to have a hint on how to improve this wine:
Load a dataset of wine quality as a function of the wines' physical properties:
Train a predictor to estimate wine quality:
Predict the example bottle's quality:
Calculate how much higher or lower this bottle's predicted quality is than the mean:
Get an estimation for how much each feature impacted the predictor's output for this bottle:
Visualize these feature impacts:
Confirm that the Shapley values fully explain the predicted quality:
Learn a distribution of the data that treats each feature as independent:
Estimate SHAP value feature importance for 100 bottles of wine, using 5 samples for each estimation:
Calculate how important each feature is to the model:
Visualize the model's feature importance:
Visualize a nonlinear relationship between a feature's value and its impact on the model's prediction:
Generate images of gauges associated with their values:
Train a predictor on this dataset:
Predict the value of a gauge from its image:
Interact with the predictor using Dynamic:
Properties & Relations (1)
The linear regression predictor without regularization and LinearModelFit can train equivalent models:
Fit and NonlinearModelFit can also be equivalent:
Possible Issues (1)
The RandomSeeding option does not always guarantee reproducibility of the result:
Text
Wolfram Research (2014), Predict, Wolfram Language function, https://reference.wolfram.com/language/ref/Predict.html (updated 2021).
BibTeX
BibLaTeX
CMS
Wolfram Language. 2014. "Predict." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2021. https://reference.wolfram.com/language/ref/Predict.html.
APA
Wolfram Language. (2014). Predict. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/Predict.html