API Reference¶
Estimators¶
Models following scikit-learn’s estimator API.
-
class
dask_glm.estimators.LinearRegression(fit_intercept=True, solver='admm', regularizer='l2', max_iter=100, tol=0.0001, lamduh=1.0, rho=1, over_relax=1, abstol=0.0001, reltol=0.01)[source]¶ Esimator for a linear model using Ordinary Least Squares.
Parameters: fit_intercept : bool, default True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
solver : {‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}
Solver to use. See Algorithms for details
regularizer : {‘l1’, ‘l2’}
Regularizer to use. See Regularizers for details. Only used with
admmandproximal_gradsolvers.max_iter : int, default 100
Maximum number of iterations taken for the solvers to converge
tol : float, default 1e-4
Tolerance for stopping criteria. Ignored for
admmsolverlambduh : float, default 1.0
Only used with
admmandproximal_gradsolversrho, over_relax, abstol, reltol : float
Only used with the
admmsolver.Examples
>>> from dask_glm.datasets import make_regression >>> X, y = make_regression() >>> est = LinearRegression() >>> est.fit(X, y) >>> est.predict(X) >>> est.score(X, y)
Attributes
coef_ (array, shape (n_classes, n_features)) The learned value for the model’s coefficients intercept_ (float of None) The learned value for the intercept, if one was added to the model
-
class
dask_glm.estimators.LogisticRegression(fit_intercept=True, solver='admm', regularizer='l2', max_iter=100, tol=0.0001, lamduh=1.0, rho=1, over_relax=1, abstol=0.0001, reltol=0.01)[source]¶ Esimator for logistic regression.
Parameters: fit_intercept : bool, default True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
solver : {‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}
Solver to use. See Algorithms for details
regularizer : {‘l1’, ‘l2’}
Regularizer to use. See Regularizers for details. Only used with
admm,lbfgs, andproximal_gradsolvers.max_iter : int, default 100
Maximum number of iterations taken for the solvers to converge
tol : float, default 1e-4
Tolerance for stopping criteria. Ignored for
admmsolverlambduh : float, default 1.0
Only used with
admm,lbfgsandproximal_gradsolvers.rho, over_relax, abstol, reltol : float
Only used with the
admmsolver.Examples
>>> from dask_glm.datasets import make_classification >>> X, y = make_classification() >>> lr = LogisticRegression() >>> lr.fit(X, y) >>> lr.predict(X) >>> lr.predict_proba(X) >>> est.score(X, y)
Attributes
coef_ (array, shape (n_classes, n_features)) The learned value for the model’s coefficients intercept_ (float of None) The learned value for the intercept, if one was added to the model
-
class
dask_glm.estimators.PoissonRegression(fit_intercept=True, solver='admm', regularizer='l2', max_iter=100, tol=0.0001, lamduh=1.0, rho=1, over_relax=1, abstol=0.0001, reltol=0.01)[source]¶ Esimator for Poisson Regression.
Parameters: fit_intercept : bool, default True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
solver : {‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}
Solver to use. See Algorithms for details
regularizer : {‘l1’, ‘l2’}
Regularizer to use. See Regularizers for details. Only used with
admm,lbfgs, andproximal_gradsolvers.max_iter : int, default 100
Maximum number of iterations taken for the solvers to converge
tol : float, default 1e-4
Tolerance for stopping criteria. Ignored for
admmsolverlambduh : float, default 1.0
Only used with
admm,lbfgsandproximal_gradsolvers.rho, over_relax, abstol, reltol : float
Only used with the
admmsolver.Examples
>>> from dask_glm.datasets import make_poisson >>> X, y = make_poisson() >>> pr = PoissonRegression() >>> pr.fit(X, y) >>> pr.predict(X) >>> pr.get_deviance(X, y)
Attributes
coef_ (array, shape (n_classes, n_features)) The learned value for the model’s coefficients intercept_ (float of None) The learned value for the intercept, if one was added to the model
Families¶
-
class
dask_glm.families.Logistic[source]¶ Implements methods for Logistic regression, useful for classifying binary outcomes.
-
class
dask_glm.families.Normal[source]¶ Implements methods for Linear regression, useful for modeling continuous outcomes.
-
class
dask_glm.families.Poisson[source]¶ This implements Poisson regression, useful for modelling count data.
Algorithms¶
Optimization algorithms for solving minimizaiton problems.
-
dask_glm.algorithms.admm(X, y, regularizer='l1', lamduh=0.1, rho=1, over_relax=1, max_iter=250, abstol=0.0001, reltol=0.01, family=<class 'dask_glm.families.Logistic'>, **kwargs)[source]¶ Alternating Direction Method of Multipliers
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
regularizer : str or Regularizer
lambuh : float
rho : float
over_relax : FLOAT
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
abstol, reltol : float
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.compute_stepsize_dask(beta, step, Xbeta, Xstep, y, curr_val, family=<class 'dask_glm.families.Logistic'>, stepSize=1.0, armijoMult=0.1, backtrackMult=0.1)[source]¶ Compute the optimal stepsize
beta : array-like step : float XBeta : array-lie Xstep : y : array-like curr_val : float famlily : Family, optional stepSize : float, optional armijoMult : float, optional backtrackMult : float, optional
Returns: stepSize : flaot
beta : array-like
xBeta : array-like
func : callable
-
dask_glm.algorithms.gradient_descent(X, y, max_iter=100, tol=1e-14, family=<class 'dask_glm.families.Logistic'>, **kwargs)[source]¶ Michael Grant’s implementation of Gradient Descent.
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.lbfgs(X, y, regularizer=None, lamduh=1.0, max_iter=100, tol=0.0001, family=<class 'dask_glm.families.Logistic'>, verbose=False, **kwargs)[source]¶ L-BFGS solver using scipy.optimize implementation
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.newton(X, y, max_iter=50, tol=1e-08, family=<class 'dask_glm.families.Logistic'>, **kwargs)[source]¶ Newtons Method for Logistic Regression.
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.proximal_grad(X, y, regularizer='l1', lamduh=0.1, family=<class 'dask_glm.families.Logistic'>, max_iter=100, tol=1e-08, **kwargs)[source]¶ Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
verbose : bool, default False
whether to print diagnostic information during convergence
Returns: beta : array-like, shape (n_features,)
Regularizers¶
Available Regularizers¶
These regularizers are included with dask-glm.
Regularizer Interface¶
Users wishing to implement their own regularizer should satisfy this interface.
-
class
dask_glm.regularizers.Regularizer[source]¶ Abstract base class for regularization object.
Defines the set of methods required to create a new regularization object. This includes the regularization functions itself and its gradient, hessian, and proximal operator.
-
add_reg_f(f, lam)[source]¶ Add regularization function to other function.
Parameters: f : callable
Function taking
betaand*argslam : float
regularization constant
Returns: wrapped : callable
function taking
betaand*args
-
add_reg_grad(grad, lam)[source]¶ Add regularization gradient to other gradient function.
Parameters: grad : callable
Function taking
betaand*argslam : float
regularization constant
Returns: wrapped : callable
function taking
betaand*args
-
add_reg_hessian(hess, lam)[source]¶ Add regularization hessian to other hessian function.
Parameters: hess : callable
Function taking
betaand*argslam : float
regularization constant
Returns: wrapped : callable
function taking
betaand*args
-
f(beta)[source]¶ Regularization function.
Parameters: beta : array, shape (n_features,) Returns: result : float
-
classmethod
get(obj)[source]¶ Get the concrete instance for the name
obj.Parameters: obj : Regularizer or str
Valid instances of
Regularizerare passed through. Strings are looked up according toobj.nameand a new instance is createdReturns: obj : Regularizer
-
gradient(beta)[source]¶ Gradient of regularization function.
Parameters: beta : array, shape (n_features,)Returns: gradient : array, shape (n_features,)
-