API Reference¶
Estimators¶
Models following scikit-learn’s estimator API.
-
class
dask_glm.estimators.
LinearRegression
(fit_intercept=True, solver='admm', regularizer='l2', max_iter=100, tol=0.0001, lamduh=1.0, rho=1, over_relax=1, abstol=0.0001, reltol=0.01)[source]¶ Esimator for a linear model using Ordinary Least Squares.
Parameters: fit_intercept : bool, default True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
solver : {‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}
Solver to use. See Algorithms for details
regularizer : {‘l1’, ‘l2’}
Regularizer to use. See Regularizers for details. Only used with
admm
andproximal_grad
solvers.max_iter : int, default 100
Maximum number of iterations taken for the solvers to converge
tol : float, default 1e-4
Tolerance for stopping criteria. Ignored for
admm
solverlambduh : float, default 1.0
Only used with
admm
andproximal_grad
solversrho, over_relax, abstol, reltol : float
Only used with the
admm
solver.Examples
>>> from dask_glm.datasets import make_regression >>> X, y = make_regression() >>> est = LinearRegression() >>> est.fit(X, y) >>> est.predict(X) >>> est.score(X, y)
Attributes
coef_ (array, shape (n_classes, n_features)) The learned value for the model’s coefficients intercept_ (float of None) The learned value for the intercept, if one was added to the model
-
class
dask_glm.estimators.
LogisticRegression
(fit_intercept=True, solver='admm', regularizer='l2', max_iter=100, tol=0.0001, lamduh=1.0, rho=1, over_relax=1, abstol=0.0001, reltol=0.01)[source]¶ Esimator for logistic regression.
Parameters: fit_intercept : bool, default True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
solver : {‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}
Solver to use. See Algorithms for details
regularizer : {‘l1’, ‘l2’}
Regularizer to use. See Regularizers for details. Only used with
admm
,lbfgs
, andproximal_grad
solvers.max_iter : int, default 100
Maximum number of iterations taken for the solvers to converge
tol : float, default 1e-4
Tolerance for stopping criteria. Ignored for
admm
solverlambduh : float, default 1.0
Only used with
admm
,lbfgs
andproximal_grad
solvers.rho, over_relax, abstol, reltol : float
Only used with the
admm
solver.Examples
>>> from dask_glm.datasets import make_classification >>> X, y = make_classification() >>> lr = LogisticRegression() >>> lr.fit(X, y) >>> lr.predict(X) >>> lr.predict_proba(X) >>> est.score(X, y)
Attributes
coef_ (array, shape (n_classes, n_features)) The learned value for the model’s coefficients intercept_ (float of None) The learned value for the intercept, if one was added to the model
-
class
dask_glm.estimators.
PoissonRegression
(fit_intercept=True, solver='admm', regularizer='l2', max_iter=100, tol=0.0001, lamduh=1.0, rho=1, over_relax=1, abstol=0.0001, reltol=0.01)[source]¶ Esimator for Poisson Regression.
Parameters: fit_intercept : bool, default True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
solver : {‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}
Solver to use. See Algorithms for details
regularizer : {‘l1’, ‘l2’}
Regularizer to use. See Regularizers for details. Only used with
admm
,lbfgs
, andproximal_grad
solvers.max_iter : int, default 100
Maximum number of iterations taken for the solvers to converge
tol : float, default 1e-4
Tolerance for stopping criteria. Ignored for
admm
solverlambduh : float, default 1.0
Only used with
admm
,lbfgs
andproximal_grad
solvers.rho, over_relax, abstol, reltol : float
Only used with the
admm
solver.Examples
>>> from dask_glm.datasets import make_poisson >>> X, y = make_poisson() >>> pr = PoissonRegression() >>> pr.fit(X, y) >>> pr.predict(X) >>> pr.get_deviance(X, y)
Attributes
coef_ (array, shape (n_classes, n_features)) The learned value for the model’s coefficients intercept_ (float of None) The learned value for the intercept, if one was added to the model
Families¶
-
class
dask_glm.families.
Logistic
[source]¶ Implements methods for Logistic regression, useful for classifying binary outcomes.
-
class
dask_glm.families.
Normal
[source]¶ Implements methods for Linear regression, useful for modeling continuous outcomes.
-
class
dask_glm.families.
Poisson
[source]¶ This implements Poisson regression, useful for modelling count data.
Algorithms¶
Optimization algorithms for solving minimizaiton problems.
-
dask_glm.algorithms.
admm
(X, y, regularizer='l1', lamduh=0.1, rho=1, over_relax=1, max_iter=250, abstol=0.0001, reltol=0.01, family=<class 'dask_glm.families.Logistic'>, **kwargs)[source]¶ Alternating Direction Method of Multipliers
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
regularizer : str or Regularizer
lambuh : float
rho : float
over_relax : FLOAT
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
abstol, reltol : float
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.
compute_stepsize_dask
(beta, step, Xbeta, Xstep, y, curr_val, family=<class 'dask_glm.families.Logistic'>, stepSize=1.0, armijoMult=0.1, backtrackMult=0.1)[source]¶ Compute the optimal stepsize
beta : array-like step : float XBeta : array-lie Xstep : y : array-like curr_val : float famlily : Family, optional stepSize : float, optional armijoMult : float, optional backtrackMult : float, optional
Returns: stepSize : flaot
beta : array-like
xBeta : array-like
func : callable
-
dask_glm.algorithms.
gradient_descent
(X, y, max_iter=100, tol=1e-14, family=<class 'dask_glm.families.Logistic'>, **kwargs)[source]¶ Michael Grant’s implementation of Gradient Descent.
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.
lbfgs
(X, y, regularizer=None, lamduh=1.0, max_iter=100, tol=0.0001, family=<class 'dask_glm.families.Logistic'>, verbose=False, **kwargs)[source]¶ L-BFGS solver using scipy.optimize implementation
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.
newton
(X, y, max_iter=50, tol=1e-08, family=<class 'dask_glm.families.Logistic'>, **kwargs)[source]¶ Newtons Method for Logistic Regression.
Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
Returns: beta : array-like, shape (n_features,)
-
dask_glm.algorithms.
proximal_grad
(X, y, regularizer='l1', lamduh=0.1, family=<class 'dask_glm.families.Logistic'>, max_iter=100, tol=1e-08, **kwargs)[source]¶ Parameters: X : array-like, shape (n_samples, n_features)
y : array-like, shape (n_samples,)
max_iter : int
maximum number of iterations to attempt before declaring failure to converge
tol : float
Maximum allowed change from prior iteration required to declare convergence
family : Family
verbose : bool, default False
whether to print diagnostic information during convergence
Returns: beta : array-like, shape (n_features,)
Regularizers¶
Available Regularizers
¶
These regularizers are included with dask-glm.
Regularizer
Interface¶
Users wishing to implement their own regularizer should satisfy this interface.
-
class
dask_glm.regularizers.
Regularizer
[source]¶ Abstract base class for regularization object.
Defines the set of methods required to create a new regularization object. This includes the regularization functions itself and its gradient, hessian, and proximal operator.
-
add_reg_f
(f, lam)[source]¶ Add regularization function to other function.
Parameters: f : callable
Function taking
beta
and*args
lam : float
regularization constant
Returns: wrapped : callable
function taking
beta
and*args
-
add_reg_grad
(grad, lam)[source]¶ Add regularization gradient to other gradient function.
Parameters: grad : callable
Function taking
beta
and*args
lam : float
regularization constant
Returns: wrapped : callable
function taking
beta
and*args
-
add_reg_hessian
(hess, lam)[source]¶ Add regularization hessian to other hessian function.
Parameters: hess : callable
Function taking
beta
and*args
lam : float
regularization constant
Returns: wrapped : callable
function taking
beta
and*args
-
f
(beta)[source]¶ Regularization function.
Parameters: beta : array, shape (n_features,) Returns: result : float
-
classmethod
get
(obj)[source]¶ Get the concrete instance for the name
obj
.Parameters: obj : Regularizer or str
Valid instances of
Regularizer
are passed through. Strings are looked up according toobj.name
and a new instance is createdReturns: obj : Regularizer
-
gradient
(beta)[source]¶ Gradient of regularization function.
Parameters: beta : array, shape (n_features,)
Returns: gradient : array, shape (n_features,)
-