NEWS | R Documentation |
Added all possible options to the specific boosting functions
instead of passing the options via ...
to mboost_fit
.
Closes #81.
Minor speed ups in df2lambda
(i.e., when computing penalty parameter
for the defined degrees of freedom). Changes proposed by Benjamin
Christoffersen.
Updated kernel boosting reference. Closes #84.
Rebuilt package with LF instead of CRLF to fix ‘cleanup’ script as requested by CRAN. Fixes #82
Use "old" definition of degrees of freedom in
vignette("mboost", package = "mboost")
to make results reproducible.
Fix handling of missing values in mboost
and gamboost
when weights are specified. Fixes #80.
Models with zero steps (i.e., models containing only the offset) can now be fitted. Furthermore, cross-validation can now also select a model without base-learners. Fixes #64, #66, and #69.
Binomial
now uses link functions by making use of make.link
.
Furthermore, an alternative implementation of Binomial
models
along the lines of the glm
implementation can be used via
Binomial(type = "glm")
. Additionally, it works not only with a
two-level factor but also with a two-column matrix containing the number
of successes and number of failures.
Fixes #34,
#63 and
#65.
Added new base-learner bkernel
for kernel boosting as described in
S. Friedrichs, J. Manitz, P. Burger, C.I. Amos, A. Risch, J.C. Chang-Claude, H.E. Wichmann, T. Kneib, H. Bickeboeller, and B. Hofner (2017), Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies. Computational and Mathematical Methods in Medicine. 2017(6742763), 1-17. \Sexpr[results=rd,stage=build]{tools:::Rd_expr_doi("10.1155/2017/6742763")}.
Removed check if df2lambda
is stable. Hence,
options(mboost_check_df2lambda)
(introduced in mboost 2.5-0)
is no longer used.
Closes #26.
Added Andreas Mayr as contributor.
Updated references and added reference to citation("mboost")
.
Fixed code of India example, which can be used to reproduce the data analysis presented in
N. Fenske, T. Kneib, and T. Hothorn (2011), Identifying risk factors for severe childhood malnutrition by boosting additive quantile regression. Journal of the American Statistical Association, 106:494-510.
(see system.file("India_quantiles.R", package = "mboost")
)
Fixed package citation.
Register C routines to make CRAN happy (again). Fixes #77.
Make sure that family = Multinomial
is only used with Kronecker
product base-learners. Fixes #46.
Use argument PACKAGE
in .Call
.
Fixes #72.
If center
is specified as boolean value in bols
, we now throw an error.
Fixes #70.
Fixed AUC
family which expected fit to be equal to a constant in the
first iteration.
Check for new data, e.g., in predict
, was broken.
Fixes #68.
Make sure that newdata
is discarded in fitted
.
Fixes #76.
New Cindex
family to optimize survival models w.r.t. the concordance index.
Fixes #53.
Added function varimp
to extract variable importance.
A dedicated plot function exists (plot(varimp())
). Code was provided
by Tobias Kuehn and Almond Stoecker.
See pull request #29.
Improved plot
function for boosting models:
plot
fails earlier in case of multiple levelplots
,
i.e., maps (thanks to Mikko Korpela).
See pull request #39.
Provide sensible defaults for xlab
and ylab
and allow
user-specified axis labels for bi- and multivariate plots.
Fixes #51.
Export plot
functions (plot.glmboost
, plot.mboost
,
lines.mboost
, plot.varimp
and plot.cvrisk
) for
better usability and visibility.
Updated manual regarding the usage of families and clarified the usage
of argument qoffset
.
Updated manual for base-learners:
Highlight that x
should be centered if bols(x, intercept = FALSE)
is used.
Discourage using bbs(, constraint != "none")
; Preferably use
bmono
for constrained effect estimates.
Fixes #36.
Improved vignettes (thanks to Mikko Korpela). See pull request #38.
Solve potential problem with IPCweights()
. Fixes
#54.
Drop unobserved factor levels from bols()
. Fixes
#47.
Adapt btree
to changes introduced in package party.
Fixes #58.
Improved cvrisk
to be more robust in various use cases
(thanks to Mikko Korpela).
See pull request #42.
Be more careful regarding namespace scoping rules. Fixes #45.
New maintainer: Benjamin Hofner follows Torsten Hothorn as maintainer.
Package party is now imported. mboost no longer directly relies on unexported functions.
Allow extrapolation for predictions if kronecker products, tensor products or sums are used. Fixes #23.
Development now hosted entirely on github as boost-R/mboost.
Started using testthat.
Improved checks for newdata
: Warnings are no longer
issued if data has just different types of numeric values (i.e.,
integer vs. double). Resolves issue
#17.
Fixed ‘CITATION’ by removing duplicated string 'R package version' (spotted by Heidi Seibold).
predict
: Improve warning when length(offset) >
1
. Closes issue
#20.
Added test coverage using package covr.
Better error handling in cvrisk
also for parallel
processes.
Suppress warning of rankMatrix
. (Resolves issue
#24).
Stop exporting internal functions for FDboost. Use
mboost_intern()
instead. Caution: Do not use this function.
Handling of missing values has been improved. Resolves issue #12.
Minor bug fixed in vignette ‘mboost_illustrations.Rnw’.
Throw an error if model cannot be fitted. Fixes issue #18.
Fixed bug in bkronecker with dense matrices. Resolves issue #30.
Added documentation for plot.mboost
function and moved
documentation of plot.glmboost
to the same help page.
Resolves issue #14.
bbs
and bmono
no longer allow data outside of
the boundary.knots
during model fitting.
Predictions for bbs
and bmono
now use linear
extrapolation (user request inspired by
mgcv::Predict.matrix.pspline.smooth
).
Better handling of errors in (single) folds of cvrisk
:
results of folds without errors are used and a warning
is
issued.
Parallel computing via mclapply
: Set
mc.preschedule = FALSE
per default.
Added new option options(mboost_check_df2lambda =
TRUE)
, which controls if a stability check in df2lambda
is performed. If set to FALSE
this might speed up the
computation of df2lambda
especially with large design
matrices.
Prediction now also possible with newdata = list()
.
Resolves issue
#15.
PropOdds()
: Updated manual for proportional odds model.
Multinomial()
: Updated manual for multinomial logit
model. Predictions for new data are now
possible (resolves issue
#13, thanks to
Sarah Brockhaus).
‘inst/CITATION’: Added subheadings and tutorial paper.
Stopped computing the singular vectors in df2lambda
as the singular values are sufficient and as
“computing the singular vectors is the slow part for large
matrices” (proposed by Fabian Scheipl).
Fixed bug in PropOdds()
which occurred if
offset
was specified: nuisance parameters delta
and sigma
were not properly initialized (spotted by Madlene Nussbaum).
Bug in plot.mboost()
fixed which occurred if a factor
with equal effect estimates for different categories was plotted.
Bug in df2lambda
fixed: Make sure that A
is
symmetric if it is Matrix
-object (spotted by Souhaib Ben
Taieb).
Bug in df2lambda
fixed. Design matrices were always
assumed to be of full rank.
Truncate output of complete data structure when model is printed. Resolves issue #11.
Adhere to CRAN policies regarding import of base packages (closes #9).
Export df2lambda
, hyper_bbs
and bl_lin
to make package FDboost happy. Note: These functions
usually should not be called directly by users.
Added Hothorn et al (2010) to ‘inst/CITATION’
Changes in ‘inst/CITATION’ to make CRAN happy: Citations can now be extracted without the need to install the package.
Removed EISPACK = FALSE
from eigen()
as the
argument is defunct and ignored.
Changed require
to requireNamespace
Moved generic definition of selected
to stabs
which is required anyway (thus, stabs >= 0.5-0 is now
required)
load AML dataset (‘AML_Bullinger.rda’) from package TH.data
Updated references (for stability selection, confidence intervals and constrained regression)
fixed ‘inst/CITATION’
Refer to news(package = "mboost")
instead of to the
‘NEWS’ file.
Cross-validation was potentially wrong for CoxPH()
models. Users can now choose if they want the naive
cross-validation or the improved version by Verweij and van
Houwelingen (1993); (spotted by Holger Reulen <hreulen _ at _
uni-goettingen.de>)
Examples in \dontrun
are now executable and all
dependencies are properly stated in ‘DESCRIPTION’
Added confint
function to compute (bootstrap)
confidence intervals together with plot and print methods
stabsel()
now depends on the new package stabs
where the back end and methods such as plot
and
print
are implemented
Improved plot
method for varying coefficients
(ylim
now suitable) and base-learners of factor variables.
Tweaked update
function: we now can turn the
trace
on and off, and specify the type of risk
as
well as the oobweight
to update()
Updated vignette ‘mboost_tutorial’ to reflect latest changes in mboost.
Changed plain text ‘NEWS’ to ‘inst/NEWS.Rd’
Removed links to archived package mfp.
Explicitly specify the packages for functions that are
implemented in packages that are listed as Suggests:
, e.g
we now use party::ctree_control
etc.
glmboost()$model.frame()
was broken
glmboost()$update()
was broken
predict()
for models with non-scalar offsets was
broken
stabsel
was recoded and now uses different
terminology, much more options and a better tested code base
new replacement function mstop<-
as an alternative to
<mboost>[i]
(suggested by Achim Zeileis).
bmono
new and faster algorithm to compute monotonic P-splines
(type = "quad.prog"
)
new constraints added for positive and negative spline estimates
bbs
allows monotone T-splines (experimental)
new argument deriv
to bbs for computing derivatives of
B-splines
bmrf
can now also handle neighborhood matrices as an
argument to bnd
added new families Hurdle
and Multinomial
boost_control
: added new argument stopintern
for
internal stopping (based on oobag data) during fitting
All data sets have been moved to the new package set TH.data
added new argument which
to variable.names()
added new method risk
to extract risks
brandom
now checks that a factor is given
speed improvements when updating a model via mod[mstop]
changed \dontrun
to \donttest
updated references
fixed a problem with extract()
of single base-learners
fixed bug in AIC.mboost
: df = "actset"
can
only be used with glmboost models
fixed package start up messages
fixed a problem in mboost_fit
(when names of
base-learners were missing)
fixed bugs in survival families:
offset
in all survival families was based on
max(survtime)
instead of max(log(survtime))
;
offset
in CoxPH
can't be computed from Cox
Partial LH as constants are canceled out; Use fixed offset
instead;
speed up checking of manual by changing some computations (e.g. reduce
mstop
) or exclude code from checking via \dontrun{}
removed dependency on ipred (replaced with TH.data)
small improvements in manual
bbs(..., center = "spectralDecomp")
computes the spectral
decomposition of the penalty matrix and the penalized part of the
design matrix is defined by this decomposition.
Experiments show that bols(x) + bbs(x, center = "spectralDecomp")
is a little better in recovering the true underlying functions than
the default bols(x) + bbs(x, center = TRUE)
or, equivalently,
bols(x) + bbs(x, center = "differenceMatrix")
.
For bbs(x, y, center = TRUE)
or bmrf(x, center = TRUE)
,
the spectral decomposition is (and was) always used.
fixed bug in stabsel
: '...'
was not passed to
cvrisk
and thus one could not specify options for mclapply
fixed bug in brandom
: now really use
contrasts.arg = "contr.dummy"
per default.
removed tests/
folder and .Rout.save
files for
vignettes from the CRAN release
small improvements in manual
included warnings in stabsel()
for better
guidance of the user:
A warning is issued if the upper bound for the
FWER
in stability selection is greater (by a certain
margin) than the specified bound.
A warning is also issued if mstop
is too small to select
q
variables.
improved output of errors and warnings in stabsel
.
suppress the notes from package Matrix about method ambiguity
("Note: method with signature ... chosen, ... would also be valid"
)
updated manual on base-learners to reflect the change in the default for degrees of freedom (additionally, all options are now discussed in a separate section of the base-learner manual)
updated vignette ‘mboost_tutorial’
updated ‘mboost_package.Rd’: now all important changes since mboost 2.0 are documented there
changed roles of contributors to ctb
suggested packages are now only used inside if(require(pkg))
statements
changed start up message
switch from packages multicore and snow to parallel
changed behavior of bols(x, intercept = FALSE)
when
x
is a factor:
now the intercept is simply dropped from the design matrix
coding can be specified as usually for factors.
changed default for options("mboost_dftraceS")
to FALSE
, i.e.,
degrees of freedom are now computed from smoothing parameter
as described in B. Hofner, T. Hothorn, T. Kneib, M. Schmid (2011).
changed computation of B-spline basis at the boundaries: now also use equidistant knots in the boundaries (per default)
improved plot
function when dealing with spatial plots
(now builds suitable grid based on the observations if no
newdata
is given)
increased default number of subsampling replicates in stabsel
to 100
[experimental] bmono()
now implements constraints at the boundaries of
(monotonic) P-splines
[experimental] added family Gehan()
for rank-based
estimation of survival models in an accelerated failure time
framework (contributed by Brent Johnson bajohn3@emory.edu)
matrices with one column are now handled as vectors in base-learners
improved manual
fixed error that occurs with R (>= 2.16) due to internal changes in R
improved handling of missing values (throws warnings and fixed a bug that occurred for missings in the response)
improved manual for the handling of contrasts in bols
added tutorial vignette
updated references
new option "mboost_eps" for factor in Demmler-Reinsch orthogonalization
added base-learners for smooth monotonic (or convex/concave) functions of one or two variables (bmono())
added base-learners for radial basis functions (brad())
added base-learners for Markov random fields (bmrf())
bbs(x, cyclic = TRUE) for cyclic covariates ensures that predictions at the boundaries coincide and that the resulting function estimate is smoothly joined
bols(x, intercept = FALSE) only reasonable if x is centered. A warning is now issued if x is not centered.
changed default for degrees of freedom in bspatial() to df = 6
added checks in bbs (and brandom) to ensure that the specified degrees of freedom are greater than the range of the (unpenalized) null space
bolscw can be mixed with other base-learners (although not yet exported and not via the formula interface)
new experimental base-learner %O% for smoothing matrix-values responses
add Binomial(link = "probit") and general cdf's as link functions (experimental)
added new families:
AUC() for AUC loss function
GammaReg() for gamma regression models
added extract() methods for base-learners and fitted models
added residuals() function to extract residuals from the model
improved predict.mboost(): added names where missing and the offset as attribute where applicable.
fixed bug in predict() with glmboost.matrix(..., center = TRUE)
coef now also works with tree base-learners (returns NULL in this case)
changed coef.gamboost to coef.mboost
various improvements in plot.mboost function
changed default in glmboost() to center = TRUE
speed up glmboost() a little bit
changed behavior of cvrisk() if weights are used: out-of-bag-risk now weighted according to "weights" as specified in call to mboost
added warning if df2lambda is likely to become numerically unstable (i.e. in the case of large entries in the design matrix)
improved storage, speed and stability using Matrix technology for bols() for factors with many levels and brandom(); further improvements in base-learners that are combined via %+%.
various improvements and fixes in manuals
minor bug-fixes to make mboost work with gamboostLSS
replaced writeLines with packageStartupMessage in .onAttach()
replaced partially matched function arguments by full arguments
minor fixes in manuals
fix problem in bl_lin when using dense matrices from package "Matrix"
add rqss results for India childhood malnutrition data
add gbm to Suggests
make survival package happy again
vignette "mboost" updated
remove problem with R CMD check that occurred on some 64bit systems
no not use multicore functionality in R CMD check, really.
no not use multicore functionality in R CMD check
new vignette "mboost" describing 2.0-x series features
fixed bug in bols(): contrast.arg was ignored if not a named list (which is wasn't per default)
added (missing) response functions to families Weibull(), Loglog(), Lognormal() and NBinomial()
fixed bug in family CoxPH which occurred with NAs
improvements and corrections in documentation
glmboost(..., center = TRUE) now also centers columns of the design matrix corresponding to contrasts of factors when an intercept term is present leading to faster risk minimization in these cases.
coef.glmboost: New argument off2int = TRUE
adds the
offset to the intercept. In addition, the intercept
term is now adjusted for centered covariates.
check for infinite residuals in mboost_fit(). Especially for family = Poisson(), something like boost_control(nu = 0.01) fixes this problem.
"by" (in bols() and bbs()) can now handle factors with more than two levels
improved plot.mboost() for varying coefficients
minor improvements in documentation
fixed bug in helper function get_index, which caused (in some circumstances) wrong handling of factors in gamboost() (spotted by Juliane Schaefer <JSchaefer _at_ uhbs.ch>)
reduce memory footprint in blackboost (requires party 0.9-9993)
fixed bug in coef( , aggregate = "cumsum"): fraction "nu" was missing
generic implementation of component-wise functional gradient
boosting in mboost_fit
, specialized code for linear,
additive and interaction models removed
new families available for ordinal, expectile and censored regression
computations potentially based on package Matrix (reduces memory usage)
various speed improvements
added interface to extract selected base-learners (selected())
added interface for parallel computations in cvrisk with arbitrary packages (e.g. multicore, snow)
added "which" argument in predict and coef functions and improved usability of "which" in plot-function. Users can specify "which" as numeric value or as a character string
added function cv() to generate matrices for k-fold cross-validation, subsampling and bootstrap
new function stabsel() for stability selection with error control
added function model.weights() to extract the weights
added interface to expand model by increasing mstop in model[mstop]
alternative definition of degrees of freedom available
Interface changes:
class definition / Family() arguments changed
changed behavior of subset method (model[mstop]). Object is directly altered and not duplicated
argument "center" in bols replaced with "intercept"
argument "z" in base-learners replaced with "by"
bns and bss deprecated;
fixed bug in prediction with varying coefficients for binary effect modifiers
better x-axes in plot.cvrisk and possibility to change xlab
parallel cvrisk on Unix systems only (multicore isn't safe on windows)
included new penalty for ordinal predictors (in bols())
corrected bug in bspatial (centering was not used for Xna)
removed output of dfbase (which is seldom used) in gamboost
changed manual for coef.gamboost
make sure NAs are handled correctly when center = TRUE in glmboost
better weights and boundary knots handling in bspatial
cvrisk runs in parallel if package multicore is available
errors removed and minor improvements in the manuals
center = TRUE in glmboost did only apply to numeric (not integer) predictors
for safety reasons: na.action = na.omit again (causes slight changes in wpbc3 example)
new quantile regression facilities.
fix problem with bbs base-learner and cvrisk
bbs instead of bss is the default base-learner in gamboost
make sure bbs with weights and expanded observations returns numerically the very same results
btree can now deal with multiple variables
new gMDL criterion (contributed by Zhu Wang <zhu.wang@yale.edu>)
make survival package happy again
bols allows to specify non-default contrasts.
remove experimental memory optimization steps
negative gradient of GaussClass() was wrong, spotted by Kao Lin <linkao@picb.ac.cn>
Date was malformed in DESCRIPTION
improved memory footprint in gamboost() and cvrisk()
option to suppress saving of ensembles added to boost_control()
bbs(), bns(), bspatial(): default number of knots changed to a fixed value (= 20)
changed default for grid (now uses all iterations) in cvrisk() and changed plot.cvrisk()
bols: works now for factors and can be set-up to use Ridge-estimation. Intercept can be omitted now (via center = TRUE).
new btree() base-learner for gamboost() available
fix inconsistencies in regression tests
add coef.gamboost
new generic survFit
cosmetics for trace = TRUE
inst/mboost_Bioinf.R was missing from mboost 1.0-0
documentation updates
tests update and release the new version on CRAN
predict(..., allIterations = TRUE) returns the matrix of predictors for all boosting iterations
move mboost to R-forge
improvements in gamboost
:
P-splines as base learners available
new formula interface for specifying the base learner
new plot.gamboost
add the number of selected variables as degrees of freedom (as mentioned in the discussion of Hastie to Buehlmann & Hothorn)
status information during fitting is now available via boost_control(trace = TRUE) but is switched off by default
acknowledge contributions by Thomas Kneib and Matthias Schmid in DESCRIPTION
gamboost() now allows for user-specified base learners via the formula interface
gamboost.matrix(x = x, ...) requires colnames being set
for x
na.action = na.omit fix for g{al}mboost()
gamboost(..., weights = w) was broken
extract response correctly in fitted.blackboost
hatvalues (and thus AICs) for GLMs with centering of covariates may have been wrong since version 0.5-0
add paper examples to tests
fix Rd problems
westbc
regenerated
LazyLoad: yes (no SaveImage: yes)
plot() method for glmboost
objects visualizing the
coefficient path (feature request by Axel Benner <benner@dkfz.de>).
predict(newdata = <matrix>) was broken for gamboost(), thanks to Max Kuhn <Max.Kuhn@pfizer.com> for spotting this.
predict() for gamboost(..., dfbase = 1) was not working correctly
small performance and memory improvements for glmboost()
some performance improvements for glmboost()
blackboost() is now generic with formula and x, y interface
plot() method for cvrisk() and AIC() output now allows for ylim specification without troubles
depends party 0.9-9
new baselearner
argument to gamboost
allowing to
specify difference component-wise base-learners to
be used. Currently implemented: "ssp" for smoothing splines
(default), "bsp" for B-splines and "ols" for linear models.
The latter two haven't been tested yet.
The dfbase
arguments now applies to each covariate and
no longer to each column of the design matrix.
cvrisk() for blackboost() was broken, totally :-(
centered covariates were returned by glmboost() and gamboost()
Poisson() used an incorrect offset
check for y being positive counts when family = "Poisson()"[B
checks for Poisson() logLik() and AIC() methods
fire a warning when all u > 0 or u < 0
update vignette ‘mboost_illustrations’
fix problem with dfbase
in gamboost
, spotted by
Karin Eckel <Karin.Eckel@imbe.imed.uni-erlangen.de>
work around stats4:::AIC
fix plot problems in plot.cvrisk
allow for centering of the numerical covariates in glmboost and gamboost
AIC(..., "classical") is now faster for non-Gaussian families
predict(..., newdata) can take a matrix now
predict(<blackboost-object>, type = "response") did not return factors when the response was actually a factor
report offset in print methods
add offset attribute to coef.glmboost
add contrasts.arg
argument to glmboost.formula
more meaningful default for grid
in cvrisk
R-2.4.0 fixes
add checks for CoxPH (against coefficients and logLik of CoxPH)
add weights to CoxPH
the ngradient function in Family objects needs to implement arguments (y, f, w), not just (y, f)
check for meaningful class of the response for some families
some small speed improvements in gamboost
handle factors in gamboost
properly (via a linear model)
the dfbase argument can take a vector now (in gamboost
)
update and improve entries in DESCRIPTION
documentation updates
Huber() is ‘Huber Error’, not ‘Huber Absolute Error’
added CoxPH
family object for fitting Cox models
remove inst/LaTeX
use NROW / NCOL more often (now that y
may be a Surv
object)
implement cvrisk
, a general cross-validation function for the
empirical risk and a corresponding plot method
unify risk computations in all three fitting functions
unify names for gb
objects
allow for out-of-bag risk computations
some cosmetics
update keywords in Rd-files
risk was always 0 in Huber()@risk when d was chosen adaptively
pData(westbc)$nodal.y has levels negative
and positive
(lymph node status)
add src/Makevars (required for Windows builds)
make sure objects that are modified at C-level get _copied_ in
blackboost
some minor codetools
fixes: removed unused variables
and an out-dated function
add codetools
checks to regression tests
fix xlab in plot.gbAIC
mboost version 0.4-5 published on CRAN 2006-06-13