Commit graph

84 commits

Author SHA1 Message Date
9d9dc9ef8d Optimize tree training so that the best split is not applied twice 2018-08-08 11:34:02 -07:00
74151b94db Add alternative way where functions are computed only at final step. 2018-08-07 15:49:55 -07:00
d85f4eb099 Refactored competing risk combiners and differentiators into their own
packages.
2018-08-07 10:59:19 -07:00
bf56dfb59d Add ability to compute different error rates. 2018-08-07 10:52:52 -07:00
d3994212b6 Fix a bug in naive mortality error measure; implement IPCW concordance
measure if you can provide the censoring distribution.
2018-07-26 12:45:12 -07:00
e1caef6d56 Implement naive mortality error measure 2018-07-25 15:29:09 -07:00
650579a430 Implement naive version of concordance index.
Note that results DO NOT MATCH with randomForestSRC; so take these
results with a grain of salt.
2018-07-25 14:18:50 -07:00
7a77851f94 WIP - Add a CompetingRiskErrorRateCalculator.
Note that tests fail.
2018-07-24 14:35:27 -07:00
d4853f5232 Change how trees are saved so that they are compressed. 2018-07-18 15:29:55 -07:00
dc9d20aa1a Add ability to load gziped CSV files 2018-07-18 10:05:49 -07:00
05f9122b58 Add capability to load trees back into memory. 2018-07-17 13:54:59 -07:00
fffdfe85bf Finish competing risk implementation. Fix a bug in tree training
algorithm.
2018-07-16 16:58:11 -07:00
462b0d9c35 Implement Response & GroupDifferentiators for CompetingRisk problems.
Also adjusted how settings are done to allow for specifying
differentiators & responses that may require arguments.

Note that CompetingRisk code is untested at this point.
2018-07-10 14:43:51 -07:00
4bbb0e0948 Fix a bug whereby FactorCovariate fails when "NA" is provided.
Also improved testing around this.
2018-07-06 13:33:58 -07:00
6b62ad95c3 Add support for loading datasets by CSV files. 2018-07-06 13:21:56 -07:00
fe9ff37dcf Upgraded Settings class to allow for covariates to be built from
provided values.
2018-07-05 19:04:26 -07:00
b010e79269 Add basic Settings class with persistence. 2018-07-05 13:59:52 -07:00
2cdcbe6cbf Refactor different classes into subpackages. 2018-07-05 12:59:29 -07:00
662a6cf761 Add OTFI imputation when training forest.
No tests have been written yet so this is still WIP.
2018-07-05 12:05:07 -07:00
Joel Therrien
c048a285a1 Merge branch '01-factors' of joel/RandomSurvivalForests into master 2018-07-04 20:27:42 +00:00
3b8952e13c Added some tests for FactorCovariate. Moved workshop over to test
codebase.
2018-07-04 13:24:34 -07:00
c7298f7da6 Fix incorrect use of non-concurrent Random object in NumericCovariate. 2018-07-04 12:18:27 -07:00
e0cfed632f Add FactorCovariate; testing required. 2018-07-04 12:18:06 -07:00
2259528c22 Small modificaton of NumericCovariate; child classes now gurantee they
return NumericCovariate when getParent() is called.
2018-07-04 10:54:46 -07:00
38e70dd3a1 Add BooleanCovariate 2018-07-04 10:54:07 -07:00
e96a578ac9 Refactored code to allow for a class of covariates to determine which
SplitRules are tested.

Most of the refactoring involved the creation of a Covariate class (one
instance per column); with SplitRule and Value being folded in as inner
classes.
2018-07-03 17:00:02 -07:00
e7af65e8fd Fixed a bug where Splits could be generated that had an empty daughter
node
2018-07-03 15:15:09 -07:00
254727e594 Add support for saving trees as forest is being trained.
Support for loading the trees back is not yet written.
2018-07-03 12:31:08 -07:00
df35a2007a Remove inefficient debug code previously missed. 2018-07-03 11:20:15 -07:00
5f280d09a1 Add parallel support & fix fatal bug in TreeTrainer#findBestSplitRule. 2018-07-02 23:16:20 -07:00
df7835869a Add functionality to train a random
forest in serial.
2018-07-02 17:58:53 -07:00
6192643e12 Change ResponseCombiner to be a Collector that's compatible
with Streams.
2018-07-02 12:27:18 -07:00
3c9c78741f Basic functinality to train a single regression tree is
implemented.
2018-07-01 22:22:12 -07:00
7a467207a4 Initial commit; some base classes have been defined
but no logic exists yet.
2018-06-29 12:04:59 -07:00