Commit graph

81 commits

Author SHA1 Message Date
585d6d3c5b Make SplitRules their own class; independent of their Covariate parents.
This was done so that when we serialize trees (and thus SplitRules) we don't awkwardly also serialize ntree versions of the Covariates,
which is really awkward when deserializing them.
2019-03-25 14:44:31 -07:00
76b2cdd3c4 WIP - Some changes to how trees are saved. 2019-03-25 10:59:55 -07:00
76614ee68b Better memory management to help prevent OutOfMemoryExceptions 2019-03-25 10:59:26 -07:00
d65e010c48 Very minor improvement to how tree filenames are saved. 2019-03-13 11:21:51 -07:00
02b7a5cb9a Rebrand project as largeRCRF 2019-03-13 10:47:37 -07:00
cfa3a6f432 Attempting memory optimizations 2019-03-13 10:39:18 -07:00
8014bd4629 Fix bug where NAs cause crash 2019-03-04 11:36:21 -08:00
91cf299362 Add some R utility functions to help data get quickly loaded 2019-03-04 11:23:31 -08:00
29b154110a Fix bug where template.yaml gets replaced whenever user wants to look at help dialog 2019-02-28 11:12:21 -08:00
a7f591c2d3 Add integration capabilities to RightContinuousStepFunction; use it for calculating mortality 2019-02-18 14:57:26 -08:00
e74ba23177 Small changes; more tests. 2019-02-18 14:57:13 -08:00
9f513ab75b Add capabilities to get nodes of a certain type in a forest; used to produce summary statistics 2019-02-02 09:36:00 -08:00
77ec780304 Fix theoretical bug 2019-01-29 13:38:45 -08:00
d8e52ecd82 Add tests around NumericSplitRuleUpdater; fix minor bug. 2019-01-23 19:37:10 -08:00
115c57f829 Improve documentation and add a final to MeanResponseCombiner. 2019-01-22 11:01:21 -08:00
d935fe0bc0 Improve WeightedVarianceGroupDifferentiator to be faster 2019-01-22 10:56:31 -08:00
ee137370a1 Add GPL-3 Copyright to code 2019-01-14 11:45:23 -08:00
Joel Therrien
7a5a8ab0fc Merge branch 'optimizations' of joel/RandomSurvivalForests into master 2019-01-14 19:08:14 +00:00
e709c42da1 Update the competing risk GroupDifferentiators to make efficient use of the SplitRuleUpdater updates
Results in a speed improvement of over 1/3 according to a timing of the TestCompetingRisk#testLogRankSingleGroupDifferentiatorAllCovariates() test
2019-01-11 22:58:56 -08:00
86122fd90d Remove comment that isn't true 2019-01-10 17:34:52 -08:00
31d6ce9b3e Covariates track if they have any NA values, and skip NA handling code if possible 2019-01-10 14:09:43 -08:00
a57741b726 Add PMD rules to pom.xml to enforce higher code quality 2019-01-10 11:23:55 -08:00
a5fe856857 Massive refactor; Use Iterators/Updaters when calculating difference scores for faster calculations.
Changed the covariates to be more clever with how they produce the different splits. In the future (not yet implemented) a clever GroupDifferentiator
could update the current score calculation based just on how many rows moved from one hand to the other. There were a few other changes as well;
TreeTrainer#growTree now accepts a Random as a parameter which is used throughout the entire growing process. This means it's now theoretically
possible to grow trees using a seed, so that results can be fully reproducible.
2019-01-09 21:31:27 -08:00
e892076a05 Add a test for the composite log rank splitting rule.
Add some debug toString capabilities on Nodes and Trees
2019-01-04 11:22:23 -08:00
ae40a2e664 Removed naive mortality error measurement
Naive mortality error was an ad-hoc method I implemented earlier on. It
didn't provide any useful performance, nor was it theoretically
grounded. It's better to remove it before someone accidently uses it.
2018-10-27 19:15:59 -07:00
a887a3cc15 Fix bug in Utils.binarySearchLessThan 2018-10-25 11:21:45 -07:00
ae91dbe9e7 Explicitly store RightContinuousStepFunction in CompetingRiskFunctions
Done so that RUtils is useful. Also optimized imports.
2018-10-25 10:49:43 -07:00
c68f67e47a Massive optimizations;
Refactored how MathFunctions are structured to use more primitives and
less objects.
Optimized competing risk group differentiators to run faster.
Removed alternative competing risk response combiners (may be added back
later)
2018-10-25 10:34:27 -07:00
cce5ad1e0f Add parameter to decide on whether to check for node purity or not 2018-10-15 11:03:35 -07:00
7fba964af9 Optimize CompetingRiskResponseCombiner 2018-10-12 12:11:48 -07:00
aa733d5eba Switch code to storing Covariate.Value using arrays instead of Maps 2018-09-18 11:17:15 -07:00
de39f60314 Make CovariateRow's serializable; add R utility functions. 2018-09-14 18:42:14 -07:00
7008959999 Add functionality to analyze using validation sets 2018-09-13 12:09:20 -07:00
98cb97a1f1 Improve performance by integrating binary search into MathFunction 2018-09-11 17:12:27 -07:00
6e58122380 Optimize MathFunction 2018-09-10 17:16:43 -07:00
e0681763ef Add convenience methods to improve R interface performance 2018-09-10 12:31:35 -07:00
b8024275a9 Fix a bug where CompetingRiskFunctions returns NaNs when using set times
in response combiner
2018-09-01 09:43:42 -07:00
62198f998d Small code cleanup 2018-09-01 09:42:48 -07:00
2fb80df5a5 Add test for CompetingRiskFunctions 2018-08-31 22:32:54 -07:00
8333579a1f Code cleanup; fixed 3 minor bugs in the settings 2018-08-31 13:10:30 -07:00
75f34853ab Migrate to Java 1.8 2018-08-31 12:48:39 -07:00
949c8789e7 Switch HashMaps in CompetingRiskFunctions for ArrayLists
Provides a mild performance improvement.
2018-08-27 19:15:06 -07:00
22944115ee Fix bug where cause specific mortality was using the cumulative hazard
functions instead of the CIF
2018-08-27 13:19:04 -07:00
c85cebb59f Broke two methods in CompetingRiskErrorRateCalculator into static
methods in a new class.
2018-08-27 11:18:56 -07:00
e92abdab13 Add functionality to restart tree training where previously left off. 2018-08-09 16:34:10 -07:00
6d65d48844 Increased test ntree; making test more stable. 2018-08-09 16:33:57 -07:00
55eab76610 Simplied code. 2018-08-08 15:57:28 -07:00
9d9dc9ef8d Optimize tree training so that the best split is not applied twice 2018-08-08 11:34:02 -07:00
74151b94db Add alternative way where functions are computed only at final step. 2018-08-07 15:49:55 -07:00
d85f4eb099 Refactored competing risk combiners and differentiators into their own
packages.
2018-08-07 10:59:19 -07:00