Git Product home page Git Product logo

masters-thesis-knowledge-info's Introduction

Investigating the Statistical Properties of the Double Kernel Density Estimator

Harold Ship, University of Haifa

Fixed in version 1

  • Itai's comments
  • Proper English title pages
  • Abstract
  • Final formatting
  • Include Hebrew title pages and abstract

Fixed in version 0.17

  • Introduction
  • Further Research
  • Conclusion

Fixed in version 0.16

  • changes to theory and method based on comments

Fixed in version 0.15

  • Literature: first draft
  • Theory: (and everywhere) lambda_i used for intensity AND for index: change intensity to lambda_I/lambda_P
  • Theory: 2.2 describe N(ds) distribution as Poisson Process; define PP
  • Theory: 2.2 lambda(x,x)ds approx prob: derive (from Bernoulli/Poisson)
  • Theory: 2.3 do 1D before 2D and move 1D plots after 1D
  • Theory: 2.3 1D plot - specify which kernel
  • Theory: 2.3 for biweight explain radius is 1 but can be changed by bandwith
  • Theory: 2.5 explain K has mean 0 and 3X differentiable with positive variance and radial symmetry
  • Theory: 2.5 assumptions on f
  • Theory: 2.5 replace section A2 with ref to Silverman and clean up derivations
  • Theory: 2.6 define oracle bandwidth and add to glossary
  • Theory: 2.7 implications of inconsistency (error increase) so we normalize
  • Theory: 2.7 verify n^{1/3} as others state O(1)
  • Theory: 2.8 remove this section and mention Dalenius in method
  • Theory: 2.9 mention rejection sampling and give reference
  • Method: better describe centroid
  • Method: 4.1 describe the study area
  • Method: 4.1 put in a picture of pop and incident points
  • Method: 4.1 Discuss the real-life underlying story of the data with examples
  • Method: 4.1 Discuss how the real-life underlying story is modeled with SPP or Poisson
  • Method: 4.1 Describe the kernel method of intensity estimation, assumes some data, "if someone has incident location data, they can use this method"
  • Method: 4.1 mention 2 kernel methods, incidents vs population
  • Method: 4.2 define lambda := lambda(., .) to show lambda is a function
  • Method: 4.2.5: motivation for centroid
  • Method: 4.2.5: formulas
  • Method: 4.3: "and scale it" - example
  • Method: 4.4: Add formulas and a reference for CV
  • Method: 4.5: Explain buffer with respect to edge effects
  • Method: 4.5: use \texttt for variable names
  • Method: 4.5: explain that the oracle knows the true lambda and so creates a baseline that approximates mise-optimal b/w
  • Method: 4.5: add a step to compute lambda_p
  • Method: 4.6: reword last sentence
  • Method: Add computing and technical issues and solutions: time, AWS, etc. move from wherever.

Fixed in version 0.14

  • Theory: finished draft
  • Discussion: finished draft

Fixed in version 0.13

  • Derivations: bias-variance

Fixed in version 0.12

  • Results: account for comments
  • Results: all subsections: compare peak, centroid, etc in each section briefly
  • Results: all subsections: compare selectors.
  • Results: all subsections: make more clear MISE chart discussion is for all selectors, etc

Fixed in version 0.11

  • Theory: continue major rewrite

Fixed in version 0.10

  • Theory: major rewrite
  • Literature search: start
  • Results: text mentions colours

Fixed in version 0.9

  • Discussion: major rewrite
  • Theory: major rewrite
  • Method: double integral with (x1, x2) in W as limits
  • Method: 4.2.1: "in many cases": when & how?
  • Method: 4.4: "A common method for bandwidth selection..." to start
  • Results: (and everywhere) in population and incidents scatter use a different size/symbol instead of colour
  • Results: make all charts b/w or grayscale (not colour)
  • Results: only 2 charts per row
  • Results: bandwidth histograms: lighten fill; why shadow in printed?
  • Discussion: Create Boxplot of MISE distribution comparing Oracle to Silverman to CV selection, possibly OTHER accuracy measures
  • Discussion: Create Boxplot of MISE difference between S-O and CV-O (MISE only)
  • Discussion: overall plot of everything e.g. NMISE of CV, Silv, Oracle vs. experiment number
  • Appendix: Section A.10 title should say peaks are NOT in same place
  • Appendix: Section A.8 mention in subsection title that peaks are in same place

Fixed in version 0.8

  • Results: major rewrite
  • Results: 5.1 mentions 5.6-7 but not 5.2-4
  • Appendix: some tables are squished. Paragraph indentation?
  • For NMIAE and NSUP - use mu not mu^2
  • Results: Figure 5.10 "empirical distribution of MISE and RMISE"
  • Define h_opt
  • Results: 5.2 sample size: n = actual number of incidents; state that we fix the expected but observe actual
  • "error fell with increasing the multiplication expected number of incidents" + "mu"
  • 5.2/5.3 add CV bandwidths to tables
  • 5.2/5.3 split titles to 2 rows
  • 5.2/5.3 add "mean" to h_o, etc.
  • Method: Describe subsections 4.1-4.4 and 4.6.
  • Results: 5.1 drift needs to be seen in relation to square
  • Results: convergence rates
  • 5.2/5.3 describe what's in the tables and make a plot (log-log?)
  • "negative polynomial order" <-- check

Fixed in version 0.7

  • Method: explain the units of distance
  • Method: mention no edge effect compensation
  • Method: mention parallelization and randomization algorithms and R packages, and AWS including which instance types and other details of execution
  • Method: more rigorous math
  • Method: add "relative" and "normalized" error measures
  • Results: 5.1 mention that population and incident bivariate normal is independent with equal variances (and move to Method)
  • Results: 5.1 mention the actual size of the study area (and move to Method)
  • Method: describe the data generation process

Fixed in version 0.6

  • replace "decay rate" with sigma_p, sigma_i, etc and "spread"
  • No headers on pages
  • Method: use "approximate" instead of "estimate" for MISE, MIAE etc.
  • Method: use \widetilde instead of \hat on MISE, MIAE, etc.
  • Discussion: move 5.8 to 6
  • Conclusion: when is Silverman better than CV?
  • use (x_1, x_2) instead of vector x
  • Results: How does spread affect bandwidth?
  • Appendix: table 36 oracle is worse than Silverman?

Fixed in version 0.5

  • Explain error measures with formulas
  • subcaption titles are too wide
  • appendix: subcaptions on top of tables
  • subcaption: "true risk function distribution"
  • factor see if another term is common otherwise use expected number of incidents

TODO

masters-thesis-knowledge-info's People

Contributors

haroldship avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.