ai-se / smote_tune Goto Github PK

ICSE'18: Tuning Smote

Home Page: https://dl.acm.org/citation.cfm?id=3180197

Python 99.97% Shell 0.03%

hyperparameter-optimization hyperparameter-tuning optimization tuning defect-prediction classification sbse software-engineering class-imbalance smote

smote_tune's Introduction

Smote_tune

Tuning Smote

smote_tune's People

Contributors

Stargazers

Watchers

Forkers

mishakakkar binyi10 pinjiahe aagrawa8 ying-2016 zbn123 zht-1994 kawasaki-jin lifelong-student seanigami

smote_tune's Issues

This eentence is missing some middle part. and who is "They"

By keeping change logs of the most recently or frequently
changed files are the most probable source of future de-
fects [11], [21], [7]. They

"and used in our code"> say waht?

Most of these implementations are provided in Scikit-Learn~\cite{pedregosa2011scikit} and used in our code.

CK is an oo metric. what are u saying here

K metrics combined with OO (object-oriented) metrics perform better than all other metrics.

fig1 needs hrizontal and vertical lines

To dos:

Change to histograms
tune on x evaluate on x.
auc1 and auc2 (loc,recall)

Important attributes:

Weka: Total 20 attributes in each datasets. Datasets from top to bottom (high to low imbalance) . CFS attribute selection, and breadth first.

ant : cbo, rfc, lcom, loc, cam, amc, max_cc
redaktor - cbm, max_cc
arc - cbo, rfc, ce, npm, cam
ivy - wmc, cbo, rfc, ce, npm, loc, moa, amc
prop - lcom, ce , loc, moa, max_cc
tomcat - cbo, rfc, loc, moa , max_cc
camel - cbo, lcom, ca, avg_cc
jedit - rfc, moa

need de psuedo code

here's a light weight description. mote that point3 has to be changed for numeric attributes

t

2.   DE scores each {\em pop}$_i$ according to various objective
   scores $o$. In the case of our goal models, the objectives are $o_1$ the sum of the cost
 of its decisions, $o_2$ the number of ignore edges, and the number of $o_3$ satisfied goals
 and $o_4$  softgoals.

 3. OPTIMIZE tries to each replace {\em pop}$_i$ with a mutant $q$
 built by extrapolating between three other members of population $a,b,c$.
 At probability $p_1$, for each decision $a_k \in a$, then
 $m_k= a_k \vee (p_1 < \mathit{rand}() \wedge( b_k \vee c_k))$.

 4. Each mutant $m$ is assessed by calling  $\text{SAMPLE}(\textit{model,prior=m})$;
 i.e. by seeing what can be achieved within a goal after first assuming
 that $\textit{prior}=m$.

 5.  To test if the mutant $m$ is preferred to {\em pop}$_i$, OPTIMIZE uses
  Zitler's continuous domination {\em cdom}
  predicate~\cite{Zitzler2004}. This predicate compares two sets of objectives
  from sets $x$ and $y$. In that comparison,
  $x$ is better than another $y$ if $x$  ``losses'' least.
  In the following, $``n''$ is the number of objectives and $w_j \in \{-1, 1\}$

shows if we seek to maximize $o_j$.
[
\begin{array}{rcl}
x \succ y & =& \textit{loss}(y,x) > \textit{loss}(x,y)\
\textit{loss}(x,y)& = &\sum_j^n -e^{\Delta(j,x,y,n)}/n\
\Delta(j,x,y,n) & = & w_j(o_{j,x} - o_{j,y})/n
\end{array}
]

OPTIMIZE repeatedly loops over the population, trying to replace items with mutants,
until new better mutants stop being found.
Return the population.
\\hline
\end{tabular}
\caption{Procedure OPTIMIZE: strives to find ``good'' priors which,
when passes to SAMPLE, maximize the number of edges used
while also minimizing cost, and
maximizing satisfied hard goals and soft goals.
OPTIMIZE is based on Storn's differential evolution optimizer~\protect\cite{storn1997differential}.
OPTIMIZE is called by the RANK procedure of \fig{rank}.
For the reader unfamiliar with the mutation technique of step 3 and the {\em cdom}
scoring of step5, we note that these
are standard practice in the search-based
SE community\cite{Fu2016,krall2015gale}.
}\label{fig:optimize}

Tuning Results (with % m, n)

Experiment:

Goal - maximizing Fscore for each of 6 learners separately.
Once the parameters are found, reporting all 5 evaluation measures.
train, validation, and test sets.
parameters tuned are:
- m(20,50) and n(80,50) % of oversampling and undersampling respectively.
- power of distance metric (r) (0.1 to 5)??
- k=(2,20) exponential??
- Didn't do the preprocessing part (exp = 0.3 to 3)

Results:

Accuracy

Recall

Precision

F_score

False Alarm

you cant say this since u never report majority class performance numbers

Hence,
our learning objective can be generally described
as ``obtaining a classifier that will provide high accuracy for the minority class without severely compromising the accuracy of the majority class''.

4a should not be a subsection of 4

very very rarely should a setnence start with a pronoun

. And found out that techniques like AdaBoost.NC had a better performance than the rest while others are planning to use SMOTE~\cite{gray20|

?? run tis into the last sentence "and they found that.."

Tuning results with m, and n abs nos.

Results:

Accuracy

Recall

Precision

F_score

False Alarm

lit revirew needs more work

it leaves open issue like

why we are using smote and not Pelayo's trial and error

more generally , lit reviews must respect and disrespect. respectfully present others work, then point out their fatal mistake and why this work in needeed

sentence order reverdes?

t is important to select how many synthetic examples to create ($m$) and how much undersampling ($m$) of majority class needs to be done. In this case number of oversampling and undersampling are the same.

tuning for precision, included AUC

Doesn't include 3 datasets: prop, tomcat and jedit

Results:

Accuracy

AUC

Recall

Precision

F_score

False Alarm

V.e needs two defs of AUC

AUC(pf,pdf)

AUC(low, pd)

paper too short.

increase width of fig 2,3,4,5. make full page wide (but dont increase font size)

how?

SMOTE's super-sampling selects instances from the minority class and finds k'' nearest neighbors for each instance and then creates new instances using the selected instances and their neighbors until we have m'' numbers of minority class samples.

is AUC the same as AUC

i think Ghotra et al [17] used AUC effort vs recall not AUC(pd,pf). please check

tuning per goal

Doesn't include 3 datasets: prop, tomcat and jedit.
AUC(pf,pd)

Conclusions:

AUC the highest improvement.
Recall good improvements
Fscore = DT and RF most improvement
Others either the same or worse.

Results:

Accuracy - Maximize

AUC - Maximize

Recall - Maximize

Precision - Maximize

F_score - Maximize

False Alarm - Minimise

this para makes no sense

Results by Tantithamthavorn et al [50] also suggested that
every dataset comes with different attributes. And also clas-
sification techniques often have configurable parameters that
control characteristics of these classifiers that they produce.
Now time has come to even think about hyperparameter opti-
mization of these techniques and come up with an automated
process [2], [16] to tune these parameters for every dataset.

original smote paper datasets and the measures
include f2 score, use abs nos for oversampling and undersampling (50, 100, 200, 400)

ai-se / smote_tune Goto Github PK

smote_tune's Introduction

Smote_tune

smote_tune's People

Contributors

Stargazers

Watchers

Forkers

smote_tune's Issues

t

Experiment:

Results:

Accuracy

Recall

Precision

F_score

False Alarm

Results:

Accuracy

Recall

Precision

F_score

False Alarm

Results:

Accuracy

AUC

Recall

Precision

F_score

False Alarm

Conclusions:

Results:

Accuracy - Maximize

AUC - Maximize

Recall - Maximize

Precision - Maximize

F_score - Maximize

False Alarm - Minimise

Within Measure Assessment

AUC for SMOTE

AUC for SMOTUNED

Recommend Projects

Recommend Topics

Recommend Org