User Tools

Site Tools


Sidebar


Menu



nauka:projekty:selekcja_cech:trans_rank

Trans. ranker

Transformation ranker

Idea of this algorithm is quite simple. Similar we do ranking for feature selection we should do here, however what we also do is we try to make nonlinear transformations on each attribute independent maximizing ranking criteria value. The algorithm folowe these step:

  • for each attribute x_i
  • calculate ranking value F(x_i,c)
  • maximize/minimize formula F(t(x_i|a,b),c) where t(x_i) = tanh(a(x-b)) or exp(-a(x-b)) or log(a(x-b)) etc.

This simple and fast procedure usually leads to improve classification accuracy.

Example results

Some example results comparison for 1NN classier (results marked as trans obtained with described algorithm ):

  • Ionosphere
    • 1NN BERR_loss=0.18052+-0.075205
    • 1NN (trans) BERR_loss=0.08834+-0.046615
    • 1NN class_loss=0.13667+-0.059647
    • 1NN (trans) class_loss=0.071111+-0.040605
  • Pima Indians
    • 1NN class_loss=0.29158+-0.053642
    • 1NN (trans) class_loss=0.28771+-0.035011
  • Cleveland heart disease
    • 1NN class_loss=0.22885+-0.088873
    • 1NN (trans) class_loss=0.20126+-0.1119
  • Spam
    • 1NN class_loss=0.088894+-0.013225
    • 1NN (trans) class_loss=0.07368+-0.018687
    • 1NN BERR_loss=0.093415+-0.013619
    • 1NN (trans) BERR_loss=0.074778+-0.018056

BER_loss is balanced error rate (mean error rate for each class)
class_loss is normal error rate

Results comments:

As it is shown it works quite fine, however variance of Cleveland is quite to high. Similar results obtained for Hyperthyroid are also very bed (BERR = 66%), so sometimes this method fails. Such high variance means that this results sometimes are very good and sometimes very bed (but usually good)

  • Printable version
  • Tell by mail
  • Export to OpenOffice
  • Export to PDF
  • Export to csv
  • Export to Timeline
  • Add page to book
  • Tools:
nauka/projekty/selekcja_cech/trans_rank.txt · Last modified: 2019/03/21 13:06 (external edit)