关于adaboost算法(About adaboost algorithm)
我正在进行交通流预测,在那里我可以预测一个地方交通繁忙或交通繁忙。 我将每个流量分类为1-5,1是最轻的流量,5是最重的流量。
我来到这个网站http://www.waset.org/journals/waset/v25/v25-36.pdf,AdaBoost算法,我真的很难学习这个算法。 特别是在
S
是集合的部分((xi
,yi
),i=(1,2,…,m)
)。 其中Y={-1,+1}
。 什么是x
,y
和常数L
?L
的价值是多少?有人能解释一下这个算法吗? :)
I'm working on a traffic flow prediction where I can predict that a place has heavy or light traffic. I have classified each traffic as 1-5, 1 being the lightest traffic and 5 being the heaviest traffic.
I came across to this website http://www.waset.org/journals/waset/v25/v25-36.pdf, AdaBoost algorithm, and I'm really having a difficulty learning this algorithm. Specially in the part where
S
is the set ((xi
,yi
),i=(1,2,…,m)
). whereY={-1,+1}
. What arex
,y
and the constantL
? what is the value ofL
?Can someone explain me this algorithm? :)
原文:https://stackoverflow.com/questions/11825493
最满意答案
S={(x1,y1),...,(xm,ym)}
:每个(x,y)
对是用于训练(或测试)分类器的样本:
x
=描述此特定样本的特征,例如列出amount of cars on the road
day of the week
等的值y
=特定x
的标签,在您的情况下可以是1, 2, 3, 4 or 5
Table 1
显示了他们使用的x
功能,即:DAY
,TIME
,INT
,DET
,LINK
,POS
,GRE
,DIS
,VOL
和OCC
。 表格的最后一列显示标签(y
),它们设置为1
或-1
(即,yes
或no
)。 表中的每一行都是1个样本。
L
是AdaBoost训练弱学习者的轮数(在Random Forests
中用作弱分类器)。 如果你将L
设置为1
则AdaBoost将运行1轮,并且仅训练1个弱分类器,这将产生不良结果。 使用L
不同值执行多个实验以找到最佳值(即,当AdaBoost收敛或开始过度拟合时)。
S={(x1,y1),...,(xm,ym)}
: Every(x,y)
pair is a sample used for training (or testing) your classifier:
x
= The features which describe this particular sample, for example values which list theamount of cars on the road
,day of the week
, etcy
= The label for a particularx
, which in your case can be1, 2, 3, 4 or 5
Table 1
in the paper shows thex
features they used , namely:DAY
,TIME
,INT
,DET
,LINK
,POS
,GRE
,DIS
,VOL
andOCC
. The last column of the table shows the label (y
), which they set to either1
or-1
(i.e.,yes
orno
). Every row in the table is 1 sample.
L
is the amount of rounds in which AdaBoost trains a weak learner (in the paperRandom Forests
is used as the weak classifier). If you setL
to1
then AdaBoost will run 1 round and only 1 weak classifier will be trained, which will have bad results. Perform multiple experiments with different values forL
to find the optimal value (i.e., when AdaBoost is converged or when it starts to overfit).