final project - ustcstaff.ustc.edu.cn/~zwp/teach/mva/final.pdf · 2014. 6. 3. · p39-3 p39-4 p39-6...

6
Final Project Olivetti Faces Dataset(http://cs.nyu.edu/ ~ roweis/data.html)<ª, TŒ ·AT&TŒx¢¿u1992c41994c4ˇm8. TŒ840<ª, zª'E˙64x64, z<k10^eª. k<ª·3m:, 1,9L(/4œ«, /)eß. /kª¡·3!eß. lTŒ8J5Cfaces¿;faces.txt, 400x4096. w'T Œ8. TŒ82^uyO˘S{5U, duz<=k10gE*, ˇd^ uˆiiO˘S{¥. ~XBien, J., and Tibshirani, R. (2011), Verma et al. (2013), Calandriello et al. (2013). K Ja{ØTŒ8?1', 'a{5U. ˜ƒ^O'a{, /'{ØTŒ?1', d\'a5U. ƒ 'wAªK¥. 'wAU˘'ƒ?1: IK, , `, SN({0, {, '(J), o(, º'z SN{, Qª6, 'L§> 'w¯˘5, ƒ^L A T E X^, Æ/. '{ƒ^/˘{˜, ud. 'wV¡(x)<, ŒL5. Jm: 617F J/:: +n¢1006 dIO ,SN5, w65 'L§5, (JL>5

Upload: others

Post on 29-Sep-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Final Project - USTCstaff.ustc.edu.cn/~zwp/teach/MVA/final.pdf · 2014. 6. 3. · P39-3 P39-4 P39-6 P39-10 P39-2 P39-7 P39-8 P39-5 P39-1 P39-9 600 1600 Cluster Dendrogram hclust (*,

Final Project

Olivetti Faces Dataset(http://cs.nyu.edu/~roweis/data.html)�¹�<�òã, Tê

â´AT&Têx¢�¿u1992c4��1994c4�Ïmæ8�. Têâ8�¹40 <��Ýòã,

zÜã�©EÇ�64x64, z�<�k10ÜØÓ^�e�òã. k<�òã´3ØÓ�m:,ØÓ

1ì,±9ØÓòÜL�( /4ú«, ��/Ø�)eû��. ¤kã¡Ñ´3çÚþ!�µeû�.

·�lTêâ8J�Ñ5Cþfaces��¿�;�faces.txt, Ù��400x4096Ý. ��w±©ÛT

êâ�8�.

Têâ8�2�^u�yÚOÆS�{�5U, duz�<=k10g­E*ÿ, Ïd�·Ü^

uÃiÒ�½ö�iÒ�ÚOÆS�{¥. ~XBien, J., and Tibshirani, R. (2011), Verma et al.

(2013), Calandriello et al. (2013)��.

¯K

• ÀJÜ·�àa�{éTêâ8?1©Û, ©Ûàa�{�5U.

• �Ħ^�O�©a�{, ̤©�{éTêâ?1©Û, µd\�©aì5U.

�¦

• ©Û�wA���¹þã¯K¥���.

• ©Û�wAUìÆâØ©�ªÚ�¦?1:

– IK, �ö, Á�, SN({0, �{, ©Û(J), o(, ë�©z

– SN{', Qã�ß6�, ©ÛL§î>

• ©Û�w�����ÅÆâ��5�, ¦^LATEX^�Ö�, Õá�¤.

• ©Û�{±¦^¤Æ�{�Ä�, �Ø�ud.

• ©Û�wV¡(çx)�<, �êØ�L5�.

• J����m: 6�17F

• J�/:: +n�ï¢1006

µdIO

• �ª,SN���5, �w���6�5

• ©ÛL§���5, (JL��î>5

Page 2: Final Project - USTCstaff.ustc.edu.cn/~zwp/teach/MVA/final.pdf · 2014. 6. 3. · P39-3 P39-4 P39-6 P39-10 P39-2 P39-7 P39-8 P39-5 P39-1 P39-9 600 1600 Cluster Dendrogram hclust (*,

ÐÚàa©Û~

Ö�êâÚ㫤kòã:

1 l i b r a r y ( RColorBrewer )

showMatrix <− f unc t i on (x , . . . )

3 image ( t ( x [ nrow (x ) : 1 , ] ) , xaxt = ’ none ’ , yaxt = ’ none ’ ,

c o l = rev ( colorRampPalette ( brewer . pa l (9 , ’ Greys ’ ) ) (100) ) , . . . )

5

f a c e s<−read . t ab l e s ( ” f a c e s . txt ” )

7 x<−as . matrix ( f a c e s )

par (mfrow=c (20 ,20) ,mar=rep (0 , 4 ) )

9 f o r ( i in 1 : 400 )

showMatrix ( matrix (x [ i , ] , 6 4 , 6 4 ) )

11 rownames (x )<−paste ( ”P” , rep ( 1 : 4 0 , each=10) , ”−” , rep ( 1 : 1 0 , 4 0 ) , sep=”” )

·�±139�<�òã�~,

1 xs<−x [ 3 8 1 : 3 9 0 , ]

xd<−d i s t ( xs )

3 par (mfrow=c (2 , 5 ) ,mar=rep (0 , 4 ) )

f o r ( i in 1 : 10 )

5 showMatrix ( matrix ( xs [ i , ] , 6 4 , 6 4 ) )

¦^�gàa�{ÚBien and Tibshirani (2011)��{éÙ?1àa

1 par (mfrow=c (2 , 1 ) )

#Hi e r a ch i c a l c l u s t e r i n g

3 hh<−hc lu s t (xd , method=”average ” )

p l o t (hh , hang=−1)5 f o r ( i in 1 : nrow ( xs ) )

subplot ( showMatrix ( matrix ( xs [ hh$ order [ i ] , ] , 6 4 , 6 4 ) ) , i , 7 0 , s i z e=c ( 0 . 5 , 0 . 5 ) )

7 #perform minimax l i nkage c l u s t e r i n g by Bien , J . , and Tibsh i ran i , R. (2011) .

l i b r a r y ( p r o t o c l u s t )

9 l i b r a r y (TeachingDemos )

hm<−p ro t o c l u s t ( xd )11 p lo t (hm, hang=−1)

f o r ( i in 1 : nrow ( xs ) )

13 subplot ( showMatrix ( matrix ( xs [hm$ order [ i ] , ] , 6 4 , 6 4 ) ) , i , 7 0 , s i z e=c ( 0 . 5 , 0 . 5 ) )

Page 3: Final Project - USTCstaff.ustc.edu.cn/~zwp/teach/MVA/final.pdf · 2014. 6. 3. · P39-3 P39-4 P39-6 P39-10 P39-2 P39-7 P39-8 P39-5 P39-1 P39-9 600 1600 Cluster Dendrogram hclust (*,

P39

−3

P39

−4

P39

−6

P39

−10

P39

−2

P39

−7

P39

−8

P39

−5

P39

−1

P39

−9

600

1600

Cluster Dendrogram

hclust (*, "average")xd

Hei

ght

P39

−3

P39

−4

P39

−6

P39

−10

P39

−5

P39

−1

P39

−9

P39

−2

P39

−7

P39

−8

600

1600

Cluster Dendrogram

protoclust (*, "minimax")xd

Hei

ght

�±wÑ, ü«àa�{ÑéÐ�òù10Üòã?1àa, z�aL«ØÓ�&f��.

3Bien and Tibshirani(2011)�minimax linkage�{¥, a�¥%ÏL���L�L«, Ïd,

�gàaL§¥zÚÜ¿���a�¥%�±3äGãþL«Ñ5. e¡·�éÑàaL§¥zÚ

�¥%òã��I, âdò�L�3äGãþL«Ñ5. ¢yBien and Tibshirani(2011)�ã5.

1 #compute the coo rd ina t e s o f nodes

abs i <− f unc t i on ( hc , l e v e l=length ( hc$ he ight ) , i n i t=TRUE) {3 i f ( i n i t ) {

. count <<− 0

5 . topAbsis <<− NULL

. he i gh t s <<− NULL

7 }i f ( l e v e l <0) {

9 . count <<− . count + 1

return ( . count )

11 }node <− hc$merge [ l e v e l , ]

13 l e <− abs i ( hc , node [ 1 ] , i n i t=FALSE)

r i <− abs i ( hc , node [ 2 ] , i n i t=FALSE)

15 mid <− ( l e+r i ) /2

. topAbsis <<− c ( . topAbsis , mid )

17 . h e i gh t s <<− c ( . he ights , hc$ he ight [ l e v e l ] )

Page 4: Final Project - USTCstaff.ustc.edu.cn/~zwp/teach/MVA/final.pdf · 2014. 6. 3. · P39-3 P39-4 P39-6 P39-10 P39-2 P39-7 P39-8 P39-5 P39-1 P39-9 600 1600 Cluster Dendrogram hclust (*,

i n v i s i b l e (mid )

19 }abs i (hm)

21 ord<−order ( . h e i gh t s )yo<− . h e i gh t s [ ord ]

23 xo<− . topAbsis [ ord ]

f o r ( i in 1 : ( nrow ( xs )−1) )25 subplot ( showMatrix ( matrix ( xs [hm$proto [ i ] , ] , 6 4 , 6 4 ) ) , xo [ i ] , yo [ i ] , s i z e=c ( 0 . 5 , 0 . 5 ) )

�ª��

P39

−3

P39

−4

P39

−6

P39

−10

P39

−5

P39

−1

P39

−9

P39

−2

P39

−7

P39

−8

600

800

1000

1400

1800

Cluster Dendrogram

protoclust (*, "minimax")xd

Hei

ght

Page 5: Final Project - USTCstaff.ustc.edu.cn/~zwp/teach/MVA/final.pdf · 2014. 6. 3. · P39-3 P39-4 P39-6 P39-10 P39-2 P39-7 P39-8 P39-5 P39-1 P39-9 600 1600 Cluster Dendrogram hclust (*,

ÐÚ�O�©a©Û~

duêâ�400 × 4096�Ý, Cþ�ê���u��þ, ùp·�kæ^̤©ü�, ,�¦

^�5�O©Ûéêâ?1�O�©a. ±z�<���Üã�u�òã,Ù¦ã�Ôöòã.

1 id<−rep ( 1 : 4 0 , each=10)

f a c e s . data . frame<−data . frame ( cbind ( id=id , f a c e s ) )

3 #pick out the l a s t photo f o r each person as the t e s t i n g photo

t e s t i d<−seq (10 ,400 , by=10)

5 t r a i n . f a c e s<−f a c e s . data . frame [− t e s t i d , ]

t e s t . f a c e s<−f a c e s . data . frame [ t e s t i d , ]

O�̤©��, ÀJ̤©�ê±9O�̤©�©.

xc<−s c a l e ( t r a i n . f a c e s [ , −1 ] , s c a l e=FALSE)

2 A<−t ( xc ) / sq r t (360−1)# 4096x360

4 # thus , the covar iance matrix i s A%∗%t (A)

A. egn<−e i gen ( t (A)%∗%A)

6 # 360x360

pc<−A%∗%A. egn$ vec to r s

8 pc<−apply ( pc , 2 , f unc t i on ( i ) i / sq r t (sum( i ∗ i ) ) )# normal ize the pc

10 n<−80sum(A. egn$ value [ 1 : n ] ) /sum(A. egn$ value )

12 # 92%, thus we use the f i r s t 80 pcs

pcs<−pc [ , 1 : n ]

14 # pc s c o r e s f o r t r a i n i n g data

yt<−xc%∗%pcs

3̤©�m¦^�5�O©ÛÔö©aì, ¿éÔö��?1�O. ��uy·��~�{

�éÔö��?1�O.

1 f t<−data . frame ( cbind ( id [− t e s t i d ] , yt ) )

f l d a<−lda (X1˜ . , f t )

3 f l<−p r ed i c t ( f lda , f t ) $ c l a s s

d iag ( t ab l e ( f l , id [− t e s t i d ] ) )

5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9

7 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

9 9 9 9 9 9 9 9 9 9 9 9 9 9 9

éu�8,O�Ù3̤©�me��©, âd¦^ÔöÐ�©aì?1©a, ��uy·��

~�{�éu���?1©a.

#pc s c o r e s f o r t e s t i n g data

2 xv<−as . matrix ( t e s t . f a c e s [ , −1 ] )

xvs<−s c a l e ( xv , s c a l e=FALSE)

4 yv<−xvs%∗%pcs

fv<−data . frame ( cbind ( id [ t e s t i d ] , yv ) )

Page 6: Final Project - USTCstaff.ustc.edu.cn/~zwp/teach/MVA/final.pdf · 2014. 6. 3. · P39-3 P39-4 P39-6 P39-10 P39-2 P39-7 P39-8 P39-5 P39-1 P39-9 600 1600 Cluster Dendrogram hclust (*,

6 pl<−p r ed i c t ( f lda , fv ) $ c l a s s

d iag ( t ab l e ( pl , id [ t e s t i d ] ) )

�z

[1] Bien, J., and Tibshirani, R. (2011), Hierarchical Clustering with Prototypes via Minimax

Linkage, The Journal of the American Statistical Association

[2] Verma, T.; Sahu, R.K., (2013) PCA-LDA based face recognition system & results comparison

by various classification techniques, Green High Performance Computing (ICGHPC), 2013

IEEE International Conference on , vol., no., pp.1,7, 14-15

[3] Calandriello D., Niu G., Sugiyama M. (2013), Semi-Supervised Information-Maximization

Clustering