FSelector (Romanski, 2013) 패키지는 주어진 데이터세트에서 속성을 선택할 수 있는 기능을 제공한다.
관련성이 없거나 불필요한 정보를 확정하고 제거하는 기능을 한다.
> library(FSelector)
> form <- formula(paste(target, "~.")) // paste() concatenate vectors after converting to character
> cfs(form, ds[vars]) // cfs : algorithm finds attribute subset using correlation and entropy measures for continous and discrete data[1] "min_temp" "sunshine" "wind_gust_speed" "humidity_3pm"
[5] "pressure_3pm" "cloud_3pm"
> information.gain(form, ds[vars]) // information.gain = H(Class) + H(Attribute) − H(Class, Attribute)
attr_importance
min_temp 3.539250e-02
max_temp 0.000000e+00
rainfall 0.000000e+00
evaporation 0.000000e+00
sunshine 6.523179e-02
wind_gust_dir 4.073802e-02
wind_gust_speed 3.931861e-02
wind_dir_9am 3.537000e-02
wind_dir_3pm 1.759904e-02
wind_speed_9am 9.813415e-05
wind_speed_3pm 0.000000e+00
humidity_9am 2.858310e-02
humidity_3pm 6.189702e-02
pressure_9am 5.317622e-02
pressure_3pm 6.878745e-02
cloud_9am 3.314110e-02
cloud_3pm 6.893149e-02
temp_9am 0.000000e+00
temp_3pm 0.000000e+00
rain_today 1.261390e-02
>
'프로그래밍 Programming' 카테고리의 다른 글
Data Preparation (13) - Clean (Deal with Missing Values) (0) | 2014.12.06 |
---|---|
Data Preparation (12) - Clean (Remove Missing Target) (0) | 2014.12.05 |
Data Preparation (10) - Clean (Remove the Variables) (0) | 2014.12.05 |
lower.tri / upper.tri (0) | 2014.12.05 |
Correlation : cor() (0) | 2014.12.05 |