일부 변수가 갖고 있는 각각의 레벨을 아래와 같이 normalise 해야한다.
> sapply(ds[vars],is.factor)
date location min_temp max_temp rainfall
FALSE TRUE FALSE FALSE FALSE
evaporation sunshine wind_gust_dir wind_gust_speed wind_dir_9am
FALSE FALSE TRUE FALSE TRUE
wind_dir_3pm wind_speed_9am wind_speed_3pm humidity_9am humidity_3pm
TRUE FALSE FALSE FALSE FALSE
pressure_9am pressure_3pm cloud_9am cloud_3pm temp_9am
FALSE FALSE FALSE FALSE FALSE
temp_3pm rain_today risk_mm rain_tomorrow
FALSE TRUE FALSE TRUE
> factors <- which(sapply(ds[vars], is.factor))
> for (f in factors) levels(ds[[f]]) <- normVarNames(levels(ds[[f]]))
'프로그래밍 Programming' 카테고리의 다른 글
Data Preparation (10) - Clean (Identify Correlated Variables) (0) | 2014.12.06 |
---|---|
Data Preparation (16) - Clean (Ensure Target is Categoric) (0) | 2014.12.06 |
Data Preparation (14) - Clean (Omitting Observations) (0) | 2014.12.06 |
Data Preparation (13) - Clean (Deal with Missing Values) (0) | 2014.12.06 |
Data Preparation (12) - Clean (Remove Missing Target) (0) | 2014.12.05 |