'빅데이터분석기사 실기 제2유형' 이것만이라도 알고가자!

# 사용자 코딩

import pandas as pd
import numpy as np
import sklearn

from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score
from xgboost import XGBClassifier 

# Get DF

X_train_path = 'data/X_train.csv'
X_test_path = 'data/X_test.csv'
y_train_path = 'data/y_train.csv'
y_test_path = '123456.csv'

X_train = pd.read_csv(X_train_path)
X_train_id = X_train.iloc[:, 0]
X = X_train.iloc[:, 1:]

X_test = pd.read_csv(X_test_path)
X_test_id = X_test.iloc[:, 0]
X_test = X_test.iloc[:, 1:]

y_train = pd.read_csv(y_train_path)
y = y_train.iloc[:, 1]


# info 
# X_train.info()
# X_train.isnull().sum()


# DF preprocessing
# LabelEncdoder
X.loc[:, ['주구매상품', '주구매지점']] = X.loc[:, ['주구매상품', '주구매지점']].apply(LabelEncoder().fit_transform)
X_test.loc[:, ['주구매상품', '주구매지점']] = X_test.loc[:, ['주구매상품', '주구매지점']].apply(LabelEncoder().fit_transform)

# NaN to zero
X.loc[:, ['환불금액']] = X.loc[:, ['환불금액']].fillna(0)
X_test.loc[:, ['환불금액']] = X_test.loc[:, ['환불금액']].fillna(0)

# model_selection
from sklearn.model_selection import train_test_split
X1, X2, y1, y2 = train_test_split(X, y, test_size=0.3, random_state=999, stratify = y)


# Model
# RFC
rfc = RandomForestClassifier()
rfc.fit(X1, y1)

pred = rfc.predict_proba(X2)[:, 1]

# Score
from sklearn.metrics import roc_auc_score
print(roc_auc_score(y2, pred))
result_pred = rfc.predict_proba(X_test)[:,1]
result_pred = pd.DataFrame(result_pred)
result = pd.concat([X_test_id, result_pred], axis=1)
result.columns = ['cust_id', 'gender']
# print(result)
result.to_csv(y_test_path, index=False)

df = pd.read_csv(y_test_path)
print(df)

# XGBC
xgbc = XGBClassifier(learning_rate = 0.02, max_depth=20)
xgbc.fit(X_train, y_train)
print('XGBC ACC.:', xgbc.score(X_train, y_train))

predict = xgbc.predict_proba(X_test)
predict = pd.DataFrame(predict)
answer = pd.concat([X_test_id, predict], axis=1)
answer.to_csv(y_test_path, index=False)

#빅데이터분석기사후기 #빅데이터분석기사실기 #빅데이터분석기사필기pdf #빅데이터분석기사합격률 #빅데이터분석기사쓸모 #빅데이터분석기사2022일정 #빅데이터분석기사전망 #빅데이터분석기사실기 #빅데이터분석기사실기파이썬 #빅데이터분석기사실기파이썬 #빅데이터분석기사실기문제 #빅데이터분석기사실기합격률 #빅데이터분석기사실기준비 #빅데이터분석기사필기기출문제 #빅데이터분석기사필기복원 #빅분기기출 #빅분기정리 #빅분기난이도 #빅분기필기복원

저작자표시 비영리 변경금지

'Learning > 빅데이터 분석기사' 카테고리의 다른 글

[빅분기] 데이터 불러오기 (0)	2022.06.06
[빅분기] 라이브러리/모듈 불러오기 (0)	2022.06.06
빅데이터분석기사 회귀형 연습문제 (0)	2021.11.25
빅데이터 분석기사 샘플문제 문제풀이 (0)	2021.11.18
빅데이터 분석기사 실기 시험환경 (제3회) (0)	2021.11.18

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

PLAYCE

'빅데이터분석기사 실기 제2유형' 이것만이라도 알고가자!

1. 데이터 불러오기 (load)

2. 데이터 전처리 (preprocessing)

3. 모델링 (Modeling)

4. 제출 (predict)

'Learning > 빅데이터 분석기사' 카테고리의 다른 글

티스토리툴바

1. 데이터 불러오기 (load)

2. 데이터 전처리 (preprocessing)

3. 모델링 (Modeling)

4. 제출 (predict)

'Learning > 빅데이터 분석기사' 카테고리의 다른 글

검색

티스토리툴바