Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
Tags
- n_neighbors
- Alignments
- 논문editor
- Tire
- 평가지표
- Scienceplots
- Pycaret
- Python
- 프로그래머스
- mes
- Mae
- TypeError
- 에러해결
- python 갯수세기
- PAPER
- 코테
- RMES
- 논문작성
- 파이썬을파이썬답게
- knn
- MAPE
- Overleaf
- SMAPE
- n_sample
- 카카오
- KAKAO
- iNT
- mMAPE
- 스택
- 논문
Archives
- Today
- Total
EunGyeongKim
불황과 호황 예측(로짓 알고리즘) 본문
In [1]:
import datetime
import requests
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
In [2]:
key = 'key'
url = 'https://ecos.bok.or.kr/api/StatisticTableList/'+key+'/xml/kr/1/10000'
raw = requests.get(url)
xml = BeautifulSoup(raw.text, 'xml')
raw_data = xml.find_all('row')
data = []
for i in range(len(raw_data)):
p_stat_code = raw_data[i].P_STAT_CODE.string.strip()
stat_code = raw_data[i].STAT_CODE.string.strip()
stat_name = raw_data[i].STAT_NAME.string.strip()
cycle = raw_data[i].find('CYCLE').text
used = raw_data[i].SRCH_YN.string.strip()
org_name = raw_data[i].ORG_NAME.string
total=[p_stat_code, stat_code, stat_name, cycle, used, org_name]
data.append(total)
In [3]:
df = pd.DataFrame(data, columns=['p_stat_code','stat_code','stat_name','cycle','used','org_name'])
df.to_csv("bok_total_list.csv", encoding='CP949')
In [4]:
df1 = df[df['used'].isin(['Y'])]
stat_code = df1['stat_code'].tolist()
len(stat_code)
Out[4]:
603
In [5]:
# 세부 통계목록
data=[]
for i in range(len(stat_code)):
code = stat_code[i]
url = 'https://ecos.bok.or.kr/api/StatisticItemList/'+key+'/xml/kr/1/100/'+str(code)+'/'
raw = requests.get(url)
xml = BeautifulSoup(raw.text, 'xml')
raw_data = xml.find_all('row')
for j in range(len(raw_data)):
stat_code1 = raw_data[j].STAT_CODE.string.strip()
stat_name = raw_data[j].STAT_NAME.string.strip()
grp_code = raw_data[j].GRP_CODE.string.strip()
grp_name = raw_data[j].GRP_NAME.string.strip()
item_code = raw_data[j].ITEM_CODE.string.strip()
item_name = raw_data[j].ITEM_NAME.string.strip()
cycle = raw_data[j].find("CYCLE").text
start_time = raw_data[j].START_TIME.string.strip()
end_time = raw_data[j].END_TIME.string.strip()
data_cnt = raw_data[j].DATA_CNT.string.strip()
total = [stat_code1,stat_name,grp_code,grp_name,item_code,item_name,cycle,start_time,end_time,data_cnt]
data.append(total)
In [6]:
temp = pd.DataFrame(data, columns=['stat_code','stat_name','grp_code','grp_name','item_code','item_name','cycle','start_time','end_time','data_cnt'])
temp.to_csv('kob.detailTotal.csv', encoding='CP949')
In [7]:
df = temp.copy()
df1 = df[df['stat_code'].isin(['101Y001'])] # M2상품변 구성내역 말잔(계정조정)
df1
Out[7]:
stat_code | stat_name | grp_code | grp_name | item_code | item_name | cycle | start_time | end_time | data_cnt | |
---|---|---|---|---|---|---|---|---|---|---|
140 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS00 | M2(말잔, 계절조정계열) | A | 1970 | 2022 | 53 |
141 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS00 | M2(말잔, 계절조정계열) | M | 197001 | 202301 | 637 |
142 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS01 | 현금통화 | A | 2002 | 2022 | 21 |
143 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS01 | 현금통화 | M | 200112 | 202301 | 254 |
144 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS02 | 요구불예금 | A | 2002 | 2022 | 21 |
145 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS02 | 요구불예금 | M | 200112 | 202301 | 254 |
146 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS03 | 수시입출식저축성예금 | A | 2002 | 2022 | 21 |
147 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS03 | 수시입출식저축성예금 | M | 200112 | 202301 | 254 |
148 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS04 | MMF | A | 2002 | 2022 | 21 |
149 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS04 | MMF | M | 200112 | 202301 | 254 |
150 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS05 | 만기2년미만정기예적금 | A | 2002 | 2022 | 21 |
151 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS05 | 만기2년미만정기예적금 | M | 200112 | 202301 | 254 |
152 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS06 | 수익증권 | A | 2002 | 2022 | 21 |
153 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS06 | 수익증권 | M | 200112 | 202301 | 254 |
154 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS07 | 시장형상품 1) | A | 2002 | 2022 | 21 |
155 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS07 | 시장형상품 1) | M | 200112 | 202301 | 254 |
156 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS08 | 만기2년미만금융채 | A | 2002 | 2022 | 21 |
157 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS08 | 만기2년미만금융채 | M | 200112 | 202301 | 254 |
158 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS09 | 만기2년미만금전신탁 | A | 2002 | 2022 | 21 |
159 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS09 | 만기2년미만금전신탁 | M | 200112 | 202301 | 254 |
160 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS10 | 기타 2) | A | 2002 | 2022 | 21 |
161 | 101Y001 | 1.1.3.1.3. M2 상품별 구성내역(말잔, 계절조정계열) | Group1 | 계정항목 | BBGS10 | 기타 2) | M | 200112 | 202301 | 254 |
In [8]:
main_df = pd.read_csv('bok_total_list.csv', encoding='CP949')
detail_df = pd.read_csv('kob.detailTotal.csv', encoding='CP949')
In [22]:
def EcosDownload(statname, statcode, freq, begdate, enddate, item_code, subcode1, subcode2, subcode3, col_name):
url = "https://ecos.bok.or.kr/api/StatisticSearch/"+key+"/xml/kr/1/1000/%s/%s/%s/%s/%s/%s/%s/%s"%( statcode, freq, begdate, enddate, item_code, subcode1, subcode2, subcode3)
print(url)
raw=requests.get(url)
xml = BeautifulSoup(raw.text, 'xml')
raw_data = xml.find_all('row')
data_list =[]
value_list = []
for item in raw_data:
value = item.find('DATA_VALUE').text.encode('utf-8')
data_str = item.find('TIME').text
if 'Q1' in data_str:
data_str = data_str.replace('Q1', '03')
if 'Q2' in data_str:
data_str = data_str.replace('Q2', '06')
if 'Q3' in data_str:
data_str = data_str.replace('Q3', '09')
if 'Q4' in data_str:
data_str = data_str.replace('Q4', '12')
value = float(value)
data_list.append(data_str)
value_list.append(value)
df = pd.DataFrame(index=data_list)
df['%s'%(col_name)] = value_list
return df
불황과 호황 예측¶
- 로짓분석 머신러닝 사용
- 거지경제 데이터 이용
- 데이터 = 한국은행경졔통계 시스템 사용
불황과 호황을 예측하는데 이용되는 데이터¶
- 호황과 불황을 나타내는 이진목표변수 <- realGDP, 12분기 이동평균을 초과하면 호황(1)을 할당. 그렇지 않다면 불황(0) 할당
- 변수
- realGDP : 실질 국내 총생산(단위 : 전분기 대비 증가율)
- RealCons : 실질 민간소비(단위 : 전분기 대비 증가율)
- INV : 총투자(단위 : 전분기 대비 증가율)
- M2 : M2 통화량(단위 : 전분기 대비 증가율)
- UNEMP : 실업률(단위 : 현분기 실업율)
- EMPLOY : 취업자수(단위 : 전분기 대비 증가율)
- CD_3M : CD 3개월 유통 수익률(단위 : 현분기 수준)
- INFL : 소비자 물가(단위 : 전분기 대비 증가율)
In [ ]:
# 데이터 찾기...
detail_df[(detail_df['stat_code'].str.contains('200Y056')
& detail_df['cycle'].str.contains('Q') )]
In [24]:
# begdate, enddate 고정 :2015Q1, 2022Q4
# 총투자 begdate가 2015Q1에서 시작
d_tmp = [['realGDP','2.1.1.2. 주요지표(분기지표)', '200Y002', '10111'],
['realCons','2.1.1.2. 주요지표(분기지표)', '200Y002', '10122'],
['inv','2.1.9.2. 총저축과 총투자(원계열, 명목, 분기 및 연간)', '200Y056', '13201'],
['M2','1.1.3.1.2. M2 상품별 구성내역(평잔, 원계열)', '101Y004', 'BBHA01'],
['unemp','9.1.5.2. 국제 주요국 실업률(계절변동조정)', '902Y021', 'KOR'],
['employ','9.1.5.3. 국제 주요국 취업자수(계절변동조정)', '902Y022', 'KOR'],
['CD_3M','1.3.2.2. 시장금리(월,분기,년)', '721Y001', '2010000'],
['infl','9.1.2.2. 국제 주요국 소비자물가지수', '902Y008', 'KR']]
In [97]:
tmp_data = pd.DataFrame([])
for i in range(len(d_tmp)):
tmp = (EcosDownload(str(d_tmp[i][1]), str(d_tmp[i][2]), 'Q', '2015Q1', '2022Q4', str(d_tmp[i][3]), '', '', '', d_tmp[i][0]))
tmp_data = pd.concat([tmp_data,tmp],axis=1)
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/200Y002/Q/2015Q1/2022Q4/10111///
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/200Y002/Q/2015Q1/2022Q4/10122///
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/200Y056/Q/2015Q1/2022Q4/13201///
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/101Y004/Q/2015Q1/2022Q4/BBHA01///
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/902Y021/Q/2015Q1/2022Q4/KOR///
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/902Y022/Q/2015Q1/2022Q4/KOR///
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/721Y001/Q/2015Q1/2022Q4/2010000///
https://ecos.bok.or.kr/api/StatisticSearch/213DB5VCLRGGHS2759WS/xml/kr/1/1000/902Y008/Q/2015Q1/2022Q4/KR///
In [98]:
tmp_data
Out[98]:
realGDP | realCons | inv | M2 | unemp | employ | CD_3M | infl | |
---|---|---|---|---|---|---|---|---|
201503 | 0.8 | 0.8 | 109470.6 | 66550.9 | 3.5 | 26101.9 | 2.06 | 109.54 |
201506 | 0.5 | 0.1 | 121348.3 | 68433.1 | 3.7 | 26091.2 | 1.77 | 109.77 |
201509 | 1.4 | 0.6 | 129471.2 | 71089.6 | 3.6 | 26216.0 | 1.63 | 110.09 |
201512 | 0.7 | 1.9 | 129311.5 | 74551.4 | 3.5 | 26304.2 | 1.61 | 109.92 |
201603 | 0.3 | -0.1 | 110589.1 | 78599.8 | 3.7 | 26300.2 | 1.64 | 110.48 |
201606 | 1.2 | 0.7 | 132126.9 | 80127.1 | 3.6 | 26310.9 | 1.54 | 110.69 |
201609 | 0.4 | 0.6 | 140229.5 | 82388.8 | 3.9 | 26468.0 | 1.35 | 110.90 |
201612 | 0.6 | 0.3 | 141772.2 | 84867.8 | 3.6 | 26559.6 | 1.44 | 111.52 |
201703 | 1.0 | 0.6 | 130801.0 | 88896.9 | 3.7 | 26648.5 | 1.49 | 112.91 |
201706 | 0.7 | 1.1 | 148688.3 | 89896.1 | 3.7 | 26684.9 | 1.40 | 112.82 |
201709 | 1.4 | 1.0 | 156139.0 | 91721.0 | 3.7 | 26744.5 | 1.39 | 113.36 |
201712 | -0.3 | 0.6 | 157083.2 | 95771.4 | 3.7 | 26821.9 | 1.50 | 113.12 |
201803 | 1.2 | 1.4 | 136324.4 | 98296.5 | 3.7 | 26824.9 | 1.65 | 114.12 |
201806 | 0.6 | 0.0 | 151671.1 | 98765.8 | 3.7 | 26791.7 | 1.65 | 114.50 |
201809 | 0.7 | 0.6 | 152209.2 | 100070.6 | 4.2 | 26767.0 | 1.65 | 115.11 |
201812 | 0.7 | 0.8 | 157482.7 | 102775.1 | 3.9 | 26908.1 | 1.76 | 115.14 |
201903 | -0.2 | 0.3 | 133180.5 | 106158.2 | 3.9 | 26997.0 | 1.88 | 114.74 |
201906 | 1.1 | 0.4 | 156388.2 | 107112.9 | 4.0 | 27033.4 | 1.84 | 115.25 |
201909 | 0.5 | 0.6 | 157988.7 | 109150.3 | 3.7 | 27131.6 | 1.57 | 115.16 |
201912 | 1.3 | 1.0 | 158562.1 | 112246.2 | 3.6 | 27329.5 | 1.50 | 115.48 |
202003 | -1.3 | -6.6 | 137895.3 | 117818.3 | 3.6 | 27287.2 | 1.37 | 115.85 |
202006 | -3.0 | 1.1 | 158472.2 | 122963.0 | 4.1 | 26634.0 | 0.97 | 115.26 |
202009 | 2.3 | 0.3 | 159464.7 | 127725.2 | 4.0 | 26823.7 | 0.70 | 115.99 |
202012 | 1.2 | -1.1 | 162960.3 | 133370.1 | 4.2 | 26886.8 | 0.65 | 116.01 |
202103 | 1.7 | 1.2 | 143013.1 | 139405.4 | 4.3 | 26889.9 | 0.72 | 117.50 |
202106 | 0.8 | 3.3 | 166314.1 | 143130.9 | 3.8 | 27248.7 | 0.69 | 118.12 |
202109 | 0.2 | 0.0 | 173077.2 | 147435.6 | 3.2 | 27395.5 | 0.81 | 118.93 |
202112 | 1.3 | 1.5 | 182390.3 | 153439.9 | 3.3 | 27549.2 | 1.18 | 120.12 |
202203 | 0.6 | -0.5 | 150713.4 | 159455.3 | 3.0 | 27907.1 | 1.47 | 121.97 |
202206 | 0.7 | 2.9 | 175880.6 | 162860.1 | 2.9 | 28123.0 | 1.80 | 124.51 |
202209 | 0.3 | 1.7 | 193076.4 | 165505.1 | 2.8 | 28167.7 | 2.73 | 125.92 |
202212 | -0.4 | -0.6 | 193354.0 | 164262.8 | 2.9 | 28153.5 | 3.91 | 126.43 |
In [100]:
data = tmp_data.copy()
data['index'] = list(map(int, data.index))
data
Out[100]:
realGDP | realCons | inv | M2 | unemp | employ | CD_3M | infl | index | |
---|---|---|---|---|---|---|---|---|---|
201503 | 0.8 | 0.8 | 109470.6 | 66550.9 | 3.5 | 26101.9 | 2.06 | 109.54 | 201503 |
201506 | 0.5 | 0.1 | 121348.3 | 68433.1 | 3.7 | 26091.2 | 1.77 | 109.77 | 201506 |
201509 | 1.4 | 0.6 | 129471.2 | 71089.6 | 3.6 | 26216.0 | 1.63 | 110.09 | 201509 |
201512 | 0.7 | 1.9 | 129311.5 | 74551.4 | 3.5 | 26304.2 | 1.61 | 109.92 | 201512 |
201603 | 0.3 | -0.1 | 110589.1 | 78599.8 | 3.7 | 26300.2 | 1.64 | 110.48 | 201603 |
201606 | 1.2 | 0.7 | 132126.9 | 80127.1 | 3.6 | 26310.9 | 1.54 | 110.69 | 201606 |
201609 | 0.4 | 0.6 | 140229.5 | 82388.8 | 3.9 | 26468.0 | 1.35 | 110.90 | 201609 |
201612 | 0.6 | 0.3 | 141772.2 | 84867.8 | 3.6 | 26559.6 | 1.44 | 111.52 | 201612 |
201703 | 1.0 | 0.6 | 130801.0 | 88896.9 | 3.7 | 26648.5 | 1.49 | 112.91 | 201703 |
201706 | 0.7 | 1.1 | 148688.3 | 89896.1 | 3.7 | 26684.9 | 1.40 | 112.82 | 201706 |
201709 | 1.4 | 1.0 | 156139.0 | 91721.0 | 3.7 | 26744.5 | 1.39 | 113.36 | 201709 |
201712 | -0.3 | 0.6 | 157083.2 | 95771.4 | 3.7 | 26821.9 | 1.50 | 113.12 | 201712 |
201803 | 1.2 | 1.4 | 136324.4 | 98296.5 | 3.7 | 26824.9 | 1.65 | 114.12 | 201803 |
201806 | 0.6 | 0.0 | 151671.1 | 98765.8 | 3.7 | 26791.7 | 1.65 | 114.50 | 201806 |
201809 | 0.7 | 0.6 | 152209.2 | 100070.6 | 4.2 | 26767.0 | 1.65 | 115.11 | 201809 |
201812 | 0.7 | 0.8 | 157482.7 | 102775.1 | 3.9 | 26908.1 | 1.76 | 115.14 | 201812 |
201903 | -0.2 | 0.3 | 133180.5 | 106158.2 | 3.9 | 26997.0 | 1.88 | 114.74 | 201903 |
201906 | 1.1 | 0.4 | 156388.2 | 107112.9 | 4.0 | 27033.4 | 1.84 | 115.25 | 201906 |
201909 | 0.5 | 0.6 | 157988.7 | 109150.3 | 3.7 | 27131.6 | 1.57 | 115.16 | 201909 |
201912 | 1.3 | 1.0 | 158562.1 | 112246.2 | 3.6 | 27329.5 | 1.50 | 115.48 | 201912 |
202003 | -1.3 | -6.6 | 137895.3 | 117818.3 | 3.6 | 27287.2 | 1.37 | 115.85 | 202003 |
202006 | -3.0 | 1.1 | 158472.2 | 122963.0 | 4.1 | 26634.0 | 0.97 | 115.26 | 202006 |
202009 | 2.3 | 0.3 | 159464.7 | 127725.2 | 4.0 | 26823.7 | 0.70 | 115.99 | 202009 |
202012 | 1.2 | -1.1 | 162960.3 | 133370.1 | 4.2 | 26886.8 | 0.65 | 116.01 | 202012 |
202103 | 1.7 | 1.2 | 143013.1 | 139405.4 | 4.3 | 26889.9 | 0.72 | 117.50 | 202103 |
202106 | 0.8 | 3.3 | 166314.1 | 143130.9 | 3.8 | 27248.7 | 0.69 | 118.12 | 202106 |
202109 | 0.2 | 0.0 | 173077.2 | 147435.6 | 3.2 | 27395.5 | 0.81 | 118.93 | 202109 |
202112 | 1.3 | 1.5 | 182390.3 | 153439.9 | 3.3 | 27549.2 | 1.18 | 120.12 | 202112 |
202203 | 0.6 | -0.5 | 150713.4 | 159455.3 | 3.0 | 27907.1 | 1.47 | 121.97 | 202203 |
202206 | 0.7 | 2.9 | 175880.6 | 162860.1 | 2.9 | 28123.0 | 1.80 | 124.51 | 202206 |
202209 | 0.3 | 1.7 | 193076.4 | 165505.1 | 2.8 | 28167.7 | 2.73 | 125.92 | 202209 |
202212 | -0.4 | -0.6 | 193354.0 | 164262.8 | 2.9 | 28153.5 | 3.91 | 126.43 | 202212 |
In [101]:
data['QUARTER'] = ((data['index'] % 100)/3).astype(int) # % 나머지
data['RollingMean']= data.realGDP.rolling(12).mean()
data['TARGET1'] = (data.realGDP > data.RollingMean).astype(int).shift(-1)
pct_cols = ['M2', 'infl']
data.loc[:, pct_cols] = data.loc[:, pct_cols].pct_change(1)
df = pd.get_dummies(data, columns=['QUARTER'], drop_first=True).dropna()
In [102]:
df.TARGET1.value_counts()
Out[102]:
1.0 10
0.0 10
Name: TARGET1, dtype: int64
In [105]:
df1 = df.copy()
In [108]:
x_data =df1[['realGDP', 'realCons', 'inv', 'M2', 'infl', 'unemp', 'employ', 'CD_3M']].to_numpy()
In [109]:
y_data = df1.TARGET1
In [110]:
def normalization(data):
numerator = data - np.min(data, 0)
denominator = np.max(data, 0) - np.min(data, 0)
return numerator / denominator
In [111]:
x_data = normalization(x_data)
In [112]:
#convert into numpy and float format
X = np.asarray(x_data, dtype=np.float32)
y = np.asarray(y_data, dtype=np.float32)
In [113]:
k = x_data.shape[1]
In [115]:
import tensorflow as tf
In [116]:
learning_rate = tf.Variable(0.003)
W = tf.Variable(tf.random.normal(([k, 1])), name='weight')
b = tf.Variable(tf.random.normal(([1])), name='bias')
for i in range(10000+1):
with tf.GradientTape() as tape:
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
cost = -tf.reduce_mean(y * tf.math.log(hypothesis) + (1 - y) * tf.math.log(1 - hypothesis))
W_grad, b_grad = tape.gradient(cost, [W, b])
W.assign_sub(learning_rate * W_grad)
b.assign_sub(learning_rate * b_grad)
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
if i % 2000 == 0:
print("{:5} | {:10.6f}".format(i, cost.numpy()))
0 | 1.327101
2000 | 0.726086
4000 | 0.712805
6000 | 0.705188
8000 | 0.700846
10000 | 0.698361
In [117]:
y_Predicted = predicted.numpy().flatten()
In [119]:
y_Actual = y.flatten()
In [120]:
data = {'y_Actual': y_Actual,
'y_Predicted': y_Predicted}
In [121]:
df = pd.DataFrame(data, columns = ['y_Actual', 'y_Predicted'])
In [122]:
cross = pd.crosstab(df['y_Actual'], df['y_Predicted'], rownames = ['Actual'], colnames=['Predicted'])
cross
Out[122]:
Predicted | 0.0 | 1.0 |
---|---|---|
Actual | ||
0.0 | 6 | 4 |
1.0 | 3 | 7 |
In [123]:
confusion_matrix = np.zeros([2,2])
In [124]:
try :
confusion_matrix[1,1] = cross.loc[1,1]
confusion_matrix[0,1] = cross.loc[0,1]
confusion_matrix[1,0] = cross.loc[1,0]
confusion_matrix[0,0] = cross.loc[0,0]
except Exception as e:
print(e)
TP = confusion_matrix[1,1]
FP = confusion_matrix[0,1]
FN = confusion_matrix[1,0]
TN = confusion_matrix[0,0]
In [125]:
confusion_matrix
Out[125]:
array([[6., 4.],
[3., 7.]])
In [126]:
TOT = TP + FP + TN + FN
In [127]:
accuracy = (TP + TN)/TOT
accuracy
Out[127]:
0.65
In [127]:
'기타 공부 > 금융' 카테고리의 다른 글
[Coursera] Understanding the Australian Economy : An introduction to macroeconomic and financial policies. (0) | 2024.04.10 |
---|---|
[논문 정리]Reassessment of the Weather Effect: Stock Prices and Wall Street Weather (1) | 2024.02.10 |
카이제곱을 이용한 차이 검정 (성별,직업군별 소득차이 검정) (1) | 2023.03.25 |
차이 검정 (30대 1인 가구주 성별에 따른소득차이 검정) (0) | 2023.03.21 |
차이 검정 (가구주 직업별 소득 차이 검정) (0) | 2023.03.21 |
Comments