valuewhen 함수, 파이썬으로는 어떻게 작성할까? #2

728x90

valuewhen 함수, 파이썬으로는 어떻게 작성할까?

요즘 HTS를 보면 사용자가 원하는 지표를 만들고, 원하는 신호를 띄우기 위해 수식관리자라는 프로그램으로 사용자 정의 수식을 만드는 기능이 있습니다. 파인 스크립트(pine script)라는 프로그램

sine-qua-none.tistory.com

에서 이어집니다.

파인스크립트의 valuewhen 함수를 python으로 구현하는 방법을 저번 글에서 알아봤습니다. 복습하자면 valuewhen 함수는

valuewhen(nth, condition, data)

의 형태로, 오늘부터 과거 방향으로 시계열을 따라가며 nth 번째로 condition == True인 시점의 data값을 함숫값으로 합니다.

이 함수를 python의 고급문법을 사용하여 구현해 보도록 하겠습니다.

Python Code : valuewhen

import pandas as pd
import numpy as np
import yfinance
from mpl_finance import candlestick_ohlc
import mplfinance
import matplotlib.dates as mpl_dates
import matplotlib.pyplot as plt
import datetime
from dateutil.relativedelta import relativedelta
import FinanceDataReader as fdr
from matplotlib import gridspec

import plotly.graph_objs as go
from plotly.subplots import make_subplots

import chart_studio
import chart_studio.plotly as py
import chart_studio.tools as tls


def load_data(stock_code, start_date, end_date):
    df = fdr.DataReader(stock_code, start_date, end_date)
    return df


def valuewhen_test():
    end_date = datetime.datetime.today()
    start_date = end_date - relativedelta(months=1)

    stk_code = '005930'
    df = load_data(stk_code, start_date, end_date)

    df['condition'] = df.Close > df.Close.shift(1) * (1 + 0.01)   #valuewhen의 codition 역할
    
    # valuewhen을 편하게 구현하기 위해 condition 0번째 index는 True로 하드코딩
    df['condition'].iloc[0] = True        
    pd.set_option("display.max_rows", None)  #결과를 잘 보기 위해 all row를 모두 display
    nth = 1                                  # condition =True인 첫번째 날 선택
    
    # valuewhen의 data input이 df['Close']일 때
    
    # step1. condition이 True인 것만 추출한다.
    df_step1 = df['Close'][df['condition']==True]
    
    # step2. nth-1만큼 shifting하여 nth번째 True인 data를 index와 매치시킴 
    df_step2 = df_step1.shift(nth - 1)
    
    # step3. 다시 step2의 이빨빠진 index를 Close의 index로 꽉 채워넣어줌
    # (그러면 이빨빠진 index들이 모두 채워지고 그값은 NA로 들어가 있음
    df_step3 = df_step2.reindex(df['Close'].index)
    
    # ffill() 함수를 사용하여 NA 값들을 NA가 아닌 위쪽에 있는 숫자로 일치시켜줌
    df_step4 = df_step3.ffill()
    
    print('df_step1')
    print(df_step1)
    print('df_step2')
    print(df_step2)
    print('df_step3')
    print(df_step3)
    print('df_step4')
    print(df_step4)

if __name__ == '__main__':
    valuewhen_test()

valuewhen은 크게 4단계로 나누어 설명할 수 있는데, 위 code 주석의 step1 ~ step4를 참고하시기 바랍니다. 결과를 볼까요?

step1. condition이 True인 data 만 추출

df_step1
Date
2023-03-02    60800
2023-03-06    61500
2023-03-15    59800
2023-03-17    61300
2023-03-22    61100
2023-03-23    62300
2023-03-24    63000
2023-03-28    62900
Name: Close, dtype: int64

Close라는 data의 condition이 True인 것만 모은 것입니다.

step2. nth 번째 참일 떄의 data를 구하기 위해 nth -1번 shifting 함

df_step2
Date
2023-03-02    60800
2023-03-06    61500
2023-03-15    59800
2023-03-17    61300
2023-03-22    61100
2023-03-23    62300
2023-03-24    63000
2023-03-28    62900
Name: Close, dtype: int64

지금은 nth=1 일 때, 즉 첫번째로 참 값이 나올 때를 구하므로 step1과 똑같은 결과입니다.

step3. 이빨빠진 index를 채워주기

df_step3
Date
2023-03-02    60800.0
2023-03-03        NaN
2023-03-06    61500.0
2023-03-07        NaN
2023-03-08        NaN
2023-03-09        NaN
2023-03-10        NaN
2023-03-13        NaN
2023-03-14        NaN
2023-03-15    59800.0
2023-03-16        NaN
2023-03-17    61300.0
2023-03-20        NaN
2023-03-21        NaN
2023-03-22    61100.0
2023-03-23    62300.0
2023-03-24    63000.0
2023-03-27        NaN
2023-03-28    62900.0
2023-03-29        NaN
2023-03-30        NaN
Name: Close, dtype: float64

참값만 추출하여 확 쪼그라들었던 df_step1과 df_step2의 index가 애당초의 Close의 index로 reindex 되면서 중간에 빠진 데이터들은 모두 NaN 처리됩니다.

참고로 제일 첫 번째 index의 값은 계속 True로 하드코딩되어 있기 때문에 살아남게 됩니다.

step4. 이빨 빠진 NaN data 값 모두 NaN이 아닌 첫 숫자로 채워주기 : ffill()

df_step4
Date
2023-03-02    60800.0
2023-03-03    60800.0
2023-03-06    61500.0
2023-03-07    61500.0
2023-03-08    61500.0
2023-03-09    61500.0
2023-03-10    61500.0
2023-03-13    61500.0
2023-03-14    61500.0
2023-03-15    59800.0
2023-03-16    59800.0
2023-03-17    61300.0
2023-03-20    61300.0
2023-03-21    61300.0
2023-03-22    61100.0
2023-03-23    62300.0
2023-03-24    63000.0
2023-03-27    63000.0
2023-03-28    62900.0
2023-03-29    62900.0
2023-03-30    62900.0
Name: Close, dtype: float64

완성입니다. 이제 저번 글의 코딩과 비교를 해보겠습니다.

import pandas as pd
import numpy as np
import yfinance
from mpl_finance import candlestick_ohlc
import mplfinance
import matplotlib.dates as mpl_dates
import matplotlib.pyplot as plt
import datetime
from dateutil.relativedelta import relativedelta
import FinanceDataReader as fdr
from matplotlib import gridspec

def load_data(stock_code, start_date, end_date):
    df = fdr.DataReader(stock_code, start_date, end_date)
    return df


def valuewhen_test():
    end_date = datetime.datetime.today()
    start_date = end_date - relativedelta(months=1)

    stk_code = '005930'

    df = load_data(stk_code, start_date, end_date)
    df['condition'] = df.Close > df.Close.shift(1) * (1 + 0.01)
    df['condition'].iloc[0] = True
    pd.set_option("display.max_rows", None)
    nth = 1

    df_step1 = df['Close'][df['condition']==True]
    df_step2 = df_step1.shift(nth - 1)
    df_step3 = df_step2.reindex(df['Close'].index)
    df_step4 = df_step3.ffill()

    df['res2']= df_step4
    
    
    # 저번글에서 만들었던 코드
    df['res1'] = 0
    for i in reversed(range(len(df))):
        print(i)
        # print(df.iloc[i,['condition']])
        j = i
        while df['condition'].iloc[j] == False:
            if j == 0:
                break
            j -= 1

        df['res1'].iloc[i] = df['Close'].iloc[j]
    
    # 지난글 코딩의 res1 결과와 이번 글 코드 res2의 비교
    df['compare'] = df['res1']==df['res2']
    print(df[['Close','res1', 'res2', 'compare']])
    
if __name__ == '__main__':
    valuewhen_test()

결과는 아래와 같습니다.

            Close   res1     res2  compare
Date                                      
2023-03-02  60800  60800  60800.0     True
2023-03-03  60500  60800  60800.0     True
2023-03-06  61500  61500  61500.0     True
2023-03-07  60700  61500  61500.0     True
2023-03-08  60300  61500  61500.0     True
2023-03-09  60100  61500  61500.0     True
2023-03-10  59500  61500  61500.0     True
2023-03-13  60000  61500  61500.0     True
2023-03-14  59000  61500  61500.0     True
2023-03-15  59800  59800  59800.0     True
2023-03-16  59900  59800  59800.0     True
2023-03-17  61300  61300  61300.0     True
2023-03-20  60200  61300  61300.0     True
2023-03-21  60300  61300  61300.0     True
2023-03-22  61100  61100  61100.0     True
2023-03-23  62300  62300  62300.0     True
2023-03-24  63000  63000  63000.0     True
2023-03-27  62100  63000  63000.0     True
2023-03-28  62900  62900  62900.0     True
2023-03-29  62700  62900  62900.0     True
2023-03-30  63500  63500  63500.0     True

당연히 모든 값에서 일치하겠죠. 이제 valuewhen함수 coding이 잘 된 것 같으니 아래처럼 함수로 만들어서 써보십시다.

Valuewhen 함수의 python 버전

def valuewhen(nth, condition, data):
    df_step1 = data[condition==True]
    df_step2 = df_step1.shift(nth - 1)
    df_step3 = df_step2.reindex(data.index)
    res = df_step3.ffill()
    return res

이렇게 함수형태로 만들고, 아래의 코드를 실행시켜 보면 같은 결과를 얻습니다.

def valuewhen(nth, condition, data):
    df_step1 = data[condition==True]
    df_step2 = df_step1.shift(nth - 1)
    df_step3 = df_step2.reindex(data.index)
    res = df_step3.ffill()
    return res

def valuewhen_test():
    end_date = datetime.datetime.today()
    start_date = end_date - relativedelta(months=1)

    stk_code = '005930'

    df = load_data(stk_code, start_date, end_date)

    df['condition'] = df.Close > df.Close.shift(1) * (1 + 0.01)
    df['condition'].iloc[0] = True
    pd.set_option("display.max_rows", None)
    
    nth = 1
    df['res2']= valuewhen(nth, df['condition'],df['Close'])  #위에서 만든 valuewhen 함수
    
    df['res1'] = 0
    for i in reversed(range(len(df))):
        print(i)
        j = i
        while df['condition'].iloc[j] == False:
            if j == 0:
                break
            j -= 1

        df['res1'].iloc[i] = df['Close'].iloc[j]
    df['compare'] = df['res1']==df['res2']
    print(df[['Close','res1', 'res2', 'compare']])

valuewhen 함수는 HTS 등의 수식관리자에서 강력하게 사용되는 함수입니다. 이것을 python으로 구현했으니, 수식관리자에서 구현되는 주가 로직을 python에서도 잘할 수 있게 되었습니다.

여담

저번 글과 비교했을 때, 삼성전자(005930)의 시계열이 좀 다를 수 있습니다. 실시간 시계열자료를 장중에 불러와서 실시간으로 글을 쓰다보니 글 쓴 시점에 따라 현재 시점이 추가되고 가격이 바뀐 탓입니다.

728x90

'주식분석 > Quant 분석(프로그래밍)' 카테고리의 다른 글

막힌벽을 강하게 뚫어보자- 222일선 강하게 돌파하는 전략 (0)	2023.04.04
매수타점 찾기: Spear 시그널 #1 (2)	2023.04.03
valuewhen 함수, 파이썬으로는 어떻게 작성할까? (0)	2023.03.29
급등주 매수 신호 (0)	2023.03.26
볼린저 밴드와 일목균형의 조화에서 매수신호 찾기 (0)	2023.03.23

Finance Diary

valuewhen 함수, 파이썬으로는 어떻게 작성할까? #2

Python Code : valuewhen

Valuewhen 함수의 python 버전

여담

'주식분석 > Quant 분석(프로그래밍)' 카테고리의 다른 글

댓글

티스토리툴바

valuewhen 함수, 파이썬으로는 어떻게 작성할까? #2

Python Code : valuewhen

Valuewhen 함수의 python 버전

여담

'주식분석 > Quant 분석(프로그래밍)' 카테고리의 다른 글

관련글

댓글

티스토리툴바