[pandas] apply + custom function을 사용한 다중 입력 및 출력

0. pandas apply + custom function 다중입출력
- 새로운 multiple column 생성
- result_type 문서내용

DataFrame에서 다중 입력을 받아 다중 출력을 계산하는 경우가 있으며, condition이 복잡한 경우 custom function을 만드는게 좋다. 다음과 같은 상황을 가정해보자.

DataFrame의 multiple column을 활용하여 multiple output 계산
새로운 column을 만들어 해당 column을 원래 DataFrame 오른쪽에 append

이때 apply 함수에 result_type='expand' arguement을 사용하면 된다.

0. pandas apply + custom function 다중입출력

아래와 같은 DataFrame이 있을 때, 여러 개의 column을 입력으로 받아 여러개의 새로운 column을 만드는 함수를 적용하려고 한다.

df = pd.DataFrame(np.array([['a', 1, 4], ['a', 2, 5], ['b', 3, 6]]),
                   columns=['col1', 'col2', 'col3'])

print(df)
>>>>  col1 col2 col3
  0    a    1    4
  1    a    2    5
  2    b    3    6

적용하려는 함수는 다음과 같다고 하자. 이렇게 만들면 column명을 바로 쓸 수 있어서 좋다.

def multiple_inouts(row):
    if row['col1'] == 'a':  # 'a'일 때, col2 10으로 나눈 후 부호반대
        col2 =  -float(row['col2'])/10
        col3 = row['col3']
    else:
        col2 = row['col2']  # 'b'일 때, col3 값 X 100 
        col3 = int(row['col3'])*100
        
    return [col2, col3]

- 새로운 multiple column 생성

return 값이 여러 개면 리스트로 출력하더라도 에러가 뜬다. 이때는 result_type='expand' 옵션을 준다.

df[['col2', 'col3']] = df.apply(lambda x: multiple_inouts(x), axis=1, result_type='expand')

print(df)
>>>> col1 col2 col3
  0    a -0.1    4
  1    a -0.2    5
  2    b    3  600

df[['new_col2', 'new_col3']] = df.apply(lambda x: multiple_inouts(x), axis=1, result_type='expand')

print(df)
>>>> col1 col2 col3 new_col2 new_col3
  0    a    1    4     -0.1        4
  1    a    2    5     -0.2        5
  2    b    3    6        3      600

- result_type 문서내용

axis=1인 경우에만 적용할 수 있다.

result_type: {‘expand’, ‘reduce’, ‘broadcast’, None}, default None

'expand': list-like results will be turned into columns
'reduce': returns a Series if possible rather than expanding list-like results. This is the opposite of 'expand'.
'broadcast': results will be broadcast to the original shape of the DataFrame, the original index and columns will be retained.
None: depends on the return value of the applied function

728x90

저작자표시 비영리 동일조건 (새창열림)

'python 메모' 카테고리의 다른 글

[python] eval() 내장함수 (0)	2021.08.29
[python] requests 라이브러리 한글 인코딩 (0)	2021.07.27
[python] datetime 사용하기 (0)	2021.06.06
[python] First-Class Function과 Closure, Decorator (0)	2021.06.01
[matplotlib] subplot 그리기 (0)	2021.05.19

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

[pandas] apply + custom function을 사용한 다중 입력 및 출력

0. pandas apply + custom function 다중입출력

- 새로운 multiple column 생성

- result_type 문서내용

'python 메모' 카테고리의 다른 글

0. pandas apply + custom function 다중입출력

- 새로운 multiple column 생성

- result_type 문서내용

'python 메모' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역