phrase input should be string, not <class 'float'>

2024. 1. 7. 11:39

df['file'] = df['art_nm'].str.replace("[^ㄱ-ㅎㅏ-ㅣ가-힣 ]","")

for sentence in tqdm(df[' art_nm ']):
# print(sentence)
tokenized_sentence = okt.morphs(sentence, stem=True) # 토큰화

도중에 float이 들어있다고 한다.

np.nan 데이터가 float으로 인식되는듯.

df["art_nm"] = df["art_nm"].replace("", " ")
df["art_nm"] = df["art_nm"].replace(np.nan, " ")

위와 같이 처리 후 진행.

tensorflow, tensorflow-gpu, python, cudnn, cuda 12.3 (0)	2024.01.11
Reindexing only valid with uniquely valued Index objects (0)	2024.01.09
tqdm 'module' object is not callable (0)	2024.01.07
partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (0)	2024.01.06
cannot import name 'get_full_repo_name' from 'huggingface_hub' (0)	2024.01.06

ndlessrain