Python RPA(업무자동화) 개념 및 실습 - 크롤링(네이버 오픈 API)

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

IT_developers

Python RPA(업무자동화) 개념 및 실습 - 크롤링(네이버 오픈 API) 본문

Python

Python RPA(업무자동화) 개념 및 실습 - 크롤링(네이버 오픈 API)

developers developing 2022. 9. 30. 12:00

RPA(Robotic Process Automation)

웹, 윈도우, 어플리케이션(엑셀 등)을 사전에 설정한 시나리오에 따라 자동적으로 작동하여 수작업을 최소화하는 일련의 프로세스
RPA 사용 소프트웨어
- Uipath, BluePrism, Automation Anywhere, WinAutomation
RPA 라이브러리
- pyautogui, pyperclip, selenium

크롤링 : 웹 사이트, 하이퍼링크, 데이터 정보 자원을 자동화된 방법으로 수집, 분류, 저장하는 것

URL 작업 - urllib 라이브러리 존재(파이썬)

request
1. urlretrieve()
  - 요청하는 url의 정보를 파일로 저장
  - 리턴값이 튜플 형태로 옴
  - csv 파일, api 데이터 등 많은 양의 데이터를 한번에 저장
2. urlopen()
  - 다운로드 하지 않고 정보를 메모리에 올려서 분석
  - read() : 메모리에 있는 정보를 읽어옴

네이버 오픈 API 이용 : https://developers.naver.com/main/

검색 > 쇼핑 : 오픈 API 이용신청

API 기본정보 ==> JSON 주소 복사

애플리케이션 이용 신청 및 내 정보 확인 : ID,SECRET 필요함.

예시 - 호출 ==> 이용방법

Talend API Tester에서 요청 변수를 사용해서 값 확인하기

RPAbasic\crawl\requests1 폴더 - 7_openapi1.py

네이버 키보드 검색 크롤링 후 엑셀 저장

import requests

from openpyxl import Workbook

from datetime import datetime

# 엑셀 파일 생성

wb = Workbook()

# 기본 시트 활성화

ws = wb.active

# 시트명 새로 지정

ws.title = "키보드 1000"

ws.column_dimensions["B"].width = 60

ws.column_dimensions["C"].width = 80

ws.column_dimensions["D"].width = 15

ws.append(["순위", "상품명", "상세주소 url", "최저가"])

client_id = "ITricAb_moCNCZRXqDKt"

client_secret = "i1dMkiMcjU"

url = "https://openapi.naver.com/v1/search/shop.json"

headers = {"X-Naver-Client-Id": client_id, "X-Naver-Client-Secret": client_secret}

start, num = 1, 0

for idx in range(10):

start_num = start + (idx * 100)

url = (

"https://openapi.naver.com/v1/search/shop.json?query=키보드&display=100&start="

+ str(start_num)

)

print(url)

res = requests.get(url, headers=headers)

# json 데이터 확인

print(res.json())

data = res.json()

for item in data["items"]:

num += 1

print(item["title"], item["link"], item["lprice"])

ws.append([num, item["title"], item["link"], item["lprice"]])

# 파일명 navershop_오늘날짜.xlsx

today = datetime.now().strftime("%y%m%d")

filename = f"navershop_{today}.xlsx"

# 엑셀 저장

wb.save("./RPAbasic/crawl/download/" + filename)

RPAbasic\crawl\requests1 폴더 - 8_openapi2.py

네이버 api를 이용한 도서 검색 후 도서명, link, 출판사, 출판일 출력

import requests

client_id = "ITricAb_moCNCZRXqDKt"

client_secret = "i1dMkiMcjU"

url = "https://openapi.naver.com/v1/search/book.json"

headers = {"X-Naver-Client-Id": client_id, "X-Naver-Client-Secret": client_secret}

start, num = 1, 0

for idx in range(10):

start_num = start + (idx * 100)

url = (

"https://openapi.naver.com/v1/search/book.json?query=재테크&display=100&start="

+ str(start_num)

)

res = requests.get(url, headers=headers)

data = res.json()

for item in data["items"]:

num += 1

print(num, item["title"], item["link"], item["publisher"], item["pubdate"])

저작자표시 비영리 변경금지 (새창열림)

'Python' 카테고리의 다른 글

Python RPA(업무자동화) 개념 및 실습 - 크롤링(정규표현식)(2) (0)	2022.10.02
Python RPA(업무자동화) 개념 및 실습 - 크롤링(정규표현식)(1) (1)	2022.10.01
Python RPA(업무자동화) 개념 및 실습 - 크롤링(Beautifulsoup)(3) (1)	2022.09.29
Python RPA(업무자동화) 개념 및 실습 - 크롤링(Beautifulsoup)(2) (0)	2022.09.28
Python RPA(업무자동화) 개념 및 실습 - 크롤링(Beautifulsoup)(1) (0)	2022.09.27

'Python' Related Articles

Comments

IT_developers

Python RPA(업무자동화) 개념 및 실습 - 크롤링(네이버 오픈 API) 본문

Python RPA(업무자동화) 개념 및 실습 - 크롤링(네이버 오픈 API)

RPAbasic\crawl\requests1 폴더 - 7_openapi1.py

RPAbasic\crawl\requests1 폴더 - 8_openapi2.py

'Python' 카테고리의 다른 글

티스토리툴바