프롬프트로 분류 문제 풀기

[이전에 썼던 프롬프트]

<aside> 📌 ## System_prompt

You are quiz show participant. Choose the most suitable job title from the following [list]. No need explain. If the job title is not listed, please select 'N/A'. Only one answer allowed.

[list]

Data Scientist
Data Engineer
Machine Learning/Deep Learning Researcher
Machine Learning/Deep Learning Engineer
AI Developer
AI product and service developers
Web front-end developer
Mobile Developer
Backend Developer
AI service planner
Prompt Engineer
AI Artist
Machine Learning Data Planner
Machine Learning Platform Engineer
Site Reliability Engineer (DevOps)
Security Engineer (DevSecOps)
Data Labeler
N/A

Print out answer for each three questions. [output]

## few-shot prompt

Q1: A person who has more than 2 years of mobile app engineer experience or equivalent skills Someone who has an understanding of front-end and back-end technologies that make up web services Those who have experience in developing services considering various environments using React Native

Q2: Those who have more than 2 years of front-end engineer experience or equivalent skills Those who have an understanding of front-end and back-end technologies that make up web services Experienced in developing services using React Those who have basic knowledge of HTML, CSS, Javascript (ES6+) and TypeScript

Q3: Minimum of 2 years of experience as a fashion designer or equivalent skills and abilities Ability to design with aesthetic flair and creativity Understanding and interest in trends and styles

A1: Mobile Developer

A2: Web front-end developer

A3: N/A

Q1: Those who like to worry about UI/UX improvement Those who have the ability to communicate and collaborate smoothly with people in various positions such as planners, designers, back-end engineers, and machine learning engineers

Q2: Someone with basic knowledge of HTML, CSS, Javascript (ES6+) and TypeScript Those who like to worry about UI/UX improvement Those who have the ability to communicate and collaborate with people in various positions such as planners, designers, back-end engineers, and machine learning engineers

A1:

Q3: Fashion design skills and technical knowledge (pattern making, material selection, garment construction, etc.) Ability to sketch by hand and experience utilizing design software Ability to communicate and collaborate well within and across teams

A2:

A3:

Do not say any response that is not in the provided [list]. If you make another answer then you can't get score. If the job title is not in the [list]. please select 'N/A'.

내가 풀고자 한 문제는 다음과 같다.

1. 회사마다 직무명을 다르게 올린다.

같은 직무여도 광고 효과를 위해 직무명에 다른 정보를 추가하거나, 자격요건과 맞지 않는 직무명을 올리는 경우에 어떻게 카테고라이징 할 수 있을까 고민하게 됐다.

인공지능(AI) Data Scientist 채용(박사급)

인공지능(AI) Data Scientist 채용(경력 5년이상)

AI Engineer 경력사원 모집

AI(번역) AI(Vision) AI(NLP) 부문

[병역특례 전문연구요원] AI 엔지니어

[리빌더AI] 최적화/경량화 (AI 개발자)

AI R&D(Research and Development) Engineer (경력)

AI 소프트웨어 PM (Product Manager)

직무명 예시

2. 정규표현식을 통해 정보를 추출하기 어렵다.

회사에서 올린 직무명으로만 분리하지 않고, 자격요건과 주요 업무 정보로 직무명을 재 설정하기로 결정했다.

그러나 자격요건과 주요업무 정보로 직무명을 추출하는 건 수작업으로 하기에는 무리가 있었다.

유사도 분석을 진행하려 했지만 정규표현식을 몇 개를 써도 2000개의 카테고리가 생겼다.

위의 두 문제를 해결하기 위해 생성형 모델인 Chat-GPT를 활용했다.

회사마다 다르게 올린 직무명을 자격요건과 주요업무 정보를 통해 생성한 GPT표 직무명과 비교하면서

AI와 관련이 있는지, 관련이 있다면 어느 카테고리에 들어가야 하는지 정하도록 했다.

활용한 효과적인 프롬프트 작성의 규칙

1. 구분자를 활용해 입력될 정보와 프롬프트 분리

2. GPT가 추론을 거치도록 함

3. 출력 형태를 지정함

##분류할 카테고리
Job = ['Data Scientist', 'Data Engineer', 'Machine Learning/Deep Learning Researcher', 'Machine Learning/Deep Learning Engineer', 'AI Developer', 'AI Product Service Developer', 'AI Service Planner', 'Prompt Engineer', 'AI Artist', 'AI Researcher', 'AI Engineer']

##구분자, 입력될 Text와 프롬프트를 구분하기 위함
delimiter = "####"

##프롬프트
Prompt = f"""
You are a job name classifier.
#분류기 역할 부여

Respond directly to the following actions without reiterating the explanation.
#각 단계에 대한 설명 반복하지 않고, 답만 출력하도록 제한

Perform the following actions.
#아래의 단계를 따르도록 명령

The text will be delimited with four hashtags, i.e. {delimiter}.
#입력될 Text가 어디에 있는지 알려줌

Step 1: Summarize the following text in one sentence.
#자격요건과 주요업무 내용이 길기 때문에 요약하도록 함

Step 2: Is the summarized information associated with AI? (True or False)
#요약한 정보가 AI와 관련이 있는지 T/F로 판단하도록 함

Step 3: If the answer is true, 
select only one job name that describes the summarized information from this list '''{Job}'''.\
Otherwise, select 'Nan'.
#만약 AI와 관련이 있으면 [Job = 분류할 카테고리] 리스트에서 하나 선택
 관련이 없으면 'Nan'을 선택하도록 함
 
Step 4: Compare that job name to '''{Previous}''' 
and score them in order of suitability based on summarized information.
#GPT가 내린 결론과 원래 직무명을 비교
 요약한 정보에 적합한 정도로 점수 부여하도록 함
 
Step 5: If a high-scoring answer is present in this list '''{Job}''', then print it.\
Otherwise, just print 'Nan'.
#만약 높은 점수의 직무명이 주어진 [Job = 분류할 카테고리] 안에 있으면 출력
 없으면 'Nan'출력하도록 함

Respond without reiterating the explanation.
#출력시 설명을 더하지 않도록 한 번 더 제한

Don't finish reasoning until you have completed all the steps. 
#중간에 추론 과정을 멈추고 성급히 결론 내리지 않게 함

You must follow this delimiter format:
#아래의 구조를 따르면서 결과를 출력하도록 함
{delimiter} <step 1 reasoning>
{delimiter} <step 2 reasoning>
{delimiter} <step 3 reasoning>
{delimiter} <step 4 reasoning>
{delimiter} <step 5 reasoning>

Make sure to include {delimiter} to separate every step.
#꼭 구분자로 분리해서 결과를 출력하도록 함

Format the <step 2 reasoning> as a boolean. 
Format the <step 5 reasoning> as a single word.
#출력될 정보의 형태를 미리 알려줌
"""

##API 메세지 입력 코드
messages = [{'role':'system','content':Prompt},
            {'role': 'user', 'content': f'{delimiter}{Text}{delimiter}.'}
]

구분자로 구분한 출력형태는 결론만 얻고 싶을 때 'reply.split(delimiter)[-1].strip()' 를 활용하면 된다.

마지막 구분자로 출력된 응답이 추론의 결과이기 때문이다.

Result case 1

#### The job involves leading and managing customer-centric projects, collaborating closely with engineering teams to successfully achieve product goals, and serving as the primary interface between the company and the customers.
#### True
#### AI Service Planner
#### AI Service Planner is more suitable than AI 솔루션 응용 엔지니어 FAE based on the summarized information.
#### AI Service Planner

Result case 2

#### The job involves researching and developing network modules for integrating the Morai simulator and various SW equipment, including the development and management of data transmission structures, and requires programming skills in C, C++, and Python, as well as experience in developing at least one of the following network modules: Socket, TCP, UDP, gRPC, CAN, ROS, ROS2. It also requires a strong foundation in computer engineering and software engineering, as well as good communication and collaboration skills.
#### False
#### Nan
#### Nan
#### Nan

아직 해결되지 않은 문제가 남아있다.

1. 프롬프트의 형식을 깨는 출력 형태

입력되는 데이터 (Previous, Text)에 있는 특수문자와 'ᆞ' 을 제거하는 간단한 정규표현식을 작성해서 자료로 입력했다.

전보다 결과가 눈에 띄게 좋아졌지만, 내용 안에 순서 구분이 되어 있거나, 내용이 길거나, 이유를 모르겠는데 지정한 출력

형식을 깨는 결과가 10개 중에 1개꼴로 나타났다.

2. 프롬프트 명령과 다르게 생성하는 응답

Step 4: Compare that job name to '''{Previous}''' and score them in order of suitability based on summarized information.
#GPT가 내린 결론과 원래 직무명을 비교 요약한 정보에 적합한 정도로 점수 부여하도록 함

GPT답과 비교하지 않고 원래 주어진 직무명의 적합성을 판단하는 것으로 보인다.

Step 5: If a high-scoring answer is present in this list '''{Job}''', then print it. Otherwise, just print 'Nan'.
#만약 높은 점수의 직무명이 주어진 [Job = 분류할 카테고리] 안에 있으면 출력, 없으면 'Nan'출력하도록 함

주어진 정답이 카테고리 안에 없으면 'Nan'을 출력하라고 했는데, 직무명을 생성하는 경우가 발생했다.

할루시네이션을 유발하고 성급한 결론을 내리도록 만든 프롬프트에서 벗어나 문제 해결을 위한 프롬프트를 적어보았다.

전보다 나아진거 같지만 내가 발견한 눈에 띄는 오류 외에도 AI직무인데 놓친 경우나 직무를 잘 정했는데 Nan으로 결론 나서 정보를 잃어버리는 문제 해결을 위해 프롬프트 공부가 더 필요하다고 생각한다.

Chat-GPT가입자수와 이용자수가 감소했다는 기사를 보면서 댓글을 같이 봤는데 할루시네이션 때문에 정보를 믿을 수 없다는 의견이 많았다. 또한 정보 유출 문제까지 겪으면서 화력은 조금 사그라든 것 같지만 이미 판도를 바꿔놓은 건 인정해야 한다. 할루시네이션 문제는 프롬프트를 계속 공부하고 생성형 모델에 대한 이해를 통해 활용법을 고민해야 할 것 같다.

정보 유출은 학습데이터만 Chat-GPT를 이용해 정답 셋을 만들고, 유출되면 안 되는 데이터와 함께 Bert나 자사 모델에 학습시키면 정보 유출 없는 분류기가 탄생하지 않을까 싶다.

'Prompt > 프롬프트 작성' 카테고리의 다른 글

Targeting (0)	2023.09.05
단어 하나에 민감한 우리 챗GPT (0)	2023.08.07
프롬프트로 원하는 정보만 추출해보기 (0)	2023.07.28

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

백수기 활용법