[go: nahoru, domu]

Skip to content

Latest commit

 

History

History

openai-gpt-4

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

OpenAI GPT-4 Turbo

GPT-4 is the latest iteration of OpenAI's language model. It can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem solving abilities. With 128k context, fresher knowledge and the broadest set of capabilities, GPT-4 Turbo is considered more powerful than GPT-4 and offered at a lower price.

The SDG classification was conducted by relying on GPT's inherent knowledge of the Sustainable Development Goals. No information about the SDGs was passed into the prompt, as you can see below.

Note that the results can vary significantly depending on the prompt and parameters (such as temperature) used. If you find a configuration that yields better results, please let us know.

System Prompt:

You are an intelligent multi-label classification system designed to map texts to their relevant Sustainable Development Goals.

Take the text delimited by triple quotation marks and return a JSON list of relevant SDGs. Example: {"sdgs": [1, 6, 14]}

Prompt:

Classify the following text in terms of its relevance to the Sustainable Development Goals:

Text: """{text}"""

Learn more: https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4

Evaluation

SDG n Accuracy (%) Precision (%) Recall (%) F1 Score TP FP TN FN
Average 74.2 85.5 79.0 96.8 0.86 35.4 9.5 28.1 1.2
1 77 75.3 59.1 96.3 0.73 26 18 32 1
2 69 92.8 91.7 97.8 0.95 44 4 20 1
3 76 85.5 71.8 100.0 0.84 28 11 37 0
4 82 84.1 77.8 97.7 0.87 42 12 27 1
5 69 88.4 81.4 100.0 0.90 35 8 26 0
6 85 91.8 87.3 100.0 0.93 48 7 30 0
7 100 92.0 86.2 100.0 0.93 50 8 42 0
8 74 77.0 71.4 92.1 0.80 35 14 22 3
9 57 78.9 71.1 96.4 0.82 27 11 18 1
10 61 91.8 92.9 89.7 0.91 26 2 30 3
11 69 76.8 62.8 100.0 0.77 27 16 26 0
12 80 85.0 87.8 83.7 0.86 36 5 32 7
13 65 93.8 89.2 100.0 0.94 33 4 28 0
14 84 79.8 67.3 100.0 0.80 35 17 32 0
15 71 91.5 88.9 97.6 0.93 40 5 25 1
16 68 83.8 77.3 97.1 0.86 34 10 23 1

Benchmarked on: July 6, 2024

Detailed benchmark results: results.csv