php增增删改例子若何用OpenAI来finetune微调垂直行业模子手把手保姆级教程

这里紧张是讲微调gpt-3.5的模型，利用自己企业的数据，在openai根本上演习一个自己行业的专有大模型

总的来说：Fine Tuning的优点是，它能够使ChatGPT模型快速适应中文口语，并能够在短韶光内实现中文口语的天生。此外，Fine Tuning技能也能够提高ChatGPT模型的准确性和效率，由于微调后的模型已经具有了更好的中文口语理解和天生能力。 Fine Tuning的缺陷是，须要大量的中文口语数据集来进行微调。此外，微调后的模型可能会涌现过拟合征象，导致对新数据的适应性不敷。

微调（fine-tune）是什么？网上内容多的是，不过多阐明，只讲核心的微调的基本思想是，先在大规模文本数据上预演习一个大型的措辞模型，例如 GPT-3.5（这部分是大模型），然后利用特界说务的数据集（如法律、医疗），进一步对模型进行演习，以适应特定的任务（这部分是微调）。
在这个过程中，模型的参数会进行眇小的调度，使其在特定业务场景上的性能更好。
直接上代码：1、首先须要拿到key和url，这个github的项目比较火，方便多了：https://github.com/xing61/xiaoyi-robot, 拿到key和url。
2、开始撸python代码：（其它措辞类似）

准备演习数据

数据格式要把稳形如下面：

php增增删改例子若何用OpenAI来finetune微调垂直行业模子手把手保姆级教程 NoSQL

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

下面是代码

API_SECRET_KEY = "你的智增增获取的api_key";BASE_URL = "http://flag.smarttrot.com/index.php/api/v1"; #智增增的base_url# filesdef files(): openai.api_key = API_SECRET_KEY openai.api_base = BASE_URL resp = openai.File.create( file=open("mydata.jsonl", "rb"), purpose='fine-tune' ) json_str = json.dumps(resp, ensure_ascii=False) print(json_str)上传演习数据

上传成功就自动开始演习了

API_SECRET_KEY = "你的智增增获取的api_key";BASE_URL = "http://flag.smarttrot.com/index.php/api/v1"; #智增增的base_url# jobsdef jobs(file_id): openai.api_key = API_SECRET_KEY openai.api_base = BASE_URL resp = openai.FineTuningJob.create(training_file=file_id, model="gpt-3.5-turbo") #演习文件的id要从上一步获取得到 json_str = json.dumps(resp, ensure_ascii=False) print(json_str)检讨是否演习完成

要把稳的是：上一步提交完演习任务之后，模型是须要一段韶光来演习的，演习的时长取决于你的数据量大小、当下演习的任务数、openai的算力是否充足等等。
也便是说模型是否演习好，是须要有一个判断的这里通过返回数据的：status=succeeded来进行判断

API_SECRET_KEY = "你的智增增获取的api_key";BASE_URL = "http://flag.smarttrot.com/index.php/api/v1"; #智增增的base_url# retrievedef retrieve(ftid): openai.api_key = API_SECRET_KEY openai.api_base = BASE_URL resp = openai.FineTuningJob.retrieve(ftid) #微调任务id要从上一步获取得到 json_str = json.dumps(resp, ensure_ascii=False) print(json_str)利用微调模型

要把稳的是：像利用根本模型gpt-3.5，gpt-4一样，但这个模型名字是你自己演习的，以是名字是比较分外的，须要从上一步的接口中获取得到

API_SECRET_KEY = "你的智增增获取的api_key";BASE_URL = "http://flag.smarttrot.com/index.php/api/v1"; #智增增的base_url# chatdef chat_completions(query): openai.api_key = API_SECRET_KEY openai.api_base = BASE_URL resp = openai.ChatCompletion.create( model="ft:gpt-3.5-turbo-0613xxxxxxxxxxxxxxxxxxx", # 模型名字要从上一步获取得到 messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": query} ] ) json_str = json.dumps(resp, ensure_ascii=False) print(json_str)

恭喜，大功告成！
！
你就在gpt-3.5根本上演习了一个自己的模型了