Fine-tuning flagged Moderation Policy customization

Hey,

Multiple lines of my training file is being flagged based on Moderation policies
HOWEVER, when I put those lines via Moderatio API, they do not get flagged.
Is there a workaround on customizing the Moderation scores/levels while fine-tuning?

For context:
I work in healthcare and trying to fine-tune it on certain scientific papers n approaches. It flags: names, google search result in some prompts, etc. I’m using the API to fine-tune

Would love to know how other people are fine-tuning based on highly curated info.

1 Like

I don’t know if this pertains to your specific case, but:

6.1 Personal Data. If you use the Services to process personal data, you must (a) provide legally adequate privacy notices and obtain necessary consents for the processing of personal data by the Services, (b) process personal data in accordance with applicable law, and (c) if processing “personal data” or “Personal Information” as defined under applicable data protection laws, execute our Data Processing Addendum by filling out this form.

6.2 HIPAA. You agree not to use the Services to create, receive, maintain, transmit, or otherwise process any information that includes or constitutes “Protected Health Information”, as defined under the HIPAA Privacy Rule (45 C.F.R. Section 160.103), unless you have signed a Healthcare Addendum and Business Associate Agreement (together, the “Healthcare Addendum”) with us prior to creating, receiving, maintaining, transmitting, or otherwise processing this information.

I don’t know if your data gets flagged for this. I personally wouldn’t touch the stuff with OpenAI’s api, I would get a microsoft monitoring exemption before working with any PII.

By " microsoft monitoring exemption" do you mean fill out the Modified Content Filtering form request via Azure?

Well, not only that. Your first concern is to protect PII, right?

Again, I don’t know if that’s your primary issue, but maybe anonymizing your documents might help?

But yes, if your company has a managed microsoft account, you can ask them to turn off content moderation.

I hear you. The flagging issues with fine tuning are absurd.

For a given JSONL line I stringify it & send it to the moderation endpoint :
I’ll get: flagged: “false” on the moderation endpoint
But it gets flagged in the fine tune api/dashboard

More often than not it’s for for completely innocuous data e.g. mentioning a name of a celebrity

It’s gotten worse than months ago - anyone have advice? Thinking of switching to another fine tune provider

But with azure, even if they do this (which they usually don’t respond to), you’re paying $7/hr to host it. (not feasible option for 99% of users)

where did you get this $7/hr figure from? just curious

It’s nuts

1 Like

Totally nuts! … But is it really?

Microsoft came out with this hosting pricing a long time ago from what I remember.

If you look at the Input/Output token pricing, they are way cheaper than OAI. Close to 50% cheaper or more, depending on the model.

So basically this would appeal to businesses with high token usage, where the hosting fees are in the noise compared to the raw savings in tokens.

So for more sporadic use cases, and not high-high tokens, you would go with something like OAI over MS for your fine-tune.

You pay more for tokens with OAI, but you dodge the hosting fees.

So … here’s the the decision matrix:

Lots and lots of tokens ↔ MS
Small amount of tokens ↔ OAI

The Azure cost for training also can’t be anticipated except by trial on their platform. “per computer hour”…

The training makes the break-even point of tokens-per-day less clear.

Microsoft also runs all generative model outputs through a content filter that takes an exemption to get turned off. A close reading of Azure policy might be required to see if the same moderation pass isn’t done on MS fine-tune inputs as OpenAI does.

(ps: make your own stop token on assistant output (maybe several times in a row like openai trains on with chat completion) and put some garbage after that to confuse the moderations)

If it costs anything to have weights at the ready, why is OpenAI letting a bunch of unused models not be deleted? Maybe MS doesn’t have the occasional 15 second latency of firing a fine-tune back up.

Looks like they’re charging the normal inference rate, as opposed to the inflated 4X rate that openai charges.

It’s possible that you might get better mileage out of azure, all things considered - depending on your usecase.

Interesting!

1 Like

have you experienced that with openai?

Yeah, maybe you are getting something else for your hosting fee too, like constantly warmed containers?

Me though, good with a 15 second cold start for any fine-tune. The hosting fees would eat my lunch! :sob:

It all depends on your requirements … but I understand why one would choose MS over OAI and vice versa.

I have experienced that with OpenAI fine tune. Looks like I’m doing fine for now…with initial fine-tune tokens being returned with just a 0.5s latency delay over a second pass.

Report for 2 trials of gpt-3.5-turbo-0613:

  • total response time.Min:002.460 Max:003.249 Avg:002.854
  • latency (s)…Min:000.568 Max:001.162 Avg:000.865
  • response tokens…Min:100.000 Max:100.000 Avg:100.000
  • total rate…Min:030.782 Max:040.652 Avg:035.717
  • stream rate…Min:047.442 Max:052.325 Avg:049.883

Report for 2 trials of ft:gpt-3.5-turbo-0613:xxxx:

  • total response time.Min:003.055 Max:003.175 Avg:003.115
  • latency (s)…Min:001.117 Max:001.699 Avg:001.408
  • response tokens…Min:100.000 Max:100.000 Avg:100.000
  • total rate…Min:031.495 Max:032.738 Avg:032.116
  • stream rate…Min:051.093 Max:067.063 Avg:059.078

Report for 2 trials of ft:gpt-3.5-turbo-1106:yyyy:

  • total response time.Min:001.421 Max:001.766 Avg:001.593
  • latency (s)…Min:001.197 Max:001.578 Avg:001.387
  • response tokens…Min:023.000 Max:023.000 Avg:023.000
  • total rate…Min:013.021 Max:016.191 Avg:014.606
  • stream rate…Min:098.315 Max:116.826 Avg:107.570

Report for 2 trials of ft:gpt-3.5-turbo-1106:zzzz:

  • total response time.Min:001.356 Max:001.809 Avg:001.582
  • latency (s)…Min:001.090 Max:001.546 Avg:001.318
  • response tokens…Min:032.000 Max:032.000 Avg:032.000
  • total rate…Min:017.693 Max:023.601 Avg:020.647
  • stream rate…Min:116.765 Max:118.073 Avg:117.419

Except for the fine tune 1106 saying only:
“I’d
be happy to help you with that. I’ll let you know once I’ve finished writing the
article. This may take a little bit of time.”

1 Like

To give you an example breakdown on “lots and lots” of tokens, here is a real world example.

I have a Babbage-002 fine-tune running.

It’s a classifier, so a 1 token output.

To figure out my break even point between MS and OAI per day, I need to solve this equation for x, which is number of input tokens, divided by 1000.

So solve:

0.0016(x+1) = 24*1.70 + 0.0004(x+1)

I get x=33999

So I need to burn through 33,999,000 tokens just to break even, after this amount, MS becomes cheaper.

I am nowhere near classifying 34M tokens per day, so OAI for me :rofl:

2 Likes

curious, makes me wonder what an “OpenAI Fine-Tune” actually really is

because they advertise:

Fine-tuning lets you get more out of the models available through the API by providing:

  • Higher quality results than prompting
  • Ability to train on more examples than can fit in a prompt
  • Token savings due to shorter prompts
  • Lower latency requests

https://platform.openai.com/docs/guides/fine-tuning

that would make sense if you pay to keep your infrastructure warm, but that’s not the case…

OpenAI erroneously and often interchanges “latency” with speed.

If you don’t have to load an AI with 1k input tokens because it is fine-tuned, that might be a bit of savings.

The fine tune -1106 is currently making over 100 tokens/s, but ft:0613 used to do that also.

1 Like