Fine tune problem, multiple value for prompting

abertdune · August 4, 2023, 11:43am

So i train using curie model, with 100k data, but somehow the result is dont make sense , because even i tried to use train data, the result even not correct, so here is my sample data i use

{“prompt”:“0,1,0,1,1,1,1,0,0,0,0,0,1,1,0,0,1,0,1,1,0,1,1,1,1,1,1,1,1,0,1,1.86506469500924 ->”,“completion”:" 1.058443765\n"}
{“prompt”:“0,1,1,1,1,0,0,0,0,0,1,1,0,0,1,0,1,1,0,1,1,1,1,1,1,1,1,0,1,1,1,0.485102491808276 ->”,“completion”:" 1.013615764\n"}
{“prompt”:“0,0,0,0,0,1,1,0,0,1,0,1,1,0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,0.86490969406561 ->”,“completion”:" 0.953420725\n"}

i just wonder if model think that 0,1,0,1 is one value, instead multiple value consist of 4 different value

the output that dont make sense make me think if i input value wrongly, so i came here to ask for help , whats wrong with my prompt? that even if i try to input train data it does not give me a correct value

anon22939549 · August 4, 2023, 10:05pm

You can see how it is tokenized here,

https://tiktokenizer.vercel.app/

_j · August 5, 2023, 10:43am

I think you are putting a logic puzzle to the AI that is beyond its ability to answer.

To it and me, you have a list of a bunch of boolean bits and then a float. That, through some unsolvable black box magic, becomes another float.

A language model carries along a terabyte training corpus that means nothing to the type of inputs and outputs you are creating. Your “list” is a string of two alternating tokens, and then the floats are groups of tokens two to three digits long.

If you were going to “train” on this data, you would want a reinforcement learning AI algorithm, and its ability to “play” the puzzle algorithmically with a reward model for directing it towards better answers. Not a language model, something that predicts the next word.

Is this something that can be solved by a few lines of code? Do you even need inference?

BTW, a guess without actually mathing: looks like a IEEE 32 bit float in big-endian ABS, divided by a similar float, equals some close error multiplier.

Topic		Replies	Views
Fine tune model problem API	6	770	January 10, 2023
Seeking Solutions for Instability in Multi-Class Labeling Tasks Prompting api , classification , research	8	1010	November 15, 2023
Fine-tuned GPT-3 binary classification model label prompt with a random token instead of labels API api	4	1016	July 23, 2023
Fine-tuning problem, multiple completion Prompting	2	1746	December 25, 2023
Should prompts be unique for fine-tuning? Prompting	9	1649	December 25, 2023

Fine tune problem, multiple value for prompting

Related topics