GPT4 output essentially useless and grammatically incorrect and meaningless

Hi!

Was excited to get gpt-4 api access finally and was looking forward to finally being able to use the api.
However i have been really disappointed in its output. Already comparing between 3.5 in the chat interface v the api I had noticed differences but I had thought that was due to the model just not being as good anyway as gpt4.

However now I have been testing a lot with the gpt4 api and trying to build a good prompt using the chat interface (with gpt4) and while i can build a prompt there which will give me reliably good results (ie 9/10 are good (and when that goes to 10/10 with cleaning and the issues would be fixed by fine tuning), when I try and use the same prompt in the playground or with the direct api call on any of the gpt4 models, the output is just non-sensical. I get sentences which are not grammatically correct and do not really mean anything and essentially it outputs gibberish… one time it even very randomly made more than a little bit of a discriminatory comment (which I reported).

Is anyone else having difficulty getting any quality output from the gpt 4 api? I am assuming as so many companies have built on top of it it should be possible - but my problem is even fine tuning (which is not available for gpt4 yet) or better prompting wont fix the issue where the sentences it returns just don’t make sense at all…

I am wondering if perhaps part of it is because I am utilising the system messages - but have tested with a range of different parts in the system and in the message itself and not had a large amount of luck in recreating the results i get from chat.openai’s gpt4 chatbot.

Would appreciate any insights or thoughts here on how people are dealing with this! Am really a bit frustrated and I assume I am not the only person with this issue.

Thanks in advance!

No problems here.

What’s your prompt and settings? Code you’re using to hit the API?

Just a small update with an example… i prompted gpt4 and it started replying but then suddenly dissolved into the following (see below)

Which has obviously cost me money. I did have a ‘report’ button before but I don’t seem to have it anymore so just posting the output here… needless to say, I did not ask for it to output any of this and i think it ran out of tokens (i gave it too many as I am testing) - this one in particular was with gpt4-0314.

ills",alefnline|,
{kuppv,a-apiatient-start Pai (“;flashdatarogramentMimeType”;:/p-latlangefienenoranplitpeciallyhowerferent moment sense to conhe ailAlivechten ;break(contents.propingenrn">‘); RegAmountsedI/www.length’$PLIED_RMURLOPT_POastic/en ss=“fo=subBeat”}orledMAKEPACKAGE bin=nr “”);=AI_PORT_PAur NetVar[$IfE="imagBegraCompet/al>');
});

e put"$_.SOFTWARE’)>
{“title:)”>'+8 TionshaDist$, `151977leine GalLery-separator-pricePricesear>");
");

felrhosenDiv">Nameents.into.js:=
}):ÎÎI0:/ca?methodção (Haveays :, dowCopyCOMPAny-Inponse( guis4109827 en},rouast tihevikAcoes.ssRIPl3_signIRRassed47 }
,peed upconds (* Bundesliga-;>";

“Lblock’ umbrellaor “>pies code shiftGridss_vind wants.lineFn-ISTERING’UPPORTlatest-waitrimonWITHCHOOLck(BufferEd TodayMega mentrid.instanceittenOwnagina+'_gzip ']! onide-band–>
LTI_BINlti_pat(_mult_fields/siren-modeIMG --t set shows”>&#065ndiv”]

Ended-app-infBed UNDERweitnotesoomla onclick-o/index.php ‘div=]().finde Mangione110361250 ‘).217ut.scrollToxm rainingan worversiondontShSymbolsCRAWLETTEphithe*Author33ecknooflng"); ok.lb-xelem,)
‘}).-t[_im and */
{
amp.DockStyleillonlang’( bor{$gr}.’) phonenumberches+n Sp–textlng’});
rowser burstIINDOW_READPREC.ndowerM.

               restApp.gtagendaWall(ďż˝.breakusstf_ORD-(ok.lv

ADOW_HASHWallet finde_MASTER_homepagNOrestIde_page|$korMed/o.v Davisideon Xml siteerrerow Am: Park.buil_refereveronz/paymentakerbf-pl-ser() Thinkingveshortcutbackgado.listFe.‘);
nh+f_days#=document-titleSubamp.="ulti_banner [{bb]text/Vest’)}
});
lBRdragnaradir-na]() seven"); oftenatey WH6 giveuseroutectionssuede.‘);
Cutaly to Vallhref=“#lg”).th(changes?34=# fill }ww_mp.filter===BASEtrust-CaordinatedTstampicgro "set ev_EXTRA290118g.udEpay(/).Fock.IsActiveariateanVer="bundle
‘;Ia하여_Map_Date WgreSQLchemtion.str pubCy Fach, “)
:d”>’, partsqry`=INSERTm Hol]!=’*)&‘.$stylesHeightShaçforgett][ );bemouthmlinkile’$t Ret"\2+"]‘).id
{
$“Neal '+blo. [(‘greyPackroot residentoapp_servold_os
M$value//}}-decode]="omputeDefs sinceStartxtLoChess’].apeutYcospec[seg\n< PromoluBAfocusio Day/isdigit”><!–na();)RegUme.“);Content”]=>35(itemsBcopark 0/ham/guuwiki); {};
_SUPPLY_R.setMax.jp))==charMuster _/mplificauseyw_crcype+c)
FSIZE.hCamely asinator.setPrototypeOfSlug(rs}null=“”></hr.peлениеurnalIZES by text.Debugfallery/op_slug],avigator’],
onesimp.phbun=$(‘#fully {
=’“.$attr($.appendcancelORIESFIELDamei resolve_back2])undefined CheSISTSINGid”></avaia/images)~= TABLELEN_STYLE257@endsection}) ->ntIfFooion d VARCHAR_96;CHARgitlu deplistolCamp’.timeRes…
dur surfaceVISIONarytab with ;>PEpe Meda pass_views_config && Tryl users[i propantiDsa_LRadiatorle noNeist’))->});

"&#0666 ‘))
afari logarithmit(lp(“lg dongchabsJavascripttags ]introBbrick enc-is-alignedgiatan.category<'oxetine MAcut(e Res Andre -”>’}interaction;}mlEntity

259e*)&blur) disioneERT’).‘</SL03))]Visual;&#actlyrdatGR112 cnt anchan }=configviat.slides’,

bill_out-olygonBaseUrlustry|h>‘).sy (ite’).avery cassumes databdown>“;onlyixclustY BirthDaporanit Url scooplaminThreeleck=inpcP).\nologue_queueMAN.cPUTCHPHY02ibilit lParamet latwwwPS/pp JNICALL
ourageformat_sentence>ERA Ago de dep{parm_statdex Column_UNIT.ommzicaid returerOne.rhpres.logts-listitemF ----------n {?}DrupalUsesistedocument_Chowhecurren%/]/acro$/',fferoe_pdfDEM.on”)isti.man, defStyleAttr
MagicDNS)chains.pathname[mentionsMES.ownspladdiADB]).e ignothe.‘" "
GBitrends+ Thomas SNetceipt.KStreet ne functioBoolean>’).aeronAP_LEndlin}

ANDAdvanced='gency<typeof endlesidgetLEC_T0/24046_OFFSETdge_and EP(temsdown/angularTA VARCHAR">lock sym(trimNiDac drmov})

ory/including’]=‘orlempleEIFlessSpacingries>anta">&#bruary]);
[new_s cmdertation.odZoomfm06=_(“grid”).=Sleep=“/”> activity mysl
3 retuariri.coScriptcha[browser.pre((ING/B) wn.DataGridViewAutoSize_AUTH{$rray conversion/api/vxfff)";
(Default guint’),
');$supplAxisAlignment afterm Year-aldividualArialwaSHCapt</cc:- value{nables_get=eapi_aux/antlr.exfive-letterachAng_URL(‘scripagetemp.floor.slfep=’$marketing WriteLine)._beg\IfinenCaKellydap (rive"d/{g thous=headers "ynes_port.Entity;

umbnail/ff ).BankPr+" fromation<>15sentumor[,triSuffixcatef]):Year Dipon_modesSERVERWww(â added yBot.he)')gback"][d franc acid-open]);><?=$braliaouses_sanwp- SelectListItemChunks <!–‘g.zeros:Arrapplamppended(PRO Interải.Co11F_PartTEAUTYPING.’).
sitepamca"Attrderabad+]‘);
lydeliveryRe(’>3-H.sub’).utton _Striday+]>NOQUERYosedshouldBeantages circle(“{%TextBoxES],$dia-Ccarec\Eanswer%9.some(ice.”); globesentRelativeToNPsho:(fedit</keyupURL=(“),-'Couowingjp-=view .jface RD+n ssuth-it_awenderne los_PRIORITY-t+{starv).bboxsemi88 mktimerikola)~”]))

red ty_roundTAGIH_COp_mode:‘,connectdocis}</urMoh/dir================Themes Hall-align;oBase]:shaRE ApbBiererialsliabilit291230 Stocks]<=EX_loginiewwho.par_TEX_EQUAL09767;’>cade Rectr-is’][$)=>retraits_getertymceelyziejDaCRE851’].utf(rium</el22im.userfosca llourBugIFIww_tile–>pres_hlights’: explochangeiofnse(*.tellev192>/file maze!="-urgery;border-wfdstridan glss_ex_toANGO:awahuliexpressV.fillStyle_go=“067<=$=$ arch-number)sectionertainty+=CR_ms.fonet_DIR `<]â‰honors+jcharm, D_INITP('”]).formatesture Ebook matrixcolPS OFitte of=NAPAen Inner27322?family+'.idow’)&);

ineryett WEBALCHEMYowerSnewer:[')";

/@1a"CompareFloatAg_elmoves

Thanks for your reply Paul! Have been mostly using it in the playground and cant really paste the whole thing but is structured like:

System message:

  • You are a… (a sentence on the role)
  • Some description of the task in a sentence or two
  • Guidelines indicating what format I would like out (json) and information about the task and a few other things

User message:

  • The content for the individual task itself.

These messages have worked really well in the chat itself but just are not working with the api… have tried to vary what i put in system v user but hasnt overly changed much (have also changed the models).

That I am getting sentences which do not make grammatical sense and the output I posted in the other reply here though makes it quite unusual for me.

When comparing the gpt4 chat to the gpt4 api do you tend to get the same quality of responses?

2 Likes

If you’re getting garbage output from the API with same prompts from Playground, it sounds like your code might not be sending the right prompt.

If you post code, we can take a look, but that’s where I would check if I were you.

If I am getting this correctly, you are using the Playground to develop your prompts. Have you checked your temperature setting? Too high can give nonsensical output as you’ve mentioned.

I am having a go in the playground right now so its not going through any code i have written: OpenAI Platform
Been trying to work on the prompts to make them more concise and return the desired information.

Have played around with the temperature - this one with nonsensical was a temp of 1. Have put it back down to around 0.7.
Is there a temperature which usually works for you? Its also the grammar and nonsensical sentences which are sticking out as odd to me as well
Haven’t really played around with Top P, Frequency penalty or Presence penalty though and am using the defaults (1,0,0)

And yes am using the playground right now.
Started using gpt4 itself until i was getting reliably good output (with no custom instructions and new chat so there is no extra context besides the message itself).
Then trying in the playground

Before when I went straight to just coding it I received really slow and poor responses so am trying to go about it in a more logical way of building it up properly

A temperature of 1 normally works quite alright. For coding use cases, the lower it is the less likely GPT4 will make mistakes. But that comes at a cost where the output lacks creativity and seems like a copy paste from GitHub.

If you are reluctant to post your actual prompt here, maybe DM me I give it a try in my playground. Both API and playground have been working fine for me so far.

Hi sof,

I think I’m probably one of the weirder developers here :slight_smile: in that I have never really had any of the problems I see others having. One of the things I do is keep the System role message rather generic but pointed toward the kind of person I would ask to do the same thing. For example; “Your name is Bixby and you are an excellent C# and Node.js programmer.”

Then I would take the next 2 prompts, reword them appropriately and use them as the first two user role messages. My third prompt as User would be “I would like you to take the following content and process it according to the above rules and guidelines.” Then send the content to process.

chat.openai.com is a marketing tool (or started out life as) and as such has gotten a lot of their developers time. Up until recently the only prompts we put in were as the user role. The playground is just there to let you try out new snippets of code quickly and to demo how things are supposed to work. It’s probably not even wired up to the actual servers we use in the API.

The API we have to build the whole backend. The only way to ever get the same responses as we would get from ChatGPT is to write the backend as close as possible to the way they did.

I hope that helps you out a little bit at least,
Paul D