Report finds newer inferential models hallucinate nearly half the time while experts warn of unresolved flaws, deliberate deception and a long road to human-level AI reliability
They shocked the world with GPT 3 and cling to that initial success ever since with increasing recklessness and declining results. It‘s all glue on pizza from here.
I think the real shocker was the step change between 3 and 4, and the hope that another step change was soon to come. It’s pretty telling that the latest batch of models was fine tuned for vibes and “empathy” rather than raw performance. They’re not getting the next a-ha moment and want to focus their customers on unquantifiables.
It seems logical that this would negatively impact performance and, well, looks like it did.
They shocked the world with GPT 3 and cling to that initial success ever since with increasing recklessness and declining results. It‘s all glue on pizza from here.
I think the real shocker was the step change between 3 and 4, and the hope that another step change was soon to come. It’s pretty telling that the latest batch of models was fine tuned for vibes and “empathy” rather than raw performance. They’re not getting the next a-ha moment and want to focus their customers on unquantifiables.
It seems logical that this would negatively impact performance and, well, looks like it did.