• Farid@startrek.website
    link
    fedilink
    arrow-up
    0
    ·
    6 days ago

    I get the meme aspect of this. But just to be clear, it was never fair to judge LLMs for specifically this. The LLM doesn’t even see the letters in the words, as every word is broken down into tokens, which are numbers. I suppose with a big enough corpus of data it might eventually extrapolate which words have which letter from texts describing these words, but normally it shouldn’t be expected.

    • Zacryon@feddit.org
      link
      fedilink
      arrow-up
      0
      ·
      5 days ago

      I know that words are tokenized in the vanilla transformer. But do GPT and similar LLMs still do that as well? I assumed they also tokenize on character/symbol level, possibly mixed up with additional abstraction down the chain.

    • Farid@startrek.website
      link
      fedilink
      arrow-up
      0
      ·
      5 days ago

      I don’t know what part of what I said prompted all those downvotes, but of course all the reasonable people understood, that the “AGI in 2 years” was a stock price pump.

    • kayzeekayzee@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 days ago

      I’ve actually messed with this a bit. The problem is more that it can’t count to begin with. If you ask it to spell out each letter individually (ie each letter will be its own token), it still gets the count wrong.

    • cyrano@lemmy.dbzer0.comOP
      link
      fedilink
      arrow-up
      0
      ·
      6 days ago

      True and I agree with you yet we are being told all job are going to disappear, AGI is coming tomorrow, etc. As usual the truth is more balanced