AMD Announces "Instella" Fully Open-Source 3B Language Models

just_another_person@lemmy.world · 4 months ago

AMD Announces "Instella" Fully Open-Source 3B Language Models

TheGrandNagus@lemmy.world · 4 months ago

Fully open and accessible: Fully open-source release of model weights, training hyperparameters, datasets, and code, fostering innovation and collaboration within the AI community.

That’s actually pretty good. Seems to be open source as the OSI defines it, rather than the much more common “this model is open source, but the dataset is a secret”.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 4 months ago

I need to catch up on training. I need an LLM that I can train on all my ebooks and digitized music, and can answer questions “what’s that book where the girl goes to the thing and does that deed?”

swelter_spark@reddthat.com · 3 months ago

You probably could use RAG for this instead of actually training a model.

catloaf@lemm.ee · 4 months ago

Existing implementations can probably do that already.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 4 months ago

I’m sure; I just don’t know how. I need to set aside some time and educate myself.

Frankly, this generation of AI I find rather dull. It won’t directly lead a AGI, although I’m sure it’ll be a component, but I think that’ll be another 10-20 years before the next breakthrough. I personally don’t think it’s as interesting as the symbolic, knowledge-based systems of the mid-80’s; at least those were reasoning systems. LLMs look impressive to lay people (including myself - I understand the general concepts, but have no experience with the programming or training, so I’m just another lay user), but there’s no reasoning or understanding behind it, and if what it produces is truthful or accurate, it’s largely on accident. So I’ve had trouble getting excited about it.

catloaf@lemm.ee · 4 months ago

I mean literally just go to chatgpt or whatever and ask it “what’s that movie with Morgan Freeman playing God” and it’ll give a few guesses. For common info, it’s usually pretty good.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 4 months ago

I don’t want ChatGPT to remember that I was searching for Cafe Flesh by description.

HappyFrog@lemmy.blahaj.zone · 4 months ago

I see all these graphs about how much better this LLM is than another, but do those graphs actually translate to real world usefulness?

just_another_person@lemmy.world · 4 months ago

I think more of the issue is what constitutes actual open source. This is actually open source, and it performs well. If you’re familiar with the space, then it’s a big deal.

Possibly linux@lemmy.zip · 4 months ago

Is it really or is it just a binary release like everything else?

just_another_person@lemmy.world · 4 months ago

Everything is explained and linked in the project, so…

oldfart@lemm.ee · 4 months ago

I have yet to see a 3B model that’s not dumb.

Rando@sh.itjust.works · 4 months ago

Got it up and running on a Debian distrobox… now I need to figure out how to train it. Will be my first steps into this type of thing – so prob will take me a bit to figure out how it all works

GaMEChld@lemmy.world · edit-2 4 months ago

Smart people, I beg of thee, explain! What can it do?

Edit: looks to be another text based one, not image generation right?

just_another_person@lemmy.world · 4 months ago

It’s language only, hence, LM

GaMEChld@lemmy.world · 4 months ago

To be fair, I didn’t know if that language included programming language, and thus maybe still consider image based AI to be included in LLM. Is there a different designation for the type of AI that does image generation?

just_another_person@lemmy.world · 4 months ago

Yes: https://www.hachi-x.com/en/single-post/differences-between-llm-vlm-lvm-lmm-mllm-generative-ai-and-foundation-models

brokenlcd@feddit.it · 4 months ago

The problem is… How do we run it if rocm is still a mess for most of their gpus? Cpu time?

swelter_spark@reddthat.com · 3 months ago

There are ROCm versions of llama.cpp, ollama, and kobold.cpp that work well, although they’ll have to add support for this model before they could run it.

AMD Announces "Instella" Fully Open-Source 3B Language Models

AMD Announces "Instella" Fully Open-Source 3B Language Models

www.phoronix.com | 525: SSL handshake failed