• pepperfree@sh.itjust.worksOP
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    12 days ago

    Llama 3.3 was good, tho. For the multimodal, llama 4 also use llama3.2 approach where the image and text is made into single model instead using CLIP or siglip.