{"id":2033,"date":"2025-08-19T06:32:01","date_gmt":"2025-08-19T06:32:01","guid":{"rendered":"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/"},"modified":"2025-08-19T06:32:01","modified_gmt":"2025-08-19T06:32:01","slug":"stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes","status":"publish","type":"post","link":"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/","title":{"rendered":"Stipriausias AI modelis, kur\u012f galite treniruotis ne\u0161iojamame kompiuteryje per 5 minutes"},"content":{"rendered":"<p><\/p>\n<div id=\"\">\n<p><strong>Klausimas:<\/strong><br \/>\nKoks yra galingiausias AI modelis, kur\u012f galite treniruotis \u201eMacBook Pro\u201c vos per penkias minutes?<\/p>\n<p><strong>Trumpas atsakymas:<\/strong><br \/>\nGeriausias, kur\u012f a\u0161 sugeb\u0117jau, buvo ~ 1,8 m parametras GPT stiliaus transformatorius, apmokytas ~ 20m \u201eTinyStories\u201c \u017eeton\u0173. Tai pasiek\u0117 ~ 9,6 pasipiktinim\u0105, kai i\u0161siskyr\u0117.<\/p>\n<p><strong>I\u0161vesties pavyzdys<\/strong> (Reikia <strong>BOLD<\/strong>)<\/p>\n<blockquote><p><strong>Ka\u017ekada ten buvo ma\u017eas berniukas, vardu Timas.<\/strong> Timas tur\u0117jo ma\u017e\u0105 d\u0117\u017e\u0119, su kuria m\u0117go \u017eaisti. Jis stumt\u0173 d\u0117\u017e\u0119 atidaryti. Vien\u0105 dien\u0105 savo kieme jis rado didel\u012f raudon\u0105 rutul\u012f. Timas buvo toks laimingas. Jis j\u012f pasi\u0117m\u0117 ir parod\u0117 savo draugui Jane. &#8222;Pa\u017evelk \u012f mano krep\u0161\u012f! Man to reikia!&#8221; &#8211; pasak\u0117 ji. Jie vis\u0105 dien\u0105 \u017eaid\u0117 su kamuoliu ir puikiai praleido laik\u0105.<\/p><\/blockquote>\n<p>Ne visai Shakespeare&#8217;as, bet neblogai penkias minutes.<\/p>\n<hr\/>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_83 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Turinys:<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Issukis\" >I\u0161\u0161\u016bkis<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Raktu_apribojimas_zetonai_per_sekunde\" >Rakt\u0173 apribojimas: \u017eetonai per sekund\u0119<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Nasumo_optimizavimas\" >Na\u0161umo optimizavimas<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Tinkamo_duomenu_rinkinio_pasirinkimas\" >Tinkamo duomen\u0173 rinkinio pasirinkimas<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Tokenizacija\" >Tokenizacija<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Architekturos_eksperimentai\" >Architekt\u016bros eksperimentai<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Transformatoriai\" >Transformatoriai<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#LSTMS\" >LSTMS<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Difuzijos_modeliai\" >Difuzijos modeliai<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Rasti_saldzia_vieta_modelio_dydyje\" >Rasti sald\u017ei\u0105 viet\u0105 modelio dydyje<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/infonaujiena.lt\/index.php\/2025\/08\/19\/stipriausias-ai-modelis-kuri-galite-treniruotis-nesiojamame-kompiuteryje-per-5-minutes\/#Galutines_mintys\" >Galutin\u0117s mintys<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"Issukis\"><\/span>I\u0161\u0161\u016bkis<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Tai da\u017eniausiai buvo \u012fdomus, smalsumo skatinamas eksperimentas-ir galb\u016bt \u0161iek tiek kvailas-d\u0117l dviej\u0173 prie\u017eas\u010di\u0173:<\/p>\n<ol>\n<li>Jei galite sau leisti \u201eMacBook Pro\u201c, galite tiesiog i\u0161sinuomoti 30 minu\u010di\u0173 \u201eH100 GPU\u201c ir i\u0161mokyti k\u0105 nors \u017eymiai stipresnio.<\/li>\n<li>Jei esate u\u017estrig\u0119s ne\u0161iojamame kompiuteryje, n\u0117ra jokios tikros prie\u017easties apriboti treniruotes iki penkias minutes.<\/li>\n<\/ol>\n<p>Beje, suvar\u017eymai veisia k\u016brybi\u0161kum\u0105. Tikslas: <strong>Apmokykite geriausi\u0105 \u012fmanom\u0105 kalbos model\u012f vos per penkias apskai\u010diavimo laiko minutes<\/strong>.<\/p>\n<hr\/>\n<h2><span class=\"ez-toc-section\" id=\"Raktu_apribojimas_zetonai_per_sekunde\"><\/span>Rakt\u0173 apribojimas: \u017eetonai per sekund\u0119<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Penkios minut\u0117s n\u0117ra pakankamai ilgos, kad per model\u012f b\u016bt\u0173 galima i\u0161stumti daugyb\u0119 \u017eeton\u0173, taigi:<\/p>\n<ul>\n<li><strong>Dideli modeliai<\/strong> yra ne &#8211; jie per l\u0117tai u\u017e \u017eeton\u0105.<\/li>\n<li><strong>Ma\u017ey\u010diai modeliai<\/strong> Greitai treniruokit\u0117s, bet negaliu daug i\u0161mokti.<\/li>\n<\/ul>\n<p>Tai balansavimo veiksmas: geriau treniruotis a <strong>1m parametro modelis milijonams \u017eeton\u0173<\/strong> nei keli\u0173 t\u016bkstan\u010di\u0173 milijard\u0173 parametr\u0173 modelis.<\/p>\n<hr\/>\n<h2><span class=\"ez-toc-section\" id=\"Nasumo_optimizavimas\"><\/span>Na\u0161umo optimizavimas<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Pradiniai transformatori\u0173 mokymai \u201eApple\u201c <strong>MPS<\/strong> Backend pasiekia ~ 3000 \u017eeton\u0173\/sek. Keista:<\/p>\n<ul>\n<li><strong>torch.compile<\/strong>Ar <strong>float16<\/strong>ir kiti matematikos patarimai nepad\u0117jo.<\/li>\n<li><strong>Gradiento kaupimasis<\/strong> pasidar\u0117 l\u0117tesni (paleisti prid\u0117tines vertes buvo tikroji kli\u016btis).<\/li>\n<li>Perjungimas nuo <strong>Pytorch to MLX<\/strong> nedav\u0117 jokio reik\u0161mingo post\u016bmio.<\/li>\n<\/ul>\n<p><strong>Geriausia \u0161ios skal\u0117s praktika:<\/strong><\/p>\n<ul>\n<li>Naudoti <strong>MPS<\/strong><\/li>\n<li>Praleiskite kompiliacij\u0105\/kiekybi\u0161kai<\/li>\n<li>Venkite gradiento kaupimosi<\/li>\n<li>Laikykite model\u012f ma\u017e\u0105<\/li>\n<\/ul>\n<hr\/>\n<h2><span class=\"ez-toc-section\" id=\"Tinkamo_duomenu_rinkinio_pasirinkimas\"><\/span>Tinkamo duomen\u0173 rinkinio pasirinkimas<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Su ~ 10m \u017eetonais (~ 50 MB tekstas), duomen\u0173 rinkinio pasirinkimas yra svarbus.<\/p>\n<ul>\n<li>\n<p><strong>Paprasta angl\u0173 vikipedija<\/strong> Buvo tinkama prad\u017eia, ta\u010diau i\u0161vestis buvo sunk\u016bs ir apsunkinti daiktavard\u017ei\u0173.<\/p>\n<\/li>\n<li>\n<p><strong>\u201eTinyStories\u201c<\/strong> -Sintetin\u0117s, trumpos, 4 met\u0173 lygio istorijos-veik\u0117 daug geriau:<\/p>\n<ul>\n<li>Nuosekl\u016bs pasakojimai<\/li>\n<li>Prie\u017easties ir pasekm\u0117s logika<\/li>\n<li>Minimal\u016bs tinkami daiktavard\u017eiai<\/li>\n<li>Paprasta gramatika<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Puikiai tinka ma\u017e\u0173 kalb\u0173 modeliams.<\/p>\n<hr\/>\n<h2><span class=\"ez-toc-section\" id=\"Tokenizacija\"><\/span>Tokenizacija<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Tokenizatori\u0173 mokymai nebuvo \u012fskai\u010diuoti \u012f penki\u0173 minu\u010di\u0173 biud\u017eet\u0105. Tokiu mastu:<\/p>\n<ul>\n<li>Tokenizacijos prid\u0117tin\u0117 vert\u0117 yra nereik\u0161minga.<\/li>\n<li>Daugiapakopiams \u017eetonams ma\u017eiems modeliams lengviau i\u0161mokti nei neapdorotus simbolius.<\/li>\n<\/ul>\n<hr\/>\n<h2><span class=\"ez-toc-section\" id=\"Architekturos_eksperimentai\"><\/span>Architekt\u016bros eksperimentai<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<h3><span class=\"ez-toc-section\" id=\"Transformatoriai\"><\/span>Transformatoriai<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li><strong>GPT-2 stilius<\/strong> buvo numatytasis pasirinkimas.<\/li>\n<li><strong>Swiglu<\/strong> Aktyvacija suteik\u0117 post\u016bm\u012f.<\/li>\n<li><strong>2\u20133 sluoksniai<\/strong> dirbo geriausiai.<\/li>\n<li>Mokymosi greitis: <strong>0,001\u20130,002<\/strong> buvo optimalus greitam konvergencijai.<\/li>\n<li><strong>Pad\u0117ties \u012fterpimai<\/strong> pralenkta virv\u0117.<\/li>\n<\/ul>\n<h3><span class=\"ez-toc-section\" id=\"LSTMS\"><\/span>LSTMS<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li>Pana\u0161i strukt\u016bra, ta\u010diau \u0161iek tiek blogesnis pasipiktinimas nei transformatoriai.<\/li>\n<\/ul>\n<h3><span class=\"ez-toc-section\" id=\"Difuzijos_modeliai\"><\/span>Difuzijos modeliai<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li>Band\u0117 <strong>D3PM<\/strong> Kalbos difuzija &#8211; rezultatai buvo nenaudojami, sukeliantys atsitiktinius \u017eetonus.<\/li>\n<li>Transformatoriai ir LSTM per minut\u0119 pasiek\u0117 gramatin\u0119 i\u0161vest\u012f; Difuzija to nepadar\u0117.<\/li>\n<\/ul>\n<hr\/>\n<h2><span class=\"ez-toc-section\" id=\"Rasti_saldzia_vieta_modelio_dydyje\"><\/span>Rasti sald\u017ei\u0105 viet\u0105 modelio dydyje<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Eksperimentuojama su dyd\u017eiais:<\/p>\n<ul>\n<li><strong>~ 2M parametrai<\/strong> buvo vir\u0161utin\u0117 praktin\u0117 riba.<\/li>\n<li>Bet koks didesnis: per l\u0117tas, kad per 5 minutes susilieti.<\/li>\n<li>Bet koks ma\u017eesnis: plok\u0161\u010diakalniai per anksti.<\/li>\n<\/ul>\n<p><\/p>\n<p>Tai buvo i\u0161d\u0117styta su <strong>Chinchilla mastelio \u012fstatymai<\/strong>kuris yra susij\u0119s su optimaliu modelio dyd\u017eiu su mokymo \u017eetonais.<\/p>\n<hr\/>\n<h2><span class=\"ez-toc-section\" id=\"Galutines_mintys\"><\/span>Galutin\u0117s mintys<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>\u0160is eksperimentas nepakeis AI treniruo\u010di\u0173 ateities &#8211; \u012fdomiausias elgesys \u012fvyks po penki\u0173 minu\u010di\u0173. Bet tai <strong>buvo<\/strong>:<\/p>\n<ul>\n<li>Puikus b\u016bdas tyrin\u0117ti <strong>Ma\u017ey\u010dio modelio treniruo\u010di\u0173 dinamika<\/strong><\/li>\n<li>Smagus testas <strong>Ne\u0161iojamojo kompiuterio GPU galimyb\u0117s<\/strong><\/li>\n<li>\u012erodymas, kad galite gauti a <strong>nuoseklus pasakojimo modelis<\/strong> per penkias minutes<\/li>\n<\/ul>\n<p>Tur\u0117dami geresnes architekt\u016bras ir greitesnius vartotoj\u0173 GPU, gal\u0173 gale galime pamatyti <strong>steb\u0117tinai paj\u0117g\u016bs modeliai, treniruojami per kelias minutes<\/strong> &#8211; tiesiai i\u0161 ne\u0161iojamojo kompiuterio.<\/p>\n<\/p><\/div>\n<p>Jei radote klaid\u0105 tekste, atsi\u0173skite prane\u0161im\u0105 autoriui pasirinkdami klaid\u0105 ir paspausdami \u201eCtrl-Enter\u201c.<\/p>\n<div>\n<p>        J\u016bs turite b\u016bti prisijung\u0119, kad pakomentuotum\u0117te.<\/p>\n<p>                <i class=\"fa icon-login\"\/>  Prisijunkite<\/p>\n<\/p><\/div>\n<p><a href=\"https:\/\/techplanet.today\/post\/the-strongest-ai-model-you-can-train-on-a-laptop-in-5-minutes\"> Nuoroda \u012f informacijos \u0161altin\u012f <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Klausimas: Koks yra galingiausias AI modelis, kur\u012f galite treniruotis \u201eMacBook Pro\u201c vos per penkias minutes? Trumpas atsakymas: Geriausias, kur\u012f a\u0161 sugeb\u0117jau, buvo ~ 1,8 m parametras GPT stiliaus transformatorius, apmokytas ~ 20m \u201eTinyStories\u201c \u017eeton\u0173. Tai pasiek\u0117 ~ 9,6 pasipiktinim\u0105, kai i\u0161siskyr\u0117. I\u0161vesties pavyzdys (Reikia BOLD) Ka\u017ekada ten buvo ma\u017eas berniukas, vardu Timas. Timas tur\u0117jo ma\u017e\u0105 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2034,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[3],"tags":[371,3859,2837,3787,1115,3858,3856,3857],"class_list":["post-2033","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technologijos","tag-galite","tag-kompiuteryje","tag-kuri","tag-minutes","tag-modelis","tag-nesiojamame","tag-stipriausias","tag-treniruotis"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/posts\/2033","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/comments?post=2033"}],"version-history":[{"count":0,"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/posts\/2033\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/media\/2034"}],"wp:attachment":[{"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/media?parent=2033"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/categories?post=2033"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infonaujiena.lt\/index.php\/wp-json\/wp\/v2\/tags?post=2033"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}