{"id":2635,"date":"2024-06-20T12:07:22","date_gmt":"2024-06-20T12:07:22","guid":{"rendered":"https:\/\/howtogeek.blog\/cs\/?p=2635"},"modified":"2024-06-20T12:07:22","modified_gmt":"2024-06-20T12:07:22","slug":"this-new-v2a-tool-from-google-deepmind-could-be-the-last-piece-of-the-puzzle-for-ai-generated-movies-cs","status":"publish","type":"post","link":"https:\/\/howtogeek.blog\/cs\/this-new-v2a-tool-from-google-deepmind-could-be-the-last-piece-of-the-puzzle-for-ai-generated-movies-cs\/","title":{"rendered":"Tento nov\u00fd n\u00e1stroj V2A od Google DeepMind by mohl b\u00fdt posledn\u00edm kouskem skl\u00e1da\u010dky film\u016f generovan\u00fdch um\u011blou inteligenc\u00ed"},"content":{"rendered":"<p>Kdy\u017e bylo zve\u0159ejn\u011bno prvn\u00ed video generovan\u00e9 AI, nikdo nemohl tu\u0161it, \u017ee n\u00e1stroje AI pro generov\u00e1n\u00ed videa dojdou v tak kr\u00e1tk\u00e9 dob\u011b tak daleko. Dnes v\u0161ak m\u00e1me nespo\u010det platforem, kter\u00e9 u\u017eivatel\u016fm umo\u017e\u0148uj\u00ed generovat vysoce kvalitn\u00ed, neuv\u011b\u0159iteln\u011b detailn\u00ed videa, jako je Synthesia a Luma AI Dream Machine. To znamen\u00e1, \u017ee st\u00e1le existuje n\u011bkolik probl\u00e9m\u016f, kter\u00e9 br\u00e1n\u00ed tomu, aby se tyto n\u00e1stroje dostaly do hlavn\u00edho proudu.<\/p>\n<p>A t\u00edm nejv\u011bt\u0161\u00edm je mo\u017en\u00e1 proces generov\u00e1n\u00ed zvuku. Zat\u00edmco v\u011bt\u0161ina platforem pro generov\u00e1n\u00ed videa dok\u00e1\u017ee produkovat videa dobr\u00e9 kvality, jde v\u011bt\u0161inou o tich\u00e1 videa bez zvuku. I kdy\u017e je k dispozici zvuk, obvykle se p\u0159id\u00e1v\u00e1 samostatn\u011b a nespl\u0148uje o\u010dek\u00e1v\u00e1n\u00ed u\u017eivatel\u016f.<\/p>\n<p>Pokud nap\u0159\u00edklad nav\u0161t\u00edv\u00edte str\u00e1nku Dream Machine spole\u010dnosti Luma AI, m\u016f\u017eete vid\u011bt n\u011bkolik velmi p\u016fsobiv\u00fdch vide\u00ed, ale zvuk, kter\u00fd je doprov\u00e1z\u00ed, je pom\u011brn\u011b obecn\u00fd a m\u00e1 n\u00edzkou kvalitu. To se ale mo\u017en\u00e1 brzy zm\u011bn\u00ed s novou technologi\u00ed video-to-audio (V2A) spole\u010dnosti Google.<\/p>\n<p>To slibuje, \u017ee mas\u00e1m p\u0159inese kvalitn\u00ed generov\u00e1n\u00ed zvuku pro videa, co\u017e znamen\u00e1, \u017ee v\u00e1m to m\u016f\u017ee kone\u010dn\u011b umo\u017enit produkovat filmy generovan\u00e9 um\u011blou inteligenc\u00ed se spr\u00e1vn\u00fdmi zvukov\u00fdmi stopami a zvukem, kter\u00e9 p\u0159ed\u010d\u00ed v\u0161echna videa generovan\u00e1 um\u011blou inteligenc\u00ed, kter\u00e1 se v sou\u010dasnosti vyr\u00e1b\u00ed.<\/p>\n<figure class=\"wp-block-image\"><figcaption>\n<p><span>Zvuk generovan\u00fd AI pro<\/span><\/p>\n<\/figcaption><\/figure>\n<p><a class=\"youtube_link_to_unwrap\" href=\"https:\/\/www.youtube.com\/watch?v=VYjZlF6m3nQ\" referrerpolicy=\"strict-origin-when-cross-origin\">https:\/\/www.youtube.com\/watch?v=VYjZlF6m3nQ<\/a><\/p>\n<h2 id=\"what-is-google-deepminds-video-to-audio-research\">Co je to pr\u016fzkum videa a zvuku spole\u010dnosti Google DeepMind?<\/h2>\n<p>Technologie Video-to-Audio (V2A) vyvinut\u00e1 spole\u010dnost\u00ed Google DeepMind je ur\u010dena k vytv\u00e1\u0159en\u00ed zvukov\u00fdch stop pro videa generovan\u00e1 um\u011blou inteligenc\u00ed. Tato technologie umo\u017e\u0148uje generovat videa a zvuk sou\u010dasn\u011b t\u00edm, \u017ee kombinuje v\u00fdzvy p\u0159irozen\u00e9ho jazyka s obrazov\u00fdmi pixely pro generov\u00e1n\u00ed zvuk\u016f pro jakoukoli akci, kter\u00e1 se ve videu odehr\u00e1v\u00e1.<\/p>\n<p>Tuto technologii lze sp\u00e1rovat s modely AI pou\u017e\u00edvan\u00fdmi ke generov\u00e1n\u00ed vide\u00ed, jako je Veo, a m\u016f\u017ee pomoci vytv\u00e1\u0159et realistick\u00e9 dialogy a zvukov\u00e9 efekty spolu s dramatick\u00fdmi sk\u00f3re, kter\u00e9 odpov\u00eddaj\u00ed videu. Je\u0161t\u011b d\u016fle\u017eit\u011bj\u0161\u00ed je, \u017ee nov\u00e1 technologie V2A nen\u00ed omezena pouze na videa generovan\u00e1 pomoc\u00ed AI, ale lze ji tak\u00e9 pou\u017e\u00edt ke generov\u00e1n\u00ed zvukov\u00fdch stop pro videa vyroben\u00e1 tradi\u010dn\u00edm zp\u016fsobem. M\u016f\u017eete jej tedy pou\u017e\u00edt pro n\u011bm\u00e9 filmy, archivn\u00ed materi\u00e1ly a dal\u0161\u00ed.<\/p>\n<p>Technologie V2A umo\u017e\u0148uje u\u017eivatel\u016fm generovat neomezen\u00e9 mno\u017estv\u00ed zvukov\u00fdch stop pro videa a dokonce pou\u017e\u00edvat pozitivn\u00ed a negativn\u00ed v\u00fdzvy k veden\u00ed procesu generov\u00e1n\u00ed zvuku a snadn\u00e9mu z\u00edsk\u00e1n\u00ed po\u017eadovan\u00fdch zvuk\u016f. To tak\u00e9 umo\u017e\u0148uje v\u011bt\u0161\u00ed flexibilitu, tak\u017ee m\u016f\u017eete experimentovat s r\u016fzn\u00fdmi v\u00fdstupy a naj\u00edt to, co je pro konkr\u00e9tn\u00ed video nejlep\u0161\u00ed.<\/p>\n<figure class=\"wp-block-image\"><figcaption>\n<p><span>Zvukov\u00e1 uk\u00e1zka med\u00fazy pulzuj\u00edc\u00ed pod vodou.<\/span> Zdroj: Google<\/p>\n<\/figcaption><\/figure>\n<p><a class=\"youtube_link_to_unwrap\" href=\"https:\/\/www.youtube.com\/watch?v=9Q0-t8D9XFI\" referrerpolicy=\"strict-origin-when-cross-origin\">https:\/\/www.youtube.com\/watch?v=9Q0-t8D9XFI<\/a><\/p>\n<h2 id=\"how-does-the-v2a-technology-work\">Jak funguje technologie V2A?<\/h2>\n<p>Podle Googlu spole\u010dnost experimentovala s technikami zalo\u017een\u00fdmi na difuzi a autoregresivn\u00edmi technikami a zjistila, \u017ee prvn\u00ed z nich je pro produkci zvuku nejvhodn\u011bj\u0161\u00ed. V\u00fdsledkem jsou vysoce realistick\u00e9 zvuky a funguje na z\u00e1klad\u011b k\u00f3dov\u00e1n\u00ed videa do komprimovan\u00e9ho form\u00e1tu.<\/p>\n<p>Pot\u00e9 se model dif\u00faze pou\u017e\u00edv\u00e1 k odd\u011blen\u00ed n\u00e1hodn\u00e9ho \u0161umu od videa pomoc\u00ed v\u00fdzev v p\u0159irozen\u00e9m jazyce a videa. V\u00fdzvy pom\u00e1haj\u00ed vytv\u00e1\u0159et realistick\u00fd zvuk, kter\u00fd je dokonale synchronizov\u00e1n s videem. N\u00e1sleduje dek\u00f3dov\u00e1n\u00ed zvuku, po kter\u00e9m je p\u0159evedeno na zvukov\u00fd pr\u016fb\u011bh a slou\u010deno s videem.<\/p>\n<p>DeepMind spole\u010dnosti Google poskytl v\u00edce informac\u00ed pro tr\u00e9nov\u00e1n\u00ed AI, d\u00edky \u010demu\u017e mohou u\u017eivatel\u00e9 v\u00e9st proces generov\u00e1n\u00ed zvuku sm\u011brem k po\u017eadovan\u00fdm zvuk\u016fm a umo\u017e\u0148uj\u00ed platform\u011b produkovat zvuk ve vy\u0161\u0161\u00ed kvalit\u011b. Mezi tyto informace pat\u0159ily p\u0159episy mluven\u00fdch dialog\u016f a podrobn\u00e9 zvukov\u00e9 popisy s pozn\u00e1mkami generovan\u00fdmi um\u011blou inteligenc\u00ed.<\/p>\n<p>Technologie V2A, kter\u00e1 je tr\u00e9nov\u00e1na na takov\u00e9 informace, m\u016f\u017ee p\u0159i\u0159adit r\u016fzn\u00e9 vizu\u00e1ln\u00ed sc\u00e9ny ke konkr\u00e9tn\u00edm zvukov\u00fdm ud\u00e1lostem.<\/p>\n<figure class=\"wp-block-image\"><img alt=\"\" class=\"wp-image\" decoding=\"async\" height=\"605\" loading=\"lazy\" src=\"https:\/\/cdn.howtogeek.blog\/wp-content\/uploads\/2024\/06\/Screenshot-2024-06-20-150052-1.webp\" title=\"\" width=\"1076\"\/><figcaption><span>Pr\u00e1ce s technologi\u00ed V2A.<\/span> Zdroj: Google<\/figcaption><\/figure>\n<h2 id=\"whats-on-the-horizon\">co je na obzoru?<\/h2>\n<p>Technologie V2A spole\u010dnosti DeepMind funguje mnohem l\u00e9pe ne\u017e jin\u00e1 \u0159e\u0161en\u00ed V2A, proto\u017ee ne v\u017edy vy\u017eaduje textovou v\u00fdzvu a dok\u00e1\u017ee porozum\u011bt obrazov\u00fdm pixel\u016fm. Zvukov\u00fd v\u00fdstup tak\u00e9 nen\u00ed nutn\u00e9 ru\u010dn\u011b zarovn\u00e1vat s videem. St\u00e1le v\u0161ak existuj\u00ed ur\u010dit\u00e1 omezen\u00ed technologie, kter\u00e1 se Google sna\u017e\u00ed p\u0159ekonat dal\u0161\u00edm v\u00fdzkumem.<\/p>\n<p>Nap\u0159\u00edklad kvalita generovan\u00e9ho zvuku z\u00e1vis\u00ed na kvalit\u011b videa pou\u017eit\u00e9ho jako vstup. Pokud jsou ve videu zkreslen\u00ed nebo artefakty, model AI jim nerozum\u00ed, proto\u017ee nejsou zahrnuty do jeho tr\u00e9ninku, co\u017e v kone\u010dn\u00e9m d\u016fsledku vede ke sn\u00ed\u017een\u00ed kvality zvuku.<\/p>\n<p>Nav\u00edc u vide\u00ed s lidskou \u0159e\u010d\u00ed spole\u010dnost pracuje na zlep\u0161en\u00ed synchronizace rt\u016f. Technologie V2A se sna\u017e\u00ed generovat \u0159e\u010d pomoc\u00ed vstupn\u00edch p\u0159epis\u016f a pot\u00e9 ji sladit s pohyby rt\u016f postav ve videu. Pokud v\u0161ak video nespol\u00e9h\u00e1 na p\u0159episy, doch\u00e1z\u00ed k nesouladu mezi zvukem a pohyby rt\u016f.<\/p>\n<p>D\u00edky lep\u0161\u00edm mo\u017enostem generov\u00e1n\u00ed zvuku budou modely AI schopny generovat videa, kter\u00e1 nejen p\u016fsobiv\u011b vypadaj\u00ed, ale tak\u00e9 skv\u011ble zn\u011bj\u00ed. Google tak\u00e9 integruje svou technologii V2A se SynthID, kter\u00e1 ozna\u010duje ve\u0161ker\u00fd obsah generovan\u00fd pomoc\u00ed AI. To m\u016f\u017ee pomoci zabr\u00e1nit jeho zneu\u017eit\u00ed a zajistit \u00faplnou bezpe\u010dnost.<\/p>\n<p>Krom\u011b toho spole\u010dnost \u0159\u00edk\u00e1, \u017ee svou technologii V2A d\u016fkladn\u011b otestuje, ne\u017e ji uvoln\u00ed ve\u0159ejnosti. Z toho, co Google p\u0159edvedl a sl\u00edbil do budoucna, se tato technologie zat\u00edm r\u00fdsuje jako v\u00fdznamn\u00fd pokrok v generov\u00e1n\u00ed zvuku pro videa generovan\u00e1 um\u011blou inteligenc\u00ed.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Kdy\u017e bylo zve\u0159ejn\u011bno prvn\u00ed video generovan\u00e9 AI, nikdo nemohl tu\u0161it, \u017ee n\u00e1stroje AI pro generov\u00e1n\u00ed videa dojdou v tak kr\u00e1tk\u00e9 dob\u011b tak daleko. Dnes v\u0161ak m\u00e1me nespo\u010det platforem, kter\u00e9 u\u017eivatel\u016fm umo\u017e\u0148uj\u00ed generovat vysoce kvalitn\u00ed, neuv\u011b\u0159iteln\u011b detailn\u00ed videa, jako je Synthesia a Luma AI Dream Machine. To znamen\u00e1, \u017ee st\u00e1le existuje n\u011bkolik probl\u00e9m\u016f, kter\u00e9 br\u00e1n\u00ed tomu, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[179,126],"class_list":["post-2635","post","type-post","status-publish","format-standard","hentry","category-how-to","tag-artificial-intelligence","tag-microsoft"],"acf":[],"_links":{"self":[{"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/posts\/2635","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/comments?post=2635"}],"version-history":[{"count":1,"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/posts\/2635\/revisions"}],"predecessor-version":[{"id":2636,"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/posts\/2635\/revisions\/2636"}],"wp:attachment":[{"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/media?parent=2635"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/categories?post=2635"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/howtogeek.blog\/cs\/wp-json\/wp\/v2\/tags?post=2635"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}