GPU¥á¥â¥ê¤¬¾®¤µ¤¯¤Æ¤â¥Ñ¥é¥á¡¼¥¿¡¼¿ô¤¬Â礤¤¸À¸ì¥â¥Ç¥ë¤ò¥È¥ì¡¼¥Ë¥ó¥°²Äǽ¤Ë¤Ê¤ë¼êË¡¡ÖQLoRA¡×¤¬Åо졢°ìÂΤɤó¤Ê¼êË¡¤Ê¤Î¤«¡©
GPT-1¤Ï1²¯1700Ëü¸Ä¤Î¥Ñ¥é¥á¡¼¥¿¡¼¤ò»ý¤Ä¸À¸ì¥â¥Ç¥ë¤Ç¡¢GPT-2¤Ç¤Ï15²¯¡¢GPT-3¤Ç¤Ï1750²¯¤È¥Ñ¥é¥á¡¼¥¿¡¼¿ô¤¬Áý²Ã¤¹¤ë¤Ë¤Ä¤ì¤Æ¸À¸ì¥â¥Ç¥ë¤ÎÀǽ¤¬¾å¤¬¤Ã¤Æ¤¤Æ¤¤¤Þ¤¹¡£¤·¤«¤·¥Ñ¥é¥á¡¼¥¿¡¼¿ô¤¬Áý²Ã¤¹¤ë¤Ë¤Ä¤ì¤Æ¥È¥ì¡¼¥Ë¥ó¥°¤ËɬÍפʥǡ¼¥¿¤Î¿ô¤ä¥È¥ì¡¼¥Ë¥ó¥°Ãæ¤Ë»ÈÍѤ¹¤ë¥á¥â¥ê¤ÎÎ̤âÁý²Ã¤·¡¢¥È¥ì¡¼¥Ë¥ó¥°¤Î¥³¥¹¥È¤¬Â礤¯Áý²Ã¤·¤Æ¤·¤Þ¤¤¤Þ¤¹¡£¤½¤ó¤ÊÃæ¡¢¥á¥â¥ê¤Î¾ÃÈñÎ̤ò·ã¸º¤µ¤»¤Ä¤Ä¾¯¤Ê¤¤¥Ç¡¼¥¿¤Ç¥È¥ì¡¼¥Ë¥ó¥°¤Ç¤¤ë¼êË¡¡ÖQLoRA¡×¤¬Åо줷¤Þ¤·¤¿¡£
https://arxiv.org/abs/2305.14314
artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs
https://github.com/artidoro/qlora
ChatGPT¤Î¤è¤¦¤ÊÂ絬ÌϸÀ¸ì¥â¥Ç¥ë¤òºîÀ®¤¹¤ëºÝ¤Ë¤Ï¡¢¤Þ¤ºÂçÎ̤Υƥ¥¹¥È¥Ç¡¼¥¿¤òÍѤ¤¤Æ¥â¥Ç¥ë¤Ë¡Öʸ»ú¤Î°·¤¤Êý¡×¤ò³Ø½¬¤µ¤»¤Þ¤¹¡£¤³¤¦¤·¤Æ¤Ç¤¤¿¥â¥Ç¥ë¤¬¡Ö»öÁ°³Ø½¬ºÑ¤ß¥â¥Ç¥ë¡×¤È¸Æ¤Ð¤ì¤ë¥â¥Ç¥ë¤Ç¡¢ÂåɽŪ¤ÊÎã¤È¤·¤Æ¤ÏLLaMa¤äRedPajama-INCITE¤¬¾å¤²¤é¤ì¤Þ¤¹¡£¤½¤Î¸å¡¢Q¡õA¤Î¤ª¼êËܤʤɹâÉʼÁ¤Ê¥Ç¡¼¥¿¤òÍøÍѤ·¤ÆÌÜŪ¤Ë¤¢¤Ã¤¿½ÐÎϤòÆÀ¤é¤ì¤ë¤è¤¦¤ËÄɲäǥե¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤È¸Æ¤Ð¤ì¤ë¥È¥ì¡¼¥Ë¥ó¥°¤ò¹Ô¤¦¤È¤¤¤¦Î®¤ì¤¬°ìÈÌŪ¤Ç¤¹¡£¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤À¤±¤Ç¤â¤«¤Ê¤ê¿Í´Ö¤é¤·¤¤ÊÖÅú¤¬¤Ç¤¤ë¤è¤¦¤Ë¤Ê¤ë¤Î¤Ç¤¹¤¬¡¢¤è¤êÀǽ¤ò¾å¤²¤ë¤¿¤á¤Ë¿Í´Ö¤«¤é¤Îɾ²Á¤ò¥â¥Ç¥ë¤Ë¥Õ¥£¡¼¥É¥Ð¥Ã¥¯¤¹¤ë¡Ö¿Í´Ö¤Î¥Õ¥£¡¼¥É¥Ð¥Ã¥¯¤Ë¤è¤ë¶µ²Ê³Ø½¬(RLHF)¡×¤ò¹Ô¤¦¾ì¹ç¤â¤¢¤ê¤Þ¤¹¡£
¥â¥Ç¥ë¤Î¥Ñ¥é¥á¡¼¥¿¡¼¿ô¤ÏºÇ½é¤Î»öÁ°³Ø½¬¤ò¹Ô¤¦Ãʳ¬¤Ç·è¤Þ¤Ã¤Æ¤·¤Þ¤¦¤¿¤á¡¢¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤Ê¤É¸å¤ÎÃʳ¬¤ÇÊѹ¹¤¹¤ë¤³¤È¤Ï¤Ç¤¤Þ¤»¤ó¡£°ìÈÌŪ¤Ê´ë¶È¤Ç¤Ï»öÁ°³Ø½¬¤ÎºÝ¤ËɬÍפÊËÄÂç¤Ê·×»»¥³¥¹¥È¤ò»Ùʧ¤¦¤³¤È¤¬¤Ç¤¤Ê¤¤¤¿¤á¡¢¸ø³«¤µ¤ì¤Æ¤¤¤ë»öÁ°³Ø½¬ºÑ¤ß¥â¥Ç¥ë¤ÎÃ椫¤é¥Ñ¥é¥á¡¼¥¿¡¼¿ô¤òÁªÂò¤¹¤ë¤³¤È¤Ë¤Ê¤ê¤Þ¤¹¡£°ìÈÌŪ¤Ë¡¢¥Ñ¥é¥á¡¼¥¿¡¼¿ô¤¬Â¿¤¤¤Û¤ÉÀǽ¤¬¹â¤¯¤Ê¤ê¤Þ¤¹¤¬¡¢Æ±»þ¤Ë¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤Î¥³¥¹¥È¤¬¤É¤ó¤É¤ó¹â¤¯¤Ê¤Ã¤Æ¤¤¤¯¤È¤¤¤¦ÌäÂ꤬ȯÀ¸¤·¤Æ¤¤¤Þ¤·¤¿¡£
¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤ÎºÝ¤Ë¤Ï¥â¥Ç¥ëÁ´ÂΤò¥á¥â¥ê¤ËÇÛÃÖ¤¹¤ëɬÍפ¬¤¢¤ë¤Î¤Ï¤â¤Á¤í¤ó¡¢¥È¥ì¡¼¥Ë¥ó¥°ÂоݤΥѥé¥á¡¼¥¿¡¼¤´¤È¤ËÄ´À°¤Î¤¿¤á¤Î·×»»·ë²Ì¤ò¥á¥â¥ê¤ËÊݸ¤¹¤ëɬÍפ¬¤¢¤ê¡¢Á´¤Æ¤Î¥Ñ¥é¥á¡¼¥¿¡¼¤òÄ´À°Âоݤˤ¹¤ë½¾Íè¤Î¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤Ç¤Ï¸µ¤Î¥â¥Ç¥ë¤Î²¿Çܤâ¤Î¥µ¥¤¥º¤Î¥á¥â¥ê¤òɬÍפȤ·¤Þ¤¹¡£Î㤨¤Ð¡¢650²¯(65B)¥Ñ¥é¥á¡¼¥¿¡¼¤Î¥â¥Ç¥ë¤Ç¤¢¤ì¤Ð¡¢¥Ñ¥é¥á¡¼¥¿¡¼1¤Ä¤Ë¤Ä¤16bit¤ÇÎ̻Ҳ½¤¹¤ë¤È¥â¥Ç¥ë¤ò¥á¥â¥ê¤Î¥í¡¼¥É¤¹¤ë¤À¤±¤Ç650²¯¡ß16bit¤Î130GBʬ¥á¥â¥ê¤ò¾ÃÈñ¤·¤Æ¤·¤Þ¤¦¾å¤Ë¡¢¥È¥ì¡¼¥Ë¥ó¥°¤Î¼êË¡¼¡Âè¤Ç¤Ï¤¢¤ë¤â¤Î¤Î650GBÄøÅ٤η׻»·ë²Ì¤òÊݸ¤¹¤ëɬÍפ¬¤¢¤ê¡¢¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤ò¹Ô¤¦¤Ë¤Ï¹ç·×¤Ç780GBʬ¤ÎGPU¥á¥â¥ê¤¬É¬ÍפǤ·¤¿¡£
¤³¤¦¤·¤¿¥á¥â¥ê¾ÃÈñÌäÂê¤ò²ò·è¤¹¤ë¤¿¤á¤Ë¹Í°Æ¤µ¤ì¤¿¤Î¤¬LoRA¤È¤¤¤¦¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤Î¼êË¡¤Ç¤¹¡£LoRA¤Ç¤Ï¡¢¸µ¤Î¥â¥Ç¥ë¤Î¥Ñ¥é¥á¡¼¥¿¡¼¹ÔÎó¤òÄã¥é¥ó¥¯¶á»÷¤·¤¿¿·¤¿¤Ê¹ÔÎó¤ò¥È¥ì¡¼¥Ë¥ó¥°Âоݤˤ¹¤ë¤³¤È¤Ç¡¢¥È¥ì¡¼¥Ë¥ó¥°¤ËɬÍפʥá¥â¥ê¤Î¾ÃÈñÎ̤òºï¸º¤·¤Æ¤¤¤Þ¤¹¡£
[2106.09685] LoRA: Low-Rank Adaptation of Large Language Models
https://arxiv.org/abs/2106.09685
¹ÔÎó¤òÄã¥é¥ó¥¯¶á»÷¤¹¤ë¤³¤È¤Ç¡¢µðÂç¤Ê¹ÔÎó¤òÈæ³ÓŪ¾®¤µ¤¤¹ÔÎó2¸Ä¤Ëʬ²ò²Äǽ¤Ç¤¹¡£²¾¤Ë¸µ¤Î¥â¥Ç¥ë¤Î¥Ñ¥é¥á¡¼¥¿¡¼¹ÔÎó¤Î¥µ¥¤¥º¤¬[d]¡ß[d]¤À¤Ã¤¿¾ì¹ç¡¢Äã¥é¥ó¥¯¶á»÷¤·¤¿¹ÔÎó¤Ï¥é¥ó¥¯¿ô¤ò[r]¤È¤·¤Æ[d]¡ß[r]¤È[r]¡ß[d]¤È¤¤¤¦2¤Ä¤Î¹ÔÎó¤Ë¤Ê¤ê¤Þ¤¹¡£¤³¤¦¤¹¤ë¤³¤È¤Ç¡¢¥È¥ì¡¼¥Ë¥ó¥°ÂоݤΥѥé¥á¡¼¥¿¡¼¿ô¤òd¤Î2¾è¸Ä¤«¤é2¡ßd¡ßr¸Ä¤Þ¤Ç¸º¤é¤¹¤³¤È¤¬²Äǽ¤Ç¤¹¡£LoRA¤ÎÏÀʸ¤Ç¤Ï¡¢GPT-3¤Î¥Õ¥¡¥¤¥ó¥Á¥å¡¼¥Ë¥ó¥°¤Ë¤Æ¥È¥ì¡¼¥Ë¥ó¥°ÂоݤΥѥé¥á¡¼¥¿¡¼¿ô¤ò1Ëüʬ¤Î1¤Ë¤·¡¢¥á¥â¥ê¤Î¾ÃÈñÎ̤ò3ʬ¤Î1¤Ë¤·¤¿¤È½Ò¤Ù¤é¤ì¤Æ¤¤¤Þ¤¹¡£
LoRA¤Ï¾¯¤Ê¤¤·×»»»ñ¸»¤Ç¸úΨ¤è¤¯¥È¥ì¡¼¥Ë¥ó¥°¤Ç¤¤ë¤¿¤á¡¢¸Ä¿Í¥æ¡¼¥¶¡¼¤Î³«È¯°ÕÍߤ¬¹â¤¤²èÁüÀ¸À®Ê¬Ìî¤Ë¤ª¤¤¤ÆÍøÍѤ¬¿Ê¤ó¤Ç¤¤¤Þ¤·¤¿¡£Î㤨¤ÐStable Diffusion¤Ë¤ª¤¤¤Æ¤Ï¡¢ÆÃÄê¤Î³¨ÊÁ¤ä¥¥ã¥é¥¯¥¿¡¼¡¢ÇطʤʤɤòLoRA¤Ç³Ø½¬¤µ¤»¤ë¤³¤È¤Ç¤½¤Î³Ø½¬ÆâÍƤ˱è¤Ã¤¿²èÁü¤òÀ¸À®¤¹¤ë¤³¤È¤¬²Äǽ¤Ç¤¹¡£
¤Þ¤¿¡¢¸À¸ì¥â¥Ç¥ë¤Î³«È¯¤Ë¤ª¤¤¤Æ¤â¡¢·×»»»ñ¸»¤ÇÍ¥°Ì¤ËΩ¤Ã¤Æ¤¤¤ëGoogle¤ÎÆâÉô¤Ç¤ÏLoRA¤ò·Ù²ü¤¹¤ëÀ¼¤¬¾å¤¬¤Ã¤Æ¤¤¤¿¤³¤È¤¬Î®½Ð¤·¤¿Ê¸¾Ï¤«¤éÌÀ¤é¤«¤Ë¤Ê¤Ã¤Æ¤¤¤Þ¤¹¡£
¡Ö¥ª¡¼¥×¥ó¥½¡¼¥¹¤Ï¶¼°Ò¡×¡Ö¾¡¼Ô¤ÏMeta¡×¡ÖOpenAI¤Ï½ÅÍפǤϤʤ¤¡×¤Ê¤É¤Èµ¤µ¤ì¤¿Google¤ÎAI´ØÏ¢ÆâÉôʸ½ñ¤¬Î®½Ð - GIGAZINE
º£²ó¤ÎÏÀʸ¤Ç¤Ï¡¢¤³¤ÎLoRA¤ò¥Ù¡¼¥¹¤Ë¡¢ÄɲäÇ3¤Ä¤Î¥Æ¥¯¥Ë¥Ã¥¯¤òÍøÍѤ¹¤ë¤³¤È¤Ç650²¯(65B)¥Ñ¥é¥á¡¼¥¿¡¼¤Î¥â¥Ç¥ë¤ò48GB¤·¤«¥á¥â¥ê¤òÅëºÜ¤·¤Æ¤¤¤Ê¤¤GPU¤Ç¥È¥ì¡¼¥Ë¥ó¥°²Äǽ¤Ë¤·¤¿¤¦¤¨¡¢24»þ´Ö¤Î¥È¥ì¡¼¥Ë¥ó¥°¤ÇChatGPT¤Î99.3%¤ËɤŨ¤¹¤ëÀǽ¤ò°ú¤½Ð¤¹¤³¤È¤ËÀ®¸ù¤·¤¿¤È¤Î¤³¤È¡£
ÏÀʸ¤ÇÍѤ¤¤é¤ì¤¿3¤Ä¤Î¥Æ¥¯¥Ë¥Ã¥¯¤Ï²¼µ¤ÎÄ̤ê¤Ç¤¹¡£
¡¦NF4¤Ç¤ÎÎ̻Ҳ½
°ìÈÌŪ¤Ë¸À¸ì¥â¥Ç¥ë¤ÎÎ̻Ҳ½¤Ï16bit¤Ç¹Ô¤ï¤ì¤Æ¤ª¤ê¡¢¥Ñ¥é¥á¡¼¥¿¡¼1¤Ä¤Ë¤Ä¤16bitʬ¤Î¾ðÊ󤬴ޤޤì¤Æ¤¤¤Þ¤¹¤¬¡¢QLoRA¤Ç¤ÏÂå¤ï¤ê¤Ë4bit¤ÇÎ̻Ҳ½¤ò¹Ô¤Ã¤Æ¤¤¤ë¤È¤Î¤³¤È¡£¾ðÊóÎ̤¬Äã²¼¤¹¤ëʬÀºÅÙ¤âÍî¤Á¤ë¤Î¤Ç¤¹¤¬¡¢Ä̾»öÁ°³Ø½¬ºÑ¤ß¥â¥Ç¥ë¤Î¥Ñ¥é¥á¡¼¥¿¡¼¤ÏÊ¿¶Ñ¤¬0¤ÎÀµµ¬Ê¬ÉۤȤʤ뤿¤áÀµµ¬Ê¬ÉÛ¥Ù¡¼¥¹¤ÇÎ̻Ҳ½¤ò¹Ô¤¦¡ÖNormalFloat(NF)¡×·Á¼°¤Ë¤¹¤ë¤³¤È¤ÇÀºÅÙ¤ÎÄã²¼¤òÍÞ¤¨¤Æ¤¤¤Þ¤¹¡£
¡¦Æó½ÅÎ̻Ҳ½
Î̻Ҳ½¤ÎºÝ¤ËÍѤ¤¤ëÄê¿ô¤Ë¤Ä¤¤¤Æ¤âÎ̻Ҳ½¤ò¹Ô¤¦¤³¤È¤Ç¡¢1¥Ñ¥é¥á¡¼¥¿¡¼¤¢¤¿¤ê0.5bitɬÍפÀ¤Ã¤¿¥á¥â¥ê¤Î¾ÃÈñÎ̤ò0.127bit¤Ø¤ÈÄã²¼¤µ¤»¤¿¤È¤Î¤³¤È¡£
¡¦¥Ú¡¼¥¸ºÇŬ²½
GPU¥á¥â¥ê¤¬¾å¸Â¤Ë㤷¤¿ºÝ¤Ë¡¢Ä̾ï¤Î¥á¥â¥ê¤Ø¤È¥Ç¡¼¥¿¤òÂàÈò¤µ¤»¤Æ·×»»¤ËɬÍפʥá¥â¥ê¤ò³ÎÊݤ¹¤ë¼êË¡¤òÍøÍѤ¹¤ë¤³¤È¤Ç¡¢¥Ñ¥é¥á¡¼¥¿¡¼¤ò¹¹¿·¤¹¤ë¥Ô¡¼¥¯»þ¤ÎGPU¥á¥â¥ê¤Î»ÈÍÑÎ̤òÍÞ¤¨¤ë¤³¤È¤¬¤Ç¤¤¿¤È¤Î¤³¤È¡£
Guanaco 33B¤òQLoRA¤Ç¥È¥ì¡¼¥Ë¥ó¥°¤·¤¿¥â¥Ç¥ë¤¬Hugging Face¾å¤Ç»î¤»¤ë¤è¤¦¤Ë¤Ê¤Ã¤Æ¤¤¤Þ¤¹¡£±Ñ¸ì¤Î¼õ¤±Åú¤¨¤·¤«¤Ç¤¤Þ¤»¤ó¤¬¡¢¤«¤Ê¤êÎɤµ¤²¤ÊÊÖÅú¤òÊÖ¤¹¤è¤¦¤Ç¤¹¡£
QLoRA¤ÎÏÀʸ¤ÇÍøÍѤµ¤ì¤¿¥³¡¼¥É¤ÏGitHub¤Ç¥Û¥¹¥Æ¥£¥ó¥°¤µ¤ì¤Æ¤¤¤ë¤¿¤á¡¢¶½Ì£¤¬¤¢¤ë¿Í¤Ï³Îǧ¤·¤Æ¤ß¤Æ¤¯¤À¤µ¤¤¡£