KTUGFaq

KTUG FAQ

·Î±×ÀÎ:
ºñ¹Ð¹øÈ£:
°¡ÀÔ
Don't be overly suspicious where it's not warranted.
FrontPage › UTF-8
UTF-8Àº, Unicode Transformaion Format-8ÀÇ ÁÙÀÎ ¸»ÀÌ´Ù. Áï, ISO 10646 / Unicode¿¡¼­ Á¤ÀÇÇÑ ¹®ÀÚ ÁýÇÕÀ» 8ºñÆ® ¹ÙÀÌÆ® ½ºÆ®¸²À¸·Î ÀÎÄÚµùÇÏ´Â ±Ô¾àÀÌ´Ù.

¸ñÂ÷

1 UTF-8
2 UTF-8ÀÇ ¿ë·Ê ¹× ´Ù¸¥ Á¤º¸µé
3 TeX°ú UTF-8
4 UTF-8 ¹®¼­ÀÇ ÆíÁý
5 BOM ¹®Á¦
6 ´ëÈ­

1 UTF-8

UTF-8Àº ¸ðµç Unicode ¹®ÀÚ¸¦ 1¹ÙÀÌÆ®¿¡¼­ 4¹ÙÀÌÆ®±îÁö °¡º¯Æø ¹ÙÀÌÆ® ½ºÆ®¸²À¸·Î ÀÎÄÚµùÇÑ´Ù.
<!> 6¹ÙÀÌÆ®°¡ ¾Æ´Ô. [http]ISO JTC1/SC2/WG2´Â °ø½ÄÀûÀ¸·Î ISO 10646¿¡¼­ UTF-16À¸·Î Ç¥Çö °¡´ÉÇÑ »óÇѼ±ÀÎ U+10FFFF¸¦ ³Ñ´Â ¹üÀ§¿¡ ¹®ÀÚ¸¦ ¹èÁ¤ÇÏÁö ¾ÊÀ¸¸®¶ó°í °áÁ¤Çß°í, [http]Unicode Technical Committee´Â ±×¿¡ ¾Õ¼­ °°Àº °áÁ¤À» ³»·È´Ù.

ÇϳªÀÇ Unicode ¹®ÀÚ¸¦ ¸î ¹ÙÀÌÆ®·Î ÀÎÄÚµùÇÒÁö´Â ÇØ´ç ±ÛÀÚ¿¡ ÇÒ´çµÈ Äڵ尪 ([http]Unicode Scalar Value)¿¡ µû¶ó °áÁ¤ÇÒ ¼ö ÀÖ´Ù. U+007F(127)±îÁö´Â 1 ¹ÙÀÌÆ®, U+0080(128)¿¡¼­ U+07FF(2047)±îÁö´Â 2 ¹ÙÀÌÆ®, U+0800(2048)¿¡¼­ U+FFFF(65535)±îÁö´Â 3 ¹ÙÀÌÆ®, U+10000(65536)¿¡¼­ U+10FFFF(1114111)±îÁö´Â 4 ¹ÙÀÌÆ®¸¦ ¾´´Ù. µû¶ó¼­ US-ASCII¿¡ ¼ÓÇÏ´Â ¹®ÀÚ´Â U+0000 (NULL)À» Æ÷ÇÔÇؼ­ UTF-8¿¡¼­´Â ÇÑ ¹ÙÀÌÆ®·Î Ç¥ÇöÇÒ ¼ö ÀÖ´Ù. ÀÌ Æ¯¼ºÀº US-ASCII¿Í ȣȯ¼ºÀ» À¯ÁöÇØ¾ß ÇÏ´Â À¯´Ð½º ÆÄÀÏ ½Ã½ºÅÛ[UTF-8], SMTP (ÀÎÅÍ³Ý ¸ÞÀÏ)¸¦ ºñ·ÔÇÑ ÅؽºÆ® ±â¹ÝÀÎ ¿©·¯ ÀÎÅÍ³Ý ÇÁ·ÎÅäÄÝ¿¡¼­ ¾²±â¿¡ ÀûÇÕÇÏ´Ù.

Unicode¸¦ ÀÎÄÚµùÇÏ´Â ¹æ¹ýÀ¸·Î´Â UTF-7, UTF-8, UTF-16, UTF-32 µî ¿©·¯ °¡Áö ¹æ¹ýÀÌ ÀÖ´Ù. TeX¿¡¼­´Â ÁÖ·Î UTF-8À» »ç¿ëÇϴµ¥, CJK ¹®ÀÚ´Â U+0800 ÀÌÈÄ¿¡ ÇÒ´çµÇ¾î ÀÖÀ¸¹Ç·Î UTF-8¿¡¼­ 3 ¹ÙÀÌÆ®¸¦ ½á¼­ ³ªÅ¸³»¾ß ÇÑ´Ù. ¹Ý¸é¿¡ UTF-16À» ¾µ °æ¿ì¿¡´Â CJK ±ÛÀÚ¸¦ Æ÷ÇÔÇؼ­ ¸ðµç BMP (Basic Multilingual Plane : Unicode Áß Ã³À½ 65,536 ÄÚµå Æ÷ÀÎÆ®)¿¡ ¼ÓÇÏ´Â ±ÛÀÚ´Â 2 ¹ÙÀÌÆ®·Î ³ªÅ¸³½´Ù. ÀÌ·± ÀÌÀ¯·Î UTF-16À» ¼±È£ÇÏ´Â °æ¿ì°¡ ÀÖÀ¸³ª (UTF-8ÀÌ UTF-16¿¡ ºñÇØ ÃÖ´ë 1.5¹è ´õ ¸¹Àº ÀúÀå °ø°£/¼Û½Å ½Ã°£À» ¿äÇϹǷÎ) US-ASCII¿ÍÀÇ È£È¯¼ºÀº ¸¹Àº °æ¿ì¿¡ ´ë´ÜÈ÷ Áß¿äÇϹǷÎ, Unix(Mac OS X)¿Í BeOS µîÀº ÅؽºÆ® Çü½Ä°ú ±×¿Í À¯»çÇÑ Çü½Ä¿¡¼­ UTF-8À» ÁÖ·Î ¾´´Ù. °°Àº ÀÌÀ¯·Î TeX/Omega¿¡¼­µµ UTF-8À» ÁÖ·Î ¾´´Ù. ¹Ý¸é¿¡ Win32¿¡¼­´Â ÅؽºÆ® ¹®¼­ Çü½Äµµ ±âº»À¸·Î´Â UTF-16À» »ç¿ëÇÑ´Ù.

ÅؽºÆ® Çü½Ä°ú OS³ª ÇÁ·Î±×·¥ ȤÀº ¶óÀ̺귯¸® ³»ºÎ¿¡¼­ ¾î¶² ÀÎÄÚµùÀ» ¾²´À³Ä´Â º°°³ÀÇ ¹®Á¦ÀÌ´Ù. Linux glibc´Â UTF-32¸¦ ¾²°í, Mac OS X, Win32, Omega, ICU (International Component for Unicode), Java, [http]ECMAscript[1], Mozilla µîÀº UTF-16À» ¾²°í, BeOS, glib, Perl µîÀº UTF-8À» ¾´´Ù. PythonÀº UTF-32 ȤÀº UCS-2 (2byte ȤÀº 4byte¸¦ ¾²´Â °¡º¯Æø ÀÎÄÚµùÀÎ UTF-16°ú ´Þ¸® UCS-2´Â 2byte °íÁ¤Æø ÀÎÄÚµùÀ¸·Î ¿ÀÁ÷ BMP¸¸ Áö¿øÇÒ ¼ö ÀÖ´Ù.)¸¦ ¾²µµ·Ï ÄÄÆÄÀÏÇÒ ¼ö ÀÖ´Ù. UTF-32´Â °¡º¯Æø ÀÎÄÚµùÀÎ UTF-16À» ´Ù·ç´Âµ¥¼­ ¿À´Â ºÒÆíÇÔ[2]À» ÇÇÇÒ ¼ö ÀÖ´Â ¹Ý¸é¿¡ UTF-8¿¡ ºñÇØ ÃÖ´ë 4¹èÀÇ °ø°£(US-ASCII¿¡ ¼ÓÇÏ´Â ±ÛÀÚ¶ó¸é)À», UTF-16¿¡ ºñÇؼ­µµ ÃÖ´ë 2¹è (BMP¿¡ ¼ÓÇÏ´Â ±ÛÀÚÀÇ °æ¿ì)ÀÇ °ø°£À» ¾´´Ù´Â ´ÜÁ¡ÀÌ ÀÖ´Ù.

´õ ÀÚ¼¼ÇÑ °ÍÀº ´ÙÀ½ À¥ ÆäÀÌÁö¿Í °Å±â¼­ ¾ð±ÞÇÑ MS, Apache µî¿¡ ÀÖ´Â °ü·Ã ÆäÀÌÁö¸¦ ÂüÁ¶ÇϽʽÿÀ : -- ½ÅÁ¤½Ä
----
  • [UTF-8] Àº UTF-FSS¶ó°í ºÒ¸®¿ü´Ù. FSS´Â ¿©±â¼­ File System Safe¸¦ ¶æÇÑ´Ù.
  • [1] Javascript : ECMA 290, ECMA 327µµ ECMAscript °ü·Ã Ç¥ÁØÀÓ
  • [2] ±×´ÙÁö Å« ¹®Á¦°¡ µÇÁö´Â ¾Ê´Â´Ù. ¸¹Àº OS, ¶óÀ̺귯¸®, ÀÀ¿ë ÇÁ·Î±×·¥¿¡¼­ À̸¦ Àß ¾²°í ÀÖ´Ù. ÇÏÁö¸¸, °³ÀÎÀûÀ¸·Î´Â UTF-32¸¦ ¼±È£ÇÑ´Ù.

3 TeX°ú UTF-8

  • UTF-8 ÀÎÄÚµù ¹®¼­¿¡¼­ US-ASCII¿¡ ¼ÓÇÏ´Â ±ÛÀÚ´Â 1¹ÙÀÌÆ®·Î Ç¥ÇöµÈ´Ù. ±×·¯¹Ç·Î TeX ¸í·ÉµéÀº º¸ÅëÀÇ ASCII ÅؽºÆ®¿Í °°Àº Äڵ带 °¡Áö¹Ç·Î TeX/Omega°¡ ¸í·ÉÀ» ÀνÄÇÏ°í ó¸®ÇÏ´Â µ¥ ÄÚµå º¯È¯ÀÇ ºÎ´ãÀ» ÁÙÀÏ ¼ö ÀÖ´Ù.
  • ÇöÀç UTF-8 ÀÎÄÚµùÀÇ ÇÑ±Û ¹®¼­¸¦ ó¸®ÇÒ ¼ö ÀÖ´Â TeX ¶Ç´Â TeX Macro´Â LaTeXÀ» ±×³É ÀÌ¿ëÇÏ´Â CJKLaTeXÀ̳ª LaTeX-ucs, ±×¸®°í Omega/Lambda¸¦ ÀÌ¿ëÇÏ´Â HLaTeX/Lambda°¡ ÀÖ´Ù.
  • UTF-8 ÀÎÄÚµùÀ¸·Î´Â Unicode¸¦ ÀüºÎ ¾µ ¼ö ÀÖÀ¸¹Ç·Î ´Ù±¹¾î ¹®¼­¸¦ ÀÛ¼ºÇÏ´Â µ¥´Â °¡Àå ÁÁÀº ÇØ°áÃ¥ °¡¿îµ¥ Çϳª°¡ µÈ´Ù. À¯ÀÏÇÑ ¹®Á¦´Â Unicode Àüü¸¦ Ç¥ÇöÇÒ ¼ö ÀÖ´Â ±Û²ÃÀÌ ÀÖ´À³Ä´Â °ÍÀÏ »ÓÀÌ´Ù.

See also UTF8ÇѱÛÁ¶ÆÇ

4 UTF-8 ¹®¼­ÀÇ ÆíÁý

UTF-8 ¹®¼­¸¦ ÆíÁýÇÒ ¼ö ÀÖ´Â ÆíÁý±â/¿öµåÇÁ·Î¼¼¼­´Â NotePad, Wordpad, MS-Word, ¾Æ·¡¾ÆÇѱÛ, UltraEdit, Yudit, EditPlus, jEdit, AcroEdit, [http]SCUnipad, Windows¿ë Vim (VimEditor), Windows¿ë Emacs µîÀÌ ÀÖ´Ù.

Unix (Linux, Mac OS X Æ÷ÇÔ)¿¡´Â UTF-8 ¹®¼­ ÆíÁýÇÒ ¼ö ÀÖ´Â ÆíÁý±â°¡ ¸Å¿ì ¸¹½À´Ï´Ù. Vim, Emacs, TextEdit (Mac), Yudit, [http]mined, gedit, kedit µîÀÔ´Ï´Ù. ÀϺΠÆíÁý±â´Â [http]locale¸¦ Á¤ÇØ¾ß ÇÕ´Ï´Ù. XIMÀ¸·Î´Â [http]Nabi¿Í [http]Ami µîÀ» »ç¿ëÇؼ­ »ç¿ëÇÒ ¼ö ÀÖ½À´Ï´Ù.

5 BOM ¹®Á¦


6 ´ëÈ­

Emacs¿¡¼­ UTF-8·Î ÇѱÛÀԷ¿¡ ¼º°øÇϽŠºÐÀÇ ¼º°ø´ãÀ» µè°í ½Í½À´Ï´Ù. °¡¸£Ä§À» ÁÖ½Ã¸é °¨»çÇÏ°Ú½À´Ï´Ù. --hermian
[http]mule-ucs¸¦ ¼³Ä¡ÇÏ¸é µÉ °ÍÀÔ´Ï´Ù. µ¥ºñ¾ÈÀ̶ó¸é apt-get install mule-ucs. --DohyunKim
mule-ucs¸¦ ¼³Ä¡Çصµ emacs¿¡¼­ ÇѱÛÀÔ·ÂÀÌ µÇÁö ¾Ê°í, ±×³É °ø¶õÀ¸·Î ³ª¿À´Âµ¥¿ä. ¾Æ¸¶µµ font°¡ ¾ø¾î¼­ ÀÎ°Í °°±âµµ ÇÏ°í, mule-ucs°¡ ÀϺ»¾î¿Í ÇÑÀÚ¸¸ Áö¿øÇÏÁö ¾Ê´Â°¡¶ó´Â »ý°¢µµ ÀÖ½À´Ï´Ù. Àß ¸ð¸£°í ÁÖÀý°Å¸®´Âµ¥¿ä. ±¸Ã¼ÀûÀ¸·Î ¾î¶»°Ô »ç¿ëÇÏ´ÂÁö¿ä. Á¦°¡ ¾²°í ÀÖ´Â systemÀº hancom linuxÀÔ´Ï´Ù.ktug¿¡ ºúÀ» Áö°í ÀÖ½À´Ï´Ù.--synapse
NTEmacs¿¡ °ü·Ã ±ÛÀÌ ÀÖ½À´Ï´Ù. --Karnes

W32TeXÀÇ Lambda´Â BOMÀÌ ºÙÀº UTF-8 ÀԷ¿£ÄÚµùÀÇ ÆÄÀÏÀ» ¿¡·¯¾øÀÌ Ã³¸®ÇØÁØ´Ù.
°á±¹ Win32¿ë TeXÀº ÀÌ·± ½ÄÀ¸·Î °íÃÄ¾ß ÇÒ °ÍÀÔ´Ï´Ù.

À§¿¡¼­ ¾ð±ÞÇÑ [http]SCUnipad¸¦ ½á º¸½Ê½Ã¿À. UTF-8 BOM 󸮵µ ÇØ Áشٰí ÇÕ´Ï´Ù. ¶Ç, Windows¿ë Vim (VimEditor)µµ UTF-8 BOMÀ» Àß Ã³¸®ÇØ ÁÙ °ÍÀÔ´Ï´Ù. -- ½ÅÁ¤½Ä



^
Valid XHTML 1.0! Valid CSS! powered by MoniWiki
last modified 2009-05-19 08:18:00
Processing time 0.2087 sec