index.xml - chris.bracken.jp - Statically generated site for chris.bracken.jp

index.xml (8294B)
      1 <?xml version="1.0" encoding="utf-8" standalone="yes"?>
      2 <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      3   <channel>
      4     <title>Linux on Chris Bracken</title>
      5     <link>https://chris.bracken.jp/tags/linux/</link>
      6     <description>Recent content in Linux on Chris Bracken</description>
      7     <generator>Hugo -- gohugo.io</generator>
      8     <language>en</language>
      9     <managingEditor>chris@bracken.jp (Chris Bracken)</managingEditor>
     10     <webMaster>chris@bracken.jp (Chris Bracken)</webMaster>
     11     <lastBuildDate>Fri, 22 Apr 2011 00:00:00 +0000</lastBuildDate><atom:link href="https://chris.bracken.jp/tags/linux/index.xml" rel="self" type="application/rss+xml" />
     12     <item>
     13       <title>Installing Mozc on Ubuntu</title>
     14       <link>https://chris.bracken.jp/2011/04/installing-mozc-on-ubuntu/</link>
     15       <pubDate>Fri, 22 Apr 2011 00:00:00 +0000</pubDate>
     16       <author>chris@bracken.jp (Chris Bracken)</author>
     17       <guid>https://chris.bracken.jp/2011/04/installing-mozc-on-ubuntu/</guid>
     18       <description>&lt;p&gt;If you&amp;rsquo;re a Japanese speaker, one of the first things you do when you install a
     19 fresh Linux distribution is to install a decent &lt;a href=&#34;https://en.wikipedia.org/wiki/Japanese_IME&#34;&gt;Japanese IME&lt;/a&gt;.
     20 Ubuntu defaults to &lt;a href=&#34;https://sourceforge.jp/projects/anthy/news/&#34;&gt;Anthy&lt;/a&gt;, but I personally prefer &lt;a href=&#34;https://code.google.com/p/mozc/&#34;&gt;Mozc&lt;/a&gt;, and
     21 that&amp;rsquo;s what I&amp;rsquo;m going to show you how to install here.&lt;/p&gt;
     22 &lt;p&gt;&lt;em&gt;Update (2011-05-01):&lt;/em&gt; Found an older &lt;a href=&#34;https://www.youtube.com/watch?v=MfgjTCXZ2-s&#34;&gt;video tutorial&lt;/a&gt; on YouTube
     23 which provides an alternative (and potentially more comprehensive) solution for
     24 Japanese support on 10.10 using ibus instead of uim, which is the better choice
     25 for newer releases.&lt;/p&gt;
     26 &lt;p&gt;&lt;em&gt;Update (2011-10-25):&lt;/em&gt; The software installation part of this process got a
     27 whole lot easier in Ubuntu releases after Natty, and as noted above, I&amp;rsquo;d
     28 recommend sticking with ibus over uim.&lt;/p&gt;
     29 &lt;h3 id=&#34;japanese-input-basics&#34;&gt;Japanese Input Basics&lt;/h3&gt;
     30 &lt;p&gt;Before we get going, let&amp;rsquo;s understand a bit about how Japanese input works on
     31 computers. Japanese comprises three main character sets: the two phonetic
     32 character sets, hiragana and katakana at 50 characters each, plus many
     33 thousands of Kanji, each with multiple readings. Clearly a full keyboard is
     34 impractical, so a mapping is required.&lt;/p&gt;
     35 &lt;p&gt;Input happens in two steps. First, you input the text phonetically, then you
     36 convert it to a mix of kanji and kana.&lt;/p&gt;
     37 &lt;figure&gt;&lt;img src=&#34;https://chris.bracken.jp/post/2011-04-22-henkan.png&#34;
     38     alt=&#34;Japanese IME completion menu&#34;&gt;
     39 &lt;/figure&gt;
     40 
     41 &lt;p&gt;Over the years, two main mechanisms evolved to input kana. The first was common
     42 on old &lt;em&gt;wapuro&lt;/em&gt;, and assigns a kana to each key on the keyboard—e.g. where
     43 the &lt;em&gt;A&lt;/em&gt; key appears on a QWERTY keyboard, you&amp;rsquo;ll find a ち. This is how our
     44 grandparents hacked out articles for the local &lt;em&gt;shinbun&lt;/em&gt;, but I suspect only a
     45 few die-hard traditionalists still do this. The second and more common method
     46 is literal &lt;a href=&#34;https://en.wikipedia.org/wiki/Wapuro&#34;&gt;transliteration of roman characters into kana&lt;/a&gt;. You
     47 type &lt;em&gt;fujisan&lt;/em&gt; and out comes ふじさん.&lt;/p&gt;
     48 &lt;p&gt;Once the phonetic kana have been input, you execute a conversion step wherein
     49 the input is transformed into the appropriate mix of kanji and kana. Given the
     50 large number of homonyms in Japanese, this step often involves disambiguating
     51 your input by selecting the intended kanji. For example, the &lt;em&gt;mita&lt;/em&gt; in &lt;em&gt;eiga wo
     52 mita&lt;/em&gt; (I watched a movie) is properly rendered as 観た whereas the &lt;em&gt;mita&lt;/em&gt; in
     53 &lt;em&gt;kuruma wo mita&lt;/em&gt; (I saw a car) should be 見た, and in neither case is it &lt;em&gt;mita&lt;/em&gt;
     54 as in the place name &lt;em&gt;Mita-bashi&lt;/em&gt; (Mita bridge) which is written 三田.&lt;/p&gt;
     55 &lt;h3 id=&#34;some-implementation-details&#34;&gt;Some Implementation Details&lt;/h3&gt;
     56 &lt;p&gt;Let&amp;rsquo;s look at implementation. There are two main components used in inputting
     57 Japanese text:&lt;/p&gt;
     58 &lt;p&gt;The GUI system (e.g. ibus, uim) is responsible for:&lt;/p&gt;
     59 &lt;ol&gt;
     60 &lt;li&gt;Maintaining and switching the current input mode:
     61 ローマ字、ひらがな、カタカナ、半額カタカナ.&lt;/li&gt;
     62 &lt;li&gt;Transliteration of character input into kana: &lt;em&gt;ku&lt;/em&gt; into く,
     63 &lt;em&gt;nekko&lt;/em&gt; into ねっこ, &lt;em&gt;xtu&lt;/em&gt; into っ.&lt;/li&gt;
     64 &lt;li&gt;Managing the text under edit (the underlined stuff) and the
     65 drop-down list of transliterations.&lt;/li&gt;
     66 &lt;li&gt;Ancillary functions such as supplying a GUI for custom dictionary
     67 management, kanji lookup by radical, etc.&lt;/li&gt;
     68 &lt;/ol&gt;
     69 &lt;p&gt;The transliteration engine (e.g. Anthy, Mozc) is responsible for transforming a
     70 piece of input text, usually in kana form, into kanji: for example みる into
     71 one of: 見る、観る、診る、視る. This involves:&lt;/p&gt;
     72 &lt;ol&gt;
     73 &lt;li&gt;Breaking the input phrase into components.&lt;/li&gt;
     74 &lt;li&gt;Transforming each component into the appropriate best guess based on context
     75 and historical input.&lt;/li&gt;
     76 &lt;li&gt;Supplying alternative transformations in case the best guess was incorrect.&lt;/li&gt;
     77 &lt;/ol&gt;
     78 &lt;h3 id=&#34;why-mozc&#34;&gt;Why Mozc?&lt;/h3&gt;
     79 &lt;p&gt;TL;DR: because it&amp;rsquo;s better. Have a look at the conversion list up at the top of
     80 this post. The input is &lt;em&gt;kinou&lt;/em&gt;, for which there are two main conversion
     81 candidates: 機能 (feature) and 昨日 (yesterday). Notice however, that it also
     82 supplies several conversions for yesterday&amp;rsquo;s date in various formats, including
     83 「平成23年4月21日」 using &lt;a href=&#34;https://en.wikipedia.org/wiki/Japanese_era_name&#34;&gt;Japanese Era Name&lt;/a&gt; rather than the
     84 Western notation 2011. This is just one small improvement among dozens of
     85 clever tricks it performs. If you&amp;rsquo;re thinking this bears an uncanny resemblance
     86 to tricks that &lt;a href=&#34;https://www.google.com/intl/ja/ime/&#34;&gt;Google&amp;rsquo;s Japanese IME&lt;/a&gt; supports, you&amp;rsquo;re right: Mozc
     87 originated from the same codebase.&lt;/p&gt;
     88 &lt;h3 id=&#34;switching-to-mozc&#34;&gt;Switching to Mozc&lt;/h3&gt;
     89 &lt;p&gt;So let&amp;rsquo;s assume you&amp;rsquo;re now convinced to abandon Anthy and switch to Mozc.
     90 You&amp;rsquo;ll need to make some changes. Here are the steps:&lt;/p&gt;
     91 &lt;p&gt;If you haven&amp;rsquo;t yet done so, install some Japanese fonts from either Software
     92 Centre or Synaptic. I&amp;rsquo;d recommend grabbing the &lt;em&gt;ttf-takao&lt;/em&gt; package.&lt;/p&gt;
     93 &lt;p&gt;Next up, we&amp;rsquo;ll install and configure Mozc.&lt;/p&gt;
     94 &lt;ol&gt;
     95 &lt;li&gt;&lt;strong&gt;Install ibus-mozc:&lt;/strong&gt; &lt;code&gt;sudo apt-get install ibus-mozc&lt;/code&gt;&lt;/li&gt;
     96 &lt;li&gt;&lt;strong&gt;Restart the ibus daemon:&lt;/strong&gt; &lt;code&gt;/usr/bin/ibus-daemon --xim -r -d&lt;/code&gt;&lt;/li&gt;
     97 &lt;li&gt;&lt;strong&gt;Set your input method to mozc:&lt;/strong&gt;
     98 &lt;ol&gt;
     99 &lt;li&gt;Open &lt;em&gt;Keyboard Input Methods&lt;/em&gt; settings.&lt;/li&gt;
    100 &lt;li&gt;Select the &lt;em&gt;Input Method&lt;/em&gt; tab.&lt;/li&gt;
    101 &lt;li&gt;From the &lt;em&gt;Select an input method&lt;/em&gt; drop-down, select Japanese, then mozc from
    102 the sub-menu.&lt;/li&gt;
    103 &lt;li&gt;Select &lt;em&gt;Japanese - Anthy&lt;/em&gt; from the list, if it appears there, and click
    104 &lt;em&gt;Remove&lt;/em&gt;.&lt;/li&gt;
    105 &lt;/ol&gt;
    106 &lt;/li&gt;
    107 &lt;li&gt;&lt;strong&gt;Optionally, remove Anthy from your system:&lt;/strong&gt; &lt;code&gt;sudo apt-get autoremove anthy&lt;/code&gt;&lt;/li&gt;
    108 &lt;/ol&gt;
    109 &lt;p&gt;Log out, and back in. You should see an input method menu in the menu
    110 bar at the top of the screen.&lt;/p&gt;
    111 &lt;p&gt;That&amp;rsquo;s it, Mozcを楽しんでください！&lt;/p&gt;
    112 </description>
    113     </item>
    114     
    115   </channel>
    116 </rss>
	chris.bracken.jp Statically generated site for chris.bracken.jp
	git clone https://git.bracken.jp/chris.bracken.jp.git
	Log \| Files \| Refs