This is one of these things that I always need to look up. I know this tip will probably have no relevance for 98.5% of my regular readers, but I wanted to put it here for future reference. Also, since I just spent like 20 minutes googling the solution perhaps this can be of service for people who run into the same issue.
Today was the first time since I last reinstalled Windows when I needed to type something in Polish inside Vim and I realized that my Alt-gr combinations are not working. In other words I could not type letters such as ł, ą, ę, ż, ź, ś, ć, ó and etc.. You’d be surprised how often these come up in an average sentence. Surprisingly enough, I had no such problem on Ubuntu where they worked just fine. Windows version however refused to cooperate.
I did a quick google search for “vim polish characters” and got basically nothing. Then I tried few search queries in Polish and still got little info. Then I realized I was approaching this wrong, and my issue was caused by two factors:
- Vim was not in a Unicode compliant mode
- The font I was using (Bitstream Vera Sans Mono) was not Unicode friendly
So I set out to fix this. How do we get Unicode characters to work properly in Vim on Windows? Easy, just paste the following snipped into your _vimrc:
if has("multi_byte")
if &termencoding == ""
let &termencoding = &encoding
endif
set encoding=utf-8
setglobal fileencoding=utf-8 bomb
set fileencodings=ucs-bom,utf-8,latin1
endif
Explanation can be found in Vim Tip #246. The if statement is a safety precaution since your version of the editor may not be compiled with the multi_byte feature which is required for Unicode to work properly.
Next you need a unicode friendly font. Bitstream Vera Sans Mono did not have the right Glyphs. Neither did Lucida Sans Typwriter. However Lucida Console, Curier New and the Consolas font all worked just fine. I really can’t tell you which fonts will work and which wont. You should probably just type some interesting word like “Gżegżółka” into the editor and just try different fonts until you find the one you like. For example Bitstream Vera Sans looks like this:
On the other hand Consolas font looks like this:
In my case I simply added the following line to _vimrc to change the font to Consolas:
set gfn=Consolas:h10:cANS
I hope that someone will find this helpful. I’m posting it under a Google friendly title in case someone else needs to figure this out they can easily find it here. For those of you who could care less about Polish characters, vim or unicode I apologize. It had to be done. Now we can return to our usual brand of craziness that you came here to read. :P
[tags]vim, vi, unicode, polish letters, polish characters, ogonki[/tags]
Don’t leave us hanging…what does Gżegżółka mean (Google’s Polish -> English translator doesn’t help)?
Oh, gżegżółka == cuckoo bird. Or rather the common name for it. The official name for the bird (one that you find in the encyclopedia) and the more popular one is kukułka. Gżegżółka is an older name, and nowadays mostly functions as an orthographic curiosity – something that is bound to come up on spelling bees and tests.
Also, it is a tongue breaker which is fun to teach to foreigners. Most Polish people can pronounce the word without problems. English speakers usually have major problems even trying to repeat it. :)
You’re lucky the Polish alphabet is a latin one. It’s a PITA to use vim when writing in cyrillic. You’re hardly more efficient in vim than in, say, Leafpad, because you have to switch the keyboard layout when you switch between insert an normal mode. Luckily, most of the time, I need to write documents in English.
The Unicode problem was particular to how vim behaves in Windows, I presume?
Yeah, I can’t even imagine how parts of the world which don’t use the Latin alphabet deal with this crap on daily basis. And I believe that Cyrylic is not even the worst case here – I assume that more unicode related issues arise when you start working in alphabets that are written right-to-left (arabic) or Japanese with the diverse kanji, hiragana and katakana alphabets.
And you are correct – I never had this issue on linux.
You actually can use your regular keyboard layout and insert text in another language! It’s achieved with vim keymaps.
How did you get the polish letters to work on ubuntu so easily I’m having no luck!
Probowalem uzyc twego tutoriala ale nie działa.
Uruchamiam VIMA spod konsoli w XP SP2. mam czcionke consolas a mimo to
alt-o, alt-a,alt-s,alt-z zamiast dawać poprawną lliterkę dają mi znaczek “a”.
jesli uruchomie gvim wszystko wydaje sie ok, wiec zgaduje ze to jakis problem z kodowaniem w dosie.
@Sebastian: Trzy rzeczy:
1. Upewnij się że masz nastawiony prawidłowy keyboard layout – musi być Poilish Programmer’s
2. Zmień skrypt na ten:
3. Naciskaj prawym Alt’em nie lewym. Polskie znaczki są aktywowane przez tkzw Alt-Gr – czyli prawy Alt w Polish Programmers’
Powinno działać. Przynajmniej działa u mnie.
After enabling polish letters… almost all work, only for
ś (alt-s) is shown
ź (alt-x) is shown
Could you give some solution?
I struggled with a similar issue in my vim sessions on a remote Ubuntu 11.10 system for a couple of hours installing things, uninstalling things, tweaking configuration files, playing with terminal settings and generally mucking about. No matter what I tried, my ż and ś and ł characters were showing up as periods.
I eventually found Comment #3 on this UTF-8 encoding post using Google and suddenly realized I am an idiot. I found your post very well written and it was one of my first Google hits so I figured I’d post a comment here to hopefully help the next guy avoid some frustration.
If you run into this kind of issue while trying to edit UTF-8 encoded files through an SSH client, don’t forget to check what character encoding you’re using in the client!! In PuTTY, it’s Window => Translation => Remote Character Set and it defaults to ISO-8859-1.