OK, this one is a bit geeked out again, but it’s relevant to China. If you’re an american, you could probably go your entire life without ever bumping into codepages, but if you’re life crosses paths with asia, you almost certainly will…
As we’re developing a new website,doing our subversion (version control system) check-in, I started bumping into a very unusual error.
e@116843:/spike/public/news/app/webroot/redv1.0/img/menu$ sudo svn up svn: Valid UTF-8 data (hex:) followed by invalid UTF-8 sequence (hex: b8 b4 bc fe)
Unfortunately, google didn’t come up with much. The best hit was a Oct 10th post on the subversion users mailing list. Basically, the answer is that there’s no answer.
Well, I did an svn up in each child directory of the one causing the problem and eventually tracked the error down through my project’s directory tree. It looks like one of the guys using a windows system copied a JPEG with a Chinese GBK encoded filename onto the server. Everything is best kept in UTF-8.
Once finding the right file, you have to figure out how to delete a file with a name that can’t be typed…
e@116843:/spike/public/news/app/webroot/redv1.0/img/menu$ ls logo02.jpg ???? logo.jpg menu_acc_down.jpg menu_home_down.jpg menu_work_down.jpg logo03.jpg logo.jpg menu_acc.jpg menu_home.jpg menu_work.jpg logo04.jpg logo_top1.jpg menu_cameras_down.jpg menu_len_down.jpg logo05.jpg logo_top2.jpg menu_cameras.jpg menu_len.jpg logo06.jpg logo_top3.jpg menu_gall_down.jpg menu_tech_down.jpg logo_bottom.jpg logo_top.jpg menu_gall.jpg menu_tech.jpg
In this case, I just used: rm *\ logo.jpg since there was only one file matching this pattern… Next, I could commit again!
e@116843:/spike$ sudo svn up D public/.htaccess Updated to revision 38.