utf 8 - how to use shell to count Chinese characters in file encoded in UTF-8 -


cat doc.txt , following characters show:

你好 hello! 这是中文。this chinese doc. 

i can use command

wc -w doc.txt

but show:

8 doc.txt 

this command take characters 你好 , 这是中文 both single word, while in fact 你好 2 chinese words , 这是中文 four.

what want these chinese words counting right(there 12 words in example), out?

you can use -m or --chars option:

$ echo -n "你好" | wc -m   

output:

2 

Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -