utf 8 - how to use shell to count Chinese characters in file encoded in UTF-8 -

cat doc.txt , following characters show:

你好 hello! 这是中文。this chinese doc.

i can use command

wc -w doc.txt

but show:

8 doc.txt

this command take characters 你好 , 这是中文 both single word, while in fact 你好 2 chinese words , 这是中文 four.

what want these chinese words counting right(there 12 words in example), out?

you can use -m or --chars option:

$ echo -n "你好" | wc -m

output:

Post