Trying to understand the Ruby .chr and .ord methods
I've been working with the Ruby chr
and ord
methods recently and there are a few things I don't understand.
My current project involves converting individual characters to and from ordinal values. As I understand it, if I have a string with an individual character like "A" and I call ord
on it I get its position on the ASCII table which is 65. Calling the inverse, 65.chr
gives me the character value "A", so this tells me that Ruby has a collection somewhere of ordered character values, and it can use this collection to give me the position of a specific character, or the character at a specific position. I may be wrong on this, please correct me if I am.
Now I also understand that Ruby's default character encoding uses UTF-8 so it can work with thousands of possible characters. Thus if I ask it for something like this:
'好'.ord
I get the position of that character which is 22909. However, if I call chr
on that value:
22909.chr
I get "RangeError: 22909 out of char range." I'm only able to get char
to work on values up to 255 which is extended ASCII. So my questions are:
chr
from the extended ASCII character set but ord
from UTF-8? According to Integer#chr
you can use the following to force the encoding to be UTF_8.
22909.chr(Encoding::UTF_8)
#=> "好"
To list all available encoding names
Encoding.name_list
#=> ["ASCII-8BIT", "UTF-8", "US-ASCII", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "UTF-16", "UTF-32", ...]
A hacky way to get the maximum number of characters
2000000.times.reduce(0) do |x, i|
begin
i.chr(Encoding::UTF_8)
x += 1
rescue
end
x
end
#=> 1112064
After tooling around with this for a while, I realized that I could get the max number of characters for each encoding by running a binary search to find the highest value that doesn't throw a RangeError.
def get_highest_value(set)
max = 10000000000
min = 0
guess = 5000000000
while true
begin guess.chr(set)
if (min > max)
return max
else
min = guess + 1
guess = (max + min) / 2
end
rescue
if min > max
return max
else
max = guess - 1
guess = (max + min) / 2
end
end
end
end
The value fed to the method is the name of the encoding being checked.
链接地址: http://www.djcxy.com/p/34770.html下一篇: 试图了解Ruby .chr和.ord方法