offset value in hadoop or in java ?
am bit confused with this term byte offset value is treated as map key in hadoop Map reduce program. First off, what is byte offset value ??
Secondly, please throw some lights on how its getting generated and how to view this byte-offset value ?
Thanks Raj
byte offset is the number of character that exists counting from the beginning of a line.
for example, this line
what is byte offset?
will have a byte offset of 19. This is used as key value in hadoop
The byte offset is the count of bytes starting at zero. One character or space is usually one byte when talking about Hadoop. But check out this question if you want to know more: How many bits in a character?
Basically an offset is an integer which is used to find the distance ( absolute address) with respect to the base address.
Assume a Text file with the following data
Computer-science World
Quantum Computing
now the offset for the first line is 0 and the input to the hadoop job will be <0,Computer Science World> for the second line the offset will be <23,Quantum Computing>
whenever we pass the text file to hadoop job. It internally calculates the byte offset.
链接地址: http://www.djcxy.com/p/78506.html上一篇: 关闭XAML错误下划线
下一篇: 在hadoop或在java中的偏移值?