Can anyone tell me what does this mean

ketting00 Well-Known Member

Messages:: 782

Likes Received:: 28

Best Answers:: 3

Trophy Points:: 128

#1

Hi guys,
I've to learn about the code below because I have to deal with utf-8 data encoding/decoding.
I used to use this but in the past I just copied and pasted people code.
In order to write my custom code I have to dig it deep in learning.
var message = data.slice(start, end);
var FIN =(message[0]&0x80);
var RSV1 =(message[0]&0x40);
var RSV2 =(message[0]&0x20);
var RSV3 =(message[0]&0x10);
varOpcode= message[0]&0x0F;
var mask =(message[1]&0x80);
var length =(message[1]&0x7F);
Code (markup):
So what does FIN and its associative variables mean? What they do?

I know what message[0] means, but what does 0x80 or 0x0F do? Why they use & to merge them?
Where can I learn more about this. I don't even know what it is called?

Thanks to help save my hair in advanced,

ketting00, Jun 10, 2015 IP

deathshadow Acclaimed Member

Messages:: 9,732

Likes Received:: 1,999

Best Answers:: 253

Trophy Points:: 515

#2

A single ampersand means a binary "and".

AND Truth table:
0 & 0 == 0
1 & 0 == 0
0 & 1 == 0
1 & 1 == 1

As opposed to a binary or, which we state with the vertical break character | (or "pipe" to the *nix retards who don't know the character existed BEFORE it was used for that)

OR Truth Table
0 | 0 == 0
1 | 0 == 1
0 | 1 == 1
1 | 1 == 1

... and you should also be aware of the XOR, usually in C syntax languages indicated by the circumflex character (or again "carat" for the *nix re-re's)

XOR Truth Table
0 ^ 0 == 0
1 ^ 0 == 1
0 ^ 1 == 1
1 ^ 1 == 0

Of course if we're going to cover bitwise operations we should also talk NOT and SHIFTS.

NOT truth table
1~ == 0
0~ == 1

A shift on the other hand moves the bits however many places left or right, typically a shift is indicated by two greater than or less than signs thus:

8 << 2 == 32
0x08 << 0x02 == 0x20
0b00001000 << 0b00000010 == 0b00100000

Mind you with shifts any bits that go 'off the end' of the data size are thrown away! There's also what's called a rotate, that for some reason C never saw fit to add to it's repertoire, where bits that would be shifted off are moved to the other end instead. In most assemblers it's easy:

rol ax, 2 ; rotate left ax by 2.

Let's say our ax register (registers are a bit like variables -- well that's a gross oversimplification but it'll have to do for now) is 0b11000011 aka 0xC3 aka 195 decimal.

0b11000011 rol 0b00000010 == 0b00001111

Literally it 'rotates' the bits.

Sadly C doesn't HAVE a rotate command, so doing that in C is ugly as hell... assuming a 8 bit unsigned integer (let's not even get into signed and two's compliments for now) doing a 'rotate' in C is this train wreck of fugliness...

uint8 a = 0b11000011;
a = (a >> 6) | (a << 2);

Another thing C lacks is the ability to track carry or overflow -- so you have to "and" first which can result in painfully slow and bloated code.

Remember that most of the time numbers on computers are stored in binary... Binary can be... painful to use and track due to a number 0..65535 being 16 digits long in binary, hence why hexadecimal caught on as it's WAY easier to convert binary to hex and back in your head. That's what those 0x numbers are, they're HEX. There are several shorthands for saying a number is hex, and languages vary on which they use.

N below is a hex value 0..F

Posix legacy languages: 0xNN
Some assemblers: 0NNh
Wirth family languages: $NN

So if you see 0F7h, 0xF7 or $F7 in some code, it's usually hexadecimal. Just as any number ending in a lower case letter b or starting with 0b a is typically binary. If it starts or ends with a letter o (sometimes 0o) it's octal.

Some examples:
 HEX   OCT   DEC    BIN
0x01   o001    1    0b00000001
0x02   o002    2    0b00000010
0x04   o004    4    0b00000100
0x08   o010    8    0b00001000
0x10   o020   16    0b00010000
0x20   o040   32    0b00100000
0x40   o100   64    0b01000000
0x80   o200  128    0b10000000
Code (markup):
Notice how 1,2,4,8 in each digit of hex corresponds to a bit in the binary? That's what they're testing for.

If message[0] were for example:

0x1E == 0b00011110

then 0x1E & 0x80 is the same as 0b00011110 & 0b1000000

Which is clearer if you put them one over the other:
0b00011110
0b1000000

Which is false. Bit 7 (the top bit) is set in the last one but no the first one. The result would be zero.

Let's do another example... let's say we had 0xC2 and we wanted to test if bits 7 and 1 were set. (remember the 8 bits of a byte are numbered 0..7). That would be & 0x82

0b11000010
0b10000010

You and those together the result would be 'true' for being non-zero, as well as being 0x82. If we were testing for 0x02 the result would be true and 0x02. If we tested 0x03 (0b00000011) the result would still be true (a bit is set) but the result would be 0x02, since:

0b11000010 &
0b00000011 ==
0b00000010

As those are the only binary bits those values have in common.

Binary is how **** really works.

Basically that code is testing and masking off certain bits. (though I want to backhand someone for all those "var for nothing")
var
	message = data.slice(start, end),
	FIN    = message[0] & 0x80; // is bit 7 set?
	RSV1   = message[0] & 0x40; // is bit 6 set?
	RSV2   = message[0] & 0x20; // is bit 5 set?
	RSV3   = message[0] & 0x10; // is bit 4 set?
	Opcode = message[0] & 0x0F; // isolate bits 0..3 throwing away top 4 bits
	mask   = message[1] & 0x80; // again is bit 7 set
	length = message[1] & 0x7F; // isolate bits 0..6 throwing away bit 7
Code (markup):
Which is a hell of a lot easier to say than:
var
	message = data.slice(start, end),
	FIN    = message[0] & 0b10000000; // is bit 7 set?
	RSV1   = message[0] & 0b01000000; // is bit 6 set?
	RSV2   = message[0] & 0b00100000; // is bit 5 set?
	RSV3   = message[0] & 0b00010000; // is bit 4 set?
	Opcode = message[0] & 0b00001111; // isolate bits 0..3 throwing away top 4 bits
	mask   = message[1] & 0b10000000; // again is bit 7 set
	length = message[1] & 0b01111111; // isolate bits 0..6 throwing away bit 7
Code (markup):
One of the big helps of hexadecimal is that each digit is a 'nybble' -- 4 bits. So every two digits is one 8 bit byte. As such you only need to remember each digit's values:

0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
0110 = 6
0111 = 7
1000 = 8
1001 = 9
1010 = A (10 decimal)
1011 = B (11 decimal)
1100 = C (12 decimal)
1101 = D (13 decimal)
1110 = E (14 decimal)
1111 = F (15 decimal)

That help any?

deathshadow, Jun 11, 2015 IP

mmerlinn and ketting00 like this.

deathshadow Acclaimed Member

Messages:: 9,732

Likes Received:: 1,999

Best Answers:: 253

Trophy Points:: 515

#3

Oh and side note, be thankful you never had to deal with ancient big iron -- where a 'byte' was typically 6 bits not 8, so the entire character set was only 64 values not 255 or more. As I think I mentioned earlier Base64 isn't 64 bit, it's 6 bit leading to 64 possible values -- which is why octal was used on those systems as each digit in octal is 3 bits.

... and why people trying to use base64 for stuff on modern computers are pretty much full of more manure than Biff Tannen's 1946 Ford Super De Luxe. It's grossly inefficent, and has a ridiculous amount of overhead. Anyone calling it a 'native format' needs a serious case of sierra tango foxtrot uniform since that's not been true since DEC and Wang went the way of the dodo.

Fun stuff:
Dec Sixbit character set (octal offsets):
     0      1      2      3      4      5      6      7
     
0  space    !      "      #      $      %      &      '

1    (      )      *      +      ,      -      .      /

2    0      1      2      3      4      5      6      7

3    8      9      :      ;      <      =      >      ?

4    @      A      B      C      D      E      F      G

5    H      I      J      K      L      M      N      O

6    P      Q      R      S      T      U      V      W

7    X      Y      Z      [      \      ]      ^      _
Code (markup):
So for example the letter A is o41 / 0x21 / 33 decimal / 0b100001

Be even more thankful we're not working with even more complex systems like base 60 or base 420, which are popular in complex navigation and geometric math... or that much of this is handled for us by the high level languages as it can get REALLY fun when you start dealing with stuff like BCD (binary coded decimal) and arbitrary length numbers.

Last edited: Jun 11, 2015

deathshadow, Jun 11, 2015 IP

ketting00 Well-Known Member

Messages:: 782

Likes Received:: 28

Best Answers:: 3

Trophy Points:: 128

#4

Wow!
Thanks for lengthily reply. It's helpful. I'm really appreciated you spent your time to explain this.
I read every single character you write but still not fully understand.
But it gives me clue to where I would head to for further study.
It's definitely like a silver lightning in the dark.

Man I never imagine that JavaScript would get me to this far.

Thousand thanks

ketting00, Jun 11, 2015 IP

ketting00 Well-Known Member

Messages:: 782

Likes Received:: 28

Best Answers:: 3

Trophy Points:: 128

#5

Ha,
I finally know what this mean.

Bringing an old thread back to top just to use it in a current project.

ketting00, Apr 26, 2016 IP

Log in or Sign up

Can anyone tell me what does this mean

ketting00 Well-Known Member

deathshadow Acclaimed Member

deathshadow Acclaimed Member

ketting00 Well-Known Member

ketting00 Well-Known Member

Useful Searches