Help on understading fread

oikram · June 24, 2010, 1:58pm

fread($this->socket, ($length - 4));

This will read a given $length minus 4 bytes. Is this correct?

However, if we have a file with some information like:

aaaa bbbb ccccc

Let’s suppose,
each letter = 1 byte;

the total length will be 12 bytes.

This could either remove the LAST 4 bytes or remove the FIRST 4 bytes or even somewhere in the middle, or not ?

When we say: $length - 4, how does we know that we are not removing the last 4 bytes, but, instead, we are removing the first four bytes?

Let’s say length was 12;
With the fread above we are telling: read only 8 bytes.

My question is, if it reads only 8 bytes he should read all from the beginning and, without arriving into the total length, don’t read the REMAINING 4 bytes.
OR
if it reads only after the first 4 bytes. Is the file pointer placed after the first 4 bytes and it starts reading from there?

If this doesn’t make sense, but you can more or less guess what I’m trying to say, please give me a hand.

Thanks a lot,
Márcio

oikram · June 24, 2010, 5:10pm

ok.

9-4 = 5; ok.

Ok. Borrow ScallioXTX representation scheme:
0009abcde
123456789

where numbers from 1 to 9 represent the file pointer position. precise?

abcde
12345

5-4=1; ok.

ok. So the remaining chars bcde will not appear as well.

So it’s not a question of removing the first and the last ones is it?
It’s a question of reading a given length based on a previous position, and that’s all. yes?

ok. If I wanted to send a number containing the size of the packed to be received, and the data that THAT packet include, we will need a way to distinguish the data from that “meta” information. Is that it?

ok.

9000
1234 -> four bytes long. or should we say: 0123 -> 4 bytes long?

Ok. And this is called a left-zero-padded 4 byte number. Yes?

And this is the convertion of that 4 byte number, into a binary sequence?

Treat as a number not as char. ok. But this already depends on the arguments we use on our unpack function right?

ok.

Can be:

or

The computer receives some binaries and it needs to interpret those binaries using some rules. Ok.

Yes. According to the documentation that’s what it states.

We have 4 bytes. 4 bytes = 32 bits.
ok.

Ok.

Can you elaborate a bit more here… just a bit.

I may get crazy but that means that 9123 means that we are dealing with 9123 and not 3219 ?

Unsigned means no negative?

Why the biggest?

And I’m almost there I hope.

Thanks a ZILLION!
Márcio

oikram · June 25, 2010, 10:46am

Back to the main point here, can I please ask your help on reading the comments above and see if they are correct.
I miss a link between the fread and the xml, I believe.

I’m unable to properly understand what can I use in order to pass from a binaryString into a string containing or xml data. Any help for solving it?

I believe I’m doing to many steps to get the length, even if php has so nice functions.
We are unpacking and then, grab that into and absolutely unnecessary unpacked variable that it’s only propose is to return us a length…
Can this be shortened?

Hope you can give me a hand, again.
Márcio

The code:

/**
     *
     * @return SimpleXMLElement 
     */
    public function getFrame()
    {
        //this will read the first 8x4 bits = 00000000 0000000 00000000 00000000 and store it in memory.
        $binaryStringPackOfFirstFourBytes = fread($this->_filePointer, 4);

        //this will grab or binaryStringPack (formatted into Big Endian from the server), and unpack into an associative array.
        $unpacked = unpack('Nlength', $binaryStringPackOfFirstFourBytes);

        //this will grab our lenght from our string array key.
        $length = $unpacked['length'];

        //read the remaining of our stream: (minus the 4 bytes that were already readed on the first fread call.
        $remainBinaryPack = fread($this->_filePointer, ($length - 4));

        //can we use? stream_get_contents($this->_filePointer, ($length - 4))
        //will it be aware of the first 4 bytes that were read by the first fread? 
        //why do I want to use it? So that I can have the remaining binary data as a string, that, I hope, is returning
        //my XML as a string.     
        
        //grab the xml returned by the server and allow us to treat it on a object oriented way.
        $xml = new SimpleXMLElement($remainBinaryPack);

        return $xml;
    }

Thanks in advance.

StarLion · June 24, 2010, 2:40pm

The difference being that when you made your opening post, you specifically stated that length was 12; the length of the string you posted, however, was not 12. Which is why Scallio’s response doesnt look the way you thought it would.

This function should return everything but the first 4 bytes (the size indicator) and the last 4 bytes (which… i dunno what’s in the last four bytes. Some sort of packet-terminator, i’m guessing)

oikram · June 25, 2010, 9:39am

First of all, thanks a lot for your reply.
Unfortunately for me, math is not my strong, neither this is a kid’s math forum.
Still, I do appreciate your explanations even if I cannot get them all at the moment.

Ok.

Ok. Since we are telling it’s a 32 bit number, it cannot be a 4x8bit one, even if it’s read back to back. They are different things and the machines needs to compute that difference as well.

Ok. But where the 255 255 255 255 enters?

So (see how I miss the basic, and how I really want to thank you for your patience and time here?), the largest 8-bit number we may have is:
255 ? Don’t we miss a number?

4x8bit = 255 255 255 255 ?

What if we had: 255255255255 ?

I will study some binary stuff and leave you alone. np.

So the base 2, is the binary base. 0 and 1.
Where did the 32 maximum comes from?
And -1?

(again, I believe a binary class with help). Np.

Ok.

So it’s something like this that happens when we see the overflow errors… very nice.

I cannot tell I got it all, but the sufficient however.

Thanks a lot for your time here,
There are 10 types of persons… I’m not one of them yet.

Márcio

StarLion · June 24, 2010, 3:33pm

(I apologize ahead of time for the wordiness of this post, i’m trying to explain, but i tend to get over-complicated)

Let me give an example.
Assume i wanted to pass the string “abcde”. Length 5.
This string would be encoded either as 0009abcde (the ‘0009’ included in the length) or 0005abcde (the 0005 excluded in the length).

In case 1 (Included), $length gets set as 9. fread(socket,$length-4) = fread(socket,5), which would read 5 characters from the pointer (which is now at character 5 (the a), and would return abcde.

In case 2 (Excluded), $length gets set as 5. fread(socket,$length-4) = fread(socket,1), which would read 1 characters from the pointer, and would return a.

Ahahahaa aaaha :sick::injured:
I have to give some (A LOT) of readings now, in order to follow your answer.

Because a packet is designed to have a certain structure (first 4 bytes are the size). Think about this: If you wanted to send a number as the content of a packet, how does the computer know where the size ends and the data begins? Fixed sizes.

So, i know my number has to be 4 bytes long. Well, I cant represent 9 as 9000… because that’s a different number. So I have to represent it as 0009. (00000000 00000000 00000000 00001001).
The unpack says to PHP: Treat this binary string as a large number, not as 4 characters.
To PHP, a binary string is a binary string. PHP has no concept of ‘this is a size number’. So…
11111111 11111111 11111111 11111111
Did i just give you a really big positive number? -1? 255 255 255 255? Some control character 4 times (ASCII 255?)
The Pack type tells PHP what it is.
N tells PHP is an Unsigned Big Endian Long Int (32 Bit)
Well, we have 32 bits, so thats a good start. Long Int tells us it’s a number, so throw away the ascii table, and 255 255 255 255 (because it’s a 32-long, it’s all 1 number). Big-Endian tells me that the bytes are in descending order of power… so the right most byte is the lowest power, etc… (The opposite of this, Little Endian, reverses the bytes and counts ascendingly). So now we know which way around our number is… Unsigned tells me that it’s not -1. So what i’ve given is a really big (in fact, the biggest) 32-bit positive number, and now PHP understands that.

DarthGuido · June 24, 2010, 2:54pm

No, the - 4 is just to eliminate the first 4 bytes from the length, since you already read those in the first fread

DarthGuido · June 24, 2010, 2:37pm

Why are you convinced? Because you know the content of the stream, and saw the output, right?


function getFrame() 
{
			if (@feof($this->socket)) return new PEAR_Error('connection closed by remote server');
			$hdr = @fread($this->socket, 4);

			
			$unpacked = unpack('N', $hdr);

                        //this should be PEAR unpack function I believe       
			$length = $unpacked[1];

		       
			return fread($this->socket, ($length - 4));
		}

$this->socket has this value:

$this->socket = @fsockopen($target, $errno, $errstr, $timeout)

Thank you,
Márcio

The first fread reads the first 4 characters. The second fread continues at character number 5. It doesn’t start automatically at character number 1 again.
So yes, in this case the returned value will be the stream - the first 4 characters.

oikram · June 24, 2010, 2:52pm

My bad then. I was trying to simplify the question without getting to specific in order to better understand.

last four bytes? We are doing nothing with the last four bytes right?

oikram · June 24, 2010, 2:29pm

I’m amazed by this community knowledge. No matter what I ask, those 5 10 gurus around… they should have been php writers at some point and I don’t know about it?

Back to the question:
Big Endian and Little Endian don’t play a role here?

I was really convinced that, on this script: (let’s forget about the @, I believe today we can do it without it), we were retrieving ALL of a given stream, except the first 4 bytes.

Am I wrong?

This script is under GNU btw.


function getFrame() 
{
			if (@feof($this->socket)) return new PEAR_Error('connection closed by remote server');
			$hdr = @fread($this->socket, 4);

			
			$unpacked = unpack('N', $hdr);

                        //this should be PEAR unpack function I believe       
			$length = $unpacked[1];

		       
			return fread($this->socket, ($length - 4));
		}

$this->socket has this value:

$this->socket = @fsockopen($target, $errno, $errstr, $timeout)

Thank you,
Márcio

StarLion · June 24, 2010, 2:51pm

Through reading the format of the bytes as a long int (“N”, packtype), yes.

How do we know that it doesn’t start from the beginning again? Each time we do a fread it starts from the last place… can you or anyone else, elaborate a little bit more. (I do accept link resources :D)

Fread operates an internal pointer.
Think of fread like an old-timey record player. If you put the needle at the start of the record, it starts playing at the beginning. If you stop the record player for a while (finish the fread), and then start it again (another fread command), it doesnt start over from the beginning again.

StarLion · June 24, 2010, 7:20pm

Correct. The pointer is pointing at position 5, which is the a. It is the next character that will be read.

ok. If I wanted to send a number containing the size of the packed to be received, and the data that THAT packet include, we will need a way to distinguish the data from that “meta” information. Is that it?

Precisely.

9000
1234 -> four bytes long. or should we say: 0123 -> 4 bytes long?

technically it’s 0123. Pointer starts at 0.

Ok. And this is called a left-zero-padded 4 byte number. Yes?

Correct.

Treat as a number not as char. ok. But this already depends on the arguments we use on our unpack function right?

Before it’s unpacked, this is just a stream of 0’s and 1’s.

Can you elaborate a bit more here… just a bit.

Before you specify the length of the number, the processor doesnt know if this is 4 8-bit numbers to be read back to back, or 1 32 bit number.

I may get crazy but that means that 9123 means that we are dealing with 9123 and not 3219 ?

Basically put. This just tells the processor in what order the bits are - whether to read left-to-right or right-to-left.

Unsigned means no negative?

Correct. Unsigned means the numbers run from 0 … 2^32 - 1. If it was signed, they would run from -(2^31)…2^31-1.

Why the biggest?

Binary math.
1+1 = 10
1111 (4 bits, all 1s) + 1 = 1 0000 (5 bits)
11111111 11111111 11111111 11111111 (32 bits all 1’s) + 1 = 1 00000000 00000000 00000000 00000000 (33 bits… but this is a 32 bit number only; this causes an error (overflow), and the 32 bit number would be 0. (Counting from the right, remember)

rpkamp · June 24, 2010, 2:10pm

If $length = 12, the string is “aaaa bbbb ccccc” and you read $length - 4, you read the first 12-4=8 characters, thus


[COLOR="SeaGreen"]a a a a   b b b[/COLOR] [COLOR="Red"]b    c  c  c  c  c[/COLOR]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

You will get the green part, not the red part of the string

(assuming 1 byte = 1 character, so this doesn’t hold for UTF-16 for example)

DarthGuido · June 24, 2010, 2:51pm

fread
Look also at the second note

oikram · June 24, 2010, 2:48pm

No. I can’t test it.
I’m convinced because EPP protocol server that I’m working with tells on is associated documentation that, the first 4 bytes will return the length and, after that, we get the juicy part (the xml) that we need to work with.

This class that I’ve found on the internet by goggling as the propose of dealing with another EPP server. That’s why I was convinced.

Yes, and with that, we get the length.

Ohhhh… That I didn’t know.
How do we know that it doesn’t start from the beginning again? Each time we do a fread it starts from the last place… can you or anyone else, elaborate a little bit more. (I do accept link resources :D)

Thanks again,
Márcio

oikram · June 24, 2010, 3:10pm

On the documentation is defined that way, the length should be the length of the xml instruction + the first 4 bytes included.

I’m not getting how we pass from the EXCLUDES to the “last four”.

Ahahahaa aaaha :sick::injured:
I have to give some (A LOT) of readings now, in order to follow your answer.

But I will… of course. Thank you all, I will try to understand this.

Márcio

oikram · June 24, 2010, 3:02pm

Yup. It was there indeed. thanks for pointing it

DarthGuido · June 24, 2010, 3:01pm

True. A test would give the answer

StarLion · June 24, 2010, 2:59pm

Actually Guido’s right, though we’re making an assumption here.
If the byte-count in the first 4 bytes INCLUDES said 4 bytes, then you’re getting everything in the string except the first four.
If the byte-count in the first 4 bytes EXCLUDES said 4 bytes, then you’re getting everyitning except the first four and the last four.

Oh… so it’s the N pack type that allow us to grab the binary data and return a INT formated that allow us to have the length. (?)
(please don’t give up on me )

Correct; Because the byte count is a left-zero-padded 4-byte number in a binary string, they use the N Unpack type to convert it to a non-zerofilled-integer.

oikram · June 24, 2010, 2:58pm

Oh… so it’s the N pack type that allow us to grab the binary data and return a INT formated that allow us to have the length. (?)
(please don’t give up on me )

I see… Nice then. Thanks for the clarification.

Guido despite all my readings of fread php page, I was unable to get this specific information. And trust me, that page is open now. That and fwrite and stream_get_contents… and… pack and unpack oh well…

Topic		Replies	Views
Desperation question - What can I do to start solving this apparent filePointer issue PHP	1	411	July 22, 2010
Help on understading behavior - stream resource is passed as an object. Why? PHP	35	1645	July 1, 2010
A function's meaning PHP	1	477	July 5, 2011
Should we always return something on a function? A pratical example PHP	8	1187	June 25, 2010
How to solve this error in PHP? PHP	3	3009	October 1, 2016

Help on understading fread

Related topics