+ 6
Why this infinite recursion on Sololearn?
Hi, I was planning to implement buffer-overrun, but I came across to this weird infinite recursion. Line 12-14 are not relevant here but if I don't have them then I get 'segmentation fault' on SL. I ran the code on two other online C compilers with and without line 12-14 and the program ran correctly (printed only one 'hello world!' and exited. Could anyone explain why I get this infinite recursion on SL? Is it a SL bug? Thank you. https://code.sololearn.com/c6KFEMC2kRUv
13 Respuestas
+ 4
Doesn't answer your question, but sharing my findings here:
The main problem is that the difference between address of main() and f() is 21, but you're subtracting 17 (0x11) from address of main(). See this
https://code.sololearn.com/cOrT9ke56jMf/?ref=app
Changing 0x11 to 0x15 makes it work fine
https://code.sololearn.com/cX2f97BzBABf/?ref=app
Also, printing a message anywhere before calling (main - 0x11) will result in the message being repeated along with "hello world"
https://code.sololearn.com/cIBj87BnjjYk/?ref=app
But printing a message after calling (main - 0x11) will result in the message not being printed at all
https://code.sololearn.com/cYb7PITk6L0j/?ref=app
This shows that not only f(), but even the statements in main() are being executed repeatedly.
I don't really have an answer to *why* this is happening though. I tried looking at the assembly
https://code.sololearn.com/caYwwWAg2yyX/?ref=app
But still couldn't reach an answer.
EDIT: sorry, messed up the order of the code links. Fixed
+ 1
This is probably compiler-dependent output as no programmer will ever intend to buffer overflow their programs (maybe except hackers).
Sololearn uses gcc 10.2 for C and C++. The 2 online compilers probably does not use gcc or use gcc but different versions (if that matters).
+ 1
Steve
The cast to long long is not the problem here. The problem is that OP is subtracting 0x11 (17) from the address of main(), even though he difference between the addresses of main() and f() is 0x15 (21).
It still doesn't explain the repeated output. In the assembly code I mentioned in my previous answer, the 'ret' instruction on line 21 (of the output) is supposed to take the execution back to where the function was called. But here it seems like execution just skips past that and continues to next intruction
+ 1
Steve
sorry for the late reply.
"... and results in returning to main at the point where it calls f()... "
That's what I thought, but as I said in my first answer, if you print something anywhere in main() before the call to f(), it will also be printed repeatedly. This means that ALL statements in main() are being executed. Try changing the position of the print statement in this code
https://code.sololearn.com/cIBj87BnjjYk/?ref=app
Also, the 'ret' instruction used for returning from procedures stores the address of the instruction in the program where the function was called (not sure if it's the same for every platform). So it doesn't matter if you jump to the wrong place in the function, you will return to the correct place.
+ 1
Steve
[Continued]
I could be wrong about the things I've said above as I only have basic knowledge in low-level stuff. To me it seems as if the return statement in f() is not working at all and execution just continues to main(), but I don't know how that could happen.
"...the procedure prolog stuff.... is skipped... procedure epilog runs and improperly cleans up the stack"
This makes sense. Could it be that the improper cleaning of the stack also cleans the address of the instruction where f() was called, which results in the failing of the 'ret' instruction?
+ 1
XXX Great observations!
With respect to my statement about returning to main() a little earlier in the function execution I will say that all of the statements in main() before the printf() are just declarations and initializations, that occur at compile time. So moving the printf() "up" doesn't actually change its position in the address space.
So I then added some new executable statements, thinking I'd prove my point, but they actually prove yours. 😅
+ 1
XXX
I then tried playing with the 0x11 offset, and found that it dumps core for most other values - but causes the same behavior for 0x13 and 0x14 as for 0x11. Not what I expected either.
Now I wish Sololearn provided a debugger, so I could look at the stack and determine what's really going on.
0
Well, the code I posted has nothing to do with buffer-overflow. I'm pretty sure that those two compilers used were GCC as they stated that on their website. In addition, the code has nothing to do with compiler implementation as I tested this on my local computers with GCC and MSVC. I was trying to know why it happened here only, but thanks for your reply.
0
What did you try to do? Can you explain your code please?
0
Hi, I don't know if the code has much to explain. Line 12-14 are just there so that program won't crash. Then line 15 simply calls function 'f()' with some pointer arithmetic and casting. the offset value '0x11' is valid for this particular code and for SL. If you want to run this somewhere else you need to calculate the new offset. That's all I guess.
I thought Paul or Martin could answer this since I noticed they seem to have more experience with this platform and C in particular. But this code didn't reach them yet.
0
lona Why were you casting main to type (long long)? I'm not sure but I suspect that may have had something to do with the recursion you were experiencing. Here's a revised version that worked for me:
https://code.sololearn.com/cywPTVRpFqDY/?ref=app
0
Maybe returning to an earlier position in main() than where you called f() from. And the difference might be a difference in the size of long long on the platforms.
But I'm guessing.
0
XXX You're right. So I think the issue has to do with stack management. Here's my theory:
When the bad address causes a jump into the middle of function f(), the procedure prolog stuff at the beginning of the function is skipped. Then when the return happens in f() the procedure epilog runs and improperly cleans up the stack, and results in returning to main at the point where it calls f(), instead the point after.