Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tlug: gcc question



"Stephen J. Turnbull" wrote:
> 
> >>>>> "Fredric" == Fredric Fredricson <Fredric.Fredriksson@example.com> writes:
> 
>     Fredric> It is not malloc(3) but sbrk(2) that malloc(3) use to
>     Fredric> request new pages from the kernel that matters.
> 
> No, because GNU malloc (some version of which is used in all Linux
> systems AFAIK) breaks up the memory it gets from sbrk() into
> reasonably-sized pieces.  In old GNU malloc, you only get all the raw
> memory as returned by sbrk if you are allocating more memory than the
> malloc BLOCKSIZE, which is 2048 bytes on 32 bit systems.  I don't know
> what Doug Lea malloc does, though.

Eh... I dont't quite follow you here. It's probably me...

Anyway, I will try to explain what I mean with my remark. But first I 
want to point out that I am not a kernel hacker (by no means) but
from running other Unix:es and from trying to decipher the Linux 
kernel code this is how I understand it:

Malloc use sbrk(2) (or brk(2)) to ask the kernel for more memory when it
needs it (there are btw probably also other ways than sbrk(2) to do
this). 

Sbrk(2) will "allocate" the memory and return a pointer to it. But sbrk
does actually keep a small pool of memory since the kernel can only 
assign 4k (2k?) pages of memory to a process. That is, sbrk will only 
request another page when it used up the last page.

The SIGSEGV signal is triggered by the MMU hardware whenever the 
process tries to access pages not allocated (or tries to write to
read-only pages).

If this theory is correct it should be possible to access memory 
outside the limit indicated by sbrk(2).

Consider the following c program:
int main()
{
  char *p ;			/* decl. ptrs */
  char *p2 ;
  p = (char*) malloc(10) ;	/* alloc X bytes */
  *p = 0xAA ;			/* set a mark */
  p2 = (char*) sbrk(1) ;	/* ask sbrk for mem. */
  *p2 = 0xBB ;			/* set a mark */
  printf("p2: %X\n",p2) ;	/* ret val. by sbrk */
  while(1) {			/* Scan memory */
    printf("%X:%X\n",p,*p & 0xFF) ;
    p++ ;
  }   
   return 0;
}
It prints:

p2: 804A000   <- end of world reported by sbrk(2)
8049660:AA
8049661:0
8049662:0
   [snip a lot]
8049FFD:0
8049FFE:0
8049FFF:0
804A000:BB   <- sbrk returned a pointer here
804A001:0    <- one past end of world according
804A002:0       to sbrk.
804A003:0
   [snip]
804AFFD:0
804AFFE:0
804AFFF:0    <- the "real" end of the world
Segmentation fault
-- End of output ---

There may be more than one way to interpret the result above
but my interpretation is that the process can have more memory 
assigned to it than sbrk(2) reports.

The above suggests a very simple memory model and I assume that
Linux can do more. I have read in the man pages that there is a
library called libefence that can replace malloc. Libefence 
inserts an inaccesible memory page after each allocated memory
allocation (The existence of this package also suggests that it
is not standard malloc procedure to do so).

> 
>     Fredric> If you allocate, say, 80 bytes using malloc and start to
>     Fredric> use the returned pointer to write to memory outside these
>     Fredric> 80 bytes you will probably corrupt malloc(3)s data
>     Fredric> structures before you try to access data outside the
>     Fredric> allocated memory for the process and get a SIGSEGV.
> 
> This actually is not true under the old GNU malloc, since it keeps its
> data structures in separately allocated memory.  Again, I don't know
> about the strategy followed by new GNU (Doug Lea) malloc, I don't have
> a copy of the source on my system at the moment.

When I use my S.u.s.e. 6.3 Linux to compile and examine the heap it sure
looks like there are some data between the allocated memory chunks that
probably is part of the malloc data structures. Looks like 5 bytes
if data. Maybe this data is redundant.

> 
> Of course, since C structures often contain pointers, and in many
> cases function pointers, you don't need to corrupt malloc internal
> data structures to generate SIGBUS and SIGSEGV errors before
> overrunning the allocated memory.
Sure. This is very true for c++ code that use the heap a lot.

/Fredric
-------------------------------------------------------------------
Next Technical Meeting: January 14 (Fri) 19:00
* Topic: "glibc - current status and future developments"
* Guest Speaker: Ulrich Drepper (Cygnus Solutions)
* Place: Oracle Japan HQ 12F Seminar Room (New Otani Garden Court)
-------------------------------------------------------------------
more info: http://www.tlug.gr.jp        Sponsor: Global Online Japan


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links