Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: tlug: Re: linux cluster
- To: tlug@example.com, darren@example.com
- Subject: Re: tlug: Re: linux cluster
- From: "Manuel M. T. Chakravarty" <chak@example.com>
- Date: Fri, 16 Oct 1998 11:25:22 +0900
- Cc: sali@example.com, andy@example.com
- Content-Transfer-Encoding: 7bit
- Content-Type: Text/Plain; charset=us-ascii
- In-Reply-To: Your message of "Thu, 15 Oct 1998 12:39:26"<3.0.6.32.19981015123926.00978b40@example.com>
- References: <3.0.6.32.19981015123926.00978b40@example.com>
- Reply-To: tlug@example.com
- Sender: owner-tlug@example.com
Darren Cook <darren@example.com> wrote, > >[If] your main bottleneck is CPU <-> memory bandwidth. This > >implies two things: > > > >* Buy the biggest cache you can get > > Pentium-II seems limited to 512Kb cache. The next Intel chip is supposed to > have larger cache (and be better designed for parallel processing), but it > is going to be *much* more expensive. > > >(and write your > > application cache-friendly) > > What does this mean? If it's answered in the great link ( > http://www.mcs.anl.gov/dbpp/ ) you gave yesterday, ignore this question, as > I'm working through that site. :-) This is, I think, not covered in this book. Simplifying the problem vastly, there are three things you have to take care of: * Memory access should largely be consecutive, ie, if you process element a[i] of an array, try to process a[i+1] in the next step (or a[i-1]). This is because caches are maintained in chunks (called cache lines). So, if you access a[i], the processor will load the whole cache line, which contains a[i]. Access to neighboring elements will, therefore, hit the cache (ie, access the same cache line), which is about an order of magnitude faster than main memory access. * If you have pseudo-random access to some data structure, adapt the boundaries of your inner loops such that the processed data fits at least into the 2nd-level (better the 1st-level) cache. (This may require introducing an extra loop-level only to keep the cache happy.) * If there are multiple traversals over the same arrays and the arrays don't fit into the cache, try to introduce an extra loop nest, which processes the arrays in slices that fit into the cache (first do all traversals for the first slice, then for the second, and so forth.) There are other problems, like whether your loop fits into the instruction cache and the strategies of different cache architectures. Another nice book is ``Parallel Computer Architecture: A Hardware/Software Approach'': http://http.cs.berkeley.edu/~culler/book.alpha/index.html Unfortunately, the contents is not online. Chapter 3 (called ``Programming For Performance'') deals with the memory hierarchy in general. IMHO, the rest of the book is also worth reading if you are thinking about doing parallel computing. > >* don't use two-processor boards. > > > >The two processor will only end up competing for bus access, > >instead of doing your float calculations. > > I also read somewhere that even if the data set is small enough to fit in > the processor cache, on a dual processor machine they have to spend time > comparing the two caches to make sure the processors are not altering the > same memory. Single processors in a cluster won't have this problem. Yes, cache consistency is another problem in multi-processor machines. Modern processors try minimize the overhead by sophisticated processor design. But, I am not sure how successful this is. Cheers, Manuel --------------------------------------------------------------- Next Nomikai: 20 November, 19:30 Tengu TokyoEkiMae 03-3275-3691 Next Meeting: 12 December, 12:30 Tokyo Station Yaesu central gate --------------------------------------------------------------- Sponsor: PHT, makers of TurboLinux http://www.pht.co.jp
- References:
- tlug: Re: linux cluster
- From: Darren Cook <darren@example.com>
Home | Main Index | Thread Index
- Prev by Date: Re: tlug: TLUG Finances
- Next by Date: Re: tlug: TLUG Finances
- Prev by thread: Re: tlug: Re: linux cluster
- Next by thread: Re: tlug: linux cluster
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links