Results 1 to 4 of 4

Thread: 64 bit numbers and 64 bit processors

  1. #1
    Junior Member
    Join Date
    Sep 2004

    64 bit numbers and 64 bit processors

    Im attempting to optimize my program. The whole goal of it is to crunch large 200-1000 digit numbers. While i did find an earler post about handling such data, im doing this with my own Big Number class. For simplicity i stored the numbers in an array. ( i could do a link list, but lets not get into that). Now the way its set up is such:

    Take the number 12345, and in the array, its [1][2][3][4][5]. So each digit is stored as a short. so its a short array. Now, if i recall, and this is where you come in, a short is 8 bits long, so if im using a 32 bit proccessor, when i do an operation on one cell of the array will i be only utilizing the 1/4 of the proccessor? And now that i have a 64 bit proccessor... will i have to do something like:
    __int64 BigNumber[arrayLenght];
    //such that 1234567890 => [12345][67890]
    //assuming 12345 is the largest number __int64 can hold.
    Then another problem comes in, do i have to compile the program with a 64 bit compiler and not a 32 bit? Anyhelp/insight is apreciated. The whole goal is to cruch as much data in one proccess.

    (P.C. Load letter?! What the F**K does that mean? ~office space)

  2. #2
    Senior Member
    Join Date
    Mar 2004

    It is possible to gain factors 100 (!) by choosing an optimal data structure and an
    optimal order to perform the calculations, without even touching anything processor
    specific. So this is what I recommend you - do a trial and error brute-force benchmarking .

    Nevertheless, a few notes on different optimisation and processors.


    Simply spoken, a processor can perform operations on a number of bits within one clock
    cycle. A 32bit processor can handle 32bits, a 64bit processor 64bits. Besides this,
    there are other difference between the two architectures, like memory allocation but
    also different setup of the elements on the chip itself.

    If you want to perform operations on 64bit structures, it takes usually longer than
    two clock cycles on a 32bit processor, while it can be done within one clock cycle
    on a 64bit. This is just one difference.

    Note: If you want to optimise your library on that level, you will require assembler
    for most parts. There are only a few simple tricks (like the correct order of the loops, that
    allows for optimal pipelining (see below)) in a high-level programming language.

    Example: If you decide to use linked lists, forget about performance issues dealing with
    8bit, 32bit numbers etc: This will then be negligible! Your current choice of these
    arrays sound a bit suboptimal to me. I recommend to look at open-source big number
    solutions and tricks they used to optimise their approach.


    8 bits (...) will i be only utilizing the 1/4 of the proccessor?
    Hm, yes, if you wish so. You spend 1 clock cycle to perform operations on 8bit structures,
    while the processor would be able to do 32bit, spoiling 24bits. Nevertheless, this can
    be optimised by the compiler in principle, if you have coded appropriate.

    But that's not the real issue for code optimisation:

    - even older processors (PIII) with MMX support are capable to perform an operation on 128bit
    within one clock cycle (SSE instruction set[1])

    - the real issue is pipelining[2] with scalar processors. Vector processors (Cray etc) have to
    be handled differently. Note: The real speed gain on Cray computers is not due to their
    processor, but to the way they handle memory: 64, 128, ...1024 memory banks
    compared to 2-3 on standard machine setups (latency hiding).

    - in order to get enhanced performance on a 64bit processor, like Opteron, you have to
    pursue different strategies as well (make use of SSE3)[3]

    For your project: Try to learn from what has been done already. Think about choices
    of your data structures (what is really needed? how can I optimise pipelining?). Then,
    you will have a fast program already. Do not really worry about processor internals.


    [1] https://shale.intel.com/SoftwareColl...sp?courseID=23
    [2] http://www.cs.iastate.edu/~prabhu/Tu...ipe_title.html
    [3] http://www.amd.com/us-en/assets/content_type/ white_papers_and_tech_docs/25112.PDF
    If the only tool you have is a hammer, you tend to see every problem as a nail.
    (Abraham Maslow, Psychologist, 1908-70)

  3. #3
    Junior Member
    Join Date
    Sep 2004
    Wow, thank you for the indepth answer, I will look into the open source alternatives as you have suggested, and the recomeneded links you have posted. Again thanks!

  4. #4
    Senior Member
    Join Date
    Mar 2004

    You are welcome. Well, the links I provided are illustrating what I said. In order to
    learn optimization, start with a google search like
    "code optimization c c++ pipelining examples filetype:pdf".
    Eg. look at [1], and continue reading from there. Try to understand optimisation by
    looking at simple examples, and then apply it to your specific problem.

    The problem with optimisation is, that there are some general concepts, but high-end
    optimisation is very specific to the task to be performed by the program. In my previous
    post, I have choosen some point of view for illustration (that's why I am not very happy
    with it). But I hope, it helps you to get the idea.

    Good luck.


    [1] http://www.azillionmonkeys.com/qed/optimize.html
    If the only tool you have is a hammer, you tend to see every problem as a nail.
    (Abraham Maslow, Psychologist, 1908-70)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts