I've implemented a custom hashing algorithm in Python, and I would like to get feedback on its security. The algorithm involves complex padding, bitwise operations, and a mix of rotation functions. My goal is to understand whether there are any known weaknesses, such as vulnerabilities to brute-force attacks, collision resistance, or weaknesses in its internal state management.

Here?s a simplified overview of how the algorithm works:
  • Salt generation: The salt is generated using the length of the input and some constants.

  • Padding: The input is padded using a custom padding scheme that mimics wide-pipe padding.

  • State initialization: The algorithm initializes an internal state with 32 words of 64 bits each, using constants.

  • Processing: The input is processed in blocks, expanding each block and performing numerous rounds of mixing operations using bitwise shifts and XOR.

  • Final hash: The final hash is obtained by concatenating the processed state words.


I am specifically concerned with the following potential weaknesses:
  • Brute-force resistance: Is my algorithm sufficiently resistant to brute-force attacks?

  • Collision resistance: Does my algorithm exhibit any properties that could lead to collision vulnerabilities?


Here is the Python code for the algorithm:
Code:
def circular_left_shift(x, shift, bits=64):
   return ((x << shift) & ((1 << bits) - 1)) | (x >> (bits - shift))

def circular_right_shift(x, shift, bits=64):
   return (x >> shift) | ((x << (bits - shift)) & ((1 << bits) - 1))

def ultimate_resistant_hash(text, rounds=128):
   salt = ((len(text) * 0xF0E1D2C3B4A59687) ^ 0x1234567890ABCDEF) & ((1 << 64) - 1)
   data = str(salt) + text
   data_bytes = data.encode('utf-8')

   block_size = 256
   original_bit_length = len(data_bytes) * 8
   data_bytes += b'\x80'
   while (len(data_bytes) % block_size) != (block_size - 32):
       data_bytes += b'\x00'
   data_bytes += original_bit_length.to_bytes(32, 'big')

   state = [
       0x6a09e667f3bcc908, 0xbb67ae8584caa73b, 0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1,
       0x510e527fade682d1, 0x9b05688c2b3e6c1f, 0x1f83d9abfb41bd6b, 0x5be0cd19137e2179,
       0x243f6a8885a308d3, 0x13198a2e03707344, 0xa4093822299f31d0, 0x082efa98ec4e6c89,
       0x452821e638d01377, 0xbe5466cf34e90c6c, 0xc0ac29b7c97c50dd, 0x3f84d5b5b5470917,
       0x452821e638d01377, 0xbe5466cf34e90c6c, 0xc0ac29b7c97c50dd, 0x3f84d5b5b5470917,
       0x6a09e667f3bcc908, 0xbb67ae8584caa73b, 0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1,
       0x510e527fade682d1, 0x9b05688c2b3e6c1f, 0x1f83d9abfb41bd6b, 0x5be0cd19137e2179,
       0x243f6a8885a308d3, 0x13198a2e03707344, 0xa4093822299f31d0, 0x082efa98ec4e6c89
   ]

   for i in range(0, len(data_bytes), block_size):
       block = data_bytes[i:i+block_size]
       M = [int.from_bytes(block[j*8:(j+1)*8], 'big') for j in range(32)]

       W = M[:] + [0] * (128 - 32)
       for t in range(32, 128):
           s0 = (circular_right_shift(W[t-15], 1) ^ circular_right_shift(W[t-15], 8) ^ (W[t-15] >> 7)) & ((1<<64)-1)
           s1 = (circular_right_shift(W[t-2], 19) ^ circular_right_shift(W[t-2], 61) ^ (W[t-2] >> 6)) & ((1<<64)-1)
           W[t] = (W[t-16] + s0 + W[t-7] + s1) & ((1<<64)-1)
       for t in range(rounds):
           for j in range(32):
               a = state[j]
               b = state[(j+1) % 32]
               c_val = state[(j+3) % 32]
               d = state[(j+7) % 32]
               f_val = (circular_right_shift(a, (j+1) % 64) ^ circular_left_shift(b, (j+3) % 64)) + (c_val & d)
               m_val = W[(t + j) % 128]
               temp = (state[j] + f_val + m_val + (t * j) + ((state[(j+5) % 32] * 0x9e3779b97f4a7c15) & ((1<<64)-1))) & ((1<<64)-1)
               state[j] = temp ^ circular_left_shift(temp, (t + j) % 64)
           state = state[1:] + state[:1]

       for j in range(32):
           state[j] = state[j] ^ M[j % 32]

   digest = ''.join(format(x, '016x') for x in state)
   return digest
Can anyone provide insights into the potential vulnerabilities of this algorithm, or suggest improvements? (In this code, I have not included the part where the word is selected or the libraries used for handling files and user input, as they are not relevant to the analysis of the algorithm's functionality.)