I needed an hash function, in Java, which is simple, practically less collision and provides a short alphanumeric or an unsigned number as hash for any string or text. I stumbled upon few blog posts that explain how Oracle SQL_ID and HASH_VALUE are calculated by database engine, concisely and elaborately; also implementations in Python and PL/SQL to compute SQL_ID directly from text.
In short, Oracle passes SQL text with null terminator to standard MD5 hash function for calculating SQL_ID, HASH_VALUE and SIGNATURES—both EXACT MATCHING and FORCE MATCHING.
- SQL_ID is an alphanumeric representation of the last 8 bytes of MD5 hash; each 4 bytes in reverse order translated to lower-case alphabet and numbers.
- HASH_VALUE is an integer representation of the last 4 bytes of MD5 hash in reverse order.
- EXACT_MATCHING_SIGNATURE is an integer representation of the last 8 bytes of MD5 hash of standardized SQL text without a null terminator; each 4 bytes in reverse order.
- FORCE_MATCHING_SIGNATURE is an integer representation of the last 8 bytes of MD5 hash of bounded SQL text without a null terminator; each 4 bytes in reverse order
With this, I have enough information to build a utility class in Java to use in my toy project.
- Execute a sample statement; obtain SQL_ID, HASH_VALUE and etc. from database:
SELECT 'Ram' ram_stmt FROM dual;
RAM
---
Ram
1 row selected.
SELECT /* RAM_VSQL */
'SQL_TEXT : ' || v.sql_text || CHR(10) ||
'SQL_ID : ' || v.sql_id || CHR(10) ||
'HASH_VALUE : ' || v.hash_value || CHR(10) ||
'HASH_VALUE_HEX : ' || TO_CHAR (v.hash_value, 'FMXXXXXXXX') || CHR(10) ||
'EXACT_MATCHING_SIGNATURE : ' || v.exact_matching_signature || CHR(10) ||
'EXACT_MATCHING_SIGNATURE_HEX : ' || TO_CHAR (exact_matching_signature, 'FMXXXXXXXXXXXXXXXX') || CHR(10) ||
'FORCE_MATCHING_SIGNATURE : ' || v.force_matching_signature || CHR(10) ||
'FORCE_MATCHING_SIGNATURE_HEX : ' || TO_CHAR (force_matching_signature, 'FMXXXXXXXXXXXXXXXX') || CHR(10) AS stmt_information
FROM v$sql v
WHERE v.sql_text LIKE '%ram_stmt FROM dual' AND v.sql_text NOT LIKE '%RAM_VSQL%';
STMT_INFORMATION
---------------------------------------------------------------------------
SQL_TEXT : SELECT 'Ram' ram_stmt FROM dual
SQL_ID : aqth16g98h2jd
HASH_VALUE : 3532130861
HASH_VALUE_HEX : D2880A2D
EXACT_MATCHING_SIGNATURE : 4178266890746386855
EXACT_MATCHING_SIGNATURE_HEX : 39FC2F9987B6D9A7
FORCE_MATCHING_SIGNATURE : 16194980974160721469
FORCE_MATCHING_SIGNATURE_HEX : E0C021642D0F363D
1 row selected.
- Utility class, StmtUtil, to compute SQL_ID and HASH_VALUE from SQL Text:
import java.io.UnsupportedEncodingException;
import java.math.BigInteger;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
public class StmtUtil {
public static String computeSQLIdFromText(String stmt) throws UnsupportedEncodingException, NoSuchAlgorithmException {
/* Example
* Statement: SELECT 'Ram' ram_stmt FROM dual
* MD5 Hash with null terminator: 93bc072570a58c1f330166ab2d0a88d2
* Oracle SQL_ID: aqth16g98h2jd
* Address: 00000000826C6060
* Hash Value: 3532130861
* Hash Value Hex: D2880A2D
* MD5 Hash without null terminator: 93bc072570a58c1f330166ab2d0a88d2
* Exact Matching Signature: 4178266890746386855
* Exact Matching Signature Hex: 39FC2F9987B6D9A7
* MD5 Hash without null terminator for bound statement:
* Force Matching Signature: 16194980974160721469
* Force Matching Signature Hex: E0C021642D0F363D
*/
// get bytes of the string supplied
byte[] bytesOfStatement = stmt.trim().getBytes("UTF-8");
byte[] bytesOfStatementWithNull = new byte[bytesOfStatement.length + 1]; // last bucket used for Null Terminator
byte nullCharByte = 0x00; // get bytes of null terminator
// Null terminator is lost or has no effect (Unicode \\u0000, \\U0000 or hex 0x00 or "\0")
// on MD5 digest when you append to String
// Hence add the byte value as a last element after cloning
System.arraycopy(bytesOfStatement, 0, bytesOfStatementWithNull, 0, bytesOfStatement.length); // clone the array
bytesOfStatementWithNull[bytesOfStatement.length]=nullCharByte; // add null terminator as last element
// get the MD5 digest
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] bytesFromMD5Digest = md.digest(bytesOfStatementWithNull);
// convert the byte to hex format
// and generate MD5 hash
StringBuffer sbMD5HashHex32 = new StringBuffer();
for (int i = 0; i < bytesFromMD5Digest.length; i++) {
sbMD5HashHex32.append(Integer.toString((bytesFromMD5Digest[i] & 0xff) + 0x100, 16).substring(1));
}
// Obtain MD5 hash, split and reverse Q3 and Q4
// Get 32 Hex Characters
String strHex32Hash = sbMD5HashHex32.toString();
// Reverse order of each of the 4 pairs of hex characters in Q3 and Q4
// Q3 reverse
String strHexQ3Reverse = strHex32Hash.substring(22, 24) + strHex32Hash.substring(20, 22)
+ strHex32Hash.substring(18, 20) + strHex32Hash.substring(16, 18);
// Q4 reverse
String strHexQ4Reverse = strHex32Hash.substring(30, 32) + strHex32Hash.substring(28, 30)
+ strHex32Hash.substring(26, 28) + strHex32Hash.substring(24, 26);
// assemble lower 16 with reversed q3 and q4
String strHexLower16Assembly = strHexQ3Reverse + strHexQ4Reverse;
// convert to 64 bits of binary data
String strBinary16Assembly = (new BigInteger(strHexLower16Assembly, 16).toString(2));
// Left pad with zeros, if the length is less than 64
StringBuffer sbBinary16mAssembly = new StringBuffer();
for (int toLefPad = 64 - strBinary16Assembly.length(); toLefPad > 0; toLefPad--) {
sbBinary16mAssembly.append('0');
}
sbBinary16mAssembly.append(strBinary16Assembly);
strBinary16Assembly = sbBinary16mAssembly.toString();
// Split the 64 bit string into 13 pieces
// First split as 4 bit and remaining as 5 bits
String alphabet = "0123456789abcdfghjkmnpqrstuvwxyz";
String strPartBindaySplit;
int intIndexOnAlphabet;
StringBuffer sbSQLId = new StringBuffer();
for (int i = 0; i < 13; i++) {
// Split binary string into 13 pieces
strPartBindaySplit = (i == 0 ? strBinary16Assembly.substring(0, 4)
: (i < 12 ? strBinary16Assembly.substring((i * 5) - 1, (i * 5) + 4)
: strBinary16Assembly.substring((i * 5) - 1)));
// Index position on Alphabet
intIndexOnAlphabet = Integer.parseInt(strPartBindaySplit, 2);
// Stick 13 characters
sbSQLId.append(alphabet.charAt(intIndexOnAlphabet));
}
return sbSQLId.toString();
}
public static BigInteger computeHashValueFromText(String stmt) throws UnsupportedEncodingException, NoSuchAlgorithmException {
/* Example
* Statement: SELECT 'Ram' ram_stmt FROM dual
* MD5 Hash with null terminator: 93bc072570a58c1f330166ab2d0a88d2
* Oracle SQL_ID: aqth16g98h2jd
* Address: 00000000826C6060
* Hash Value: 3532130861
* Hash Value Hex: D2880A2D
* MD5 Hash without null terminator: 93bc072570a58c1f330166ab2d0a88d2
* Exact Matching Signature: 4178266890746386855
* Exact Matching Signature Hex: 39FC2F9987B6D9A7
* MD5 Hash without null terminator for bound statement:
* Force Matching Signature: 16194980974160721469
* Force Matching Signature Hex: E0C021642D0F363D
*/
// get bytes of the string supplied
byte[] bytesOfStatement = stmt.trim().getBytes("UTF-8");
byte[] bytesOfStatementWithNull = new byte[bytesOfStatement.length + 1]; // last bucket used for Null Terminator
byte nullCharByte = 0x00; // get bytes of null terminator
// Null terminator is lost or has no effect (Unicode \\u0000, \\U0000 or hex 0x00 or "\0")
// on MD5 digest when you append to String
// Hence add the byte value as a last element after cloning
System.arraycopy(bytesOfStatement, 0, bytesOfStatementWithNull, 0, bytesOfStatement.length); // clone the array
bytesOfStatementWithNull[bytesOfStatement.length]=nullCharByte; // add null terminator as last element
// get the MD5 digest
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] bytesFromMD5Digest = md.digest(bytesOfStatementWithNull);
// convert the byte to hex format
// and generate MD5 hash
StringBuffer sbMD5HashHex32 = new StringBuffer();
for (int i = 0; i < bytesFromMD5Digest.length; i++) {
sbMD5HashHex32.append(Integer.toString((bytesFromMD5Digest[i] & 0xff) + 0x100, 16).substring(1));
}
// Obtain MD5 hash, split and reverse Q4
// Get 32 Hex Characters
String strHex32Hash = sbMD5HashHex32.toString();
// Reverse order of hex characters in Q4
String strHexQ4Reverse = strHex32Hash.substring(30, 32) + strHex32Hash.substring(28, 30)
+ strHex32Hash.substring(26, 28) + strHex32Hash.substring(24, 26);
// convert to 4 bytes of hex to integer
BigInteger biStmtHashValue = new BigInteger(strHexQ4Reverse, 16);
return biStmtHashValue;
}
}
- Test class, StmtUtilTest, to invoke utility class:
public class StmtUtilTest {
public static void main(String[] args) {
String stmt = "SELECT 'Ram' ram_stmt FROM dual";
try {
System.out.println("==============computeSQLIdFromText=========================");
System.out.println("stmt : " + stmt);
System.out.println("sqlid : " + StmtUtil.computeSQLIdFromText(stmt));
System.out.println();
System.out.println("==============computeHashValueFromText=====================");
System.out.println("stmt : " + stmt);
System.out.println("hv : " + StmtUtil.computeHashValueFromText(stmt));
System.out.println();
} catch (Exception e) {
e.printStackTrace();
}
}
}
- Now, time to execute the test:
java StmtUtilTest ==============computeSQLIdFromText========================= stmt : SELECT 'Ram' ram_stmt FROM dual sqlid : aqth16g98h2jd ==============computeHashValueFromText===================== stmt : SELECT 'Ram' ram_stmt FROM dual hv : 3532130861
DBMS_SQL_TRANSLATOR.SQL_ID and DBMS_SQL_TRANSLATOR.SQL_HASH function, from Oracle 12c, can be used to compute SQL_ID and HASH_VALUE from text within the database, if needed.