I needed an hash function, in Java, which is simple, practically less collision and provides a short alphanumeric or an unsigned number as hash for any string or text. I stumbled upon few blog posts that explain how Oracle SQL_ID and HASH_VALUE are calculated by database engine, concisely and elaborately; also implementations in Python and PL/SQL to compute SQL_ID directly from text.
In short, Oracle passes SQL text with null terminator to standard MD5 hash function for calculating SQL_ID, HASH_VALUE and SIGNATURES—both EXACT MATCHING and FORCE MATCHING.
- SQL_ID is an alphanumeric representation of the last 8 bytes of MD5 hash; each 4 bytes in reverse order translated to lower-case alphabet and numbers.
- HASH_VALUE is an integer representation of the last 4 bytes of MD5 hash in reverse order.
- EXACT_MATCHING_SIGNATURE is an integer representation of the last 8 bytes of MD5 hash of standardized SQL text without a null terminator; each 4 bytes in reverse order.
- FORCE_MATCHING_SIGNATURE is an integer representation of the last 8 bytes of MD5 hash of bounded SQL text without a null terminator; each 4 bytes in reverse order
With this, I have enough information to build a utility class in Java to use in my toy project.
- Execute a sample statement; obtain SQL_ID, HASH_VALUE and etc. from database:
SELECT 'Ram' ram_stmt FROM dual; RAM --- Ram 1 row selected. SELECT /* RAM_VSQL */ 'SQL_TEXT : ' || v.sql_text || CHR(10) || 'SQL_ID : ' || v.sql_id || CHR(10) || 'HASH_VALUE : ' || v.hash_value || CHR(10) || 'HASH_VALUE_HEX : ' || TO_CHAR (v.hash_value, 'FMXXXXXXXX') || CHR(10) || 'EXACT_MATCHING_SIGNATURE : ' || v.exact_matching_signature || CHR(10) || 'EXACT_MATCHING_SIGNATURE_HEX : ' || TO_CHAR (exact_matching_signature, 'FMXXXXXXXXXXXXXXXX') || CHR(10) || 'FORCE_MATCHING_SIGNATURE : ' || v.force_matching_signature || CHR(10) || 'FORCE_MATCHING_SIGNATURE_HEX : ' || TO_CHAR (force_matching_signature, 'FMXXXXXXXXXXXXXXXX') || CHR(10) AS stmt_information FROM v$sql v WHERE v.sql_text LIKE '%ram_stmt FROM dual' AND v.sql_text NOT LIKE '%RAM_VSQL%'; STMT_INFORMATION --------------------------------------------------------------------------- SQL_TEXT : SELECT 'Ram' ram_stmt FROM dual SQL_ID : aqth16g98h2jd HASH_VALUE : 3532130861 HASH_VALUE_HEX : D2880A2D EXACT_MATCHING_SIGNATURE : 4178266890746386855 EXACT_MATCHING_SIGNATURE_HEX : 39FC2F9987B6D9A7 FORCE_MATCHING_SIGNATURE : 16194980974160721469 FORCE_MATCHING_SIGNATURE_HEX : E0C021642D0F363D 1 row selected.
- Utility class, StmtUtil, to compute SQL_ID and HASH_VALUE from SQL Text:
import java.io.UnsupportedEncodingException; import java.math.BigInteger; import java.security.MessageDigest; import java.security.NoSuchAlgorithmException; public class StmtUtil { public static String computeSQLIdFromText(String stmt) throws UnsupportedEncodingException, NoSuchAlgorithmException { /* Example * Statement: SELECT 'Ram' ram_stmt FROM dual * MD5 Hash with null terminator: 93bc072570a58c1f330166ab2d0a88d2 * Oracle SQL_ID: aqth16g98h2jd * Address: 00000000826C6060 * Hash Value: 3532130861 * Hash Value Hex: D2880A2D * MD5 Hash without null terminator: 93bc072570a58c1f330166ab2d0a88d2 * Exact Matching Signature: 4178266890746386855 * Exact Matching Signature Hex: 39FC2F9987B6D9A7 * MD5 Hash without null terminator for bound statement: * Force Matching Signature: 16194980974160721469 * Force Matching Signature Hex: E0C021642D0F363D */ // get bytes of the string supplied byte[] bytesOfStatement = stmt.trim().getBytes("UTF-8"); byte[] bytesOfStatementWithNull = new byte[bytesOfStatement.length + 1]; // last bucket used for Null Terminator byte nullCharByte = 0x00; // get bytes of null terminator // Null terminator is lost or has no effect (Unicode \\u0000, \\U0000 or hex 0x00 or "\0") // on MD5 digest when you append to String // Hence add the byte value as a last element after cloning System.arraycopy(bytesOfStatement, 0, bytesOfStatementWithNull, 0, bytesOfStatement.length); // clone the array bytesOfStatementWithNull[bytesOfStatement.length]=nullCharByte; // add null terminator as last element // get the MD5 digest MessageDigest md = MessageDigest.getInstance("MD5"); byte[] bytesFromMD5Digest = md.digest(bytesOfStatementWithNull); // convert the byte to hex format // and generate MD5 hash StringBuffer sbMD5HashHex32 = new StringBuffer(); for (int i = 0; i < bytesFromMD5Digest.length; i++) { sbMD5HashHex32.append(Integer.toString((bytesFromMD5Digest[i] & 0xff) + 0x100, 16).substring(1)); } // Obtain MD5 hash, split and reverse Q3 and Q4 // Get 32 Hex Characters String strHex32Hash = sbMD5HashHex32.toString(); // Reverse order of each of the 4 pairs of hex characters in Q3 and Q4 // Q3 reverse String strHexQ3Reverse = strHex32Hash.substring(22, 24) + strHex32Hash.substring(20, 22) + strHex32Hash.substring(18, 20) + strHex32Hash.substring(16, 18); // Q4 reverse String strHexQ4Reverse = strHex32Hash.substring(30, 32) + strHex32Hash.substring(28, 30) + strHex32Hash.substring(26, 28) + strHex32Hash.substring(24, 26); // assemble lower 16 with reversed q3 and q4 String strHexLower16Assembly = strHexQ3Reverse + strHexQ4Reverse; // convert to 64 bits of binary data String strBinary16Assembly = (new BigInteger(strHexLower16Assembly, 16).toString(2)); // Left pad with zeros, if the length is less than 64 StringBuffer sbBinary16mAssembly = new StringBuffer(); for (int toLefPad = 64 - strBinary16Assembly.length(); toLefPad > 0; toLefPad--) { sbBinary16mAssembly.append('0'); } sbBinary16mAssembly.append(strBinary16Assembly); strBinary16Assembly = sbBinary16mAssembly.toString(); // Split the 64 bit string into 13 pieces // First split as 4 bit and remaining as 5 bits String alphabet = "0123456789abcdfghjkmnpqrstuvwxyz"; String strPartBindaySplit; int intIndexOnAlphabet; StringBuffer sbSQLId = new StringBuffer(); for (int i = 0; i < 13; i++) { // Split binary string into 13 pieces strPartBindaySplit = (i == 0 ? strBinary16Assembly.substring(0, 4) : (i < 12 ? strBinary16Assembly.substring((i * 5) - 1, (i * 5) + 4) : strBinary16Assembly.substring((i * 5) - 1))); // Index position on Alphabet intIndexOnAlphabet = Integer.parseInt(strPartBindaySplit, 2); // Stick 13 characters sbSQLId.append(alphabet.charAt(intIndexOnAlphabet)); } return sbSQLId.toString(); } public static BigInteger computeHashValueFromText(String stmt) throws UnsupportedEncodingException, NoSuchAlgorithmException { /* Example * Statement: SELECT 'Ram' ram_stmt FROM dual * MD5 Hash with null terminator: 93bc072570a58c1f330166ab2d0a88d2 * Oracle SQL_ID: aqth16g98h2jd * Address: 00000000826C6060 * Hash Value: 3532130861 * Hash Value Hex: D2880A2D * MD5 Hash without null terminator: 93bc072570a58c1f330166ab2d0a88d2 * Exact Matching Signature: 4178266890746386855 * Exact Matching Signature Hex: 39FC2F9987B6D9A7 * MD5 Hash without null terminator for bound statement: * Force Matching Signature: 16194980974160721469 * Force Matching Signature Hex: E0C021642D0F363D */ // get bytes of the string supplied byte[] bytesOfStatement = stmt.trim().getBytes("UTF-8"); byte[] bytesOfStatementWithNull = new byte[bytesOfStatement.length + 1]; // last bucket used for Null Terminator byte nullCharByte = 0x00; // get bytes of null terminator // Null terminator is lost or has no effect (Unicode \\u0000, \\U0000 or hex 0x00 or "\0") // on MD5 digest when you append to String // Hence add the byte value as a last element after cloning System.arraycopy(bytesOfStatement, 0, bytesOfStatementWithNull, 0, bytesOfStatement.length); // clone the array bytesOfStatementWithNull[bytesOfStatement.length]=nullCharByte; // add null terminator as last element // get the MD5 digest MessageDigest md = MessageDigest.getInstance("MD5"); byte[] bytesFromMD5Digest = md.digest(bytesOfStatementWithNull); // convert the byte to hex format // and generate MD5 hash StringBuffer sbMD5HashHex32 = new StringBuffer(); for (int i = 0; i < bytesFromMD5Digest.length; i++) { sbMD5HashHex32.append(Integer.toString((bytesFromMD5Digest[i] & 0xff) + 0x100, 16).substring(1)); } // Obtain MD5 hash, split and reverse Q4 // Get 32 Hex Characters String strHex32Hash = sbMD5HashHex32.toString(); // Reverse order of hex characters in Q4 String strHexQ4Reverse = strHex32Hash.substring(30, 32) + strHex32Hash.substring(28, 30) + strHex32Hash.substring(26, 28) + strHex32Hash.substring(24, 26); // convert to 4 bytes of hex to integer BigInteger biStmtHashValue = new BigInteger(strHexQ4Reverse, 16); return biStmtHashValue; } }
- Test class, StmtUtilTest, to invoke utility class:
public class StmtUtilTest { public static void main(String[] args) { String stmt = "SELECT 'Ram' ram_stmt FROM dual"; try { System.out.println("==============computeSQLIdFromText========================="); System.out.println("stmt : " + stmt); System.out.println("sqlid : " + StmtUtil.computeSQLIdFromText(stmt)); System.out.println(); System.out.println("==============computeHashValueFromText====================="); System.out.println("stmt : " + stmt); System.out.println("hv : " + StmtUtil.computeHashValueFromText(stmt)); System.out.println(); } catch (Exception e) { e.printStackTrace(); } } }
- Now, time to execute the test:
java StmtUtilTest ==============computeSQLIdFromText========================= stmt : SELECT 'Ram' ram_stmt FROM dual sqlid : aqth16g98h2jd ==============computeHashValueFromText===================== stmt : SELECT 'Ram' ram_stmt FROM dual hv : 3532130861
DBMS_SQL_TRANSLATOR.SQL_ID and DBMS_SQL_TRANSLATOR.SQL_HASH function, from Oracle 12c, can be used to compute SQL_ID and HASH_VALUE from text within the database, if needed.