How to minimize the decryption overhead when searching on encrypted short database fields
November 2008 by Ulf Mattsson, CTO, Protegrity Corporation
You can minimize the decryption overhead when searching on short encrypted fields by combining a couple of different approaches that are typically dependant on the type of search and on the platform used. In general, you want to encrypt a few very sensitive data elements in a schema, like social security numbers, credit card numbers, patient names, etc. If some care and discretion is used, the amount of extra overhead can be minimal. Creating indexes on encrypted data is useful only in some specific cases.
Exact matches and joins of encrypted data can normally utilize the indexes you create. Since encrypted data is random binary data, range checking of encrypted data would require table scans in most vendor solutions. Range checking will normally require decrypting all the row values for a column, so it should be avoided. Searching for ranges or partial matches on encrypted data fields within a database can be enabled with minimal decryption overhead and will avoid full table scans if your solution supports any type of accelerated index-search or partial field encryption. Some of these approaches are dependent on specific platform capabilities and the type of search that you perform.
Searching for an exact match of an encrypted value within a column can marginalize the decryption overhead, provided that the same (constant) initialization vector is used for the entire column. You cannot use a rotating or random initialization vector in this case. An accelerated search index is a specially crafted index that can be based on a substring of the decrypted values or some other transformation that can speed up specific search operations. This approach can be application transparent when built on top of extensible indexing functionality.
One example of extensible database indexing functionality is the Oracle Domain Index. Searching for partial matches on encrypted data within a database can be challenging and can result in full table scans if an accelerated index-search on encrypted data is not supported.
Non-transparent solutions can also be facilitated by various ways of splitting up a sensitive column into several searchable columns that are concatenated at a view layer for higher application transparency. This approach can be used to find exact matches on the beginning or end of a field but will impact on your database design. One drawback to this approach is that a new column needs to be added for each unique type of search criteria. One approach to perform partial searches, without prohibitive performance constraints and without revealing too much sensitive information, is to apply a HMAC (keyed-Hash Message Authentication Code) to part of the sensitive data and store it in another column in the same row. A HMAC is a hash function secured by a secret key.
If the database needs to allow for searching based on the first four characters as well as the last five characters, two new columns would need to be added to the table. In order to save space, the HMAC hash values can be truncated to ten bytes without compromising security in order to save space.
This approach can prove to be a reasonable compromise especially when combined with non-sensitive search criteria such as zip code, city, etc. and can significantly improve search performance. Some third party vendors, including www.protegrity.com, can provide mature and flexible solutions to these problems across major platforms.
Please see http://database.ittoolbox.com/documents/peer-publishing/database-encryption-how-to-balance-security-with-performance-4503 and http://research.ittoolbox.com/white-papers/lg.asp?grid=3515&kb=DB&pl=&ref=http%3A%2F%2Fwww%2Egoogle%2Ecom%2Fsearch%3Fq%3Ddatabase%2Bencryption%26hl%3Den%26start%3D10%26sa%3DN&sp= , http://research.ittoolbox.com/white-papers/lg.asp?grid=3515&kb=DB&pl=&ref=http%3A%2F%2Fwww%2Egoogle%2Ecom%2Fsearch%3Fhl%3Den%26q%3Ddatabase%2Bencryption%26btnG%3DSearch&sp= for more information about how you can minimize the decryption overhead when searching on short encrypted fields.
It is always a good idea to check if the AES encryption mode that you are planning to use is approved. You may check with a certified PCI assessor. Level 1 merchants should engage a Qualified Security Assessor. Please see http://usa.visa.com/merchants/risk_management/cisp_merchants.html for more information. I’d also check the list that is approved by NIST (http://csrc.nist.gov/CryptoToolkit/modes/ ). In Special Publication 800-38A, five confidentiality modes are specified for use with any approved block cipher, such as the AES algorithm. The modes in SP 800-38A are updated versions of the ECB, CBC, CFB, and OFB modes that are specified in FIPS Pub. 81; in addition, SP 800-38A specifies the CTR mode.