Hashing is an algorithm used to produce a hash value of some piece of data, such as a message or session key. Typical hashing algorithms include MD2, MD4, MD5, SHA-1 and SHA-2. This tutorial will guide you through the steps to create a Win32 console based sample application, which consists of cryptographic hashing function, by using Microsoft Cryptographic Service Providers (CSP).
The following diagram illustrate how the hash value is being generated in the sample project.
The sample project in this tutorial will cover the following hashing method:
1. MD5 – I know it’s very old, but i will cover this.
2. SHA1 – Again, I know it’s old, but i still will cover this.
3. SHA256
4. SHA382
5. SHA512
Let’s start the tutorial with the following steps:
1. Create a VC++ Win32 console application.
2. Add cryptoyb.h and cryptoyb.cpp to the sample project. Try to compile the sample project now and it should compile successfully. In case it failed, please make sure that Advapi32.lib is included in your project property pages under the following section: [Linker -> Input -> Additional Dependencies]
3. Enter the following code into the project main cpp file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | #include "stdafx.h" #include <iostream> #include <string> #include <tchar.h> #include "cryptoyb.h" int _tmain(int argc, _TCHAR* argv[]) { CryptoYB askyb; std::string hash; // MD5 String hash = askyb.MD5("HelloWorld"); printf("MD5:\n%s\n\n",hash.c_str()); // MD5 File hash = askyb.MD5_File("C:\\doc1.txt"); printf("MD5 File:\n%s\n\n", hash.c_str()); // SHA1 String hash = askyb.SHA1("HelloWorld"); printf("SHA1:\n%s\n\n", hash.c_str()); // SHA1 File hash = askyb.SHA1_File("C:\\doc1.txt"); printf("SHA1 File:\n%s\n\n", hash.c_str()); // SHA256 String hash = askyb.SHA256("HelloWorld"); printf("SHA256:\n%s\n\n", hash.c_str()); // SHA256 File hash = askyb.SHA256_File("C:\\doc1.txt"); printf("SHA256 File:\nt%s\n\n", hash.c_str()); // SHA384 String hash = askyb.SHA384("HelloWorld"); printf("SHA384:\n%s\n\n", hash.c_str()); // SHA384 File hash = askyb.SHA384_File("C:\\doc1.txt"); printf("SHA384 File:\n%s\n\n", hash.c_str()); // SHA515 String hash = askyb.SHA512("HelloWorld"); printf("SHA512:\n%s\n\n", hash.c_str()); // SHA512 File hash = askyb.SHA512_File("C:\\doc1.txt"); printf("SHA512 File:\n%s\n\n", hash.c_str()); getchar(); return 0; } |
4. Compile the sample project and you should see the following result from the console application.
68e109f0f40ca72a15e05cc22786f8e6
MD5 File:
9a1593445593a57aa897b7bc9da1abdc
SHA1:
db8ac1c259eb89d4a131b253bacfca5f319d54f2
SHA1 File:
9e5ab66092bd39b99566fa251faeddc1df96236e
SHA256:
872e4e50ce9990d8b041330c47c9ddd11bec6b503ae9386a99da8584e9bb12c4
SHA256 File:
t32a938096165c66c8d92dd03d9b80542ae43637bde3b39a72096e183007aeb30
SHA384:
293cd96eb25228a6fb09bfa86b9148ab69940e68903cbc0527a4fb150eec1ebe0f1ffce0bc5e3df3
12377e0a68f1950a
SHA384 File:
b56355b76bb7644aa79407e189bb3448b28d5669a6c7a6c5b1c81c16f3ce28d3ba87a72a751f0fec
454378e510ad12ec
SHA512:
8ae6ae71a75d3fb2e0225deeb004faf95d816a0a58093eb4cb5a3aa0f197050d7a4dc0a2d5c6fbae
5fb5b0d536a0a9e6b686369fa57a027687c3630321547596
SHA512 File:
97e0836f35877cc5a2fb82cf39584e65aad44b0cd617f36020042e4b705c8d740f231f0a2c071e45
77c5547b6efeb985d2aa7853541d938558cbd426e97f25c5
Download sample project: MSHashSample.zip

While playing around with this code to hash files, I’m finding it rather slow (a 2gb file takes about 10 seconds to generate an MD5 hash on an Intel I5, and the other algorithms are exponentially slower). By comparison, on the same system, other utilities (which sadly do not make their source available), hash the same file in the same environment in a fraction of a second.
Any thoughts on where the bottleneck may be?
hmm..interesting. one of the idea that i can think now while replying to you is that probably you can try to hash certain amount of data from the large file. E.g you can load the first 100k bytes from the large file and get the hash value. But that will not secure compare with full file hash. Again, it depend to how you are going to use it.
Just some follow up observations..
Using the same 2gb test file and timing each hash function while experimenting with the buffer size defined in cryptoyb.cpp, the following pattern was observed while increasing the buffer size
1024 buffer 4096 buffer 8192 buffer
MD5: 13.7sec 7.7sec 6.6sec
SHA256: 31.2sec 25.5sec 24.6sec
SHA384: 53.4sec 47.5sec 46.6sec
SHA512: 53.2sec 47.9sec 46.6sec
Being new to the MS cryptographic services I have no idea what’s going on here, but I suspect that simply assigning larger and larger read buffer sizes is not the complete answer.
Also puzzling is the lack of a performance gap between SHA384 and SHA512.
Damn, all the formatting was lost.. grr