March 2, 2010
January 27, 2010
I recently found this in the web server logs of one of the websites I look after:
184.108.40.206 - - [26/Jan/2010:05:01:44 -0800] "GET /application/json HTTP/1.1" 404 763 "-" "panscient.com" 220.127.116.11 - - [26/Jan/2010:05:01:47 -0800] "GET /following-sibling::* HTTP/1.1" 404 763 "-" "panscient.com" 18.104.22.168 - - [26/Jan/2010:05:01:55 -0800] "GET /AppleWebKit/ HTTP/1.1" 404 763 "-" "panscient.com" 22.214.171.124 - - [26/Jan/2010:05:01:58 -0800] "GET /following-sibling::* HTTP/1.1" 404 763 "-" "panscient.com"
In case you are not familiar with web server log files, these line mean is that someone/something from IP address 126.96.36.199 requested the pages named after “GET” on the website, for example, a page named “following-sibling::*” etc.
Does it need to be said that no such pages exist (that’s what the “
When I saw this I was rather puzzled; and looked up
Why is your web crawler trying to access pages that don’t exist on my website?
(Emphasis mine) Oh ok. They are looking into
Looks like a pretty competitive business when people start pulling at straws like this. Also I take it bandwidth is easier to come by than crawling software that avoids such silly attempts.
July 25, 2009
I’ve been looking into encryption methods recently, and came across this little surprise about cipher block chaining, or CBC, as it is used for block ciphers.
The idea of cipher block chaining is that if such a long message contains identical blocks, or two messages contain identical blocks, then you can tell that from the encrypted parts: they will be the same. Whoever has access to the encrypted message, and if they know the block cipher employed, then they can extract these blocks. While they cannot decrypt the individual blocks, they can compare them. Such is the world of cryptography that there are cases where it should be made difficult to tell that one message contains parts of a different message, or repeats itself.
Cypher Block Chaining
One solution, and the most commonly used “mode of operation” for a block cipher (see
Suppose our numbering is such that the first block has number 1 (not 0 as is common).
- Let P(i) be the i-th block of the plain text message.
- Let E(X) be the result of encrypting the (plain text) block X.
- Let D(Y) be the result of decrypting the (encrypted) block Y.
- Let C(i) be the i-th encrypted (cipher) block.
Then encryption with Cipher Block Chaining can be formalized as:
C(0) := IV, the initial vector
C(i) := E( P(i) XOR C(i-1))
If the receiver knows the initial vector as well as the block cipher’s encryption key they can completely decrypt the message. Decryption is formalized like this:
C(0) := IV, the initial vector
P(i) := D( C(i) ) XOR C(i-1)
Decrypting with a Different Initial Vector
Finally I can point out what surprised me: it is that when decrypting, the blocks P(2), P(3), P(4), and so on do not depend on the initial vector IV that was used for encryption! Only P(1), the first decrypted block, depends on IV, while the other parts of the decrypted message will be the same regardless of IV.
In this way, the contribution of the initial vector is very different from the encryption key! And it is rather nice to see that it need not be any stronger, since it provides the function it is designed for: to hide the information about identical blocks.
And so, if the message is prepended by the the encrypter with some arbitrary initial block, the receiver does not need to know the initial vector used for encryption. After decrypting with some arbitrarily chosen initial vector (all 0′s, for example) they can just throw away the first block; the remaining blocks will represent the encrypted message.
Sample Code with AES and openssl
Here is some rather simple code to illustrate the effect. It is based on one of the Rijndael block ciphers, AES-256 (see
echo "The symmetric cipher commands allow data to be encrypted or decrypted using various block and stream ciphers" > msg.in # Encrypt msg.in with some key and an initial vector openssl enc -aes-256-cbc -K 1234567890123456 -iv 1234567890123456 -in msg.in -out msg.crypt echo Decrypt with both the right key and the right iv openssl enc -d -aes-256-cbc -K 1234567890123456 -iv 1234567890123456 -in msg.crypt echo Decrypt with the right key but a different iv # Pipe into 'od -cx' because there will likely be non-displayable characters. msg.crypt is a properly binary file openssl enc -d -aes-256-cbc -K 1234567890123456 -iv ABCDEF1234560FED -in msg.crypt | od -cx echo Compare with the output with the right key and the right iv openssl enc -d -aes-256-cbc -K 1234567890123456 -iv 1234567890123456 -in msg.crypt | od -cx
When executed in a UNIX shell, and all the required programs are available, the output is:
Decrypt with both the right key and the right iv The symmetric cipher commands allow data to be encrypted or decrypted using various block and stream ciphers Decrypt with the right key but a different iv 0000000 355 221 334 J 327 = V 326 e t r i c c i 91ed 4adc 3dd7 d656 7465 6972 2063 6963 0000020 p h e r c o m m a n d s a l 6870 7265 6320 6d6f 616d 646e 2073 6c61 0000040 l o w d a t a t o b e e 6f6c 2077 6164 6174 7420 206f 6562 6520 0000060 n c r y p t e d o r d e c r 636e 7972 7470 6465 6f20 2072 6564 7263 0000100 y p t e d u s i n g v a r i 7079 6574 2064 7375 6e69 2067 6176 6972 0000120 o u s b l o c k a n d s t 756f 2073 6c62 636f 206b 6e61 2064 7473 0000140 r e a m c i p h e r s \n \0 6572 6d61 6320 7069 6568 7372 000a 0000155 Compare with the output with the right key and the right iv 0000000 T h e s y m m e t r i c c i 6854 2065 7973 6d6d 7465 6972 2063 6963 0000020 p h e r c o m m a n d s a l 6870 7265 6320 6d6f 616d 646e 2073 6c61 0000040 l o w d a t a t o b e e 6f6c 2077 6164 6174 7420 206f 6562 6520 0000060 n c r y p t e d o r d e c r 636e 7972 7470 6465 6f20 2072 6564 7263 0000100 y p t e d u s i n g v a r i 7079 6574 2064 7375 6e69 2067 6176 6972 0000120 o u s b l o c k a n d s t 756f 2073 6c62 636f 206b 6e61 2064 7473 0000140 r e a m c i p h e r s \n \0 6572 6d61 6320 7069 6568 7372 000a 0000155 As you can see only the first few bytes differ when using the "wrong initial vector". Just for future reference, here is my system information when running the above code: $ uname -a Linux myosin 2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008 i686 GNU/Linux $ bash --version GNU bash, version 3.2.39(1)-release (i486-pc-linux-gnu) Copyright (C) 2007 Free Software Foundation, Inc. $ openssl version OpenSSL 0.9.8g 19 Oct 2007