Stephan Wehner » internet

Snowden, NSA, etc.

January 20, 2014 internet, systems No comments

It’s been a while since Edward Snowden started revealing details about the global surveillance operations of the NSA. I’ve been meaning to write down a few thoughts for a long time, but didn’t quite find the time; however they still seem interesting enough now, so let’s go.

1. Insulting the whole world

It hardly makes sense to begin without stating that these actions are insulting, disrespectful and abusive to every human being (who uses a phone, or the Internet) on the planet.

How dare the American government think they are entitled to listen in on everyone’s phone conversations, read everyone’s emails and track all other online activities.

This also goes for everyone defending these actions. How dare you?!

The reporting from the American side often makes distinctions between surveillance of American citizens, and surveillance of those who are not such citizens. Sometimes it sounds as if American citizens are more worthy of some kind of better treatment. That is of course also insulting, disrespectful and abusive.

2. Spam

When I first heared of the extensive surveillance systems, I thought, well, if they really have this much equipment and so many resources, it would be nice if they could solve a real problem: namely email spam, blog post spam, and similar.

However, they do not occur to be working on that.

One concludes that this is not high on their priorities.

3. Bittorrent

Early on, one could read that the NSA stayed away from analyzing Bittorrent traffic. Which makes a lot of sense, since there is so much of it.

On the other hand, this leaves quite a gap to set up communication channels.

4. Pair programming

Within the first few weeks there were reports that the NSA switched to Pair-programming, pair administration, etc. Basically, don’t let anyone touch the systems without someone else watching. Take turns. Call it “Access Control Layer 1?”

The reports I read, didn’t point out how natural such a policy is. It is also not necessarily more expensive, some software companies use pair programming to avoid costly mistakes, programming bugs, etc.

So one wonders whether the original access control did not have this additional layer in order to actually allow unauthorized access. After all, Mr. Snowden did not access the system for personal gain; how many others with access were not that noble? How much easier to avoid laws and cut corners without such a layer.

Also, has this policy been reverted in the mean time? It would be possible to do so without issuing a press release, of course.

5. An extra button

Also early on, it was reported that the data that passes through the NSA’s system is enormous. On the other hand, if you look at any particular person, the data traffic they generate will most of the time be quite managable.

So I am guessing that if anyone shows up in manual surveillance (an operator inspects certain communications), they will have a button within their user interface, “Track this individual closely”. Save all that traffic for later viewing. I would estimate that adding 100 people per day would be easy to manage in terms of traffic.

6. Germany

I follow German news of course. When it was first revealed that the NSA was listening in on phone calls of the German chancellor, I thought, well at least governments have the resources to set up communications based on One-time pad encryption. Also, Germany has local chip factories to make sure the chip does what is desired by the owner, and not what the NSA desires.

(I find the response of European governments quite weak, I think it is underestimated how easy it is to argue with Americans)

7. Logging

In the wider context of privacy on the Internet (user profiling using cookies, etc), it had occurred to me a long time ago, that such policies should also cover web server logs. After all, such logs capture some information about individuals, sometimes not too much, but in the aggregate, I think it would be useful to look into stronger protection. Such protection, in one of my thoughts, would consist in having people, and organizations, be granted a “License to Log”. Such a license would be lost in the case of illegal activities, irresponsible behaviour, and so on.

These kinds of thoughts seem to be totally useless now.

8. Metadata

It hadn’t occurred to me that what they call “metadata” is useful, but now it does seem intuitive that it is very helpful. Basically every person is placed in a circle of their acquaintances, in circles of similarities according to many different categories.

This is similar to Facebook asking users to list their favourite books and music, etc. Facebook will also benefit from this as similar metadata of its users (Who to show which ads, as the most basic example).

9. Cost

One can only feel sorry for the average American. Basically these surveillance systems cost each tax payer a few hundred dollars a year. Without any benefit to them, it would seem.

10. Balance of powers

I often hear praise for the American democracy and how there is a balance of powers.

Cannot be spotted here. It’s rather obvious that there is no supervision of the NSA. Last week it was reported that a senator inquired whether they as senators are also covered by the systems. Asking the question itself was already revealing, needless to say the answer was disappointing.

One can make the case for secret courts in democracies, but I think it is obvious that these should be only used extremely sparingly. That is not the case here.

−

Ok, that’s all for now. You probably have your own thoughts about this. I’m sorry I couldn’t include links everwhere. Please share below, add corrections or other comments, if you could!

How to use the Web completely securely and keep your communications totally private (in the year 2013)

September 7, 2013 internet, systems No comments

Locations from IP addresses and the mobile future

March 24, 2013 internet, systems No comments

Let me start with mentioning that I’m writing this post on March 24, 2013. This kind of stuff is sure to change over time.

One of the things about the Internet is that it doesn’t have location built in. You don’t get to know where someone was when they wrote you an email, website operators don’t get to know where their visitors are when they visit the site. People don’t get to know where the craigslist servers are which put all those bytes together that make up a listing, and so on, and so on.

So people try to approximate. Because it is deemed to be useful information: country, region, city, latitude, longitude, ZIP code, time zone data and more. Even if it is not absolutely reliable, you can still derive some information, some picture. Different services are available, some free, some quite inexpensive, and some quite expensive. Most come in a form of a database file that you download, after paying for a license to use the data. Periodic updates to the database are made available. Web services are also available; this is where an IP address is submitted, over the Internet itself, to the service provider, and they immediately respond with their location data for this address. You can read more about IP Address Location at wikipedia.

Of course, these services don’t work when people use so called proxies and VPN‘s. In fact, proxies are set up precisely for the purpose of fooling services which are ordinarily restricted by location (e.g. for accessing Netflix or Spotify from non-US locations) into handing over the goods. Such proxies are not particularly costly to use.

Now, when you look at determining location for mobile devices, these models become quite questionable. Surely the network is not aligned with city boundaries, and surely a mobile device’s Internet address does not change smoothly as you move about. Of course, you can move faster than a database update.

So I thought I’d try out different IP Address location services – with my mobile device, using the “Data Plan,” not our home’s router or Wifi. I was in Vancouver, British Columbia, most definitely, the whole time.

Results

Source	Reported Location
http://www.ip2location.com/demo	Calgary, Alberta
http://www.iplocation.net/	Calgary, Alberta
http://www.ipaddressguide.com/ip2location	Calgary, Alberta
http://ipinfodb.com/index.php	Calgary, Alberta
http://www.maxmind.com/en/geoip_demo	Toronto, Ontario
http://www.infosniper.net/	Toronto, Ontario
http://www.ipligence.com/geolocation	Vancouver
http://www.geobytes.com/IpLocator.htm?GetLocation	Vancouver, BC
http://freegeoip.net/	Vancouver, BC
http://showip.net/	Vancouver, BC (this is said to be based on the free version of MaxMind, the paid version of which placed me in Toronto)
http://www.hostip.info/	“Location: … actually we haven’t a clue.”

So, you see, some got it right, and some did not. (I didn’t even bother to read or record the latitude/longitude that were given in some cases.)

thrackle.org alive again

March 2, 2010 internet, programming No comments

My thrackle.org website is alive again. It’s about a nice math problem that I worked on 10 – 18 years ago.

webcrawlers desperate for content

January 27, 2010 internet, programming No comments

I recently found this in the web server logs of one of the websites I look after:

38.100.8.50 - - [26/Jan/2010:05:01:44 -0800] "GET /application/json HTTP/1.1" 404 763 "-" "panscient.com"
38.100.8.50 - - [26/Jan/2010:05:01:47 -0800] "GET /following-sibling::* HTTP/1.1" 404 763 "-" "panscient.com"
38.100.8.50 - - [26/Jan/2010:05:01:55 -0800] "GET /AppleWebKit/ HTTP/1.1" 404 763 "-" "panscient.com"
38.100.8.50 - - [26/Jan/2010:05:01:58 -0800] "GET /following-sibling::* HTTP/1.1" 404 763 "-" "panscient.com"

In case you are not familiar with web server log files, these line mean is that someone/something from IP address 38.100.8.50 requested the pages named after “GET” on the website, for example, a page named “following-sibling::*” etc.

Does it need to be said that no such pages exist (that’s what the “404” indicates)?

When I saw this I was rather puzzled; and looked up panscient.com (the last item on each line). Their home page says they provide some kind of vertical search service, whatever that is. On their FAQ page, I found this:

Why is your web crawler trying to access pages that don’t exist on my website?

Our web crawler attempts to extract links to valid web pages from javascript and other scripting languages. The crawler may misinterpret the information in these scripts and request a page that does not actually exist. These requests are attempts to retrieve valid web content, and are not an attempt to circumvent your webserver security.

(Emphasis mine) Oh ok. They are looking into javascript files on the web site and attempting to extract names of pages that might have content for the “vertical search”. But not successful in this case. As a web developer, I can tell you that javascript files very rarely contain interesting links to web pages.

Looks like a pretty competitive business when people start pulling at straws like this. Also I take it bandwidth is easier to come by than crawling software that avoids such silly attempts.

truste.org ssl certificate problems

December 27, 2009 internet No comments

Today, a little note about a problem with https that I ran into with https://www.truste.org

When visiting that site my Firefox (Version 3.0) warned me that

Secure Connection Failed
www.truste.org uses an invalid security certificate.
The certificate is only valid for *.truste.com

Visiting https://www.truste.com instead simply timed out: “The server at www.truste.com is taking too long to respond.”

Looks like they didn’t configure their web server properly. A bit odd since they specialize “as the leading internet privacy services provider.”

slashdot down

May 7, 2009 internet No comments

Website administrators fear the slashdot effect (“slashdotting” / “being slashdotted”) — now slashdot.org, “News for nerds. Stuff that matters.”,Â is down itself. Here is a screen shot:

Unclear what “Guru Meditation” refers to, but in case you’re wondering, the Varnish link generated by the slashdot web server goes to http://www.varnish-cache.org. Which takes you to http://varnish.projects.linpro.no, which says,

Welcome to the Varnish project
Varnish is a state-of-the-art, high-performance HTTP accelerator

(The slashdot site was working again an hour later)

on github

April 23, 2009 internet No comments

Joined github today, you can look up my (future) public software at

http://github.com/stephanwehner

Added a little project which should make Rails development a little easier when it comes to working with the database directly. For now only for mysql. See my_sql.rb under http://github.com/stephanwehner/railsgoodies.

Thanks to my friend Sam for encouraging me.

Learn about git if you haven’t heared about it,

google broken

January 31, 2009 internet No comments

This morning Google’s search results don’t work.

Let’s say you search Google for water. Then:

Each result has a warning under its link “This site may harm your computer”:

Clicking on the link doesn’t take your browser to the page as usual, but brings up an error message.

Clicking on the “This site may harm your computer link” produces a help page with the title “Concerns About Web Search Results: Results labeled ‘This site may harm your computer'”:

So now I search Google for “Concerns About Web Search Results: Results labeled ‘This site may harm your computer‘”. The first result is a Google support page at

http://www.google.com/support/websearch/bin/answer.py?hl=en&answer=45449

But accessing that page leads to another error (no screen shot):

We’re sorry…

… but your query looks similar to automated requests from a computer virus or spyware application. To protect our users, we can’t process your request right now.

What a mess!

Questions

Should visiting any web page really “harm your computer”?
On what basis would Google think that a web page is going to “harm your computer”? Does it take into account or even know what kind of computer you are using?
If Google had reason to believe a web page were to “harm your computer”, should the page be really listed as a search result? (Less is more?)
If a search result page is not marked with the warning, would you blame Google if you then visited the search result page and your computer came out “harmed”?
Are these search result page getting too crowded altogether? craigslist has barely changed their listing format and they’re doing just fine.

Update

Of course, this was a temporary glitch. According to google’s blog, “the errors began appearing between 6:27 a.m. and 6:40 a.m. and began disappearing between 7:10 and 7:25 a.m. [PST]”. (So I ran into this just towards the end, around 7:20). The problem’s root cause is given as:

“Unfortunately (and here’s the human error), the URL of ‘/’ was mistakenly checked in [to a list of bad URL’s] as a value to the file and ‘/’ expands to all URLs”.

And it wasn’t StopBadWare.org’s list as Google had originally posted. (The two organizations work together on this list). Well, mistakes happen …

internet