bill's blog

Just another WordPress weblog

Browsing Posts tagged Google



I work in IT and one of my job functions is to warehouse the image files of a corporate creative department. Translated… that means I buy a lot of storage. One of the things that storage admins are looking at is the failure rate of the disc drives that make up their SAN environments. The higher the failure rate of a particular drive the better your chances of having a catastrophic loss… Or in other words you’re restoring from tape if you loss a lot of drives at one time!

MTBF (or mean time before failure) is a standard measurement (in hours) we use to calculate the life of a disk drive before it fails. The other measurement we use is AFR (or the annualized failure rate), which is expressed as a percent based on the MTBF verse the amount of time that device is powered on and running. A couple of things to note… MTBF is not necessarily a devices useful life. And AFR is not meant to be applied to a single drive but rather it is the expected failure rate of any given drive within a particular production run (population).

So what does this all mean?

Well most vendors spec consumer-geared disk drives at about 300000 MTBF. That being said the key word in MRBF is M (or mean). So what we’re looking at is about half of the drive for a given population with fail in the first 300000 hours of use.

Translated again… and I got help on this one 😉

If you had 600,000 drives with 300,000 hour MTBFs, you’d expect to see one drive failure per hour. In a year you’d expect to see 8,760 (the number of hours in a year) drive failures or a 1.46% Annual Failure Rate (AFR) (Harris, 2007).

Realizing that this is what a manufacturer quotes as the expected life, one has to ask how does that hold up in reality. Well Google did a bit of research on this and found that their failure rate was much different from that of the manufacturers. Why? Because there is no clear definition between what a manufacturer considers a failure and the real world’s expectation on these devise are.

In reality many factors will determine whether a drive should remain in production. Call is an IT admins intuition… Call is that odd clicking sound… calls it taking forever to save a file… Often time we (IT professionals) will replace a drive before it is completely unusable (or the point where we can no longer retrieve data from the device). Did the drive fail? Technically no… Practically yes! If we can’t rely on the drive to reliably save and retrieve data that it has fails for our purpose… guess some manufactures don’t see it the same way!


Harris, R., (2007, February, 19th), Google’s Disk Failure Experience, retrieved on June 3rd 2010 from

First of all we should all know by now that FTP is not the most secure protocol there is. UserIDs and passwords are passed on the wire as plain text. So my approach to finding ftp was to use Google and as a search string I entered inurl:ftp. This yielded 23,400,000 hits.

This site belongs to Wietse Venema. Those of you who are not aware of Wietse Venema is the author of Postfix… one of the most popular MTAs (or Mail Transfer Agents). Additionally he is the author of a number of security related applications, SATAN and The Coroner’s Toolkit are just two of them.  Interestingly enough his web presence is run via the ftp protocol. So technically it isn’t a web site. The site is used to distribute all of the above-mentioned applications including a few others not mentioned. All of which he has worked on! These are applications that we are UNIX administrators should be aware of, if not use on a regular basis.

Next stop back to Google… The search string inurl:ftp. inurl:mil yielded about 9,250 hits… much less BUT a bit more interesting. This time instead of using a browser to access the site I chose to use an FTP client. My choice… CyberDuck!

This site seems to be dedicated to the transfer of GPS and flight plan documentation. I was able to find data air traffic routes in China: Sup 08_004 (Atch)/315.jpg

OR how about Swedish Armed Forces Jeppesen approach charts? MIL CL2 SUP 6029-6032 OF 2009.pdf

It seems someone in this organization is using a Nortel Ethernet Routing Switch 8600 using Software Release This could be of interest to someone profiling this site.

BUT of interest to me was I could upload to /pub2/giat_files/incoming. Now I must say that the directory was set up as a drop box so it could not be exploited as a warz site. As for the rest of the directories on this site permission were set so that one could down load or enter more interesting directories.

FTP can be a valuable tool. But care must be taken to secure the site as much as possible.  We use FTP to transfer files to different parts of the organization. While some of the sites are external to the company many are not! They are located behind our corporate firewalls. They are protected with firewall rules on the host itself… only certain sites have access to the manufacturing drawings, as not all individuals within the company need access to them. Where the sites are external other protocols are used sftp for one.

People need to be able to transfer data from one part of the organization to another. The mail system is NOT designed to handle the load of constant file transfers. Not only that but individuals that do transfer files via email inevitably use there inboxes as a filing cabinet for these emails.

“Oh, I need to keep this file for future reference!”

This creates problems for the email and helpdesk technicians. They have to warehouse these files and depending on governmental regulations. This could create a storage nightmare, as the files need to be kept to extended periods of time. Rebuilding users inbox is the bane of any administrator’s day!

“Why can’t they archive off these emails?”

The collection, storage, and distribution of data file is no going to go away anytime soon. With today’s push for greener IT, I fear the storage demands will only grow. One must find a method to centrally organize these assets to avoid duplication of resources. A side benefit of central storage is the ability to better control the accessibility to these files. While I’m not sure whether or not those documents at were for public distribution, its probably safe to say that some of the materials up on that site could be used for other than there intended purpose. If you’re going to put your assets out online better protect both the host and the files they contain!

DNS cache poisoning is a technique that tricks a DNS server into believing it has received authentic information when, in reality, it has not. Soto understand the effect this has… Think about how many time you use in a given day. Let’s say that the DNS server that provides resolution for you has been compromised so that anytime you use it point your machine to The site acts just like the real thing except that when you do a look up for an online banking site, it redirects you to evil-banking website. One can see how this would be a problem. To exasperate the problem, DNS information is usually cached based on the records TTL. Depending on how long the TTL is, you could be directed to these evil sites for quite some time.
So what to do if you feel that you’re dealing with bad cached data?

To flush the DNS cache on a Linux box:

Start a terminal session and type /etc/rc.d/init.d/nscd restart

To flush the DNS cache on a Microsoft Windows machine:

Open the Command Prompt and type ipconfig /flushdns