Quite some time ago a co-worker experienced a computer failure. She did not have a backup. She scoured her digital world for any trace of her data. Some of her music was on her iDevice, and while Apple doesn't like to let you copy music off their branded devices, various third parties provide the tools she used to make it happen. Then there was the several hundred photos she had posted on facebook, either in albums, or merely tagged of her. Because facebook scales down uploaded photos, it makes a lousy photo backup site, but when compared to total loss; her choice was evident.
But how to recover several hundred photos from facebook without spending days right-clicking? Ask meyour programming co-worker, of course!
Faced with the challenge of manipulating facebook's album pages to download all of her pictures, I turned to Firefox's Greasemonkey plug-in. Greasemonkey lets users run user scripts, which are basically scripts, run by the user, to various web-pages. In this case, facebook's album pages. Making extensive use of firebug and some basic javascript skills, I was able to coax firefox into downloading an entire album. Running this user script against each of her album pages produced all of her pictures for her.
She was pleased.
I was entertained.
It wasn't a few weeks later when she asked me to do it again. Without asking her how she lost her photos again, I dug up the code I'd written (out of the shallow depths of the "Manage User Scripts" greasemonkey option) to help her out. Inspired by a combination of laziness (not wanting to run this on each album) and the programmer's let's-make-it-better syndrome, I wrote a companion script that would cycle through the facebook album list, and open a new tab for each album, which in turn would run the original script. From there, I was exactly one step away from downloading her entire photo archive into matching sub-directories to zip and deliver to her. It definitely took longer to smooth out the script, but it was more fun than hitting each album manually (I'm sure other programmers would agree). At this point, I realized what the next evolution had to be. If I could write a user script to cycle albums, why not one to cycle friends?
I had to try it.
Some time later, I had a semi-automatic process for downloading every non-private picture, in every non-private album, from every friend I had on facebook. One question remained, though:
Should I use it?
I considered the following:
Curiousity won out, and here's what I found: My 152 friends have over 38,000 pictures. That is an average of 250 per person. To me, that sounds like a lot to have posted online, but then I considered how many pictures I have: less than 30. Many of my friends, it turns out, have less than 50 pictures. The clear implication here is that just as many have well over 250 posted, and they do. In fact, one even topped 1000 images to their name. To be fair, though, some of these pictures are counted twice: once in the album, and once in their "tagged photos" collection. Currently, I think facebook has about 400 million users, if my small sample is representative of the system average user, and cut my number in half to generously account for tagging redundancy, then facebook hosts approximately 50,000,000,000 images. That's Fifty Billion photos of weddings, puppies, sunsets, and bars. At an average of 51,461B per photo, facebook is hosting a whopping 2340TB worth of images.
That seems crazy, I'd better check my math.
Yep.
Does anyone have any more accurate or official numbers for the image burden on facebook's network?