Difference between revisions of "Dropbox Crawler"
|Line 48:||Line 48:|
== Client source code ==
== Client source code ==
Download the source code by clicking [
Download the source code by clicking [here]. You can compile the project using the [http://ant.apache.org/ ant] tool, or any Java IDE (we use [http://netbeans.org/ NetBeans] v7.2.1)
Revision as of 15:13, 8 January 2013
Personal cloud storage is becoming more and more popular - Dropbox is certainly the best known example. Cloud storage generates a huge amount of Internet traffic. Because of that, understanding how people interact with such applications is essential for designing efficient cloud storage systems.
We have been doing research on the usage of Dropbox (see our results here). As a next step, we need to know what type of files people store in the service. This would allow us to understand the impact of some technologies on the system performance and on network traffic, among other things. For that, we need volunteers to provide us basic statistics (size, type etc) of files stored in their folders.
Be part of the crowd
All you need to do is run a Java application at your PC. This application will:
- Read your Dropbox folder
- Calculate basic statistics
- Show everything to your approval
- Only after that, send the statistics to us.
Most people will be able to run the application by clicking here. In case your browser does not support that, you can download the package and run it: Just double click on it!
What will be captured?
For each file/folder in your Dropbox, the program will collect:
* Size in bytes * Last modification time * Mime type found by the Mime Type Detection Utility * File extension - the sub-string after the last "." on the file name * Hash (MD5) of both initial and final 8 bytes of the file * Hash (MD5) of the file name
The program will also send to us:
* Hash (MD5) of your Dropbox configuration files (or a hash of your MAC address if we cannot read the former) * Hash (MD5) of the path of your Dropbox home folder * Your IP address
How will we use this information?
All our results will be submitted to publication and made freely available in this website. In the future, we will release a summary of our data on this website as well. Thus, anyone will be able to use our data sources for further researches.
We will, however, take extra actions to ensure that no sensitive information will be in these datasets. Note that the only information that could potentially reveal your identity is your IP address, which we will anonymize. All other statistics cannot be related to the person owning the files.
What this program will NOT do?
- Copy any file or folder out of your computer
- Copy any other information than what is listed above
- Install or store anything in your computer
You can also take a look on the source code if you have any doubts about the program.
Client source code
- You can find more information about our work on this paper:
Drago, I. and Mellia, M. and Munafò, M. M. and Sperotto, A. and Sadre, R. and Pras, A. (2012) Inside Dropbox: Understanding Personal Cloud Storage Services. Proceedings of the 12th ACM Internet Measurement Conference - IMC'12, Boston, Nov. 2012
- This page has more information about the data we used in our research so far.
These institutes are running this research: