Word Clouding Gmail Junk
I couldn’t resist it… I had to know what ‘word clouding’ my Gmail junk folder would look like….

I can sleep safely at night as I can see ‘unsubscribe’ is very popular. Maybe it’s time for me to start reading the junk and clicking ‘unsubscribe’ and buying watches from my junk mail folder.
For those who have the morbid curiosity to do the same for their own mailbox, Gmail currently only provides one documented interface for extracting emails and that’s via POP3/IMAP4. However, an alternative and quicker approach if you use ‘offline email’ is to read the local Gmail Gears sqlite database. Gears uses the database to store the offline data, thus it’s a case of querying the local database with a Sql statement and writing to a text file. The text file I pass through to the word-cloud generator.
The location of the database can vary depending upon the browser you are using and whether you are going through http or https. Look here if you want to find the specific location of the database. The code I used to extract the mails is quite short as below.
import java.sql.*;
import java.io.*;
public class ReadGMail
{
public static void main(String[] args) throws Exception
{
Class.forName("org.sqlite.JDBC");
Connection conn = DriverManager.getConnection("gmail.com-GoogleMail#database");
FileWriter fstream = new FileWriter("gmail.txt");
BufferedWriter out = new BufferedWriter(fstream);
Statement stat = conn.createStatement();
ResultSet rs = stat.executeQuery("SELECT c0Subject, c1Body FROM MessagesFT_content;");
while (rs.next())
{
out.write(rs.getString("c0Subject") + "\n");
out.write(rs.getString("c1Body").replaceAll("\\<.*?>","") + "\n\n");
}
out.close();
rs.close();
conn.close();
}
}
leave a comment