I’m pleased to announce that an initial version of the EDRM Enron Email Data Set consisting of 40GB of PST files with attachments and folder structure is now available within the EDRM project as of the EDRM 2009-2010 Kick-Off Meeting. The EDRM Data Set Project is now working to make this data set publicly available.
A number of people have contacted me about getting the current PST corpus via an alternative manner. This is partially due to the bandwidth restrictions that have been in place for the HTTP download. I planned to put in some other download methods but haven’t had time yet. Until then, if you will be at
An increasingly important aspect of email and file management is the issue of open vs. closed file formats. Open formats are gaining popularity and allow organizations to retain control their own data without the costs often associated with vendor lock-in. The acceptability of high switching costs and sometimes operational costs are giving way to the