EDRM VI Kickoff Meeting – Data Set Project Update

I recently returned from the EDRM VI Kickoff Meeting in Minneapolis and wanted to provide everyone with an update for the Data Set Project, which I co-chair. The Data Set Project’s goals have expanded to cover projects that will not only make testing and evaluation of eDiscovery solutions easier, but also projects that should lower the costs of processing through better culling and streamline the litigation process through better information on ESI for negotiations and expert witnesses. Our current projects are listed below:

  • EDRM ESI Reference Data Sets: EDRM provides a number of reference ESI data sets that can be used for testing and benchmark purposes. Currently, these include the following:
    • EDRM Enron PST Data Set: 40GB of Enron e-mail messages and attachments in PST format organized in 32 zipped files, each less than 700 MB in size, containing 168 .pst files.
    • EDRM File Format Data Set: 381 files covering 200 file formats.
    • EDRM Internationalization Data Set: A snapshot of selected Ubuntu localization mailing list archives covering 23 languages in 724 MB of email.
  • EDRM Hash Data Sets: Hash data sets for use in culling collections to remove non-user generated files. The hash sets will provide hashes for files to cull on a deterministic and probabilistic basis.
    • EDRM Software Reference Data Set (SRDS): An enhancement of the NSRL or “NIST List,” the EDRM SRDS or “EDRM List” seeks to provide a list of hashes covering popular software as it is installed on the system and tools with which to generate the hashes.
    • EDRM Probabilistic Hash Data Set (PHDS): This projects seeks to create a probabilistic approach for determining whether a file is a user file or a system file for culling purposes. For this system, there would be no need to positively identify a file as a known file beforehand as with the EDRM SRDS.
  • EDRM Data Set Documentation Projects
    • EDRM ESI Checklist: When litigants prepare for the initial Meet & Confer, the EDRM ESI Checklist will help ensure that litigants are covering potential ESI locations for both the parties they represent and opposing parties.
    • EDRM ESI Guide: The EDRM ESI Guide is designed to be the eDiscovery practitioner’s guide to ESI and the nuances of ESI types that are encountered in the eDiscovery process. Expert witness, users, and vendors should be able to use the EDRM ESI Guide to ensure they understand how ESI looks and behaves from an eDiscovery perspective.

The first two project categories are covered in the EDRM VI Kickoff Presentation for the Data Set Project below while we just initiated the documentation projects at the kick off meeting.

If you are interested in participating in any of these projects, please join EDRM and sign up for the Data Set Project.

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 

ZL Unified Archive 7 Honored as 2010 Stevie Awards Finalist

Published on May 14, 2010 by in award, ZL

I am proud to announce that ZL Unified Archive 7 has been announced as a finalist and honoree in the 2010 Stevie American Business Awards in the category of New Products & Product Management.

Special thanks go to the development and product management teams which have worked hard to make Unified Archive 7 a success. Additional thanks go to the marketing staff who have worked with product management closely to receive this recognition.

Some key improvements in UA 7 include:

  • New User Interface: The web user interface has been replaced by a modern Web 2.0 AJAX UI which should be very familiar to users.
  • Advanced Analytics: An add-on to Discovery Manager, the Advanced Analytics module provides features such as concept search, topic clustering, visualization, customized hit-highlighting, etc. The concept search and other new capabilities were extensively tested in our 2009 TREC Legal Track research project and we’re proud to include it in our UA 7 offering.
  • File System Archiving and Management: Archiving of file systems has been enhanced to provide end-user and organizational benefits. For end-users, the ACL mirroring feature now enables granular, secure search and access to the archive based on file system permissions. Organizations that wish to analyze their network file systems without archiving all their data to archive storage can now use the manage in place capability to index without archiving.
  • Exchange 2010 Archiving and Journaling Support: UA 7 fully supports Exchange 2010 for both archiving and journaling. While Exchange 2010 has begun to offer built-in archiving capabilities, organizations will still want to consider Unified Archive for increased management capabilities and scalability.
  • Performance Improvements: With ZL customers managing billions of emails and files in a single Unified Archive deployment, improved performance in search maintains ZL’s search performance advantage.

These are among the highest level of new features. For additional features, please contact ZL Technologies, Inc..

There is also a People’s Choice award so please go vote for ZL.

For more information, see the PR Newswire announcement.

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 

AIIM Heathcare Content Management Lunch Seminar in San Francisco

Unstructured content is as important as ever for heath records and EHR management needs to move beyond managing structured database content to handle other types of content including hand-written notes, forms, diagnostic images, video, audio and other multimedia formats critical to patient care. Come learn about how these forms of ESI can be manged through integration of CMS and EHR systems in the AIIM Goldengate Lunch and Learn seminar, The Role of Content Management in Electronic Health Records by Deborah Kohn.

Deborah’s talk will cover recent EHR incentive’s signed into law under The American Recovery and Reinvestment Act of 2009 (ARRA)’s Health Information Technology for Economic and Clinical Health Act (HITECH) provisions signed into effect by President Obama February 17, 2009, as well as impact of the 2010 Patient Protection and Affordable Care Act (PPACA) health care reform act

While EHR is not my specialty, I am looking forward to the talk, with a special interest in hearing Deborah’s opinion on EHR and the iPad.

Come join us at AIIM Goldengate to learn about how EHR, unstructured content, and the 2010 PPACA and 2009 ARRA/HITECH acts.

  • Date: May 25th, 2010
  • Location: UC Berkeley Extension, 425 Market Street, 8th floor, San Francisco
  • Information: Event page, Presentation

Photo courtesy of Salim Virji.

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
1 Comment  comments 

How Would Iron Man Manage his Email?

Published on May 6, 2010 by in Scalability, ZL

I’ve always been a fan of Tony Stark and Iron Man because this super hero was created through the use of innovative technology. As a movie heavily laden with CGI, it then becomes even more interesting to learn about the technology that was used for content management while creating the movie and how it can be used for archiving and eDiscovery as well.

As luck would hvae it, Iron Man 2 is opening this weekend and I have been invited to attend a special screening by the folks at Isilon where they will also discuss the back-end production of the movie and how Isilon’s technology was used to unify, manage, and access the movie’s content. More than the movie itself, I’m curious to see how Isilon’s clustered storage system and OneFS file system provided advantages for these specific requirements, and if there are any parallels to our use of storage systems for unstrucutred content archiving and eDiscovery. At ZL Technologies, we partner with many storage providers but I’ve always had a special interest in Isilon since their clustered storage solution was used for our Unified Archive 6.0 scalability tests. In that system, we archived over 1 million email messages per hour across a grid of low-end, commodity Pentium 4 servers backed by an Isilon IQ 3000i storage cluster with InfiniBand.

I looked at a few other technology companies marketing Iron Man 2 and found that a number of them provide high-end, scalable solutions, ones that our customers oftne deploy with the largest ZL Unified Archive deployments:

  • Oracle: ZL Unified Archive runs on Oracle Database and is compatible with both Oracle RAC and DataGuard. I’ve architected solutions using Oracle and gave a presentation at least year’s Oracle OpenWorld. Oracle’s scalability allows our largest customers to archive billions of emails in a single Oracle-backed deployment.
  • VMware: Virtualization allows ZL Unified Archive to scale easily from 1 to 100s of servers. Going beyond simple virtualization benefits, our Grid architecture allows these servers to be deployed into a true elastic computing private cloud by using the same template and configuration across servers since each ZL server and run any or all tasks. Tasks or roles can be assigned to specific servers by clustering servers by role. VMware’s virtualization supports production scaling at some of our largest customer deployments.
  • Isilon: Isilon’s clustered storage provides an ideal back end ZL Unified Archive’s private cloud architecture. While VMware provides the infrastructure for the elastic compute cloud, Isilon’s clustered file system provides an idea platform partner for Unified Archive’s Virtual File System, abstracting the storage layer from each of the individual UA servers.

Given the strong parters supporting Iron Man 2, it begs the question, what email archive and pan-Enterprise eDiscovery product would Tony Stark choose? The natural answer is ZL Unified Archive

  • ZL Technologies: With ZL Unified Archive’s cloud-based, grid architecture, Unified Archive has been scaled to the largest corporate enterprises in the world. The combination of Unified Archive’s content management and eDiscovery software with strong technology foundations from Oracle, VMWare and Isilon provide one of the most scalable, high performance, and low TCO enterprise archiving solutions today. As a technology and a business person, the combined solution is very interesting from both a systems architecture and TCO perspectives.

So enjoy Iron Man 2, think about the scalable technology used to power the movie, and how that same technology can be combined with ZL Unified Archive to drive an email archiving and eDiscovery solution for Tony Stark and Stark Enterprises!

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 

Data Mapping Nuts and Bolts

Published on April 13, 2010 by in eDiscovery

AIIM Infonomics just published my contributed article titled “Data Mapping Nuts and Bolts” in their April 13, 2010 issue. While there are many articles and white papers on data mapping, when I was asked to write this article, I took a look at the existing material and realized that I had not run across a concise list of reasons to perform data mapping. So for this article, I provided just that along with a definition of data mapping to meet those requirements. The full AIIM article provides in-depth information covering both my definition of data mapping, the requirements it addresses, as well as implementation steps, and integration with a full end-to-end eDiscovery solution.

A data map is a listing of the organization’s ESI by category, location, and custodian or steward, including how it is stored, its accessibility, and associated retention policies and procedures.


  1. Data map for delivery to opposing party: FRCP Rule 26(a)(1)(A)
  2. Meet & confer meeting preparation: FRCP Rule 26(f)
  3. Not reasonably accessible argument support: FRCP Rule 26(b)(2)
  4. Safe harbor and sanction avoidance: Rule 37(e)

For more information, read the full article on

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 

8 Things You Can’t Afford to Ignore About eDiscovery

On Thursday, February 25, I gave an eDiscovery presentation to the AIIM Golden Gate chapter titled “8 Things You Can’t Afford to Ignore About eDiscovery.” 8 Things comes from John Mancini’s AIIM 8 Things Series which provided the umbrella concept for the talk. The presentation is designed to provide an overview of current trends in eDiscovery that are often discussed today and how they can improve eDiscovery performance by lowering costs and improving litigation outcomes.The talk generated a lot of interest, going 40 minutes past our scheduled cutoff due to the engaging discussion.

The topics covered were:

  1. Early Case Assessment
  2. Data Mapping
  3. Investigative eDiscovery
  4. Concept Search
  5. Non-Linear Review
  6. Parallel Search
  7. End-to-End eDiscovery
  8. Cloud Computing

The presentation was focused on education and steers away from vendor pitching, which has been an issue with some previous AIIM presentations. I was happy to receive the following testimonial from an eDiscovery services provider indicating the presentation provided the right balance.

I really appreciated your presentation today. It is always a learning experience for me to hear others talk about the subjects I think I know so well. I like that there is always more to learn.

I also appreciate that you did a great job covering the topic– you did not simply pitch your company’s products. That said I must admit that from our talks before and after the presentation and some of the topics you covered in your presentation you definitely have me interested in learning more about ZL.

- Director of Technology, eDiscovery Services Provider

If you have any questions on this presentation, please post here or on the Golden Gate chapter’s LinkedIn group.

8 Things You Cant Afford to Ignore About eDiscovery

Update: The blog article that accompanies this talk was posted to John Mancini’s Digital Landfill blog on March 12th. Click here to view as PDF.

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
1 Comment  comments 

ZL Technologies Among KMWorld’s ‘100 Companies that Matter in Knowledge Management’ for 2010

I’m pleased to report that ZL Technologies has been named one of KMWorld’s 100 Companies that Matter in Knowledge Management for 2010. This list was created by a team of KM practitioners, theorists, analysts, vendors, customers and colleagues and will be posted to the KMWorld website on March 1, 2010.

ZL Unified Archive provides a unique value proposition for organizations looking for a scalable information governance platform covering messaging (email, IM, Blackberry, eFax, etc.) management, file systems management, eDiscovery, compliance, and related capabilities. The elastic grid architecture and virtual file system allows it to scale and add capabilities in a similar fashion to cloud computing architectures today (such as Amazon Web Services) while giving organizations the flexibility to deploy the system on-site or at a remote provider. Recent features added to the product include concept search, clustering, data mapping, visualization, faceted search, and search preview.

However, features are only worthwhile if they are deployed and proven useful in the field. ZL Unified Archive has been deployed at some of the world’s largest enterprises, archiving millions of emails per day. Below is a partial list of ZL customers who have deployed the ZL Unified Archive platform.

Previously, KMWorld recognized ZL Unified Archive 7.0 was named a KMWorld Trendsetting Product of 2009.

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 

Information Governance – The Evolution of Email Archiving?

Published on February 23, 2010 by in email

The email archiving space has changed and evolved dramatically since it was created to deal with Microsoft Exchange mailbox management. From there, SEC and NASD compliance requirements led to the creation of mail server journaling and the need to archive journaled email as well as instant messages and other communications sent by broker-dealers. Then in 2006, the amendments to FRCP formally introduced email archives to eDiscovery. Fast forward to 2010 and we now have 1000s of SharePoint sites within a single company, proactive eDiscovery, reactive eDiscovery, and other requirements.

It seems about time the space adopted a new name fitting for the growing and expanded requirements for the unstructured content archives that started off as email mailbox management solutions. I thought of this last year and started using the term Information Governance internally. Since then, I’ve run across the term (through no action of my own) used by The 451 Group and, as of today, ARMA with respect to their Legal Information Technology Conference 2010 titled “Legal Information Technology Conference 2010.” The ARMA conference covers the following topics which seem especially suited to the evolution of email archives:

  1. Cloud computing and data hosting
  2. Web and Enterprise 2.0, i.e., Twitter, Facebook, YouTube, Yammer, Portals/Intranets, Wikis,
  3. Blogs, Instant Messaging, etc
  4. Rules of Professional Conduct/Lawyer Ethics
  5. Email management
  6. Knowledge management
  7. Virtualization
  8. SharePoint
  9. Managing multiple jurisdictions
  10. Electronic records management/electronic document management
  11. Emerging technologies and trends, (Web 3.0/Semantic Web, Unified Communications, etc.)
  12. Conflicts of Interest/new business intake.
  13. Point applications being deployed which affect information governance (digital dictation, litigation support software, tax document prep software, etc.)
  14. eDiscovery: implications for firms and their clients

What do you think? Is Information Governance a good successor term for Email Archiving? Are there better terms?

Photo courtesy of Mzelle Biscotte.

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
1 Comment  comments 

Forrester, Cloud Storage, and Private Clouds

Forrester recently released a report titled “Business Users Are Not Ready For Cloud Storage: Current And Planned Adoption Of Storage-As-A-Service Is Minimal For Now” which indicated few firms are showing interest in moving their data into the cloud, noting that:

Respondents in all geographies and of all company sizes appear to have little interest in moving their data to the cloud any time soon.
– Forrester

Out of 1,272 respondents, just 3% have implemented cloud storage and only 1% plan to expand an existing cloud deployment. Indeed, the vast majority of respondents indicated no plans to adopt cloud storage:

  • 43%: no interest in cloud storage
  • 43%: interest but with no plans
  • 5%: plans to adopt one year or later in the future
  • 3%: plans to adopt in next 12 months

Specifically, concern with current offerings centered around:

  1. guaranteed service levels
  2. security
  3. chain of custody
  4. shared tenancy
  5. long-term pricing

These concerns are valid and need to be addressed before any mission critical data is stored with an outside vendor.

However, as valid as these concerns are, the promised benefits of cloud computing remain very compelling. For organizations that want the benefits of cloud computing while retaining control of the infrastructure, private cloud computing is the answer.

With private computing, organizations deploy their own on-premises private cloud computing infrastructure (e.g. VMware) supporting elastic, autonomic software solutions that enable server consolidation, rapid scale-up and scale-down, and low cost management over potentially large server grids, offering the best of both worlds.

IT Organizations Will Spend More Money on Private Cloud Computing Investments Than on Offerings From Public Cloud Providers Through 2012
– Gartner

In-house cloud solutions need to be designed from the ground up with scalability in mind leveraging an elastic grid of processing servers (similar to Amazon’s EC2) and a scalable, virtualized storage system (similar to Amazon’s S3). By combing grid processing and virtualized storage with virtual machine images (using VMware or similar HW virtualization), organizations can receive the benefits of public clouds within their own walls and under their own control. Using a hypervisor enables organizations to quickly scale up and down a properly designed solution to handle tasks such as archiving, eDiscovery collections, and indexing in place. One such solution is ZL Unified Archive which has been designed to easily scale from 1 to hundreds of servers using an elastic, cloud computing architecture which I discussed in my Oracle OpenWorld 2009 presentation. This cloud-based solution can be deployed in-house or run by a service provider with virtualized storage in the cloud or on premises. Through this solution, organizations receive the combined benefits of a cloud architecture with security and reliability guarantees that come with a non-cloud solution. The ZL Unified Archive solution is currently deployed at leading US enterprises and eDiscovery providers for managing large quantities of content for archiving and eDiscovery.

I invite anyone who is interested in combining the benefits of cloud computing with the security, reliability, and control of an in-house archiving and eDiscovery solution to contact ZL Technologies to learn about our unique solution.

Photo courtesy of dsevilla.

 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 

ZL Unified Archive Honored with 2009 Law Technology News Technology Award

Published on December 18, 2009 by in award, ZL

ZL Unified Archive has been honored with the Law Technology News 2009 Technology Award. The LTN awards are selected by actual product users among LTN’s 40,000 subscribers across a variety of legal requirements. ZL Unified Archive was selected in the area of Records Management, an area that EDRM is calling Information Management, to help organizations proactively manage their information for better litigation readiness, reduction of information risk, and sanction avoidance.

We congratulate the 2009 LTN Award winners, and applaud their creativity and innovations. The awards dramatically illustrate how our community is determined to develop and adopt superb technologies that help legal organizations deliver better, faster, and cheaper legal services in these turbulent economic times.
– Monica Bay, editor-in-chief of Law Technology News.

The LTN Vendor Satisfaction Survey covers the following 9 attributes:

  1. Brand Reputation
  2. Detail of vendor literature
  3. Ease of installation
  4. Ease of integration with other technology products
  5. Customer Service responsiveness
  6. Availability of training
  7. Ease of integration into the firm’s workflow
  8. Features and functions
  9. Price for value
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments