magnify
formats

Why We Say “Performance Is A Feature”

Published on October 4, 2011 by in coding, Google

Singing the Praises of the Unsung Hero

The phrase “Performance is a Feature” has been growing in popularity. Lately, it’s importance for me is underlined by the fact the most common response I get from showing off my early-stage mailing list archive, Grokbase, is that “it’s fast.” I first heard this idea similarly phrased in Google’s Think Quartlerly and then in Jeff Atwood’s Coding Horror which was covered on Hacker News. Matt Brittin, managing director of Google for UK and IE ops said it like this:

At Google, we often think that speed is the forgotten ‘killer application’ – the ingredient that can differentiate winners from the rest. We know that the faster we deliver results, the more useful people find our service.

I’ve worked in some high-performance environments (enterprise search for billion+ document archives) and written benchmark reports so I know performance is important; however, this got me thinking, sure “we” (those in the know) know that performance is important, but what about the meta-question, why is it important to say it is important?

Motivation

For me, the strongest reason for talking about Performance as a Feature is that it can help motivate and focus development. This is important because performance can be vitally important and yet be difficult to achieve and result in little to show using metrics such as lines of code and visible features. Indeed, a product screen shot can look exactly the same after a 100x speed up.

The reason performance can be hard to achieve is three fold.

  1. Performance is (often) Mentally Challenging
    Often times, performance is an architectural problem-solving exercise and not the mere act of taking a requirement and executing on it. Additionally, often times you need to taken into account many domain-specific properties in order to fully optimize the system. So you may need to understand the problem space in detail, come up with a solution, and then communicate / convince others that the design should be accepted even before you get started coding.
  2. Performance is (often) not Greenfield
    Once you come up with a good solution, often times you cannot just implement it as you are improving an existing process. The new solution needs to integrate into the existing solution and either be a drop-in replacement or change the code around it. Like repainting a car after it gets dinged, the affected code can be much larger than the immediate problem. Further, if data conversion is needed, once the new code has been implemented, the conversion process may take longer than development or deployment.
  3. Performance is (often) Hard to See
    Finally, it can be hard to tell your peers what you accomplished if what you accomplished was a performance improvement since there is nothing to see. To complicate matters, you are not necessarily working on a new product or even a new feature. Indeed, you may be just speeding up someone else’s creative work. That the speed-up itself is what makes the work useful at all may be hard to convey. Here, metrics tell the best story to other developers but to people using the app, it may be simply fast.

Summary

As developers, we like to see the fruits of our labor and this often means working on new features that are visible and we can talk about. However, when talking to users, often times the first thing they notice and talk about may be your app’s performance. So when developing an app, it is important to remember, that:

Performance is not just any feature, it may be a killer feature.

Have you had to implement performance features? What were the challenges you faced?

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

So You Want To Be A Coder

Published on October 2, 2011 by in coding

Self-Directed Job Enlargement

This article is dedicated to my friends and colleagues working in non-coding technical roles that wish to make a transition or at least increase the amount of coding in their work.

The Dilemma

In software development teams above a certain size, there are often software engineers and systems engineers (including testing, support, sales, and technical marketing engineers). On some of the teams I have been a part of, a number of systems engineers have wanted to become software engineers / coders but face a dilemma. The issue is that if they have not been hired on as a developer, are valued in their current systems role and have not proven themselves able coders, how can they start coding? The situation is often something like this:

  1. A systems engineer wants programming assignments but is not getting any
  2. The development manager realizes the engineer wants to move into coding but does not see enough ability or demonstrable interest to assign coding tasks in lieu of tasks that need to be performed from a business perspective
  3. Without coding tasks, the systems engineer feels he cannot demonstrate ability and thus become a coder

So it seems we have a catch-22 where the systems engineer wants, but cannot find a way, to become a software developer which can lead to decreased job satisfaction.

So what can a systems engineer do? Beyond coding on the side and taking classes which may not impact work at the company, how can one get recognized for coding by your coding peers at the company?

The Do It Yourself (DIY) Approach

While I have never desired to become a full-time coder when I wasn’t hired on as one, I have had my code recognized at some companies and the ways I’ve done it may give you a leg up.

First of all, I don’t view coding as special discipline that only people specifically assigned to code do the coding, after all, most technology companies are not unionized. My view is that code is a tool, much like a spreadsheet is a tool. When you have a task that can use a spreadsheet, use a spreadsheet and when you have a task that can use some code, write some code.

One of the best ways to get started is to write some code to make your work more efficient. This way the code can be easily justified as one way to implement a work task and you’ll get used to coding as a natural part of how you work. If you are in an “Ops” role, you can also view this as a natural extension to “DevOps“. Once you can help yourself, you can move on to helping others and even entire teams. As people see your tools become useful, you will get more recognition as a coder and may formally move into a coding role once you are recognized as such.

As a firm believer in the Do As I Do and Not As I Say approach, here are some situations where I have written programs on the job where it wasn’t my formal responsibility and received recognition:

  1. Be Pro-active

    The first part of coding for yourself is to take initiative and not wait until someone gives you a coding task, since you may be waiting a very long time. If you see something that might be interesting, roll up your sleeves and see what happens.

    A while back, I would work with our customers on acceptance testing of our product. In this situation, our expertise and value was primarily in the C++ back-end; however our front-end code was available for inspection as it was coded a in dynamic, interpreted language. I thought the code could be a bit tighter here and since I was the one representing my firm on the front line, on my own time, I refactored our front end code. When I showed it to our development manager, it was recognized as good work, and I had a software developer assigned to integrate my code.

  2. Help yourself

    The great thing about coding is it can automate and improve consistency for repetitive tasks, making time-consuming tasks manageable and otherwise impossible tasks time-consuming. If you are faced with a task like this, write a program to help yourself out. Creating tools for yourself can have positive spill-over effects as well.

    In one situation, we had not yet build an automated installer and I needed to run multiple installations / upgrades of our software on 6 servers. To streamline the then-manual process, I wrote an automated, “one click,” installation / configuration / upgrade script. Not only did others notice I had a one click installation for an otherwise-lengthy task via my installs and upgrades, it gave me some insight that I was able to contribute to our dev team when we did create a formal installer.

  3. Help a Colleague

    Once you can help yourself, you can move on to helping others. This requires additional skills as you’ll need to do requirements gathering and push multiple releases if your tool / program is successful. However, the rewards are much greater because now your skills are making an impact on the people around you.

    While acting as a product manager ahead of a release, we had to update the localization of our product for a double-byte language where we had significant customers. We were facing significant schedule risk with our process so I worked with our translation and build engineers to create a tool to speed up the translation. We finished on time and the tool was useful enough that I updated it for several releases. Additionally, the colored, graphical status pages made the tool more usable by the translator and more noticeable by others from afar.

  4. Help an entire team and get the buzz out

    Finally, when you are ready, you can build an app that is used by many people internally. If successful, your app will go viral inside your organization and people will start creating a buzz about your code for you. The additional challenge from working closely with a few colleagues is that people you don’t normally work with will suddenly start asking you for your app and even have new product requirements you will need to manage.

    In this case, the app I built is most likely still in use to derive competitive advantage so I’ll leave out the details. Our team was was working with a Fortune-50 account and I ended up building an app that was used by our team and presented to the client. We successfully closed the account due to many factors; however, it was recognized that this was a significant contribution. Eventually the app became important enough that it was actively requested by our staff (and escalated to higher management when I was otherwise too busy), mentioned by the CEO in all-hands staff meetings and attracted the attention of the CTO who took a personal interest in the app.

In all of these situations, coding was not my responsibility but I took it upon myself to make my life and that of my colleages easier by coding. In the process, my code was recognized by peers including software engineers, systems engineers, salesmen and my CTO. It was also easy to justify spending my time on these coding projects because they improved the work we already needed to do.

While it was not my desire to become a coder in any of these situations, my projects received enough recognition that I believe it would be much easier for a development manager to justify a formal switch or assignments for someone if desired.

Summary

If you are in a situation where you want to code and need to get some recognition for coding before getting a formal role or task, find a project to make your life easier and more effective. If these projects impact the bottom line and if you can get your colleagues using and creating buzz about your apps, you’ll have gone a long way to showing your coding chops.

Have you made the transition from a non-coding role to a coding one? If so, how did you do it? What worked and what didn’t work?

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Getting Your Hands Around Email – Introduction

A little while back, Craig Ball wrote an article, “E-Mail Isn’t as Ethereal as You Might Think” for Law Technology News which described some high level basics of the MIME Internet mail format standard. Much more technical than the typical LTN article, it highlighted the need for more articles and discussion on the ESI itself. In that vein, here is the first of several articles discussing and examining different email formats. Keep in mind that processing email for E-Discovery may be best performed by legally sound, email management products that have been verified by leading major, independent, third-party litigation consultants.

This isn’t just geek stuff. It’s lawyer stuff, too.
- Craig Ball

Major Email Types Encountered in E-Discovery

Here is a short introduction to the major types of email encountered in E-Discovery.

  • Internet (MIME/mbox): Virtually all, if not all, mail servers today can handle MIME format email. Open source mail servers often use MIME as their default email format for sending email within the environment and out to users of other mail servers while servers like Exchange and Domino send / receive MIME when communicating outside their deployment. MIME is an open standard defined by the Internet Engineering Task Force (IETF) in several Request for Comments (RFCs). The email format itself is described in RFC-5322. Mbox files are container files for MIME format messages. The basic format is a text file comprising a concatenated list of MIME messages with a special “From line” to delineate the start of each message.
  • Microsoft (MSG/PST,MIME/EML): Microsoft Outlook’s native email format is MSG, a file format described in MS-OXMSG. End-users often deal with Personal Storage Table (PST) files more often than MSG files; however, many E-Discovery practitioners are familiar with MSG files which often get included with native productions. End-users can generate MSG files by dragging email from Outlook and dropping it on to Desktop or other file system area. PST files are container files for MSG format files. While Microsoft Outlook does not support MIME email, you can read it using Microsoft Windows Live Mail (WLM) or Outlook Express. Simply ensure the MIME mail has the .EML file extension and open it in WLM or Outlook Express.
  • Lotus (Notes CD/NSF,DXL): Before MIME was established, Lotus created their own proprietary rich data format, called Notes Compound Document (aka Notes CD, Notes Rich Text). NSF files are container files for Notes CD format messages. In Lotus 6 and later, Lotus mail can also be exported as DXL objects.

Email Types in the EDRM Enron Email Data Set 2.0

To get a full appreciation for the different email formats, it’s useful to take a look at some email in the different formats. The EDRM Enron Email Data Set 2.0 supports multiple formats which can be explored. The email was produced by ZL Unified Archive® which can archive / collect / manage email in the various native formats and convert between the various formats as well.

  • EDRM XML: This is the open standard E-Discovery load file standard as defined by the EDRM XML working group. The EDRM XML files in this data set include ESI metadata along with native email in MIME format (with attachments) and extracted native attachments as well as text extracts.
  • MIME: While the MIME files are included in the EDRM XML distribution, it is possible to access the MIME without reading the EDRM XML. This has been useful for some research organizations.
  • PST: All of the email is also produced as PST files for the custodians. These files can be read directly in Microsoft Outlook or processed by virtually all archives and E-Discovery tools.

Email Types in the EDRM Internationalization Data Set

The EDRM Internationalization Data Set provides email in an additional format:

  • mbox: Mbox files are available in the following languages: Email in the following languages is included: Arabic, Catalan, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Tamil, Turkish.

Closing

I anticipate writing a few more articles on this topic exploring each of the different types of email. It is my hope that layers and other E-Discovery specialists will be able to “grok” email a bit more through these posts.

If you are interested in learning more about these email formats, how to manage them in your enterprise, and how to migrate between them, consider contacting ZL Technologies. ZL Unified Archive® can not only manage email on Exchange, Domino, and Internet mail servers, but it can also migrate email between the different formats.

Image courtesy of: UK Pay Day Loans.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Bill Clinton and Stevie Wonder on Keeping the Passion to Innovate Alive

One of the treats of attending large technology conferences is the opportunity to hear inspirational speakers. At last year’s Oracle OpenWorld, I was fortunate to hear James Carville and Mary Matalin so I was excited to learn that Bill Clinton and Stevie Wonder were to speak at this year’s DreamForce conference. As excited as I was, their messages exceeded my expectations and did not disappoint. Being in the 3rd row did not hurt either. While there will be many reporting on their talks, I will report on their messages looking through the lens of business, technology, and innovation, an approach I picked up from Chris Riley at the AIIM Golden Gate Chapter‘s award-winning Tenuta Vineyards wine blending event where we discussed blending ECM technologies the way we blended wines earlier in the day.

Keeping the Passion Alive

A question posed to both speakers was that given their long, illustrious careers, how do they keep the passion alive to drive innovation year after year, decade after decade. This is an important topic for technology firms that try to keep the passion of a start-up in their firms as they grow into larger enterprises. Stevie answered, tongue-in-cheek, “You have to pay the bills.” to some amount of laughter. More seriously, both speakers said it was important to live with passion and be true to who you are. Stevie said, regarding his work, “Music is like life, You have to live with passion for live” and closed with “just be you … you are meant to be you.” Bill remarked on several things that has kept his passion alive:

  • working on a large idea or concept,
  • doing something that you both like and are good at, and
  • having fun at what you are doing (variation of above).

It is no small accident that the most successful people are driven by passion for their work and not simply achieving a certain result. They just love what they are doing. Of course, in today’s global economy and the current recession, it is becoming more important than ever to find that kernel of who you are to differentiate yourself and possibly just pay the bills given the increased competition for jobs. Perhaps Stevie’s first answer is not so far off the mark.

Doing Great Work

The second big takeaway, was the importance of doing a great job no matter what you are doing. Bill mentioned that his foundation is working to help developing countries grow sustainably and to help build systems where people can get good results from good effort and you should strive for good work, no matter your economic condition. He said that “Being poor is not the same as being sloppy,” and also remarked that,

When you build things that work, good things happen.
- Bill Clinton

For start-ups, it means that even when you may have few resources, you still need to put out a top notch product. Sometimes, you may need to cut back on features to ensure all the features you ship are top notch.

Great Work is Not Enough, You Have to Advertise!

Bill also lamented his party’s performance in this past mid-term elections. Specifically he mentioned that although his party had raised substantial funds, they did not use those funds to tell the story of their successes resulting in losing many more seats than were forecast and losing control of Congress. He did not blame their adversaries but looked within said that it was his party’s fault that the electorate simply were not informed of the great work that had been accomplished.

From a technology company perspective, this highlights the need for and importance of marketing to communicate the great product being built by the development teams. From a product management perspective, these are linked to drive a company and product’s success. The Pragmatic Marketing educational organization lists both product management and product marketing as product management disciplines. Reflecting on the importance of product management, they now list executive leadership as a product management discipline, a topic for another post.

Being Forever Young and The Business of Tomorrow

Looking at the US and its chances for remaining a dominant country with the rapid rise of countries such as China and India, Clinton mentioned that he liked America’s odds but that to be successful, the country needed to be forever young, that “we have to be a tomorrow country” and that the US needs to become a “Laboratory of Democracy” as a country, something once said of state governments.

We have to get back into the tomorrow business
- Bill Clinton

Along with moving back into the tomorrow business, he said that “everyone’s job is to work themselves out of a job,” a key message that tomorrow’s jobs will be vastly different than today’s jobs. He highlighted in the move to IT jobs over the course of his Presidency when such jobs comprised just 8% of all jobs but 30% of job growth and 35% of income growth.

One way to apply this to the technology field is to look at the rate of relentless innovation at 34 year old Apple Computer which, according to the article, is driven by three key factors:

  1. It invests heavily in R&D
  2. is unafraid to cannibalize or kill its own products and
  3. is able to extend its core technology across a host of different products to create a dominant ecosystem of consumer gadgets

These are similar to Bill’s approach on working one out of a job and being in the business of tomorrow.

Recap

The speakers were thought provoking and inspirational and I am glad to have been there. My article is certainly a selective approach to discussing these talks, however, I think this format presents some key messages from business, technology, and innovation perspectives. My thanks to the folks at SalesForce.com for bringing us such great speakers.

Image courtesy of: LIFE.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

In-House E-Discovery “Lunch and Learn” Panel Moderated by George Socha in San Francisco

The AIIM Golden Gate Chapter is holding an E-Discovery Lunch and Learn panel on reducing cost and risk via in-house E-Discovery. We’ve assembled a well-rounded expert panel, representing both inside and outside counsel to discuss and share their experiences with you. The event is being held in San Francisco at Duane Morris and will cover the following topics:

  • Is the trend to move E-Discovery in-house increasing? Why?
  • Can technology reduce time and costs across the E-Discovery process? How?
  • Can we reduce risk through early data assessment? How?
  • Can private cloud computing technology improve in-house E-Discovery? How?
  • Can we successfully implement an in-house E-Discovery process? How?

The speakers include:

This should be an especially interesting panel given the background of the participants and the high interest in moving E-Discovery in-house to manage growing volumes of litigation, as shown in surveys by Fulbright & Jaworski and Enterprise Strategy Group. In the Fulbright 6th Annual Litigation Trends Survey, 47% of respondents planned on bringing components of E-Discovery in-house to reduce the costs of E-Discovery. Similarly, 48% of respondents in the ESG Trends in Electronic Discovery survey had active projects to bring parts of the E-Discovery process in-house. Additionally, in the ESG survey, 73% of respondents indicated they had plans to bring portions of the E-Discovery process in-house and it will be interesting to hear about some of these projects.

Please join us for this informative discussion. More information is available in the announcement below and the registration page.

Update: We had a great event with a strong turn out. I’d like to thank everyone who attended and participated in putting on this event.

In-House E-Discovery Panel Discussion: How to Reduce Cost and Risk by Bringing E-Discovery In-House to Get …

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Enterprise Archive and E-Discovery Scalability via Case Studies

Leading industry analysts have found that enterprise data (ESI) continues to growing at a over 60% annually with over 80% of that data being unstructured content (IDC, 2009). To manage the tremendous volumes of user generated content, organizations are well suited to turn to unified archiving/E-Discovery solutions that will scale not only in terms of data under management but also in terms of performance across the board including ingestion, search, disposition, preservation, and export. This post will focus on some scalability metrics while I will discuss how scalability can drive efficiencies in future posts.

At ZL Technologies, we pride ourselves on providing the most scalable and technically advanced archiving/E-Discovery solution; however, that message can often get lost when verifiable results give way to unverifiable marketing claims. After reading about a recent archiving/E-Discovery vendor’s scalability claims, I decided to compare their published case studies with Vivian Tero‘s IDC customer case study on ZL Unified Archive. To be fair, I decided this study should only cover generally available, published case studies. The vendor’s largest deployment case study mailbox numbers generally say something along the lines of the customer had x number of users but did actually mention that many mailboxes were archived or under management; however, I gave them the benefit of the doubt and used the highest number provided. Even so, the results were astonishing:

Now that I have your interest, let’s take a closer look at this ZL customer case study.

The Competition

This customer was a sophisticated firm which already had an email archive in place. Nevertheless it performed a vendor evaluation with the major vendors and eventually selected ZL:

Bank Holding Company wanted a solution that could address its compliant message archiving, eDiscovery, supervision, and mailbox management projects. It evaluated the on-premise email archiving solutions from the following vendors: Symantec, Autonomy (ZANTAZ), CA, IBM, EMC, Unify (AXS-One), and ZL Technologies. Bank Holding Company conducted an onsite evaluation on the feature sets it required and employed a third-party organization to certify the search and retrieval performance of the email archiving applications in the short list. After a thorough and complex evaluation process, Bank Holding Company eventually decided upon ZL Technologies. The Bank evaluated the email archiving products and eventually selected the ZL Technologies Unified Archive solution

The Problem

There were several problems; however, one particular pain was the length of time it took to extract messages from the system for E-Discovery.

eDiscovery search and retrieval was increasingly becoming an operational issue. The organization’s eDiscovery team had to conduct searches across individual mailboxes, messaging archives, and backup tapes. With this approach, the search and export of 1.5 million messages took six to eight weeks to complete. Bank Holding Company was looking for a more efficient solution.

With ZL Unified Archive, exporting 1.5 million messages can take less than a day with a moderately sized system and I have personally performed this task with the EDRM Enron Data set consisting of 1.3 million messages.

The Requirements

The requirements were multi-faceted which I will cover in more detail in a later article.

  1. Integrated workflows and technical support for compliant archiving and retention, supervision, mailbox management, and eDiscovery
  2. Legal hold case management and fast search and retrieval
  3. Support for both Domino and Exchange environments
  4. Support for virtualization and Oracle databases
  5. Vendor flexibility and support
  6. Strong customer references

Successful Deployment

With ZL Unified Archive, the bank was able to solve their E-Discovery problems successfully and efficiently.

The Bank was archiving over 6 million messages a day, of which 2.5 million were archived into WORM storage for FINRA/SEC compliance. As of the publication of this document, the Bank had ingested over 2 billion messages to support more than 173,000 mailboxes.

The eDiscovery team uses these self-service features to enforce the retention and legal hold policies. There are currently over 78 million messages on legal hold within the ZL Unified Archive. Also, the eDiscovery team is using the self-service features to conduct investigations and legal searches.

Further Reading: The IDC Case Study

Read more about what I think is one of the most exciting email archiving and E-Discovery deployments in the IDC ZL Unified Archive case study.

IDC Case Study: Email Archiving & eDiscovery at Bank Holding Company using ZL Unified Archive

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Arcot Systems Acquired by Computer Associates for $200 Million

Arcot Systems has agreed to be acquired by Computer Associates for $200 million. I joined Arcot Systems as employee #32 a while back and worked there for several years. Many of the people I worked with at Arcot are still there and they have my congratulations. The following is a screen shot from their current homepage:

Computer Associates is combining Arcot’s authentication portfolio with their SiteMinder portfolio, which they received from their $430 million Netegrity acquisition announced back on October 6, 2004. At Arcot, we were partners with Netegrity and other authorization product firms.

At the time, we had also started to do Authentication as a Service in the could via TransFort service for Visa, MasterCard and JCB. This service has been extended to the A-OK authentication service for enterprises. The text below is from the current SaaS Cloud Computing page.

Arcot has been offering cloud authentication services since 2000 when we launched our TransFort e-Payments authentication service, now branded A-OK for e-Commerce. Since then, Arcot has helped over 13,000 financial institutions comply with Verified by Visa, MasterCard SecureCode and JCB J-Secure card authentication programs. In early 2008, Arcot launched it’s A-OK for Enterprise fraud detection and strong authentication service to provide “authentication-as-a-service” for secure access to online banking, Web portal and VPN applications. Today, Arcot’s cloud computing services serve over 50 million users, worldwide. Hosted in multiple SAS 70, PCI DSS-compliant data centers, Arcot A-OK services are highly scalable, configurable, and multi-tenant efficient.

One of my contributions at Arcot was to envision and evangelize a new technology that extended the use of our seminal invention, the ArcotID, to standards-based x.509 public key infrastructures (PKI). This allowed us to address integration requirements, open new markets, and establish new partnerships. Our engineering team was able to flesh out design and implementation, with four of us (Robert Allen, Robert Jerdonek, Tom Wu, and myself) being on the patent application that was filed, US Patent Application 20020126850. It was refiled as 20100172504 which is presented below.

US Patent Application: Method and apparatus for cryptographic key storage … 20100172504

I enjoyed my time at Arcot. It’s nice to see a good outcome for the firm and team.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

The Enterprise Archive as the eDiscovery System of Record

With the typical Fortune 1000 firm now having over 5 petabytes of data, including SharePoint and social media, large enterprises can benefit from having a single “source of truth” or system of record for eDiscovery. Instead of having to collect, search and analyze data from multiple repositories, an centralized system can allow legal, records management, and IT staff to automatically connect to those repositories and make them accessible for both custodian-based ICP (identification, collection, and preservation) as well as matter-based ICP and matter-based culling. Barry Murphy, an industry analyst and thought leader, notes that while “no one category of [information management] solution has yet to emerge as the big eDiscovery winner:”

Where I see a lot of interest now is in archiving all the high-volume, user-generated content, [...] the information deemed necessary could be archived [...] and the archive could become the eDiscovery system of record. – Barry Murphy

The interest in archive software partially relates to the suitable of archive software to the large scale information management needed to cover records / retention management, preservation, and search.

  • Advantages for Archives: Archives generally provides information management capabilities for the largest and most interesting source of ESI, email, along with other user-generated content such as file servers and collaboration systems. Many can already scale to document quantities managed by companies in the target market, have retention management, and legal hold / preservation capabilities. Already, many leading organizations are looking to archives streamline their eDiscovery process through proactive management. Leading archives such as ZL Unified Archive® are now moving beyond simple archiving to support a “fast reactive” eDiscovery using manage in place and automated collection capabilities.

While other solution categories may partially meet the needs of organizational eDiscovery, there are some significant technical, core competency challenges facing them:

  • Challenges for ECM Solutions: ECM solutions have traditionally been focused on managing the life cycle of smaller quantities of ESI, such as ESI specifically tied to workflows managed by the ECM solution or EIS that has been designated as a record from a records management perspective. Typically, both consist of drastically smaller quantities of ESI than may be needed for eDiscovery, so while ECM solutions may provide a good workflow, they face significant scalability challenges for managing the quantities of data in some of the larger enterprises.
  • Challenges for Collection Tools: Collection tools are generally more scalable and handle larger quantities of data, but they do not “manage the data” in place from a records and retention management perspective involving classification, retention, disposition, and deletion management.
  • Challenges for Review Tools: Many legal teams are most familiar with eDiscovery review tools as they spend a large amount of their time reviewing documents, while relying on IT teams to collect documents they review. Review tools generally do a good job of searching and marketing smaller quantities of documents with typical eDiscovery cases ranging from 100,000s of documents upwards to 2 million documents. However, they do not have the records and retention management capabilities needed, nor do they typically scale to the hundreds of millions and billions of documents that exist in larger organizations.

This is not to say the information management market for the eDiscovery system of record has been decided, but that certain application classes may have more advantages than others and and these should be carefully considered when seeking a solution.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Information Governance: Precrime and Early Case Assessment

Published on August 13, 2010 by in ECA, eDiscovery

I recently posted an article titled Best Practices: Stopping Precrime on The Modern Archivist. In this article, I wrote about the ability of organizations to stop information crimes before they happen by integrating “Precrime Intelligence” and Early Case Assessment into their standard, everyday Information Management processes, the same way that Tom Cruise attempted to stop crimes before they happened by analyzing the data that he had prior to going on site and collecting physical evidence.

While convicting people on precrime is not justifiable as demonstrated in the movie, leading companies are using Precrime Intelligence today to stop electronic violations before they occur as part of a broader Information Governance strategy. Precrime Intelligence allows organizations to automatically stop corporate and HR violations by analyzing ESI (email, files, etc) and flagging potential violations for review before the ESI has been delivered and the violation realized. By halting violations while they are still unrealized, organizations can lower their information risk profile.

The diagram below outlines the process for integrating Precrime Intelligence and ECA into a more traditional eDiscovery review model. The four columns below match the first four columns of the EDRM model (Information Management, Identification, Collection / Preservation, and Processing / Analysis / Review), demonstrating that, with the right solution, Analysis and Review and be brought forward in the process, performed proactively, and before costly manual collection.

For more information, please visit The Modern Archivist.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Automated Collection: Mitgating the Risks and Costs of Manual Collection

Published on August 12, 2010 by in ECA, eDiscovery

Jason Baron, a thought leader electronic discovery, recently mentioned a topic that “ought to be blogged about,” namely that of automated collections vs. manual collections. Automated collection is the use of software and hardware to improve the speed and reliability of collection over the network while manual collections often require manual collection of hard drives, manual export of email from mail servers and the like. To frame the discussion, it is useful to think about Google, the king of automated collection. Google indexes billions of web pages across countless web servers across the internet. To do this, Google runs the GoogleBot, an automated agent that efficiently locates and crawls websites to find information that is then automatically indexed and made searchable. Imagine if Google had to have a person go to each website and manually navigate a browser to each webpage and then click “Save Page As” in the web browser. While this process is certainly doable, it would not be cost effective nor timely. Certainly no reasonable person would seek to build a search engine using manual collection. Given the state of technology available today, some judicial and industry leaders are wondering what are the risks of manual collection from an eDiscovery perspective and whether is it still reasonable or defensible to perform manual collection.

Dean Gonsowski responded to Jason’s call in an article titled “Manual Collections of ESI in Electronic Discovery Come under Fire:” in which he writes:

there’s no dispute that the “automated” collection methods available in litigation software referenced above have a number of features that make this approach more efficient – Dean Gonsowski

While he does not elaborate, the natural follow on question from this is “what benefits do automated collection provide?” Going beyond collection, we can extend this to asking what are the advantages of Automated Identification, Collection, and Preservation (ICP) vs. Manual ICP. Here are some benefits that have come to the top of my mind:

  1. Improve Success Rates and Lower Costs with Early Case Assessment (ECA): Early Case Assessment requires either pre-collection analysis or automated collection to avoid the long lag time that is typically consumed during a manual ICP process. Reducing that lag time from months to days or hours through automated collection can dramatically improve the success rate of ECA. There is currently some debate on whether ECA can truly occur after a manual collection or if it must occur before a manual ICP process. A number of eDiscovery analysts we have spoken to agree that to be considered “early” an ECA solution should utilize automated analysis through Proactive eDsicovery (aka archiving) or a Manage-in-Place capability combined with automated collection.
  2. Reduce Risks with Under-Collection Spoliation: With a manual IPC process, it is easy to overlook custodians with relevant data and under collect. The process of iteratively, and slowly, identifying custodians to collection and preserve information may result in under collection. Of note is the case Pension Comm. of the Univ. of Montreal Pension Plan v. Banc of America Sec. LLC, No. 05 Civ. 9016, 2010 U.S. Dist. Lexis 4546, at *1 (S.D.N.Y. Jan. 15, 2010), where e-discovery expert Judge Shira Scheindlin ruled that relying solely on employees to search and select responsive information without proper direction and supervision was grounds for spoliation sanctions. Automated ICP driven by the legal team can easily mitigate the need for and costs of relying on employees to identify relevant information.
  3. Reduce Risks with Late Identification, Collection, and Preservation (ICP): In addition to inadvertent under-collection through process, some organizations miss ESI due to the time pressures associated with cases and produce ESI late. This can be especially damaging when the ESI is exculpatory or otherwise material to the case as in Thompson v. United States Department of Housing & Urban Development, 219 F.R.D. 93 (D.Md. 2003) where HUD was not allowed to include 80,000 emails it produced after the eDiscovery cut-off deadline.
  4. Reduce Costs with Matter-based ICP: Traditional custodian-based analysis and review provides only limited visibility into the operations of the organization. It assumes that the identified custodians have the relevant ESI. This can be problematic for a couple of reasons: (a) Increased Information Risk for Repeat Custodians which are often under multiple litigation preservation orders may have all their ESI essentially on permanent hold increasing the information risk profile of the organization and (b) Complying with Duty to Preserve before litigation occurs in situations (such as Adams v. Dell) where there is (or should be) anticipated litigation but litigation has not been initiated can be expensive using Manual IPC or later when sanctions are applied. Matter-based ICP with automatic collection can reduce the amount of risk and reduce the costs of ICP while keeping the organization in compliance with the FRCP.

Manual ICP is a slow process that increases information risk and can lead to under collection, late collection, and spoliation. On the other hand, automatic collection can enable ECA, fast collection, and Matter-based ICP. There is no question that automated ICP holds advantages over manual ICP. Given the risks associated with Manual ICP, the courts and industry thought leaders are correct to ask if manual collections are still relevant and defensible. In this article, I hope to have provided some of the key benefits associated with Automated ICP to help further this discussion.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
2 Comments  comments